Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 50 result(s)
The Cognitive Function and Ageing Studies (CFAS) are population based studies of individuals aged 65 years and over living in the community, including institutions, which is the only large multi-centred population-based study in the UK that has reached sufficient maturity. There are three main studies within the CFAS group. MRC CFAS, the original study began in 1989, with three of its sites providing a parent subset for the comparison two decades later with CFAS II (2008 onwards). Subsequently another CFAS study, CFAS Wales began in 2011.
Exposures in the period from conception to early childhood - including fetal growth, cell division, and organ functioning - may have long-lasting impact on health and disease susceptibility. To investigate these issues the Danish National Birth Cohort (Better health in generations) was established. A large cohort of pregnant women with long-term follow-up of the offspring was the obvious choice because many of the exposures of interest cannot be reconstructed with suffcient validity back in time. The study needed to be large, and the aim was to recruit 100,000 women early in pregnancy, and to continue follow-up for decades. Exposure information was collected by computer-assisted telephone interviews with the women twice during pregnancy and when their children were six and 18 months old. Participants were also asked to fill in a self-administered food frequency questionnaire in mid-pregnancy. Furthermore, a biological bank has been set up with blood taken from the mother twice during pregnancy and blood from theumbilical cord taken shortly after birth.
The Diabetes Study of Northern California (DISTANCE) conducts epidemiological and health services research in diabetes among a large, multiethnic cohort of patients in a large, integrated health care delivery system.
The data in the U of M’s Clinical Data Repository comes from the electronic health records (EHRs) of more than 2 million patients seen at 8 hospitals and more than 40 clinics. For each patient, data is available regarding the patient's demographics (age, gender, language, etc.), medical history, problem list, allergies, immunizations, outpatient vitals, diagnoses, procedures, medications, lab tests, visit locations, providers, provider specialties, and more.
Exposome-Explorer is the first database dedicated to biomarkers of exposure to environmental risk factors for diseases. It contains detailed information on the nature of biomarkers, populations and subjects where measured, samples analyzed, methods used for biomarker analyses, concentrations in biospecimens, correlations with external exposure measurements, and biological reproducibility over time.
The FDZ-DZA (Forschungsdatenzentrum DZA) is a facility of the German Centre of Gerontology (Deutsches Zentrum für Altersfragen, DZA) and has received accreditation as research data center DZA by the German Data Forum (RatSWD). Its main task is to make data of the German Ageing Survey DEAS and the German Survey on Volunteering (FWS) accessible to researchers by providing user-friendly Scientific Use Files (SUF), documentation of the contents and instruments as well support for scholars using the data.
TOXMAP® is a Geographic Information System (GIS) from the Division of Specialized Information Services of the US National Library of Medicine® (NLM) that uses maps of the United States and Canada to help users visually explore data primarily from the US Environmental Protection Agency (EPA)'s Toxics Release Inventory (TRI) and Superfund Program.
The NCAA Student-Athlete Experiences Data Archive provides access to data about student athletes and will grow to include a handful of user-friendly data collections related to graduation rates; team-level Academic Progress Rates in Division I; and individual-level data on the experiences of current and former student-athletes from the NCAA's Growth, Opportunities, Aspirations and Learning of Students in college study (GOALS), and the Study of College Outcomes and Recent Experiences (SCORE). In the long run, the NCAA expects to follow this initial release with the publication of as much data as possible from its archives. The data is used by college presidents, athletic personnel, faculty, student-athlete groups, media members, and researchers in looking at issues related to intercollegiate athletics and higher education.
Project Tycho® is a project at the University of Pittsburgh to advance the availability and use of public health data for science and policy making. Currently, the Project Tycho® database includes data from all weekly notifiable disease reports for the United States dating back to 1888. These data are freely available to anybody interested. Additional U.S. and international data will be released twice yearly.
The Cancer Immunome Database (TCIA) provides results of comprehensive immunogenomic analyses of next generation sequencing data (NGS) data for 19 solid cancers from The Cancer Genome Atlas (TCGA) and other datasource. The Cancer Immunome Atlas (TCIA) was developed and is maintained at the Division of Bioinformatics (ICBI). The database can be queried for the gene expression of specific immune-related gene sets, cellular composition of immune infiltrates (characterized using gene set enrichment analyses and deconvolution), neoantigens and cancer-germline antigens, HLA types, and tumor heterogeneity (estimated from cancer cell fractions). Moreover it provides survival analyses for different types immunological parameters. TCIA will be constantly updated with new data and results.
The Nord-Trøndelag Health Study (The HUNT Study) is one of the largest health studies ever performed. It is a unique database of personal and family medical histories collected during three intensive studies. The fundamental strategy is to earn and maintain the confidence of the population we work in and with as is necessary for any successful population study. This strategy has been successful and has resulted in extraordinarily high participation rates. There is enthusiastic public and political support for HUNT and for the HUNT Research Centre. This has created a good basis for further health surveys in the county and an excellent research environment. Today, the HUNT Study is a database with information about approximately 120,000 people that integrates family data and individual data and can be linked to national health registries.
The Human Mortality Database (HMD) was created to provide detailed mortality and population data to researchers, students, journalists, policy analysts, and others interested in the history of human longevity. The Human Mortality Database (HMD) contains original calculations of death rates and life tables for national populations (countries or areas), as well as the input data used in constructing those tables. The input data consist of death counts from vital statistics, plus census counts, birth counts, and population estimates from various sources.
TRAILS is a prospective cohort study, which started in 2001 with population cohort and 2004 with a clinical cohort (CC). Since then, a group of 2500 young people from the Northern part of the Netherlands has been closely monitored in order to chart and explain their mental, physical, and social development. These TRAILS participants have been measured every two to three years, by means of questionnaires, interviews, and all kinds of tests. By now, we have collected information that spans the total period from preadolescence up until young adulthood. One of the main goals of TRAILS is to contribute to the knowledge of the development of emotional and behavioral problems and the (social) functioning of preadolescents into adulthood, their determinants, and underlying mechanisms.
The Health and Retirement Study (HRS) is a longitudinal panel study that surveys a representative sample of more than 26,000 Americans over the age of 50 every two years. The study has collected information about income, work, assets, pension plans, health insurance, disability, physical health and functioning, cognitive functioning, genetic information and health care expenditures.
The Growing Up Today Study is a collaborative study between clinicians, researchers, and thousands of participants across the US and beyond. The aim of this study is to gain a deeper understanding of the factors that affect health throughout life. Together we are working to building one of the most powerful resources for fighting cancer, obesity, heart disease, depression, and so much more.
This site is dedicated to making high value health data more accessible to entrepreneurs, researchers, and policy makers in the hopes of better health outcomes for all. In a recent article, Todd Park, United States Chief Technology Officer, captured the essence of what the Health Data Initiative is all about and why our efforts here are so important.
The Integrated Fertility Survey Series (IFSS) is a project of the Population Studies aiming in view to produce a harmonized dataset of U.S. family and fertility surveys spanning five decades (1955-2002). IFSS integrates data from ten underlying component studies of family and fertility encompassing the Growth of American Families (GAF) in 1955 and 1960; National Fertility Surveys (NFS) in 1965 and 1970; as well as National Surveys of Family Growth (NSFG) in 1973, 1976, 1982, 1988, 1995, and 2002. The first release contains harmonized sociodemographic variables for all respondents from all ten component studies, including those related to marital status, race and ethnicity, etc. Thus it provides access to researchers, educators, students, policy makers, and others with a data resource to examine issues related to families and fertility in the United States. Potential users can download original/ harmonized datasets (along with documentation) and numerous analytic tools make it possible to quickly and easily explore the data and obtain information about changes in behaviors and attitudes across time.
Content type(s)
A small genotype data repository containing data used in recent papers from the Estonian Biocentre. Most of the data pertains to human population genetics. PDF files of the papers are also freely available.
A premier source for United States cancer statistics, SEER gathers information related to incidence, prevalence, and survival from specific geographic areas that represent 28 percent of the population, as well as compiles related reports and reports on the national cancer mortality rates. Their aim is to provide information related to cancer statistics and decrease the burden of cancer in the national population. SEER has been collecting data from cancer cases since 1973.
The Twenty-07 Study was set up in 1986 in order to investigate the reasons for differences in health by socio-economic circumstances, gender, area of residence, age, ethnic group, and family type. 4510 people are being followed for 20 years. The initial wave of data collection took place in 1987/8, when respondents were aged 15, 35 and 55. The final wave of data collection took place in 2007/08 when respondents were aged 35, 55 and 75. In this way the Twenty-07 Study provides us with unique opportunities to investigate both the changes in people's lives over 20 years and how they affect their health, and the differences in people's experiences at the same ages 20 years apart, and how these have different effects on their health. is the Centers for Disease Control and Prevention primary online communication channel. provides users with credible, reliable health information on Data and Statistics, Diseases and Conditions, Emergencies and Disasters, Environmental Health, Healthy Living, Injury, Violence and Safety,Life Stages and Populations, Travelers' Health, Workplace Safety and Health