Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 44 result(s)
The CancerData site is an effort of the Medical Informatics and Knowledge Engineering team (MIKE for short) of Maastro Clinic, Maastricht, The Netherlands. Our activities in the field of medical image analysis and data modelling are visible in a number of projects we are running. CancerData is offering several datasets. They are grouped in collections and can be public or private. You can search for public datasets in the NBIA (National Biomedical Imaging Archive) image archives without logging in.
Research Data Centres offer a secure access to detailed microdata from Statistics Canada's surveys, and to Canadian censuses' data, as well as to an increasing number of administrative data sets. The search engine was designed to help you find out more easily which dataset among all the surveys available in the RDCs best suits your research needs.
Intrepid Bioinformatics serves as a community for genetic researchers and scientific programmers who need to achieve meaningful use of their genetic research data – but can’t spend tremendous amounts of time or money in the process. The Intrepid Bioinformatics system automates time consuming manual processes, shortens workflow, and eliminates the threat of lost data in a faster, cheaper, and better environment than existing solutions. The system also provides the functionality and community features needed to analyze the large volumes of Next Generation Sequencing and Single Nucleotide Polymorphism data, which is generated for a wide range of purposes from disease tracking and animal breeding to medical diagnosis and treatment.
The WorldWide Antimalarial Resistance Network (WWARN) is a collaborative platform generating innovative resources and reliable evidence to inform the malaria community on the factors affecting the efficacy of antimalarial medicines. Access to data is provided through diverse Tools and Resources: WWARN Explorer, Molecular Surveyor K13 Methodology, Molecular Surveyor pfmdr1 & pfcrt, Molecular Surveyor dhfr & dhps.
caNanoLab is a data sharing portal designed to facilitate information sharing in the biomedical nanotechnology research community to expedite and validate the use of nanotechnology in biomedicine. caNanoLab provides support for the annotation of nanomaterials with characterizations resulting from physico-chemical and in vitro assays and the sharing of these characterizations and associated nanotechnology protocols in a secure fashion.
Human Proteinpedia is a community portal for sharing and integration of human protein data. This is a joint project between Pandey at Johns Hopkins University, and Institute of Bioinformatics, Bangalore. This portal allows research laboratories around the world to contribute and maintain protein annotations. Human Protein Reference Database (HPRD) integrates data, that is deposited in Human Proteinpedia along with the existing literature curated information in the context of an individual protein. All the public data contributed to Human Proteinpedia can be queried, viewed and downloaded. Data pertaining to post-translational modifications, protein interactions, tissue expression, expression in cell lines, subcellular localization and enzyme substrate relationships may be deposited.
This interface provides access to several types of data related to the Chesapeake Bay. Bay Program databases can be queried based upon user-defined inputs such as geographic region and date range. Each query results in a downloadable, tab- or comma-delimited text file that can be imported to any program (e.g., SAS, Excel, Access) for further analysis. Comments regarding the interface are encouraged. Questions in reference to the data should be addressed to the contact provided on subsequent pages.
The Cancer Immunome Database (TCIA) provides results of comprehensive immunogenomic analyses of next generation sequencing data (NGS) data for 19 solid cancers from The Cancer Genome Atlas (TCGA) and other datasource. The Cancer Immunome Atlas (TCIA) was developed and is maintained at the Division of Bioinformatics (ICBI). The database can be queried for the gene expression of specific immune-related gene sets, cellular composition of immune infiltrates (characterized using gene set enrichment analyses and deconvolution), neoantigens and cancer-germline antigens, HLA types, and tumor heterogeneity (estimated from cancer cell fractions). Moreover it provides survival analyses for different types immunological parameters. TCIA will be constantly updated with new data and results.
<<!! checked 20.03.2017 SumsDB was offline; for more information see!! >> SumsDB (the Surface Management System DataBase) is a repository of brain-mapping data (surfaces & volumes; structural & functional data) from many laboratories.
NACDA acquires and preserves data relevant to gerontological research, processing as needed to promote effective research use, disseminates them to researchers, and facilitates their use. By preserving and making available the largest library of electronic data on aging in the United States, NACDA offers opportunities for secondary analysis on major issues of scientific and policy relevance
The MG-RAST server is an open source system for annotation and comparative analysis of metagenomes. Users can upload raw sequence data in fasta format; the sequences will be normalized and processed and summaries automatically generated. The server provides several methods to access the different data types, including phylogenetic and metabolic reconstructions, and the ability to compare the metabolism and annotations of one or more metagenomes and genomes. In addition, the server offers a comprehensive search capability. Access to the data is password protected, and all data generated by the automated pipeline is available for download in a variety of common formats. MG-RAST has become an unofficial repository for metagenomic data, providing a means to make your data public so that it is available for download and viewing of the analysis without registration, as well as a static link that you can use in publications. It also requires that you include experimental metadata about your sample when it is made public to increase the usefulness to the community.
TRAILS is a prospective cohort study, which started in 2001 with population cohort and 2004 with a clinical cohort (CC). Since then, a group of 2500 young people from the Northern part of the Netherlands has been closely monitored in order to chart and explain their mental, physical, and social development. These TRAILS participants have been measured every two to three years, by means of questionnaires, interviews, and all kinds of tests. By now, we have collected information that spans the total period from preadolescence up until young adulthood. One of the main goals of TRAILS is to contribute to the knowledge of the development of emotional and behavioral problems and the (social) functioning of preadolescents into adulthood, their determinants, and underlying mechanisms.
The ETH Data Archive is ETH Zurich's institutional digital long-term archive. Researchers who are affiliated with ETH Zurich, the Swiss Federal Institute of Technology, may deposit file based research data from all domains. In particular, supplementary material to publications is deposited and published here. Research data includes raw data, processed data, software code and other data considered relevant to ensure reproducibility of research results or to facilitate re-use for new research questions. The ETH Data Archive contains both public research data with DOI and data with restricted access. Beyond this, born digital and digitized documents and other data from libraries, collections and archives are preserved in the ETH Data Archive, usually in the form of a dark archive without public access. You find open access data by searching the Knowledge Portal. You may either narrow your search to the Resource Type "Research Data" or the Collection "ETH Data Archive".
The Health and Retirement Study (HRS) is a longitudinal panel study that surveys a representative sample of more than 26,000 Americans over the age of 50 every two years. The study has collected information about income, work, assets, pension plans, health insurance, disability, physical health and functioning, cognitive functioning, genetic information and health care expenditures.
CPES provides access to information that relates to mental disorders among the general population. Its primary goal is to collect data about the prevalence of mental disorders and their treatments in adult populations in the United States. It also allows for research related to cultural and ethnic influences on mental health. CPES combines the data collected in three different nationally representative surveys (National Comorbidity Survey Replication, National Survey of American Life, National Latino and Asian American Study).
CODEX is a database of NGS mouse and human experiments. Although, the main focus of CODEX is Haematopoiesis and Embryonic systems, the database includes a large variety of cell types. In addition to the publically available data, CODEX also includes a private site hosting non-published data. CODEX provides access to processed and curated NGS experiments. To use CODEX: (i) select a specialized repository (HAEMCODE or ESCODE) or choose the whole compendium (CODEX), then (ii) filter by organism and (iii) choose how to explore the database.
Synapse is an open source software platform that clinical and biological data scientists can use to carry out, track, and communicate their research in real time. Synapse enables co-location of scientific content (data, code, results) and narrative descriptions of that work.
NURSA began in 2002 with the objective to accrue, develop and communicate information about the nuclear receptor superfamily. Over the last ten years, NURSA has developed a website that has developed into a comprehensive source of information about nuclear receptors, and their co-regulators, ligands, and downstream targets. Through a series of integrated 'omics-scale and informatic approaches projects, NURSA has fostered a systems biology understanding of nuclear receptor function, physiology and regulation of target gene networks in vivo.
PhysioNet is an on-line forum for the dissemination and exchange of recorded biomedical signals and open-source software for analyzing them. It provides facilities for the cooperative analysis of data and the evaluation of proposed new algorithms. In addition to providing free electronic access to PhysioBank data and PhysioToolkit software via the World Wide Web. PhysioNet offers services and training via on-line tutorials to assist users with varying levels of expertise. PhysioNet is a resource for biomedical research and development. It has three closely interdependent components: PhysioBank is a large and growing archive of well-characterized digital recordings of physiologic signals, time series, and related data for use by the biomedical research community. PhysioBank currently includes more than 60 collections of cardiopulmonary, neural, and other biomedical signals from healthy subjects and patients with a variety of conditions with major public health implications, including sudden cardiac death, congestive heart failure, epilepsy, gait disorders, sleep apnea, and aging. PhysioToolkit is a large and growing library of software for physiologic signal processing and analysis, detection of physiologically significant events using both classical techniques and novel methods based on statistical physics and nonlinear dynamics, interactive display and characterization of signals, creation of new databases, simulation of physiologic and other signals, quantitative evaluation and comparison of analysis methods, and analysis of nonequilibrium and nonstationary processes. PhysioNetWorks is a virtual laboratory where you can work together with us and with colleagues anywhere in the world to create, evaluate, improve, document, and prepare new data and software "works" for publication on PhysioNet. Unlike all other parts of the PhysioNet web site, access to PhysioNetWorks is password-protected. (Accounts are free and a password can be obtained in a minute or two.)
The Avian Knowledge Network (AKN) is an international network of governmental and non-governmental institutions and individuals linking avian conservation, monitoring and science through efficient data management and coordinated development of useful solutions using best-science practices based on the data.
The Twenty-07 Study was set up in 1986 in order to investigate the reasons for differences in health by socio-economic circumstances, gender, area of residence, age, ethnic group, and family type. 4510 people are being followed for 20 years. The initial wave of data collection took place in 1987/8, when respondents were aged 15, 35 and 55. The final wave of data collection took place in 2007/08 when respondents were aged 35, 55 and 75. In this way the Twenty-07 Study provides us with unique opportunities to investigate both the changes in people's lives over 20 years and how they affect their health, and the differences in people's experiences at the same ages 20 years apart, and how these have different effects on their health.
PSI is a global health organization dedicated to improving the health of people in the developing world by focusing on serious challenges like a lack of family planning, HIV and AIDS, barriers to maternal health, and the greatest threats to children under five, including malaria, diarrhea, pneumonia and malnutrition. A hallmark of PSI is a commitment to the principle that health services and products are most effective when they are accompanied by robust communications and distribution efforts that help ensure wide acceptance and proper use. PSI works in partnership with local governments, ministries of health and local organizations to create health solutions that are built to last. We use original data to monitor and evaluate our programs, generate consumer insight, estimate the impact of our solutions, and evaluate the health of the markets we work to strengthen.
!! OFFLINE !! A recent computer security audit has revealed security flaws in the legacy HapMap site that require NCBI to take it down immediately. We regret the inconvenience, but we are required to do this. That said, NCBI was planning to decommission this site in the near future anyway (although not quite so suddenly), as the 1,000 genomes (1KG) project has established itself as a research standard for population genetics and genomics. NCBI has observed a decline in usage of the HapMap dataset and website with its available resources over the past five years and it has come to the end of its useful life. The International HapMap Project is a multi-country effort to identify and catalog genetic similarities and differences in human beings. Using the information in the HapMap, researchers will be able to find genes that affect health, disease, and individual responses to medications and environmental factors. The Project is a collaboration among scientists and funding agencies from Japan, the United Kingdom, Canada, China, Nigeria, and the United States. All of the information generated by the Project will be released into the public domain. The goal of the International HapMap Project is to compare the genetic sequences of different individuals to identify chromosomal regions where genetic variants are shared. By making this information freely available, the Project will help biomedical researchers find genes involved in disease and responses to therapeutic drugs. In the initial phase of the Project, genetic data are being gathered from four populations with African, Asian, and European ancestry. Ongoing interactions with members of these populations are addressing potential ethical issues and providing valuable experience in conducting research with identified populations. Public and private organizations in six countries are participating in the International HapMap Project. Data generated by the Project can be downloaded with minimal constraints. The Project officially started with a meeting in October 2002 ( and is expected to take about three years.