Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 171 result(s)
Peptidome was a public repository that archived tandem mass spectrometry peptide and protein identification data generated by the scientific community. This repository is now offline and is in archival mode. All data may be obtained from the Peptidome FTP site. Due to budgetary constraints NCBI has discontinued the Peptidome Repository. All existing data and metadata files will continue to be made available from our ftp server a indefinitely. Those files are named according to their Peptidome accession number, allowing cited data to be identified and downloaded. All of the Peptidome studies have been made publicly available at the PRoteomics IDEntifications (PRIDE) database. A map of Peptidome to Pride accessions may be found at If you have any specific questions, please feel free to contact us at
The IMSR is a searchable online database of mouse strains, stocks, and mutant ES cell lines available worldwide, including inbred, mutant, and genetically engineered strains. The goal of the IMSR is to assist the international scientific community in locating and obtaining mouse resources for research. Note that the data content found in the IMSR is as supplied by strain repository holders. For each strain or cell line listed in the IMSR, users can obtain information about: Where that resource is available (Repository Site); What state(s) the resource is available as (e.g. live, cryopreserved embryo or germplasm, ES cells); Links to descriptive information about a strain or ES cell line; Links to mutant alleles carried by a strain or ES cell line; Links for ordering a strain or ES cell line from a Repository; Links for contacting the Repository to send a query
Clone DB contains information about genomic clones and cDNA and cell-based libraries for eukaryotic organisms. The database integrates this information with sequence data, map positions, and distributor information. At this time, Clone DB contains records for genomic clones and libraries, the collection of MICER mouse gene targeting clones and cell-based gene trap and gene targeting libraries from the International Knockout Mouse Consortium, Lexicon and the International Gene Trap Consortium. A planned expansion for Clone DB will add records for additional gene targeting and gene trap clones, as well as cDNA clones.
The Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC) is a team of researchers, data specialists and computer system developers who are supporting the development of a data management system to store scientific data generated by Gulf of Mexico researchers. The Master Research Agreement between BP and the Gulf of Mexico Alliance that established the Gulf of Mexico Research Initiative (GoMRI) included provisions that all data collected or generated through the agreement must be made available to the public. The Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC) is the vehicle through which GoMRI is fulfilling this requirement. The mission of GRIIDC is to ensure a data and information legacy that promotes continual scientific discovery and public awareness of the Gulf of Mexico Ecosystem.
INDEPTH is a global network of research centres that conduct longitudinal health and demographic evaluation of populations in low- and middle-income countries (LMICs). INDEPTH aims to strengthen global capacity for Health and Demographic Surveillance Systems (HDSSs), and to mount multi-site research to guide health priorities and policies in LMICs, based on up-to-date scientific evidence. The data collected by the INDEPTH Network members constitute a valuable resource of population and health data for LMIC countries. This repository aims to make well documented anonymised longitudinal microdata from these Centres available to data users.
Intrepid Bioinformatics serves as a community for genetic researchers and scientific programmers who need to achieve meaningful use of their genetic research data – but can’t spend tremendous amounts of time or money in the process. The Intrepid Bioinformatics system automates time consuming manual processes, shortens workflow, and eliminates the threat of lost data in a faster, cheaper, and better environment than existing solutions. The system also provides the functionality and community features needed to analyze the large volumes of Next Generation Sequencing and Single Nucleotide Polymorphism data, which is generated for a wide range of purposes from disease tracking and animal breeding to medical diagnosis and treatment.
The National Cancer Data Base (NCDB), a joint program of the Commission on Cancer (CoC) of the American College of Surgeons (ACoS) and the American Cancer Society (ACS), is a nationwide oncology outcomes database for more than 1,500 Commission-accredited cancer programs in the United States and Puerto Rico. Some 70 percent of all newly diagnosed cases of cancer in the United States are captured at the institutional level and reported to the NCDB. The NCDB, begun in 1989, now contains approximately 29 million records from hospital cancer registries across the United States. Data on all types of cancer are tracked and analyzed. These data are used to explore trends in cancer care, to create regional and state benchmarks for participating hospitals, and to serve as the basis for quality improvement.
The data in the U of M’s Clinical Data Repository comes from the electronic health records (EHRs) of more than 2 million patients seen at 8 hospitals and more than 40 clinics. For each patient, data is available regarding the patient's demographics (age, gender, language, etc.), medical history, problem list, allergies, immunizations, outpatient vitals, diagnoses, procedures, medications, lab tests, visit locations, providers, provider specialties, and more.
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems. The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target ("TARGET_TYPE='PROTEIN'") is provided. Data extracted by BindingDB typically includes more details regarding experimental conditions, etc
The FREEBIRD website aims to facilitate data sharing in the area of injury and emergency research in a timely and responsible manner. It has been launched by providing open access to anonymised data on over 30,000 injured patients (the CRASH-1 and CRASH-2 trials).
The Comprehensive Epidemiologic Data Resource (CEDR) is the Department of Energy's (DOE) electronic database comprised of health studies of DOE contract workers and environmental studies of areas surrounding DOE facilities. DOE recognizes the benefits of data sharing and supports the public's right to know about worker and community health risks. CEDR provides independent researchers and the public with access to de-identified data collected since the Department's early production years. Current CEDR holdings include more than 80 studies of over 1 million workers at 31 DOE sites. Access to these data is at no cost to the user. Most of CEDR's holdings are derived from epidemiologic studies of DOE workers at many large nuclear weapons plants, such as Hanford, Los Alamos, the Oak Ridge reservation, Savannah River Site, and Rocky Flats. These studies primarily use death certificate information to identify excess deaths and patterns of disease among workers to determine what factors contribute to the risk of developing cancer and other illnesses. In addition, many of these studies have radiation exposure measurements on individual workers. CEDR is supported by the Oak Ridge Institute for Science and Education (ORISE) in Oak Ridge, Tennessee. Now a mature system in routine operational use, CEDR's modern internet-based systems respond to thousands of requests to its web server daily. With about 1,500 Internet sites pointing to CEDR's web site, CEDR is a national user facility, with a large audience for data that are not available elsewhere.
METLIN represents the largest MS/MS collection of data with the database generated at multiple collision energies and in positive and negative ionization modes. The data is generated on multiple instrument types including SCIEX, Agilent, Bruker and Waters QTOF mass spectrometers.
a collection of data at Agency for Healthcare Research and Quality (AHRQ) supporting research that helps people make more informed decisions and improves the quality of health care services. The portal contains U.S.Health Information Knowledgebase (USHIK) and Systematic Review Data Repository (SRDR) and other sources concerning cost, quality, accesibility and evaluation of healthcare and medical insurance.
The GHDx is our user-friendly and searchable data catalog for global health, demographic, and other health-related datasets. It provides detailed information about datasets ranging from censuses and surveys to health records and vital statistics, globally. It also serves as a platform for data owners to share their data with the public. The GDB Compare visualization, which allows the user to see rate of change in disease incidence, globally or by country, by age or across all ages, is especially powerful as a tool. Be sure to try adding a bottom chart, like the map, to augment the treemap that loads by default in the top chart.
The Health and Medical Care Archive (HMCA) is the data archive of the Robert Wood Johnson Foundation (RWJF), the largest philanthropy devoted exclusively to health and health care in the United States. Operated by the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan, HMCA preserves and disseminates data collected by selected research projects funded by the Foundation and facilitates secondary analyses of the data. Our goal is to increase understanding of health and health care in the United States through secondary analysis of RWJF-supported data collections
MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human health and disease. The projects contributing to this resource are: Mouse Genome Database (MGD) Project, Gene Expression Database (GXD) Project, Mouse Tumor Biology (MTB) Database Project, Gene Ontology (GO) Project at MGI, MouseMine Project, MouseCyc Project at MGI
The Brain Biodiversity Bank refers to the repository of images of and information about brain specimens contained in the collections associated with the National Museum of Health and Medicine at the Armed Forces Institute of Pathology in Washington, DC. These collections include, besides the Michigan State University Collection, the Welker Collection from the University of Wisconsin, the Yakovlev-Haleem Collection from Harvard University, the Meyer Collection from the Johns Hopkins University, and the Huber-Crosby and Crosby-Lauer Collections from the University of Michigan and the C.U. Ariëns Kappers brain collection from Amsterdam Netherlands.Introducing online atlases of the brains of humans, sheep, dolphins, and other animals. A world resource for illustrations of whole brains and stained sections from a great variety of mammals
OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. OMIM is authored and edited at the McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, under the direction of Dr. Ada Hamosh. Its official home is
DNASU is a central repository for plasmid clones and collections. Currently we store and distribute over 200,000 plasmids including 75,000 human and mouse plasmids, full genome collections, the protein expression plasmids from the Protein Structure Initiative as the PSI: Biology Material Repository (PSI : Biology-MR), and both small and large collections from individual researchers. We are also a founding member and distributor of the ORFeome Collaboration plasmid collection.
TOXMAP® is a Geographic Information System (GIS) from the Division of Specialized Information Services of the US National Library of Medicine® (NLM) that uses maps of the United States and Canada to help users visually explore data primarily from the US Environmental Protection Agency (EPA)'s Toxics Release Inventory (TRI) and Superfund Program.
Reactome is a manually curated, peer-reviewed pathway database, annotated by expert biologists and cross-referenced to bioinformatics databases. Its aim is to share information in the visual representations of biological pathways in a computationally accessible format. Pathway annotations are authored by expert biologists, in collaboration with Reactome editorial staff and cross-referenced to many bioinformatics databases. These include NCBI Gene, Ensembl and UniProt databases, the UCSC and HapMap Genome Browsers, the KEGG Compound and ChEBI small molecule databases, PubMed, and Gene Ontology.
The Cancer Imaging Archive is a freely accessible repository containing medical images and supporting data from cancer patients. Images are stored in DICOM file format. The images are organized as “Collections”, typically patients related by a common disease (e.g. lung cancer), image modality (MRI, CT, etc) or research focus. Search functionality allows users to query across Collections or within them to filter out only the data they are most interested in.
dbEST is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. Expressed Sequence Tags (ESTs) are short (usually about 300-500 bp), single-pass sequence reads from mRNA (cDNA). Typically they are produced in large batches. They represent a snapshot of genes expressed in a given tissue and/or at a given developmental stage. They are tags (some coding, others not) of expression for a given cDNA library. Most EST projects develop large numbers of sequences. These are commonly submitted to GenBank and dbEST as batches of dozens to thousands of entries, with a great deal of redundancy in the citation, submitter and library information. To improve the efficiency of the submission process for this type of data, we have designed a special streamlined submission process and data format. dbEST also includes sequences that are longer than the traditional ESTs, or are produced as single sequences or in small batches. Among these sequences are products of differential display experiments and RACE experiments. The thing that these sequences have in common with traditional ESTs, regardless of length, quality, or quantity, is that there is little information that can be annotated in the record. If a sequence is later characterized and annotated with biological features such as a coding region, 5'UTR, or 3'UTR, it should be submitted through the regular GenBank submissions procedure (via BankIt or Sequin), even if part of the sequence is already in dbEST. dbEST is reserved for single-pass reads. Assembled sequences should not be submitted to dbEST. GenBank will accept assembled EST submissions for the forthcoming TSA (Transcriptome Shotgun Assembly) division. The individual reads which make up the assembly should be submitted to dbEST, the Trace archive or the Short Read Archive (SRA) prior to the submission of the assemblies.