Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 268 result(s)
dbEST is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. Expressed Sequence Tags (ESTs) are short (usually about 300-500 bp), single-pass sequence reads from mRNA (cDNA). Typically they are produced in large batches. They represent a snapshot of genes expressed in a given tissue and/or at a given developmental stage. They are tags (some coding, others not) of expression for a given cDNA library. Most EST projects develop large numbers of sequences. These are commonly submitted to GenBank and dbEST as batches of dozens to thousands of entries, with a great deal of redundancy in the citation, submitter and library information. To improve the efficiency of the submission process for this type of data, we have designed a special streamlined submission process and data format. dbEST also includes sequences that are longer than the traditional ESTs, or are produced as single sequences or in small batches. Among these sequences are products of differential display experiments and RACE experiments. The thing that these sequences have in common with traditional ESTs, regardless of length, quality, or quantity, is that there is little information that can be annotated in the record. If a sequence is later characterized and annotated with biological features such as a coding region, 5'UTR, or 3'UTR, it should be submitted through the regular GenBank submissions procedure (via BankIt or Sequin), even if part of the sequence is already in dbEST. dbEST is reserved for single-pass reads. Assembled sequences should not be submitted to dbEST. GenBank will accept assembled EST submissions for the forthcoming TSA (Transcriptome Shotgun Assembly) division. The individual reads which make up the assembly should be submitted to dbEST, the Trace archive or the Short Read Archive (SRA) prior to the submission of the assemblies.
The Gene database provides detailed information for known and predicted genes defined by nucleotide sequence or map position. Gene supplies gene-specific connections in the nexus of map, sequence, expression, structure, function, citation, and homology data. Unique identifiers are assigned to genes with defining sequences, genes with known map positions, and genes inferred from phenotypic information. These gene identifiers are used throughout NCBI's databases and tracked through updates of annotation. Gene includes genomes represented by NCBI Reference Sequences (or RefSeqs) and is integrated for indexing and query and retrieval from NCBI's Entrez and E-Utilities systems.
The IMSR is a searchable online database of mouse strains, stocks, and mutant ES cell lines available worldwide, including inbred, mutant, and genetically engineered strains. The goal of the IMSR is to assist the international scientific community in locating and obtaining mouse resources for research. Note that the data content found in the IMSR is as supplied by strain repository holders. For each strain or cell line listed in the IMSR, users can obtain information about: Where that resource is available (Repository Site); What state(s) the resource is available as (e.g. live, cryopreserved embryo or germplasm, ES cells); Links to descriptive information about a strain or ES cell line; Links to mutant alleles carried by a strain or ES cell line; Links for ordering a strain or ES cell line from a Repository; Links for contacting the Repository to send a query
With ARS - Antimicrobial Resistance Surveillance in Germany - the infrastructure for a nationwide surveillance of antimicrobial resistance has been established, which covers both the inpatient medical care and the ambulatory care sector. This is intended to reliable data on the epidemiology of antimicrobial resistance in Germany and differential statements provided by structural features of the health care and by region are possible. ARS is designed as a laboratory-based surveillance system for continuous collection of resistance data from routine for the full range of clinically relevant bacterial pathogens. Project participants and thus data suppliers are laboratories that analyze samples of medical facilities and doctors' offices microbiologically.
OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. OMIM is authored and edited at the McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, under the direction of Dr. Ada Hamosh. Its official home is
Swiss Institute of Bioinformatics (SIB) coordinates research and education in bioinformatics throughout Switzerland and provides bioinformatics services to the national and international research community. ExPASy gives access to numerous repositories and databases of SIB. For example: array map, MetaNetX, SWISS-MODEL and World-2DPAGE, and many others see a list here
caNanoLab is a data sharing portal designed to facilitate information sharing in the biomedical nanotechnology research community to expedite and validate the use of nanotechnology in biomedicine. caNanoLab provides support for the annotation of nanomaterials with characterizations resulting from physico-chemical and in vitro assays and the sharing of these characterizations and associated nanotechnology protocols in a secure fashion.
Indian Genetic Disease Database (IGDD) is an initiative of CSIR Indian Institute of Chemical Biology. It is supported by Council of Scientific and Industrial Research (CSIR) and Department of Biotechnology (DBT) of India. The Indian people represent one-sixth of the world population and consists of a ethnically, geographically, and genetically diverse population. In some communities the ratio of genetic disorder is relatively high due to consanguineous marriage practiced in the community. This database has been created to keep track of mutations in the causal genes for genetic diseases common in India and help the physicians, geneticists, and other professionals retrieve and use the information for the benefit of the public. The database includes scientific information about these genetic diseases and disabilities, but also statistical information about these diseases in today's society. Data is categorized by body part affected and then by title of the disease.
This resource allows users to search for and compare influenza virus genomes and gene sequences taken from GenBank. It also provides a virus sequence annotation tool and links to other influenza resources: NIAID project, JCVI Flu, Influenza research database, CDC Flu, Vaccine Selection and WHO Flu.
The GHDx is our user-friendly and searchable data catalog for global health, demographic, and other health-related datasets. It provides detailed information about datasets ranging from censuses and surveys to health records and vital statistics, globally. It also serves as a platform for data owners to share their data with the public. The GDB Compare visualization, which allows the user to see rate of change in disease incidence, globally or by country, by age or across all ages, is especially powerful as a tool. Be sure to try adding a bottom chart, like the map, to augment the treemap that loads by default in the top chart.
The Health and Medical Care Archive (HMCA) is the data archive of the Robert Wood Johnson Foundation (RWJF), the largest philanthropy devoted exclusively to health and health care in the United States. Operated by the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan, HMCA preserves and disseminates data collected by selected research projects funded by the Foundation and facilitates secondary analyses of the data. Our goal is to increase understanding of health and health care in the United States through secondary analysis of RWJF-supported data collections
ModelDB is a curated database of published models in the broad domain of computational neuroscience. It addresses the need for access to such models in order to evaluate their validity and extend their use. It can handle computational models expressed in any textual form, including procedural or declarative languages (e.g. C++, XML dialects) and source code written for any simulation environment. The model source code doesn't even have to reside inside ModelDB; it just has to be available from some publicly accessible online repository or WWW site.
The project brings together national key players providing environmentally related biological data and services to develop the ‘German Federation for Biological Data' (GFBio). The overall goal is to provide a sustainable, service oriented, national data infrastructure facilitating data sharing and stimulating data intensive science in the fields of biological and environmental research.
Groundbreaking biomedical research requires access to cutting edge scientific resources; however such resources are often invisible beyond the laboratories or universities where they were developed. eagle-i is a discovery platform that helps biomedical scientists find previously invisible, but highly valuable, resources.
The tree of life links all biodiversity through a shared evolutionary history. This project will produce the first online, comprehensive first-draft tree of all 1.8 million named species, accessible to both the public and scientific communities. Assembly of the tree will incorporate previously-published results, with strong collaborations between computational and empirical biologists to develop, test and improve methods of data synthesis. This initial tree of life will not be static; instead, we will develop tools for scientists to update and revise the tree as new data come in. Early release of the tree and tools will motivate data sharing and facilitate ongoing synthesis of knowledge.
The Cognitive Function and Ageing Studies (CFAS) are population based studies of individuals aged 65 years and over living in the community, including institutions, which is the only large multi-centred population-based study in the UK that has reached sufficient maturity. There are three main studies within the CFAS group. MRC CFAS, the original study began in 1989, with three of its sites providing a parent subset for the comparison two decades later with CFAS II (2008 onwards). Subsequently another CFAS study, CFAS Wales began in 2011.
DEG hosts records of currently available essential genomic elements, such as protein-coding genes and non-coding RNAs, among bacteria, archaea and eukaryotes. Essential genes in a bacterium constitute a minimal genome, forming a set of functional modules, which play key roles in the emerging field, synthetic biology.
The Ontology Lookup Service (OLS) is a repository for biomedical ontologies that aims to provide a single point of access to the latest ontology versions. The user can browse the ontologies through the website as well as programmatically via the OLS API. The OLS provides a web service interface to query multiple ontologies from a single location with a unified output format.The OLS can integrate any ontology available in the Open Biomedical Ontology (OBO) format. The OLS is an open source project hosted on Google Code.
The Human Genetic Variation Database (HGVD) aims to provide a central resource to archive and display Japanese genetic variation and association between the variation and transcription level of genes. The database currently contains genetic variations determined by exome sequencing of 1,208 individuals and genotyping data of common variations obtained from a cohort of 3,248 individuals.
The German Neuroinformatics Node's data infrastructure (GIN) services provide a platform for comprehensive and reproducible management and sharing of neuroscience data. Building on well established versioning technology, GIN offers the power of a web based repository management service combined with a distributed file storage. The service addresses the range of research data workflows starting from data analysis on the local workstation to remote collaboration and data publication.
AmoebaDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for Entamoeba and Acanthamoeba parasites. In its first iteration (released in early 2010), AmoebaDB contains the genomes of three Entamoeba species (see below). AmoebaDB integrates whole genome sequence and annotation and will rapidly expand to include experimental data and environmental isolate sequences provided by community researchers . The database includes supplemental bioinformatics analyses and a web interface for data-mining.
The RAMEDIS system is a platform independent, web-based information system for rare metabolic diseases based on filed case reports. It was developed in close cooperation with clinical partners to allow them to collect information on rare metabolic diseases with extensive details, e.g. about occurring symptoms, laboratory findings, therapy and molecular data.
The taxonomically broad EST database TBestDB serves as a repository for EST data from a wide range of eukaryotes, many of which have previously not been thoroughly investigated. Most of the data contained in TBestDB has been generated by the labs of the Protist EST Program located in six universities across Canada. PEP is a large interdisciplinaryresearch project, involving six Canadian universities. PEP aims at the exploration of the diversity of eukaryotic genomes in a systematic, comprehensive and integrated way. The focus is on unicellular microbial eukaryotes, known as protists. Protistan eukaryotes comprise more than a dozen major lineages that, together, encompass more evolutionary, ecological and probably biochemical diversity than the multicellular kingdoms of animals, plants and fungi combined. PEP is a unique endeavor in that it is the first phylogenetically-broad genomic investigation of protists.