Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 173 result(s)
The Barcode of Life Data Systems (BOLD) provides DNA barcode data. BOLD's online workbench supports data validation, annotation, and publication for specimen, distributional, and molecular data. The platform consists of four main modules: a data portal, a database of barcode clusters, an educational portal, and a data collection workbench. BOLD is the go-to site for DNA-based identification. As the central informatics platform for DNA barcoding, BOLD plays a crucial role in assimilating and organizing data gathered by the international barcode research community. Two iBOL (International Barcode of Life) Working Groups are supporting the ongoing development of BOLD.
The repository facilitates computation of a wide range of biosystem data. It also connects biosystem data with associated literature throughout the Entrez system.
I2D (Interologous Interaction Database) is an on-line database of known and predicted mammalian and eukaryotic protein-protein interactions. It has been built by mapping high-throughput (HTP) data between species. Thus, until experimentally verified, these interactions should be considered "predictions". It remains one of the most comprehensive sources of known and predicted eukaryotic PPI. I2D includes data for S. cerevisiae, C. elegans, D. melonogaster, R. norvegicus, M. musculus, and H. sapiens.
The Sequence Read Archive stores the raw sequencing data from such sequencing platforms as the Roche 454 GS System, the Illumina Genome Analyzer, the Applied Biosystems SOLiD System, the Helicos Heliscope, and the Complete Genomics. It archives the sequencing data associated with RNA-Seq, ChIP-Seq, Genomic and Transcriptomic assemblies, and 16S ribosomal RNA data.
GenBankĀ® is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.
The Drosophila Genetic Reference Panel (DGRP) is a population consisting of more than 200 inbred lines derived from the Raleigh, USA population. The DGRP is a living library of common polymorphisms affecting complex traits, and a community resource for whole genome association mapping of quantitative trait loci.
Stemformatics is a collaboration between the stem cell and bioinformatics community. We were motivated by the plethora of exciting cell models in the public and private domains, and the realisation that for many biologists these were mostly inaccessible. We wanted a fast way to find and visualise interesting genes in these exemplar stem cell datasets. We'd like you to explore. You'll find data from leading stem cell laboratories in a format that is easy to search, easy to visualise and easy to export.
BioGPS is a gene portal built with two guiding principles in mind -- customizability and extensibility. It is a complete resource for learning about gene and protein function. A free extensible and customizable gene annotation portal, a complete resource for learning about gene and protein function.
The Conserved Domain Database is a resource for the annotation of functional units in proteins. Its collection of domain models includes a set curated by NCBI, which utilizes 3D structure to provide insights into sequence/structure/function relationships
A curated database of mutations and polymorphisms associated with Lafora Progressive Myoclonus Epilepsy. The Lafora progressive myoclonus epilepsy mutation and polymorphism database is a collection of hand curated mutation and polymorphism data for the EPM2A and EPM2B (NHLRC1) from publicly available literature: databases and unpublished data. The database is continuously updated with information from in-house experimental data as well as data from published research studies.
TPA is a database that contains sequences built from the existing primary sequence data in GenBank. TPA records are retrieved through the Nucleotide Database and feature information on the sequence, how it was cataloged, and proper way to cite the sequence information.
InnateDB is a publicly available database of the genes, proteins, experimentally-verified interactions and signaling pathways involved in the innate immune response of humans, mice and bovines to microbial infection. The database captures an improved coverage of the innate immunity interactome by integrating known interactions and pathways from major public databases together with manually-curated data into a centralised resource. The database can be mined as a knowledgebase or used with our integrated bioinformatics and visualization tools for the systems level analysis of the innate immune response.
DOMINO is an open-access database comprising more than 3900 annotated experiments describing interactions mediated by protein-interaction domains. The curation effort aims at covering the interactions mediated by the following domains (SH3, SH2, 14-3-3, PDZ, PTB, WW, EVH, VHS, FHA, EH, FF, BRCT, Bromo, Chromo, GYF). The interactions deposited in DOMINO are annotated according to the PSI MI standard and can be easily analyzed in the context of the global protein interaction network as downloaded from major interaction databases like MINT, INTACT, DIP, MIPS/MPACT. DOMINO can be searched with a versatile search tool and the interaction networks can be visualized with a convenient graphic display applet that explicitly identifies the domains/sites involved in the interactions.
The Ensembl genome annotation system, developed jointly by the EBI and the Wellcome Trust Sanger Institute, has been used for the annotation, analysis and display of vertebrate genomes since 2000. Since 2009, the Ensembl site has been complemented by the creation of five new sites, for bacteria, protists, fungi, plants and invertebrate metazoa, enabling users to use a single collection of (interactive and programatic) interfaces for accessing and comparing genome-scale data from species of scientific interest from across the taxonomy. In each domain, we aim to bring the integrative power of Ensembl tools for comparative analysis, data mining and visualisation across genomes of scientific interest, working in collaboration with scientific communities to improve and deepen genome annotation and interpretation.
The Protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB. Protein sequences are the fundamental determinants of biological structure and function.
The NCBI Nucleotide database collects sequences from such sources as GenBank, RefSeq, TPA, and PDB. Sequences collected relate to genome, gene, and transcript sequence data, and provide a foundation for research related to the biomedical field.
The GSS database collects unannotated, short, single-read, primary genomic sequences from GenBank and contains nucleic acid sequences. These sequences include random survey sequences, clone-end sequences, and exon-trapped sequences.
InterPro collects information about protein sequence analysis and classification, providing access to a database of predictive protein signatures used for the classification and automatic annotation of proteins and genomes. Sequences in InterPro are classified at superfamily, family, and subfamily. InterPro predicts the occurrence of functional domains, repeats, and important sites, and adds in-depth annotation such as GO terms to the protein signatures.
The Yeast Resource Center provides access to data about mass spectrometry, yeast two-hybrid arrays, deconvolution florescence microscopy, protein structure prediction and computational biology. These services are provided to further the goal of a complete understanding of the chemical interactions required for the maintenance and faithful reproduction of a living cell. The observation that the fundamental biological processes of yeast are conserved among all eukaryotes ensures that this knowledge will shape and advance our understanding of living systems.
The BioCyc database collection of Pathway/Genome Databases (PGDBs) provides a reference on the genomes and metabolic pathways of thousands of sequenced organisms. BioCyc PGDBs are generated by software that predict the metabolic pathways of completely sequenced organisms, predict which genes code for missing enzymes in metabolic pathways, and predict operons. BioCyc also integrates information from other bioinformatics databases, such as protein feature and Gene Ontology information from UniProt. The BioCyc website provides a suite of software tools for database searching and visualization, for omics data analysis, and for comparative genomics and comparative pathway questions.
BsubCyc is a model-organism database for the bacterium Bacillus subtilis and is based on the updated B. subtilis 168 genome sequence and annotation published by Barbe et al. in 2009. Gene function annotations are being updated when new literature is available.
INTEGRALL is a web-based platform dedicated to compile information on integrons and designed to organize all the data available for these genetic structures. INTEGRALL provides a public genetic repository for sequence data and nomenclature and offers to scientists an easy and interactive access to integron's DNA sequences, their molecular arrangements as well as their genetic contexts.
This site provides users with access to up-to-date information about mutations at the phenylalanine hydroxylase locus. Here you will have access to the content of the database in the form of electronic reports. The database is updated manually off-line by the curators to assure that no erroneous information is appended. The curators now also accept data electronically via the submission form.