Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 44 result(s)
The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a wide variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data set provided on this website spans 60,706 unrelated individuals sequenced as part of various disease-specific and population genetic studies.
The Global Proteome Machine (GPM) is a protein identification database. This data repository allows users to post and compare results. GPM's data is provided by contributors like The Informatics Factory, University of Michigan, and Pacific Northwestern National Laboratories. The GPM searchable databases are: GPMDB, pSYT, SNAP, MRM, PEPTIDE and HOT.
The Sequence Read Archive stores the raw sequencing data from such sequencing platforms as the Roche 454 GS System, the Illumina Genome Analyzer, the Applied Biosystems SOLiD System, the Helicos Heliscope, and the Complete Genomics. It archives the sequencing data associated with RNA-Seq, ChIP-Seq, Genomic and Transcriptomic assemblies, and 16S ribosomal RNA data.
The Conserved Domain Database is a resource for the annotation of functional units in proteins. Its collection of domain models includes a set curated by NCBI, which utilizes 3D structure to provide insights into sequence/structure/function relationships
GermOnline 4.0 is a cross-species database gateway focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. The portal provides access to the Saccharomyces Genomics Viewer (SGV) which facilitates online interpretation of complex data from experiments with high-density oligonucleotide tiling microarrays that cover the entire yeast genome.
The GeneDB project is a core part of the Sanger Institute's Pathogen Genomics activities. Its primary goals are: to provide reliable access to the latest sequence data and annotation/curation for the whole range of organisms sequenced by the Pathogen group. to develop the website and other tools to aid the community in accessing and obtaining the maximum value from these data.
The RESID Database of Protein Modifications is a comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications.
Established by the HLA Informatics Group of the Anthony Nolan Research Institute, IPD provides a centralized system for studying the immune system's polymorphism in genes. The IPD maintains databases concerning the sequences of human Killer-cell Immunoglobulin-like Receptors (KIR), sequences of the major histocompatibility complex in a number of species, human platelet antigens (HPA), and tumor cell lines. Each subject has related, credible news, current research and publications, and a searchable database for highly specific, research grade genetic information.
The miRBase database is a searchable database of published miRNA sequences and annotation. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR). Both hairpin and mature sequences are available for searching and browsing, and entries can also be retrieved by name, keyword, references and annotation. All sequence and annotation data are also available for download. The miRBase Registry provides miRNA gene hunters with unique names for novel miRNA genes prior to publication of results.
It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels.
dbSTS is an NCBI resource that contains sequence data for short genomic landmark sequences or Sequence Tagged Sites. STS sequences are incorporated into the STS Division of GenBank.
ALEXA is a microarray design platform for 'alternative expression analysis'. This platform facilitates the design of expression arrays for analysis of mRNA isoforms generated from a single locus by the use of alternative transcription initiation, splicing and polyadenylation sites. We use the term 'ALEXA' to describe a collection of novel genomic methods for 'alternative expression' analysis. 'Alternative expression' refers to the identification and quantification of alternative mRNA transcripts produced by alternative transcript initiation, alternative splicing and alternative polyadenylation. This website provides supplementary materials, source code and other downloads for recent publications describing our studies of alternative expression (AE). Most recently we have developed a method, 'ALEXA-Seq' and associated resources for alternative expression analysis by massively parallel RNA sequencing.
InterPro collects information about protein sequence analysis and classification, providing access to a database of predictive protein signatures used for the classification and automatic annotation of proteins and genomes. Sequences in InterPro are classified at superfamily, family, and subfamily. InterPro predicts the occurrence of functional domains, repeats, and important sites, and adds in-depth annotation such as GO terms to the protein signatures.
The NCBI Nucleotide database collects sequences from such sources as GenBank, RefSeq, TPA, and PDB. Sequences collected relate to genome, gene, and transcript sequence data, and provide a foundation for research related to the biomedical field.
The Ensembl genome annotation system, developed jointly by the EBI and the Wellcome Trust Sanger Institute, has been used for the annotation, analysis and display of vertebrate genomes since 2000. Since 2009, the Ensembl site has been complemented by the creation of five new sites, for bacteria, protists, fungi, plants and invertebrate metazoa, enabling users to use a single collection of (interactive and programatic) interfaces for accessing and comparing genome-scale data from species of scientific interest from across the taxonomy. In each domain, we aim to bring the integrative power of Ensembl tools for comparative analysis, data mining and visualisation across genomes of scientific interest, working in collaboration with scientific communities to improve and deepen genome annotation and interpretation.