Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 24 result(s)
The Entrez Protein Clusters database contains annotation information, publications, structures and analysis tools for related protein sequences encoded by complete genomes. The data available in the Protein Clusters Database is generated from prokaryotic genomic studies and is intended to assist researchers studying micro-organism evolution as well as other biological sciences. Available genomes include plants and viruses as well as organelles and microbial genomes.
The Gene database provides detailed information for known and predicted genes defined by nucleotide sequence or map position. Gene supplies gene-specific connections in the nexus of map, sequence, expression, structure, function, citation, and homology data. Unique identifiers are assigned to genes with defining sequences, genes with known map positions, and genes inferred from phenotypic information. These gene identifiers are used throughout NCBI's databases and tracked through updates of annotation. Gene includes genomes represented by NCBI Reference Sequences (or RefSeqs) and is integrated for indexing and query and retrieval from NCBI's Entrez and E-Utilities systems.
dictyBase is an integrated genetic and literature database that contains published Dictyostelium discoideum literature, genes, expressed sequence tags (ESTs), as well as the chromosomal and mitochondrial genome sequences. Direct access to the genome browser, a Blast search tool, the Dictyostelium Stock Center, research tools, colleague databases, and much much more are just a mouse click away. Dictybase is a genome portal for the Amoebozoa. dictyBase is funded by a grant from the National Institute for General Medical Sciences.
MycoCosm, the DOE JGI’s web-based fungal genomics resource, which integrates fungal genomics data and analytical tools for fungal biologists. It provides navigation through sequenced genomes, genome analysis in context of comparative genomics and genome-centric view. MycoCosm promotes user community participation in data submission, annotation and analysis.
The Genome database contains annotations and analysis of eukaryotic and prokaryotic genomes, as well as tools that allow users to compare genomes and gene sequences from humans, microbes, plants, viruses and organelles. Users can browse by organism, and view genome maps and protein clusters.
We developed a method, ChIP-sequencing (ChIP-seq), combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing to identify mammalian DNA sequences bound by transcription factors in vivo. We used ChIP-seq to map STAT1 targets in interferon-gamma (IFN-gamma)-stimulated and unstimulated human HeLa S3 cells, and compared the method's performance to ChIP-PCR and to ChIP-chip for four chromosomes.For both Chromatin- immunoprecipation Transcription Factors and Histone modifications. Sequence files and the associated probability files are also provided.
The cisRED database holds conserved sequence motifs identified by genome scale motif discovery, similarity, clustering, co-occurrence and coexpression calculations. Sequence inputs include low-coverage genome sequence data and ENCODE data. A Nucleic Acids Research article describes the system architecture
Greengenes is an Earth Sciences website that assists clinical and environmental microbiologists from around the globe in classifying microorganisms from their local environments. A 16S rRNA gene database addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies.
This site provides access to complete, annotated genomes from bacteria and archaea (present in the European Nucleotide Archive) through the Ensembl graphical user interface (genome browser). Ensembl Bacteria contains genomes from annotated INSDC records that are loaded into Ensembl multi-species databases, using the INSDC annotation import pipeline.
PhytoPath is a new bioinformatics resource that integrates genome-scale data from important plant pathogen species with literature-curated information about the phenotypes of host infection. Using the Ensembl Genomes browser, it provides access to complete genome assembly and gene models of priority crop and model-fungal, oomycete and bacterial phytopathogens. PhytoPath also links genes to disease progression using data from the curated PHI-base resource. PhytoPath portal is a joint project bringing together Ensembl Genomes with PHI-base, a community-curated resource describing the role of genes in pathogenic infection. PhytoPath provides access to genomic and phentoypic data from fungal and oomycete plant pathogens, and has enabled a considerable increase in the coverage of phytopathogen genomes in Ensembl Fungi and Ensembl Protists. PhytoPath also provides enhanced searching of the PHI-base resource as well as the fungi and protists in Ensembl Genomes.
ASAP (a systematic annotation package for community analysis of genomes) is a relational database and web interface developed to store, update and distribute genome sequence data and gene expression data collected by or in collaboration with researchers at the University of Wisconsin - Madison. ASAP was designed to facilitate ongoing community annotation of genomes and to grow with genome projects as they move from the preliminary data stage through post-sequencing functional analysis. The ASAP database includes multiple genome sequences at various stages of analysis, and gene expression data from preliminary experiments.
CryptoDB is an integrated genomic and functional genomic database for the parasite Cryptosporidium and other related genera. CryptoDB integrates whole genome sequence and annotation along with experimental data and environmental isolate sequences provided by community researchers. The database includes supplemental bioinformatics analyses and a web interface for data-mining.
The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins. This initial release of the UniPROBE database provides a centralized resource for accessing comprehensive data on the preferences of proteins for all possible sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In total, the database currently hosts DNA binding data for 406 nonredundant proteins from a diverse collection of organisms, including the prokaryote Vibrio harveyi, the eukaryotic malarial parasite Plasmodium falciparum, the parasitic Apicomplexan Cryptosporidium parvum, the yeast Saccharomyces cerevisiae, the worm Caenorhabditis elegans, mouse, and human. The database's web tools (on the right) include a text-based search, a function for assessing motif similarity between user-entered data and database PWMs, and a function for locating putative binding sites along user-entered nucleotide sequences
GallusReactome is a free, online, open-source, curated resource of core pathways and reactions in chicken biology. Information is authored by expert biological researchers, maintained by the GallusReactome editorial staff and cross-referenced to the NCBI Entrez Gene, Ensembl and UniProt databases, the KEGG and ChEBI small molecule databases, PubMed, and the Gene Ontology (GO).
The Mouse Atlas of Gene Expression is a quantitative and comprehensive atlas of gene expression in mouse development. Gene expression levels from 198 tissue samples was measured using 202 Serial Analysis of Gene Expression (SAGE). Emphasis was on mouse development, samples taken at different stages of mouse development.
CEEHRC represents a multi-stage funding commitment by the Canadian Institutes of Health Research (CIHR) and multiple Canadian and international partners. The overall aim is to position Canada at the forefront of international efforts to translate new discoveries in the field of epigenetics into improved human health. The two sites will focus on sequencing human reference epigenomes and developing new technologies and protocols; they will also serve as platforms for other CEEHRC funding initiatives, such as catalyst and team grants. The complementary reference epigenome mapping efforts of the two sites will focus on a range of common human diseases. The Vancouver group will focus on the role of epigenetics in the development of cancer, including lymphoma and cancers of the ovary, colon, breast, and thyroid. The Montreal team will focus on autoimmune / inflammatory, cardio-metabolic, and neuropsychiatric diseases, using studies of identical twins as well as animal models of human disease.
GenBank® is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.
The Ensembl genome annotation system, developed jointly by the EBI and the Wellcome Trust Sanger Institute, has been used for the annotation, analysis and display of vertebrate genomes since 2000. Since 2009, the Ensembl site has been complemented by the creation of five new sites, for bacteria, protists, fungi, plants and invertebrate metazoa, enabling users to use a single collection of (interactive and programatic) interfaces for accessing and comparing genome-scale data from species of scientific interest from across the taxonomy. In each domain, we aim to bring the integrative power of Ensembl tools for comparative analysis, data mining and visualisation across genomes of scientific interest, working in collaboration with scientific communities to improve and deepen genome annotation and interpretation.
The GSS database collects unannotated, short, single-read, primary genomic sequences from GenBank and contains nucleic acid sequences. These sequences include random survey sequences, clone-end sequences, and exon-trapped sequences.
The DNA Bank Network is a node of GGBN and hosts the GGBN Data Portal, Library, and Registry. The main focus of the DNA Bank Network is to enhance taxonomic, systematic, genetic, conservation and evolutionary studies by providing: • high quality, long-term storage of DNA material on which molecular studies have been performed, so that results can be verified, extended, and complemented, • complete on-line documentation of each sample, including the provenance of the original material, the place of voucher deposit, information about DNA quality and extraction methodology, digital images of vouchers and links to published molecular data if available.
The NCBI Trace Archive is a permanent repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects. The Trace Archive serves as the repository of sequencing data from gel/capillary platforms such as Applied Biosystems ABI 3730®. The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, Helicos Heliscope®, and others. The Trace Assembly Archive stores pairwise alignment and multiple alignment of sequencing reads, linking basic trace data with finished genomic sequence.