Content Types


AID systems


Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 60 result(s)
The Gene database provides detailed information for known and predicted genes defined by nucleotide sequence or map position. Gene supplies gene-specific connections in the nexus of map, sequence, expression, structure, function, citation, and homology data. Unique identifiers are assigned to genes with defining sequences, genes with known map positions, and genes inferred from phenotypic information. These gene identifiers are used throughout NCBI's databases and tracked through updates of annotation. Gene includes genomes represented by NCBI Reference Sequences (or RefSeqs) and is integrated for indexing and query and retrieval from NCBI's Entrez and E-Utilities systems.
MozAtlas provides gene expression data of adult male and female mosquitoes as tables, expressions, trees and models. MozAtlas also provides sequence orthology relationships with data provided by FlyBase, Vectorbase, Beetlebase, BeeBase, and WormBase.
TPA is a database that contains sequences built from the existing primary sequence data in GenBank. TPA records are retrieved through the Nucleotide Database and feature information on the sequence, how it was cataloged, and proper way to cite the sequence information.
The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana . Data available from TAIR includes the complete genome sequence along with gene structure, gene product information, metabolism, gene expression, DNA and seed stocks, genome maps, genetic and physical markers, publications, and information about the Arabidopsis research community. Gene product function data is updated every two weeks from the latest published research literature and community data submissions. Gene structures are updated 1-2 times per year using computational and manual methods as well as community submissions of new and updated genes. TAIR also provides extensive linkouts from our data pages to other Arabidopsis resources.
The goals of the Drosophila Genome Center are to finish the sequence of the euchromatic genome of Drosophila melanogaster to high quality and to generate and maintain biological annotations of this sequence. In addition to genomic sequencing, the BDGP is 1) producing gene disruptions using P element-mediated mutagenesis on a scale unprecedented in metazoans; 2) characterizing the sequence and expression of cDNAs; and 3) developing informatics tools that support the experimental process, identify features of DNA sequence, and allow us to present up-to-date information about the annotated sequence to the research community.
DDBJ; DNA Data Bank of Japan is the sole nucleotide sequence data bank in Asia, which is officially certified to collect nucleotide sequences from researchers and to issue the internationally recognized accession number to data submitters.Since we exchange the collected data with EMBL-Bank/EBI; European Bioinformatics Institute and GenBank/NCBI; National Center for Biotechnology Information on a daily basis, the three data banks share virtually the same data at any given time. The virtually unified database is called "INSD; International Nucleotide Sequence Database DDBJ collects sequence data mainly from Japanese researchers, but of course accepts data and issue the accession number to researchers in any other countries.
>>>!!! NCBI announced plans to retire the Clone DB web interface. Pursuant to this retirement, starting on May 27, 2019, all web pages associated with Clone DB and CloneFinder will redirect to this blog post. Links to Clone DB from the NCBI home page will also be going away.!!!<<< Clone DB contains information about genomic clones and cDNA and cell-based libraries for eukaryotic organisms. The database integrates this information with sequence data, map positions, and distributor information. At this time, Clone DB contains records for genomic clones and libraries, the collection of MICER mouse gene targeting clones and cell-based gene trap and gene targeting libraries from the International Knockout Mouse Consortium, Lexicon and the International Gene Trap Consortium. A planned expansion for Clone DB will add records for additional gene targeting and gene trap clones, as well as cDNA clones.
The NCBI Nucleotide database collects sequences from such sources as GenBank, RefSeq, TPA, and PDB. Sequences collected relate to genome, gene, and transcript sequence data, and provide a foundation for research related to the biomedical field.
In early 2010 we updated the site to facilitate more rapid transfer of our data to the public database and focus our efforts on the core mission of providing expression pattern images to the research community. The original database reproduced functions available on FlyBase, complicating our updates by the requirement to re-synchronize with FlyBase updates. Our expression reports on the new site still link to FlyBase gene reports, but we no longer reproduce FlyBase functions and therefore can update expression data on an ongoing basis instead of more infrequent major releases. All the functions relating to the expression patterns remain and we soon will add an option to search expression patterns by image similarity, in addition to annotation term searches. In a transitional phase we will leave both the old and the new sites up, but the newer data (post Release 2) will appear only on the new website. We welcome any feedback or requests for additional features. - The goals of the Drosophila Genome Center are to finish the sequence of the euchromatic genome of Drosophila melanogaster to high quality and to generate and maintain biological annotations of this sequence. In addition to genomic sequencing, the BDGP is 1) producing gene disruptions using P element-mediated mutagenesis on a scale unprecedented in metazoans; 2) characterizing the sequence and expression of cDNAs; and 3) developing informatics tools that support the experimental process, identify features of DNA sequence, and allow us to present up-to-date information about the annotated sequence to the research community.
Candida Genome Database, a resource for genomic sequence data and gene and protein information for Candida albicans and related species. CGD is based on the Saccharomyces Genome Database. The Candida Genome Database (CGD) provides online access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans and related species. C. albicans is the best studied of the human fungal pathogens. It is a common commensal organism of healthy individuals, but can cause debilitating mucosal infections and life-threatening systemic infections, especially in immunocompromised patients. C. albicans also serves as a model organism for the study of other fungal pathogens.
The miRBase database is a searchable database of published miRNA sequences and annotation. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR). Both hairpin and mature sequences are available for searching and browsing, and entries can also be retrieved by name, keyword, references and annotation. All sequence and annotation data are also available for download. The miRBase Registry provides miRNA gene hunters with unique names for novel miRNA genes prior to publication of results.
Gene Expression Omnibus: a public functional genomics data repository supporting MIAME-compliant data submissions. Array- and sequence-based data are accepted. Tools are provided to help users query and download experiments and curated gene expression profiles.
>>>!!! NCBI has retired the Probe Database !!!<<< Probe database provides a public registry of nucleic acid reagents as well as information on reagent distributors, sequence similarities and probe effectiveness. Database users have access to applications of gene expression, gene silencing and mapping, as well as reagent variation analysis and projects based on probe-generated data. The Probe database is constantly updated.
>>>!!!<<< Noticed 26.08.2020: The NCI CBIIT instance of the CGAP no longer exist on this website. The Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer has a new home at the NCI-funded Institute for Systems Biology Cancer Genomics Cloud available at the following location: >>>!!!<<<
The Reference Sequence (RefSeq) collection provides a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins. RefSeq sequences form a foundation for medical, functional, and diversity studies. They provide a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis (especially RefSeqGene records), expression studies, and comparative analyses.
BacMap is a picture atlas of annotated bacterial genomes. It is an interactive visual database containing hundreds of fully labeled, zoomable, and searchable maps of bacterial genomes.
The Saccharomyces Genome Database (SGD) provides comprehensive integrated biological information for the budding yeast Saccharomyces cerevisiae along with search and analysis tools to explore these data, enabling the discovery of functional relationships between sequence and gene products in fungi and higher organisms.
>>>>!!!!<<<< AspGD data are being integrated into FungiDB. Please click here for additional details . Discussion of how to maximize the value of FungiDB for the Aspergillus research community will be a major topic at the upcoming AsperFest12 meeting at Asilomar (March 16-17, 2015). >>>>!!!!<<<< AspGD is an organized collection of genetic and molecular biological information about the filamentous fungi of the genus Aspergillus. Among its many species, the genus contains an excellent model organism (A. nidulans, or its teleomorph Emericella nidulans), an important pathogen of the immunocompromised (A. fumigatus), an agriculturally important toxin producer (A. flavus), and two species used in industrial processes (A. niger and A. oryzae). AspGD contains information about genes and proteins of multiple Aspergillus species; descriptions and classifications of their biological roles, molecular functions, and subcellular localizations; gene, protein, and chromosome sequence information; tools for analysis and comparison of sequences; and links to literature information; as well as a multispecies comparative genomics browser tool (Sybil) for exploration of orthology and synteny across multiple sequenced Aspergillus species.
CBS offers Comprehensive public databases of DNA- and protein sequences, macromolecular structure, g ene and protein expression levels, pathway organization and cell signalling, have been established to optimise scientific exploitation of the explosion of data within biology. Unlike many other groups in the field of biomolecular informatics, Center for Biological Sequence Analysis directs its research primarily towards topics related to the elucidation of the functional aspects of complex biological mechanisms. Among contemporary bioinformatics concerns are reliable computational interpretation of a wide range of experimental data, and the detailed understanding of the molecular apparatus behind cellular mechanisms of sequence information. By exploiting available experimental data and evidence in the design of algorithms, sequence correlations and other features of biological significance can be inferred. In addition to the computational research the center also has experimental efforts in gene expression analysis using DNA chips and data generation in relation to the physical and structural properties of DNA. In the last decade, the Center for Biological Sequence Analysis has produced a large number of computational methods, which are offered to others via WWW servers.
This database serves forest tree scientists by providing online access to hardwood tree genomic and genetic data, including assembled reference genomes, transcriptomes, and genetic mapping information. The web site also provides access to tools for mining and visualization of these data sets, including BLAST for comparing sequences, Jbrowse for browsing genomes, Apollo for community annotation and Expression Analysis to build gene expression heatmaps.
The ISSAID website gathers resources related to the systemic autoinflammatory diseases in order to facilitate contacts between interested physicians and researchers. The website provides support to share and rapidly disseminate information, thoughts, feelings and experiences to improve the quality of life of patients and families affected by systemic autoinflammatory diseases, and promote advances in the search for causes and cures.
The Chickpea Transcriptome Database (CTDB) has been developed with the view to provide most comprehensive information about the chickpea transcriptome, the most relevant part of the genome. The database contains various information and tools for transcriptome sequence, functional annotation, conserved domain(s), transcription factor families, molecular markers (microsatellites and single nucleotide polymorphisms), Comprehensive gene expression and comparative genomics with other legumes. The database is a freely available resource, which provides user scientists/breeders a portal to search, browse and query the data to facilitate functional and applied genomics research in chickpea and other legumes. The current release of database provides transcriptome sequence from cultivated (Cicer arietinum desi (ICC4958) and kabuli (ICCV2)) and wild (Cicer reticulatum, PI489777) chickpea genotypes.
INTEGRALL is a web-based platform dedicated to compile information on integrons and designed to organize all the data available for these genetic structures. INTEGRALL provides a public genetic repository for sequence data and nomenclature and offers to scientists an easy and interactive access to integron's DNA sequences, their molecular arrangements as well as their genetic contexts.
A repository for high-quality gene models produced by the manual annotation of vertebrate genomes. The final update of Vega, version 68, was released in February 2017 and is now archived at We plan to maintain this resource until Feb 2020.