Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 28 result(s)
The Structure database provides three-dimensional structures of macromolecules for a variety of research purposes and allows the user to retrieve structures for specific molecule types as well as structures for genes and proteins of interest. Three main databases comprise Structure-The Molecular Modeling Database; Conserved Domains and Protein Classification; and the BioSystems Database. Structure also links to the PubChem databases to connect biological activity data to the macromolecular structures. Users can locate structural templates for proteins and interactively view structures and sequence data to closely examine sequence-structure relationships.
The Entrez Protein Clusters database contains annotation information, publications, structures and analysis tools for related protein sequences encoded by complete genomes. The data available in the Protein Clusters Database is generated from prokaryotic genomic studies and is intended to assist researchers studying micro-organism evolution as well as other biological sciences. Available genomes include plants and viruses as well as organelles and microbial genomes.
Species included in PlantTFDB 3.0 covers the main lineages of green plants. Therefore, PlantTFDB provides genomic TF repertoires across Viridiplantae. To provide comprehensive information for the TF family, a brief introduction and key references are presented for each family. Comprehensive annotations are made for each identified TF, including functional domains, 3D structures, gene ontology (GO), plant ontology (PO), expression information, expert-curated functional description, regulation information, interaction, conserved elements, references, and annotations in various databases such as UniProt, RefSeq, TransFac, STRING, and VISTA. By inferring orthologous groups and constructing phylogenetic trees, evolutionary relationships among identified TFs were inferred. In addition, PlantTFDB has a simple and user-friendly interface to allow users to query based on combined conditions or make sequence similarity search using BLAST.
Content type(s)
This database host for fungi data related to new classification with morphology, molecular and other important data. This fungal database allows deposition of taxonomic data, phenotypic details and other useful data, which will enhance our current taxonomic understanding and ultimately enable mycologists to gain better and updated insights into the current fungal classification system. In addition, the database will also allow access to comprehensive metadata including descriptions of voucher and type specimens.
Xanthobase provides information on Xanthomonas oryzae pv oryzae (Xoo), the rice (Oryza sativa) pathogenic bacterium in which genome sequencing has revealed very extensive race differentiation. The whole genome sequence of its native host has also been completed, and analysis of the host parasite interaction on the basis of the two genomes can be expected to be useful.
Gramene is a platform for comparative genomic analysis of agriculturally important grasses, including maize, rice, sorghum, wheat and barley. Relationships between cereals are queried and displayed using controlled vocabularies (Gene, Plant, Trait, Environment, and Gramene Taxonomy) and web-based displays, including the Genes and Quantitative Trait Loci (QTL) modules.
dbEST is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. Expressed Sequence Tags (ESTs) are short (usually about 300-500 bp), single-pass sequence reads from mRNA (cDNA). Typically they are produced in large batches. They represent a snapshot of genes expressed in a given tissue and/or at a given developmental stage. They are tags (some coding, others not) of expression for a given cDNA library. Most EST projects develop large numbers of sequences. These are commonly submitted to GenBank and dbEST as batches of dozens to thousands of entries, with a great deal of redundancy in the citation, submitter and library information. To improve the efficiency of the submission process for this type of data, we have designed a special streamlined submission process and data format. dbEST also includes sequences that are longer than the traditional ESTs, or are produced as single sequences or in small batches. Among these sequences are products of differential display experiments and RACE experiments. The thing that these sequences have in common with traditional ESTs, regardless of length, quality, or quantity, is that there is little information that can be annotated in the record. If a sequence is later characterized and annotated with biological features such as a coding region, 5'UTR, or 3'UTR, it should be submitted through the regular GenBank submissions procedure (via BankIt or Sequin), even if part of the sequence is already in dbEST. dbEST is reserved for single-pass reads. Assembled sequences should not be submitted to dbEST. GenBank will accept assembled EST submissions for the forthcoming TSA (Transcriptome Shotgun Assembly) division. The individual reads which make up the assembly should be submitted to dbEST, the Trace archive or the Short Read Archive (SRA) prior to the submission of the assemblies.
The Genome database contains annotations and analysis of eukaryotic and prokaryotic genomes, as well as tools that allow users to compare genomes and gene sequences from humans, microbes, plants, viruses and organelles. Users can browse by organism, and view genome maps and protein clusters.
This database serves forest tree scientists by providing online access to hardwood tree genomic and genetic data, including assembled reference genomes, transcriptomes, and genetic mapping information. The web site also provides access to tools for mining and visualization of these data sets, including BLAST for comparing sequences, Jbrowse for browsing genomes, Apollo for community annotation and Expression Analysis to build gene expression heatmaps.
Phytozome is the Plant Comparative Genomics portal of the Department of Energy's Joint Genome Institute. Families of related genes representing the modern descendants of ancestral genes are constructed at key phylogenetic nodes. These families allow easy access to clade-specific orthology/paralogy relationships as well as insights into clade-specific novelties and expansions.
AceView provides a curated, comprehensive and non-redundant sequence representation of all public mRNA sequences (mRNAs from GenBank or RefSeq, and single pass cDNA sequences from dbEST and Trace). These experimental cDNA sequences are first co-aligned on the genome then clustered into a minimal number of alternative transcript variants and grouped into genes. Using exhaustively and with high quality standards the available cDNA sequences evidences the beauty and complexity of mammals’ transcriptome, and the relative simplicity of the nematode and plant transcriptomes. Genes are classified according to their inferred coding potential; many presumably non-coding genes are discovered. Genes are named by Entrez Gene names when available, else by AceView gene names, stable from release to release. Alternative features (promoters, introns and exons, polyadenylation signals) and coding potential, including motifs, domains, and homologies are annotated in depth; tissues where expression has been observed are listed in order of representation; diseases, phenotypes, pathways, functions, localization or interactions are annotated by mining selected sources, in particular PubMed, GAD and Entrez Gene, and also by performing manual annotation, especially in the worm. In this way, both the anatomy and physiology of the experimentally cDNA supported human, mouse and nematode genes are thoroughly annotated.
The Crop EST Database (CR-EST) is a public available online resource providing access to sequence, classification, clustering, and annotation data of crop EST projects at the IPK. A view of these information give the summarized numbers about genomic data of species listed in the adjacent table.
A database for plant breeders and researchers to combine, visualize, and interrogate the wealth of phenotype and genotype data generated by the Triticeae Coordinated Agricultural Project (TCAP).
UniGene collects entries of transcript sequences from transcription loci from genes or expressed pseudogenes. Entries also contain information on the protein similarities, gene expressions, cDNA clone reagents, and genomic locations.
The Atlas of Living Australia (ALA) combines and provides scientifically collected data from a wide range of sources such as museums, herbaria, community groups, government departments, individuals and universities. Data records consist of images, literature, molecular DNA data, identification keys, species interaction data, species profile data, nomenclature, source data, conservation indicators, and spatial data.
During cell cycle, numerous proteins temporally and spatially localized in distinct sub-cellular regions including centrosome (spindle pole in budding yeast), kinetochore/centromere, cleavage furrow/midbody (related or homolog structures in plants and budding yeast called as phragmoplast and bud neck, respectively), telomere and spindle spatially and temporally. These sub-cellular regions play important roles in various biological processes. In this work, we have collected all proteins identified to be localized on kinetochore, centrosome, midbody, telomere and spindle from two fungi (S. cerevisiae and S. pombe) and five animals, including C. elegans, D. melanogaster, X. laevis, M. musculus and H. sapiens based on the rationale of "Seeing is believing" (Bloom K et al., 2005). Through ortholog searches, the proteins potentially localized at these sub-cellular regions were detected in 144 eukaryotes. Then the integrated and searchable database MiCroKiTS - Midbody, Centrosome, Kinetochore, Telomere and Spindle has been established.
GABI, acronym for "Genomanalyse im biologischen System Pflanze", is the name of a large collaborative network of different plant genomic research projects. Plant data from different ‘omics’ fronts representing more than 10 different model or crop species are integrated in GabiPD.
The portal is a web site for specialized georeferenced databases and tools for the analysis of marine bacterial, archaeal, and phage genomes and metagenomes. Megx offers three main functions: 1. Mapserver Popup The Genes Mapserver can be used to view georeferenced genome, metagenome and rRNA sampling sites and selected physicochemical and biological parameters. 2. Geographic-BLAST - can query the genome and metagenome databases we offer and view the distribution of your georeferenced hits. 3. "Browse" functions - The "Browse" menu in the navigation bar offers additional functionality. The Microbial Metagenomic Traits Database (MiMeT DB) contains a pre-calculated set of metagenomic traits.
The Sequence Read Archive stores the raw sequencing data from such sequencing platforms as the Roche 454 GS System, the Illumina Genome Analyzer, the Applied Biosystems SOLiD System, the Helicos Heliscope, and the Complete Genomics. It archives the sequencing data associated with RNA-Seq, ChIP-Seq, Genomic and Transcriptomic assemblies, and 16S ribosomal RNA data.
The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana . Data available from TAIR includes the complete genome sequence along with gene structure, gene product information, metabolism, gene expression, DNA and seed stocks, genome maps, genetic and physical markers, publications, and information about the Arabidopsis research community. Gene product function data is updated every two weeks from the latest published research literature and community data submissions. Gene structures are updated 1-2 times per year using computational and manual methods as well as community submissions of new and updated genes. TAIR also provides extensive linkouts from our data pages to other Arabidopsis resources.
The Protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB. Protein sequences are the fundamental determinants of biological structure and function.