Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 41 result(s)
The Entrez Protein Clusters database contains annotation information, publications, structures and analysis tools for related protein sequences encoded by complete genomes. The data available in the Protein Clusters Database is generated from prokaryotic genomic studies and is intended to assist researchers studying micro-organism evolution as well as other biological sciences. Available genomes include plants and viruses as well as organelles and microbial genomes.
The tree of life links all biodiversity through a shared evolutionary history. This project will produce the first online, comprehensive first-draft tree of all 1.8 million named species, accessible to both the public and scientific communities. Assembly of the tree will incorporate previously-published results, with strong collaborations between computational and empirical biologists to develop, test and improve methods of data synthesis. This initial tree of life will not be static; instead, we will develop tools for scientists to update and revise the tree as new data come in. Early release of the tree and tools will motivate data sharing and facilitate ongoing synthesis of knowledge.
The Expression Atlas provides information on gene expression patterns under different biological conditions such as a gene knock out, a plant treated with a compound, or in a particular organism part or cell. It includes both microarray and RNA-seq data. The data is re-analysed in-house to detect interesting expression patterns under the conditions of the original experiment. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas. The Baseline Atlas displays information about which gene products are present (and at what abundance) in "normal" conditions (e.g. tissue, cell type). It aims to answer questions such as "which genes are specifically expressed in human kidney?". This component of the Expression Atlas consists of highly-curated and quality-checked RNA-seq experiments from ArrayExpress. It has data for many different animal and plant species. New experiments are added as they become available. The Differential Atlas allows users to identify genes that are up- or down-regulated in a wide variety of different experimental conditions such as yeast mutants, cadmium treated plants, cystic fibrosis or the effect on gene expression of mind-body practice. Both microarray and RNA-seq experiments are included in the Differential Atlas. Experiments are selected from ArrayExpress and groups of samples are manually identified for comparison e.g. those with wild type genotype compared to those with a gene knock out. Each experiment is processed through our in-house differential expression statistical analysis pipeline to identify genes with a high probability of differential expression.
Gramene is a platform for comparative genomic analysis of agriculturally important grasses, including maize, rice, sorghum, wheat and barley. Relationships between cereals are queried and displayed using controlled vocabularies (Gene, Plant, Trait, Environment, and Gramene Taxonomy) and web-based displays, including the Genes and Quantitative Trait Loci (QTL) modules.
TreeBASE is a repository of phylogenetic information, specifically user-submitted phylogenetic trees and the data used to generate them. TreeBASE accepts all types of phylogenetic data (e.g., trees of species, trees of populations, trees of genes) representing all biotic taxa. Data in TreeBASE are exposed to the public if they are used in a publication that is in press or published in a peer-reviewed scientific journal, book, conference proceedings, or thesis. Data used in publications that are in preparation or in review can be submitted to TreeBASE but are only available to the authors, publication editors, or reviewers using a special access code.
VertNet is a NSF-funded collaborative project that makes biodiversity data free and available on the web. VertNet is a tool designed to help people discover, capture, and publish biodiversity data. It is also the core of a collaboration between hundreds of biocollections that contribute biodiversity data and work together to improve it. VertNet is an engine for training current and future professionals to use and build upon best practices in data quality, curation, research, and data publishing. Yet, VertNet is still the aggregate of all of the information that it mobilizes. To us, VertNet is all of these things and more.
The Maize Genetics and Genomics Database focuses on collecting data related to the crop plant and model organism Zea mays. The project's goals are to synthesize, display, and provide access to maize genomics and genetics data, prioritizing mutant and phenotype data and tools, structural and genetic map sets, and gene models. MaizeGDB also aims to make the Maize Newsletter available, and provide support services to the community of maize researchers. MaizeGDB is working with the Schnable lab, the Panzea project, The Genome Reference Consortium, and iPlant Collaborative to create a plan for archiving, dessiminating, visualizing, and analyzing diversity data. MMaizeGDB is short for Maize Genetics/Genomics Database. It is a USDA/ARS funded project to integrate the data found in MaizeDB and ZmDB into a single schema, develop an effective interface to access this data, and develop additional tools to make data analysis easier. Our goal in the long term is a true next-generation online maize database.aize genetics and genomics database.
Antarctic marine and terrestrial biodiversity data is widely scattered, patchy and often not readily accessible. In many cases the data is in danger of being irretrievably lost. establishes and supports a distributed system of interoperable databases, giving easy access through a single internet portal to a set of resources relevant to research, conservation and management pertaining to Antarctic biodiversity. provides access to both marine and terrestrial Antarctic biodiversity data.
GLOBE (Global Collaboration Engine) is an online collaborative environment that enables land change researchers to share, compare and integrate local and regional studies with global data to assess the global relevance of their work.
The Genome database contains annotations and analysis of eukaryotic and prokaryotic genomes, as well as tools that allow users to compare genomes and gene sequences from humans, microbes, plants, viruses and organelles. Users can browse by organism, and view genome maps and protein clusters.
The Saccharomyces Genome Database (SGD) provides comprehensive integrated biological information for the budding yeast Saccharomyces cerevisiae along with search and analysis tools to explore these data, enabling the discovery of functional relationships between sequence and gene products in fungi and higher organisms.
iHUB is a collaborative environment that supports research that relate to the genes and gene networks that control the ionomes, mineral nutrient, and trace element compositions of tissues and organisms. It provides tools to share data, literature, and coordinating collection efforts, among others. It contains ionomic data on more than 200.000 samples.
Tropicos® was originally created for internal research but has since been made available to the world’s scientific community. All of the nomenclatural, bibliographic, and specimen data accumulated in MBG’s electronic databases during the past 30 years are publicly available here.
The Environmental Data Initiative Repository concentrates on studies of ecological processes that play out at time scales spanning decades to centuries including those of the NSF Long Term Ecological Research (LTER) program, the NSF Macrosystems Biology Program, the NSF Long Term Research in Environmental Biology (LTREB) program, the Organization of Biological Field Stations, and others. The repository hosts data that provide a context to evaluate the nature and pace of ecological change, to interpret its effects, and to forecast the range of future biological responses to change.
The Arctic Data Center is the primary data and software repository for the Arctic section of NSF Polar Programs. The Center helps the research community to reproducibly preserve and discover all products of NSF-funded research in the Arctic, including data, metadata, software, documents, and provenance that links these together. The repository is open to contributions from NSF Arctic investigators, and data are released under an open license (CC-BY, CC0, depending on the choice of the contributor). All science, engineering, and education research supported by the NSF Arctic research program are included, such as Natural Sciences (Geoscience, Earth Science, Oceanography, Ecology, Atmospheric Science, Biology, etc.) and Social Sciences (Archeology, Anthropology, Social Science, etc.). Key to the initiative is the partnership between NCEAS at UC Santa Barbara, DataONE, and NOAA’s NCEI, each of which bring critical capabilities to the Center. Infrastructure from the successful NSF-sponsored DataONE federation of data repositories enables data replication to NCEI, providing both offsite and institutional diversity that are critical to long term preservation.
Biological collections are replete with taxonomic, geographic, temporal, numerical, and historical information. This information is crucial for understanding and properly managing biodiversity and ecosystems, but is often difficult to access. Canadensys, operated from the Université de Montréal Biodiversity Centre, is a Canada-wide effort to unlock the biodiversity information held in biological collections.
SoyBase is a professionally curated repository for genetics, genomics and related data resources for soybean. It contains current genetic, physical and genomic sequence maps integrated with qualitative and quantitative traits. SoyBase includes annotated "Williams 82" genomic sequence and associated data mining tools. The repository maintains controlled vocabularies for soybean growth, development, and traits that are linked to more general plant ontologies.
UniGene collects entries of transcript sequences from transcription loci from genes or expressed pseudogenes. Entries also contain information on the protein similarities, gene expressions, cDNA clone reagents, and genomic locations.
GABI, acronym for "Genomanalyse im biologischen System Pflanze", is the name of a large collaborative network of different plant genomic research projects. Plant data from different ‘omics’ fronts representing more than 10 different model or crop species are integrated in GabiPD.
The Atlas of Living Australia (ALA) combines and provides scientifically collected data from a wide range of sources such as museums, herbaria, community groups, government departments, individuals and universities. Data records consist of images, literature, molecular DNA data, identification keys, species interaction data, species profile data, nomenclature, source data, conservation indicators, and spatial data.