Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 30 result(s)
The Entrez Protein Clusters database contains annotation information, publications, structures and analysis tools for related protein sequences encoded by complete genomes. The data available in the Protein Clusters Database is generated from prokaryotic genomic studies and is intended to assist researchers studying micro-organism evolution as well as other biological sciences. Available genomes include plants and viruses as well as organelles and microbial genomes.
VegBank is the vegetation plot database of the Ecological Society of America's Panel on Vegetation Classification. VegBank consists of three linked databases that contain the actual plot records, vegetation types recognized in the U.S. National Vegetation Classification and other vegetation types submitted by users, and all plant taxa recognized by ITIS/USDA as well as all other plant taxa recorded in plot records. Vegetation records, community types and plant taxa may be submitted to VegBank and may be subsequently searched, viewed, annotated, revised, interpreted, downloaded, and cited. VegBank receives its data from the VegBank community of users.
The Plant Metabolic Network (PMN) provides a broad network of plant metabolic pathway databases that contain curated information from the literature and computational analyses about the genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism in plants. The PMN currently houses one multi-species reference database called PlantCyc and 22 species/taxon-specific databases.
CPLM (Compendium of Protein Lysine Modifications) is an online data resource specifically designed for protein lysine modifications (PLMs). The CPLM database was extended and adapted from our CPLA 1.0 (Compendium of Protein Lysine Acetylation) database and the 2.0 release contains 203,972 modification events on 189,919 modified lysines in 45,748 proteins for 12 types of PLMs, including Nε-lysine acetylation, ubiquitination, methylation, sumoylation, glycation, butyrylation, crotonylation, malonylation, propionylation, succinylation, phosphoglycerylation and prokaryotic Pupylation.
The Genome database contains annotations and analysis of eukaryotic and prokaryotic genomes, as well as tools that allow users to compare genomes and gene sequences from humans, microbes, plants, viruses and organelles. Users can browse by organism, and view genome maps and protein clusters.
e!DAL stands for electronic Data Archive Library. It is a lightweight open source software software framework for publishing and sharing research data. e!DAL was developed based on experiences coming from decades of research data management and has grown towards being a general data archiving and publication infrastructure [doi:10.1186/1471-2105-15-214]. First research data repository is "Plant Genomics and Phenomics Research Data Repository".
The KNB Data Repository is an international repository intended to facilitate ecological, environmental and earth science research in the broadest senses. For scientists, the KNB Data Repository is an efficient way to share, discover, access and interpret complex ecological, environmental, earth science, and sociological data and the software used to create and manage those data. Due to rich contextual information provided with data in the KNB, scientists are able to integrate and analyze data with less effort. The data originate from a highly-distributed set of field stations, laboratories, research sites, and individual researchers. The KNB supports rich, detailed metadata to promote data discovery as well as automated and manual integration of data into new projects. The KNB supports a rich set of modern repository services, including the ability to assign Digital Object Identifiers (DOIs) so data sets can be confidently referenced in any publication, the ability to track the versions of datasets as they evolve through time, and metadata to establish the provenance relationships between source and derived data.
The Arctic Data Center is the primary data and software repository for the Arctic section of NSF Polar Programs. The Center helps the research community to reproducibly preserve and discover all products of NSF-funded research in the Arctic, including data, metadata, software, documents, and provenance that links these together. The repository is open to contributions from NSF Arctic investigators, and data are released under an open license (CC-BY, CC0, depending on the choice of the contributor). All science, engineering, and education research supported by the NSF Arctic research program are included, such as Natural Sciences (Geoscience, Earth Science, Oceanography, Ecology, Atmospheric Science, Biology, etc.) and Social Sciences (Archeology, Anthropology, Social Science, etc.). Key to the initiative is the partnership between NCEAS at UC Santa Barbara, DataONE, and NOAA’s NCEI, each of which bring critical capabilities to the Center. Infrastructure from the successful NSF-sponsored DataONE federation of data repositories enables data replication to NCEI, providing both offsite and institutional diversity that are critical to long term preservation.
The Sol Genomics Network (SGN) is a clade-oriented database dedicated to the biology of the Solanaceae family which includes a large number of closely related and many agronomically important species such as tomato, potato, tobacco, eggplant, pepper, and the ornamental Petunia hybrida. SGN is part of the International Solanaceae Initiative (SOL), which has the long-term goal of creating a network of resources and information to address key questions in plant adaptation and diversification
The Wellcome Trust Sanger Institute is a charitably funded genomic research centre located in Hinxton, nine miles south of Cambridge in the UK. We study diseases that have an impact on health globally by investigating genomes. Building on our past achievements and based on priorities that exploit the unique expertise of our Faculty of researchers, we will lead global efforts to understand the biology of genomes. We are convinced of the importance of making this research available and accessible for all audiences. reduce global health burdens.
Biological collections are replete with taxonomic, geographic, temporal, numerical, and historical information. This information is crucial for understanding and properly managing biodiversity and ecosystems, but is often difficult to access. Canadensys, operated from the Université de Montréal Biodiversity Centre, is a Canada-wide effort to unlock the biodiversity information held in biological collections.
TreeGenes is a genomic, phenotypic, and environmental data resource for forest tree species. The TreeGenes database and Dendrome project provide custom informatics tools to manage the flood of information.The database contains several curated modules that support the storage of data and provide the foundation for web-based searches and visualization tools. GMOD GUI tools such as CMAP for genetic maps and GBrowse for genome and transcriptome assemblies are implemented here. A sample tracking system, known as the Forest Tree Genetic Stock Center, sits at the forefront of most large-scale projects. Barcode identifiers assigned to the trees during sample collection are maintained in the database to identify an individual through DNA extraction, resequencing, genotyping and phenotyping. DiversiTree, a user-friendly desktop-style interface, queries the TreeGenes database and is designed for bulk retrieval of resequencing data. CartograTree combines geo-referenced individuals with relevant ecological and trait databases in a user-friendly map-based interface. ---- The Conifer Genome Network (CGN) is a virtual nexus for researchers working in conifer genomics. The CGN web site is maintained by the Dendrome Project at the University of California, Davis.
UniGene collects entries of transcript sequences from transcription loci from genes or expressed pseudogenes. Entries also contain information on the protein similarities, gene expressions, cDNA clone reagents, and genomic locations.
The Institute of Plant Genetics and Crop Plant Research IPK Gatersleben, is a nonprofit research institution for crop genetics and molecular biology, and is part of the Leibniz Association. The mission of the IPK Gatersleben is to conduct basic and applied research in the area of plant genetics and crop plant research. The results of this work are not only of significant benefit to plant breeders and the agricultural industry, but also to the food, feed, and chemical industry. An additional research area, the use of renewable raw materials, is increasingly gaining in importance.
During cell cycle, numerous proteins temporally and spatially localized in distinct sub-cellular regions including centrosome (spindle pole in budding yeast), kinetochore/centromere, cleavage furrow/midbody (related or homolog structures in plants and budding yeast called as phragmoplast and bud neck, respectively), telomere and spindle spatially and temporally. These sub-cellular regions play important roles in various biological processes. In this work, we have collected all proteins identified to be localized on kinetochore, centrosome, midbody, telomere and spindle from two fungi (S. cerevisiae and S. pombe) and five animals, including C. elegans, D. melanogaster, X. laevis, M. musculus and H. sapiens based on the rationale of "Seeing is believing" (Bloom K et al., 2005). Through ortholog searches, the proteins potentially localized at these sub-cellular regions were detected in 144 eukaryotes. Then the integrated and searchable database MiCroKiTS - Midbody, Centrosome, Kinetochore, Telomere and Spindle has been established.
The Global Proteome Machine (GPM) is a protein identification database. This data repository allows users to post and compare results. GPM's data is provided by contributors like The Informatics Factory, University of Michigan, and Pacific Northwestern National Laboratories. The GPM searchable databases are: GPMDB, pSYT, SNAP, MRM, PEPTIDE and HOT.
The portal is a web site for specialized georeferenced databases and tools for the analysis of marine bacterial, archaeal, and phage genomes and metagenomes. Megx offers three main functions: 1. Mapserver Popup The Genes Mapserver can be used to view georeferenced genome, metagenome and rRNA sampling sites and selected physicochemical and biological parameters. 2. Geographic-BLAST - can query the genome and metagenome databases we offer and view the distribution of your georeferenced hits. 3. "Browse" functions - The "Browse" menu in the navigation bar offers additional functionality. The Microbial Metagenomic Traits Database (MiMeT DB) contains a pre-calculated set of metagenomic traits. is a project of the University of Georgia’s Center for Invasive Species and Ecosystem Health and one of the four major parts of BugwoodImages. The Focus is on invasive and exotic species of North America. This can be animals, plants, insects, and pathogens. It provides an easily accessible archive of high quality images for use in educational applications. In most cases, the images found in this system were taken by and loaned to us by photographers other than ourselves. Most are in the realm of public sector images. The photographs are in this system to be used
PeanutBase is a peanut community resource providing genetic, genomic, gene function, and germplasm data to support peanut breeding and molecular research. This includes molecular markers, genetic maps, QTL data, genome assemblies, germplasm records, and traits. Data is curated from literature and submitted directly by researchers. Funding for PeanutBase is provided by the Peanut Foundation with in-kind contributions from the USDA-ARS.
The Sequence Read Archive stores the raw sequencing data from such sequencing platforms as the Roche 454 GS System, the Illumina Genome Analyzer, the Applied Biosystems SOLiD System, the Helicos Heliscope, and the Complete Genomics. It archives the sequencing data associated with RNA-Seq, ChIP-Seq, Genomic and Transcriptomic assemblies, and 16S ribosomal RNA data.
The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana . Data available from TAIR includes the complete genome sequence along with gene structure, gene product information, metabolism, gene expression, DNA and seed stocks, genome maps, genetic and physical markers, publications, and information about the Arabidopsis research community. Gene product function data is updated every two weeks from the latest published research literature and community data submissions. Gene structures are updated 1-2 times per year using computational and manual methods as well as community submissions of new and updated genes. TAIR also provides extensive linkouts from our data pages to other Arabidopsis resources.
The Protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB. Protein sequences are the fundamental determinants of biological structure and function.