Content Types


AID systems


Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 48 result(s)
The RESID Database of Protein Modifications is a comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications.
Content type(s)
Datanator is an integrated database of genomic and biochemical data designed to help investigators find data about specific molecules and reactions in specific organisms and specific environments for meta-analyses and mechanistic models. Datanator currently includes metabolite concentrations, RNA modifications and half-lives, protein abundances and modifications, and reaction kinetics integrated from several databases and numerous publications. The Datanator website and REST API provide tools for extracting clouds of data about specific molecules and reactions in specific organisms and specific environments, as well as data about similar molecules and reactions in taxonomically similar organisms.
VIPERdb is a database for icosahedral virus capsid structures . The emphasis of the resource is on providing data from structural and computational analyses on these systems, as well as high quality renderings for visual exploration. In addition, all virus capsids are placed in a single icosahedral orientation convention, facilitating comparison between different structures. The web site includes powerful search utilities , links to other relevant databases, background information on virus capsid structure, and useful database interface tools.
DNASU is a central repository for plasmid clones and collections. Currently we store and distribute over 200,000 plasmids including 75,000 human and mouse plasmids, full genome collections, the protein expression plasmids from the Protein Structure Initiative as the PSI: Biology Material Repository (PSI : Biology-MR), and both small and large collections from individual researchers. We are also a founding member and distributor of the ORFeome Collaboration plasmid collection.
CBS offers Comprehensive public databases of DNA- and protein sequences, macromolecular structure, g ene and protein expression levels, pathway organization and cell signalling, have been established to optimise scientific exploitation of the explosion of data within biology. Unlike many other groups in the field of biomolecular informatics, Center for Biological Sequence Analysis directs its research primarily towards topics related to the elucidation of the functional aspects of complex biological mechanisms. Among contemporary bioinformatics concerns are reliable computational interpretation of a wide range of experimental data, and the detailed understanding of the molecular apparatus behind cellular mechanisms of sequence information. By exploiting available experimental data and evidence in the design of algorithms, sequence correlations and other features of biological significance can be inferred. In addition to the computational research the center also has experimental efforts in gene expression analysis using DNA chips and data generation in relation to the physical and structural properties of DNA. In the last decade, the Center for Biological Sequence Analysis has produced a large number of computational methods, which are offered to others via WWW servers.
MassIVE is a community resource developed by the NIH-funded Center for Computational Mass Spectrometry to promote the global, free exchange of mass spectrometry data. MassIVE datasets can be assigned ProteomeXchange accessions to satisfy publication requirements.
The Protein Circular Dichroism Data Bank (PCDDB) provides and accepts a circular dichroism spectra data. The PCDDB and it's parent organization, the Institute of Structural and Molecular Biology (ISMB), investigate molecular structure using techniques such as biomolecular nuclear magnetic resonance, X-ray crystallography and computational structure prediction, as methods for protein production and biological characterization.
The Comparative RNA Web (CRW) Site disseminates information about RNA structure and evolution that has been determined using comparative sequence analysis. We present both raw (sequences, structure models, metadata) and processed (analyses, evolution, accuracy) data, organized into four main sections.
OpenWorm aims to build the first comprehensive computational model of the Caenorhabditis elegans (C. elegans), a microscopic roundworm. With only a thousand cells, it solves basic problems such as feeding, mate-finding and predator avoidance. Despite being extremely well studied in biology, this organism still eludes a deep, principled understanding of its biology. We are using a bottom-up approach, aimed at observing the worm behaviour emerge from a simulation of data derived from scientific experiments carried out over the past decade. To do so we are incorporating the data available in the scientific community into software models. We are engineering Geppetto and Sibernetic, open-source simulation platforms, to be able to run these different models in concert. We are also forging new collaborations with universities and research institutes to collect data that fill in the gaps All the code we produce in the OpenWorm project is Open Source and available on GitHub.
The Yeast Resource Center provides access to data about mass spectrometry, yeast two-hybrid arrays, deconvolution florescence microscopy, protein structure prediction and computational biology. These services are provided to further the goal of a complete understanding of the chemical interactions required for the maintenance and faithful reproduction of a living cell. The observation that the fundamental biological processes of yeast are conserved among all eukaryotes ensures that this knowledge will shape and advance our understanding of living systems.
The Plant Metabolic Network (PMN) provides a broad network of plant metabolic pathway databases that contain curated information from the literature and computational analyses about the genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism in plants. The PMN currently houses one multi-species reference database called PlantCyc and 22 species/taxon-specific databases.
SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts. The available information include side effect frequency, drug and side effect classifications as well as links to further information, for example drug–target relations.
The Erythron Database is a resource dedicated to facilitating better understanding of the cellular and molecular underpinnings of mammalian erythropoiesis. The resource is built upon a searchable database of gene expression in murine primitive and definitive erythroid cells at progressive stages of maturation.
OrtholugeDB contains Ortholuge-based orthology predictions for completely sequenced bacterial and archaeal genomes. It is also a resource for reciprocal best BLAST-based ortholog predictions, in-paralog predictions (recently duplicated genes) and ortholog groups in Bacteria and Archaea. The Ortholuge method improves the specificity of high-throughput orthology prediction.
SimTK is a free project-hosting platform for the biomedical computation community that enables researchers to easily share their software, data, and models and provides the infrastructure so they can support and grow a community around their projects. It has over 62,000 members, hosts more than 960 projects from researchers around the world, and has had more than 500,000 files downloaded from it. Individuals have created SimTK projects to meet publisher and funding agencies’ software and data sharing requirements, run scientific challenges, create a collection of their community’s resources, and much more.
Pathway Commons is a convenient point of access to biological pathway information collected from public pathway databases. Information is sourced from public pathway databases and is readily searched, visualized, and downloaded. The data is freely available under the license terms of each contributing database.
FaceBase is a collaborative NIDCR-funded project that houses comprehensive data in support of advancing research into craniofacial development and malformation. It serves as a community resource by curating large datasets of a variety of types from the craniofacial research community and sharing them via this website. Practices emphasize a comprehensive and multidisciplinary approach to understanding the developmental processes that create the face. The data offered spotlights high-throughput genetic, molecular, biological, imaging and computational techniques. One of the missions of this project is to facilitate cooperation and collaboration between the central coordinating center (ie, the Hub) and the craniofacial research community.
The Database of Protein Disorder (DisProt) is a curated database that provides information about proteins that lack fixed 3D structure in their putatively native states, either in their entirety or in part. DisProt is a community resource annotating protein sequences for intrinsically disorder regions from the literature. It classifies intrinsic disorder based on experimental methods and three ontologies for molecular function, transition and binding partner.
NetPath is currently one of the largest open-source repository of human signaling pathways that is all set to become a community standard to meet the challenges in functional genomics and systems biology. Signaling networks are the key to deciphering many of the complex networks that govern the machinery inside the cell. Several signaling molecules play an important role in disease processes that are a direct result of their altered functioning and are now recognized as potential therapeutic targets. Understanding how to restore the proper functioning of these pathways that have become deregulated in disease, is needed for accelerating biomedical research. This resource is aimed at demystifying the biological pathways and highlights the key relationships and connections between them. Apart from this, pathways provide a way of reducing the dimensionality of high throughput data, by grouping thousands of genes, proteins and metabolites at functional level into just several hundreds of pathways for an experiment. Identifying the active pathways that differ between two conditions can have more explanatory power than just a simple list of differentially expressed genes and proteins.
TAED is a database of phylogenetically indexed gene families. It contains multiple sequence alignments from MAFFT1, maximum likelihood phylogenetic trees from PhyML2, bootstrap values for each node, dN/dS ratios for each lineage from the free ratios model in PAML3, and labels for each node of speciation or duplication from gene tree/species tree reconciliation using SoftParsMap4. The phylogenetic indexing enables simultaneous viewing of lineages with high dN/dS that occurred along the same species tree branches. Resources from the Protein Data Bank (PDB) and the Kyoto Encyclopedia of Genes and Genomes (KEGG)5, have been incorporated into the TAED analysis to detect substitutions along each branch within the phylogenetic tree and to assess selection within pathways.
BioSimulations is a web application for sharing and re-using biomodels, simulations, and visualizations of simulations results. BioSimulations supports a wide range of modeling frameworks (e.g., kinetic, constraint-based, and logical modeling), model formats (e.g., BNGL, CellML, SBML), and simulation tools (e.g., COPASI, libRoadRunner/tellurium, NFSim, VCell). BioSimulations aims to help researchers discover published models that might be useful for their research and quickly try them via a simple web-based interface.
The Benchmark Energy & Geometry Database (BEGDB) collects results of highly accurate QM calculations of molecular structures, energies and properties. These data can serve as benchmarks for testing and parameterization of other computational methods.