Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 39 result(s)
The Reciprocal Net is a distributed database used by research crystallographers to store information about molecular structures; much of the data is available to the general public. The Reciprocal Net project is still under development. Currently, there are 18 participating crystallography laboratories online. The project is funded by the National Science Foundation (NSF) and part of the National Science Digital Library. The contents of this collection will come principally from structures contributed by participating crystallography laboratories, thus providing a means for teachers, students, and the general public to connect better with current chemistry research. The Reciprocal Net's emphasis is on obtaining structures of general interest and usefulness to those several classes of digital library users.
The Benchmark Energy & Geometry Database (BEGDB) collects results of highly accurate QM calculations of molecular structures, energies and properties. These data can serve as benchmarks for testing and parameterization of other computational methods.
The Structure database provides three-dimensional structures of macromolecules for a variety of research purposes and allows the user to retrieve structures for specific molecule types as well as structures for genes and proteins of interest. Three main databases comprise Structure-The Molecular Modeling Database; Conserved Domains and Protein Classification; and the BioSystems Database. Structure also links to the PubChem databases to connect biological activity data to the macromolecular structures. Users can locate structural templates for proteins and interactively view structures and sequence data to closely examine sequence-structure relationships.
Established in 1965, the CSD is the world’s repository for small-molecule organic and metal-organic crystal structures. Containing the results of over one million x-ray and neutron diffraction analyses this unique database of accurate 3D structures has become an essential resource to scientists around the world. The CSD records bibliographic, chemical and crystallographic information for:organic molecules, metal-organic compounds whose 3D structures have been determined using X-ray diffraction, neutron diffraction. The CSD records results of: single crystal studies, powder diffraction studies which yield 3D atomic coordinate data for at least all non-H atoms. In some cases the CCDC is unable to obtain coordinates, and incomplete entries are archived to the CSD. The CSD includes crystal structure data arising from: publications in the open literature and Private Communications to the CSD (via direct data deposition). The CSD contains directly deposited data that are not available anywhere else, known as CSD Communications.
PQR is an online database of molecular properties predicted from quantum mechanics with integrated capabilities for molecular visualization and data sharing. ased on the number of molecules, PQR is currently the largest open database of molecular quantum calculations. PQR features interactive high-quality rendering of molecular structures and properties on computers, tablets, and cell phones and allows to efficiently share data via digital object identifiers (DOI) and scannable QR barcodes.
nmrshiftdb is a NMR database (web database) for organic structures and their nuclear magnetic resonance (nmr) spectra. It allows for spectrum prediction (13C, 1H and other nuclei) as well as for searching spectra, structures and other properties. Last not least, it features peer-reviewed submission of datasets by its users. The nmrshiftdb2 software is open source, the data is published under an open content license. Please consult the documentation for more detailed information. nmrshiftdb2 is the continuation of the NMRShiftDB project with additional data and bugfixes and changes in the software.
The CyberCell database (CCDB) is a comprehensive collection of detailed enzymatic, biological, chemical, genetic, and molecular biological data about E. coli (strain K12, MG1655). It is intended to provide sufficient information and querying capacity for biologists and computer scientists to use computers or detailed mathematical models to simulate all or part of a bacterial cell at a nanoscopic (10-9 m), mesoscopic (10-8 m).The CyberCell database CCDB actually consists of 4 browsable databases: 1) the main CyberCell database (CCDB - containing gene and protein information), 2) the 3D structure database (CC3D – containing information for structural proteomics), 3) the RNA database (CCRD – containing tRNA and rRNA information), and 4) the metabolite database (CCMD – containing metabolite information). Each of these databases is accessible through hyperlinked buttons located at the top of the CCDB homepage. All CCDB sub-databases are fully web enabled, permitting a wide variety of interactive browsing, search and display operations. and microscopic (10-6 m) level.
INTEGRALL is a web-based platform dedicated to compile information on integrons and designed to organize all the data available for these genetic structures. INTEGRALL provides a public genetic repository for sequence data and nomenclature and offers to scientists an easy and interactive access to integron's DNA sequences, their molecular arrangements as well as their genetic contexts.
ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs We attempt to normalise the bioactivities into a uniform set of end-points and units where possible, and also to tag the links between a molecular target and a published assay with a set of varying confidence levels. Additional data on clinical progress of compounds is being integrated into ChEMBL at the current time.
China National GeneBank DataBase (CNGBdb) is a unified platform built for biological big data sharing and application services to the research community. Based on the big data and cloud computing technologies, it provides data services such as archive, analysis, knowledge search, management authorization, and visualization. At present, CNGBdb has integrated large amounts of internal and external molecular data and other information from CNGB, NCBI, EBI, DDBJ, etc., indexed by search, covering 12 data structures. Moreover, CNGBdb correlates living sources, biological samples and bioinformatic data to realize the traceability of comprehensive data.
The European Bioinformatics Institute (EBI) has a long-standing mission to collect, organise and make available databases for biomolecular science. It makes available a collection of databases along with tools to search, download and analyse their content. These databases include DNA and protein sequences and structures, genome annotation, gene expression information, molecular interactions and pathways. Connected to these are linking and descriptive data resources such as protein motifs, ontologies and many others. In many of these efforts, the EBI is a European node in global data-sharing agreements involving, for example, the USA and Japan.
GlyTouCan is the international glycan structure repository. This repository is a freely available, uncurated registry for glycan structures that assigns globally unique accession numbers to any glycan independent of the level of information provided by the experimental method used to identify the structure(s). Any glycan structure, ranging in resolution from monosaccharide composition to fully defined structures can be registered as long as there are no inconsistencies in the structure.
The CATH database is a hierarchical domain classification of protein structures in the Protein Data Bank. Protein structures are classified using a combination of automated and manual procedures. There are four major levels in the CATH hierarchy; Class, Architecture, Topology and Homologous superfamily.
APID Interactomes is a database that provides a comprehensive collection of protein interactomes for more than 400 organisms based in the integration of known experimentally validated protein-protein physical interactions (PPIs). Construction of the interactomes is done with a methodological approach to report quality levels and coverage over the proteomes for each organism included. In this way, APID provides interactomes from specific organisms that in 25 cases have more than 500 proteins. As a whole APID includes a comprehensive compendium of 90,379 distinct proteins and 678,441 singular interactions. The analytical and integrative effort done in APID unifies PPIs from primary databases of molecular interactions (BIND, BioGRID, DIP, HPRD, IntAct, MINT) and also from experimentally resolved 3D structures (PDB) where more than two distinct proteins have been identified. In this way, 8,388 structures have been analyzed to find specific protein-protein interactions reported with details of their molecular interfaces. APID also includes a new data visualization web-tool that allows the construction of sub-interactomes using query lists of proteins of interest and the visual exploration of the corresponding networks, including an interactive selection of the properties of the interactions (i.e. the reliability of the "edges" in the network) and an interactive mapping of the functional environment of the proteins (i.e. the functional annotations of the "nodes" in the network).
MetaboLights is a database for Metabolomics experiments and derived information. The database is cross-species, cross-technique and covers metabolite structures and their reference spectra as well as their biological roles, locations and concentrations, and experimental data from metabolic experiments.
>>>!!!<<< Crystaleye has now been excitingly integrated into the Crystallography Open Database at>>>!!!<<< Crystallography Open Database now is including data and software from CrystalEye, developed by Nick Day at the department of Chemistry, the University of Cambridge under supervision of Peter Murray-Rust. The aim of the CrystalEye project is to aggregate crystallography from web resources, and to provide methods to easily browse, search, and to keep up to date with the latest published information.At present we are aggregating the crystallography from the supplementary data to articles at publishers websites.
PDBj (Protein Data Bank Japan) provides a centralized PDB archive of macromolecular structures, integrated tools for data retrieval, visualization, and functional characterization. PDBj is supported by JST-NBDC and Osaka University.
The aim of the present volume is the compilation of experimental data. The Tables of energy levels are presented in a way similar to the "Atomic Energy levels the Rare Earth Elements", and incorporate additionnal data: isotope shifts and hyperfine structures. For each spectrum, they are separated in two lists of odd and even levels, the parity of the ground level being given first.
The Materials Project produces one of the world's foremost databases of computed information about inorganic, crystalline materials, along with providing powerful web-based apps to help analyze this information to help the design of novel materials. Access is provided free-of-charge with an API available and under a permissive license.
PDBe is the European resource for the collection, organisation and dissemination of data on biological macromolecular structures. In collaboration with the other worldwide Protein Data Bank (wwPDB) partners - the Research Collaboratory for Structural Bioinformatics (RCSB) and BioMagResBank (BMRB) in the USA and the Protein Data Bank of Japan (PDBj) - we work to collate, maintain and provide access to the global repository of macromolecular structure data. We develop tools, services and resources to make structure-related data more accessible to the biomedical community.
The aim of the EPPO Global Database is to provide in a single portal for all pest-specific information that has been produced or collected by EPPO. The full database is available via the Internet, but when no Internet connection is available a subset of the database called ‘EPPO GD Desktop’ can be run as a software (now replacing PQR).
Cryo electron microscopy enables the determination of 3D structures of macromolecular complexes and cells from 2 to 100 Å resolution. EMDataResource is the unified global portal for one-stop deposition and retrieval of 3DEM density maps, atomic models and associated metadata, and is a joint effort among investigators of the Stanford/SLAC CryoEM Facility and the Research Collaboratory for Structural Bioinformatics (RCSB) at Rutgers, in collaboration with the EMDB team at the European Bioinformatics Institute. EMDataResource also serves as a resource for news, events, software tools, data standards, and validation methods for the 3DEM community. The major goal of the EMDataResource project in the current funding period is to work with the 3DEM community to (1) establish data-validation methods that can be used in the process of structure determination, (2) define the key indicators of a well-determined structure that should accompany every deposition, and (3) implement appropriate validation procedures for maps and map-derived models into a 3DEM validation pipeline.
mzCloud is an extensively curated database of high-resolution tandem mass spectra that are arranged into spectral trees. MS/MS and multi-stage MSn spectra were acquired at various collision energies, precursor m/z, and isolation widths using Collision-induced dissociation (CID) and Higher-energy collisional dissociation (HCD). Each raw mass spectrum was filtered and recalibrated giving rise to additional filtered and recalibrated spectral trees that are fully searchable. Besides the experimental and processed data, each database record contains the compound name with synonyms, the chemical structure, computationally and manually annotated fragments (peaks), identified adducts and multiply charged ions, molecular formulas, predicted precursor structures, detailed experimental information, peak accuracies, mass resolution, InChi, InChiKey, and other identifiers. mzCloud is a fully searchable library that allows spectra searches, tree searches, structure and substructure searches, monoisotopic mass searches, peak (m/z) searches, precursor searches, and name searches. mzCloud is free and available for public use online.