Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 42 result(s)
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems. The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target ("TARGET_TYPE='PROTEIN'") is provided. Data extracted by BindingDB typically includes more details regarding experimental conditions, etc
eCrystals - Southampton is the archive for Crystal Structures generated by the Southampton Chemical Crystallography Group and the EPSRC UK National Crystallography Service.
The DOE Data Explorer (DDE) is an information tool to help you locate DOE's collections of data and non-text information and, at the same time, retrieve individual datasets within some of those collections. It includes collection citations prepared by the Office of Scientific and Technical Information, as well as citations for individual datasets submitted from DOE Data Centers and other organizations.
Pubchem contains 3 databases. 1. PubChem BioAssay: The PubChem BioAssay Database contains bioactivity screens of chemical substances described in PubChem Substance. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure. 2. PubChem Compound: The PubChem Compound Database contains validated chemical depiction information provided to describe substances in PubChem Substance. Structures stored within PubChem Compounds are pre-clustered and cross-referenced by identity and similarity groups. 3. PubChem Substance. The PubChem Substance Database contains descriptions of samples, from a variety of sources, and links to biological screening results that are available in PubChem BioAssay. If the chemical contents of a sample are known, the description includes links to PubChem Compound.
The Benchmark Energy & Geometry Database (BEGDB) collects results of highly accurate QM calculations of molecular structures, energies and properties. These data can serve as benchmarks for testing and parameterization of other computational methods.
UsefulChem is an Open Notebook Science project in chemistry led by the Bradley Laboratory at Drexel University. The main project currently involves the synthesis of novel anti-malarial compounds. The work is done under Open Notebook Science conditions with the actual detailed lab notebook.
The PeptideAtlas validates expressed proteins to provide eukaryotic genome data. Peptide Atlas provides data to advance biological discoveries in humans. The PeptideAtlas accepts proteomic data from high-throughput processes and encourages data submission.
EBI's CSA contains data documenting enzyme active sites and catalytic residues in enzymes of 3D structure. Entries in CSA may be original hand-annotated entries from primary literature or homologous entries found by PSI-BLAST alignment.
Edmond is the institutional repository of the Max Planck Society for public research data. It enables Max Planck scientists to create citable scientific assets by describing, enriching, sharing, exposing, linking, publishing and archiving research data of all kinds. A unique feature of Edmond is the dedicated metadata management, which supports a non-restrictive metadata schema definition, as simple as you like or as complex as your parameters require. Further on, all objects within Edmond have a unique identifier and therefore can be clearly referenced in publications or reused in other contexts.
jPOSTrepo (Japan ProteOme STandard Repository) is a repository of sharing MS raw/processed data. It consists of a high-speed file upload process, flexible file management system and easy-to-use interfaces. Users can release their "raw/processed" data via this site with a unique identifier number for the paper publication. Users also can suspend (or "embargo") their data until their paper is published. The file transfer from users’ computer to our repository server is very fast (roughly ten times faster than usual file transfer) and uses only web browsers – it does not require installing any additional software.
The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository for proteomics data, including protein and peptide identifications, post-translational modifications and supporting spectral evidence. PRIDE encourages and welcomes direct user submissions of mass spectrometry data to be published in peer-reviewed publications.
Database of mass spectra of known, unknown and provisionally identified substances. MassBank is the first public repository of mass spectral data for sharing them among scientific research community. MassBank data are useful for the chemical identification and structure elucidation of chemical compounds detected by mass spectrometry.
TOXNET (TOXicology Data NETwork) is a group of databases covering chemicals and drugs, diseases and the environment, environmental health, occupational safety and health, poisoning, risk assessment and regulations, and toxicology. Information in the TOXNET databases covers: Toxicology data: CCRIS (Chemical Carcinogenesis Research Information System), CPDB (Carcinogenic Potency Database), CTD (Comparative Toxicogenomics Database), GENE-TOX (Genetic Toxicology), HSDB® (Hazardous Substances Data Bank), Haz-Map®, Household Products Database, IRIS (Integrated Risk Information System), ITER (International Toxicity Estimates for Risk), LactMed® (Drugs and Lactation), TRI (Toxics Release Inventory), TOXMAP®, ; Chemical nomenclature: ChemIDplus; Toxicology literature: TOXLINE®, DART® (Developmental and Reproductive Toxicology Database).
The Protein Circular Dichroism Data Bank (PCDDB) provides and accepts a circular dichroism spectra data. The PCDDB and it's parent organization, the Institute of Structural and Molecular Biology (ISMB), investigate molecular structure using techniques such as biomolecular nuclear magnetic resonance, X-ray crystallography and computational structure prediction, as methods for protein production and biological characterization.
The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. It is intended to be used for applications in metabolomics, clinical chemistry, biomarker discovery and general education.
The ProteomeXchange consortium has been set up to provide a single point of submission of MS proteomics data to the main existing proteomics repositories, and to encourage the data exchange between them for optimal data dissemination. Current members accepting submissions are: The PRIDE PRoteomics IDEntifications database at the European Bioinformatics Institute focusing mainly on shotgun mass spectrometry proteomics data PeptideAtlas/PASSEL focusing on SRM/MRM datasets.
PDBj (Protein Data Bank Japan) provides a centralized PDB archive of macromolecular structures, integrated tools for data retrieval, visualization, and functional characterization. PDBj is supported by JST-NBDC and Osaka University.
The CyberCell database (CCDB) is a comprehensive collection of detailed enzymatic, biological, chemical, genetic, and molecular biological data about E. coli (strain K12, MG1655). It is intended to provide sufficient information and querying capacity for biologists and computer scientists to use computers or detailed mathematical models to simulate all or part of a bacterial cell at a nanoscopic (10-9 m), mesoscopic (10-8 m).The CyberCell database CCDB actually consists of 4 browsable databases: 1) the main CyberCell database (CCDB - containing gene and protein information), 2) the 3D structure database (CC3D – containing information for structural proteomics), 3) the RNA database (CCRD – containing tRNA and rRNA information), and 4) the metabolite database (CCMD – containing metabolite information). Each of these databases is accessible through hyperlinked buttons located at the top of the CCDB homepage. All CCDB sub-databases are fully web enabled, permitting a wide variety of interactive browsing, search and display operations. and microscopic (10-6 m) level.
MassBank of North America (MoNA) is a metadata-centric, auto-curating repository designed for efficient storage and querying of mass spectral records. It intends to serve as a the framework for a centralized, collaborative database of metabolite mass spectra, metadata and associated compounds. MoNA currently contains over 200,000 mass spectral records from experimental and in-silico libraries as well as from user contributions.
The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of biological macromolecules that serves a global community of researchers, educators, and students. The data contained in the archive include atomic coordinates, crystallographic structure factors and NMR experimental data. Aside from coordinates, each deposition also includes the names of molecules, primary and secondary structure information, sequence database references, where appropriate, and ligand and biological assembly information, details about data collection and structure solution, and bibliographic citations. The Worldwide Protein Data Bank (wwPDB) consists of organizations that act as deposition, data processing and distribution centers for PDB data. Members are: RCSB PDB (USA), PDBe (Europe) and PDBj (Japan), and BMRB (USA). The wwPDB's mission is to maintain a single PDB archive of macromolecular structural data that is freely and publicly available to the global community.
Rhea is a freely available and comprehensive resource of expert-curated biochemical reactions. It has been designed to provide a non-redundant set of chemical transformations for applications such as the functional annotation of enzymes, pathway inference and metabolic network reconstruction. There are three types of reaction participants (reactants and products): Small molecules, Rhea polymers, Generic compounds. All three types of reaction participants are linked to the ChEBI database (Chemical Entities of Biological Interest) which provides detailed information about structure, formula and charge. Rhea provides built-in validations that ensure both mass and charge balance of the reactions. We have populated the database with the reactions found in the enzyme classification (i.e. in the IntEnz and ENZYME databases), extending it with additional known reactions of biological interest. While the main focus of Rhea is enzyme-catalysed reactions, other biochemical reactions (including those that are often termed "spontaneous") also are included.
With the creation of the Metabolomics Data Repository managed by Data Repository and Coordination Center (DRCC), the NIH acknowledges the importance of data sharing for metabolomics. Metabolomics represents the systematic study of low molecular weight molecules found in a biological sample, providing a "snapshot" of the current and actual state of the cell or organism at a specific point in time. Thus, the metabolome represents the functional activity of biological systems. As with other ‘omics’, metabolites are conserved across animals, plants and microbial species, facilitating the extrapolation of research findings in laboratory animals to humans. Common technologies for measuring the metabolome include mass spectrometry (MS) and nuclear magnetic resonance spectroscopy (NMR), which can measure hundreds to thousands of unique chemical entities. Data sharing in metabolomics will include primary raw data and the biological and analytical meta-data necessary to interpret these data. Through cooperation between investigators, metabolomics laboratories and data coordinating centers, these data sets should provide a rich resource for the research community to enhance preclinical, clinical and translational research.
The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies. The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide.