Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 61 result(s)
Pubchem contains 3 databases. 1. PubChem BioAssay: The PubChem BioAssay Database contains bioactivity screens of chemical substances described in PubChem Substance. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure. 2. PubChem Compound: The PubChem Compound Database contains validated chemical depiction information provided to describe substances in PubChem Substance. Structures stored within PubChem Compounds are pre-clustered and cross-referenced by identity and similarity groups. 3. PubChem Substance. The PubChem Substance Database contains descriptions of samples, from a variety of sources, and links to biological screening results that are available in PubChem BioAssay. If the chemical contents of a sample are known, the description includes links to PubChem Compound.
The Space Physics Data Facility (SPDF) leads in the design and implementation of unique multi-mission and multi-disciplinary data services and software to strategically advance NASA's solar-terrestrial program, to extend our science understanding of the structure, physics and dynamics of the Heliosphere of our Sun and to support the science missions of NASA's Heliophysics Great Observatory. Major SPDF efforts include multi-mission data services such as Heliophysics Data Portal (formerly VSPO), CDAWeb and CDAWeb Inside IDL,and OMNIWeb Plus (including COHOWeb, ATMOWeb, HelioWeb and CGM) , science planning and orbit services such as SSCWeb, data tools such as the CDF software and tools, and a range of other science and technology research efforts. The staff supporting SPDF includes scientists and information technology experts.
A collection of data at Agency for Healthcare Research and Quality (AHRQ) supporting research that helps people make more informed decisions and improves the quality of health care services. The portal contains U.S.Health Information Knowledgebase (USHIK) and Systematic Review Data Repository (SRDR) and other sources concerning cost, quality, accesibility and evaluation of healthcare and medical insurance.
The European Bioinformatics Institute (EBI) has a long-standing mission to collect, organise and make available databases for biomolecular science. It makes available a collection of databases along with tools to search, download and analyse their content. These databases include DNA and protein sequences and structures, genome annotation, gene expression information, molecular interactions and pathways. Connected to these are linking and descriptive data resources such as protein motifs, ontologies and many others. In many of these efforts, the EBI is a European node in global data-sharing agreements involving, for example, the USA and Japan.
Gramene is a platform for comparative genomic analysis of agriculturally important grasses, including maize, rice, sorghum, wheat and barley. Relationships between cereals are queried and displayed using controlled vocabularies (Gene, Plant, Trait, Environment, and Gramene Taxonomy) and web-based displays, including the Genes and Quantitative Trait Loci (QTL) modules.
The Global Hydrology Resource Center (GHRC) provides both historical and current Earth science data, information, and products from satellite, airborne, and surface-based instruments. GHRC acquires basic data streams and produces derived products from many instruments spread across a variety of instrument platforms.
The Northern California Earthquake Data Center (NCEDC) is a permanent archive and distribution center primarily for multiple types of digital data relating to earthquakes in central and northern California. The NCEDC is located at the Berkeley Seismological Laboratory, and has been accessible to users via the Internet since mid-1992. The NCEDC was formed as a joint project of the Berkeley Seismological Laboratory (BSL) and the U.S. Geological Survey (USGS) at Menlo Park in 1991, and current USGS funding is provided under a cooperative agreement for seismic network operations.
The UniProtKB Sequence/Annotation Version Archive (UniSave) has the mission of providing freely to the scientific community a repository containing every version of every Swiss-Prot/TrEMBL entry in the UniProt Knowledge Base (UniProtKB). This is achieved by archiving, every release, the entry versions within the current release. The primary usage of this service is to provide open access to all entry versions of all entries. In addition to viewing their content, one can also filter, download and compare versions.
The Expression Atlas provides information on gene expression patterns under different biological conditions such as a gene knock out, a plant treated with a compound, or in a particular organism part or cell. It includes both microarray and RNA-seq data. The data is re-analysed in-house to detect interesting expression patterns under the conditions of the original experiment. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas. The Baseline Atlas displays information about which gene products are present (and at what abundance) in "normal" conditions (e.g. tissue, cell type). It aims to answer questions such as "which genes are specifically expressed in human kidney?". This component of the Expression Atlas consists of highly-curated and quality-checked RNA-seq experiments from ArrayExpress. It has data for many different animal and plant species. New experiments are added as they become available. The Differential Atlas allows users to identify genes that are up- or down-regulated in a wide variety of different experimental conditions such as yeast mutants, cadmium treated plants, cystic fibrosis or the effect on gene expression of mind-body practice. Both microarray and RNA-seq experiments are included in the Differential Atlas. Experiments are selected from ArrayExpress and groups of samples are manually identified for comparison e.g. those with wild type genotype compared to those with a gene knock out. Each experiment is processed through our in-house differential expression statistical analysis pipeline to identify genes with a high probability of differential expression.
The CiardRING is a global directory of web-based information services and datasets for agricultural research for development (ARD). It is the principal tool created through the CIARD initiative to allow information providers to register their services and datasets in various categories and so facilitate the discovery of sources of agriculture-related information across the world. The RING aims to provide an infrastructure to improve the accessibility of the outputs of agricultural research and of information relevant to agriculture.
The ENCODE Encyclopedia organizes the most salient analysis products into annotations, and provides tools to search and visualize them. The Encyclopedia has two levels of annotations: Integrative-level annotations integrate multiple types of experimental data and ground level annotations. Ground-level annotations are derived directly from the experimental data, typically produced by uniform processing pipelines.
The BGS is a data-rich organisation with over 400 datasets in its care; including environmental monitoring data, digital databases, physical collections (borehole core, rocks, minerals and fossils), records and archives. Our data is managed by the National Geoscience Data Centre.
The Rat Genome Database is a collaborative effort between leading research institutions involved in rat genetic and genomic research. Its goal, as stated in RFA: HL-99-013 is the establishment of a Rat Genome Database, to collect, consolidate, and integrate data generated from ongoing rat genetic and genomic research efforts and make these data widely available to the scientific community. A secondary, but critical goal is to provide curation of mapped positions for quantitative trait loci, known mutations and other phenotypic data.
Launched in 2000, WormBase is an international consortium of biologists and computer scientists dedicated to providing the research community with accurate, current, accessible information concerning the genetics, genomics and biology of C. elegans and some related nematodes. In addition to their curation work, all sites have ongoing programs in bioinformatics research to develop the next generations of WormBase structure, content and accessibility
The DIP database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Please, check the reference page to find articles describing the DIP database in greater detail. The Database of Ligand-Receptor Partners (DLRP) is a subset of DIP (Database of Interacting Proteins). The DLRP is a database of protein ligand and protein receptor pairs that are known to interact with each other. By interact we mean that the ligand and receptor are members of a ligand-receptor complex and, unless otherwise noted, transduce a signal. In some instances the ligand and/or receptor may form a heterocomplex with other ligands/receptors in order to be functional. We have entered the majority of interactions in DLRP as full DIP entries, with links to references and additional information
The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository for proteomics data, including protein and peptide identifications, post-translational modifications and supporting spectral evidence. PRIDE encourages and welcomes direct user submissions of mass spectrometry data to be published in peer-reviewed publications.
The MG-RAST server is an open source system for annotation and comparative analysis of metagenomes. Users can upload raw sequence data in fasta format; the sequences will be normalized and processed and summaries automatically generated. The server provides several methods to access the different data types, including phylogenetic and metabolic reconstructions, and the ability to compare the metabolism and annotations of one or more metagenomes and genomes. In addition, the server offers a comprehensive search capability. Access to the data is password protected, and all data generated by the automated pipeline is available for download in a variety of common formats. MG-RAST has become an unofficial repository for metagenomic data, providing a means to make your data public so that it is available for download and viewing of the analysis without registration, as well as a static link that you can use in publications. It also requires that you include experimental metadata about your sample when it is made public to increase the usefulness to the community.
CDAAC is responsible for processing the science data received from COSMIC. This data is currently being processed not long after the data is received, i.e. approximately eighty percent of radio occultation profiles are delivered to operational weather centers within 3 hours of observation as well as in a more accurate post-processed mode (within 8 weeks of observation).
As one of the cornerstones of the U.S. Geological Survey's (USGS) National Geospatial Program, The National Map is a collaborative effort among the USGS and other Federal, State, and local partners to improve and deliver topographic information for the Nation. It has many uses ranging from recreation to scientific analysis to emergency response. The National Map is easily accessible for display on the Web, as products and services, and as downloadable data. The geographic information available from The National Map includes orthoimagery (aerial photographs), elevation, geographic names, hydrography, boundaries, transportation, structures, and land cover. Other types of geographic information can be added within the viewer or brought in with The National Map data into a Geographic Information System to create specific types of maps or map views.
VectorBase provides data on arthropod vectors of human pathogens. Sequence data, gene expression data, images, population data, and insecticide resistance data for arthropod vectors are available for download. VectorBase also offers genome browser, gene expression and microarray repository, and BLAST searches for all VectorBase genomes. VectorBase Genomes include Aedes aegypti, Anopheles gambiae, Culex quinquefasciatus, Ixodes scapularis, Pediculus humanus, Rhodnius prolixus. VectorBase is one the Bioinformatics Resource Centers (BRC) projects which is funded by National Institute of Allergy and Infectious Diseases (NAID).
The CDAWeb data system enables improved display and coordinated analysis of multi-instrument, multimission data bases of the kind whose analysis is critical to meeting the science objectives of the ISTP program and the InterAgency Consultative Group (IACG) Solar-Terrestrial Science Initiative. The system combines the client-server user interface technology of the World Wide Web with a powerful set of customized IDL routines to leverage the data format standards (CDF) and guidelines for implementation adopted by ISTP and the IACG. The system can be used with any collection of data granules following the extended set of ISTP/IACG standards. CDAWeb is being used both to support coordinated analysis of public and proprietary data and better functional access to specific public data such as the ISTP-precursor CDAW 9 data base that is formatted to the ISTP/IACG standards. Many data sets are available through the Coordinated Data Analysis Web (CDAWeb) service and the data coverage continues to grow. These are largely, but not exclusively, magnetospheric data and nearby solar wind data of the ISTP era (1992-present) at time resolutions of approximately a minute. The CDAWeb service provides graphical browsing, data subsetting, screen listings, file creations and downloads (ASCII or CDF). Public data from current (1992-present) space physics missions (including Cluster, IMAGE, ISTP, FAST, IMP-8, SAMPEX and others). Public data from missions before 1992 (including IMP-8, ISIS1/2, Alouette2, Hawkeye and others). Public data from all current and past space physics missions. CDAWeb ist part of "Space Physics Data Facility" (
ISRIC - World Soil Information is an independent foundation. As regular member of the ICS World Data System it is also known as World Data Centre for Soils (WDC-Soils). ISRIC was founded in 1966 through the International Soil Science Society (ISSS) and United Nations Educational, Scientific and Cultural Organization (UNESCO). It has a mission to serve the international community with information about the world’s soil resources to help addressing major global issues. Our work is organised according to four work streams: 1) Setting standards and references, 2) Soil information provision (databases & soil mapping), 3) Capcaity building and advocacy, and 4) Generation of derived products.
EnsemblPlants is a genome-centric portal for plant species. Ensembl Plants is developed in coordination with other plant genomics and bioinformatics groups via the EBI's role in the transPLANT consortium.
The Protein Data Bank (PDB) archive is the single worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids. These are the molecules of life that are found in all organisms including bacteria, yeast, plants, flies, other animals, and humans. Understanding the shape of a molecule helps to understand how it works. This knowledge can be used to help deduce a structure's role in human health and disease, and in drug development. The structures in the archive range from tiny proteins and bits of DNA to complex molecular machines like the ribosome.