Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 177 result(s)
The Database of Genomic Variants archive provides curated archiving and distribution of publicly available genomic structural variants. Direct submissions are accepted as well as published data. The DGVa is the primary supplier of data to the Database of Genomic Variants (DGV) (hosted by The Centre for Applied Genomics in Toronto, Canada).
UniProtKB/Swiss-Prot is the manually annotated and reviewed section of the UniProt Knowledgebase (UniProtKB). It is a high quality annotated and non-redundant protein sequence database, which brings together experimental results, computed features and scientific conclusions. Since 2002, it is maintained by the UniProt consortium and is accessible via the UniProt website.
China National GeneBank DataBase (CNGBdb) is a unified platform built for biological big data sharing and application services to the research community. Based on the big data and cloud computing technologies, it provides data services such as archive, analysis, knowledge search, management authorization, and visualization. At present, CNGBdb has integrated large amounts of internal and external molecular data and other information from CNGB, NCBI, EBI, DDBJ, etc., indexed by search, covering 12 data structures. Moreover, CNGBdb correlates living sources, biological samples and bioinformatic data to realize the traceability of comprehensive data.
The Expression Atlas provides information on gene expression patterns under different biological conditions such as a gene knock out, a plant treated with a compound, or in a particular organism part or cell. It includes both microarray and RNA-seq data. The data is re-analysed in-house to detect interesting expression patterns under the conditions of the original experiment. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas. The Baseline Atlas displays information about which gene products are present (and at what abundance) in "normal" conditions (e.g. tissue, cell type). It aims to answer questions such as "which genes are specifically expressed in human kidney?". This component of the Expression Atlas consists of highly-curated and quality-checked RNA-seq experiments from ArrayExpress. It has data for many different animal and plant species. New experiments are added as they become available. The Differential Atlas allows users to identify genes that are up- or down-regulated in a wide variety of different experimental conditions such as yeast mutants, cadmium treated plants, cystic fibrosis or the effect on gene expression of mind-body practice. Both microarray and RNA-seq experiments are included in the Differential Atlas. Experiments are selected from ArrayExpress and groups of samples are manually identified for comparison e.g. those with wild type genotype compared to those with a gene knock out. Each experiment is processed through our in-house differential expression statistical analysis pipeline to identify genes with a high probability of differential expression.
IntAct provides a freely available, open source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.
The Department of Energy (DOE) Joint Genome Institute (JGI) is a national user facility with massive-scale DNA sequencing and analysis capabilities dedicated to advancing genomics for bioenergy and environmental applications. Beyond generating tens of trillions of DNA bases annually, the Institute develops and maintains data management systems and specialized analytical capabilities to manage and interpret complex genomic data sets, and to enable an expanding community of users around the world to analyze these data in different contexts over the web. The JGI Genome Portal provides a unified access point to all JGI genomic databases and analytical tools. A user can find all DOE JGI sequencing projects and their status, search for and download assemblies and annotations of sequenced genomes, and interactively explore those genomes and compare them with other sequenced microbes, fungi, plants or metagenomes using specialized systems tailored to each particular class of organisms. Databases: Genome Online Database (GOLD), Integrated Microbial Genomes (IGM), MycoCosm, Phytozome
Pubchem contains 3 databases. 1. PubChem BioAssay: The PubChem BioAssay Database contains bioactivity screens of chemical substances described in PubChem Substance. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure. 2. PubChem Compound: The PubChem Compound Database contains validated chemical depiction information provided to describe substances in PubChem Substance. Structures stored within PubChem Compounds are pre-clustered and cross-referenced by identity and similarity groups. 3. PubChem Substance. The PubChem Substance Database contains descriptions of samples, from a variety of sources, and links to biological screening results that are available in PubChem BioAssay. If the chemical contents of a sample are known, the description includes links to PubChem Compound. is the host website of the Center for Invasive Species and Ecosystem Health at the University of Georgia (Formerly: Bugwood Network). The Center aims to develop, consolidate and disseminate information and programmes focused on invasive species, forest health, natural resources and agricultural management through technology development, programmes implementation, training, applied research and public awareness at state, regional, national and international levels. The site gives details of its products (Bugwood Image Database; Early Detection and Distribution Mapping and Bugwoodwiki). Details of its projects, services and personnel are provided. Users can also access image databases on Forestry, Insects, IPM, Invasive Species, Forest Pests, weed and Bark Beetle.
The dbMHC database provides an open, publicly accessible platform for DNA and clinical data related to the human Major Histocompatibility Complex (MHC). The dbMHC provides access to human leukocyte antigen (HLA) sequences, HLA allele and haplotype frequencies, and clinical datasets.
AVISO stands for "Archiving, Validation and Interpretation of Satellite Oceanographic data". Here, you will find data, articles, news and tools to help you discover or improve your skills in the altimetry domain through four key themes: ocean, coast, hydrology and ice. Altimetry is a technique for measuring height. Satellite altimetry measures the time taken by a radar pulse to travel from the satellite antenna to the surface and back to the satellite receiver. Combined with precise satellite location data, altimetry measurements yield sea-surface heights.
SilkDB is a database of the integrated genome resource for the silkworm, Bombyx mori. This database provides access to not only genomic data including functional annotation of genes, gene products and chromosomal mapping, but also extensive biological information such as microarray expression data, ESTs and corresponding references. SilkDB will be useful for the silkworm research community as well as comparative genomics
AceView provides a curated, comprehensive and non-redundant sequence representation of all public mRNA sequences (mRNAs from GenBank or RefSeq, and single pass cDNA sequences from dbEST and Trace). These experimental cDNA sequences are first co-aligned on the genome then clustered into a minimal number of alternative transcript variants and grouped into genes. Using exhaustively and with high quality standards the available cDNA sequences evidences the beauty and complexity of mammals’ transcriptome, and the relative simplicity of the nematode and plant transcriptomes. Genes are classified according to their inferred coding potential; many presumably non-coding genes are discovered. Genes are named by Entrez Gene names when available, else by AceView gene names, stable from release to release. Alternative features (promoters, introns and exons, polyadenylation signals) and coding potential, including motifs, domains, and homologies are annotated in depth; tissues where expression has been observed are listed in order of representation; diseases, phenotypes, pathways, functions, localization or interactions are annotated by mining selected sources, in particular PubMed, GAD and Entrez Gene, and also by performing manual annotation, especially in the worm. In this way, both the anatomy and physiology of the experimentally cDNA supported human, mouse and nematode genes are thoroughly annotated.
The DIP database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Please, check the reference page to find articles describing the DIP database in greater detail. The Database of Ligand-Receptor Partners (DLRP) is a subset of DIP (Database of Interacting Proteins). The DLRP is a database of protein ligand and protein receptor pairs that are known to interact with each other. By interact we mean that the ligand and receptor are members of a ligand-receptor complex and, unless otherwise noted, transduce a signal. In some instances the ligand and/or receptor may form a heterocomplex with other ligands/receptors in order to be functional. We have entered the majority of interactions in DLRP as full DIP entries, with links to references and additional information
The Reciprocal Net is a distributed database used by research crystallographers to store information about molecular structures; much of the data is available to the general public. The Reciprocal Net project is still under development. Currently, there are 18 participating crystallography laboratories online. The project is funded by the National Science Foundation (NSF) and part of the National Science Digital Library. The contents of this collection will come principally from structures contributed by participating crystallography laboratories, thus providing a means for teachers, students, and the general public to connect better with current chemistry research. The Reciprocal Net's emphasis is on obtaining structures of general interest and usefulness to those several classes of digital library users.
The Protein Data Bank (PDB) archive is the single worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids. These are the molecules of life that are found in all organisms including bacteria, yeast, plants, flies, other animals, and humans. Understanding the shape of a molecule helps to understand how it works. This knowledge can be used to help deduce a structure's role in human health and disease, and in drug development. The structures in the archive range from tiny proteins and bits of DNA to complex molecular machines like the ribosome.
At 2016-05-29 sees the official merger of the IMOS eMarine Information Infrastructure (eMII) Facility and the Australian Ocean Data Network (AODN) into a single entity. The marine information Facility of IMOS is now the AODN. Enabling open access to marine data is core business for IMOS. The IMOS data will continue to be discoverable alongside a wider collection of Australian marine and climate data via the new-look AODN Portal. Visit the AODN Portal at - IMOS is designed to be a fully-integrated, national system, observing at ocean-basin and regional scales, and covering physical, chemical and biological variables. IMOS observations are guided by science planning undertaken collaboratively across the Nodes of the Australian marine and climate science community with input from government, industry and other stakeholders. There are five major research themes that unify IMOS science plans and related observations: Long-term ocean change, Climate variability and weather extremes, Boundary currents, Continental shelf and coastal processes, and Ecosystem responses. The observations and data streams are collected via ten technology platforms, or Facilities.
The PeptideAtlas validates expressed proteins to provide eukaryotic genome data. Peptide Atlas provides data to advance biological discoveries in humans. The PeptideAtlas accepts proteomic data from high-throughput processes and encourages data submission.
The NOAA National Centers for Environmental Information (formerly the National Geophysical Data Center) provide scientific stewardship, products and services for sea floor and lakebed data, including geophysics (gravity, magnetics, seismic reflection, bathymetry, water column sonar), and data derived from sediment and rock samples. NCEI compiles coastal and global digital elevation models, high-resolution models for tsunami inundation studies, provides stewardship for NOS data supporting charts and navigation, and is the US national long-term archive for MGG data
The primary focus of the Upper Ocean Processes Group is the study of physical processes in the upper ocean and at the air-sea interface using moored surface buoys equipped with meteorological and oceanographic sensors. UOP Project Map The Upper Ocean Processes Group provides technical support to upper ocean and air-sea interface science programs. Deep-ocean and shallow-water moored surface buoy arrays are designed, fabricated, instrumented, tested, and deployed at sea for periods of up to one year
US Department of Energy’s Atmospheric Radiation Measurement (ARM) Data Center is a long-term archive and distribution facility for various ground-based, aerial and model data products in support of atmospheric and climate research. ARM facility currently operates over 400 instruments at various observatories ( ARM Data Center (ADC) Archive currently holds over 11,000 data products with a total holding of over 1.5 petabytes of data that dates back to 1993, these include data from instruments, value added products, model outputs, field campaign and PI contributed data. The data center archive also includes data collected by ARM from related program (e.g., external data such as NASA satellite).
CRAN is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R. R is ‘GNU S’, a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.
The CDAWeb data system enables improved display and coordinated analysis of multi-instrument, multimission data bases of the kind whose analysis is critical to meeting the science objectives of the ISTP program and the InterAgency Consultative Group (IACG) Solar-Terrestrial Science Initiative. The system combines the client-server user interface technology of the World Wide Web with a powerful set of customized IDL routines to leverage the data format standards (CDF) and guidelines for implementation adopted by ISTP and the IACG. The system can be used with any collection of data granules following the extended set of ISTP/IACG standards. CDAWeb is being used both to support coordinated analysis of public and proprietary data and better functional access to specific public data such as the ISTP-precursor CDAW 9 data base that is formatted to the ISTP/IACG standards. Many data sets are available through the Coordinated Data Analysis Web (CDAWeb) service and the data coverage continues to grow. These are largely, but not exclusively, magnetospheric data and nearby solar wind data of the ISTP era (1992-present) at time resolutions of approximately a minute. The CDAWeb service provides graphical browsing, data subsetting, screen listings, file creations and downloads (ASCII or CDF). Public data from current (1992-present) space physics missions (including Cluster, IMAGE, ISTP, FAST, IMP-8, SAMPEX and others). Public data from missions before 1992 (including IMP-8, ISIS1/2, Alouette2, Hawkeye and others). Public data from all current and past space physics missions. CDAWeb ist part of "Space Physics Data Facility" (
The EPN (or EUREF Permanent Network) is a voluntary organization of several European agencies and universities that pool resources and permanent GNSS station data to generate precise GNSS products. The EPN has been created under the umbrella of the International Association Geodesy and more precisely by its sub-commission EUREF. The European Terrestrial Reference System 89 (ETRS89) is used as the standard precise GPS coordinate system throughout Europe. Supported by EuroGeographics and endorsed by the EU, this reference system forms the backbone for all geographic and geodynamic projects on the European territory both on a national as on an international level.