Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 231 result(s)
The NCBI Short Genetic Variations database, commonly known as dbSNP, catalogs short variations in nucleotide sequences from a wide range of organisms. These variations include single nucleotide variations, short nucleotide insertions and deletions, short tandem repeats and microsatellites. Short Genetic Variations may be common, thus representing true polymorphisms, or they may be rare. Some rare human entries have additional information associated withthem, including disease associations, genotype information and allele origin, as some variations are somatic rather than germline events. ***NCBI will phase out support for non-human organism data in dbSNP and dbVar beginning on September 1, 2017***
>>>!!!<<< as stated 2017-06-09 MPIDB is no longer available under URL >>>!!!<<< The microbial protein interaction database (MPIDB) aims to collect and provide all known physical microbial interactions. Currently, 24,295 experimentally determined interactions among proteins of 250 bacterial species/strains can be browsed and downloaded. These microbial interactions have been manually curated from the literature or imported from other databases (IntAct, DIP, BIND, MINT) and are linked to 26,578 experimental evidences (PubMed ID, PSI-MI methods). In contrast to these databases, interactions in MPIDB are further supported by 68,346 additional evidences based on interaction conservation, protein complex membership, and 3D domain contacts (iPfam, 3did). We do not include (spoke/matrix) binary interactions infered from pull-down experiments.
The ProteomeXchange consortium has been set up to provide a single point of submission of MS proteomics data to the main existing proteomics repositories, and to encourage the data exchange between them for optimal data dissemination. Current members accepting submissions are: The PRIDE PRoteomics IDEntifications database at the European Bioinformatics Institute focusing mainly on shotgun mass spectrometry proteomics data PeptideAtlas/PASSEL focusing on SRM/MRM datasets.
The Argo observational network consists of a fleet of 3000+ profiling autonomous floats deployed by about a dozen teams worldwide. WHOI has built about 10% of the global fleet. The mission lifetime of each float is about 4 years. During a typical mission, each float reports a profile of the upper ocean every 10 days. The sensors onboard record fundamental physical properties of the ocean: temperature and conductivity (a measure of salinity) as a function of pressure. The depth range of the observed profile depends on the local stratification and the float's mechanical ability to adjust it's buoyancy. The majority of Argo floats report profiles between 1-2 km depth. At each surfacing, measurements of temperature and salinity are relayed back to shore via satellite. Telemetry is usually received every 10 days, but floats at high-latitudes which are iced-over accumulate their data and transmit the entire record the next time satellite contact is established. With current battery technology, the best performing floats last 6+ years and record over 200 profiles.
The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources. These include submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centres and routine and comprehensive exchange with our partners in the International Nucleotide Sequence Database Collaboration (INSDC). Provision of nucleotide sequence data to ENA or its INSDC partners has become a central and mandatory step in the dissemination of research findings to the scientific community. ENA works with publishers of scientific literature and funding bodies to ensure compliance with these principles and to provide optimal submission systems and data access tools that work seamlessly with the published literature.
SoyBase is a professionally curated repository for genetics, genomics and related data resources for soybean. It contains current genetic, physical and genomic sequence maps integrated with qualitative and quantitative traits. SoyBase includes annotated "Williams 82" genomic sequence and associated data mining tools. The repository maintains controlled vocabularies for soybean growth, development, and traits that are linked to more general plant ontologies.
The JPL Tropical Cyclone Information System (TCIS) was developed to support hurricane research. There are three components to TCIS; a global archive of multi-satellite hurricane observations 1999-2010 (Tropical Cyclone Data Archive), North Atlantic Hurricane Watch and ASA Convective Processes Experiment (CPEX) aircraft campaign. Together, data and visualizations from the real time system and data archive can be used to study hurricane process, validate and improve models, and assist in developing new algorithms and data assimilation techniques.
The Clouds and the Earth’s Radiant Energy System (CERES) is a key component of the Earth Observing System (EOS) program. CERES instruments provide radiometric measurements of the Earth’s atmosphere from three broadband channels. CERES products include both solar-reflected and Earth-emitted radiation from the top of the atmosphere to the Earth's surface.
Cryo electron microscopy enables the determination of 3D structures of macromolecular complexes and cells from 2 to 100 Å resolution. EMDataResource is the unified global portal for one-stop deposition and retrieval of 3DEM density maps, atomic models and associated metadata, and is a joint effort among investigators of the Stanford/SLAC CryoEM Facility and the Research Collaboratory for Structural Bioinformatics (RCSB) at Rutgers, in collaboration with the EMDB team at the European Bioinformatics Institute. EMDataResource also serves as a resource for news, events, software tools, data standards, and validation methods for the 3DEM community. The major goal of the EMDataResource project in the current funding period is to work with the 3DEM community to (1) establish data-validation methods that can be used in the process of structure determination, (2) define the key indicators of a well-determined structure that should accompany every deposition, and (3) implement appropriate validation procedures for maps and map-derived models into a 3DEM validation pipeline.
Satellite-tracked drifting buoys ("drifters") collect measurements of upper ocean currents and sea surface temperatures (SST) around the world as part of the Global Drifter Program. Drifter locations are estimated from 16-20 satellite fixes per day, per drifter. The Drifter Data Assembly Center (DAC) at NOAA's Atlantic Oceanographic and Meteorological Laboratory (AOML) assembles these raw data, applies quality control procedures, and interpolates them via kriging to regular six-hour intervals. The raw observations and processed data are archived at AOML and at the Marine Environmental Data Services (MEDS) in Canada. Two types of data are available: "metadata" contains deployment location and time, time of drogue (sea anchor) loss, date of final transmission, etc. for each drifter. "Interpolated data" contains the quality-controlled, interpolated drifter observations.
The MPC is responsible for the designation of minor bodies in the solar system: minor planets; comets, in conjunction with the Central Bureau for Astronomical Telegrams (CBAT); and natural satellites (also in conjunction with CBAT). The MPC is also responsible for the efficient collection, computation, checking and dissemination of astrometric observations and orbits for minor planets and comets
UniGene collects entries of transcript sequences from transcription loci from genes or expressed pseudogenes. Entries also contain information on the protein similarities, gene expressions, cDNA clone reagents, and genomic locations.
The HomoloGene database provides a system for the automated detection of homologs among annotated genes of genomes across multiple species. These homologs are fully documented and organized by homology group. HomoloGene processing uses proteins from input organisms to compare and sequence homologs, mapping back to corresponding DNA sequences.
The IPD-IMGT/HLA Database provides a specialist database for sequences of the human major histocompatibility complex (MHC) and includes the official sequences named by the WHO Nomenclature Committee For Factors of the HLA System. The IPD-IMGT/HLA Database is part of the international ImMunoGeneTics project (IMGT). The database uses the 2010 naming convention for HLA alleles in all tools herein. To aid in the adoption of the new nomenclature, all search tools can be used with both the current and pre-2010 allele designations. The pre-2010 nomenclature designations are only used where older reports or outputs have been made available for download.
The CliSAP-Integrated Climate Data Center (ICDC) allows easy access to climate relevant data from in-situ measurements and satellite remote sensing. These data are important to determine the status and the changes in the climate system. Additionally some relevant re-analysis data are included, which are modeled on the basis of observational data.
The database includes world-wide cosmic-ray neutron observations (pressure-corrected 1 hour counts) since 1953. The date are opened in two formats; one is 4096-byte "longformat" data and the other one is 80-byte "cardformat" data. Since the "cardformat" data are prepared only for quick check of data, the "longformat" data, which include information for data usage (constant, factors, etc), should be used for research works. PS files (compressed) of yearly plots are also available.
Nobeyama Radio Polarimeters (NoRP) are observing the Sun with multiple frequencies in the microwave range. It is capable to obtain the total coming flux and the circular-polarization degree.
The modENCODE Project, Model Organism ENCyclopedia Of DNA Elements, was initiated by the funding of applications received in response to Requests for Applications (RFAs) HG-06-006, entitled Identification of All Functional Elements in Selected Model Organism Genomes and HG-06-007, entitled A Data Coordination Center for the Model Organism ENCODE Project (modENCODE). The modENCODE Project is being run as an open consortium and welcomes any investigator willing to abide by the criteria for participation that have been established for the project. Both computational and experimental approaches are being applied by modENCODE investigators to study the genomes of D. melanogaster and C. elegans. An added benefit of studying functional elements in model organisms is the ability to biologically validate the elements discovered using methods that cannot be applied in humans. The comprehensive dataset that is expected to result from the modENCODE Project will provide important insights into the biology of D. melanogaster and C. elegans as well as other organisms, including humans.
The POES satellite system offers the advantage of daily global coverage, by making nearly polar orbits 14 times per day approximately 520 miles above the surface of the Earth. The Earth's rotation allows the satellite to see a different view with each orbit, and each satellite provides two complete views of weather around the world each day. NOAA partners with the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) to constantly operate two polar-orbiting satellites – one POES and one European polar-orbiting satellite called Metop. NOAA's Polar Orbiting Environmental Satellites (POES) carry a suite of instruments that measure the flux of energetic ions and electrons at the altitude of the satellite. This environment varies as a result of solar and geomagnetic activity. Beginning with the NOAA-15 satellite, an upgraded version of the Space Environment Monitor (SEM-2) has been flown.
The UCSD Signaling Gateway Molecule Pages provide essential information on over thousands of proteins involved in cellular signaling. Each Molecule Page contains regularly updated information derived from public data sources as well as sequence analysis, references and links to other databases.
The USGODAE Project consists of United States academic, government and military researchers working to improve assimilative ocean modeling as part of the International GODAE Project. GODAE hopes to develop a global system of observations, communications, modeling and assimilation, that will deliver regular, comprehensive information on the state of the oceans, in a way that will promote and engender wide utility and availability of this resource for maximum benefit to the community. The USGODAE Argo GDAC is currently operational, serving daily data from the following national DACs: Australia (CSIRO), Canada (MEDS), China (2: CSIO and NMDIS), France (Coriolis), India (INCOIS), Japan (JMA), Korea (2: KMA and Kordi), UK (BODC), and US (AOML).
EartH2Observe brings together the findings from European FP projects DEWFORA, GLOWASIS, WATCH, GEOWOW and others. It will integrate available global earth observations (EO), in-situ datasets and models and will construct a global water resources re-analysis dataset of significant length (several decades). The resulting data will allow for improved insights on the full extent of available water and existing pressures on global water resources in all parts of the water cycle. The project will support efficient and globally consistent water management and decision making by providing comprehensive multi-scale (regional, continental and global) water resources observations. It will test new EO data sources, extend existing processing algorithms and combine data from multiple satellite missions in order to improve the overall resolution and reliability of EO data included in the re-analysis dataset. The resulting datasets will be made available through an open Water Cycle Integrator data portal : the European contribution to the GEOSS/WCI approach. The datasets will be downscaled for application in case-studies at regional and local levels, and optimized based on identified European and local needs supporting water management and decision making . Actual data access:
The Taenia solium genome project is a whole genome sequencing project of the parasite Taenia solium, the causal agent of human and porcine cysticercosis; a disease that is still a public health problem of relevance in Mexico. It is being carried out by a consortium of scientists belonging to diverse institutions of the Universidad Nacional Autónoma de México (UNAM, the National Autonomous University of Mexico).
The Inter-regional Geomagnetic Data Center of the Russian-Ukrainian INTERMAGNET segment is operated by the Geophysical Center of the Russian Academy of Sciences (GC RAS). Geomagnetic data are transmitted from observatories and stations located in Russia and Ukraine. The particular feature of the Center is the automated system for real-time recognition of artificial (anthropogenic) disturbances in incoming data. Being based on fuzzy logic approach, this quality control system facilitates the preparation of the definitive magnetograms from preliminary records carried out by data experts manually. The collected geomagnetic data are stored using relational database management system. The geomagnetic database is intended for storing both 1-minute and 1-second data. The results of anthropogenic disturbance recognition are also stored in the database.