Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 165 result(s)
This site provides access to complete, annotated genomes from bacteria and archaea (present in the European Nucleotide Archive) through the Ensembl graphical user interface (genome browser). Ensembl Bacteria contains genomes from annotated INSDC records that are loaded into Ensembl multi-species databases, using the INSDC annotation import pipeline.
ArrayExpress is one of the major international repositories for high-throughput functional genomics data from both microarray and high-throughput sequencing studies, many of which are supported by peer-reviewed publications. Data sets are submitted directly to ArrayExpress and curated by a team of specialist biological curators. In the past (until 2018) datasets from the NCBI Gene Expression Omnibus database were imported on a weekly basis. Data is collected to MIAME and MINSEQE standards.
The Electron Microscopy Data Bank (EMDB) is a public repository for electron microscopy density maps of macromolecular complexes and subcellular structures. It covers a variety of techniques, including single-particle analysis, electron tomography, and electron (2D) crystallography.
The European Centre for Medium-Range Weather Forecasts (ECMWF) is an independent intergovernmental organisation supported by 34 states. ECMWF is both a research institute and a 24/7 operational service, producing and disseminating numerical weather predictions to its Member States. This data is fully available to the national meteorological services in the Member States. The Centre also offers a catalogue of forecast data that can be purchased by businesses worldwide and other commercial customers Forecasts, analyses, climate re-analyses, reforecasts and multi-model data are available from our archive (MARS) or via dedicated data servers or via point-to-point dissemination
Paleoclimatology data are derived from natural sources such as tree rings, ice cores, corals, and ocean and lake sediments. These proxy climate data extend the archive of weather and climate information hundreds to millions of years. The data include geophysical or biological measurement time series and some reconstructed climate variables such as temperature and precipitation. NCEI provides the paleoclimatology data and information scientists need to understand natural climate variability and future climate change. We also operate the World Data Center for Paleoclimatology, which archives and distributes data contributed by scientists around the world.
The datacommons@psu was developed in 2005 to provide a resource for data sharing, discovery, and archiving for the Penn State research and teaching community. Access to information is vital to the research, teaching, and outreach conducted at Penn State. The datacommons@psu serves as a data discovery tool, a data archive for research data created by PSU for projects funded by agencies like the National Science Foundation, as well as a portal to data, applications, and resources throughout the university. The datacommons@psu facilitates interdisciplinary cooperation and collaboration by connecting people and resources and by: Acquiring, storing, documenting, and providing discovery tools for Penn State based research data, final reports, instruments, models and applications. Highlighting existing resources developed or housed by Penn State. Supporting access to project/program partners via collaborative map or web services. Providing metadata development citation information, Digital Object Identifiers (DOIs) and links to related publications and project websites. Members of the Penn State research community and their affiliates can easily share and house their data through the datacommons@psu. The datacommons@psu will also develop metadata for your data and provide information to support your NSF, NIH, or other agency data management plan.
>>>!!!<<< Noticed 26.08.2020: The NCI CBIIT instance of the CGAP no longer exist on this website. The Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer has a new home at the NCI-funded Institute for Systems Biology Cancer Genomics Cloud available at the following location: >>>!!!<<< The NCI’s Cancer Genome Anatomy Project (CGAP) is an online resource designed to provide the scientific community with detailed characterization of gene expression in biological tissues. By characterizing normal, pre-cancer and cancer cells, CGAP aims to improve detection, diagnosis and treatment for the patient. Moreover, CGAP provides access to cDNA clones to the research community through a variety of distributors. CGAP provides a wide range of genomic data and resources
Rhea is a freely available and comprehensive resource of expert-curated biochemical reactions. It has been designed to provide a non-redundant set of chemical transformations for applications such as the functional annotation of enzymes, pathway inference and metabolic network reconstruction. There are three types of reaction participants (reactants and products): Small molecules, Rhea polymers, Generic compounds. All three types of reaction participants are linked to the ChEBI database (Chemical Entities of Biological Interest) which provides detailed information about structure, formula and charge. Rhea provides built-in validations that ensure both mass and charge balance of the reactions. We have populated the database with the reactions found in the enzyme classification (i.e. in the IntEnz and ENZYME databases), extending it with additional known reactions of biological interest. While the main focus of Rhea is enzyme-catalysed reactions, other biochemical reactions (including those that are often termed "spontaneous") also are included.
The CDC Data Catalogue describes the Climate Data of the DWD and provides access to data, descriptions and access methods. Climate Data refers to observations, statistical indices and spatial analyses. CDC comprises Climate Data for Germany, but also global Climate Data, which were collected and processed in the framework of international co-operation. The CDC Data Catalogue is under construction and not yet complete. The purposes of the CDC Data Catalogue are: to provide uniform access to climate data centres and climate datasets of the DWD to describe the climate data according to international metadata standards to make the catalogue information available on the Internet to support the search for climate data to facilitate the access to climate data and climate data descriptions
SMOKA provides public science data obtained at Subaru Telescope, 188cm telescope at Okayama Astrophysical Observatory, 105cm Schmidt telescope at Kiso Observatory (University of Tokyo), MITSuME, and KANATA Telescope at Higashi-Hiroshima Observatory. It is intended mainly for astronomical researchers.
The Solar Data Analysis Center serves data from recent and current space-based solar-physics missions, funds and hosts much of the SolarSoft library, and leads the Virtual Solar Observatory (VSO) effort. SDAC is the active archive, providing network access to data from such missions as SOHO, Yohkoh, and TRACE.
The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana . Data available from TAIR includes the complete genome sequence along with gene structure, gene product information, metabolism, gene expression, DNA and seed stocks, genome maps, genetic and physical markers, publications, and information about the Arabidopsis research community. Gene product function data is updated every two weeks from the latest published research literature and community data submissions. Gene structures are updated 1-2 times per year using computational and manual methods as well as community submissions of new and updated genes. TAIR also provides extensive linkouts from our data pages to other Arabidopsis resources.
The WDCGG archives measurement data for greenhouse and related gases in the atmosphere and the ocean (77 gaseous species as of 31 December 2009). The data are classified into six categories according to the observation platforms or methods used (see WDCGG Data Submission and Dissemination Guide). Air observation at stationary platform Air observation by mobile platforms (e.g. aircraft, ships, etc.) Vertical profile observation of air (e.g. multi-height observation using a tower) Hydrographic sampling observation by ships Ice core observation Observation of surface seawater and overlying air
The Data Library and Archives (DLA) is part of the joint library system supported by the Marine Biological Laboratory and the Woods Hole Oceanographic Institution. The DLA holds collections of administrative records, photographs, scientists' data and papers, film and video, historical instruments, as well as books, journals and technical reports.
PDBe is the European resource for the collection, organisation and dissemination of data on biological macromolecular structures. In collaboration with the other worldwide Protein Data Bank (wwPDB) partners - the Research Collaboratory for Structural Bioinformatics (RCSB) and BioMagResBank (BMRB) in the USA and the Protein Data Bank of Japan (PDBj) - we work to collate, maintain and provide access to the global repository of macromolecular structure data. We develop tools, services and resources to make structure-related data more accessible to the biomedical community.
The Bavarian Archive for Speech Signals (BAS) is a public institution hosted by the University of Munich. This institution was founded with the aim of making corpora of current spoken German available to both the basic research and the speech technology communities via a maximally comprehensive digital speech-signal database. The speech material will be structured in a manner allowing flexible and precise access, with acoustic-phonetic and linguistic-phonetic evaluation forming an integral part of it.
The Coriolis Data Centre handles operational oceanography measurements made in situ, complementing the measurement of the ocean surface made using instruments aboard satellites. This work is realised through the establishment of permanent networks with data collected by ships or autonomous systems that are either fixed or drifting. This data can be used to construct a snapshot of water mass structure and current intensity.
The objective of this database is to stimulate the exchange of information and the collaboration between researchers within the ChArMEx community. However, this community is not exclusive and researchers not directly involved in ChArMEx, but who wish to contribute to the achievements of ChArMEx scientific and/or educational goals are welcome to join-in. The database is a depository for all the data collected during the various projects that contribute to ChArMEx coordinated program. It aims at documenting, storing and distributing the data produced or used by the project community. However, it is also intended to host datasets that were produced outside the ChArMEx program but which are meaningful to ChArMEx scientific and/or educational goals. Any data owner who wishes to add or link his dataset to ChArMEx database is welcome to contact the database manager in order to get help and support. The ChArMEx database includes past and recent geophysical in situ observations, satellite products and model outputs. The database organizes the data management and provides data services to end-users of ChArMEx data. The database system provides a detailed description of the products and uses standardized formats whenever it is possible. It defines the access rules to the data and details the mutual rights and obligations of data providers and users (see ChArMEx data and publication policy). The database is being developed jointly by : SEDOO, OMP Toulouse , ICARE, Lille and ESPRI, IPSL Paris
HYdrological cycle in the Mediterranean EXperiemnt. Considering the science and societal issues motivating HyMeX, the programme aims to : improve our understanding of the water cycle, with emphasis on extreme events, by monitoring and modelling the Mediterranean atmosphere-land-ocean coupled system, its variability from the event to the seasonal and interannual scales, and its characteristics over one decade (2010-2020) in the context of global change, assess the social and economic vulnerability to extreme events and adaptation capacity.The multidisciplinary research and the database developed within HyMeX should contribute to: improve observational and modelling systems, especially for coupled systems, better predict extreme events, simulate the long-term water-cycle more accurately, provide guidelines for adaptation measures, especially in the context of global change.
Originally named the Radiation Belt Storm Probes (RBSP), the mission was re-named the Van Allen Probes, following successful launch and commissioning. For simplicity and continuity, the RBSP short-form has been retained for existing documentation, file naming, and data product identification purposes. The RBSPICE investigation including the RBSPICE Instrument SOC maintains compliance with requirements levied in all applicable mission control documents.
The NCAR is a federally funded research and development center committed to research and education in atmospheric science and related scientific fields. NCAR seeks to support and enhance the scientific community nationally and globally by monitoring and researching the atmosphere and related physical and biological systems. Users can access climate and earth models created to better understand the atmosphere, the Earth and the Sun; as well as data from various NCAR research programs and projects. NCAR is sponsored by the National Science Foundation in addition to various other U.S. agencies.
HunCLARIN is a strategic research infrastructure of Hungary’s leading knowledge centres involved in R&D in speech- and language processing. It contains linguistic resources and tools that form the basis of research. The infrastructure has obtained an “SKI” qualification (Strategic Research Infrastructure) in 2010, and has been significantly expanded since. Currently comprising 36 members, the infrastructure includes several general- and specific-purpose text corpora, different language processing tools and analysers, linguistic databases as well as ontologies. RIL HAS was a co-founder of the European CLARIN project, which aims at supporting humanities and social sciences research with the help of language technology and by making digital linguistic resources more easily available. In accordance with these goals HunClarin makes the research infrastructures developed by the respective centres directly accessible for researchers through a common network entry point. A general goal of the infrastructure is to realise the interoperability of the collected research infrastructures and to enable comparing the performance of the respective alternatives and to coordinate different foci in R&D. The coordinator and contact person of the infrastructure is Tamás Váradi, RIL HAS.