Content Types


AID systems



Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 16 result(s)
THIN is a medical data collection scheme that collects anonymised patient data from its members through the healthcare software Vision. The UK Primary Care database contains longitudinal patient records for approximately 6% of the UK Population. The anonymised data collection, which goes back to 1994, is nationally representative of the UK population.
The International Maize and Wheat Improvement Center (CIMMYT) provides a free, open access repository of research software, studies, and datasets produced and developed by CIMMYT scientists as well as the results of the Seeds of Discovery project, which makes available genetic profiles of wheat and maize, two of mankind's three major cereal crops.
The Research Data Centre (Forschungsdatenzentrum, FDZ) at the Institute for Educational Quality Improvement (Institut zur Qualitätsentwicklung im Bildungswesen, IQB) archives and documents data sets resulting from national and international assessment studies (such as DESI, PIRLS, PISA, IQB-Bildungstrends). Moreover, the FDZ makes these data sets available for re- and secondary analysis. Members of the scientific community can apply for access to the data sets archived at the FDZ.
The GML contributes to the continual improvement of access to and information about official microdata; provides a service and research infrastructure for these data; adopts the function of an intermediary between the Federal Statistical Office and empirical research; conducts exemplary research based upon official data. The GML is an integral part of the German data infrastructure and features as one of six institutions funded by the German Council of Social and Economic Data.
VertNet is a NSF-funded collaborative project that makes biodiversity data free and available on the web. VertNet is a tool designed to help people discover, capture, and publish biodiversity data. It is also the core of a collaboration between hundreds of biocollections that contribute biodiversity data and work together to improve it. VertNet is an engine for training current and future professionals to use and build upon best practices in data quality, curation, research, and data publishing. Yet, VertNet is still the aggregate of all of the information that it mobilizes. To us, VertNet is all of these things and more.
GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, association genetics, markers, polymorphisms, germplasms, phenotypes and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest. GnpIS is used by both large international projects and plant science departments at the French National Research Institute for Agriculture, Food and Environment. It is regularly improved and released several times per year. GnpIS is accessible through a web portal and allows to browse different types of data either independently through dedicated interfaces or simultaneously using a quick search ('google like search') or advanced search (Biomart, Galaxy, Intermine) tools.
The National Cancer Data Base (NCDB), a joint program of the Commission on Cancer (CoC) of the American College of Surgeons (ACoS) and the American Cancer Society (ACS), is a nationwide oncology outcomes database for more than 1,500 Commission-accredited cancer programs in the United States and Puerto Rico. Some 70 percent of all newly diagnosed cases of cancer in the United States are captured at the institutional level and reported to the NCDB. The NCDB, begun in 1989, now contains approximately 29 million records from hospital cancer registries across the United States. Data on all types of cancer are tracked and analyzed. These data are used to explore trends in cancer care, to create regional and state benchmarks for participating hospitals, and to serve as the basis for quality improvement.
The International Food Policy Research Institute (IFPRI) seeks sustainable solutions for ending hunger and poverty. In collaboration with institutions throughout the world, IFPRI is often involved in the collection of primary data and the compilation and processing of secondary data. The resulting datasets provide a wealth of information at the local (household and community), national, and global levels. IFPRI freely distributes as many of these datasets as possible and encourages their use in research and policy analysis. IFPRI Dataverse contains following dataverses: Agricultural Science and Knowledge Indicators - ASTI, HarvestChoice, Statistics on Public Expenditures for Economic Development - SPEED, International Model for Policy Analysis of Agricultural Commodities and Trade - IMPACT, Africa RISING Dataverse and Food Security Portal Dataverse.
Under the World Climate Research Programme (WCRP) the Working Group on Coupled Modelling (WGCM) established the Coupled Model Intercomparison Project (CMIP) as a standard experimental protocol for studying the output of coupled atmosphere-ocean general circulation models (AOGCMs). CMIP provides a community-based infrastructure in support of climate model diagnosis, validation, intercomparison, documentation and data access. This framework enables a diverse community of scientists to analyze GCMs in a systematic fashion, a process which serves to facilitate model improvement. Virtually the entire international climate modeling community has participated in this project since its inception in 1995. The Program for Climate Model Diagnosis and Intercomparison (PCMDI) archives much of the CMIP data and provides other support for CMIP. We are now beginning the process towards the IPCC Fifth Assessment Report and with it the CMIP5 intercomparison activity. The CMIP5 (CMIP Phase 5) experiment design has been finalized with the following suites of experiments: I Decadal Hindcasts and Predictions simulations, II "long-term" simulations, III "atmosphere-only" (prescribed SST) simulations for especially computationally-demanding models. The new ESGF peer-to-peer (P2P) enterprise system ( is now the official site for CMIP5 model output. The old gateway ( is deprecated and now shut down permanently.
NOAA's National Centers for Environmental Information (NCEI) are responsible for hosting and providing public access to one of the most significant archives for environmental data on Earth with over 20 petabytes of comprehensive atmospheric, coastal, oceanic, and geophysical data. NCEI headquarters are located in Asheville, North Carolina. Most employees work in the four main locations, but apart from those locations, NCEI has employees strategically located throughout the United States. The main locations are Cooperative Institute for Climate and Satellites–North Carolina (CICS-NC) at Asheville, North Carolina, Cooperative Institute for Research in Environmental Sciences (CIRES) at Boulder Colorado, Cooperative Institute for Climate and Satellites–Maryland (CICS-MD) at Silver Spring Maryland and Stennis Space Center, Mississippi.
The International Ocean Discovery Program (IODP) is an international marine research collaboration that explores Earth's history and dynamics using ocean-going research platforms to recover data recorded in seafloor sediments and rocks and to monitor subseafloor environments. IODP depends on facilities funded by three platform providers with financial contributions from five additional partner agencies. Together, these entities represent 26 nations whose scientists are selected to staff IODP research expeditions conducted throughout the world's oceans. IODP expeditions are developed from hypothesis-driven science proposals aligned with the program's science plan Illuminating Earth's Past, Present, and Future. The science plan identifies 14 challenge questions in the four areas of climate change, deep life, planetary dynamics, and geohazards. Until 2013 under the name: International Ocean Drilling Program.
In keeping with the open data policies of the U.S. Agency for International Development (USAID) and Bill & Melinda Gates Foundation, the Cereal Systems Initiative for South Asia (CSISA) has launched the CSISA Data Repository to ensure public accessibility to key data sets, including crop cut data- directly observed, crop yield estimates, on-station and on-farm research trial data and socioeconomic surveys. CSISA is a science-driven and impact-oriented regional initiative for increasing the productivity of cereal-based cropping systems in Bangladesh, India and Nepal, thus improving food security and farmers’ livelihoods. CSISA generates data that is of value and interest to a diverse audience of researchers, policymakers and the public. CSISA’s data repository is hosted on Dataverse, an open source web application developed at Harvard University to share, preserve, cite, explore and analyze research data. CSISA’s repository contains rich datasets, including on-station trial data from 2009–17 about crop and resource management practices for sustainable future cereal-based cropping systems. Collection of this data occurred during the long-term, on-station research trials conducted at the Indian Council of Agricultural Research – Research Complex for the Eastern Region in Bihar, India. The data include information on agronomic management for the sustainable intensification of cropping systems, mechanization, diversification, futuristic approaches to sustainable intensification, long-term effects of conservation agriculture practices on soil health and the pest spectrum. Additional trial data in the repository includes nutrient omission plot technique trials from Bihar, eastern Uttar Pradesh and Odisha, India, covering 2012–15, which help determine the indigenous nutrient supplying ability of the soil. This data helps develop precision nutrient management approaches that would be most effective in different types of soils. CSISA’s most popular dataset thus far includes crop cut data on maize in Odisha, India and rice in Nepal. Crop cut datasets provide ground-truthed yield estimates, as well as valuable information on relevant agronomic and socioeconomic practices affecting production practices and yield. A variety of research data on wheat systems are also available from Bangladesh and India. Additional crop cut data will also be coming online soon. Cropping system-related data and socioeconomic data are in the repository, some of which are cross-listed with a Dataverse run by the International Food Policy Research Institute. The socioeconomic datasets contain baseline information that is crucial for technology targeting, as well as to assess the adoption and performance of CSISA-supported technologies under smallholder farmers’ constrained conditions, representing the ultimate litmus test of their potential for change at scale. Other highly interesting datasets include farm composition and productive trajectory information, based on a 20-year panel dataset, and numerous wheat crop cut and maize nutrient omission trial data from across Bangladesh.
To target the multidisciplinary, broad scale nature of empirical educational research in the Federal Republic of Germany, a networked research data infrastructure is required which brings together disparate services from different research data providers, delivering services to researchers in a usable, needs-oriented way. The Verbund Forschungsdaten Bildung (Educational Research Data Alliance, VFDB) therefore aims to cooperate with relevant actors from science, politics and research funding institutes to set up a powerful infrastructure for empirical educational research. This service is meant to adequately capture specific needs of the scientific communities and support empirical educational research in carrying out excellent research.
ICRISAT performs crop improvement research, using conventional as well as methods derived from biotechnology, on the following crops: Chickpea, Pigeonpea, Groundnut, Pearl millet,Sorghum and Small millets. ICRISAT's data repository collects, preserves and facilitates access to the datasets produced by ICRISAT researchers to all users who are interested in. Data includes Phenotypic, Genotypic, Social Science, and Spatial data, Soil and Weather.
CLARIN is a European Research Infrastructure for the Humanities and Social Sciences, focusing on language resources (data and tools). It is being implemented and constantly improved at leading institutions in a large and growing number of European countries, aiming at improving Europe's multi-linguality competence. CLARIN provides several services, such as access to language data and tools to analyze data, and offers to deposit research data, as well as direct access to knowledge about relevant topics in relation to (research on and with) language resources. The main tool is the 'Virtual Language Observatory' providing metadata and access to the different national CLARIN centers and their data.