Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 722 result(s)
The Comprehensive Epidemiologic Data Resource (CEDR) is the Department of Energy's (DOE) electronic database comprised of health studies of DOE contract workers and environmental studies of areas surrounding DOE facilities. DOE recognizes the benefits of data sharing and supports the public's right to know about worker and community health risks. CEDR provides independent researchers and the public with access to de-identified data collected since the Department's early production years. Current CEDR holdings include more than 80 studies of over 1 million workers at 31 DOE sites. Access to these data is at no cost to the user. Most of CEDR's holdings are derived from epidemiologic studies of DOE workers at many large nuclear weapons plants, such as Hanford, Los Alamos, the Oak Ridge reservation, Savannah River Site, and Rocky Flats. These studies primarily use death certificate information to identify excess deaths and patterns of disease among workers to determine what factors contribute to the risk of developing cancer and other illnesses. In addition, many of these studies have radiation exposure measurements on individual workers. CEDR is supported by the Oak Ridge Institute for Science and Education (ORISE) in Oak Ridge, Tennessee. Now a mature system in routine operational use, CEDR's modern internet-based systems respond to thousands of requests to its web server daily. With about 1,500 Internet sites pointing to CEDR's web site, CEDR is a national user facility, with a large audience for data that are not available elsewhere.
The National Archives and Records Administration (NARA) is the nation's record keeper. Of all documents and materials created in the course of business conducted by the United States Federal government, only 1%-3% are so important for legal or historical reasons that they are kept by us forever. Those valuable records are preserved and are available to you, whether you want to see if they contain clues about your family’s history, need to prove a veteran’s military service, or are researching an historical topic that interests you.
Country is an online data storage and synchronization service provided by the Danish e-Infrastructure Cooperation (DeIC), specifically aimed at researchers and scientists at Danish academic institutions. The service is primarily intended for working with and sharing active research data as well as for safekeeping of large datasets. Such data can be put in an area ('/Data') that is specifically not synced, i.e. not copied to desktops, laptops and mobile devices by the sync clients. Instead the data can be accessed and manipulated via the web interface, file transfer clients or the command line. The service is built on and with open-source software from the ground up: FreeBSD, ZFS, Apache, PHP, ownCloud+apps. DeIC is actively engaged in community efforts on developing such apps, and some are available as previews of things to come - including apps for getting large amounts of data into the system and tagging with meta-data. Our servers are attached directly to the 10-Gigabit backbone of "Forskningsnettet" - implying that wired up and download speed from Danish academic institutions is in principle comparable to those of an external USB hard drive.
The Eurac Research CLARIN Centre (ERCC) is a dedicated repository for language data. It is hosted by the Institute for Applied Linguistics (IAL) at Eurac Research, a private research centre based in Bolzano, South Tyrol. The Centre is part of the Europe-wide CLARIN infrastructure, which means that it follows well-defined international standards for (meta)data and procedures and is well-embedded in the wider European Linguistics infrastructure. The repository hosts data collected at the IAL, but is also open for data deposits from external collaborators.
The Cancer Imaging Archive is a freely accessible repository containing medical images and supporting data from cancer patients. Images are stored in DICOM file format. The images are organized as “Collections”, typically patients related by a common disease (e.g. lung cancer), image modality (MRI, CT, etc) or research focus. Search functionality allows users to query across Collections or within them to filter out only the data they are most interested in.
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans. BioGRID is an online interaction repository with data compiled through comprehensive curation efforts. All interaction data are freely provided through our search index and available via download in a wide variety of standardized formats.
The German Neuroinformatics Node's data infrastructure (GIN) services provide a platform for comprehensive and reproducible management and sharing of neuroscience data. Building on well established versioning technology, GIN offers the power of a web based repository management service combined with a distributed file storage. The service addresses the range of research data workflows starting from data analysis on the local workstation to remote collaboration and data publication.
The EBiSC Catalogue is a collection of human iPS cells being made available to academic and commercial researchers for use in disease modelling and other forms of preclinical research. The initial collection has been generated from a wide range of donors representing specific disease backgrounds and healthy controls. As the collection grows, more isogenic control lines will become available which will add further to the collection’s appeal.
SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts. The available information include side effect frequency, drug and side effect classifications as well as links to further information, for example drug–target relations.
With the Program EnviDat we develop a unified and managed access portal for WSL's rich reservoir of environmental monitoring and research data. EnviDat is designed as a portal to publish, connect and search across existing data but is not intended to become a large data centre hosting original data. While sharing of data is centrally facilitated, data management remains decentralised and the know-how and responsibility to curate research data remains with the original data providers.
The National Sleep Research Resource (NSRR) offers free web access to large collections of de-identified physiological signals and clinical data elements collected in well-characterized research cohorts and clinical trials.
Academic Commons provides open, persistent access to the scholarship produced by researchers at Columbia University, Barnard College, Jewish Theological Seminary, Teachers College, and Union Theological Seminary. Academic Commons is a program of the Columbia University Libraries. Academic Commons accepts articles, dissertations, research data, presentations, working papers, videos, and more.
The Health and Medical Care Archive (HMCA) is the data archive of the Robert Wood Johnson Foundation (RWJF), the largest philanthropy devoted exclusively to health and health care in the United States. Operated by the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan, HMCA preserves and disseminates data collected by selected research projects funded by the Foundation and facilitates secondary analyses of the data. Our goal is to increase understanding of health and health care in the United States through secondary analysis of RWJF-supported data collections
GovData the data portal for Germany offers consistent and central access to administrative data at the federal, state, and local level. Objective is to make data more available and easier to use at a single location. As set out in the concept of "open data", we attempt to facilitate the use of open licenses and to increase the supply of machine-readable raw data.
OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. OMIM is authored and edited at the McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, under the direction of Dr. Ada Hamosh. Its official home is
! The National Climatic Data Center has merged into the National Centers for Environmental Information (NCEI). NOAA's National Climatic Data Center (NCDC) is responsible for preserving, monitoring, assessing, and providing public access to the Nation's treasure of climate and historical weather data and information.
Jason is a remote-controlled deep-diving vessel that gives shipboard scientists immediate, real-time access to the sea floor. Instead of making short, expensive dives in a submarine, scientists can stay on deck and guide Jason as deep as 6,500 meters (4 miles) to explore for days on end. Jason is a type of remotely operated vehicle (ROV), a free-swimming vessel connected by a long fiberoptic tether to its research ship. The 10-km (6 mile) tether delivers power and instructions to Jason and fetches data from it.
VAMDC aims to be an interoperable e-infrastructure that provides the international research community with access to a broad range of atomic and molecular (A&M) data compiled within a set of A&M databases accessible through the provision of this portal and of user software. Furthermore VAMDC aims to provide A&M data providers and compilers with a large dissemination platform for their work. VAMDC infrastructure was established to provide a service to a wide international research community and has been developed in conjunction with consultations and advice from the A&M user community.
This Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house all publicly available QTL and trait mapping data (i.e. trait and genome location association data; collectively called "QTL data" on this site) on livestock animal species for easily locating and making comparisons within and between species. New database tools are continuely added to align the QTL and association data to other types of genome information, such as annotated genes, RH / SNP markers, and human genome maps. Besides the QTL data from species listed below, the QTLdb is open to house QTL/association date from other animal species where feasible. Note that the JAS along with other journals, now require that new QTL/association data be entered into a QTL database as part of their publication requirements.
The Scholarly Database (SDB) at Indiana University aims to serve researchers and practitioners interested in the analysis, modeling, and visualization of large-scale scholarly datasets. The online interface provides access to six datasets: MEDLINE papers, registered Clinical Trials, U.S. Patent and Trademark Office patents (USPTO), National Science Foundation (NSF) funding, National Institutes of Health (NIH) funding, and National Endowment for the Humanities funding – over 26 million records in total.
The Census Bureau releases TIGER/Line shapefiles and metadata each year to the public. TIGER/Line shapefiles are spatial extracts from the Census Bureau’s MAF/TIGER database. They contain features such as roads, railroads, hydrographic features and legal and statistical boundaries.