Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 280 result(s)
The CancerData site is an effort of the Medical Informatics and Knowledge Engineering team (MIKE for short) of Maastro Clinic, Maastricht, The Netherlands. Our activities in the field of medical image analysis and data modelling are visible in a number of projects we are running. CancerData is offering several datasets. They are grouped in collections and can be public or private. You can search for public datasets in the NBIA (National Biomedical Imaging Archive) image archives without logging in.
Intrepid Bioinformatics serves as a community for genetic researchers and scientific programmers who need to achieve meaningful use of their genetic research data – but can’t spend tremendous amounts of time or money in the process. The Intrepid Bioinformatics system automates time consuming manual processes, shortens workflow, and eliminates the threat of lost data in a faster, cheaper, and better environment than existing solutions. The system also provides the functionality and community features needed to analyze the large volumes of Next Generation Sequencing and Single Nucleotide Polymorphism data, which is generated for a wide range of purposes from disease tracking and animal breeding to medical diagnosis and treatment.
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems. The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target ("TARGET_TYPE='PROTEIN'") is provided. Data extracted by BindingDB typically includes more details regarding experimental conditions, etc
The Autism Chromosome Rearrangement Database is a collection of hand curated breakpoints and other genomic features, related to autism, taken from publicly available literature: databases and unpublished data. The database is continuously updated with information from in-house experimental data as well as data from published research studies.
The CMU Multi-Modal Activity Database (CMU-MMAC) database contains multimodal measures of the human activity of subjects performing the tasks involved in cooking and food preparation. The CMU-MMAC database was collected in Carnegie Mellon's Motion Capture Lab. A kitchen was built and to date twenty-five subjects have been recorded cooking five different recipes: brownies, pizza, sandwich, salad, and scrambled eggs.
MIT’s implementation of OpenGeoportal is called MIT Geoweb. It was collaboratively developed as an open source, federated web application to discover, preview, and retrieve geospatial data from different repositories. Several of the country's leading universities and a state agency have formed a partnership to make thousands of geospatial data layers available through a single, open source interface. The application also incorporates some new innovative search techniques. Partners include Tufts, Harvard, MIT, Princeton, MassGIS, Stanford and UC Berkeley. The single interface is skinnable and may have slight differences in appearance based on the institution hosting the application. You can search for GIS data held in the MIT Geodata Repository and other local colleges.
The GHDx is our user-friendly and searchable data catalog for global health, demographic, and other health-related datasets. It provides detailed information about datasets ranging from censuses and surveys to health records and vital statistics, globally. It also serves as a platform for data owners to share their data with the public. The GDB Compare visualization, which allows the user to see rate of change in disease incidence, globally or by country, by age or across all ages, is especially powerful as a tool. Be sure to try adding a bottom chart, like the map, to augment the treemap that loads by default in the top chart.
>>>!!!<<< SMD has been retired. After approximately fifteen years of microarray-centric research service, the Stanford Microarray Database has been retired. We apologize for any inconvenience; please read below for possible resolutions to your queries. If you are looking for any raw data that was directly linked to SMD from a manuscript, please search one of the public repositories. NCBI Gene Expression Omnibus EBI ArrayExpress All published data were previously communicated to one (or both) of the public repositories. Alternatively, data for publications between 1997 and 2004 were likely migrated to the Princeton University MicroArray Database, and are accessible there. If you are looking for a manuscript supplement (i.e. from a domain other than, perhaps try searching the Internet Archive: Wayback Machine . >>>!!!<<< The Stanford Microarray Database (SMD) is a DNA microarray research database that provides a large amount of data for public use.
The RRUFF Project is creating a complete set of high quality spectral data from well characterized minerals and is developing the technology to share this information with the world. The collected data provides a standard for mineralogists, geoscientists, gemologists and the general public for the identification of minerals both on earth and for planetary exploration.Electron microprobe analysis is used to determine the chemistry of each mineral.
The GOES Space Environment Monitor archive is an important component of the National Space Weather Program --a interagency program to provide timely and reliable space environment observations and forecasts. GOES satellites carry onboard a Space Environment Monitor subsystem that measures X-rays, Energetic Particles and Magnetic Field at the Spacecraft.
In the digital collections, you can take a look at the digitized prints from the holdings of the ULB Düsseldorf free of cost. In special collections, the ULB unites rare, valuable and unique parts of holdings that are installed as an ensemble. Deposita, unpublished works, donations, acquisition of rare books etc. were and are an important source for the constant growth of the library. These treasures and specialties - beyond their academic value - also contribute substantially to the profile of the ULB.
META-SHARE, the open language resource exchange facility, is devoted to the sustainable sharing and dissemination of language resources (LRs) and aims at increasing access to such resources in a global scale. META-SHARE is an open, integrated, secure and interoperable sharing and exchange facility for LRs (datasets and tools) for the Human Language Technologies domain and other applicative domains where language plays a critical role. META-SHARE is implemented in the framework of the META-NET Network of Excellence. It is designed as a network of distributed repositories of LRs, including language data and basic language processing tools (e.g., morphological analysers, PoS taggers, speech recognisers, etc.). Data and tools can be both open and with restricted access rights, free and for-a-fee.
Online materials database (known as PAULING FILE project) with nearly 2 million entries: physical properties, crystal structures, phase diagrams, available via API, ready for modern data-intensive applications. The source of these entries are about 300,000 peer-reviewed publications in materials science, processed during the last 16 years by an international team of PhD editors. The results are presented online with a quick search interface. The basic access is provided for free.
Neotoma is a multiproxy paleoecological database that covers the Pliocene-Quaternary, including modern microfossil samples. The database is an international collaborative effort among individuals from 19 institutions, representing multiple constituent databases. There are over 20 data-types within the Neotoma Paleoecological Database, including pollen microfossils, plant macrofossils, vertebrate fauna, diatoms, charcoal, biomarkers, ostracodes, physical sedimentology and water chemistry. Neotoma provides an underlying cyberinfrastructure that enables the development of common software tools for data ingest, discovery, display, analysis, and distribution, while giving domain scientists control over critical taxonomic and other data quality issues.
HPIDB is a public resource, which integrates experimental PPIs from various databases into a single database. The Host-Pathogen Interaction Database (HPIDB) is a genomics resource devoted to understanding molecular interactions between key organisms and the pathogens to which they are susceptible.
The CiardRING is a global directory of web-based information services and datasets for agricultural research for development (ARD). It is the principal tool created through the CIARD initiative to allow information providers to register their services and datasets in various categories and so facilitate the discovery of sources of agriculture-related information across the world. The RING aims to provide an infrastructure to improve the accessibility of the outputs of agricultural research and of information relevant to agriculture.
The programme "International Oceanographic Data and Information Exchange" (IODE) of the "Intergovernmental Oceanographic Commission" (IOC) of UNESCO was established in 1961. Its purpose is to enhance marine research, exploitation and development, by facilitating the exchange of oceanographic data and information between participating Member States, and by meeting the needs of users for data and information products.
The UWA Research Repository contains research publications, research datasets and theses created by researchers and postgraduates affiliated with UWA. It is managed by the University Library and provides access to research datasets held at the University of Western Australia. The information about each dataset has been provided by UWA research groups. Dataset metadata is harvested into Research Data Australia (RDA: Language: The user interface language of the research data repository.
OASIS-3 is the latest release in the Open Access Series of Imaging Studies (OASIS) that aimed at making neuroimaging datasets freely available to the scientific community. By compiling and freely distributing this multi-modal dataset, we hope to facilitate future discoveries in basic and clinical neuroscience. Previously released data for OASIS-Cross-sectional (Marcus et al, 2007) and OASIS-Longitudinal (Marcus et al, 2010) have been utilized for hypothesis driven data analyses, development of neuroanatomical atlases, and development of segmentation algorithms. OASIS-3 is a longitudinal neuroimaging, clinical, cognitive, and biomarker dataset for normal aging and Alzheimer’s Disease. The OASIS datasets hosted by provide the community with open access to a significant database of neuroimaging and processed imaging data across a broad demographic, cognitive, and genetic spectrum an easily accessible platform for use in neuroimaging, clinical, and cognitive research on normal aging and cognitive decline. All data is available via
This interactive database provides complete access to statistics on seasonal cotton supply and use for each country and each region in the world, from 1920/21 to date. This project is part of ICAC’s efforts to improve the transparency of world cotton statistics.
Open Government Data Portal of Tamil Nadu is a platform (designed by the National Informatics Centre), for Open Data initiative of the Government of Tamil Nadu. The portal is intended to publish datasets collected by the Tamil Nadu Government for public uses in different perspective. It has been created under Software as A Service (SaaS) model of Open Government Data (OGD) and publishes dataset in open formats like CSV, XLS, ODS/OTS, XML, RDF, KML, GML, etc. This data portal has following modules, namely (a) Data Management System (DMS) for contributing data catalogs by various state government agencies for making those available on the front end website after a due approval process through a defined workflow; (b) Content Management System (CMS) for managing and updating various functionalities and content types; (c) Visitor Relationship Management (VRM) for collating and disseminating viewer feedback on various data catalogs; and (d) Communities module for community users to interact and share their views and common interests with others. It includes different types of datasets generated both in geospatial and non-spatial data classified as shareable data and non-shareable data. Geospatial data consists primarily of satellite data, maps, etc.; and non-spatial data derived from national accounts statistics, price index, census and surveys produced by a statistical mechanism. It follows the principle of data sharing and accessibility via Openness, Flexibility, Transparency, Quality, Security and Machine-readable.
The Durham High Energy Physics Database (HEPData), formerly: the Durham HEPData Project, has been built up over the past four decades as a unique open-access repository for scattering data from experimental particle physics. It currently comprises the data points from plots and tables related to several thousand publications including those from the Large Hadron Collider (LHC). The Durham HepData Project has for more than 25 years compiled the Reactions Database containing what can be loosly described as cross sections from HEP scattering experiments. The data comprise total and differential cross sections, structure functions, fragmentation functions, distributions of jet measures, polarisations, etc... from a wide range of interactions. In the new HEPData site (, you can explore new functionalities for data providers and data consumers, as well as the submission interface. HEPData is operated by CERN and IPPP at Durham University and is based on the digital library framework Invenio.
NED is a comprehensive database of multiwavelength data for extragalactic objects, providing a systematic, ongoing fusion of information integrated from hundreds of large sky surveys and tens of thousands of research publications. The contents and services span the entire observed spectrum from gamma rays through radio frequencies. As new observations are published, they are cross- identified or statistically associated with previous data and integrated into a unified database to simplify queries and retrieval. Seamless connectivity is also provided to data in NASA astrophysics mission archives (IRSA, HEASARC, MAST), to the astrophysics literature via ADS, and to other data centers around the world.
The Plant Metabolic Network (PMN) provides a broad network of plant metabolic pathway databases that contain curated information from the literature and computational analyses about the genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism in plants. The PMN currently houses one multi-species reference database called PlantCyc and 22 species/taxon-specific databases.
The Scholarly Database (SDB) at Indiana University aims to serve researchers and practitioners interested in the analysis, modeling, and visualization of large-scale scholarly datasets. The online interface provides access to six datasets: MEDLINE papers, registered Clinical Trials, U.S. Patent and Trademark Office patents (USPTO), National Science Foundation (NSF) funding, National Institutes of Health (NIH) funding, and National Endowment for the Humanities funding – over 26 million records in total.