Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 23 result(s)
The OpenMadrigal project seeks to develop and support an on-line database for geospace data. The project has been led by MIT Haystack Observatory since 1980, but now has active support from Jicamarca Observatory and other community members. Madrigal is a robust, World Wide Web based system capable of managing and serving archival and real-time data, in a variety of formats, from a wide range of ground-based instruments. Madrigal is installed at a number of sites around the world. Data at each Madrigal site is locally controlled and can be updated at any time, but shared metadata between Madrigal sites allow searching of all Madrigal sites at once from any Madrigal site. Data is local; metadata is shared.
As a department of the United States Department of Agriculture (USDA) the National Agricultural Statistics Service (NASS) continually surveys and reports on U.S. agriculture. NASS reports include production and supplies of food and fiber, prices paid and received by farmers, farm labor and wages, farm finances, chemical use, and changes in the demographics of U.S. producers. NASS provides objective and unbiased statistics of states and counties, while safeguarding the privacy of farmers and ranchers.
>>>>!!!<<< As stated 2017-06-27 The website is no longer available; repository software is archived on github >>>!!!<<< The ResearchCompendia platform is an attempt to use the web to enhance the reproducibility and verifiability—and thus the reliability—of scientific research. we provide the tools to publish the "actual scholarship" by hosting data, code, and methods in a form that is accessible, trackable, and persistent. Some of our short term goals include: To expand and enhance the platform including adding executability for a greater variety of coding languages and frameworks, and enhancing output presentation. To expand usership and to test the ResearchCompendia model in a number of additional fields, including computational mathematics, statistics, and biostatistics. To pilot integration with existing scholarly platforms, enabling researchers to discover relevant Research Compendia websites when looking at online articles, code repositories, or data archives.
CRAWDAD is the Community Resource for Archiving Wireless Data, a wireless network data resource for the research community. This archive has the capacity to store wireless trace data from many contributing locations, and staff to develop better tools for collecting, anonymizing, and analyzing the data. We work with community leaders to ensure that the archive meets the needs of the research community.
The RESIF Seismic data portal offers access to seismological and other associated geophysical data from permanent and temporary seismic networks operated all over the world by French research institutions and international partners, to support research on source processes and imaging of the Earth's interior at all scales. RESIF (French seismologic and geodetic network) is a French national equipment for the observation and understanding of the solid Earth.
Avibase is an extensive database information system about all birds of the world, containing over 19 million records about 10,000 species and 22,000 subspecies of birds, including distribution information, taxonomy, synonyms in several languages and more. This site is managed by Denis Lepage and hosted by Bird Studies Canada, the Canadian copartner of Birdlife International. Avibase has been a work in progress since 1992 and I am now pleased to offer it as a service to the bird-watching and scientific community.
Country is a research infrastructure that preserves millions of files collected from the web since 1996 and provides a public search service over this information. It contains information in several languages. Periodically it collects and stores information published on the web. Then, it processes the collect data to make it searchable, providing a “Google-like” service that enables searching the past web (English user interface available at This preservation workflow is performed through a large-scale distributed information system and can also accessed through API.
Datatang is a professional data pre-processing company. We are engaged in data collecting, annotating, and customizing to meet our clients’ various needs. We assist our clients from university research labs and company R&D departments to waive trivial yet necessary data processing procedure and make their approach to the highest-value data in a more efficient way.
Socrata’s cloud-based solution allows government organizations to put their data online, make data-driven decisions, operate more efficiently, and share insights with citizens.
The German Text Archive (Deutsches Textarchiv, DTA) presents online a selection of key German-language works in various disciplines from the 17th to 19th centuries. The electronic full-texts are indexed linguistically and the search facilities tolerate a range of spelling variants. The DTA presents German-language printed works from around 1650 to 1900 as full text and as digital facsimile. The selection of texts was made on the basis of lexicographical criteria and includes scientific or scholarly texts, texts from everyday life, and literary works. The digitalisation was made from the first edition of each work. Using the digital images of these editions, the text was first typed up manually twice (‘double keying’). To represent the structure of the text, the electronic full-text was encoded in conformity with the XML standard TEI P5. The next stages complete the linguistic analysis, i.e. the text is tokenised, lemmatised, and the parts of speech are annotated. The DTA thus presents a linguistically analysed, historical full-text corpus, available for a range of questions in corpus linguistics. Thanks to the interdisciplinary nature of the DTA Corpus, it also offers valuable source-texts for neighbouring disciplines in the humanities, and for scientists, legal scholars and economists.
Kaggle is a platform for predictive modelling and analytics competitions in which statisticians and data miners compete to produce the best models for predicting and describing the datasets uploaded by companies and users. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know beforehand which technique or analyst will be most effective.
The World Glacier Monitoring Service (WGMS) collects standardized observations on changes in mass, volume, area and length of glaciers with time (glacier fluctuations), as well as statistical information on the distribution of perennial surface ice in space (glacier inventories). Such glacier fluctuation and inventory data are high priority key variables in climate system monitoring; they form a basis for hydrological modelling with respect to possible effects of atmospheric warming, and provide fundamental information in glaciology, glacial geomorphology and quaternary geology. The highest information density is found for the Alps and Scandinavia, where long and uninterrupted records are available. As a contribution to the Global Terrestrial/Climate Observing System (GTOS, GCOS), the Division of Early Warning and Assessment and the Global Environment Outlook of UNEP, and the International Hydrological Programme of UNESCO, the WGMS collects and publishes worldwide standardized glacier data.
The Open PHACTS project will develop an open source, open standards and open access innovation platform, Open Pharmacological Space (OPS), via a semantic web approach. OPS will comprise data, vocabularies and infrastructure needed to accelerate drugoriented research. This semantic integration hub will address key bottlenecks in small molecule drug discovery: disparate information sources, lack of standards and shared concept identifiers, guided by well defined research questions assembled from participating drug discovery teams. Open PHACTS draws together multiple sources of publicly-available pharmacological and physicochemical data, accessible via the Open PHACTS Explorer, an intuitive interface, and the powerful Open PHACTS API.
The Solar Dynamics Observatory (SDO) studies the solar atmosphere on small scales of space and time, in multiple wavelengths. This is a searchable database of all SDO data, including citizen scientist images, space weather and near real time data, and helioseismology data.
ChemSpider is a free chemical structure database providing fast access to over 58 million structures, properties and associated information. By integrating and linking compounds from more than 400 data sources, ChemSpider enables researchers to discover the most comprehensive view of freely available chemical data from a single online search. It is owned by the Royal Society of Chemistry. ChemSpider builds on the collected sources by adding additional properties, related information and links back to original data sources. ChemSpider offers text and structure searching to find compounds of interest and provides unique services to improve this data by curation and annotation and to integrate it with users’ applications.
The United States Health Information Knowledgebase (USHIK) is an on-line, publicly accessible registry and repository of healthcare related data, metadata, and standards.
Europeana is the trusted source of cultural heritage brought to you by the Europeana Foundation and a large number of European cultural institutions, projects and partners. It’s a real piece of team work. Ideas and inspiration can be found within the millions of items on Europeana. These objects include: Images - paintings, drawings, maps, photos and pictures of museum objects Texts - books, newspapers, letters, diaries and archival papers Sounds - music and spoken word from cylinders, tapes, discs and radio broadcasts Videos - films, newsreels and TV broadcasts All texts are CC BY-SA, images and media licensed individually.
Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge to their users. Anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users.
The United States Census Bureau (officially the Bureau of the Census, as defined in Title 13 U.S.C. § 11) is the government agency that is responsible for the United States Census. It also gathers other national demographic and economic data. As a part of the United States Department of Commerce, the Census Bureau serves as a leading source of data about America's people and economy. The most visible role of the Census Bureau is to perform the official decennial (every 10 years) count of people living in the U.S. The most important result is the reallocation of the number of seats each state is allowed in the House of Representatives, but the results also affect a range of government programs received by each state. The agency director is a political appointee selected by the President of the United States.
The NASA Exoplanet Archive collects and serves public data to support the search for and characterization of extra-solar planets (exoplanets) and their host stars. The data include published light curves, images, spectra and parameters, and time-series data from surveys that aim to discover transiting exoplanets. Tools are provided to work with the data, particularly the display and analysis of transit data sets from Kepler and CoRoT. All data are validated by the Exoplanet Archive science staff and traced to their sources. The Exoplanet Archive is the U.S. data portal for the CoRoT mission.
InnateDB is a publicly available database of the genes, proteins, experimentally-verified interactions and signaling pathways involved in the innate immune response of humans, mice and bovines to microbial infection. The database captures an improved coverage of the innate immunity interactome by integrating known interactions and pathways from major public databases together with manually-curated data into a centralised resource. The database can be mined as a knowledgebase or used with our integrated bioinformatics and visualization tools for the systems level analysis of the innate immune response.
Trove helps you find and use resources relating to Australia. It’s more than a search engine. Trove brings together content from libraries, museums, archives and other research organisations and gives you tools to explore and build.