Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 24 result(s)
Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. SNAP is also available through the NodeXL which is a graphical front-end that integrates network analysis into Microsoft Office and Excel. The SNAP library is being actively developed since 2004 and is organically growing as a result of our research pursuits in analysis of large social and information networks. Largest network we analyzed so far using the library was the Microsoft Instant Messenger network from 2006 with 240 million nodes and 1.3 billion edges. The datasets available on the website were mostly collected (scraped) for the purposes of our research. The website was launched in July 2009.
Research Data Australia is the data discovery service of the Australian National Data Service (ANDS). We do not store the data itself here but provide descriptions of, and links to, the data from our data publishing partners. ANDS is funded by the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS).
The BCDC serves the research data obtained, and the data syntheses assembled, by researchers within the Bjerknes Centre for Climate Research. Furthermore it is open for all interested scientists independent of institution. All data from the different disciplines (e.g. geology, oceanography, biology, model community) will be archived in a long-term repository, interconnected and made publicly available by the BCDC. BCDC has collaborations with many international data repositories and actively archives metadata and data at those ensuring quality and FAIRness. BCDC has it's main focus on services for data management for external and internal funded projects in the field of climate research, provides data management plans and ensures that data is archived accordingly according to the best practices in the field. The data management services rank from project work for small external funded project to top-of-the-art data management services for research infrastructures on the ESFRI roadmap (e.g. RI ICOS – Integrated Carbon Observation System) and for provides products and services for Copernicus Marine Environmental Monitoring Services. In addition BCDC is advising various communities on data management services e.g. IOC UNESCO, OECD, IAEA and various funding agencies. BCDC will become an Associated Data Unit (ADU) under IODE, International Oceanographic Data and Information Exchange, a worldwide network that operates under the auspices of the Intergovernmental Oceanographic Commission of UNESCO and aims at becoming a part of ICSU World Data System.
The Rat Genome Database is a collaborative effort between leading research institutions involved in rat genetic and genomic research. Its goal, as stated in RFA: HL-99-013 is the establishment of a Rat Genome Database, to collect, consolidate, and integrate data generated from ongoing rat genetic and genomic research efforts and make these data widely available to the scientific community. A secondary, but critical goal is to provide curation of mapped positions for quantitative trait loci, known mutations and other phenotypic data.
The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources. These include submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centres and routine and comprehensive exchange with our partners in the International Nucleotide Sequence Database Collaboration (INSDC). Provision of nucleotide sequence data to ENA or its INSDC partners has become a central and mandatory step in the dissemination of research findings to the scientific community. ENA works with publishers of scientific literature and funding bodies to ensure compliance with these principles and to provide optimal submission systems and data access tools that work seamlessly with the published literature.
The JPL Tropical Cyclone Information System (TCIS) was developed to support hurricane research. There are three components to TCIS; a global archive of multi-satellite hurricane observations 1999-2010 (Tropical Cyclone Data Archive), North Atlantic Hurricane Watch and ASA Convective Processes Experiment (CPEX) aircraft campaign. Together, data and visualizations from the real time system and data archive can be used to study hurricane process, validate and improve models, and assist in developing new algorithms and data assimilation techniques.
Biological collections are replete with taxonomic, geographic, temporal, numerical, and historical information. This information is crucial for understanding and properly managing biodiversity and ecosystems, but is often difficult to access. Canadensys, operated from the Université de Montréal Biodiversity Centre, is a Canada-wide effort to unlock the biodiversity information held in biological collections.
ChemSpider is a free chemical structure database providing fast access to over 58 million structures, properties and associated information. By integrating and linking compounds from more than 400 data sources, ChemSpider enables researchers to discover the most comprehensive view of freely available chemical data from a single online search. It is owned by the Royal Society of Chemistry. ChemSpider builds on the collected sources by adding additional properties, related information and links back to original data sources. ChemSpider offers text and structure searching to find compounds of interest and provides unique services to improve this data by curation and annotation and to integrate it with users’ applications.
The Substance Abuse and Mental Health Data Archive (SAMHDA) is an initiative funded under contract HHSS283201500001C with the Center for Behavioral Health Statistics and Quality (CBHSQ), Substance Abuse and Mental Health Services Administration (SAMHSA), U.S. Department of Health and Human Services (HHS). CBHSQ has primary responsibility for the collection, analysis, and dissemination of SAMHSA's behavioral health data. Public use files and restricted use files are provided. CBHSQ promotes the access and use of the nation's substance abuse and mental health data through SAMHDA. SAMHDA provides public-use data files, file documentation, and access to restricted-use data files to support a better understanding of this critical area of public health.
GenBank® is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.
The Government is releasing public data to help people understand how government works and how policies are made. Some of this data is already available, but brings it together in one searchable website. Making this data easily available means it will be easier for people to make decisions and suggestions about government policies based on detailed information.
The Open PHACTS project will develop an open source, open standards and open access innovation platform, Open Pharmacological Space (OPS), via a semantic web approach. OPS will comprise data, vocabularies and infrastructure needed to accelerate drugoriented research. This semantic integration hub will address key bottlenecks in small molecule drug discovery: disparate information sources, lack of standards and shared concept identifiers, guided by well defined research questions assembled from participating drug discovery teams. Open PHACTS draws together multiple sources of publicly-available pharmacological and physicochemical data, accessible via the Open PHACTS Explorer, an intuitive interface, and the powerful Open PHACTS API.
Europeana is the trusted source of cultural heritage brought to you by the Europeana Foundation and a large number of European cultural institutions, projects and partners. It’s a real piece of team work. Ideas and inspiration can be found within the millions of items on Europeana. These objects include: Images - paintings, drawings, maps, photos and pictures of museum objects Texts - books, newspapers, letters, diaries and archival papers Sounds - music and spoken word from cylinders, tapes, discs and radio broadcasts Videos - films, newsreels and TV broadcasts All texts are CC BY-SA, images and media licensed individually.
Research Data Unipd is a data archive and supports research produced by the members of the University of Padova. The service aims to facilitate data discovery, data sharing, and reuse, as required by funding institutions (eg. European Commission). Datasets published in the archive have a set of metadata that ensure proper description and discoverability.
InnateDB is a publicly available database of the genes, proteins, experimentally-verified interactions and signaling pathways involved in the innate immune response of humans, mice and bovines to microbial infection. The database captures an improved coverage of the innate immunity interactome by integrating known interactions and pathways from major public databases together with manually-curated data into a centralised resource. The database can be mined as a knowledgebase or used with our integrated bioinformatics and visualization tools for the systems level analysis of the innate immune response.
The United States Census Bureau (officially the Bureau of the Census, as defined in Title 13 U.S.C. § 11) is the government agency that is responsible for the United States Census. It also gathers other national demographic and economic data. As a part of the United States Department of Commerce, the Census Bureau serves as a leading source of data about America's people and economy. The most visible role of the Census Bureau is to perform the official decennial (every 10 years) count of people living in the U.S. The most important result is the reallocation of the number of seats each state is allowed in the House of Representatives, but the results also affect a range of government programs received by each state. The agency director is a political appointee selected by the President of the United States.
The NASA Exoplanet Archive collects and serves public data to support the search for and characterization of extra-solar planets (exoplanets) and their host stars. The data include published light curves, images, spectra and parameters, and time-series data from surveys that aim to discover transiting exoplanets. Tools are provided to work with the data, particularly the display and analysis of transit data sets from Kepler and CoRoT. All data are validated by the Exoplanet Archive science staff and traced to their sources. The Exoplanet Archive is the U.S. data portal for the CoRoT mission.
The Energy Data eXchange (EDX) is an online collection of capabilities and resources that advance research and customize energy-related needs. EDX is developed and maintained by NETL-RIC researchers and technical computing teams to support private collaboration for ongoing research efforts, and tech transfer of finalized DOE NETL research products. EDX supports NETL-affiliated research by: Coordinating historical and current data and information from a wide variety of sources to facilitate access to research that crosscuts multiple NETL projects/programs; Providing external access to technical products and data published by NETL-affiliated research teams; Collaborating with a variety of organizations and institutions in a secure environment through EDX’s ;Collaborative Workspaces
The Arctic Data Center is the primary data and software repository for the Arctic section of NSF Polar Programs. The Center helps the research community to reproducibly preserve and discover all products of NSF-funded research in the Arctic, including data, metadata, software, documents, and provenance that links these together. The repository is open to contributions from NSF Arctic investigators, and data are released under an open license (CC-BY, CC0, depending on the choice of the contributor). All science, engineering, and education research supported by the NSF Arctic research program are included, such as Natural Sciences (Geoscience, Earth Science, Oceanography, Ecology, Atmospheric Science, Biology, etc.) and Social Sciences (Archeology, Anthropology, Social Science, etc.). Key to the initiative is the partnership between NCEAS at UC Santa Barbara, DataONE, and NOAA’s NCEI, each of which bring critical capabilities to the Center. Infrastructure from the successful NSF-sponsored DataONE federation of data repositories enables data replication to NCEI, providing both offsite and institutional diversity that are critical to long term preservation.
Rodare is the institutional research data repository at HZDR (Helmholtz-Zentrum Dresden-Rossendorf). Rodare allows HZDR researchers to upload their research data and enrich those with metadata to make them findable, accessible, interoperable and retrievable (FAIR). By publishing all associated research data via Rodare research reproducibility can be improved. Uploads receive a Digital Object Identfier (DOI) and can be harvested via a OAI-PMH interface.