Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 22 result(s)
The National Cancer Data Base (NCDB), a joint program of the Commission on Cancer (CoC) of the American College of Surgeons (ACoS) and the American Cancer Society (ACS), is a nationwide oncology outcomes database for more than 1,500 Commission-accredited cancer programs in the United States and Puerto Rico. Some 70 percent of all newly diagnosed cases of cancer in the United States are captured at the institutional level and reported to the NCDB. The NCDB, begun in 1989, now contains approximately 29 million records from hospital cancer registries across the United States. Data on all types of cancer are tracked and analyzed. These data are used to explore trends in cancer care, to create regional and state benchmarks for participating hospitals, and to serve as the basis for quality improvement.
Open Government Data Portal of Tamil Nadu is a platform (designed by the National Informatics Centre), for Open Data initiative of the Government of Tamil Nadu. The portal is intended to publish datasets collected by the Tamil Nadu Government for public uses in different perspective. It has been created under Software as A Service (SaaS) model of Open Government Data (OGD) and publishes dataset in open formats like CSV, XLS, ODS/OTS, XML, RDF, KML, GML, etc. This data portal has following modules, namely (a) Data Management System (DMS) for contributing data catalogs by various state government agencies for making those available on the front end website after a due approval process through a defined workflow; (b) Content Management System (CMS) for managing and updating various functionalities and content types; (c) Visitor Relationship Management (VRM) for collating and disseminating viewer feedback on various data catalogs; and (d) Communities module for community users to interact and share their views and common interests with others. It includes different types of datasets generated both in geospatial and non-spatial data classified as shareable data and non-shareable data. Geospatial data consists primarily of satellite data, maps, etc.; and non-spatial data derived from national accounts statistics, price index, census and surveys produced by a statistical mechanism. It follows the principle of data sharing and accessibility via Openness, Flexibility, Transparency, Quality, Security and Machine-readable.
The UniProt Knowledgebase (UniProtKB) is the central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation. In addition to capturing the core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and clear indications of the quality of annotation in the form of evidence attribution of experimental and computational data. The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt Metagenomic and Environmental Sequences (UniMES) database is a repository specifically developed for metagenomic and environmental data. The UniProt Knowledgebase,is an expertly and richly curated protein database, consisting of two sections called UniProtKB/Swiss-Prot and UniProtKB/TrEMBL.
Edmond is the institutional repository of the Max Planck Society for public research data. It enables Max Planck scientists to create citable scientific assets by describing, enriching, sharing, exposing, linking, publishing and archiving research data of all kinds. A unique feature of Edmond is the dedicated metadata management, which supports a non-restrictive metadata schema definition, as simple as you like or as complex as your parameters require. Further on, all objects within Edmond have a unique identifier and therefore can be clearly referenced in publications or reused in other contexts.
Gemma is a database for the meta-analysis, re-use and sharing of genomics data, currently primarily targeted at the analysis of gene expression profiles. Gemma contains data from thousands of public studies, referencing thousands of published papers. Users can search, access and visualize co-expression and differential expression results.
The ABCD Data Repository houses all data generated by the Adolescent Brain Cognitive Development (ABCD) Study. The ABCD Study is supported by NIH partners (the National Institute on Drug Abuse, the National Institute on Alcohol Abuse and Alcoholism, the National Cancer Institute, the Eunice Kennedy Shriver National Institute of Child Health and Human Development, the National Institute of Mental Health, the National Institute on Minority Health and Health Disparities, the National Institute of Neurological Disorders and Stroke, the NIH Office of Behavioral and Social Sciences Research, and the NIH Office of Research on Women’s Health), as well as the Centers for Disease Control and Prevention – Division of Adolescent and School Health. This repository will store data generated by ABCD investigators, serve as a collaborative platform for harmonizing these data, and share those data with qualified researchers.
Curtin University has 222 data records in Research Data Australia, which cover 199 subjects areas including Applied research, EARTH SCIENCES and GEOLOGY and involve 32 group(s)
GeneWeaver combines cross-species data and gene entity integration, scalable hierarchical analysis of user data with a community-built and curated data archive of gene sets and gene networks, and tools for data driven comparison of user-defined biological, behavioral and disease concepts. Gene Weaver allows users to integrate gene sets across species, tissue and experimental platform. It differs from conventional gene set over-representation analysis tools in that it allows users to evaluate intersections among all combinations of a collection of gene sets, including, but not limited to annotations to controlled vocabularies. There are numerous applications of this approach. Sets can be stored, shared and compared privately, among user defined groups of investigators, and across all users.
It is a platform for supporting Open Data initiative of Government of Odisha, intends to publish datasets collected by them for public use. It also supports widely used file formats that are suitable for machine processing, thus gives avenues for many more innovative uses of Government Data in different perspective. This portal has been created under Software as A Service (SaaS) model of Open Government Data (OGD) Platform India of NIC. The data available in the portal are owned by various Departments/Organization of Government of Odisha. It follows principles on which data sharing and accessibility need to be based include: Openness, Flexibility, Transparency, Quality, Security and Machine-readable. (Clinical trials) is a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world.
MassBank of North America (MoNA) is a metadata-centric, auto-curating repository designed for efficient storage and querying of mass spectral records. It intends to serve as a the framework for a centralized, collaborative database of metabolite mass spectra, metadata and associated compounds. MoNA currently contains over 200,000 mass spectral records from experimental and in-silico libraries as well as from user contributions.
The Global Proteome Machine (GPM) is a protein identification database. This data repository allows users to post and compare results. GPM's data is provided by contributors like The Informatics Factory, University of Michigan, and Pacific Northwestern National Laboratories. The GPM searchable databases are: GPMDB, pSYT, SNAP, MRM, PEPTIDE and HOT.
The European Genome-phenome Archive (EGA) is designed to be a repository for all types of sequence and genotype experiments, including case-control, population, and family studies. We will include SNP and CNV genotypes from array based methods and genotyping done with re-sequencing methods. The EGA will serve as a permanent archive that will archive several levels of data including the raw data (which could, for example, be re-analysed in the future by other algorithms) as well as the genotype calls provided by the submitters. We are developing data mining and access tools for the database. For controlled access data, the EGA will provide the necessary security required to control access, and maintain patient confidentiality, while providing access to those researchers and clinicians authorised to view the data. In all cases, data access decisions will be made by the appropriate data access-granting organisation (DAO) and not by the EGA. The DAO will normally be the same organisation that approved and monitored the initial study protocol or a designate of this approving organisation. The European Genome-phenome Archive (EGA) allows you to explore datasets from genomic studies, provided by a range of data providers. Access to datasets must be approved by the specified Data Access Committee (DAC).
Kenya Open Data offers visualizations tools, data downloads, and easy access for software developers. Kenya Open Data provides core government development, demographic, statistical and expenditure data available for researchers, policymakers, developers and the general public. Kenya is the first developing country to have an open government data portal, the first in sub-Saharan Africa and second on the continent after Morocco. The initiative has been widely acclaimed globally as one of the most significant steps Kenya has made to improve governance and implement the new Constitution’s provisions on access to information.
Open Government Data Portal of Sikkim– - is a platform for supporting Open Data initiative of Government of Sikkim. The portal is intended to be used by Departments/Organizations of Government of Sikkim to publish datasets, documents, services, tools and applications collected by them for public use. It intends to increase transparency in the functioning of the state Government and also open avenues for many more innovative uses of Government Data to give different perspective. Open Government Data Portal of Sikkim is designed and developed by the Open Government Data Division of National Informatics Centre (NIC), Department of Electronics and Information Technology (DeitY), Government of India. The portal has been created under Software as A Service (SaaS) model of Open Government Data (OGD) Platform India of NIC. The data available in the portal are owned by various Departments/Organization of Government of Sikkim. Open Government Data Portal of Sikkim has following modules: Data Management System (DMS) – Module for contributing data catalogs by various state government agencies for making those available on the front end website after a due approval process through a defined workflow. Content Management System (CMS) – Module for managing and updating various functionalities and content types of Open Government Data Portal of Sikkim. Visitor Relationship Management (VRM) – Module for collating and disseminating viewer feedback on various data catalogs. Communities – Module for community users to interact and share their zeal and views with others, who share common interests as that of theirs.
The Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC) is a team of researchers, data specialists and computer system developers who are supporting the development of a data management system to store scientific data generated by Gulf of Mexico researchers. The Master Research Agreement between BP and the Gulf of Mexico Alliance that established the Gulf of Mexico Research Initiative (GoMRI) included provisions that all data collected or generated through the agreement must be made available to the public. The Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC) is the vehicle through which GoMRI is fulfilling this requirement. The mission of GRIIDC is to ensure a data and information legacy that promotes continual scientific discovery and public awareness of the Gulf of Mexico Ecosystem.
BioGPS is a gene portal built with two guiding principles in mind -- customizability and extensibility. It is a complete resource for learning about gene and protein function. A free extensible and customizable gene annotation portal, a complete resource for learning about gene and protein function.
The Government is releasing public data to help people understand how government works and how policies are made. Some of this data is already available, but brings it together in one searchable website. Making this data easily available means it will be easier for people to make decisions and suggestions about government policies based on detailed information.
The Social Science Data Archive is still active and maintained as part of the UCLA Library Data Science Center. SSDA Dataverse is one of the archiving opportunities of SSDA, the others are: Data can be archived by SSDA itself ( or by ICPSR or by UCLA Library or by California Digital Library. The Social Science Data Archives serves the UCLA campus as an archive of faculty and graduate student survey research. We provide long term storage of data files and documentation. We ensure that the data are useable in the future by migrating files to new operating systems. We follow government standards and archival best practices. The mission of the Social Science Data Archive has been and continues to be to provide a foundation for social science research with faculty support throughout an entire research project involving original data collection or the reuse of publicly available studies. Data Archive staff and researchers work as partners throughout all stages of the research process, beginning when a hypothesis or area of study is being developed, during grant and funding activities, while data collection and/or analysis is ongoing, and finally in long term preservation of research results. Our role is to provide a collaborative environment where the focus is on understanding the nature and scope of research approach and management of research output throughout the entire life cycle of the project. Instructional support, especially support that links research with instruction is also a mainstay of operations.
The SICAS Medical Image Repository is a freely accessible repository containing medical research data including medical images, surface models, clinical data, genomics data and statistical shape models. The data can freely be organized and shared on SMIR and made publicly accessible with a DOI. Dedicated data sets are organized as collections of anatomical regions (e.g Cochlea). The data can be filtered using a modular search and accessed on the web or through the SMIR API.
The Ensembl project produces genome databases for vertebrates and other eukaryotic species. Ensembl is a joint project between the European Bioinformatics Institute (EBI) and the Wellcome Trust Sanger Institute (WTSI) to develop a software system that produces and maintains automatic annotation on selected genomes.The Ensembl project was started in 1999, some years before the draft human genome was completed. Even at that early stage it was clear that manual annotation of 3 billion base pairs of sequence would not be able to offer researchers timely access to the latest data. The goal of Ensembl was therefore to automatically annotate the genome, integrate this annotation with other available biological data and make all this publicly available via the web. Since the website's launch in July 2000, many more genomes have been added to Ensembl and the range of available data has also expanded to include comparative genomics, variation and regulatory data. Ensembl is a joint project between European Bioinformatics Institute (EBI), an outstation of the European Molecular Biology Laboratory (EMBL), and the Wellcome Trust Sanger Institute (WTSI). Both institutes are located on the Wellcome Trust Genome Campus in Hinxton, south of the city of Cambridge, United Kingdom.