Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 249 result(s)
The CancerData site is an effort of the Medical Informatics and Knowledge Engineering team (MIKE for short) of Maastro Clinic, Maastricht, The Netherlands. Our activities in the field of medical image analysis and data modelling are visible in a number of projects we are running. CancerData is offering several datasets. They are grouped in collections and can be public or private. You can search for public datasets in the NBIA (National Biomedical Imaging Archive) image archives without logging in.
Peptidome was a public repository that archived tandem mass spectrometry peptide and protein identification data generated by the scientific community. This repository is now offline and is in archival mode. All data may be obtained from the Peptidome FTP site. Due to budgetary constraints NCBI has discontinued the Peptidome Repository. All existing data and metadata files will continue to be made available from our ftp server a indefinitely. Those files are named according to their Peptidome accession number, allowing cited data to be identified and downloaded. All of the Peptidome studies have been made publicly available at the PRoteomics IDEntifications (PRIDE) database. A map of Peptidome to Pride accessions may be found at If you have any specific questions, please feel free to contact us at
The Entrez Protein Clusters database contains annotation information, publications, structures and analysis tools for related protein sequences encoded by complete genomes. The data available in the Protein Clusters Database is generated from prokaryotic genomic studies and is intended to assist researchers studying micro-organism evolution as well as other biological sciences. Available genomes include plants and viruses as well as organelles and microbial genomes.
ICD serves as the international standard for diagnostic classification for all general epidemiological, many health management purposes and clinical use. The ICD's resources include the analysis of different population groups' general health situations, monitoring of the incidence and prevalence of diseases in relation to the characteristics of the individuals affected, reimbursement, resource allocation, quality, and guidelines. The records provide the basis for the compilation of national mortality and morbidity statistics, and enable the storage and retrieval of diagnostic information for clinical epidemiological and quality purposes.
The IMSR is a searchable online database of mouse strains, stocks, and mutant ES cell lines available worldwide, including inbred, mutant, and genetically engineered strains. The goal of the IMSR is to assist the international scientific community in locating and obtaining mouse resources for research. Note that the data content found in the IMSR is as supplied by strain repository holders. For each strain or cell line listed in the IMSR, users can obtain information about: Where that resource is available (Repository Site); What state(s) the resource is available as (e.g. live, cryopreserved embryo or germplasm, ES cells); Links to descriptive information about a strain or ES cell line; Links to mutant alleles carried by a strain or ES cell line; Links for ordering a strain or ES cell line from a Repository; Links for contacting the Repository to send a query
VectorBase provides data on arthropod vectors of human pathogens. Sequence data, gene expression data, images, population data, and insecticide resistance data for arthropod vectors are available for download. VectorBase also offers genome browser, gene expression and microarray repository, and BLAST searches for all VectorBase genomes. VectorBase Genomes include Aedes aegypti, Anopheles gambiae, Culex quinquefasciatus, Ixodes scapularis, Pediculus humanus, Rhodnius prolixus. VectorBase is one the Bioinformatics Resource Centers (BRC) projects which is funded by National Institute of Allergy and Infectious Diseases (NAID).
The project brings together national key players providing environmentally related biological data and services to develop the ‘German Federation for Biological Data' (GFBio). The overall goal is to provide a sustainable, service oriented, national data infrastructure facilitating data sharing and stimulating data intensive science in the fields of biological and environmental research.
The PLANKTON*NET data provider at the Alfred Wegener Institute for Polar and Marine Research is an open access repository for plankton-related information. It covers all types of phytoplankton and zooplankton from marine and freshwater areas. PLANKTON*NET's greatest strength is its comprehensiveness as for the different taxa image information as well as taxonomic descriptions can be archived. PLANKTON*NET also contains a glossary with accompanying images to illustrate the term definitions. PLANKTON*NET therefore presents a vital tool for the preservation of historic data sets as well as the archival of current research results. Because interoperability with international biodiversity data providers (e.g. GBIF) is one of our aims, the architecture behind the new planktonnet@awi repository is observation centric and allows for mulitple assignment of assets (images, references, animations, etc) to any given observation. In addition, images can be grouped in sets and/or assigned tags to satisfy user-specific needs . Sets (and respective images) of relevance to the scientific community and/or general public have been assigned a persistant digital object identifier (DOI) for the purpose of long-term preservation (e.g. set ""Plankton*Net celebrates 50 years of Roman Treaties"", handle: 10013/de.awi.planktonnet.set.495)"
The tree of life links all biodiversity through a shared evolutionary history. This project will produce the first online, comprehensive first-draft tree of all 1.8 million named species, accessible to both the public and scientific communities. Assembly of the tree will incorporate previously-published results, with strong collaborations between computational and empirical biologists to develop, test and improve methods of data synthesis. This initial tree of life will not be static; instead, we will develop tools for scientists to update and revise the tree as new data come in. Early release of the tree and tools will motivate data sharing and facilitate ongoing synthesis of knowledge.
The Cognitive Function and Ageing Studies (CFAS) are population based studies of individuals aged 65 years and over living in the community, including institutions, which is the only large multi-centred population-based study in the UK that has reached sufficient maturity. There are three main studies within the CFAS group. MRC CFAS, the original study began in 1989, with three of its sites providing a parent subset for the comparison two decades later with CFAS II (2008 onwards). Subsequently another CFAS study, CFAS Wales began in 2011.
Exposures in the period from conception to early childhood - including fetal growth, cell division, and organ functioning - may have long-lasting impact on health and disease susceptibility. To investigate these issues the Danish National Birth Cohort (Better health in generations) was established. A large cohort of pregnant women with long-term follow-up of the offspring was the obvious choice because many of the exposures of interest cannot be reconstructed with suffcient validity back in time. The study needed to be large, and the aim was to recruit 100,000 women early in pregnancy, and to continue follow-up for decades. Exposure information was collected by computer-assisted telephone interviews with the women twice during pregnancy and when their children were six and 18 months old. Participants were also asked to fill in a self-administered food frequency questionnaire in mid-pregnancy. Furthermore, a biological bank has been set up with blood taken from the mother twice during pregnancy and blood from theumbilical cord taken shortly after birth.
The Diabetes Study of Northern California (DISTANCE) conducts epidemiological and health services research in diabetes among a large, multiethnic cohort of patients in a large, integrated health care delivery system.
The objective of this Research Coordination Network project is to develop an international network of researchers who use genetic methodologies to study the ecology and evolution of marine organisms in the Indo-Pacific to share data, ideas and methods. The tropical Indian and Pacific Oceans encompass the largest biogeographic region on the planet, the Indo-Pacific
The WorldWide Antimalarial Resistance Network (WWARN) is a collaborative platform generating innovative resources and reliable evidence to inform the malaria community on the factors affecting the efficacy of antimalarial medicines. Access to data is provided through diverse Tools and Resources: WWARN Explorer, Molecular Surveyor K13 Methodology, Molecular Surveyor pfmdr1 & pfcrt, Molecular Surveyor dhfr & dhps.
The Breast Cancer Surveillance Consortium (BCSC) is a research resource for studies designed to assess the delivery and quality of breast cancer screening and related patient outcomes in the United States. The BCSC is a collaborative network of seven mammography registries with linkages to tumor and/or pathology registries. The network is supported by a central Statistical Coordinating Center.
DEG hosts records of currently available essential genomic elements, such as protein-coding genes and non-coding RNAs, among bacteria, archaea and eukaryotes. Essential genes in a bacterium constitute a minimal genome, forming a set of functional modules, which play key roles in the emerging field, synthetic biology.
The National Sleep Research Resource (NSRR) offers free web access to large collections of de-identified physiological signals and clinical data elements collected in well-characterized research cohorts and clinical trials.
The National Cancer Data Base (NCDB), a joint program of the Commission on Cancer (CoC) of the American College of Surgeons (ACoS) and the American Cancer Society (ACS), is a nationwide oncology outcomes database for more than 1,500 Commission-accredited cancer programs in the United States and Puerto Rico. Some 70 percent of all newly diagnosed cases of cancer in the United States are captured at the institutional level and reported to the NCDB. The NCDB, begun in 1989, now contains approximately 29 million records from hospital cancer registries across the United States. Data on all types of cancer are tracked and analyzed. These data are used to explore trends in cancer care, to create regional and state benchmarks for participating hospitals, and to serve as the basis for quality improvement.
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems. The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target ("TARGET_TYPE='PROTEIN'") is provided. Data extracted by BindingDB typically includes more details regarding experimental conditions, etc
The taxonomically broad EST database TBestDB serves as a repository for EST data from a wide range of eukaryotes, many of which have previously not been thoroughly investigated. Most of the data contained in TBestDB has been generated by the labs of the Protist EST Program located in six universities across Canada. PEP is a large interdisciplinaryresearch project, involving six Canadian universities. PEP aims at the exploration of the diversity of eukaryotic genomes in a systematic, comprehensive and integrated way. The focus is on unicellular microbial eukaryotes, known as protists. Protistan eukaryotes comprise more than a dozen major lineages that, together, encompass more evolutionary, ecological and probably biochemical diversity than the multicellular kingdoms of animals, plants and fungi combined. PEP is a unique endeavor in that it is the first phylogenetically-broad genomic investigation of protists.
The Autism Chromosome Rearrangement Database is a collection of hand curated breakpoints and other genomic features, related to autism, taken from publicly available literature: databases and unpublished data. The database is continuously updated with information from in-house experimental data as well as data from published research studies.
Silkworm Pathogen Database (SilkPathDB) is a comprehensive resource for studying on pathogens of silkworm, including microsporidia, fungi, bacteria and virus. SilkPathDB provides access to not only genomic data including functional annotation of genes and gene products, but also extensive biological information for gene expression data and corresponding researches. SilkPathDB will be help with researches on pathogens of silkworm as well as other Lepidoptera insects.
Edinburgh DataShare is an online digital repository of multi-disciplinary research datasets produced at the University of Edinburgh, hosted by the Data Library in Information Services. Edinburgh University researchers who have produced research data associated with an existing or forthcoming publication, or which has potential use for other researchers, are invited to upload their dataset for sharing and safekeeping. A persistent identifier and suggested citation will be provided.
The Universitat de Barcelona Digital Repository is an institutional resource containing open-access digital versions of publications related to the teaching, research and institutional activities of the UB's teaching staff and other members of the university community, including research data.