Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 41 result(s)
The CancerData site is an effort of the Medical Informatics and Knowledge Engineering team (MIKE for short) of Maastro Clinic, Maastricht, The Netherlands. Our activities in the field of medical image analysis and data modelling are visible in a number of projects we are running. CancerData is offering several datasets. They are grouped in collections and can be public or private. You can search for public datasets in the NBIA (National Biomedical Imaging Archive) image archives without logging in.
MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human health and disease. The projects contributing to this resource are: Mouse Genome Database (MGD) Project, Gene Expression Database (GXD) Project, Mouse Tumor Biology (MTB) Database Project, Gene Ontology (GO) Project at MGI, MouseMine Project, MouseCyc Project at MGI
GESDB is a platform for sharing simulation data and discussion of simulation techniques for human genetic studies. The database contains simulation scripts, simulated data, and documentations from published manuscripts. The forum provides a platform for Q&A for the simulated data and exchanging simulation ideas. GESDB aims to promote transparency and efficiency in simulation studies for human genetic studies.
A collection of data at Agency for Healthcare Research and Quality (AHRQ) supporting research that helps people make more informed decisions and improves the quality of health care services. The portal contains U.S.Health Information Knowledgebase (USHIK) and Systematic Review Data Repository (SRDR) and other sources concerning cost, quality, accesibility and evaluation of healthcare and medical insurance.
The Gene database provides detailed information for known and predicted genes defined by nucleotide sequence or map position. Gene supplies gene-specific connections in the nexus of map, sequence, expression, structure, function, citation, and homology data. Unique identifiers are assigned to genes with defining sequences, genes with known map positions, and genes inferred from phenotypic information. These gene identifiers are used throughout NCBI's databases and tracked through updates of annotation. Gene includes genomes represented by NCBI Reference Sequences (or RefSeqs) and is integrated for indexing and query and retrieval from NCBI's Entrez and E-Utilities systems.
China National GeneBank DataBase (CNGBdb) is a unified platform built for biological big data sharing and application services to the research community. Based on the big data and cloud computing technologies, it provides data services such as archive, analysis, knowledge search, management authorization, and visualization. At present, CNGBdb has integrated large amounts of internal and external molecular data and other information from CNGB, NCBI, EBI, DDBJ, etc., indexed by search, covering 12 data structures. Moreover, CNGBdb correlates living sources, biological samples and bioinformatic data to realize the traceability of comprehensive data.
The Expression Atlas provides information on gene expression patterns under different biological conditions such as a gene knock out, a plant treated with a compound, or in a particular organism part or cell. It includes both microarray and RNA-seq data. The data is re-analysed in-house to detect interesting expression patterns under the conditions of the original experiment. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas. The Baseline Atlas displays information about which gene products are present (and at what abundance) in "normal" conditions (e.g. tissue, cell type). It aims to answer questions such as "which genes are specifically expressed in human kidney?". This component of the Expression Atlas consists of highly-curated and quality-checked RNA-seq experiments from ArrayExpress. It has data for many different animal and plant species. New experiments are added as they become available. The Differential Atlas allows users to identify genes that are up- or down-regulated in a wide variety of different experimental conditions such as yeast mutants, cadmium treated plants, cystic fibrosis or the effect on gene expression of mind-body practice. Both microarray and RNA-seq experiments are included in the Differential Atlas. Experiments are selected from ArrayExpress and groups of samples are manually identified for comparison e.g. those with wild type genotype compared to those with a gene knock out. Each experiment is processed through our in-house differential expression statistical analysis pipeline to identify genes with a high probability of differential expression.
The ENCODE Encyclopedia organizes the most salient analysis products into annotations, and provides tools to search and visualize them. The Encyclopedia has two levels of annotations: Integrative-level annotations integrate multiple types of experimental data and ground level annotations. Ground-level annotations are derived directly from the experimental data, typically produced by uniform processing pipelines.
GENCODE is a scientific project in genome research and part of the ENCODE (ENCyclopedia Of DNA Elements) scale-up project. The GENCODE consortium was initially formed as part of the pilot phase of the ENCODE project to identify and map all protein-coding genes within the ENCODE regions (approx. 1% of Human genome). Given the initial success of the project, GENCODE now aims to build an “Encyclopedia of genes and genes variants” by identifying all gene features in the human and mouse genome using a combination of computational analysis, manual annotation, and experimental validation, and annotating all evidence-based gene features in the entire human genome at a high accuracy.
The PeptideAtlas validates expressed proteins to provide eukaryotic genome data. Peptide Atlas provides data to advance biological discoveries in humans. The PeptideAtlas accepts proteomic data from high-throughput processes and encourages data submission.
Greengenes is an Earth Sciences website that assists clinical and environmental microbiologists from around the globe in classifying microorganisms from their local environments. A 16S rRNA gene database addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies.
ArrayExpress is one of the major international repositories for high-throughput functional genomics data from both microarray and high-throughput sequencing studies, many of which are supported by peer-reviewed publications. Data sets are either submitted directly to ArrayExpress and curated by a team of specialist biological curators, or are imported systematically from the NCBI Gene Expression Omnibus database on a weekly basis. Data is collected to MIAME and MINSEQE standards.
The Japanese Genotype-phenotype Archive (JGA) is a service for permanent archiving and sharing of all types of individual-level genetic and de-identified phenotypic data resulting from biomedical research projects. The JGA contains exclusive data collected from individuals whose consent agreements authorize data release only for specific research use or to bona fide researchers. Strict protocols govern how information is managed, stored and distributed by the JGA. Once processed, all data are encrypted. Users can contact the JGA team from here. JGA services are provided in collaboration with National Bioscience Database Center (NBDC) of Japan Science and Technology Agency.
Probe database provides a public registry of nucleic acid reagents as well as information on reagent distributors, sequence similarities and probe effectiveness. Database users have access to applications of gene expression, gene silencing and mapping, as well as reagent variation analysis and projects based on probe-generated data. The Probe database is constantly updated.
The NCBI database of Genotypes and Phenotypes archives and distributes the results of studies that have investigated the interaction of genotype and phenotype, including genome-wide association studies, medical sequencing, molecular diagnostic assays, and association between genotype and non-clinical traits. The database provides summaries of studies, the contents of measured variables, and original study document text. dbGaP provides two types of access for users, open and controlled. Through the controlled access, users may access individual-level data such as phenotypic data tables and genotypes.
The Taenia solium genome project is a whole genome sequencing project of the parasite Taenia solium, the causal agent of human and porcine cysticercosis; a disease that is still a public health problem of relevance in Mexico. It is being carried out by a consortium of scientists belonging to diverse institutions of the Universidad Nacional Autónoma de México (UNAM, the National Autonomous University of Mexico).
DDBJ; DNA Data Bank of Japan is the sole nucleotide sequence data bank in Asia, which is officially certified to collect nucleotide sequences from researchers and to issue the internationally recognized accession number to data submitters.Since we exchange the collected data with EMBL-Bank/EBI; European Bioinformatics Institute and GenBank/NCBI; National Center for Biotechnology Information on a daily basis, the three data banks share virtually the same data at any given time. The virtually unified database is called "INSD; International Nucleotide Sequence Database DDBJ collects sequence data mainly from Japanese researchers, but of course accepts data and issue the accession number to researchers in any other countries.
This site provides access to complete, annotated genomes from bacteria and archaea (present in the European Nucleotide Archive) through the Ensembl graphical user interface (genome browser). Ensembl Bacteria contains genomes from annotated INSDC records that are loaded into Ensembl multi-species databases, using the INSDC annotation import pipeline.
The Electron Microscopy Data Bank (EMDB) is a public repository for electron microscopy density maps of macromolecular complexes and subcellular structures. It covers a variety of techniques, including single-particle analysis, electron tomography, and electron (2D) crystallography.
The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources. These include submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centres and routine and comprehensive exchange with our partners in the International Nucleotide Sequence Database Collaboration (INSDC). Provision of nucleotide sequence data to ENA or its INSDC partners has become a central and mandatory step in the dissemination of research findings to the scientific community. ENA works with publishers of scientific literature and funding bodies to ensure compliance with these principles and to provide optimal submission systems and data access tools that work seamlessly with the published literature.
GermOnline 4.0 is a cross-species database gateway focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. The portal provides access to the Saccharomyces Genomics Viewer (SGV) which facilitates online interpretation of complex data from experiments with high-density oligonucleotide tiling microarrays that cover the entire yeast genome.