Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 79 result(s)
ALFRED is a free, web-accessible, curated compilation of allele frequency data on DNA sequence polymorphisms in anthropologically defined human populations. ALFRED is distinct from such databases as dbSNP, which catalogs sequence variation.
Edinburgh DataShare is an online digital repository of multi-disciplinary research datasets produced at the University of Edinburgh, hosted by the Data Library in Information Services. Edinburgh University researchers who have produced research data associated with an existing or forthcoming publication, or which has potential use for other researchers, are invited to upload their dataset for sharing and safekeeping. A persistent identifier and suggested citation will be provided.
The NCI’s Cancer Genome Anatomy Project (CGAP) is an online resource designed to provide the scientific community with detailed characterization of gene expression in biological tissues. By characterizing normal, pre-cancer and cancer cells, CGAP aims to improve detection, diagnosis and treatment for the patient. Moreover, CGAP provides access to cDNA clones to the research community through a variety of distributors. CGAP provides a wide range of genomic data and resources
The Fungal Genetics Stock Center has preserved and distributed strains of genetically characterized fungi since 1960. The collection includes over 20,000 accessioned strains of classical and genetically engineered mutants of key model, human, and plant pathogenic fungi. These materials are distributed as living stocks to researchers around the world.
The dbVar is a database of genomic structural variation containing data from multiple gene studies. Users can browse data containing the number of variant cells from each study, and filter studies by organism, study type, method and genomic variant. Organisms include human, mouse, cattle and several additional animals. ***NCBI will phase out support for non-human organism data in dbSNP and dbVar beginning on September 1, 2017 ***
GeneWeaver combines cross-species data and gene entity integration, scalable hierarchical analysis of user data with a community-built and curated data archive of gene sets and gene networks, and tools for data driven comparison of user-defined biological, behavioral and disease concepts. Gene Weaver allows users to integrate gene sets across species, tissue and experimental platform. It differs from conventional gene set over-representation analysis tools in that it allows users to evaluate intersections among all combinations of a collection of gene sets, including, but not limited to annotations to controlled vocabularies. There are numerous applications of this approach. Sets can be stored, shared and compared privately, among user defined groups of investigators, and across all users.
Content type(s)
The Centre for Applied Genomics hosts a variety of databases related to ongoing supported projects. Curation of these databases is performed in-house by TCAG Bioinformatics staff. The Autism Chromosome Rearrangement Database, The Cystic Fibrosis Mutation Database, TThe Lafora Progressive Myoclonus Epilepsy Mutation and Polymorphism Database are included. Large Scale Genomics Research resources include, the Database of Genomic Variants, The Chromosome 7 Annotation Project, The Human Genome Segmental Duplication Database, and the Non-Human Segmental Duplication Database
The Alzheimer Disease & Frontotemporal Dementia Mutation Database (AD&FTDMDB) aims at collecting all known mutations in the genes related to Alzheimer disease (AD) and fromtotemporal dementias (FTD). Mutations are collected from the literature and from presentations at scientific meetings. In addition, mutations can be submitted to AD&FTDMDB at this web site.
>>>>!!!!<<<< The Cancer Genomics Hub mission is now completed. The Cancer Genomics Hub was established in August 2011 to provide a repository to The Cancer Genome Atlas, the childhood cancer initiative Therapeutically Applicable Research to Generate Effective Treatments and the Cancer Genome Characterization Initiative. CGHub rapidly grew to be the largest database of cancer genomes in the world, storing more than 2.5 petabytes of data and serving downloads of nearly 3 petabytes per month. As the central repository for the foundational genome files, CGHub streamlined team science efforts as data became as easy to obtain as downloading from a hard drive. The convenient access to Big Data, and the collaborations that CGHub made possible, are now essential to cancer research. That work continues at the NCI's Genomic Data Commons. All files previously stored at CGHub can be found there. The Website for the Genomic Data Commons is here: >>>>!!!!<<<< The Cancer Genomics Hub (CGHub) is a secure repository for storing, cataloging, and accessing cancer genome sequences, alignments, and mutation information from the Cancer Genome Atlas (TCGA) consortium and related projects. Access to CGHub Data: All researchers using CGHub must meet the access and use criteria established by the National Institutes of Health (NIH) to ensure the privacy, security, and integrity of participant data. CGHub also hosts some publicly available data, in particular data from the Cancer Cell Line Encyclopedia. All metadata is publicly available and the catalog of metadata and associated BAMs can be explored using the CGHub Data Browser.
The Small Molecule Pathway Database (SMPDB) contains small molecule pathways found in humans, which are presented visually. All SMPDB pathways include information on the relevant organs, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Accompanying data includes detailed descriptions and references, providing an overview of the pathway, condition or processes depicted in each diagram.
The Cancer Immunome Database (TCIA) provides results of comprehensive immunogenomic analyses of next generation sequencing data (NGS) data for 20 solid cancers from The Cancer Genome Atlas (TCGA) and other datasource. The Cancer Immunome Atlas (TCIA) was developed and is maintained at the Division of Bioinformatics (ICBI). The database can be queried for the gene expression of specific immune-related gene sets, cellular composition of immune infiltrates (characterized using gene set enrichment analyses and deconvolution), neoantigens and cancer-germline antigens, HLA types, and tumor heterogeneity (estimated from cancer cell fractions). Moreover it provides survival analyses for different types immunological parameters. TCIA will be constantly updated with new data and results.
Recode2 is a database of genes that utilize non-standard translation for gene expression purposes. Recoding events described in the database include programmed ribosomal frameshifting, translational bypassing (aka hopping) and mRNA specific codon redefinition. Frameshifting at a particular site often yields two protein products from one coding sequence and sometimes serves a regulatory purpose by acting as a sensor of the level of product protein or of some external ligand. Bypassing (hopping) allows the coupling of two ORFs separated on an mRNA by a coding gap. Codon redefinition occurs when a stop codon is decoded as a standard amino acid (often glutamine or tryptophan), or the 21st amino acid selenocysteine. These recoding events are in competition with standard decoding and are site specific. The efficiency of recoding is often modulated by cis-stimulators and sometimes by trans-factors. The sequences of the genes that use recoding for their expression are in the database. The recoding sites and the known stimulatory signals are annotated in the database together with notes on factors that are known to affect recoding efficiencies.
Complete Genomics provides free public access to a variety of whole human genome data sets generated from Complete Genomics’ sequencing service. The research community can explore and familiarize themselves with the quality of these data sets, review the data formats provided from our sequencing service, and augment their own research with additional summaries of genomic variation across a panel of diverse individuals. The quality of these data sets is representative of what a customer can expect to receive for their own samples. This public genome repository comprises genome results from both our Standard Sequencing Service (69 standard, non-diseased samples) and the Cancer Sequencing Service (two matched tumor and normal sample pairs). In March 2013 Complete Genomics was acquired by BGI-Shenzhen , the world’s largest genomics services company. BGI is a company headquartered in Shenzhen, China that provides comprehensive sequencing and bioinformatics services for commercial science, medical, agricultural and environmental applications. Complete Genomics is now focused on building a new generation of high-throughput sequencing technology and developing new and exciting research, clinical and consumer applications.
The Swedish Human Protein Atlas project has been set up to allow for a systematic exploration of the human proteome using Antibody-Based Proteomics. This is accomplished by combining high-throughput generation of affinity-purified antibodies with protein profiling in a multitude of tissues and cells assembled in tissue microarrays. Confocal microscopy analysis using human cell lines is performed for more detailed protein localization. The program hosts the Human Protein Atlas portal with expression profiles of human proteins in tissues and cells. The main objective of the resource centre is to produce specific antibodies to human target proteins using a high-throughput production method involving the cloning and protein expression of Protein Epitope Signature Tags (PrESTs). After purification, the antibodies are used to study expression profiles in cells and tissues and for functional analysis of the corresponding proteins in a wide range of platforms.
SILVA is a comprehensive, quality-controlled web resource for up-to-date aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains alongside supplementary online services. In addition to data products, SILVA provides various online tools such as alignment and classification, phylogenetic tree calculation and viewer, probe/primer matching, and an amplicon analysis pipeline. With every full release a curated guide tree is provided that contains the latest taxonomy and nomenclature based on multiple references. SILVA is an ELIXIR Core Data Resource.
The Plasmid Information Database (PlasmID) was established in 2004 to curate, maintain, and distribute cDNA and ORF constructs for use in basic molecular biological research. The materials deposited at our facility represent the culmination of several international collaborative efforts from 2004 to present: Beth Israel Deaconess Medical Center, Boston Children's Hospital, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Harvard School of Public Health, and Massachusetts General Hospital.
The Human Ageing Genomic Resources (HAGR) is a collection of databases and tools designed to help researchers study the genetics of human ageing using modern approaches such as functional genomics, network analyses, systems biology and evolutionary analyses.
The CPTAC Data Portal is the centralized repository for the dissemination of proteomic data collected by the Proteome Characterization Centers (PCCs) for the CPTAC program. The portal also hosts analyses of the mass spectrometry data (mapping of spectra to peptide sequences and protein identification) from the PCCs and from a CPTAC-sponsored common data analysis pipeline (CDAP).
The Melanoma Molecular Map Project (MMMP) is an open access, interactive web-based multidatabase dedicated to the research on melanoma biology and therapy. The aim of this non-profit project is to create an organized and continuously updated databank collecting the huge and ever growing amount of scientific knowledge on melanoma currently scattered in thousands of articles published in hundreds of Journals.
Stemformatics is a collaboration between the stem cell and bioinformatics community. We were motivated by the plethora of exciting cell models in the public and private domains, and the realisation that for many biologists these were mostly inaccessible. We wanted a fast way to find and visualise interesting genes in these exemplar stem cell datasets. We'd like you to explore. You'll find data from leading stem cell laboratories in a format that is easy to search, easy to visualise and easy to export.
The Conserved Domain Database is a resource for the annotation of functional units in proteins. Its collection of domain models includes a set curated by NCBI, which utilizes 3D structure to provide insights into sequence/structure/function relationships