Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 42 result(s)
Indian Genetic Disease Database (IGDD) is an initiative of CSIR Indian Institute of Chemical Biology. It is supported by Council of Scientific and Industrial Research (CSIR) and Department of Biotechnology (DBT) of India. The Indian people represent one-sixth of the world population and consists of a ethnically, geographically, and genetically diverse population. In some communities the ratio of genetic disorder is relatively high due to consanguineous marriage practiced in the community. This database has been created to keep track of mutations in the causal genes for genetic diseases common in India and help the physicians, geneticists, and other professionals retrieve and use the information for the benefit of the public. The database includes scientific information about these genetic diseases and disabilities, but also statistical information about these diseases in today's society. Data is categorized by body part affected and then by title of the disease.
The objective of this Research Coordination Network project is to develop an international network of researchers who use genetic methodologies to study the ecology and evolution of marine organisms in the Indo-Pacific to share data, ideas and methods. The tropical Indian and Pacific Oceans encompass the largest biogeographic region on the planet, the Indo-Pacific
GESDB is a platform for sharing simulation data and discussion of simulation techniques for human genetic studies. The database contains simulation scripts, simulated data, and documentations from published manuscripts. The forum provides a platform for Q&A for the simulated data and exchanging simulation ideas. GESDB aims to promote transparency and efficiency in simulation studies for human genetic studies.
Edinburgh DataShare is an online digital repository of multi-disciplinary research datasets produced at the University of Edinburgh, hosted by the Data Library in Information Services. Edinburgh University researchers who have produced research data associated with an existing or forthcoming publication, or which has potential use for other researchers, are invited to upload their dataset for sharing and safekeeping. A persistent identifier and suggested citation will be provided.
Swiss Institute of Bioinformatics (SIB) coordinates research and education in bioinformatics throughout Switzerland and provides bioinformatics services to the national and international research community. ExPASy gives access to numerous repositories and databases of SIB. For example: array map, MetaNetX, SWISS-MODEL and World-2DPAGE, and many others see a list here
dbEST is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. Expressed Sequence Tags (ESTs) are short (usually about 300-500 bp), single-pass sequence reads from mRNA (cDNA). Typically they are produced in large batches. They represent a snapshot of genes expressed in a given tissue and/or at a given developmental stage. They are tags (some coding, others not) of expression for a given cDNA library. Most EST projects develop large numbers of sequences. These are commonly submitted to GenBank and dbEST as batches of dozens to thousands of entries, with a great deal of redundancy in the citation, submitter and library information. To improve the efficiency of the submission process for this type of data, we have designed a special streamlined submission process and data format. dbEST also includes sequences that are longer than the traditional ESTs, or are produced as single sequences or in small batches. Among these sequences are products of differential display experiments and RACE experiments. The thing that these sequences have in common with traditional ESTs, regardless of length, quality, or quantity, is that there is little information that can be annotated in the record. If a sequence is later characterized and annotated with biological features such as a coding region, 5'UTR, or 3'UTR, it should be submitted through the regular GenBank submissions procedure (via BankIt or Sequin), even if part of the sequence is already in dbEST. dbEST is reserved for single-pass reads. Assembled sequences should not be submitted to dbEST. GenBank will accept assembled EST submissions for the forthcoming TSA (Transcriptome Shotgun Assembly) division. The individual reads which make up the assembly should be submitted to dbEST, the Trace archive or the Short Read Archive (SRA) prior to the submission of the assemblies.
The Gene database provides detailed information for known and predicted genes defined by nucleotide sequence or map position. Gene supplies gene-specific connections in the nexus of map, sequence, expression, structure, function, citation, and homology data. Unique identifiers are assigned to genes with defining sequences, genes with known map positions, and genes inferred from phenotypic information. These gene identifiers are used throughout NCBI's databases and tracked through updates of annotation. Gene includes genomes represented by NCBI Reference Sequences (or RefSeqs) and is integrated for indexing and query and retrieval from NCBI's Entrez and E-Utilities systems.
The dbMHC database provides an open, publicly accessible platform for DNA and clinical data related to the human Major Histocompatibility Complex (MHC). The dbMHC provides access to human leukocyte antigen (HLA) sequences, HLA allele and haplotype frequencies, and clinical datasets.
Human Proteinpedia is a community portal for sharing and integration of human protein data. This is a joint project between Pandey at Johns Hopkins University, and Institute of Bioinformatics, Bangalore. This portal allows research laboratories around the world to contribute and maintain protein annotations. Human Protein Reference Database (HPRD) integrates data, that is deposited in Human Proteinpedia along with the existing literature curated information in the context of an individual protein. All the public data contributed to Human Proteinpedia can be queried, viewed and downloaded. Data pertaining to post-translational modifications, protein interactions, tissue expression, expression in cell lines, subcellular localization and enzyme substrate relationships may be deposited.
The Genome database contains annotations and analysis of eukaryotic and prokaryotic genomes, as well as tools that allow users to compare genomes and gene sequences from humans, microbes, plants, viruses and organelles. Users can browse by organism, and view genome maps and protein clusters.
AceView provides a curated, comprehensive and non-redundant sequence representation of all public mRNA sequences (mRNAs from GenBank or RefSeq, and single pass cDNA sequences from dbEST and Trace). These experimental cDNA sequences are first co-aligned on the genome then clustered into a minimal number of alternative transcript variants and grouped into genes. Using exhaustively and with high quality standards the available cDNA sequences evidences the beauty and complexity of mammals’ transcriptome, and the relative simplicity of the nematode and plant transcriptomes. Genes are classified according to their inferred coding potential; many presumably non-coding genes are discovered. Genes are named by Entrez Gene names when available, else by AceView gene names, stable from release to release. Alternative features (promoters, introns and exons, polyadenylation signals) and coding potential, including motifs, domains, and homologies are annotated in depth; tissues where expression has been observed are listed in order of representation; diseases, phenotypes, pathways, functions, localization or interactions are annotated by mining selected sources, in particular PubMed, GAD and Entrez Gene, and also by performing manual annotation, especially in the worm. In this way, both the anatomy and physiology of the experimentally cDNA supported human, mouse and nematode genes are thoroughly annotated.
Project Achilles is a systematic effort aimed at identifying and cataloging genetic vulnerabilities across hundreds of genomically characterized cancer cell lines. The project uses genome-wide genetic perturbation reagents (shRNAs or Cas9/sgRNAs) to silence or knock-out individual genes and identify those genes that affect cell survival. Large-scale functional screening of cancer cell lines provides a complementary approach to those studies that aim to characterize the molecular alterations (e.g. mutations, copy number alterations) of primary tumors, such as The Cancer Genome Atlas (TCGA). The overall goal of the project is to identify cancer genetic dependencies and link them to molecular characteristics in order to prioritize targets for therapeutic development and identify the patient population that might benefit from such targets.
Addgene archives and distributes plasmids for researchers around the globe. They are working with thousands of laboratories to assemble a high-quality library of published plasmids for use in research and discovery. By linking plasmids with articles, scientists can always find data related to the materials they request.
GeneWeaver combines cross-species data and gene entity integration, scalable hierarchical analysis of user data with a community-built and curated data archive of gene sets and gene networks, and tools for data driven comparison of user-defined biological, behavioral and disease concepts. Gene Weaver allows users to integrate gene sets across species, tissue and experimental platform. It differs from conventional gene set over-representation analysis tools in that it allows users to evaluate intersections among all combinations of a collection of gene sets, including, but not limited to annotations to controlled vocabularies. There are numerous applications of this approach. Sets can be stored, shared and compared privately, among user defined groups of investigators, and across all users.
The Alzheimer Disease & Frontotemporal Dementia Mutation Database (AD&FTDMDB) aims at collecting all known mutations in the genes related to Alzheimer disease (AD) and fromtotemporal dementias (FTD). Mutations are collected from the literature and from presentations at scientific meetings. In addition, mutations can be submitted to AD&FTDMDB at this web site.
The Global Proteome Machine (GPM) is a protein identification database. This data repository allows users to post and compare results. GPM's data is provided by contributors like The Informatics Factory, University of Michigan, and Pacific Northwestern National Laboratories. The GPM searchable databases are: GPMDB, pSYT, SNAP, MRM, PEPTIDE and HOT.
Mapping, copy number analysis, sequence and gene expression data generated by the High Resolution Analysis of Follicular Lymphoma Genomes project. The data will be available for 24 patients with follicular lymphoma. All data will be made as widely and freely available as possible while safeguarding the privacy of participants, and protecting confidential and proprietary data.The data from this project will be submitted to public genomic data sources. These sources will be listed on this web site as the data becomes available in these external data sources.
The Cancer Cell Line Encyclopedia project is a collaboration between the Broad Institute, and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models, to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patterns and to translate cell line integrative genomics into cancer patient stratification. The CCLE provides public access to genomic data, analysis and visualization for about 1000 cell lines.
HumanCyc provides an encyclopedic reference on human metabolic pathways. It provides a zoomable human metabolic map diagram, and it has been used to generate a steady-state quantitative model of human metabolism.
The Allele Frequency Net Database (AFND) is a public database which contains frequency information of several immune genes such as Human Leukocyte Antigens (HLA), Killer-cell Immunoglobulin-like Receptors (KIR), Major histocompatibility complex class I chain-related (MIC) genes, and a number of cytokine gene polymorphisms. The Allele Frequency Net Database (AFND) provides a central source, freely available to all, for the storage of allele frequencies from different polymorphic areas in the Human Genome. Users can contribute the results of their work into one common database and can perform database searches on information already available. We have currently collected data in allele, haplotype and genotype format. However, the success of this website will depend on you to contribute your data.