Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 203 result(s)
In response to the declaration of the Zika virus as a public health emergency, LabKey has launched the Zika Open-Research Portal to help facilitate collaborative research. This portal provides a platform for investigators to make Zika research data, commentary and results publicly available in real-time. Projects are freely available to researchers. If you are interested in sharing real-time research through the Zika Open-Research Portal, please contact LabKey to get started.
The CPTAC Data Portal is the centralized repository for the dissemination of proteomic data collected by the Proteome Characterization Centers (PCCs) for the CPTAC program. The portal also hosts analyses of the mass spectrometry data (mapping of spectra to peptide sequences and protein identification) from the PCCs and from a CPTAC-sponsored common data analysis pipeline (CDAP).
I2D (Interologous Interaction Database) is an on-line database of known and predicted mammalian and eukaryotic protein-protein interactions. It has been built by mapping high-throughput (HTP) data between species. Thus, until experimentally verified, these interactions should be considered "predictions". It remains one of the most comprehensive sources of known and predicted eukaryotic PPI. I2D includes data for S. cerevisiae, C. elegans, D. melonogaster, R. norvegicus, M. musculus, and H. sapiens.
The Sequence Read Archive stores the raw sequencing data from such sequencing platforms as the Roche 454 GS System, the Illumina Genome Analyzer, the Applied Biosystems SOLiD System, the Helicos Heliscope, and the Complete Genomics. It archives the sequencing data associated with RNA-Seq, ChIP-Seq, Genomic and Transcriptomic assemblies, and 16S ribosomal RNA data.
The Drosophila Genetic Reference Panel (DGRP) is a population consisting of more than 200 inbred lines derived from the Raleigh, USA population. The DGRP is a living library of common polymorphisms affecting complex traits, and a community resource for whole genome association mapping of quantitative trait loci.
Stemformatics is a collaboration between the stem cell and bioinformatics community. We were motivated by the plethora of exciting cell models in the public and private domains, and the realisation that for many biologists these were mostly inaccessible. We wanted a fast way to find and visualise interesting genes in these exemplar stem cell datasets. We'd like you to explore. You'll find data from leading stem cell laboratories in a format that is easy to search, easy to visualise and easy to export.
The Conserved Domain Database is a resource for the annotation of functional units in proteins. Its collection of domain models includes a set curated by NCBI, which utilizes 3D structure to provide insights into sequence/structure/function relationships
A curated database of mutations and polymorphisms associated with Lafora Progressive Myoclonus Epilepsy. The Lafora progressive myoclonus epilepsy mutation and polymorphism database is a collection of hand curated mutation and polymorphism data for the EPM2A and EPM2B (NHLRC1) from publicly available literature: databases and unpublished data. The database is continuously updated with information from in-house experimental data as well as data from published research studies.
IMGT/GENE-DB is the IMGT genome database for IG and TR genes from human, mouse and other vertebrates. IMGT/GENE-DB provides a full characterization of the genes and of their alleles: IMGT gene name and definition, chromosomal localization, number of alleles, and for each allele, the IMGT allele functionality, and the IMGT reference sequences and other sequences from the literature. IMGT/GENE-DB allele reference sequences are available in FASTA format (nucleotide and amino acid sequences with IMGT gaps according to the IMGT unique numbering, or without gaps).
arrayMap is a repository of cancer genome profiling data. Original) from primary repositories (e.g. NCBI GEO, EBI ArrayExpress, TCGA) is re-processed and annotated for metadata. Unique visualisation of the processed data allows critical evaluation of data quality and genome information. Structured metadata provides easy access to summary statistics, with a focus on copy number aberrations in cancer entities.
GenBankĀ® is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.
MetaCyc is a curated database of experimentally elucidated metabolic pathways from all domains of life. MetaCyc contains pathways involved in both primary and secondary metabolism, as well as associated metabolites, reactions, enzymes, and genes. The goal of MetaCyc is to catalog the universe of metabolism by storing a representative sample of each experimentally elucidated pathway. MetaCyc applications include: Online encyclopedia of metabolism, Prediction of metabolic pathways in sequenced genomes, Support metabolic engineering via enzyme database, Metabolite database aids. metabolomics research.
>>>>!!!!<<<< AspGD data are being integrated into FungiDB. Please click here for additional details . Discussion of how to maximize the value of FungiDB for the Aspergillus research community will be a major topic at the upcoming AsperFest12 meeting at Asilomar (March 16-17, 2015). >>>>!!!!<<<< AspGD is an organized collection of genetic and molecular biological information about the filamentous fungi of the genus Aspergillus. Among its many species, the genus contains an excellent model organism (A. nidulans, or its teleomorph Emericella nidulans), an important pathogen of the immunocompromised (A. fumigatus), an agriculturally important toxin producer (A. flavus), and two species used in industrial processes (A. niger and A. oryzae). AspGD contains information about genes and proteins of multiple Aspergillus species; descriptions and classifications of their biological roles, molecular functions, and subcellular localizations; gene, protein, and chromosome sequence information; tools for analysis and comparison of sequences; and links to literature information; as well as a multispecies comparative genomics browser tool (Sybil) for exploration of orthology and synteny across multiple sequenced Aspergillus species.
GermOnline 4.0 is a cross-species database gateway focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. The portal provides access to the Saccharomyces Genomics Viewer (SGV) which facilitates online interpretation of complex data from experiments with high-density oligonucleotide tiling microarrays that cover the entire yeast genome.
The GeneDB project is a core part of the Sanger Institute's Pathogen Genomics activities. Its primary goals are: to provide reliable access to the latest sequence data and annotation/curation for the whole range of organisms sequenced by the Pathogen group. to develop the website and other tools to aid the community in accessing and obtaining the maximum value from these data.
EcoGene is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations.
Mapping, copy number analysis, sequence and gene expression data generated by the High Resolution Analysis of Follicular Lymphoma Genomes project. The data will be available for 24 patients with follicular lymphoma. All data will be made as widely and freely available as possible while safeguarding the privacy of participants, and protecting confidential and proprietary data.The data from this project will be submitted to public genomic data sources. These sources will be listed on this web site as the data becomes available in these external data sources.
TPA is a database that contains sequences built from the existing primary sequence data in GenBank. TPA records are retrieved through the Nucleotide Database and feature information on the sequence, how it was cataloged, and proper way to cite the sequence information.
Content type(s)
A genome database for the genus Piroplasma. PiroplasmaDB is a member of pathogen-databases that are housed under the NIAID-funded EuPathDB Bioinformatics Resource Center (BRC) umbrella.
The Barcode of Life Data Systems (BOLD) provides DNA barcode data. BOLD's online workbench supports data validation, annotation, and publication for specimen, distributional, and molecular data. The platform consists of four main modules: a data portal, a database of barcode clusters, an educational portal, and a data collection workbench. BOLD is the go-to site for DNA-based identification. As the central informatics platform for DNA barcoding, BOLD plays a crucial role in assimilating and organizing data gathered by the international barcode research community. Two iBOL (International Barcode of Life) Working Groups are supporting the ongoing development of BOLD.
The Cancer Cell Line Encyclopedia project is a collaboration between the Broad Institute, and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models, to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patterns and to translate cell line integrative genomics into cancer patient stratification. The CCLE provides public access to genomic data, analysis and visualization for about 1000 cell lines.