Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 23 result(s)
The taxonomically broad EST database TBestDB serves as a repository for EST data from a wide range of eukaryotes, many of which have previously not been thoroughly investigated. Most of the data contained in TBestDB has been generated by the labs of the Protist EST Program located in six universities across Canada. PEP is a large interdisciplinaryresearch project, involving six Canadian universities. PEP aims at the exploration of the diversity of eukaryotic genomes in a systematic, comprehensive and integrated way. The focus is on unicellular microbial eukaryotes, known as protists. Protistan eukaryotes comprise more than a dozen major lineages that, together, encompass more evolutionary, ecological and probably biochemical diversity than the multicellular kingdoms of animals, plants and fungi combined. PEP is a unique endeavor in that it is the first phylogenetically-broad genomic investigation of protists.
This Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house all publicly available QTL and trait mapping data (i.e. trait and genome location association data; collectively called "QTL data" on this site) on livestock animal species for easily locating and making comparisons within and between species. New database tools are continuely added to align the QTL and association data to other types of genome information, such as annotated genes, RH / SNP markers, and human genome maps. Besides the QTL data from species listed below, the QTLdb is open to house QTL/association date from other animal species where feasible. Note that the JAS along with other journals, now require that new QTL/association data be entered into a QTL database as part of their publication requirements.
CBS offers Comprehensive public databases of DNA- and protein sequences, macromolecular structure, g ene and protein expression levels, pathway organization and cell signalling, have been established to optimise scientific exploitation of the explosion of data within biology. Unlike many other groups in the field of biomolecular informatics, Center for Biological Sequence Analysis directs its research primarily towards topics related to the elucidation of the functional aspects of complex biological mechanisms. Among contemporary bioinformatics concerns are reliable computational interpretation of a wide range of experimental data, and the detailed understanding of the molecular apparatus behind cellular mechanisms of sequence information. By exploiting available experimental data and evidence in the design of algorithms, sequence correlations and other features of biological significance can be inferred. In addition to the computational research the center also has experimental efforts in gene expression analysis using DNA chips and data generation in relation to the physical and structural properties of DNA. In the last decade, the Center for Biological Sequence Analysis has produced a large number of computational methods, which are offered to others via WWW servers.
METLIN represents the largest MS/MS collection of data with the database generated at multiple collision energies and in positive and negative ionization modes. The data is generated on multiple instrument types including SCIEX, Agilent, Bruker and Waters QTOF mass spectrometers.
The European Bioinformatics Institute (EBI) has a long-standing mission to collect, organise and make available databases for biomolecular science. It makes available a collection of databases along with tools to search, download and analyse their content. These databases include DNA and protein sequences and structures, genome annotation, gene expression information, molecular interactions and pathways. Connected to these are linking and descriptive data resources such as protein motifs, ontologies and many others. In many of these efforts, the EBI is a European node in global data-sharing agreements involving, for example, the USA and Japan.
The Expression Atlas provides information on gene expression patterns under different biological conditions such as a gene knock out, a plant treated with a compound, or in a particular organism part or cell. It includes both microarray and RNA-seq data. The data is re-analysed in-house to detect interesting expression patterns under the conditions of the original experiment. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas. The Baseline Atlas displays information about which gene products are present (and at what abundance) in "normal" conditions (e.g. tissue, cell type). It aims to answer questions such as "which genes are specifically expressed in human kidney?". This component of the Expression Atlas consists of highly-curated and quality-checked RNA-seq experiments from ArrayExpress. It has data for many different animal and plant species. New experiments are added as they become available. The Differential Atlas allows users to identify genes that are up- or down-regulated in a wide variety of different experimental conditions such as yeast mutants, cadmium treated plants, cystic fibrosis or the effect on gene expression of mind-body practice. Both microarray and RNA-seq experiments are included in the Differential Atlas. Experiments are selected from ArrayExpress and groups of samples are manually identified for comparison e.g. those with wild type genotype compared to those with a gene knock out. Each experiment is processed through our in-house differential expression statistical analysis pipeline to identify genes with a high probability of differential expression.
OrtholugeDB contains Ortholuge-based orthology predictions for completely sequenced bacterial and archaeal genomes. It is also a resource for reciprocal best BLAST-based ortholog predictions, in-paralog predictions (recently duplicated genes) and ortholog groups in Bacteria and Archaea. The Ortholuge method improves the specificity of high-throughput orthology prediction.
Giardia lamblia is a significant, environmentally transmitted, human pathogen and an amitochondriate protist. It is a major contributor to the enormous worldwide burden of human diarrheal diseases, yet the basic biology of this parasite is not well understood. No virulence factor has been identified. The Giardia lamblia genome contains only 12 million base pairs distributed onto five chromosomes. Its analysis promises to provide insights about the origins of nuclear genome organization, the metabolic pathways used by parasitic protists, and the cellular biology of host interaction and avoidance of host immune systems. Since the divergence of Giardia lamblia lies close to the transition between eukaryotes and prokaryotes in universal ribosomal RNA phylogenies, it is a valuable, if not unique, model for gaining basic insights into genetic innovations that led to formation of eukaryotic cells. In evolutionary terms, the divergence of this organism is at least twice as ancient as the common ancestor for yeast and man. A detailed study of its genome will provide insights into an early evolutionary stage of eukaryotic chromosome organization as well as other aspects of the prokaryotic / eukaryotic divergence.
The Rat Genome Database is a collaborative effort between leading research institutions involved in rat genetic and genomic research. Its goal, as stated in RFA: HL-99-013 is the establishment of a Rat Genome Database, to collect, consolidate, and integrate data generated from ongoing rat genetic and genomic research efforts and make these data widely available to the scientific community. A secondary, but critical goal is to provide curation of mapped positions for quantitative trait loci, known mutations and other phenotypic data.
ArrayExpress is one of the major international repositories for high-throughput functional genomics data from both microarray and high-throughput sequencing studies, many of which are supported by peer-reviewed publications. Data sets are either submitted directly to ArrayExpress and curated by a team of specialist biological curators, or are imported systematically from the NCBI Gene Expression Omnibus database on a weekly basis. Data is collected to MIAME and MINSEQE standards.
ZFIN serves as the zebrafish model organism database. The long term goals for ZFIN are a) to be the community database resource for the laboratory use of zebrafish, b) to develop and support integrated zebrafish genetic, genomic and developmental information, c) to maintain the definitive reference data sets of zebrafish research information, d) to link this information extensively to corresponding data in other model organism and human databases, e) to facilitate the use of zebrafish as a model for human biology and f) to serve the needs of the research community. ZIRC is the Zebrafish International Resource Center, an independent NIH-funded facility providing a wide range of zebrafish lines, probes and health services. ZFIN works closely with ZIRC to connect our genetic data with available probes and fish lines.
DDBJ; DNA Data Bank of Japan is the sole nucleotide sequence data bank in Asia, which is officially certified to collect nucleotide sequences from researchers and to issue the internationally recognized accession number to data submitters.Since we exchange the collected data with EMBL-Bank/EBI; European Bioinformatics Institute and GenBank/NCBI; National Center for Biotechnology Information on a daily basis, the three data banks share virtually the same data at any given time. The virtually unified database is called "INSD; International Nucleotide Sequence Database DDBJ collects sequence data mainly from Japanese researchers, but of course accepts data and issue the accession number to researchers in any other countries.
BioMagResBank (BMRB) is the publicly-accessible depository for NMR results from peptides, proteins, and nucleic acids recognized by the International Society of Magnetic Resonance and by the IUPAC-IUBMB-IUPAB Inter-Union Task Group on the Standardization of Data Bases of Protein and Nucleic Acid Structures Determined by NMR Spectroscopy. In addition, BMRB provides reference information and maintains a collection of NMR pulse sequences and computer software for biomolecular NMR
The Immunology Database and Analysis Portal (ImmPort) archives clinical study and trial data generated by NIAID/DAIT-funded investigators. Data types housed in ImmPort include subject assessments i.e., medical history, concomitant medications and adverse events as well as mechanistic assay data such as flow cytometry, ELISA, ELISPOT, etc. --- You won't need an ImmPort account to search for compelling studies, peruse study demographics, interventions and mechanistic assays. But why stop there? What you really want to do is download the study, look at each experiment in detail including individual ELISA results and flow cytometry files. Perhaps you want to take those flow cytometry files for a test drive using FLOCK in the ImmPort flow cytometry module. To download all that interesting data you will need to register for ImmPort access.
ArkDB is a generic, species-independent database built to capture the state of published information on genome mapping in a given species. It stores details of references, markers and loci and genetic linkage and cytogenetic maps which can be drawn using the online map-drawing application. Data from linkage maps held within the ArkDB system can be drawn alongside their corresponding genome sequence maps (extracted from ENSEMBL).
The National Database for Autism Research (NDAR) is an NIH-funded research data repository that aims to accelerate progress in autism spectrum disorders (ASD) research through data sharing, data harmonization, and the reporting of research results. NDAR also serves as a scientific community platform and portal to multiple other research repositories, allowing for aggregation and secondary analysis of data. NDAR combines the function of a data repository, which holds genetic, phenotypic, clinical, and medical imaging data, and the function of a scientific community platform, which defines the standard tools and policies to integrate the computational resources developed by scientific research institutions, private foundations, and other federal and state agencies supporting ASD research. Furthermore, NDAR is working to develop the means to connect relevant repositories together through data federation.
The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies. The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide. The International Genome Sample Resource (IGSR) has been established at EMBL-EBI to continue supporting data generated by the 1000 Genomes Project, supplemented with new data and new analysis.
The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana . Data available from TAIR includes the complete genome sequence along with gene structure, gene product information, metabolism, gene expression, DNA and seed stocks, genome maps, genetic and physical markers, publications, and information about the Arabidopsis research community. Gene product function data is updated every two weeks from the latest published research literature and community data submissions. Gene structures are updated 1-2 times per year using computational and manual methods as well as community submissions of new and updated genes. TAIR also provides extensive linkouts from our data pages to other Arabidopsis resources.
!!!Sprry.we are no longer in operation!!! The Beta Cell Biology Consortium (BCBC) was a team science initiative that was established by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). It was initially funded in 2001 (RFA DK-01-014), and competitively continued both in 2005 (RFAs DK-01-17, DK-01-18) and in 2009 (RFA DK-09-011). Funding for the BCBC came to an end on August 1, 2015, and with it so did our ability to maintain active websites.!!! One of the many goals of the BCBC was to develop and maintain databases of useful research resources. A total of 813 different scientific resources were generated and submitted by BCBC investigators over the 14 years it existed. Information pertaining to 495 selected resources, judged to be the most scientifically-useful, has been converted into a static catalog, as shown below. In addition, the metadata for these 495 resources have been transferred to dkNET in the form of RDF descriptors, and all genomics data have been deposited to either ArrayExpress or GEO. Please direct questions or comments to the NIDDK Division of Diabetes, Endocrinology & Metabolic Diseases (DEM).
CottonGen is a new cotton community genomics, genetics and breeding database being developed to enable basic, translational and applied research in cotton. It is being built using the open-source Tripal database infrastructure. CottonGen consolidates and expands the data from CottonDB and the Cotton Marker Database, providing enhanced tools for easy querying, visualizing and downloading research data.
PathCards is an integrated database of human biological pathways and their annotations. Human pathways were clustered into SuperPaths based on gene content similarity. Each PathCard provides information on one SuperPath which represents one or more human pathways.