Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 182 result(s)
MetaCyc is a curated database of experimentally elucidated metabolic pathways from all domains of life. MetaCyc contains pathways involved in both primary and secondary metabolism, as well as associated metabolites, reactions, enzymes, and genes. The goal of MetaCyc is to catalog the universe of metabolism by storing a representative sample of each experimentally elucidated pathway. MetaCyc applications include: Online encyclopedia of metabolism, Prediction of metabolic pathways in sequenced genomes, Support metabolic engineering via enzyme database, Metabolite database aids. metabolomics research.
Academic Commons is a freely accessible digital collection of research and scholarship produced at Columbia University or one of its affiliate institutions (Barnard College, Teachers College, Union Theological Seminary, and Jewish Theological Seminary). The mission of Academic Commons is to collect and preserve the digital outputs of research and scholarship produced at Columbia and its affiliate institutions and present them to a global audience. Academic Commons accepts articles, dissertations, research data, presentations, working papers, videos, and more.
The Allele Frequency Net Database (AFND) is a public database which contains frequency information of several immune genes such as Human Leukocyte Antigens (HLA), Killer-cell Immunoglobulin-like Receptors (KIR), Major histocompatibility complex class I chain-related (MIC) genes, and a number of cytokine gene polymorphisms. The Allele Frequency Net Database (AFND) provides a central source, freely available to all, for the storage of allele frequencies from different polymorphic areas in the Human Genome. Users can contribute the results of their work into one common database and can perform database searches on information already available. We have currently collected data in allele, haplotype and genotype format. However, the success of this website will depend on you to contribute your data.
FANTOM stands for 'Functional Annotation of the Mammalian Genome' and is the name of an international research consortium organized by the RIKEN Omics Science Center. The FANTOM5 project aims to build a full understanding of transcriptional regulation in a human system by generating transcriptional regulatory networks that define every human cell type.
InnateDB is a publicly available database of the genes, proteins, experimentally-verified interactions and signaling pathways involved in the innate immune response of humans, mice and bovines to microbial infection. The database captures an improved coverage of the innate immunity interactome by integrating known interactions and pathways from major public databases together with manually-curated data into a centralised resource. The database can be mined as a knowledgebase or used with our integrated bioinformatics and visualization tools for the systems level analysis of the innate immune response.
DOMINO is an open-access database comprising more than 3900 annotated experiments describing interactions mediated by protein-interaction domains. The curation effort aims at covering the interactions mediated by the following domains (SH3, SH2, 14-3-3, PDZ, PTB, WW, EVH, VHS, FHA, EH, FF, BRCT, Bromo, Chromo, GYF). The interactions deposited in DOMINO are annotated according to the PSI MI standard and can be easily analyzed in the context of the global protein interaction network as downloaded from major interaction databases like MINT, INTACT, DIP, MIPS/MPACT. DOMINO can be searched with a versatile search tool and the interaction networks can be visualized with a convenient graphic display applet that explicitly identifies the domains/sites involved in the interactions.
The Ensembl genome annotation system, developed jointly by the EBI and the Wellcome Trust Sanger Institute, has been used for the annotation, analysis and display of vertebrate genomes since 2000. Since 2009, the Ensembl site has been complemented by the creation of five new sites, for bacteria, protists, fungi, plants and invertebrate metazoa, enabling users to use a single collection of (interactive and programatic) interfaces for accessing and comparing genome-scale data from species of scientific interest from across the taxonomy. In each domain, we aim to bring the integrative power of Ensembl tools for comparative analysis, data mining and visualisation across genomes of scientific interest, working in collaboration with scientific communities to improve and deepen genome annotation and interpretation.
The GSS database collects unannotated, short, single-read, primary genomic sequences from GenBank and contains nucleic acid sequences. These sequences include random survey sequences, clone-end sequences, and exon-trapped sequences.
MetaboLights is a database for Metabolomics experiments and derived information. The database is cross-species, cross-technique and covers metabolite structures and their reference spectra as well as their biological roles, locations and concentrations, and experimental data from metabolic experiments.
The mission of the Influenza Research Database (IRD) is to provide a resource for the influenza virus research community that will facilitate an understanding of the influenza virus and how it interacts with the host organism, leading to new treatments and preventive actions. This resource will contain avian and non-human mammalian influenza surveillance data, human clinical data associated with virus extracts, phenotypic characteristics of viruses isolated from extracts, and all genomic and proteomic data available in public repositories for influenza viruses.
This database will provide a central location for scientists to browse uniquely observed proteoforms and to contribute their own datasets. Top-down proteomics is a method of protein identification that uses an ion trapping mass spectrometer to store an isolated protein ion for mass measurement and tandem mass spectrometry analysis.
Patient Reported Outcomes Following Initial treatment and Long term Evaluation of Survivorship (PROFILES)’ is a registry for the study of the physical and psychosocial impact of cancer and its treatment from a dynamic, growing population-based cohort of both short and long-term cancer survivors. Researchers from the Netherlands Comprehensive Cancer Centre and Tilburg University in Tilburg, The Netherlands, work together with medical specialists from national hospitals in order to setup different PROFILES studies, collect the necessary data, and present the results in scientific journals and (inter)national conferences.
ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs We attempt to normalise the bioactivities into a uniform set of end-points and units where possible, and also to tag the links between a molecular target and a published assay with a set of varying confidence levels. Additional data on clinical progress of compounds is being integrated into ChEMBL at the current time.
ArrayExpress is one of the major international repositories for high-throughput functional genomics data from both microarray and high-throughput sequencing studies, many of which are supported by peer-reviewed publications. Data sets are either submitted directly to ArrayExpress and curated by a team of specialist biological curators, or are imported systematically from the NCBI Gene Expression Omnibus database on a weekly basis. Data is collected to MIAME and MINSEQE standards.
WFCC-MIRCEN World Data Centre for Microorganisms (WDCM) provides a comprehensive directory of culture collections, databases on microbes and cell lines, and the gateway to biodiversity, molecular biology and genome projects.The WFCC is a Multidisciplinary Commission of the International Union of Biological Sciences (IUBS) and a Federation within the International Union of Microbiological Societies (IUMS). The WFCC is concerned with the collection, authentication, maintenance and distribution of cultures of microorganisms and cultured cells. Its aim is to promote and support the establishment of culture collections and related services, to provide liaison and set up an information network between the collections and their users, to organise workshops and conferences, publications and newsletters and work to ensure the long term perpetuation of important collections.
openLandscapes is an open access information portal for landscape research. Amongst other things, the platform provides information about current research projects. In addition, it offers the scientific community the possibility to maintain a Wiki on landscape-related contents and to make available future primary data from landscape research. In openLandscapes, all technical contents are stored and organised in a networked manner, enabling technical terms to be linked to experts or institutions, as well as to data in the future.
The Reference Sequence (RefSeq) collection provides a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins. RefSeq sequences form a foundation for medical, functional, and diversity studies. They provide a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis (especially RefSeqGene records), expression studies, and comparative analyses.
ALEXA is a microarray design platform for 'alternative expression analysis'. This platform facilitates the design of expression arrays for analysis of mRNA isoforms generated from a single locus by the use of alternative transcription initiation, splicing and polyadenylation sites. We use the term 'ALEXA' to describe a collection of novel genomic methods for 'alternative expression' analysis. 'Alternative expression' refers to the identification and quantification of alternative mRNA transcripts produced by alternative transcript initiation, alternative splicing and alternative polyadenylation. This website provides supplementary materials, source code and other downloads for recent publications describing our studies of alternative expression (AE). Most recently we have developed a method, 'ALEXA-Seq' and associated resources for alternative expression analysis by massively parallel RNA sequencing.
GigaDB primarily serves as a repository to host data and tools associated with articles in GigaScience (GigaScience is an online, open-access journal). GigaDB defines a dataset as a group of files (e.g., sequencing data, analyses, imaging files, software programs) that are related to and support an article or study. GigaDB allows the integration of manuscript publication with supporting data and tools.
The NCBI Trace Archive is a permanent repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects. The Trace Archive serves as the repository of sequencing data from gel/capillary platforms such as Applied Biosystems ABI 3730®. The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, Helicos Heliscope®, and others. The Trace Assembly Archive stores pairwise alignment and multiple alignment of sequencing reads, linking basic trace data with finished genomic sequence.
NCBI Virus Variation is a specialized database which collects tools to provide searchable resources in the fields of Influenza virus, Dengue virus, and West Nile virus. Specific BLAST databases are listed. Their new publications are also available in their site. Rotavirus database will be added in their site soon.
!!!! This information is for reference purposes only. It was current when produced and may now be outdated. Archive material is no longer maintained, and some links may not work. Persons with disabilities having difficulty accessing this information should contact us at: Let us know the nature of the problem, the Web address of what you want, and your contact information. Please go to for current information. !!!! HIV and AIDS Costs and Use is the first major research effort to collect information on a nationally representative sample of people in care for HIV infection. Also called the HIV Cost and Services Utilization Study (HCSUS), the core study is meant to help policymakers in the U.S. make informed decisions on the subject. The study describes the type of therapies available and costs of health care services for people with HIV/AIDS, as well as quality of care, social support, and non-medical services HIV/AIDS patients receive. Supplemental studies examine HIV care delivery in rural areas, prevalence of mental and substance abuse disorders, and other health issues of HIV/AIDS patients.
Here you will find authoritative taxonomic information on plants, animals, fungi, and microbes of North America and the world.