Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 43 result(s)
The aim of FlyReactome, based in the Department of Genetics, University of Cambridge, is to develop a curated repository for Drosophila melanogaster pathways and reactions. The information in this database is authored by biological researchers with expertise in their fields, maintained by the FlyReactome staff.
METLIN represents the largest MS/MS collection of data with the database generated at multiple collision energies and in positive and negative ionization modes. The data is generated on multiple instrument types including SCIEX, Agilent, Bruker and Waters QTOF mass spectrometers.
The Organelle Genome Megasequencing Program (OGMP) provides mitochondrial, chloroplast, and mitochondrial plasmid genome data. OGMP tools allow direct comparison of OGMP and NCBI validated records. Includes GOBASE, a taxonomically broad organelle genome database that organizes and integrates diverse data related to mitochondria and chloroplasts.
The IMEx consortium is an international collaboration between a group of major public interaction data providers who have agreed to share curation effort and develop and work to a single set of curation rules when capturing data from both directly deposited interaction data or from publications in peer-reviewed journals, capture full details of an interaction in a “deep” curation model, perform a complete curation of all protein-protein interactions experimentally demonstrated within a publication, make these interaction available in a single search interface on a common website, provide the data in standards compliant download formats, make all IMEx records freely accessible under the Creative Commons Attribution License
Online Mendelian Inheritance in Animals (OMIA) is a catalogue/compendium of inherited disorders, other (single-locus) traits, and genes in 218 animal species (other than human and mouse and rats, which have their own resources) authored by Professor Frank Nicholas of the University of Sydney, Australia, with help from many people over the years. OMIA information is stored in a database that contains textual information and references, as well as links to relevant PubMed and Gene records at the NCBI, and to OMIM and Ensembl.
It captures and catalogues ancient human genome and microbiome data, including raw sequence and processed data, along with metadata about its provenance and production. Included datasets are generated from ancient samples studied at the Australian Centre for Ancient DNA, University of Adelaide in collaboration with other research groups. Datasets and collections in OAGR are open data resources made freely available in a reusable form, using open file formats and licensed with minimal restrictions for reuse. Digital object identifiers (DOIs) are minted for included datasets and collections to facilitate persistent identification and citation.
Swiss Institute of Bioinformatics (SIB) coordinates research and education in bioinformatics throughout Switzerland and provides bioinformatics services to the national and international research community. ExPASy gives access to numerous repositories and databases of SIB. For example: array map, MetaNetX, SWISS-MODEL and World-2DPAGE, and many others see a list here
The PeptideAtlas validates expressed proteins to provide eukaryotic genome data. Peptide Atlas provides data to advance biological discoveries in humans. The PeptideAtlas accepts proteomic data from high-throughput processes and encourages data submission.
Content type(s)
The JCB DataViewer is an image hosting and presentation platform for original image datasets associated with articles published in The Journal of Cell Biology, a peer-reviewed journal from the Rockefeller University Press.
The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains 6811 drug entries including 1528 FDA-approved small molecule drugs, 150 FDA-approved biotech (protein/peptide) drugs, 87 nutraceuticals and 5080 experimental drugs. Additionally, 4294 non-redundant protein (i.e. drug target/enzyme/transporter/carrier) sequences are linked to these drug entries. Each DrugCard entry contains more than 150 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data.
Content type(s)
Since the first discovery of RNA pseudoknots more and many more pseudoknots have been found. However, not all of those pseudoknot data are easy to trace. Sometimes the information is hidden in a publication where the title gives no hint that pseudoknot information is there. This was the first reason that we thought that a general accessible information source for pseudoknots would be handy.
The Database of Protein Disorder (DisProt) is a curated database that provides information about proteins that lack fixed 3D structure in their putatively native states, either in their entirety or in part. DisProt is a community resource annotating protein sequences for intrinsically disorder regions from the literature. It classifies intrinsic disorder based on experimental methods and three ontologies for molecular function, transition and binding partner.
Oral Cancer Gene Database is an initiative of the Advanced Centre for Treatment, Research and Education in Cancer, Navi Mumbai. The present database, version II, consists of 374 genes. It is developed as a user friendly site that would provide the scientist, information and external links from one place. The database is accessed through a list of all genes, and Keyword Search using gene name or gene symbol, chromosomal location, CGH (in %), and molecular weight. Interaction Network shows the interaction between genes for particular biological processes and molecular functions.
Content type(s)
A small genotype data repository containing data used in recent papers from the Estonian Biocentre. Most of the data pertains to human population genetics. PDF files of the papers are also freely available.
The Taenia solium genome project is a whole genome sequencing project of the parasite Taenia solium, the causal agent of human and porcine cysticercosis; a disease that is still a public health problem of relevance in Mexico. It is being carried out by a consortium of scientists belonging to diverse institutions of the Universidad Nacional Autónoma de México (UNAM, the National Autonomous University of Mexico).
Complete Genomics provides free public access to a variety of whole human genome data sets generated from Complete Genomics’ sequencing service. The research community can explore and familiarize themselves with the quality of these data sets, review the data formats provided from our sequencing service, and augment their own research with additional summaries of genomic variation across a panel of diverse individuals. The quality of these data sets is representative of what a customer can expect to receive for their own samples. This public genome repository comprises genome results from both our Standard Sequencing Service (69 standard, non-diseased samples) and the Cancer Sequencing Service (two matched tumor and normal sample pairs). In March 2013 Complete Genomics was acquired by BGI-Shenzhen , the world’s largest genomics services company. BGI is a company headquartered in Shenzhen, China that provides comprehensive sequencing and bioinformatics services for commercial science, medical, agricultural and environmental applications. Complete Genomics is now focused on building a new generation of high-throughput sequencing technology and developing new and exciting research, clinical and consumer applications.
The Sol Genomics Network (SGN) is a clade-oriented database dedicated to the biology of the Solanaceae family which includes a large number of closely related and many agronomically important species such as tomato, potato, tobacco, eggplant, pepper, and the ornamental Petunia hybrida. SGN is part of the International Solanaceae Initiative (SOL), which has the long-term goal of creating a network of resources and information to address key questions in plant adaptation and diversification
The CyberCell database (CCDB) is a comprehensive collection of detailed enzymatic, biological, chemical, genetic, and molecular biological data about E. coli (strain K12, MG1655). It is intended to provide sufficient information and querying capacity for biologists and computer scientists to use computers or detailed mathematical models to simulate all or part of a bacterial cell at a nanoscopic (10-9 m), mesoscopic (10-8 m).The CyberCell database CCDB actually consists of 4 browsable databases: 1) the main CyberCell database (CCDB - containing gene and protein information), 2) the 3D structure database (CC3D – containing information for structural proteomics), 3) the RNA database (CCRD – containing tRNA and rRNA information), and 4) the metabolite database (CCMD – containing metabolite information). Each of these databases is accessible through hyperlinked buttons located at the top of the CCDB homepage. All CCDB sub-databases are fully web enabled, permitting a wide variety of interactive browsing, search and display operations. and microscopic (10-6 m) level.
DDBJ; DNA Data Bank of Japan is the sole nucleotide sequence data bank in Asia, which is officially certified to collect nucleotide sequences from researchers and to issue the internationally recognized accession number to data submitters.Since we exchange the collected data with EMBL-Bank/EBI; European Bioinformatics Institute and GenBank/NCBI; National Center for Biotechnology Information on a daily basis, the three data banks share virtually the same data at any given time. The virtually unified database is called "INSD; International Nucleotide Sequence Database DDBJ collects sequence data mainly from Japanese researchers, but of course accepts data and issue the accession number to researchers in any other countries.
EMAGE (e-Mouse Atlas of Gene Expression) is an online biological database of gene expression data in the developing mouse (Mus musculus) embryo. The data held in EMAGE is spatially annotated to a framework of 3D mouse embryo models produced by EMAP (e-Mouse Atlas Project). These spatial annotations allow users to query EMAGE by spatial pattern as well as by gene name, anatomy term or Gene Ontology (GO) term. EMAGE is a freely available web-based resource funded by the Medical Research Council (UK) and based at the MRC Human Genetics Unit in the Institute of Genetics and Molecular Medicine, Edinburgh, UK.
PDBj (Protein Data Bank Japan) provides a centralized PDB archive of macromolecular structures, integrated tools for data retrieval, visualization, and functional characterization. PDBj is supported by JST-NBDC and Osaka University.
The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of biological macromolecules that serves a global community of researchers, educators, and students. The data contained in the archive include atomic coordinates, crystallographic structure factors and NMR experimental data. Aside from coordinates, each deposition also includes the names of molecules, primary and secondary structure information, sequence database references, where appropriate, and ligand and biological assembly information, details about data collection and structure solution, and bibliographic citations. The Worldwide Protein Data Bank (wwPDB) consists of organizations that act as deposition, data processing and distribution centers for PDB data. Members are: RCSB PDB (USA), PDBe (Europe) and PDBj (Japan), and BMRB (USA). The wwPDB's mission is to maintain a single PDB archive of macromolecular structural data that is freely and publicly available to the global community.
EMDataBank is a global portal for deposition and retrieval of cryo electron microscopy (3DEM) density maps, atomic models and associated metadata. It is a joint effort among investigators of the Protein Databank in Europe (PDBe) at the European Bioinformatics Institute, the Research Collaboratory for Structural Bioinformatics (RCSB) at Rutgers, and the National Center for Macromolecular Imaging (NCMI) at Baylor College of Medicine.