Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 37 result(s)
STRING is a database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations; they are derived from four sources: - Genomic Context - High-throughput Experiments - (Conserved) Coexpression - Previous Knowledge STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable.
IntAct provides a freely available, open source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.
UniProtKB/Swiss-Prot is the manually annotated and reviewed section of the UniProt Knowledgebase (UniProtKB). It is a high quality annotated and non-redundant protein sequence database, which brings together experimental results, computed features and scientific conclusions. Since 2002, it is maintained by the UniProt consortium and is accessible via the UniProt website.
The Plant Metabolic Network (PMN) provides a broad network of plant metabolic pathway databases that contain curated information from the literature and computational analyses about the genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism in plants. The PMN currently houses one multi-species reference database called PlantCyc and 22 species/taxon-specific databases.
>>>!!!<<< SMD has been retired. After approximately fifteen years of microarray-centric research service, the Stanford Microarray Database has been retired. We apologize for any inconvenience; please read below for possible resolutions to your queries. If you are looking for any raw data that was directly linked to SMD from a manuscript, please search one of the public repositories. NCBI Gene Expression Omnibus EBI ArrayExpress All published data were previously communicated to one (or both) of the public repositories. Alternatively, data for publications between 1997 and 2004 were likely migrated to the Princeton University MicroArray Database, and are accessible there. If you are looking for a manuscript supplement (i.e. from a domain other than, perhaps try searching the Internet Archive: Wayback Machine . >>>!!!<<< The Stanford Microarray Database (SMD) is a DNA microarray research database that provides a large amount of data for public use.
JASPAR is the leading open-access database of matrix profiles describing the DNA-binding patterns of transcription factors and other proteins interacting with DNA in a sequence-specific manner.
A human interactome map. The sequencing of the human genome has provided a surprisingly small number of genes, indicating that the complex organization of life is not reflected in the gene number but, rather, in the gene products – that is, in the proteins. These macromolecules regulate the vast majority of cellular processes by their ability to communicate with each other and to assemble into larger functional units. Therefore, the systematic analysis of protein-protein interactions is fundamental for the understanding of protein function, cellular processes and, ultimately, the complexity of life. Moreover, interactome maps are particularly needed to link new proteins to disease pathways and the identification of novel drug targets.
AmoebaDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for Entamoeba and Acanthamoeba parasites. In its first iteration (released in early 2010), AmoebaDB contains the genomes of three Entamoeba species (see below). AmoebaDB integrates whole genome sequence and annotation and will rapidly expand to include experimental data and environmental isolate sequences provided by community researchers . The database includes supplemental bioinformatics analyses and a web interface for data-mining.
The database contains all the variants published as pathogenic mutations in the international literature up to November 2007. In addition, unpublished Usher mutations and non-pathogenic variants from the laboratory of Montpellier have been included.
ToxoDB is a genome database for the genus Toxoplasma, a set of single-celled eukaryotic pathogens that cause human and animal diseases, including toxoplasmosis.
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems. The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target ("TARGET_TYPE='PROTEIN'") is provided. Data extracted by BindingDB typically includes more details regarding experimental conditions, etc
The main objective of our work is to understand the pathomechanisms of late onset neurodegenerative disorders such as Huntington's, Parkinson's, Alzheimer's and Machado Joseph disease and to develop causal therapies for them. The disease causing proteins of these illnesses have been identified, but their functions in the unaffected organism are mostly unknown. Here, we have developed a strategy combining library and matrix yeast two-hybrid screens to generate a highly connected PPI network for Huntington's disease (HD).
GENCODE is a scientific project in genome research and part of the ENCODE (ENCyclopedia Of DNA Elements) scale-up project. The GENCODE consortium was initially formed as part of the pilot phase of the ENCODE project to identify and map all protein-coding genes within the ENCODE regions (approx. 1% of Human genome). Given the initial success of the project, GENCODE now aims to build an “Encyclopedia of genes and genes variants” by identifying all gene features in the human and mouse genome using a combination of computational analysis, manual annotation, and experimental validation, and annotating all evidence-based gene features in the entire human genome at a high accuracy.
MycoCosm, the DOE JGI’s web-based fungal genomics resource, which integrates fungal genomics data and analytical tools for fungal biologists. It provides navigation through sequenced genomes, genome analysis in context of comparative genomics and genome-centric view. MycoCosm promotes user community participation in data submission, annotation and analysis.
GeneCards is a searchable, integrative database that provides comprehensive, user-friendly information on all annotated and predicted human genes. It automatically integrates gene-centric data from ~125 web sources, including genomic, transcriptomic, proteomic, genetic, clinical and functional information.
The Yeast Metabolome Database (YMDB) is a manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae (also known as Baker’s yeast and Brewer’s yeast). This database covers metabolites described in textbooks, scientific journals, metabolic reconstructions and other electronic databases.
BenchSci is a free platform designed to help biomedical research scientists quickly and easily identify validated antibodies from publications. Using various filters including techniques, tissue, cell lines, and more, scientists can find out published data along with the antibody that match specific experimental contexts within seconds. Free registration & access for academic research scientists.
Content type(s)
BioSamples stores and supplies descriptions and metadata about biological samples used in research and development by academia and industry. Samples are either 'reference' samples (e.g. from 1000 Genomes, HipSci, FAANG) or have been used in an assay database such as the European Nucleotide Archive (ENA) or ArrayExpress.
BioGRID ORCS is an open repository of CRISPR screens compiled through comprehensive curation efforts. The current index is version 1.0.3 and searches more than 49 publications and 58,161 genes to return more than 895 CRISPR screens from 3 major model organism species and 629 cell lines. All screen data are freely provided through our search index and available via download in a wide variety of standardized formats.
This site provides access to complete, annotated genomes from bacteria and archaea (present in the European Nucleotide Archive) through the Ensembl graphical user interface (genome browser). Ensembl Bacteria contains genomes from annotated INSDC records that are loaded into Ensembl multi-species databases, using the INSDC annotation import pipeline.
The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins. This initial release of the UniPROBE database provides a centralized resource for accessing comprehensive data on the preferences of proteins for all possible sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In total, the database currently hosts DNA binding data for 406 nonredundant proteins from a diverse collection of organisms, including the prokaryote Vibrio harveyi, the eukaryotic malarial parasite Plasmodium falciparum, the parasitic Apicomplexan Cryptosporidium parvum, the yeast Saccharomyces cerevisiae, the worm Caenorhabditis elegans, mouse, and human. The database's web tools (on the right) include a text-based search, a function for assessing motif similarity between user-entered data and database PWMs, and a function for locating putative binding sites along user-entered nucleotide sequences
>>>!!!<<<Efforts to obtain renewed funding after 2008 were unfortunately not successful. PANDIT has therefore been frozen since November 2008, and its data are not updated since September 2005 when version 17.0 was released (corresponding to Pfam 17.0). The existing data and website remain available from these pages, and should remain stable and, we hope, useful. >>>!!!<<< PANDIT is a collection of multiple sequence alignments and phylogenetic trees. It contains corresponding amino acid and nucleotide sequence alignments, with trees inferred from each alignment. PANDIT is based on the Pfam database (Protein families database of alignments and HMMs), and includes the seed amino acid alignments of most families in the Pfam-A database. DNA sequences for as many members of each family as possible are extracted from the EMBL Nucleotide Sequence Database and aligned according to the amino acid alignment. PANDIT also contains a further copy of the amino acid alignments, restricted to the sequences for which DNA sequences were found.
This DOI repository provides permanent identifiers to data sets generated by Life Science researchers active in Sweden, and for which no other suitable public repository is available. BILS is a distributed national research infrastructure supported by the Swedish Research Council (Vetenskapsrådet) providing bioinformatics support to life science researchers in Sweden.
The Sol Genomics Network (SGN) is a clade-oriented database dedicated to the biology of the Solanaceae family which includes a large number of closely related and many agronomically important species such as tomato, potato, tobacco, eggplant, pepper, and the ornamental Petunia hybrida. SGN is part of the International Solanaceae Initiative (SOL), which has the long-term goal of creating a network of resources and information to address key questions in plant adaptation and diversification
The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).