Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 49 result(s)
The BioStudies database holds descriptions of biological studies, links to data from these studies in other databases at EMBL-EBI or outside, as well as data that do not fit in the structured archives at EMBL-EBI. The database accepts submissions via an online tool, or in a simple tab-delimited format. It also enables authors to submit supplementary information and link to it from the publication.
4TU.ResearchData, previously known as 3TU.Datacentrum, is an archive for research data. It offers the knowledge, experience and the tools to share and safely store scientific research data in a standardized, secure and well-documented manner. 4TU.Centre for Research Data provides the research community with: Advice and support on data management; A long-term archive for scientific research data; Support for current research projects; Tools for reusing research data.
STRING is a database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations; they are derived from four sources: - Genomic Context - High-throughput Experiments - (Conserved) Coexpression - Previous Knowledge STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable.
The UniProt Knowledgebase (UniProtKB) is the central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation. In addition to capturing the core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and clear indications of the quality of annotation in the form of evidence attribution of experimental and computational data. The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt Metagenomic and Environmental Sequences (UniMES) database is a repository specifically developed for metagenomic and environmental data. The UniProt Knowledgebase,is an expertly and richly curated protein database, consisting of two sections called UniProtKB/Swiss-Prot and UniProtKB/TrEMBL.
BRENDA is the main collection of enzyme functional data available to the scientific community worldwide. The enzymes are classified according to the Enzyme Commission list of enzymes. It is available free of charge for via the internet ( and as an in-house database for commercial users (requests to our distributor Biobase). The enzymes are classified according to the Enzyme Commission list of enzymes. Some 5000 "different" enzymes are covered. Frequently enzymes with very different properties are included under the same EC number. BRENDA includes biochemical and molecular information on classification, nomenclature, reaction, specificity, functional parameters, occurrence, enzyme structure, application, engineering, stability, disease, isolation, and preparation. The database also provides additional information on ligands, which function as natural or in vitro substrates/products, inhibitors, activating compounds, cofactors, bound metals, and other attributes.
Reactome is a manually curated, peer-reviewed pathway database, annotated by expert biologists and cross-referenced to bioinformatics databases. Its aim is to share information in the visual representations of biological pathways in a computationally accessible format. Pathway annotations are authored by expert biologists, in collaboration with Reactome editorial staff and cross-referenced to many bioinformatics databases. These include NCBI Gene, Ensembl and UniProt databases, the UCSC and HapMap Genome Browsers, the KEGG Compound and ChEBI small molecule databases, PubMed, and Gene Ontology.
ArachnoServer is a manually curated database containing information on the sequence, three-dimensional structure, and biological activity of protein toxins derived from spider venom. Spiders are the largest group of venomous animals and they are predicted to contain by far the largest number of pharmacologically active peptide toxins (Escoubas et al., 2006). ArachnoServer has been custom-built so that a wide range of biological scientists, including neuroscientists, pharmacologists, and toxinologists, can readily access key data relevant to their discipline without being overwhelmed by extraneous information.
The Database of Protein Disorder (DisProt) is a curated database that provides information about proteins that lack fixed 3D structure in their putatively native states, either in their entirety or in part. DisProt is a community resource annotating protein sequences for intrinsically disorder regions from the literature. It classifies intrinsic disorder based on experimental methods and three ontologies for molecular function, transition and binding partner.
jPOSTrepo (Japan ProteOme STandard Repository) is a repository of sharing MS raw/processed data. It consists of a high-speed file upload process, flexible file management system and easy-to-use interfaces. Users can release their "raw/processed" data via this site with a unique identifier number for the paper publication. Users also can suspend (or "embargo") their data until their paper is published. The file transfer from users’ computer to our repository server is very fast (roughly ten times faster than usual file transfer) and uses only web browsers – it does not require installing any additional software.
Database of mass spectra of known, unknown and provisionally identified substances. MassBank is the first public repository of mass spectral data for sharing them among scientific research community. MassBank data are useful for the chemical identification and structure elucidation of chemical compounds detected by mass spectrometry.
TopFIND is a protein-centric database for the annotation of protein termini currently in its third version. Non-canonical protein termini can be the result of multiple different biological processes, including pre-translational processes such as alternative splicing and alternative translation initiation or post-translational protein processing by proteases that cleave proteases as part of protein maturation or as a regulatory modification. Accordingly, protein termini evidence in TopFIND is inferred from other databases such as ENSEMBL transcripts, TISdb for alternative translation initiation, MEROPS for protein cleavage by proteases, and UniProt for canonical and protein isoform start sites.
The cisRED database holds conserved sequence motifs identified by genome scale motif discovery, similarity, clustering, co-occurrence and coexpression calculations. Sequence inputs include low-coverage genome sequence data and ENCODE data. A Nucleic Acids Research article describes the system architecture
MTD is focused on mammalian transcriptomes with a current version that contains data from humans, mice, rats and pigs. Regarding the core features, the MTD browses genes based on their neighboring genomic coordinates or joint KEGG pathway and provides expression information on exons, transcripts, and genes by integrating them into a genome browser. We developed a novel nomenclature for each transcript that considers its genomic position and transcriptional features.
The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository for proteomics data, including protein and peptide identifications, post-translational modifications and supporting spectral evidence. PRIDE encourages and welcomes direct user submissions of mass spectrometry data to be published in peer-reviewed publications.
DEIMS-SDR (Dynamic Ecological Information Management System - Site and dataset registry) is an information management system that allows you to discover long-term ecosystem research sites around the globe, along with the data gathered at those sites and the people and networks associated with them. DEIMS-SDR describes a wide range of sites, providing a wealth of information, including each site’s location, ecosystems, facilities, parameters measured and research themes. It is also possible to access a growing number of datasets and data products associated with the sites. All sites and dataset records can be referenced using unique identifiers that are generated by DEIMS-SDR. It is possible to search for sites via keyword, predefined filters or a map search. By including accurate, up to date information in DEIMS, site managers benefit from greater visibility for their LTER site, LTSER platform and datasets, which can help attract funding to support site investments. The aim of DEIMS-SDR is to be the globally most comprehensive catalogue of environmental research and monitoring facilities, featuring foremost but not exclusively information about all LTER sites on the globe and providing that information to science, politics and the public in general.
M-CSA is a database of enzyme reaction mechanisms. It provides annotation on the protein, catalytic residues, cofactors, and the reaction mechanisms of hundreds of enzymes. There are two kinds of entries in M-CSA. 'Detailed mechanism' entries are more complete and show the individual chemical steps of the mechanism as schemes with electron flow arrows. 'Catalytic Site' entries annotate the catalytic residues necessary for the reaction, but do not show the mechanism. The M-CSA (Mechanism and Catalytic Site Atlas) represents a unified resource that combines the data in both MACiE and the CSA
Xenbase's mission is to provide the international research community with a comprehensive, integrated and easy to use web based resource that gives access the diverse and rich genomic, expression and functional data available from Xenopus research. Xenbase also provides a critical data sharing infrastructure for many other NIH-funded projects, and is a focal point for the Xenopus community. In addition to our primary goal of supporting Xenopus researchers, Xenbase enhances the availability and visibility of Xenopus data to the broader biomedical research community.
OpenWorm aims to build the first comprehensive computational model of the Caenorhabditis elegans (C. elegans), a microscopic roundworm. With only a thousand cells, it solves basic problems such as feeding, mate-finding and predator avoidance. Despite being extremely well studied in biology, this organism still eludes a deep, principled understanding of its biology. We are using a bottom-up approach, aimed at observing the worm behaviour emerge from a simulation of data derived from scientific experiments carried out over the past decade. To do so we are incorporating the data available in the scientific community into software models. We are engineering Geppetto and Sibernetic, open-source simulation platforms, to be able to run these different models in concert. We are also forging new collaborations with universities and research institutes to collect data that fill in the gaps All the code we produce in the OpenWorm project is Open Source and available on GitHub.
The Database explores the interactions of chemicals and proteins. It integrates information about interactions from metabolic pathways, crystal structures, binding experiments and drug-target relationships. Inferred information from phenotypic effects, text mining and chemical structure similarity is used to predict relations between chemicals. STITCH further allows exploring the network of chemical relations, also in the context of associated binding proteins.
The CATH database is a hierarchical domain classification of protein structures in the Protein Data Bank. Protein structures are classified using a combination of automated and manual procedures. There are four major levels in the CATH hierarchy; Class, Architecture, Topology and Homologous superfamily.
This DOI repository provides permanent identifiers to data sets generated by Life Science researchers active in Sweden, and for which no other suitable public repository is available. BILS is a distributed national research infrastructure supported by the Swedish Research Council (Vetenskapsrådet) providing bioinformatics support to life science researchers in Sweden.
The ProteomeXchange consortium has been set up to provide a single point of submission of MS proteomics data to the main existing proteomics repositories, and to encourage the data exchange between them for optimal data dissemination. Current members accepting submissions are: The PRIDE PRoteomics IDEntifications database at the European Bioinformatics Institute focusing mainly on shotgun mass spectrometry proteomics data PeptideAtlas/PASSEL focusing on SRM/MRM datasets.
The IPD-IMGT/HLA Database provides a specialist database for sequences of the human major histocompatibility complex (MHC) and includes the official sequences named by the WHO Nomenclature Committee For Factors of the HLA System. The IPD-IMGT/HLA Database is part of the international ImMunoGeneTics project (IMGT). The database uses the 2010 naming convention for HLA alleles in all tools herein. To aid in the adoption of the new nomenclature, all search tools can be used with both the current and pre-2010 allele designations. The pre-2010 nomenclature designations are only used where older reports or outputs have been made available for download.