Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 24 result(s)
OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation. OrthoMCL starts with reciprocal best hits within each genome as potential in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as potential ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.
AmoebaDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for Entamoeba and Acanthamoeba parasites. In its first iteration (released in early 2010), AmoebaDB contains the genomes of three Entamoeba species (see below). AmoebaDB integrates whole genome sequence and annotation and will rapidly expand to include experimental data and environmental isolate sequences provided by community researchers . The database includes supplemental bioinformatics analyses and a web interface for data-mining.
ToxoDB is a genome database for the genus Toxoplasma, a set of single-celled eukaryotic pathogens that cause human and animal diseases, including toxoplasmosis.
FungiDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for the kingdom Fungi. FungiDB was first released in early 2011 as a collaborative project between EuPathDB and the group of Jason Stajich (University of California, Riverside). At the end of 2015, FungiDB was integrated into the EuPathDB bioinformatic resource center. FungiDB integrates whole genome sequence and annotation and also includes experimental and environmental isolate sequence data. The database includes comparative genomics, analysis of gene expression, and supplemental bioinformatics analyses and a web interface for data-mining.
TriTrypDB is an integrated genomic and functional genomic database for pathogens of the family Trypanosomatidae, including organisms in both Leishmania and Trypanosoma genera. TriTrypDB and its continued development are possible through the collaborative efforts between EuPathDB, GeneDB and colleagues at the Seattle Biomedical Research Institute (SBRI).
Giardia lamblia is a significant, environmentally transmitted, human pathogen and an amitochondriate protist. It is a major contributor to the enormous worldwide burden of human diarrheal diseases, yet the basic biology of this parasite is not well understood. No virulence factor has been identified. The Giardia lamblia genome contains only 12 million base pairs distributed onto five chromosomes. Its analysis promises to provide insights about the origins of nuclear genome organization, the metabolic pathways used by parasitic protists, and the cellular biology of host interaction and avoidance of host immune systems. Since the divergence of Giardia lamblia lies close to the transition between eukaryotes and prokaryotes in universal ribosomal RNA phylogenies, it is a valuable, if not unique, model for gaining basic insights into genetic innovations that led to formation of eukaryotic cells. In evolutionary terms, the divergence of this organism is at least twice as ancient as the common ancestor for yeast and man. A detailed study of its genome will provide insights into an early evolutionary stage of eukaryotic chromosome organization as well as other aspects of the prokaryotic / eukaryotic divergence.
VectorBase provides data on arthropod vectors of human pathogens. Sequence data, gene expression data, images, population data, and insecticide resistance data for arthropod vectors are available for download. VectorBase also offers genome browser, gene expression and microarray repository, and BLAST searches for all VectorBase genomes. VectorBase Genomes include Aedes aegypti, Anopheles gambiae, Culex quinquefasciatus, Ixodes scapularis, Pediculus humanus, Rhodnius prolixus. VectorBase is one the Bioinformatics Resource Centers (BRC) projects which is funded by National Institute of Allergy and Infectious Diseases (NAID).
The Maize Genetics and Genomics Database focuses on collecting data related to the crop plant and model organism Zea mays. The project's goals are to synthesize, display, and provide access to maize genomics and genetics data, prioritizing mutant and phenotype data and tools, structural and genetic map sets, and gene models. MaizeGDB also aims to make the Maize Newsletter available, and provide support services to the community of maize researchers. MaizeGDB is working with the Schnable lab, the Panzea project, The Genome Reference Consortium, and iPlant Collaborative to create a plan for archiving, dessiminating, visualizing, and analyzing diversity data. MMaizeGDB is short for Maize Genetics/Genomics Database. It is a USDA/ARS funded project to integrate the data found in MaizeDB and ZmDB into a single schema, develop an effective interface to access this data, and develop additional tools to make data analysis easier. Our goal in the long term is a true next-generation online maize database.aize genetics and genomics database.
MicrosporidiaDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for the phylum Microsporidia. In its first iteration (released in early 2010), MicrosporidiaDB contains the genomes of two Encephalitozoon species (see below). MicrosporidiaDB integrates whole genome sequence and annotation and will rapidly expand to include experimental data and environmental isolate sequences provided by community researchers. The database includes supplemental bioinformatics analyses and a web interface for data-mining.
CryptoDB is an integrated genomic and functional genomic database for the parasite Cryptosporidium and other related genera. CryptoDB integrates whole genome sequence and annotation along with experimental data and environmental isolate sequences provided by community researchers. The database includes supplemental bioinformatics analyses and a web interface for data-mining.
The Harvard Dataverse is open to all scientific data from all disciplines worldwide. It includes the world's largest collection of social science research data. It is hosting data for projects, archives, researchers, journals, organizations, and institutions.
OpenAgrar publication server now also features research data. Creators support further development as claimed on the Open-Access Days 2017 (Dresden, Germany)
The Immunology Database and Analysis Portal (ImmPort) archives clinical study and trial data generated by NIAID/DAIT-funded investigators. Data types housed in ImmPort include subject assessments i.e., medical history, concomitant medications and adverse events as well as mechanistic assay data such as flow cytometry, ELISA, ELISPOT, etc. --- You won't need an ImmPort account to search for compelling studies, peruse study demographics, interventions and mechanistic assays. But why stop there? What you really want to do is download the study, look at each experiment in detail including individual ELISA results and flow cytometry files. Perhaps you want to take those flow cytometry files for a test drive using FLOCK in the ImmPort flow cytometry module. To download all that interesting data you will need to register for ImmPort access.
TERN's AEKOS data portal is the original gateway to Australian ecology data. It is a ‘data and research methods’ data portal for Australia’s land-dwelling plants, animals and their environments. The primary focus of data content is raw co-located ‘species and environment’ ecological survey data that has been collected at the ‘plot’ level to describe biodiversity, its patterns and ecological processes. It is openly accessible with standard discovery metadata and user-oriented, contextual metadata critical for data reuse. Our services support the ecosystem science community, land managers and governments seeking to publish under COPE publishing ethics and the FAIR data publishing principles. AEKOS is registered with Thomson & Reuters Data Citation Index and is a recommended repository of Nature Publishing’s Scientific Data. There are currently 97,037 sites covering mostly plant biodiversity and co-located environmental data of Australia. The AEKOS initiative is supported by TERN (, hosted by The University of Adelaide and funded by the Australian Government’s National Research Infrastructure for Australia.
EuPathDB (formerly ApiDB) is an integrated database covering the eukaryotic pathogens in the genera Acanthamoeba, Annacaliia, Babesia, Crithidia, Cryptosporidium, Edhazardia, Eimeria, Encephalitozoon, Endotrypanum, Entamoeba, Enterocytozoon, Giardia, Gregarina, Hamiltosporidium, Leishmania, Nematocida, Neospora, Nosema, Plasmodium, Theileria, Toxoplasma, Trichomonas, Trypanosoma and Vavraia, Vittaforma). While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all of these resources, and the opportunity to leverage orthology for searches across genera.
Content type(s)
A genome database for the genus Piroplasma. PiroplasmaDB is a member of pathogen-databases that are housed under the NIAID-funded EuPathDB Bioinformatics Resource Center (BRC) umbrella.
Rodare is the institutional research data repository at HZDR (Helmholtz-Zentrum Dresden-Rossendorf). Rodare allows HZDR researchers to upload their research data and enrich those with metadata to make them findable, accessible, interoperable and retrievable (FAIR). By publishing all associated research data via Rodare research reproducibility can be improved. Uploads receive a Digital Object Identfier (DOI) and can be harvested via a OAI-PMH interface.
Content type(s)
TrichDB integrated genomic resources for the eukaryotic protist pathogens Trichomonas vaginalis.
The Department of Energy Systems Biology Knowledgebase (KBase) is a software and data platform designed to meet the grand challenge of systems biology: predicting and designing biological function. KBase integrates data and tools in a unified graphical interface so users do not need to access them from numerous sources or learn multiple systems in order to create and run sophisticated systems biology workflows. Users can perform large-scale analyses and combine multiple lines of evidence to model plant and microbial physiology and community dynamics. KBase is the first large-scale bioinformatics system that enables users to upload their own data, analyze it (along with collaborator and public data), build increasingly realistic models, and share and publish their workflows and conclusions. KBase aims to provide a knowledgebase: an integrated environment where knowledge and insights are created and multiplied.
BBMRI-ERIC is a European research infrastructure for biobanking. We bring together all the main players from the biobanking field – researchers, biobankers, industry, and patients – to boost biomedical research. To that end, we offer quality management services, support with ethical, legal and societal issues, and a number of online tools and software solutions. Ultimately, our goal is to make new treatments possible. The Directory is a tool to share aggregate information about the biobanks that are willing external collaboration. It is based on the MIABIS 2.0 standard, which describes the samples and data in the biobanks at an aggregated level.
BacDive is a bacterial metadatabase that provides strain-linked information about bacterial and archaeal biodiversity. The database is a resource for different kind of phenotypic data like taxonomy, morphology, physiology, environment and molecular-biology. The majority of data is manually annotated and curated. With the release in April 2019 BacDive offers information for 80,584 strains. The database is hosted by the Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures GmbH and is part of de.NBI the German Network for Bioinformatics Infrastructure.