Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 74 result(s)
Intrepid Bioinformatics serves as a community for genetic researchers and scientific programmers who need to achieve meaningful use of their genetic research data – but can’t spend tremendous amounts of time or money in the process. The Intrepid Bioinformatics system automates time consuming manual processes, shortens workflow, and eliminates the threat of lost data in a faster, cheaper, and better environment than existing solutions. The system also provides the functionality and community features needed to analyze the large volumes of Next Generation Sequencing and Single Nucleotide Polymorphism data, which is generated for a wide range of purposes from disease tracking and animal breeding to medical diagnosis and treatment.
The aim of FlyReactome, based in the Department of Genetics, University of Cambridge, is to develop a curated repository for Drosophila melanogaster pathways and reactions. The information in this database is authored by biological researchers with expertise in their fields, maintained by the FlyReactome staff.
It captures and catalogues ancient human genome and microbiome data, including raw sequence and processed data, along with metadata about its provenance and production. Included datasets are generated from ancient samples studied at the Australian Centre for Ancient DNA, University of Adelaide in collaboration with other research groups. Datasets and collections in OAGR are open data resources made freely available in a reusable form, using open file formats and licensed with minimal restrictions for reuse. Digital object identifiers (DOIs) are minted for included datasets and collections to facilitate persistent identification and citation.
The Cancer Imaging Archive is a freely accessible repository containing medical images and supporting data from cancer patients. Images are stored in DICOM file format. The images are organized as “Collections”, typically patients related by a common disease (e.g. lung cancer), image modality (MRI, CT, etc) or research focus. Search functionality allows users to query across Collections or within them to filter out only the data they are most interested in.
Avibase is an extensive database information system about all birds of the world, containing over 19 million records about 10,000 species and 22,000 subspecies of birds, including distribution information, taxonomy, synonyms in several languages and more. This site is managed by Denis Lepage and hosted by Bird Studies Canada, the Canadian copartner of Birdlife International. Avibase has been a work in progress since 1992 and I am now pleased to offer it as a service to the bird-watching and scientific community.
The project brings together national key players providing environmentally related biological data and services to develop the ‘German Federation for Biological Data' (GFBio). The overall goal is to provide a sustainable, service oriented, national data infrastructure facilitating data sharing and stimulating data intensive science in the fields of biological and environmental research.
The FishNet network is a collaborative effort among fish collections around the world to share and distribute data on specimen holdings. There is an open invitation for any institution with a fish collection to join.
The UniProt Knowledgebase (UniProtKB) is the central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation. In addition to capturing the core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and clear indications of the quality of annotation in the form of evidence attribution of experimental and computational data. The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt Metagenomic and Environmental Sequences (UniMES) database is a repository specifically developed for metagenomic and environmental data. The UniProt Knowledgebase,is an expertly and richly curated protein database, consisting of two sections called UniProtKB/Swiss-Prot and UniProtKB/TrEMBL.
As with most biomedical databases, the first step is to identify relevant data from the research community. The Monarch Initiative is focused primarily on phenotype-related resources. We bring in data associated with those phenotypes so that our users can begin to make connections among other biological entities of interest. We import data from a variety of data sources. With many resources integrated into a single database, we can join across the various data sources to produce integrated views. We have started with the big players including ClinVar and OMIM, but are equally interested in boutique databases. You can learn more about the sources of data that populate our system from our data sources page
MycoCosm, the DOE JGI’s web-based fungal genomics resource, which integrates fungal genomics data and analytical tools for fungal biologists. It provides navigation through sequenced genomes, genome analysis in context of comparative genomics and genome-centric view. MycoCosm promotes user community participation in data submission, annotation and analysis.
CorrDB has data of cattle, relating to meat production, milk production, growth, health, and others. This database is designed to collect all published livestock genetic/phenotypic trait correlation data, aimed at facilitating genetic network analysis or systems biology studies.
Ag Data Commons (ADC) provides access to a wide variety of open data relevant to agricultural research. We are a centralized repository for data already on the web, as well as for new data being published for the first time. While compliance with the U.S. Federal public access and open data directives is important, we aim to surpass them. Our goal is that ADC will foster innovative data re-use, integration, and visualization to support bigger, better science and policy.
In the framework of the Collaborative Research Centre/Transregio 32 ‘Patterns in Soil-Vegetation-Atmosphere Systems: Monitoring, Modelling, and Data Assimilation’ (CRC/TR32,, funded by the German Research Foundation from 2007 to 2018, a RDM system was self-designed and implemented. The so-called CRC/TR32 project database (TR32DB, is operating online since early 2008. The TR32DB handles all data including metadata, which are created by the involved project participants from several institutions (e.g. Universities of Cologne, Bonn, Aachen, and the Research Centre Jülich) and research fields (e.g. soil and plant sciences, hydrology, geography, geophysics, meteorology, remote sensing). The data is resulting from several field measurement campaigns, meteorological monitoring, remote sensing, laboratory studies and modelling approaches. Furthermore, outcomes of the scientists such as publications, conference contributions, PhD reports and corresponding images are collected in the TR32DB.
GLOBE (Global Collaboration Engine) is an online collaborative environment that enables land change researchers to share, compare and integrate local and regional studies with global data to assess the global relevance of their work.
iNaturalist is a citizen science project and online social network of naturalists, citizen scientists, and biologists built on the concept of mapping and sharing observations of biodiversity across the globe. iNat is a platform for biodiversity research, where anyone can start up their own science project with a specific purpose and collaborate with other observers.
Project Achilles is a systematic effort aimed at identifying and cataloging genetic vulnerabilities across hundreds of genomically characterized cancer cell lines. The project uses genome-wide genetic perturbation reagents (shRNAs or Cas9/sgRNAs) to silence or knock-out individual genes and identify those genes that affect cell survival. Large-scale functional screening of cancer cell lines provides a complementary approach to those studies that aim to characterize the molecular alterations (e.g. mutations, copy number alterations) of primary tumors, such as The Cancer Genome Atlas (TCGA). The overall goal of the project is to identify cancer genetic dependencies and link them to molecular characteristics in order to prioritize targets for therapeutic development and identify the patient population that might benefit from such targets.
The CASRdb site is dedicated to providing information on published mutations and polymorphisms of the calcium-sensing receptor (CASR).
Content type(s)
The Northern Ontario Plant Database (NOPD) is a website that provides free public access to records of herbarium specimens housed in northern Ontario educational and government institutions. A herbarium is an archival collection of plants that have been pressed, dried, mounted, and labelled. It also provides up-to-date and accurate information on the flora of northern Ontario.
CODEX is a database of NGS mouse and human experiments. Although, the main focus of CODEX is Haematopoiesis and Embryonic systems, the database includes a large variety of cell types. In addition to the publically available data, CODEX also includes a private site hosting non-published data. CODEX provides access to processed and curated NGS experiments. To use CODEX: (i) select a specialized repository (HAEMCODE or ESCODE) or choose the whole compendium (CODEX), then (ii) filter by organism and (iii) choose how to explore the database.
BioPortal is an open repository of biomedical ontologies, a service that provides access to those ontologies, and a set of tools for working with them. BioPortal provides a wide range of such tools, either directly via the BioPortal web site, or using the BioPortal web service REST API. BioPortal also includes community features for adding notes, reviews, and even mappings to specific ontologies. BioPortal has four major product components: the web application; the API services; widgets, or applets, that can be installed on your own site; and a Virtual Appliance version that is available for download or through Amazon Web Services machine instance (AMI). There is also a beta release SPARQL endpoint.
Database of mass spectra of known, unknown and provisionally identified substances. MassBank is the first public repository of mass spectral data for sharing them among scientific research community. MassBank data are useful for the chemical identification and structure elucidation of chemical compounds detected by mass spectrometry.
Giardia lamblia is a significant, environmentally transmitted, human pathogen and an amitochondriate protist. It is a major contributor to the enormous worldwide burden of human diarrheal diseases, yet the basic biology of this parasite is not well understood. No virulence factor has been identified. The Giardia lamblia genome contains only 12 million base pairs distributed onto five chromosomes. Its analysis promises to provide insights about the origins of nuclear genome organization, the metabolic pathways used by parasitic protists, and the cellular biology of host interaction and avoidance of host immune systems. Since the divergence of Giardia lamblia lies close to the transition between eukaryotes and prokaryotes in universal ribosomal RNA phylogenies, it is a valuable, if not unique, model for gaining basic insights into genetic innovations that led to formation of eukaryotic cells. In evolutionary terms, the divergence of this organism is at least twice as ancient as the common ancestor for yeast and man. A detailed study of its genome will provide insights into an early evolutionary stage of eukaryotic chromosome organization as well as other aspects of the prokaryotic / eukaryotic divergence.