Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 35 result(s)
>>> !!!!! The Portal is no longer available. !!!! >>> The CARMEN pilot project seeks to create a virtual laboratory for experimental neurophysiology, enabling the sharing and collaborative exploitation of data, analysis code and expertise. This study by the DCC contributes to an understanding of the data curation requirements of the eScience community, through its extended observation of the CARMEN neurophysiology community’s specification and selection of solutions for the organisation, access and curation of digital research output.
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans. BioGRID is an online interaction repository with data compiled through comprehensive curation efforts. All interaction data are freely provided through our search index and available via download in a wide variety of standardized formats.
CRAWDAD is the Community Resource for Archiving Wireless Data, a wireless network data resource for the research community. This archive has the capacity to store wireless trace data from many contributing locations, and staff to develop better tools for collecting, anonymizing, and analyzing the data. We work with community leaders to ensure that the archive meets the needs of the research community.
Open Research Exeter (ORE) is the University of Exeter's repository for all types of research, including research papers, research data and theses. Research in ORE can be viewed and downloaded freely by anyone, anywhere: researchers, students, industry, business and the wider public. ORE's content includes journal articles, conference papers, working papers, reports, book chapters, videos, audio, images, multimedia research project outputs, raw data and analysed data. ORE's content is securely stored, managed and preserved to ensure free, permanent access.
The WorldWide Antimalarial Resistance Network (WWARN) is a collaborative platform generating innovative resources and reliable evidence to inform the malaria community on the factors affecting the efficacy of antimalarial medicines. Access to data is provided through diverse Tools and Resources: WWARN Explorer, Molecular Surveyor K13 Methodology, Molecular Surveyor pfmdr1 & pfcrt, Molecular Surveyor dhfr & dhps.
The HUGO Gene Nomenclature Committee (HGNC) assigned unique gene symbols and names to over 35,000 human loci, of which around 19,000 are protein coding. This curated online repository of HGNC-approved gene nomenclature and associated resources includes links to genomic, proteomic and phenotypic information, as well as dedicated gene family pages.
CODEX is a database of NGS mouse and human experiments. Although, the main focus of CODEX is Haematopoiesis and Embryonic systems, the database includes a large variety of cell types. In addition to the publically available data, CODEX also includes a private site hosting non-published data. CODEX provides access to processed and curated NGS experiments. To use CODEX: (i) select a specialized repository (HAEMCODE or ESCODE) or choose the whole compendium (CODEX), then (ii) filter by organism and (iii) choose how to explore the database.
MatrixDB is a freely available database focused on interactions established by extracellular proteins and polysaccharides. MatrixDB takes into account the multimetric nature of the extracellular proteins (e.g. collagens, laminins and thrombospondins are multimers). MatrixDB includes interaction data extracted from the literature by manual curation in our lab, and offers access to relevant data involving extracellular proteins provided by our IMEx partner databases through the PSICQUIC webservice, as well as data from the Human Protein Reference Database. MatrixDB is in charge of the curation of papers published in Matrix Biology since January 2009
EDINA delivers online services and tools to benefit students, teachers and researchers in UK Higher and Further Education and beyond.
The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository for proteomics data, including protein and peptide identifications, post-translational modifications and supporting spectral evidence. PRIDE encourages and welcomes direct user submissions of mass spectrometry data to be published in peer-reviewed publications.
VectorBase provides data on arthropod vectors of human pathogens. Sequence data, gene expression data, images, population data, and insecticide resistance data for arthropod vectors are available for download. VectorBase also offers genome browser, gene expression and microarray repository, and BLAST searches for all VectorBase genomes. VectorBase Genomes include Aedes aegypti, Anopheles gambiae, Culex quinquefasciatus, Ixodes scapularis, Pediculus humanus, Rhodnius prolixus. VectorBase is one the Bioinformatics Resource Centers (BRC) projects which is funded by National Institute of Allergy and Infectious Diseases (NAID).
The IMPC is a confederation of international mouse phenotyping projects working towards the agreed goals of the consortium: To undertake the phenotyping of 20,000 mouse mutants over a ten year period, providing the first functional annotation of a mammalian genome. Maintain and expand a world-wide consortium of institutions with capacity and expertise to produce germ line transmission of targeted knockout mutations in embryonic stem cells for 20,000 known and predicted mouse genes. Test each mutant mouse line through a broad based primary phenotyping pipeline in all the major adult organ systems and most areas of major human disease. Through this activity and employing data annotation tools, systematically aim to discover and ascribe biological function to each gene, driving new ideas and underpinning future research into biological systems; Maintain and expand collaborative “networks” with specialist phenotyping consortia or laboratories, providing standardized secondary level phenotyping that enriches the primary dataset, and end-user, project specific tertiary level phenotyping that adds value to the mammalian gene functional annotation and fosters hypothesis driven research; and Provide a centralized data centre and portal for free, unrestricted access to primary and secondary data by the scientific community, promoting sharing of data, genotype-phenotype annotation, standard operating protocols, and the development of open source data analysis tools. Members of the IMPC may include research centers, funding organizations and corporations.
Established in 1965, the CSD is the world’s repository for small-molecule organic and metal-organic crystal structures. Containing the results of over one million x-ray and neutron diffraction analyses this unique database of accurate 3D structures has become an essential resource to scientists around the world. The CSD records bibliographic, chemical and crystallographic information for:organic molecules, metal-organic compounds whose 3D structures have been determined using X-ray diffraction, neutron diffraction. The CSD records results of: single crystal studies, powder diffraction studies which yield 3D atomic coordinate data for at least all non-H atoms. In some cases the CCDC is unable to obtain coordinates, and incomplete entries are archived to the CSD. The CSD includes crystal structure data arising from: publications in the open literature and Private Communications to the CSD (via direct data deposition). The CSD contains directly deposited data that are not available anywhere else, known as CSD Communications.
The Wellcome Trust Sanger Institute is a charitably funded genomic research centre located in Hinxton, nine miles south of Cambridge in the UK. We study diseases that have an impact on health globally by investigating genomes. Building on our past achievements and based on priorities that exploit the unique expertise of our Faculty of researchers, we will lead global efforts to understand the biology of genomes. We are convinced of the importance of making this research available and accessible for all audiences. reduce global health burdens.
GeoCommons is the public community of GeoIQ users who are building an open repository of data and maps for the world. The GeoIQ platform includes a large number of features that empower you to easily access, visualize and analyze your data. The GeoIQ platform powers the growing GeoCommons community of over 25,000 members actively creating and sharing hundreds of thousands of datasets and maps across the world. With GeoCommons, anyone can contribute and share open data, easily build shareable maps and collaborate with others.
SeaDataNet is a standardized system for managing the large and diverse data sets collected by the oceanographic fleets and the automatic observation systems. The SeaDataNet infrastructure network and enhance the currently existing infrastructures, which are the national oceanographic data centres of 35 countries, active in data collection. The networking of these professional data centres, in a unique virtual data management system provide integrated data sets of standardized quality on-line. As a research infrastructure, SeaDataNet contributes to build research excellence in Europe.
Apollo (previously DSpace@Cambridge) is the University of Cambridge’s institutional repository, preserving and providing access to content created by members of the University. The repository stores a range of content and provides different levels of access, but its primary focus is on providing open access to the University’s research publications.
The IPD-IMGT/HLA Database provides a specialist database for sequences of the human major histocompatibility complex (MHC) and includes the official sequences named by the WHO Nomenclature Committee For Factors of the HLA System. The IPD-IMGT/HLA Database is part of the international ImMunoGeneTics project (IMGT). The database uses the 2010 naming convention for HLA alleles in all tools herein. To aid in the adoption of the new nomenclature, all search tools can be used with both the current and pre-2010 allele designations. The pre-2010 nomenclature designations are only used where older reports or outputs have been made available for download.
BioModels is a repository of mathematical models of biological and biomedical systems. It hosts a vast selection of existing literature-based physiologically and pharmaceutically relevant mechanistic models in standard formats. Our mission is to provide the systems modelling community with reproducible, high-quality, freely-accessible models published in the scientific literature.
The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources. These include submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centres and routine and comprehensive exchange with our partners in the International Nucleotide Sequence Database Collaboration (INSDC). Provision of nucleotide sequence data to ENA or its INSDC partners has become a central and mandatory step in the dissemination of research findings to the scientific community. ENA works with publishers of scientific literature and funding bodies to ensure compliance with these principles and to provide optimal submission systems and data access tools that work seamlessly with the published literature.
BioVeL is a virtual e-laboratory that supports research on biodiversity issues using large amounts of data from cross-disciplinary sources. BioVeL supports the development and use of workflows to process data. It offers the possibility to either use already made workflows or create own. BioVeL workflows are stored in MyExperiment - Biovel Group They are underpinned by a range of analytical and data processing functions (generally provided as Web Services or R scripts) to support common biodiversity analysis tasks. You can find the Web Services catalogued in the BiodiversityCatalogue.
The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of biological macromolecules that serves a global community of researchers, educators, and students. The data contained in the archive include atomic coordinates, crystallographic structure factors and NMR experimental data. Aside from coordinates, each deposition also includes the names of molecules, primary and secondary structure information, sequence database references, where appropriate, and ligand and biological assembly information, details about data collection and structure solution, and bibliographic citations. The Worldwide Protein Data Bank (wwPDB) consists of organizations that act as deposition, data processing and distribution centers for PDB data. Members are: RCSB PDB (USA), PDBe (Europe) and PDBj (Japan), and BMRB (USA). The wwPDB's mission is to maintain a single PDB archive of macromolecular structural data that is freely and publicly available to the global community.
I2D (Interologous Interaction Database) is an on-line database of known and predicted mammalian and eukaryotic protein-protein interactions. It has been built by mapping high-throughput (HTP) data between species. Thus, until experimentally verified, these interactions should be considered "predictions". It remains one of the most comprehensive sources of known and predicted eukaryotic PPI. I2D includes data for S. cerevisiae, C. elegans, D. melonogaster, R. norvegicus, M. musculus, and H. sapiens.