Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 50 result(s)
The Eurac Research CLARIN Centre (ERCC) is a dedicated repository for language data. It is hosted by the Institute for Applied Linguistics (IAL) at Eurac Research, a private research centre based in Bolzano, South Tyrol. The Centre is part of the Europe-wide CLARIN infrastructure, which means that it follows well-defined international standards for (meta)data and procedures and is well-embedded in the wider European Linguistics infrastructure. The repository hosts data collected at the IAL, but is also open for data deposits from external collaborators.
SeaBASS, the publicly shared archive of in situ oceanographic and atmospheric data maintained by the NASA Ocean Biology Processing Group (OBPG). High quality in situ measurements are prerequisite for satellite data product validation, algorithm development, and many climate-related inquiries. As such, the NASA Ocean Biology Processing Group (OBPG) maintains a local repository of in situ oceanographic and atmospheric data to support their regular scientific analyses. The SeaWiFS Project originally developed this system, SeaBASS, to catalog radiometric and phytoplankton pigment data used their calibration and validation activities. To facilitate the assembly of a global data set, SeaBASS was expanded with oceanographic and atmospheric data collected by participants in the SIMBIOS Program, under NASA Research Announcements NRA-96 and NRA-99, which has aided considerably in minimizing spatial bias and maximizing data acquisition rates. Archived data include measurements of apparent and inherent optical properties, phytoplankton pigment concentrations, and other related oceanographic and atmospheric data, such as water temperature, salinity, stimulated fluorescence, and aerosol optical thickness. Data are collected using a number of different instrument packages, such as profilers, buoys, and hand-held instruments, and manufacturers on a variety of platforms, including ships and moorings.
AmoebaDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for Entamoeba and Acanthamoeba parasites. In its first iteration (released in early 2010), AmoebaDB contains the genomes of three Entamoeba species (see below). AmoebaDB integrates whole genome sequence and annotation and will rapidly expand to include experimental data and environmental isolate sequences provided by community researchers . The database includes supplemental bioinformatics analyses and a web interface for data-mining.
FungiDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for the kingdom Fungi. FungiDB was first released in early 2011 as a collaborative project between EuPathDB and the group of Jason Stajich (University of California, Riverside). At the end of 2015, FungiDB was integrated into the EuPathDB bioinformatic resource center. FungiDB integrates whole genome sequence and annotation and also includes experimental and environmental isolate sequence data. The database includes comparative genomics, analysis of gene expression, and supplemental bioinformatics analyses and a web interface for data-mining.
nmrshiftdb is a NMR database (web database) for organic structures and their nuclear magnetic resonance (nmr) spectra. It allows for spectrum prediction (13C, 1H and other nuclei) as well as for searching spectra, structures and other properties. Last not least, it features peer-reviewed submission of datasets by its users. The nmrshiftdb2 software is open source, the data is published under an open content license. Please consult the documentation for more detailed information. nmrshiftdb2 is the continuation of the NMRShiftDB project with additional data and bugfixes and changes in the software.
The Abacus Dataverse Network is the research data repository of the British Columbia Research Libraries' Data Services, a collaboration involving the Data Libraries at Simon Fraser University (SFU), the University of British Columbia (UBC), the University of Northern British Columbia (UNBC) and the University of Victoria (UVic).
OpARA (Open Access Repository and Archive) is the repository for digital research data of the TU Dresden (TUD) and the TU Bergakademie Freiberg (TUBAF). It offers researchers the possibility of archiving their digital research data and optionally making it accessible to third parties under an Open Access license.
Pubchem contains 3 databases. 1. PubChem BioAssay: The PubChem BioAssay Database contains bioactivity screens of chemical substances described in PubChem Substance. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure. 2. PubChem Compound: The PubChem Compound Database contains validated chemical depiction information provided to describe substances in PubChem Substance. Structures stored within PubChem Compounds are pre-clustered and cross-referenced by identity and similarity groups. 3. PubChem Substance. The PubChem Substance Database contains descriptions of samples, from a variety of sources, and links to biological screening results that are available in PubChem BioAssay. If the chemical contents of a sample are known, the description includes links to PubChem Compound.
The dbMHC database provides an open, publicly accessible platform for DNA and clinical data related to the human Major Histocompatibility Complex (MHC). The dbMHC provides access to human leukocyte antigen (HLA) sequences, HLA allele and haplotype frequencies, and clinical datasets.
The European VLBI Network (EVN) is an interferometric array of radio telescopes located primarily in Europe and Asia, with additional telescopes in South Africa and Puerto Rico. The EVN performs high-resolution observations of cosmic radio sources at wavelenghts from 92cm to 7mm. The EVN Data Archive contains, among other things, the correlated data from EVN observations plus pipeline output, including the initial calibration tables to apply to the correlated data and preliminary images. In general, the correlated data and some pipeline results are proprietary for one year following distribution to the PI of the final epoch of observations resulting from a proposal after which the data enters the public domain; more details are in the "EVN Data Access Policy" linked via the archive-introduction page.
Socrata’s cloud-based solution allows government organizations to put their data online, make data-driven decisions, operate more efficiently, and share insights with citizens.
VectorBase provides data on arthropod vectors of human pathogens. Sequence data, gene expression data, images, population data, and insecticide resistance data for arthropod vectors are available for download. VectorBase also offers genome browser, gene expression and microarray repository, and BLAST searches for all VectorBase genomes. VectorBase Genomes include Aedes aegypti, Anopheles gambiae, Culex quinquefasciatus, Ixodes scapularis, Pediculus humanus, Rhodnius prolixus. VectorBase is one the Bioinformatics Resource Centers (BRC) projects which is funded by National Institute of Allergy and Infectious Diseases (NAID).
The Genome database contains annotations and analysis of eukaryotic and prokaryotic genomes, as well as tools that allow users to compare genomes and gene sequences from humans, microbes, plants, viruses and organelles. Users can browse by organism, and view genome maps and protein clusters.
The NCBI database of Genotypes and Phenotypes archives and distributes the results of studies that have investigated the interaction of genotype and phenotype, including genome-wide association studies, medical sequencing, molecular diagnostic assays, and association between genotype and non-clinical traits. The database provides summaries of studies, the contents of measured variables, and original study document text. dbGaP provides two types of access for users, open and controlled. Through the controlled access, users may access individual-level data such as phenotypic data tables and genotypes.
The SURF Data Repository is a user-friendly web-based data publication platform that allows researchers to store, annotate and publish research datasets of any size to ensure long-term preservation and availability of their data. The service allows any dataset to be stored, independent of volume, number of files and structure. A published dataset is enriched with complex metadata, unique identifiers are added and the data is preserved for an agreed-upon period of time. The service is domain-agnostic and supports multiple communities with different policy and metadata requirements.
The Maize Genetics and Genomics Database focuses on collecting data related to the crop plant and model organism Zea mays. The project's goals are to synthesize, display, and provide access to maize genomics and genetics data, prioritizing mutant and phenotype data and tools, structural and genetic map sets, and gene models. MaizeGDB also aims to make the Maize Newsletter available, and provide support services to the community of maize researchers. MaizeGDB is working with the Schnable lab, the Panzea project, The Genome Reference Consortium, and iPlant Collaborative to create a plan for archiving, dessiminating, visualizing, and analyzing diversity data. MMaizeGDB is short for Maize Genetics/Genomics Database. It is a USDA/ARS funded project to integrate the data found in MaizeDB and ZmDB into a single schema, develop an effective interface to access this data, and develop additional tools to make data analysis easier. Our goal in the long term is a true next-generation online maize database.aize genetics and genomics database.
ROSA P is the United States Department of Transportation (US DOT) National Transportation Library's (NTL) Repository and Open Science Access Portal (ROSA P). The name ROSA P was chosen to honor the role public transportation played in the civil rights movement, along with one of the important figures, Rosa Parks. To meet the requirements outlined in its legislative mandate, NTL collects research and resources across all modes of transportation and related disciplines, with specific focus on research, data, statistics, and information produced by USDOT, state DOTs, and other transportation organizations. Content types found in ROSA P include textual works, datasets, still image works, moving image works, other multimedia, and maps. These resources have value to federal, state, and local transportation decision makers, transportation analysts, and researchers.
IDEALS is an institutional repository that collects, disseminates, and provides persistent and reliable access to the research and scholarship of faculty, staff, and students at the University of Illinois at Urbana-Champaign. Faculty, staff, graduate students, and in some cases undergraduate students, can deposit their research and scholarship directly into IDEALS. Departments can use IDEALS to distribute their working papers, technical reports, or other research material. Contact us at for more information.
TreeGenes is a genomic, phenotypic, and environmental data resource for forest tree species. The TreeGenes database and Dendrome project provide custom informatics tools to manage the flood of information.The database contains several curated modules that support the storage of data and provide the foundation for web-based searches and visualization tools. GMOD GUI tools such as CMAP for genetic maps and GBrowse for genome and transcriptome assemblies are implemented here. A sample tracking system, known as the Forest Tree Genetic Stock Center, sits at the forefront of most large-scale projects. Barcode identifiers assigned to the trees during sample collection are maintained in the database to identify an individual through DNA extraction, resequencing, genotyping and phenotyping. DiversiTree, a user-friendly desktop-style interface, queries the TreeGenes database and is designed for bulk retrieval of resequencing data. CartograTree combines geo-referenced individuals with relevant ecological and trait databases in a user-friendly map-based interface. ---- The Conifer Genome Network (CGN) is a virtual nexus for researchers working in conifer genomics. The CGN web site is maintained by the Dendrome Project at the University of California, Davis.
The American National Election Studies (ANES) conducts national surveys and pilot studies and provides large, multifaceted datasets. Time Series Studies are conducted during years of national elections, with pre-election and post-election surveys conducted in presidential election years and post-election surveys conducted during congressional election years. Pilot Studies are normally conducted in years when there is no national election and are designed to test new, or to refine existing, instrumentation and study designs. Other Major Data Collections includes panel studies and other special studies.
The NCBI Short Genetic Variations database, commonly known as dbSNP, catalogs short variations in nucleotide sequences from a wide range of organisms. These variations include single nucleotide variations, short nucleotide insertions and deletions, short tandem repeats and microsatellites. Short Genetic Variations may be common, thus representing true polymorphisms, or they may be rare. Some rare human entries have additional information associated withthem, including disease associations, genotype information and allele origin, as some variations are somatic rather than germline events. ***NCBI will phase out support for non-human organism data in dbSNP and dbVar beginning on September 1, 2017***
Content type(s)
The Network for the Detection of Atmospheric Composition Change (NDACC), a major contributor to the worldwide atmospheric research effort, consists of a set of globally distributed research stations providing consistent, standardized, long-term measurements of atmospheric trace gases, particles, spectral UV radiation reaching the Earth's surface, and physical parameters, centered around the following priorities.
The Media Archive of the Arts is the platform for collaborative work, sharing and archiving of media at the ZHdK. It is available to students, lecturers and staff. The areas of application of the media archive are mainly focused on teaching and research, but the ZHdK departments archive and university communication also benefit. The media archive manages a wide range of visual and audiovisual content and supports collaborative forms of working.