Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 146 result(s)
The World Atlas of Language Structures (WALS) is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of 55 authors (many of them the leading authorities on the subject).
The Alternative Fuels Data Center (AFDC) is a comprehensive clearinghouse of information about advanced transportation technologies. The AFDC offers transportation decision makers unbiased information, data, and tools related to the deployment of alternative fuels and advanced vehicles. The AFDC launched in 1991 in response to the Alternative Motor Fuels Act of 1988 and the Clean Air Act Amendments of 1990. It originally served as a repository for alternative fuel performance data. The AFDC has since evolved to offer a broad array of information resources that support efforts to reduce petroleum use in transportation. The AFDC serves Clean Cities stakeholders, fleets regulated by the Energy Policy Act, businesses, policymakers, government agencies, and the general public.
The IMSR is a searchable online database of mouse strains, stocks, and mutant ES cell lines available worldwide, including inbred, mutant, and genetically engineered strains. The goal of the IMSR is to assist the international scientific community in locating and obtaining mouse resources for research. Note that the data content found in the IMSR is as supplied by strain repository holders. For each strain or cell line listed in the IMSR, users can obtain information about: Where that resource is available (Repository Site); What state(s) the resource is available as (e.g. live, cryopreserved embryo or germplasm, ES cells); Links to descriptive information about a strain or ES cell line; Links to mutant alleles carried by a strain or ES cell line; Links for ordering a strain or ES cell line from a Repository; Links for contacting the Repository to send a query
The Plant Metabolic Network (PMN) provides a broad network of plant metabolic pathway databases that contain curated information from the literature and computational analyses about the genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism in plants. The PMN currently houses one multi-species reference database called PlantCyc and 22 species/taxon-specific databases.
The Entrez Protein Clusters database contains annotation information, publications, structures and analysis tools for related protein sequences encoded by complete genomes. The data available in the Protein Clusters Database is generated from prokaryotic genomic studies and is intended to assist researchers studying micro-organism evolution as well as other biological sciences. Available genomes include plants and viruses as well as organelles and microbial genomes.
The NCAA Student-Athlete Experiences Data Archive provides access to data about student athletes and will grow to include a handful of user-friendly data collections related to graduation rates; team-level Academic Progress Rates in Division I; and individual-level data on the experiences of current and former student-athletes from the NCAA's Growth, Opportunities, Aspirations and Learning of Students in college study (GOALS), and the Study of College Outcomes and Recent Experiences (SCORE). In the long run, the NCAA expects to follow this initial release with the publication of as much data as possible from its archives. The data is used by college presidents, athletic personnel, faculty, student-athlete groups, media members, and researchers in looking at issues related to intercollegiate athletics and higher education.
Swiss Institute of Bioinformatics (SIB) coordinates research and education in bioinformatics throughout Switzerland and provides bioinformatics services to the national and international research community. ExPASy gives access to numerous repositories and databases of SIB. For example: array map, MetaNetX, SWISS-MODEL and World-2DPAGE, and many others see a list here
The HUGO Gene Nomenclature Committee (HGNC) assigned unique gene symbols and names to over 35,000 human loci, of which around 19,000 are protein coding. This curated online repository of HGNC-approved gene nomenclature and associated resources includes links to genomic, proteomic and phenotypic information, as well as dedicated gene family pages.
NED is a comprehensive database of multiwavelength data for extragalactic objects, providing a systematic, ongoing fusion of information integrated from hundreds of large sky surveys and tens of thousands of research publications. The contents and services span the entire observed spectrum from gamma rays through radio frequencies. As new observations are published, they are cross- identified or statistically associated with previous data and integrated into a unified database to simplify queries and retrieval. Seamless connectivity is also provided to data in NASA astrophysics mission archives (IRSA, HEASARC, MAST), to the astrophysics literature via ADS, and to other data centers around the world.
The WorldWide Antimalarial Resistance Network (WWARN) is a collaborative platform generating innovative resources and reliable evidence to inform the malaria community on the factors affecting the efficacy of antimalarial medicines. Access to data is provided through diverse Tools and Resources: WWARN Explorer, Molecular Surveyor K13 Methodology, Molecular Surveyor pfmdr1 & pfcrt, Molecular Surveyor dhfr & dhps.
The centerpiece of the Global Trade Analysis Project is a global data base describing bilateral trade patterns, production, consumption and intermediate use of commodities and services. The GTAP Data Base consists of bilateral trade, transport, and protection matrices that link individual country/regional economic data bases. The regional data bases are derived from individual country input-output tables, from varying years.
Silkworm Pathogen Database (SilkPathDB) is a comprehensive resource for studying on pathogens of silkworm, including microsporidia, fungi, bacteria and virus. SilkPathDB provides access to not only genomic data including functional annotation of genes and gene products, but also extensive biological information for gene expression data and corresponding researches. SilkPathDB will be help with researches on pathogens of silkworm as well as other Lepidoptera insects.
The taxonomically broad EST database TBestDB serves as a repository for EST data from a wide range of eukaryotes, many of which have previously not been thoroughly investigated. Most of the data contained in TBestDB has been generated by the labs of the Protist EST Program located in six universities across Canada. PEP is a large interdisciplinaryresearch project, involving six Canadian universities. PEP aims at the exploration of the diversity of eukaryotic genomes in a systematic, comprehensive and integrated way. The focus is on unicellular microbial eukaryotes, known as protists. Protistan eukaryotes comprise more than a dozen major lineages that, together, encompass more evolutionary, ecological and probably biochemical diversity than the multicellular kingdoms of animals, plants and fungi combined. PEP is a unique endeavor in that it is the first phylogenetically-broad genomic investigation of protists.
OpenTopography facilitates community access to high-resolution, Earth science-oriented, topography data, and related tools and resources. The OpenTopography Facility is based at the San Diego Supercomputer Center at the University of California, San Diego and is operated in collaboration with colleagues in the School of Earth and Space Exploration at Arizona State University. Core operational support for OpenTopography comes from the National Science Foundation Earth Sciences: Instrumentation and Facilities Program (EAR/IF) and the Office of Cyberinfrastructure. In addition, we receive funding from the NSF and NASA to support various OpenTopography related research and development activities.
Research Data Australia is the data discovery service of the Australian National Data Service (ANDS). We do not store the data itself here but provide descriptions of, and links to, the data from our data publishing partners. ANDS is funded by the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS).
ArachnoServer is a manually curated database containing information on the sequence, three-dimensional structure, and biological activity of protein toxins derived from spider venom. Spiders are the largest group of venomous animals and they are predicted to contain by far the largest number of pharmacologically active peptide toxins (Escoubas et al., 2006). ArachnoServer has been custom-built so that a wide range of biological scientists, including neuroscientists, pharmacologists, and toxinologists, can readily access key data relevant to their discipline without being overwhelmed by extraneous information.
The BCDC serves the research data obtained, and the data syntheses assembled, by researchers within the Bjerknes Centre for Climate Research. Furthermore it is open for all interested scientists independent of institution. All data from the different disciplines (e.g. geology, oceanography, biology, model community) will be archived in a long-term repository, interconnected and made publicly available by the BCDC. BCDC has collaborations with many international data repositories and actively archives metadata and data at those ensuring quality and FAIRness. BCDC has it's main focus on services for data management for external and internal funded projects in the field of climate research, provides data management plans and ensures that data is archived accordingly according to the best practices in the field. The data management services rank from project work for small external funded project to top-of-the-art data management services for research infrastructures on the ESFRI roadmap (e.g. RI ICOS – Integrated Carbon Observation System) and for provides products and services for Copernicus Marine Environmental Monitoring Services. In addition BCDC is advising various communities on data management services e.g. IOC UNESCO, OECD, IAEA and various funding agencies. BCDC will become an Associated Data Unit (ADU) under IODE, International Oceanographic Data and Information Exchange, a worldwide network that operates under the auspices of the Intergovernmental Oceanographic Commission of UNESCO and aims at becoming a part of ICSU World Data System.
CaltechDATA is an institutional data repository for Caltech. Caltech library runs the repository to preserve the accomplishments of Caltech researchers and share their results with the world. Caltech-associated researchers can upload data, link data with their publications, and assign a permanent DOI so that others can reference the data set. The repository also preserves software and has automatic Github integration. All files present in the repository are open access or embargoed, and all metadata is always available to the public.
Vivli is a non-profit organization working to advance human health through the insights and discoveries gained by sharing and analyzing data. It is home to an independent global data-sharing and analytics platform which serves all elements of the international research community. The platform includes a data repository, in-depth search engine and cloud-based analytics, and harmonizes governance, policy and processes to make sharing data easier. Vivli acts as a neutral broker between data contributor and data user and the wider data sharing community.
The Duke Research Data Repository is a service of the Duke University Libraries that provides curation, access, and preservation of research data produced by the Duke community. Duke's RDR is a discipline agnostic institutional data repository that is intended to preserve and make public data related to the teaching and research mission of Duke University including data linked to a publication, research project, and/or class, as well as supplementary software code and documentation used to provide context for the data.
Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. SNAP is also available through the NodeXL which is a graphical front-end that integrates network analysis into Microsoft Office and Excel. The SNAP library is being actively developed since 2004 and is organically growing as a result of our research pursuits in analysis of large social and information networks. Largest network we analyzed so far using the library was the Microsoft Instant Messenger network from 2006 with 240 million nodes and 1.3 billion edges. The datasets available on the website were mostly collected (scraped) for the purposes of our research. The website was launched in July 2009.
MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human health and disease. The projects contributing to this resource are: Mouse Genome Database (MGD) Project, Gene Expression Database (GXD) Project, Mouse Tumor Biology (MTB) Database Project, Gene Ontology (GO) Project at MGI, MouseMine Project, MouseCyc Project at MGI
Reactome is a manually curated, peer-reviewed pathway database, annotated by expert biologists and cross-referenced to bioinformatics databases. Its aim is to share information in the visual representations of biological pathways in a computationally accessible format. Pathway annotations are authored by expert biologists, in collaboration with Reactome editorial staff and cross-referenced to many bioinformatics databases. These include NCBI Gene, Ensembl and UniProt databases, the UCSC and HapMap Genome Browsers, the KEGG Compound and ChEBI small molecule databases, PubMed, and Gene Ontology.