Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 1054 result(s)
The Buckeye Corpus of conversational speech contains high-quality recordings from 40 speakers in Columbus OH conversing freely with an interviewer. The speech has been orthographically transcribed and phonetically labeled. The audio and text files, together with time-aligned phonetic labels, are stored in a format for use with speech analysis software (Xwaves and Wavesurfer). Software for searching the transcription files is currently being written.
CHILDES is the child language component of the TalkBank system. TalkBank is a system for sharing and studying conversational interactions.
The centerpiece of the Global Trade Analysis Project is a global data base describing bilateral trade patterns, production, consumption and intermediate use of commodities and services. The GTAP Data Base consists of bilateral trade, transport, and protection matrices that link individual country/regional economic data bases. The regional data bases are derived from individual country input-output tables, from varying years.
Dataverse to host followup observations of galaxy clusters identified in South Pole Telescope SZ Surveys. This includes: 1) GMOS spectroscopy of low to moderate redshift galaxy clusters taken as a part of NOAO Large Survey Program 11A-0034 (PI: Christopher Stubbs).
The Census Bureau releases TIGER/Line shapefiles and metadata each year to the public. TIGER/Line shapefiles are spatial extracts from the Census Bureau’s MAF/TIGER database. They contain features such as roads, railroads, hydrographic features and legal and statistical boundaries.
Here you will find a collection of atomic microstructures that have been built by the atomic modeling community. Feel free to download any of these and use them in your own scientific explorations.The focus of this cyberinfrastructure is to advance the field of atomic-scale modeling of materials by acting as a forum for disseminating new atomistic scale methodologies, educating non-experts and the next generation of computational materials scientists, and serving as a bridge between the atomistic and complementary (electronic structure, mesoscale) modeling communities.
The VDC is a public, web-based search engine for accessing worldwide earthquake strong ground motion data. While the primary focus of the VDC is on data of engineering interest, it is also an interactive resource for scientific research and government and emergency response professionals. is the premier place for computational nanotechnology research, education, and collaboration. Our site hosts a rapidly growing collection of Simulation Programs for nanoscale phenomena that run in the cloud and are accessible through a web browser. In addition to simulation devices, nanoHUB provides Online Presentations, Courses, Learning Modules, Podcasts, Animations, Teaching Materials, and more. These resources help users learn about our simulation programs and about nanotechnology in general. Our site offers researchers a venue to explore, collaborate, and publish content, as well. Much of these collaborative efforts occur via Workspaces and User groups.
Atomic and Ionic UV/VUV Linelist . This facility permits selective searches of some atomic data compliled by R. L. Kelly. The data provided are: - vacuum wavelength [in nm], - intensity estimate, - E [in cm-1], j, and configuration for lower and upper levels, - multiplet (where available), - reference numbers of the sources of the data.
The National Cancer Data Base (NCDB), a joint program of the Commission on Cancer (CoC) of the American College of Surgeons (ACoS) and the American Cancer Society (ACS), is a nationwide oncology outcomes database for more than 1,500 Commission-accredited cancer programs in the United States and Puerto Rico. Some 70 percent of all newly diagnosed cases of cancer in the United States are captured at the institutional level and reported to the NCDB. The NCDB, begun in 1989, now contains approximately 29 million records from hospital cancer registries across the United States. Data on all types of cancer are tracked and analyzed. These data are used to explore trends in cancer care, to create regional and state benchmarks for participating hospitals, and to serve as the basis for quality improvement.
LSDA contains information, data, studies and materials from medical and biological experiments from the Mercury Project (1961) to current flight, flight analog and ground research. LSDA includes data from NASA's Human Research Program (HRP), NASA’s Space Biology Program (SP), The Human Health and Performance Directorate (HH&P) , and Lifetime Surveillance of Astronaut Health (LSAH)
The mission of NCHS is to provide statistical information that will guide actions and policies to improve the health of the American people. As the Nation's principal health statistics agency, NCHS is responsible for collecting accurate, relevant, and timely data. NCHS' mission, and those of its counterparts in the Federal statistics system, focuses on the collection, analysis, and dissemination of information that is of use to a broad range of us.
TES is the first satellite instrument to provide simultaneous concentrations of carbon monoxide, ozone, water vapor and methane throughout Earth’s lower atmosphere. This lower atmosphere (the troposphere) is situated between the surface and the height at which aircraft fly, and is an important part of the atmosphere that we often impact with our activities.
Search and access 201 data sets covering the Atmosphere, Ocean, Land and more. Explore climate indices, reanalyses and satellite data and understand their application to climate model metrics. This is the only data portal that combines data discovery, metadata, figures and world-class expertise on the strengths, limitations and applications of climate data.
The UC San Diego Library Digital Collections website gathers two categories of content managed by the Library: library collections (including digitized versions of selected collections covering topics such as art, film, music, history and anthropology) and research data collections (including research data generated by UC San Diego researchers).
ResearchWorks Archive is the University of Washington’s digital repository (also known as “institutional repository”) for disseminating and preserving scholarly work. ResearchWorks Archive can accept any digital file format or content (examples include numerical datasets, photographs and diagrams, working papers, technical reports, pre-prints and post-prints of published articles).
AIRS moves climate research and weather prediction into the 21st century. AIRS is one of six instruments on board the Aqua satellite, part of the NASA Earth Observing System. AIRS along with its partner microwave instrument the Advanced Microwave Sounding Unit AMSU-A, represents the most advanced atmospheric sounding system ever deployed in space. Together these instruments observe the global water and energy cycles, climate variation and trends, and the response of the climate system to increased greenhouse gases.
AmoebaDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for Entamoeba and Acanthamoeba parasites. In its first iteration (released in early 2010), AmoebaDB contains the genomes of three Entamoeba species (see below). AmoebaDB integrates whole genome sequence and annotation and will rapidly expand to include experimental data and environmental isolate sequences provided by community researchers . The database includes supplemental bioinformatics analyses and a web interface for data-mining.
OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation. OrthoMCL starts with reciprocal best hits within each genome as potential in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as potential ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.
The CMU Multi-Modal Activity Database (CMU-MMAC) database contains multimodal measures of the human activity of subjects performing the tasks involved in cooking and food preparation. The CMU-MMAC database was collected in Carnegie Mellon's Motion Capture Lab. A kitchen was built and to date twenty-five subjects have been recorded cooking five different recipes: brownies, pizza, sandwich, salad, and scrambled eggs.
----<<<<< This repository is no longer available. This record is out dated >>>>>----- The aim of FlyReactome, based in the Department of Genetics, University of Cambridge, is to develop a curated repository for Drosophila melanogaster pathways and reactions. The information in this database is authored by biological researchers with expertise in their fields, maintained by the FlyReactome staff.
The FREEBIRD website aims to facilitate data sharing in the area of injury and emergency research in a timely and responsible manner. It has been launched by providing open access to anonymised data on over 30,000 injured patients (the CRASH-1 and CRASH-2 trials).
The Supreme Court Database is the definitive source for researchers, students, journalists, and citizens interested in the U.S. Supreme Court. The Database contains over two hundred pieces of information about each case decided by the Court between the 1791 and 2015 terms. Examples include the identity of the court whose decision the Supreme Court reviewed, the parties to the suit, the legal provisions considered in the case, and the votes of the Justices. The project started with Spaeth's original database. The analysis tools allow you to select and summarize cases from the Modern or Legacy Database based on your needs.
OHSU Digital Commons is a repository for the scholarly and creative work of Oregon Health & Science University. Developed by the OHSU Library, Digital Commons provides the university community with a platform for publishing and accessing content produced by students, faculty, and staff. OHSU Digital Commons documents the history and growth of the university, as well as current progress in education, research, and health care.
ToxoDB is a genome database for the genus Toxoplasma, a set of single-celled eukaryotic pathogens that cause human and animal diseases, including toxoplasmosis.