Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 13 result(s)
The CLARIN­/Text+ repository at the Saxon Academy of Sciences and Humanities in Leipzig offers long­term preservation of digital resources, along with their descriptive metadata. The mission of the repository is to ensure the availability and long­term preservation of resources, to preserve knowledge gained in research, to aid the transfer of knowledge into new contexts, and to integrate new methods and resources into university curricula. Among the resources currently available in the Leipzig repository are a set of corpora of the Leipzig Corpora Collection (LCC), based on newspaper, Wikipedia and Web text. Furthermore several REST-based webservices are provided for a variety of different NLP-relevant tasks The repository is part of the CLARIN infrastructure and part of the NFDI consortium Text+. It is operated by the Saxon Academy of Sciences and Humanities in Leipzig.
The Comprehensive Epidemiologic Data Resource (CEDR) is the U.S. Department of Energy (DOE) electronic database comprised of health studies of DOE contract workers and environmental studies of areas surrounding DOE facilities. DOE recognizes the benefits of data sharing and supports the public's right to know about worker and community health risks. CEDR provides independent researchers and educators with access to de-identified data collected since the Department's early production years. Current CEDR holdings include more than 76 studies of over 1 million workers at 31 DOE sites. Access to these data is at no cost to the user.
OLAC, the Open Language Archives Community, is an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources by: (i) developing consensus on best current practice for the digital archiving of language resources, and (ii) developing a network of interoperating repositories and services for housing and accessing such resources. The OLAC system has 2016 been integrated with the Linguistic Linked Open Data Cloud.
CiteSeerx is an evolving scientific literature digital library and search engine that focuses primarily on the literature in computer and information science. CiteSeerx aims to improve the dissemination of scientific literature and to provide improvements in functionality, usability, availability, cost, comprehensiveness, efficiency, and timeliness in the access of scientific and scholarly knowledge. Rather than creating just another digital library, CiteSeerx attempts to provide resources such as algorithms, data, metadata, services, techniques, and software that can be used to promote other digital libraries. CiteSeerx has developed new methods and algorithms to index PostScript and PDF research articles on the Web.
<<<!!!<<< The repository is offline >>>!!!>>> A collection of open content name datasets for Information Centric Networking. The "Content Name Collection" (CNC) lists and hosts open datasets of content names. These datasets are either derived from URL link databases or web traces. The names are typically used for research on Information Centric Networking (ICN), for example to measure cache hit/miss ratios in simulations.
NKN is now Research Computing and Data Services (RCDS)! We provide data management support for UI researchers and their regional, national, and international collaborators. This support keeps researchers at the cutting-edge of science and increases our institution's competitiveness for external research grants. Quality data and metadata developed in research projects and curated by RCDS (formerly NKN) is a valuable, long-term asset upon which to develop and build new research and science.
GroupLens is a research lab in the Department of Computer Science and Engineering at the University of Minnesota, Twin Cities specializing in recommender systems, online communities, mobile and ubiquitous technologies, digital libraries, and local geographic information systems.
US Department of Energy’s Atmospheric Radiation Measurement (ARM) Data Center is a long-term archive and distribution facility for various ground-based, aerial and model data products in support of atmospheric and climate research. ARM facility currently operates over 400 instruments at various observatories (https://www.arm.gov/capabilities/observatories/). ARM Data Center (ADC) Archive currently holds over 11,000 data products with a total holding of over 3 petabytes of data that dates back to 1993, these include data from instruments, value added products, model outputs, field campaign and PI contributed data. The data center archive also includes data collected by ARM from related program (e.g., external data such as NASA satellite).
The NASA Space Science Data Coordinated Archive serves as the permanent archive for NASA space science mission data. "Space science" means astronomy and astrophysics, solar and space plasma physics, and planetary and lunar science. As permanent archive, NSSDCA teams with NASA's discipline-specific space science "active archives" which provide access to data to researchers and, in some cases, to the general public. NSSDCA also serves as NASA's permanent archive for space physics mission data. It provides access to several geophysical models and to data from some non-NASA mission data. In addition to supporting active space physics and astrophysics researchers, NSSDCA also supports the general public both via several public-interest web-based services (e.g., the Photo Gallery) and via the offline mailing of CD-ROMs, photoprints, and other items.
The repository is part of the National Research Data Infrastructure initiative Text+, in which the University of Tübingen is a partner. It is housed at the Department of General and Computational Linguistics. The infrastructure is maintained in close cooperation with the Digital Humanities Centre, which is a core facility of the university, colaborating with the library and computing center of the university. Integration of the repository into the national CLARIN-D and international CLARIN infrastructures gives it wide exposure, increasing the likelihood that the resources will be used and further developed beyond the lifetime of the projects in which they were developed. Among the resources currently available in the Tübingen Center Repository, researchers can find widely used treebanks of German (e.g. TüBa-D/Z), the German wordnet (GermaNet), the first manually annotated digital treebank (Index Thomisticus), as well as descriptions of the tools used by the WebLicht ecosystem for natural language processing.