Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 15 result(s)
Competence Centre IULA-UPF-CC CLARIN manages, disseminates and facilitates this catalogue, which provides access to reference information on the use of language technology projects and studies in different disciplines, especially with regard to Humanities and Social Sciences. The Catalog relates information that is organized by Áreas, (disciplines and research topics), Projects (of research that use or have used language technologies), Tasks (that make the tools), Tools (of language technology), Documentation (articles regarding the tools and how they are used) and resources such as Corpora (collections of annotated texts) and Lexica (collections of words for different uses).
CHILDES is the child language component of the TalkBank system. TalkBank is a system for sharing and studying conversational interactions.
LINDAT/CLARIN is designed as a Czech “node” of Clarin ERIC (Common Language Resources and Technology Infrastructure). It also supports the goals of the META-NET language technology network. Both networks aim at collection, annotation, development and free sharing of language data and basic technologies between institutions and individuals both in science and in all types of research. The Clarin ERIC infrastructural project is more focused on humanities, while META-NET aims at the development of language technologies and applications. The data stored in the repository are already being used in scientific publications in the Czech Republic.
>>>>> As of 01/12/2015, deposit of data on SLDR website will be suspended to allow the public opening of Ortolang platform https://www.ortolang.fr/#/market/home .<<<<<Speech & Language Data Repository (SLDR) is a Trusted Data Repository offering labs and scholars a free-of-charge service for sharing their oral/linguistic data and archiving it with the help of procedures compliant with the OAIS model for long-term preservation. Its entire storage is referenced in international repositories such as OLAC (Open Language Archives Community) and CLARIN Virtual Language Observatory (VLO). Currently, packages are distributed via the TGE-Adonis grid, now integrated in Huma-Num, hosted by CC-IN2P3 and preserved on the platform of CINES, an institutional archive beneficiary of the Data Seal of Approval.
The Text Laboratory provides assistance with databases, word lists, corpora and tailored solutions for language technology. We also work on research and development projects alone or in cooperation with others - locally, nationally and internationally. Services and tools: Word and frequency lists, Written corpora, Speech corpora, Multilingual corpora, Databases, Glossa Search Tool, The Oslo-Bergen Tagger, GREI grammar games, Audio files: dialects from Norway and America etc., Nordic Atlas of Language Structures (NALS) Journal, Norwegian in America, NEALT, Ethiopian Language Technology, Access to Corpora
The IDS Repository aims at long-term archival of linguistic resources and tools in the field of German studies. It provides data together with metadata in Dublin Core and CMDI formats. The Mannheim Corpus of historical newspapers and magazines consists of 21 german newspapers and magazines from the 18th and 19th century. It comprises about 750 individual volumes on 3532 pages overall. This corpus was assembled and digitized from 2009 to 2011.
The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories. It was formed in 1992 to address the critical data shortage then facing language technology research and development. Initially, LDC's primary role was as a repository and distribution point for language resources. Since that time, and with the help of its members, LDC has grown into an organization that creates and distributes a wide array of language resources. LDC also supports sponsored research programs and language-based technology evaluations by providing resources and contributing organizational expertise. LDC is hosted by the University of Pennsylvania and is a center within the University’s School of Arts and Sciences.
Answering an increasing demand for digital and collective research features in the humanities, TextGrid has, since its start in 2006, established the infrastructure for a respective virtual research environment. In continuous exchange with the scientific community, TextGrid has developed a variety of tools and services available for free download in a stable version. Together with the TextGrid Repository, the Virtual Research environment TextGrid offers humanist researcher in the humanities sustainable editing, storing and publishing of their data in a thoroughly tested and safe environment. The vision of a digital ecosystem is based on the open source idea, allowing for free exchange of tools and data, whereby adaptation concerning discipline-specific needs is made possible. Researchers from a wide range of humanistic disciplines such as philology, linguistics, musicology, art history, classical philology and musicology are actively working with TextGrid have joined the TextGrid consortium in the second phase of the project.
Lithuania became a full member of CLARIN ERIC in January of 2015 and soon CLARIN-LT consortium was founded by three partner universities: Vytautas Magnus University, Kaunas Technology University and Vilnius University. The main goal of the consortium is to become a CLARIN B centre, which will be able to serve language users in Lithuania and Europe for storing and accessing language resources.
Country
Created in 2005 by the CNRS, CNRTL unites in a single portal, a set of linguistic resources and tools for language processing. The CNRTL includes the identification, documentation (metadata), standardization, storage, enhancement and dissemination of resources. The sustainability of the service and the data is guaranteed by the backing of the UMR ATILF (CNRS - Université Nancy), support of the CNRS and its integration in the excellence equipment project ORTOLANG .
Currently the institute has more than 450 collections consisting of (digital) research data, digitized material, archival collections, printed material, handwritten questionnaires, maps and pictures. The focus is on resources relevant for the study of function, meaning and coherence of cultural expressions and resources relevant for the structural, dialectological and sociolinguistic study of language variation within the Dutch language. An overview is here http://www.meertens.knaw.nl/cms/en/collections/databases
In collaboration with other centres in the CLARIN-D consortium, the UdS CLARIN-D centre enables eHumanities by providing a service for hosting and processing language resources (notably corpora) for members of the research community. The UdS CLARIN-D centre thus contributes of lifting the fragmentation of language resources by assisting members of the research community in preparing language materials in such a way that easy discovery is ensured, interchange is facilitated and preservation is enabled by enriching such materials with meta-information, transforming them into sustainable formats and hosting them. We have an explicit mission to archive language resources especially multilingual corpora (parallel, comparable) and corpora including specific registers, both collected by associated researchers as well as researchers who are not affiliated with us.