Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 8 result(s)
Currently the institute has more than 700 collections consisting of (digital) research data, digitized material, archival collections, printed material, handwritten questionnaires, maps and pictures. The focus is on resources relevant for the study of function, meaning and coherence of cultural expressions and resources relevant for the structural, dialectological and sociolinguistic study of language variation within the Dutch language. An overview is here https://meertens.knaw.nl/en/datasets/
The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories. It was formed in 1992 to address the critical data shortage then facing language technology research and development. Initially, LDC's primary role was as a repository and distribution point for language resources. Since that time, and with the help of its members, LDC has grown into an organization that creates and distributes a wide array of language resources. LDC also supports sponsored research programs and language-based technology evaluations by providing resources and contributing organizational expertise. LDC is hosted by the University of Pennsylvania and is a center within the University’s School of Arts and Sciences.
Språkbanken was established in 1975 as a national center located in the Faculty of Arts, University of Gothenburg. Allén's groundbreaking corpus linguistic research resulted in the creation of one of the first large electronic text corpora in another language than English, with one million words of newspaper text. The task of Språkbanken is to collect, develop, and store (Swedish) text corpora, and to make linguistic data extracted from the corpora available to researchers and to the public.
The focus of CLARIN INT Portal is on resources that are relevant to the lexicological study of the Dutch language and on resources relevant for research in and development of language and speech technology. For Example: lexicons, lexical databases, text corpora, speech corpora, language and speech technology tools, etc. The resources are: Cornetto-LMF (Lexicon Markup Framework), Corpus of Contemporary Dutch (Corpus Hedendaags Nederlands), Corpus Gysseling, Corpus VU-DNC (VU University Diachronic News text Corpus), Dictionary of the Frisian Language (Woordenboek der Friese Taal), DuELME-LMF (Lexicon Markup Framework), Language Portal (Taalportaal), Namescape, NERD (Named Entity Recognition and Disambiguation) and TICCLops (Text-Induced Corpus Clean-up online processing system).
CLAPOP is the portal of the Dutch CLARIN community. It brings together all relevant resources that were created within the CLARIN NL project and that now are part of the CLARIN NL infrastructure or that were created by other projects but are essential for the functioning of the CLARIN (NL) infrastructure. CLARIN-NL has closely cooperated with CLARIN Flanders in a number of projects. The common results of this cooperation and the results of this cooperation created by CLARIN Flanders are included here as well.
Country
Created in 2005 by the CNRS, CNRTL unites in a single portal, a set of linguistic resources and tools for language processing. The CNRTL includes the identification, documentation (metadata), standardization, storage, enhancement and dissemination of resources. The sustainability of the service and the data is guaranteed by the backing of the UMR ATILF (CNRS - Université Nancy), support of the CNRS and its integration in the excellence equipment project ORTOLANG .
ARCHE (A Resource Centre for the HumanitiEs) is a service aimed at offering stable and persistent hosting as well as dissemination of digital research data and resources for the Austrian humanities community. ARCHE welcomes data from all humanities fields. ARCHE is the successor of the Language Resources Portal (LRP) and acts as Austria’s connection point to the European network of CLARIN Centres for language resources.
ORTOLANG is an EQUIPEX project accepted in February 2012 in the framework of investissements d’avenir. Its aim is to construct a network infrastructure including a repository of language data (corpora, lexicons, dictionaries etc.) and readily available, well-documented tools for its processing. Expected outcomes comprize: promoting research on analysis, modelling and automatic processing of our language to their highest international levels thanks to effective resource pooling; facilitating the use and transfer of resources and tools set up within public laboratories to industrial partners, notably SMEs which often cannot develop such resources and tools for language processing given the cost of investment; promoting French language and the regional languages of France by sharing expertise acquired by public laboratories. ORTOLANG is a service for the language, which is complementary to the service offered by Huma-Num (très grande infrastructure de recherche). Ortolang gives access to SLDR for speech, and CNRTL for text resources.