Reset all


Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 145 result(s)
ORTOLANG is an EQUIPEX project accepted in February 2012 in the framework of investissements d’avenir. Its aim is to construct a network infrastructure including a repository of language data (corpora, lexicons, dictionaries etc.) and readily available, well-documented tools for its processing. Expected outcomes comprize: promoting research on analysis, modelling and automatic processing of our language to their highest international levels thanks to effective resource pooling; facilitating the use and transfer of resources and tools set up within public laboratories to industrial partners, notably SMEs which often cannot develop such resources and tools for language processing given the cost of investment; promoting French language and the regional languages of France by sharing expertise acquired by public laboratories. ORTOLANG is a service for the language, which is complementary to the service offered by Huma-Num (très grande infrastructure de recherche). Ortolang gives access to SLDR for speech, and CNRTL for text resources.
York Digital Library (YODL) is a University-wide Digital Library service for multimedia resources used in or created through teaching, research and study at the University of York. YODL complements the University's research publications, held in White Rose Research Online and PURE, and the digital teaching materials in the University's Yorkshare Virtual Learning Environment. YODL contains a range of collections, including images, past exam papers, masters dissertations and audio. Some of these are available only to members of the University of York, whilst other material is available to the public. YODL is expanding with more content being added all the time
The project is set up in order to improve the infrastructure for text-based linguistic research and development by building a huge, automatically annotated German text corpus and the corresponding tools for corpus annotation and exploitation. DeReKo constitutes the largest linguistically motivated collection of contemporary German texts, contains fictional, scientific and newspaper texts, as well as several other text types, contains only licenced texts, is encoded with rich meta-textual information, is fully annotated morphosyntactically (three concurrent annotations), is continually expanded, with a focus on size and stratification of data, may be analyzed free of charge via the query system COSMAS II, serves as a 'primordial sample' from which users may draw specialized sub-samples (socalled 'virtual corpora') to represent the language domain they wish to investigate.
The aim of the project was to compile a representative computerized corpus of German for the period 1650-1800. This is the first such corpus of early modern German and it is intended as a primary research resource in a number of disciplines. Its structure deliberately parallels that of extant historical corpora of English in order to facilitate systematic comparative studies. The regional dimension which was an essential feature of the projects also provides information about the link between language and changes in the relative cultural and political areas within Germany.
Methods of digital architectural documentation/polychromy (pilot project). Three architectural fragments were recorded with photography, architectural drawings by hand, different techniques of 3D scanning, and Reflectance Transformation Imaging (RTI).
eLaborate is an online work environment in which scholars can upload scans, transcribe and annotate text, and publish the results as on online text edition which is freely available to all users. Short information about and a link to already published editions is presented on the page Editions under Published. Information about editions currently being prepared is posted on the page Ongoing projects. The eLaborate work environment for the creation and publication of online digital editions is developed by the Huygens Institute for the History of the Netherlands of the Royal Netherlands Academy of Arts and Sciences. Although the institute considers itself primarily a research facility and does not maintain a public collection profile, Huygens ING actively maintains almost 200 digitally available resource collections.
Currently the institute has more than 450 collections consisting of (digital) research data, digitized material, archival collections, printed material, handwritten questionnaires, maps and pictures. The focus is on resources relevant for the study of function, meaning and coherence of cultural expressions and resources relevant for the structural, dialectological and sociolinguistic study of language variation within the Dutch language. An overview is here
CLAPOP is the portal of the Dutch CLARIN community. It brings together all relevant resources that were created within the CLARIN NL project and that now are part of the CLARIN NL infrastructure or that were created by other projects but are essential for the functioning of the CLARIN (NL) infrastructure. CLARIN-NL has closely cooperated with CLARIN Flanders in a number of projects. The common results of this cooperation and the results of this cooperation created by CLARIN Flanders are included here as well.
In the Wolfenbüttel Digital Library the Herzog August Bibliothek presents in digital facsimile selected items from its collections which are rare, outstanding, frequently used, or currently most relevant for research. All digitized titles may be accessed not only here, but also via the PICA-OPAC as long as they are monographs. The OPAC allows you to search for digitized books separately by limiting the search options within the database using the term Online Resources. Projects which provide additional indexing comprise a project-specific database, an inventory of digitized titles, information about tools and techniques, and references to literature. Here the main objective is to provide search facilities outside the scope of usual bibliographic description, such as page-related indexing.
The scores and libretti in this Virtual Collection include first and early editions and manuscript copies of music from the eighteenth and early nineteenth centuries by J.S. Bach and Bach family members, Mozart, Schubert and other composers, as well as multiple versions of nineteenth century opera scores, seminal works of musical modernism, and music of the Second Viennese School. Many, such as variant editions of nineteenth century operas and related libretti, fall into intellectually related sets that are meant to be seen and used together. As a group, they give scholars a window into the study of historical performance practice that cannot be duplicated using the holdings of any one other library.
Welcome to the UCLA Phonetics Lab Archive. For over half a century, the UCLA Phonetics Laboratory has collected recordings of hundreds of languages from around the world, providing source materials for phonetic and phonological research, of value to scholars, speakers of the languages, and language learners alike. The materials on this site comprise audio recordings illustrating phonetic structures from over 200 languages with phonetic transcriptions, plus scans of original field notes where relevant.
CLARIN.SI is the Slovenian node of the European CLARIN (Common Language Resources and Technology Infrastructure) Centers. The CLARIN.SI repository is hosted at the Jožef Stefan Institute and offers long-term preservation of deposited linguistic resources, along with their descriptive metadata. The integration of the repository with the CLARIN infrastructure gives the deposited resources wide exposure, so that they can be known, used and further developed beyond the lifetime of the projects in which they were produced. Among the resources currently available in the CLARIN.SI repository are the multilingual MULTEXT-East resources, the CC version of Slovenian reference corpus Gigafida, the morphological lexicon Sloleks, the IMP corpora and lexicons of historical Slovenian, as well as many other resources for a variety of languages. Furthermore, several REST-based web services are provided for different corpus-linguistic and NLP tasks.
Content type(s)
The aim of the research project TOPOI II A-2-4 was to re-evaluate archaeological records and finds resulting from earlier investigations within the context of recent and ongoing research across the region.
Content type(s)
Database of ancient sources concerning Roman Water Law. Specific legal sources, e.g. from the Corpus Iuris Civilis or the Codex Theodosianus, and literary sources, for example from Cicero, Frontinus, Hyginus, Siculus Flaccus or Vitruvius, were collected to give an overview of water related legal problems in ancient Rome. Furthermore, the aim of the database is to classify these sources into different legal topics, in order to facilitate the research for sources concerning specific questions regarding Roman Water Law.
The CRC806-Database platform is the Research Data Management infrastructure of the SFB / CRC 806. The infrastructure is implemented using Open Source software, and implements Open Science, Open Access and Open Data principles. The Collaborative Research Centre (CRC; ‘Sonderforschungsbereich’ or SFB) is designed to capture the complex nature of chronology, regional structure, climatic, environmental and socio-cultural contexts of major intercontinental and transcontinental events of dispersal of Modern Man from Africa to Western Eurasia, and particularly to Europe (Cited from introductory text on:
Since 1898 the Swiss Lawyers Society edits a collection of law sources which had been created on Swiss territory up to 1798, the Collection of Swiss Law Sources. The Collection contains material from the early middle ages until early modern times (1798). Over 100 volumes, or more than 70,000 pages of source material and comments from all language regions of Switzerland have been published so far.
Sinmin contains texts of different genres and styles of the modern and old Sinhala language. The main sources of electronic copies of texts for the corpus are online Sinhala newspapers, online Sinhala news sites, Sinhala school textbooks available in online, online Sinhala magazines, Sinhala Wikipedia, Sinhala fictions available in online, Mahawansa, Sinhala Blogs, Sinhala subtitles and Sri lankan gazette.