Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 255 result(s)
Content type(s)
Country
Fondo Antiguo is part of UVaDOC Repositorio Documental de la Universidad de Valladolid. It contains ancient printed documents.
The Polinsky Language Sciences Lab at Harvard University is a linguistics lab that examines questions of language structure and its effect on the ways in which people use and process language in real time. We engage in linguistic and interdisciplinary research projects ourselves; offer linguistic research capabilities for undergraduate and graduate students, faculty, and visitors; and build relationships with the linguistic communities in which we do our research. We are interested in a broad range of issues pertaining to syntax, interfaces, and cross-linguistic variation. We place a particular emphasis on novel experimental evidence that facilitates the construction of linguistic theory. We have a strong cross-linguistic focus, drawing upon English, Russian, Chinese, Korean, Mayan languages, Basque, Austronesian languages, languages of the Caucasus, and others. We believe that challenging existing theories with data from as broad a range of languages as possible is a crucial component of the successful development of linguistic theory. We investigate both fluent speakers and heritage speakers—those who grew up hearing or speaking a particular language but who are now more fluent in a different, societally dominant language. Heritage languages, a novel field of linguistic inquiry, are important because they provide new insights into processes of linguistic development and attrition in general, thus increasing our understanding of the human capacity to maintain and acquire language. Understanding language use and processing in real time and how children acquire language helps us improve language study and pedagogy, which in turn improves communication across the globe. Although our lab does not specialize in language acquisition, we have conducted some studies of acquisition of lesser-studied languages and heritage languages, with the purpose of comparing heritage speakers to adults.
ANPERSANA is the digital library of IKER (UMR 5478), a research centre specialized in Basque language and texts. The online library platform receives and disseminates primary sources of data issued from research in Basque language and culture. As of today, two corpora of documents have been published. The first one, is a collection of private letters written in an 18th century variety of Basque, documented in and transcribed to modern standard Basque. The discovery of the collection, named Le Dauphin, has enabled the emerging of new questions about the history and sociology of writing in the domain of minority languages, not only in France, but also among the whole Atlantic Arc. The second of the two corpora is a selection of sound recordings about monodic chant in the Basque Country. The documents were collected as part of a PhD thesis research work that took place between 2003 and 2012. It's a total of 50 hours of interviews with francophone and bascophone cultural representatives carried out at either their workplace of the informers or in public areas. ANPERSANA is bundled with an advanced search engine. The documents have been indexed and geo-localized on an interactive map. The platform is engaged with open access and all the resources can be uploaded freely under the different Creative Commons (CC) licenses.
FactGrid is a Wikibase instance designed to be used by historians with a focus on international projects. The database is hosted by the University of Erfurt and coordinated at the Gotha Research Centre. Partners in joint ventures are Wikimedia Germany as the software provider and the German National Library in a project to open the GND to international research.
By stimulating inspiring research and producing innovative tools, Huygens ING intends to open up old and inaccessible sources, and to understand them better. Huygens ING’s focus is on Digital Humanities, History, History of Science, and Textual Scholarship. Huygens ING pursues research in the fields of History, Literary Studies, the History of Science and Digital Humanities. Huygens ING aims to publish digital sources and data responsibly and with care. Innovative tools are made as widely available as possible. We strive to share the available knowledge at the institute with both academic peers and the wider public.
MICASE provides a collection of transcripts of academic speech events recorded at the University of Michigan. The original DAT audiotapes are held in the English Language Institute and may be consulted by bona fide researchers under special arrangements. Additional access: https://lsa.umich.edu/eli/language-resources/micase-micusp.html
The Répertoire International des Sources Musicales (RISM) - International Inventory of Musical Sources - is an international, non-profit organization that aims to comprehensively document extant musical sources worldwide. These primary sources are music manuscripts or printed music editions, writings on music theory, and libretti. They are preserved in libraries, archives, churches, schools and private collections. RISM was founded in Paris in 1952 and is the largest and only international organization that documents written musical sources. RISM records what exists and where it can be found. As a result, by virtue of being cataloged in a comprehensive inventory, music traditions are protected while also being made available to musicologists and musicians alike. Such work is thus not an end in itself, but leads directly to practical applications.
LAUDATIO has developed an open access research data repository for historical corpora. For the access and (re-)use of historical corpora, the LAUDATIO repository uses a flexible and appropriate documentation schema with a subset of TEI customized by TEI ODD. The extensive metadata schema contains information about the preparation and checking methods applied to the data, tools, formats and annotation guidelines used in the project, as well as bibliographic metadata, and information on the research context (e.g. the research project). To provide complex and comprehensive search in the annotation data, the search and visualization tool ANNIS is integrated in the LAUDATIO-Repository.
An increasing number of Language Resources (LT) in the various fields of Human Language Technology (HLT) are distributed on behalf of ELRA via its operational body ELDA, thanks to the contribution of various players of the HLT community. Our aim is to provide Language Resources, by means of this repository, so as to prevent researchers and developers from investing efforts to rebuild resources which already exist as well as help them identify and access those resources.
The Digital Collections repository is a service that provides free and open access to the scholarship and creative works produced and owned by the Texas State University community. The Wittliff Collections, located on the seventh floor of the Albert B. Alkek Library at Texas State University, was founded by William D. Wittliff in 1987. The Wittliff Collections include 2 collections. 1. The Southwestern Writers Collection: These Collection holds the papers of numerous 20th century writers and the Southwestern & Mexican Photography Collection. The film holdings contain over 500 film and television screenplays as well as complete production archives for several popular films, including the television miniseries Lonesome Dove. The music holdings represent the breadth and scope of popular Texas sounds. 2. Mexican Photography Collection: The Southwestern & Mexican Photography Collection assembles a broad range of photographic work from the Southwestern United States and Mexico, from the 19th-century to the present day.
CERIC Data Portal allows users to consult and manage data related to experiments carried out at CERIC (Central European Research Infrastructure Consortium) partner facilities. Data made available includes scientific datasets collected during experiments, experiment proposals, samples used and publications if any. Users can search for data based on related metadata (both their own data and other peoples' public data).
eLaborate is an online work environment in which scholars can upload scans, transcribe and annotate text, and publish the results as on online text edition which is freely available to all users. Short information about and a link to already published editions is presented on the page Editions under Published. Information about editions currently being prepared is posted on the page Ongoing projects. The eLaborate work environment for the creation and publication of online digital editions is developed by the Huygens Institute for the History of the Netherlands of the Royal Netherlands Academy of Arts and Sciences. Although the institute considers itself primarily a research facility and does not maintain a public collection profile, Huygens ING actively maintains almost 200 digitally available resource collections.
The German Text Archive (Deutsches Textarchiv, DTA) presents online a selection of key German-language works in various disciplines from the 17th to 19th centuries. The electronic full-texts are indexed linguistically and the search facilities tolerate a range of spelling variants. The DTA presents German-language printed works from around 1650 to 1900 as full text and as digital facsimile. The selection of texts was made on the basis of lexicographical criteria and includes scientific or scholarly texts, texts from everyday life, and literary works. The digitalisation was made from the first edition of each work. Using the digital images of these editions, the text was first typed up manually twice (‘double keying’). To represent the structure of the text, the electronic full-text was encoded in conformity with the XML standard TEI P5. The next stages complete the linguistic analysis, i.e. the text is tokenised, lemmatised, and the parts of speech are annotated. The DTA thus presents a linguistically analysed, historical full-text corpus, available for a range of questions in corpus linguistics. Thanks to the interdisciplinary nature of the DTA Corpus, it also offers valuable source-texts for neighbouring disciplines in the humanities, and for scientists, legal scholars and economists.
The Humanitarian Data Exchange (HDX) is an open platform for sharing data across crises and organisations. Launched in July 2014, the goal of HDX is to make humanitarian data easy to find and use for analysis. HDX is managed by OCHA's Centre for Humanitarian Data, which is located in The Hague. OCHA is part of the United Nations Secretariat and is responsible for bringing together humanitarian actors to ensure a coherent response to emergencies. The HDX team includes OCHA staff and a number of consultants who are based in North America, Europe and Africa.
The aim of the project is systematic mapping of Czech and other languages in comparison with Czech. CNC corpora are accessible to everybody interested in studying the language after free registration.
The Pacific Islands Families (PIF) Study is an ongoing longitudinal birth cohort study that has been tracking the health and development of 1,398 Pacific children and their parents since the children were born at Middlemore Hospital in South Auckland in the year 2000. It is the only prospective study specifically of Pacific peoples in the world.
Content type(s)
RELMIN collects, studies and publishes legal texts defining the status of religious minorities in medieval Europe. The corpus of texts is rich and varied, spanning ten centuries over a broad geographical area; these texts, in Latin, Arabic, Greek, Hebrew and Aramaic (and also in Medieval Spanish, Portuguese, and other European vernaculars), are dispersed in libraries and archives across Europe. The texts are now gathered in the RELMIN Database in their original language, with translations and commentaries. They are made available to scholars, students and citizens at large. Access is unlimited, free and perennial. and to contribute to the work of compilation. RELMIN is is buil ding a digital database of legal, judicial and normative sources defining the status of religious minorities from the 5th to the 15th century.
The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories. It was formed in 1992 to address the critical data shortage then facing language technology research and development. Initially, LDC's primary role was as a repository and distribution point for language resources. Since that time, and with the help of its members, LDC has grown into an organization that creates and distributes a wide array of language resources. LDC also supports sponsored research programs and language-based technology evaluations by providing resources and contributing organizational expertise. LDC is hosted by the University of Pennsylvania and is a center within the University’s School of Arts and Sciences.
SWE-CLARIN is a national node in European Language and Technology Infrastructure (CLARIN) - an ESFRI initiative to build an infrastructure for e-science in the humanities and social sciences. SWE-CLARIN makes language-based materials available as research data using advanced processing tools and other resources. One basic idea is that the increasing amount of text and speech - contemporary and historical - as digital research material enables new forms of e-science and new ways to tackle old research issues.
The Text Laboratory provides assistance with databases, word lists, corpora and tailored solutions for language technology. We also work on research and development projects alone or in cooperation with others - locally, nationally and internationally. Services and tools: Word and frequency lists, Written corpora, Speech corpora, Multilingual corpora, Databases, Glossa Search Tool, The Oslo-Bergen Tagger, GREI grammar games, Audio files: dialects from Norway and America etc., Nordic Atlas of Language Structures (NALS) Journal, Norwegian in America, NEALT, Ethiopian Language Technology, Access to Corpora
The Language Archive Cologne (LAC) is a research data repository for the linguistics and all humanities disciplines working with audiovisual data. The archive forms a cluster of the Data Center for Humanities in cooperation with the Institute of Linguistics of the University of Cologne. The LAC is an archive for language resources, which is freely available via a web-based access. In addition, concrete technical and methodological advice is offered in the research data cycle - from the collection of the data, their preparation and archiving, to publication and reuse.
Country
The architecture of the Myus Temple (Ionian coast) is preserved only in a few very fragmented parts. These components, currently housed in the Staatlichen Museen zu Berlin - Antikensammlung, were digitalized and will be used in the reconstruction of a column from a temple likely dedicated to Dionysos.
CLARIN-LV is a national node of Clarin ERIC (Common Language Resources and Technology Infrastructure). The mission of the repository is to ensure the availability and long­ term preservation of language resources. The data stored in the repository are being actively used and cited in scientific publications.