Filter
Reset all

Subjects

Content Types

Countries

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 9 result(s)
ARCHE (A Resource Centre for the HumanitiEs) is a service aimed at offering stable and persistent hosting as well as dissemination of digital research data and resources for the Austrian humanities community. ARCHE welcomes data from all humanities fields. ARCHE is the successor of the Language Resources Portal (LRP) and acts as Austria’s connection point to the European network of CLARIN Centres for language resources.
The German Text Archive (Deutsches Textarchiv, DTA) presents online a selection of key German-language works in various disciplines from the 17th to 19th centuries. The electronic full-texts are indexed linguistically and the search facilities tolerate a range of spelling variants. The DTA presents German-language printed works from around 1650 to 1900 as full text and as digital facsimile. The selection of texts was made on the basis of lexicographical criteria and includes scientific or scholarly texts, texts from everyday life, and literary works. The digitalisation was made from the first edition of each work. Using the digital images of these editions, the text was first typed up manually twice (‘double keying’). To represent the structure of the text, the electronic full-text was encoded in conformity with the XML standard TEI P5. The next stages complete the linguistic analysis, i.e. the text is tokenised, lemmatised, and the parts of speech are annotated. The DTA thus presents a linguistically analysed, historical full-text corpus, available for a range of questions in corpus linguistics. Thanks to the interdisciplinary nature of the DTA Corpus, it also offers valuable source-texts for neighbouring disciplines in the humanities, and for scientists, legal scholars and economists.
IBICT is providing a research data repository that takes care of long-term preservation and archiving of good practices, so that researchers can share, maintain control and get recognition for your data. The repository supports research data sharing with Quote persistent data, allowing them to be played. The Dataverse is a large open data repository of all disciplines, created by the Institute for Quantitative Social Science at Harvard University. IBICT the Dataverse repository provides a means available for free to deposit and find specific data sets stored by employees of the institutions participating in the Cariniana network.
Open Context is a free, open access resource for the electronic publication of primary field research from archaeology and related disciplines. It emerged as a means for scholars and students to easily find and reuse content created by others, which are key to advancing research and education. Open Context's technologies focus on ease of use, open licensing frameworks, informal data integration and, most importantly, data portability.Open Context currently publishes 132 projects.
Pandora is an open data platform devoted to the study of the human story. Data may be deposited from various disciplines and research topics that investigate humans from their early beginnings until present in addition to their environmental context (e.g. archeology, anthropology history, ancient DNA, isotopes, zooarchaeology, archaeobotany, and paleoenvironmental and paleoclimatic studies, etc.). Pandora allows autonomous data communities to self-manage their webspace and community membership. Data communities self-curate their data plus other supporting resources. Datasets may be assigned a new DOI and a schema markup is employed to improve data findability. Pandora also allows for links to datasets stored externally and having previously assigned DOIs. Through this, it becomes possible to establish data networks devoted to specific topics that may combine a mix of datasets stored either within Pandora or externally.
The Bavarian Archive for Speech Signals (BAS) is a public institution hosted by the University of Munich. This institution was founded with the aim of making corpora of current spoken German available to both the basic research and the speech technology communities via a maximally comprehensive digital speech-signal database. The speech material will be structured in a manner allowing flexible and precise access, with acoustic-phonetic and linguistic-phonetic evaluation forming an integral part of it.
The project is set up in order to improve the infrastructure for text-based linguistic research and development by building a huge, automatically annotated German text corpus and the corresponding tools for corpus annotation and exploitation. DeReKo constitutes the largest linguistically motivated collection of contemporary German texts, contains fictional, scientific and newspaper texts, as well as several other text types, contains only licenced texts, is encoded with rich meta-textual information, is fully annotated morphosyntactically (three concurrent annotations), is continually expanded, with a focus on size and stratification of data, may be analyzed free of charge via the query system COSMAS II, serves as a 'primordial sample' from which users may draw specialized sub-samples (socalled 'virtual corpora') to represent the language domain they wish to investigate. !!! Access to data of Das Deutsche Referenzkorpus is also provided by: IDS Repository https://www.re3data.org/repository/r3d100010382 !!!
CLARIN.SI is the Slovenian node of the European CLARIN (Common Language Resources and Technology Infrastructure) Centers. The CLARIN.SI repository is hosted at the Jožef Stefan Institute and offers long-term preservation of deposited linguistic resources, along with their descriptive metadata. The integration of the repository with the CLARIN infrastructure gives the deposited resources wide exposure, so that they can be known, used and further developed beyond the lifetime of the projects in which they were produced. Among the resources currently available in the CLARIN.SI repository are the multilingual MULTEXT-East resources, the CC version of Slovenian reference corpus Gigafida, the morphological lexicon Sloleks, the IMP corpora and lexicons of historical Slovenian, as well as many other resources for a variety of languages. Furthermore, several REST-based web services are provided for different corpus-linguistic and NLP tasks.
Country
The purpose of the Canadian Urban Data Repository (CUDR) is to provide a “home” for urban datasets. While primarily focused on datasets created by academe, it will also contain datasets created by NGOs, governments, citizens, and industry. Datasets stored in the repository will be open-access and will not contain personally identifiable information. The purpose of the Canadian Urban Data Catalogue (CUDC) is to enhance the awareness of urban datasets that exist across Canada by providing a catalogue of Canadian and Canadian-created urban datasets. It will catalogue datasets available in CUDR and external datasets available on other platforms and as web services. These external datasets may be open or closed. CUDC uses a rich metadata model that supports the documentation and search for datasets relevant to a user’s needs. Catalogue entry metadata may be exported and imported from/to CUDC.