Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 20 result(s)
The Eurac Research CLARIN Centre (ERCC) is a dedicated repository for language data. It is hosted by the Institute for Applied Linguistics (IAL) at Eurac Research, a private research centre based in Bolzano, South Tyrol. The Centre is part of the Europe-wide CLARIN infrastructure, which means that it follows well-defined international standards for (meta)data and procedures and is well-embedded in the wider European Linguistics infrastructure. The repository hosts data collected at the IAL, but is also open for data deposits from external collaborators.
The CLARIN­/Text+ repository at the Saxon Academy of Sciences and Humanities in Leipzig offers long­term preservation of digital resources, along with their descriptive metadata. The mission of the repository is to ensure the availability and long­term preservation of resources, to preserve knowledge gained in research, to aid the transfer of knowledge into new contexts, and to integrate new methods and resources into university curricula. Among the resources currently available in the Leipzig repository are a set of corpora of the Leipzig Corpora Collection (LCC), based on newspaper, Wikipedia and Web text. Furthermore several REST-based webservices are provided for a variety of different NLP-relevant tasks The repository is part of the CLARIN infrastructure and part of the NFDI consortium Text+. It is operated by the Saxon Academy of Sciences and Humanities in Leipzig.
DataON is Korea's National Research Data Platform. It provides integrated search of metadata for KISTI's research data and domestic and international research data and links to raw data. DataON allows users (researchers, policy makers, etc.) to perform the following tasks: Easily search for various types of research data in all scientific fields. By registering research results, research data can be posted and cited. Build a community among researchers and enable collaborative research. It provides a data analysis environment that allows one-stop analysis of discovered research data.
CLARIN-LV is a national node of Clarin ERIC (Common Language Resources and Technology Infrastructure). The mission of the repository is to ensure the availability and long­ term preservation of language resources. The data stored in the repository are being actively used and cited in scientific publications.
Country
sciencedata.dk is a research data store provided by DTU, the Danish Technical University, specifically aimed at researchers and scientists at Danish academic institutions. The service is intended for working with and sharing active research data as well as for safekeeping of large datasets. The data can be accessed and manipulated via a web interface, synchronization clients, file transfer clients or the command line. The service is built on and with open-source software from the ground up: FreeBSD, ZFS, Apache, PHP, ownCloud/Nextcloud. DTU is actively engaged in community efforts on developing research-specific functionality for data stores. Our servers are attached directly to the 10-Gigabit backbone of "Forskningsnettet" (the National Research and Education Network of Denmark) - implying that up and download speed from Danish academic institutions is in principle comparable to those of an external USB hard drive. Data store for research data allowing private sharing and sharing via links / persistent URLs.
Discovery is the digital repository of research, and related activities, undertaken at the University of Dundee. The content held in Discovery is varied and ranges from traditional research outputs such as peer-reviewed articles and conference papers, books, chapters and post-graduate research theses and data to records for artefacts, exhibitions, multimedia and software. Where possible Discovery provides full-text access to a version of the research. Discovery is the data catalogue for datasets resulting from research undertaken at the University of Dundee and in some instances the publisher of research data.
LINDAT/CLARIN is designed as a Czech “node” of Clarin ERIC (Common Language Resources and Technology Infrastructure). It also supports the goals of the META-NET language technology network. Both networks aim at collection, annotation, development and free sharing of language data and basic technologies between institutions and individuals both in science and in all types of research. The Clarin ERIC infrastructural project is more focused on humanities, while META-NET aims at the development of language technologies and applications. The data stored in the repository are already being used in scientific publications in the Czech Republic. In 2019 LINDAT/CLARIAH-CZ was established as a unification of two research infrastructures, LINDAT/CLARIN and DARIAH-CZ.
ILC-CNR for CLARIN-IT repository is a library for linguistic data and tools. Including: Text Processing and Computational Philology; Natural Language Processing and Knowledge Extraction; Resources, Standards and Infrastructures; Computational Models of Language Usage. The studies carried out within each area are highly interdisciplinary and involve different professional skills and expertises that extend across the disciplines of Linguistics, Computational Linguistics, Computer Science and Bio-Engineering.
Country
Repository "Open Science Resource Atlas 2.0" aims to increase the accessibility, improve the quality and extend the reusability of science resources. Repository focuses on the digital sharing of resources of great importance to the field of science and economy. These include publications, scripts, lectures, 3D models, audio and video recordings, photos, input and output files of various computer programs, databases collecting data from various fields, machines, systems, language corpora and many others. The target group, apart from academics, students and doctoral students, is everyone interested, including entrepreneurs and, what is important and unique - disabled, blind, visually impaired and deaf people.
CLARIN.SI is the Slovenian node of the European CLARIN (Common Language Resources and Technology Infrastructure) Centers. The CLARIN.SI repository is hosted at the Jožef Stefan Institute and offers long-term preservation of deposited linguistic resources, along with their descriptive metadata. The integration of the repository with the CLARIN infrastructure gives the deposited resources wide exposure, so that they can be known, used and further developed beyond the lifetime of the projects in which they were produced. Among the resources currently available in the CLARIN.SI repository are the multilingual MULTEXT-East resources, the CC version of Slovenian reference corpus Gigafida, the morphological lexicon Sloleks, the IMP corpora and lexicons of historical Slovenian, as well as many other resources for a variety of languages. Furthermore, several REST-based web services are provided for different corpus-linguistic and NLP tasks.
The repository is part of the National Research Data Infrastructure initiative Text+, in which the University of Tübingen is a partner. It is housed at the Department of General and Computational Linguistics. The infrastructure is maintained in close cooperation with the Digital Humanities Centre, which is a core facility of the university, colaborating with the library and computing center of the university. Integration of the repository into the national CLARIN-D and international CLARIN infrastructures gives it wide exposure, increasing the likelihood that the resources will be used and further developed beyond the lifetime of the projects in which they were developed. Among the resources currently available in the Tübingen Center Repository, researchers can find widely used treebanks of German (e.g. TüBa-D/Z), the German wordnet (GermaNet), the first manually annotated digital treebank (Index Thomisticus), as well as descriptions of the tools used by the WebLicht ecosystem for natural language processing.
This is the KONECT project, a project in the area of network science with the goal to collect network datasets, analyse them, and make available all analyses online. KONECT stands for Koblenz Network Collection, as the project has roots at the University of Koblenz–Landau in Germany. All source code is made available as Free Software, and includes a network analysis toolbox for GNU Octave, a network extraction library, as well as code to generate these web pages, including all statistics and plots. KONECT contains over a hundred network datasets of various types, including directed, undirected, bipartite, weighted, unweighted, signed and rating networks. The networks of KONECT are collected from many diverse areas such as social networks, hyperlink networks, authorship networks, physical networks, interaction networks and communication networks. The KONECT project has developed network analysis tools which are used to compute network statistics, to draw plots and to implement various link prediction algorithms. The result of these analyses are presented on these pages. Whenever we are allowed to do so, we provide a download of the networks.
The University of Guelph Research Data Repositories provide long-term stewardship of research data created at or in cooperation with the University of Guelph. The Data Repositories are guided by the FAIR Guiding Principles for scientific data management and stewardship which aim to improve the Findability, Accessibility, Interoperability and Reuse of research data. The Data Repositories is composed of two main collections: the Agri-environmental Research Data collection which houses agricultural and environmental research data, and the Cross-disciplinary Research Data collection which houses all other disciplinary research data.