Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 64 result(s)
The focus of PolMine is on texts published by public institutions in Germany. Corpora of parliamentary protocols are at the heart of the project: Parliamentary proceedings are available for long stretches of time, cover a broad set of public policies and are in the public domain, making them a valuable text resource for political science. The project develops repositories of textual data in a sustainable fashion to suit the research needs of political science. Concerning data, the focus is on converting text issued by public institutions into a sustainable digital format (TEI/XML).
Country
FULIR Data is a research data repository that gathers, permanently stores and allows open access to primary data produced by researchers based at Ruđer Bošković Institute. Researchers deposit datasets by themselves (self-archiving) with the support given by the Centre for Scientific Information and their RDM experts.
The repository is part of the National Research Data Infrastructure initiative Text+, in which the University of Tübingen is a partner. It is housed at the Department of General and Computational Linguistics. The infrastructure is maintained in close cooperation with the Digital Humanities Centre, which is a core facility of the university, colaborating with the library and computing center of the university. Integration of the repository into the national CLARIN-D and international CLARIN infrastructures gives it wide exposure, increasing the likelihood that the resources will be used and further developed beyond the lifetime of the projects in which they were developed. Among the resources currently available in the Tübingen Center Repository, researchers can find widely used treebanks of German (e.g. TüBa-D/Z), the German wordnet (GermaNet), the first manually annotated digital treebank (Index Thomisticus), as well as descriptions of the tools used by the WebLicht ecosystem for natural language processing.
UM Dataverse is part of the Dataverse Project conceived of by Harvard University. It is an open source repository to assist researchers in the creation, management and dissemination of their research data. UM Dataverse allows for the creation of multiple collaborative environments containing datasets, metadata and digital objects. UM Dataverse provides formal scholarly data citations and can help with data requirements from publishers and funders.
SWISSUbase is a national cross-disciplinary solution for Swiss universities and other research organisations in need of local institutional data repositories for their researchers. SWISSUbase offers to the Swiss research institutions a tool for the curation, preservation, and dissemination of scientific research data and for the storage of information on ongoing and completed research projects in the country. The platform relies on international archiving standards and processes to ensure that data are preserved and accessible in the long-term. Datasets are curated by Data Service Units (DSU). DSU LaRS – Language Repository of Switzerland: https://www.lars.uzh.ch/en.html
Currently the institute has more than 700 collections consisting of (digital) research data, digitized material, archival collections, printed material, handwritten questionnaires, maps and pictures. The focus is on resources relevant for the study of function, meaning and coherence of cultural expressions and resources relevant for the structural, dialectological and sociolinguistic study of language variation within the Dutch language. An overview is here https://meertens.knaw.nl/en/datasets/
The Numeric Data Services Dataverse provides access to the Cross National Time Series (Banks data), the ITERATE database, and selected survey data. The DataVerse of the Harvard's Numeric Data Services houses a curated collection of datasets to meet the research and instructional needs of the Harvard community, which are also openly accessible. Primarily social sciences.
Launched in December 2013, Gaia is destined to create the most accurate map yet of the Milky Way. By making accurate measurements of the positions and motions of stars in the Milky Way, it will answer questions about the origin and evolution of our home galaxy. The first data release (2016) contains three-dimensional positions and two-dimensional motions of a subset of two million stars. The second data release (2018) increases that number to over 1.6 Billion. Gaia’s measurements are as precise as planned, paving the way to a better understanding of our galaxy and its neighborhood. The AIP hosts the Gaia data as one of the external data centers along with the main Gaia archive maintained by ESAC and provides access to the Gaia data releases as part of Gaia Data Processing and Analysis Consortium (DPAC).
Country
The Universidad del Rosario Research data repository is an institutional iniciative launched in 2019 to preserve, provide access and promote the use of data resulting from Universidad del Rosario research projects. The Repository aims to consolidate an online, collaborative working space and data-sharing platform to support Universidad del Rosario researchers and their collaborators, and to ensure that research data is available to the community, in order to support further research and contribute to the democratization of knowledge. The Research data repository is the heart of an institutional strategy that seeks to ensure the generation of Findable, Accessible, Interoperable and Reusable (FAIR) data, with the aim of increasing its impact and visibility. This strategy follows the international philosophy of making research data “as open as possible and as closed as necessary”, in order to foster the expansion, valuation, acceleration and reusability of scientific research, but at the same time, safeguard the privacy of the subjects. The platform storage, preserves and facilitates the management of research data from all disciplines, generated by the researchers of all the schools and faculties of the University, that work together to ensure research with the highest standards of quality and scientific integrity, encouraging innovation for the benefit of society.
The project is set up in order to improve the infrastructure for text-based linguistic research and development by building a huge, automatically annotated German text corpus and the corresponding tools for corpus annotation and exploitation. DeReKo constitutes the largest linguistically motivated collection of contemporary German texts, contains fictional, scientific and newspaper texts, as well as several other text types, contains only licenced texts, is encoded with rich meta-textual information, is fully annotated morphosyntactically (three concurrent annotations), is continually expanded, with a focus on size and stratification of data, may be analyzed free of charge via the query system COSMAS II, serves as a 'primordial sample' from which users may draw specialized sub-samples (socalled 'virtual corpora') to represent the language domain they wish to investigate. !!! Access to data of Das Deutsche Referenzkorpus is also provided by: IDS Repository https://www.re3data.org/repository/r3d100010382 !!!
The UK Data Service is a national data service funded by the ESRC to provide research access to the UK’s largest collection of social, economic and population data including UK government-sponsored surveys, cross-national surveys, longitudinal studies, UK census data, international aggregate, business data, and qualitative data. Designed to meet the data needs of researchers, students and teachers from all sectors, including academia, central and local government, charities and foundations, independent research centres, think tanks, business consultants and analysts, communities and the commercial sector, the UK Data Service provides access to high-quality social and economic data; support for policy-relevant research; guidance and training for the development of skills in data use, and the development of best practice in digital preservation and sharing. Data users can browse collections online and register to analyse and download them. Open Data collections are available for anyone to use. Key partners include JISC, the University of Manchester, University of Edinburgh and University College London (UCL). The lead partner is the UK Data Archive (https://service.re3data.org/repository/r3d100010215) based at the University of Essex, a Trusted Digital Repository (TDR) certified against the CoreTrustSeal (https://www.coretrustseal.org/) and certified against ISO27001 for Information Security (https://www.iso.org/standard/27001). The UK Data Service replaces the earlier ESRC investments of the Economic and Social Data Service (ESDS), the Secure Data Service (SDS), the Survey Question Bank and elements of the ESRC Census Programme.
CLARIN.SI is the Slovenian node of the European CLARIN (Common Language Resources and Technology Infrastructure) Centers. The CLARIN.SI repository is hosted at the Jožef Stefan Institute and offers long-term preservation of deposited linguistic resources, along with their descriptive metadata. The integration of the repository with the CLARIN infrastructure gives the deposited resources wide exposure, so that they can be known, used and further developed beyond the lifetime of the projects in which they were produced. Among the resources currently available in the CLARIN.SI repository are the multilingual MULTEXT-East resources, the CC version of Slovenian reference corpus Gigafida, the morphological lexicon Sloleks, the IMP corpora and lexicons of historical Slovenian, as well as many other resources for a variety of languages. Furthermore, several REST-based web services are provided for different corpus-linguistic and NLP tasks.