Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 32 result(s)
Språkbanken was established in 1975 as a national center located in the Faculty of Arts, University of Gothenburg. Allén's groundbreaking corpus linguistic research resulted in the creation of one of the first large electronic text corpora in another language than English, with one million words of newspaper text. The task of Språkbanken is to collect, develop, and store (Swedish) text corpora, and to make linguistic data extracted from the corpora available to researchers and to the public.
Country
Lithuanian Data Archive for Social Sciences and Humanities (LiDA) is a virtual digital infrastructure for SSH data and research resources acquisition, long-term preservation and dissemination. All the data and research resources are documented in both English and Lithuanian according to international standards. Access to the resources is provided via Dataverse repository. LiDA curates different types of resources and they are published into catalogues according to the type: Survey Data, Aggregated Data (including Historical Statistics), Encoded Data (including News Media Studies), and Textual Data. Also, LiDA holds collections of social sciences and humanities data deposited by Lithuanian science and higher education institutions and Lithuanian state institutions (Data of Other Institutions). LiDA is hosted by the Centre for Data Analysis and Archiving of Kaunas University of Technology (data.ktu.edu).
CLARIN is a European Research Infrastructure for the Humanities and Social Sciences, focusing on language resources (data and tools). It is being implemented and constantly improved at leading institutions in a large and growing number of European countries, aiming at improving Europe's multi-linguality competence. CLARIN provides several services, such as access to language data and tools to analyze data, and offers to deposit research data, as well as direct access to knowledge about relevant topics in relation to (research on and with) language resources. The main tool is the 'Virtual Language Observatory' providing metadata and access to the different national CLARIN centers and their data.
Country
Agri-environmental Research Data collection is part of the University of Guelph Research Data Repositories (University of Guelph Dataverse). The purpose of this collection is to provide access to, and long-term stewardship of, agricultural and environmental data created at or in cooperation with the University of Guelph.
The CLARIN­/Text+ repository at the Saxon Academy of Sciences and Humanities in Leipzig offers long­term preservation of digital resources, along with their descriptive metadata. The mission of the repository is to ensure the availability and long­term preservation of resources, to preserve knowledge gained in research, to aid the transfer of knowledge into new contexts, and to integrate new methods and resources into university curricula. Among the resources currently available in the Leipzig repository are a set of corpora of the Leipzig Corpora Collection (LCC), based on newspaper, Wikipedia and Web text. Furthermore several REST-based webservices are provided for a variety of different NLP-relevant tasks The repository is part of the CLARIN infrastructure and part of the NFDI consortium Text+. It is operated by the Saxon Academy of Sciences and Humanities in Leipzig.
UCLA Library is adopting Dataverse, the open source web application designed for sharing, preserving and using research data. UCLA Dataverse will allow data, text, software, scripts, data visualizations, etc., created from research projects at UCLA to be made publicly available, widely discoverable, linkable, and ultimately, reusable
Country
MyTardis began at Monash University to solve the problem of users needing to store large datasets and share them with collaborators online. Its particular focus is on integration with scientific instruments, instrument facilities and research lab file storage. Our belief is that the less effort a researcher has to expend safely storing data, the more likely they are to do so. This approach has flourished with MyTardis capturing data from areas such as protein crystallography, electron microscopy, medical imaging and proteomics and with deployments at Australian institutions such as University of Queensland, RMIT, University of Sydney and the Australian Synchrotron. Data access via https://www.massive.org.au/ and https://store.erc.monash.edu.au/experiment/view/104/ and see 'remarks'.
The TextGrid Repository is a digital preservation archive for human sciences research data. It offers an extensive searchable and adaptable corpus of XML/TEI encoded texts, pictures and databases. Amongst the continuously growing corpus is the Digital Library of TextGrid, which consists of works of more than 600 authors of fiction (prose verse and drama) as well as nonfiction from the beginning of the printing press to the early 20th century written in or translated into German. The files are saved in different output formats (XML, ePub, PDF), published and made searchable. Different tools e.g. viewing or quantitative text-analysis tools can be used for visualization or to further research the text. The TextGrid Repository is part of the virtual research environment TextGrid, which besides offering digital preservation also offers open-source software for collaborative creations and publications of e.g. digital editions that are based on XML/TEI.
The Polinsky Language Sciences Lab at Harvard University is a linguistics lab that examines questions of language structure and its effect on the ways in which people use and process language in real time. We engage in linguistic and interdisciplinary research projects ourselves; offer linguistic research capabilities for undergraduate and graduate students, faculty, and visitors; and build relationships with the linguistic communities in which we do our research. We are interested in a broad range of issues pertaining to syntax, interfaces, and cross-linguistic variation. We place a particular emphasis on novel experimental evidence that facilitates the construction of linguistic theory. We have a strong cross-linguistic focus, drawing upon English, Russian, Chinese, Korean, Mayan languages, Basque, Austronesian languages, languages of the Caucasus, and others. We believe that challenging existing theories with data from as broad a range of languages as possible is a crucial component of the successful development of linguistic theory. We investigate both fluent speakers and heritage speakers—those who grew up hearing or speaking a particular language but who are now more fluent in a different, societally dominant language. Heritage languages, a novel field of linguistic inquiry, are important because they provide new insights into processes of linguistic development and attrition in general, thus increasing our understanding of the human capacity to maintain and acquire language. Understanding language use and processing in real time and how children acquire language helps us improve language study and pedagogy, which in turn improves communication across the globe. Although our lab does not specialize in language acquisition, we have conducted some studies of acquisition of lesser-studied languages and heritage languages, with the purpose of comparing heritage speakers to adults.
ICRISAT performs crop improvement research, using conventional as well as methods derived from biotechnology, on the following crops: Chickpea, Pigeonpea, Groundnut, Pearl millet,Sorghum and Small millets. ICRISAT's data repository collects, preserves and facilitates access to the datasets produced by ICRISAT researchers to all users who are interested in. Data includes Phenotypic, Genotypic, Social Science, and Spatial data, Soil and Weather.
eLaborate is an online work environment in which scholars can upload scans, transcribe and annotate text, and publish the results as on online text edition which is freely available to all users. Short information about and a link to already published editions is presented on the page Editions under Published. Information about editions currently being prepared is posted on the page Ongoing projects. The eLaborate work environment for the creation and publication of online digital editions is developed by the Huygens Institute for the History of the Netherlands of the Royal Netherlands Academy of Arts and Sciences. Although the institute considers itself primarily a research facility and does not maintain a public collection profile, Huygens ING actively maintains almost 200 digitally available resource collections.
The mission of World Data Center for Climate (WDCC) is to provide central support for the German and European climate research community. The WDCC is member of the ISC's World Data System. Emphasis is on development and implementation of best practice methods for Earth System data management. Data for and from climate research are collected, stored and disseminated. The WDCC is restricted to data products. Cooperations exist with thematically corresponding data centres of, e.g., earth observation, meteorology, oceanography, paleo climate and environmental sciences. The services of WDCC are also available to external users at cost price. A special service for the direct integration of research data in scientific publications has been developed. The editorial process at WDCC ensures the quality of metadata and research data in collaboration with the data producers. A citation code and a digital identifier (DOI) are provided and registered together with citation information at the DOI registration agency DataCite.
The Language Archive Cologne (LAC) is a research data repository for the linguistics and all humanities disciplines working with audiovisual data. The archive forms a cluster of the Data Center for Humanities in cooperation with the Institute of Linguistics of the University of Cologne. The LAC is an archive for language resources, which is freely available via a web-based access. In addition, concrete technical and methodological advice is offered in the research data cycle - from the collection of the data, their preparation and archiving, to publication and reuse.
Country
ArkeoGIS is a unified scientific data publishing platform. It is a multilingual Geographic Information System (GIS), initially developed in order to mutualize archaeological and paleoenvironmental data of the Rhine Valley. Today, it allows the pooling of spatialized scientific data concerning the past, from prehistory to the present day. The databases come from the work of institutional researchers, doctoral students, master students, private companies and archaeological services. They are stored on the TGIR Huma-Num service grid and archived as part of the Huma-Num/CINES long-term archiving service. Because of their sensitive nature, which could lead to the looting of archaeological deposits, access to the tool is reserved to archaeological professionals, from research institutions or non-profit organizations. Each user can query online all or part of the available databases and export the results of his query to other tools.
Country
ISIDORE is a international search engine and a discovery platform for open science allowing the access to digital materials from social sciences and humanities (SSH). Open to all and especially to teachers, researchers, PhD students, and students, it relies on the principles of Web of data and provides access to data in free access (open access). By its vocation, ISIDORE will foster access to open access data produced by research and higher education institutions, laboratories and research teams: digital publication, documentary databases, digitized collections of research libraries, research notebooks and scientific event announcements. ISIDORE collects, enriches and highlights digital data and documents from the Humanities and Social Sciences while providing unified access to them. More information see: https://isidore.science/about
Country
QSAR DataBank (QsarDB) is repository for (Quantitative) Structure-Activity Relationships ((Q)SAR) data and models. It also provides open domain-specific digital data exchange standards and associated tools that enable research groups, project teams and institutions to share and represent predictive in silico models.
The Abacus Data Network is a data repository collaboration involving Libraries at Simon Fraser University (SFU), the University of British Columbia (UBC), the University of Northern British Columbia (UNBC) and the University of Victoria (UVic).
The CESSDA Data Catalogue contains the metadata of all data in the holdings of CESSDA service providers. It is a one-stop-shop for search and discovery, enabling effective access to European research data for researchers. Details of over 40, 000 data collections are listed. These are harvested from fifteen different CESSDA Service Providers.
The Language Bank features text and speech corpora with different kinds of annotations in over 60 languages. There is also a selection of tools for working with them, from linguistic analyzers to programming environments. Corpora are also available via web interfaces, and users can be allowed to download some of them. The IP holders can monitor the use of their resources and view user statistics.
The Social Science Data Archive is still active and maintained as part of the UCLA Library Data Science Center. SSDA Dataverse is one of the archiving opportunities of SSDA, the others are: Data can be archived by SSDA itself or by ICPSR or by UCLA Library or by California Digital Library. The Social Science Data Archives serves the UCLA campus as an archive of faculty and graduate student survey research. We provide long term storage of data files and documentation. We ensure that the data are useable in the future by migrating files to new operating systems. We follow government standards and archival best practices. The mission of the Social Science Data Archive has been and continues to be to provide a foundation for social science research with faculty support throughout an entire research project involving original data collection or the reuse of publicly available studies. Data Archive staff and researchers work as partners throughout all stages of the research process, beginning when a hypothesis or area of study is being developed, during grant and funding activities, while data collection and/or analysis is ongoing, and finally in long term preservation of research results. Our role is to provide a collaborative environment where the focus is on understanding the nature and scope of research approach and management of research output throughout the entire life cycle of the project. Instructional support, especially support that links research with instruction is also a mainstay of operations.
The gift of the Stowell Datasets, a digital archive of psychographic data, to the College of Liberal Arts (and continued gift of new datasets) provide a unique opportunity for WSU to facilitate access to a valuable research resource. The datasets include over 350 individual major media market surveys (CATI, Random Digit Dialing telephone surveys) collected over the period 1989-2001 and feature approximately n=1,000+ respondents for each market for each year.
Country
NAKALA is a repository dedicated to SSH research data in France. Given its generalist and multi-disciplinary nature, all types of data are accepted, although certain formats are recommended to ensure longterm data preservation. It has been developed and is hosted by Huma-Num, the French national research infrastructure for digital humanities.