Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 27 result(s)
The Eurac Research CLARIN Centre (ERCC) is a dedicated repository for language data. It is hosted by the Institute for Applied Linguistics (IAL) at Eurac Research, a private research centre based in Bolzano, South Tyrol. The Centre is part of the Europe-wide CLARIN infrastructure, which means that it follows well-defined international standards for (meta)data and procedures and is well-embedded in the wider European Linguistics infrastructure. The repository hosts data collected at the IAL, but is also open for data deposits from external collaborators.
CLARIN.SI is the Slovenian node of the European CLARIN (Common Language Resources and Technology Infrastructure) Centers. The CLARIN.SI repository is hosted at the Jožef Stefan Institute and offers long-term preservation of deposited linguistic resources, along with their descriptive metadata. The integration of the repository with the CLARIN infrastructure gives the deposited resources wide exposure, so that they can be known, used and further developed beyond the lifetime of the projects in which they were produced. Among the resources currently available in the CLARIN.SI repository are the multilingual MULTEXT-East resources, the CC version of Slovenian reference corpus Gigafida, the morphological lexicon Sloleks, the IMP corpora and lexicons of historical Slovenian, as well as many other resources for a variety of languages. Furthermore, several REST-based web services are provided for different corpus-linguistic and NLP tasks.
The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. It is used by students, educators, and researchers all over the world as a primary source of machine learning data sets. As an indication of the impact of the archive, it has been cited over 1000 times.
ILC-CNR for CLARIN-IT repository is a library for linguistic data and tools. Including: Text Processing and Computational Philology; Natural Language Processing and Knowledge Extraction; Resources, Standards and Infrastructures; Computational Models of Language Usage. The studies carried out within each area are highly interdisciplinary and involve different professional skills and expertises that extend across the disciplines of Linguistics, Computational Linguistics, Computer Science and Bio-Engineering.
The University of Cape Town (UCT) uses Figshare for institutions for their data repository, which was launched in 2017 and is called ZivaHub: Open Data UCT. ZivaHub serves principal investigators at the University of Cape Town who are in need of a repository to store and openly disseminate the data that support their published research findings. The repository service is provided in terms of the UCT Research Data Management Policy. It provides open access to supplementary research data files and links to their respective scholarly publications (e.g. theses, dissertations, papers et al) hosted on other platforms, such as OpenUCT.
This repository stores and links the openly available power-grid frequency recordings across the globe. This database is comprised of open data existent across three dimensions: - TSO data: Transmission System's Operator (TSO) recordings made public; - Research projects: Open-data database research projects; - Independent Gatherings: Industrial, private, or personal recordings that were made publicly available.
RUresearch Data Portal is a subset of RUcore (Rutgers University Community Repository), provides a platform for Rutgers researchers to share their research data and supplementary resources with the global scholarly community. This data portal leverages all the capabilities of RUcore with additional tools and services specific to research data. It provides data in different clusters (research-genre) with excellent search facility; such as experimental data, multivariate data, discrete data, continuous data, time series data, etc. However it facilitates individual research portals that include the Video Mosaic Collaborative (VMC), an NSF-funded collection of mathematics education videos for Teaching and Research. Its' mission is to maintain the significant intellectual property of Rutgers University; thereby intended to provide open access and the greatest possible impact for digital data collections in a responsible manner to promote research and learning.
DataON is Korea's National Research Data Platform. It provides integrated search of metadata for KISTI's research data and domestic and international research data and links to raw data. DataON allows users (researchers, policy makers, etc.) to perform the following tasks: Easily search for various types of research data in all scientific fields. By registering research results, research data can be posted and cited. Build a community among researchers and enable collaborative research. It provides a data analysis environment that allows one-stop analysis of discovered research data.
Academic Torrents is a distributed data repository. The academic torrents network is built for researchers, by researchers. Its distributed peer-to-peer library system automatically replicates your datasets on many servers, so you don't have to worry about managing your own servers or file availability. Everyone who has data becomes a mirror for those data so the system is fault-tolerant.
An increasing number of Language Resources (LT) in the various fields of Human Language Technology (HLT) are distributed on behalf of ELRA via its operational body ELDA, thanks to the contribution of various players of the HLT community. Our aim is to provide Language Resources, by means of this repository, so as to prevent researchers and developers from investing efforts to rebuild resources which already exist as well as help them identify and access those resources.
Data products developed and distributed by the National Institute of Standards and Technology span multiple disciplines of research and are widely used in research and development programs by industry and academia. NIST's publicly available data sets showcase its committment to providing accurate, well-curated measurements of physical properties, exemplified by the Standard Reference Data program, as well as its committment to advancing basic research. In accordance with U.S. Government Open Data Policy and the NIST Plan for providing public access to the results of federally funded research data, NIST maintains a publicly accessible listing of available data, the NIST Public Dataset List (json). Additionally, these data are assigned a Digital Object Identifier (DOI) to increase the discovery and access to research output; these DOIs are registered with DataCite and provide globally unique persistent identifiers. The NIST Science Data Portal provides a user-friendly discovery and exploration tool for publically available datasets at NIST. This portal is designed and developed with data.gov Project Open Data standards and principles. The portal software is hosted in the usnistgov github repository.
Bitbucket is a web-based version control repository hosting service owned by Atlassian, for source code and development projects that use either Mercurial or Git revision control systems.
-----<<<<< The repository is no longer available. This record is out-dated. >>>>>----- GEON is an open collaborative project that is developing cyberinfrastructure for integration of 3 and 4 dimensional earth science data. GEON will develop services for data integration and model integration, and associated model execution and visualization. Mid-Atlantic test bed will focus on tectonothermal, paleogeographic, and biotic history from the late-Proterozoicto mid-Paleozoic. Rockies test bed will focus on integration of data with dynamic models, to better understand deformation history. GEON will develop the most comprehensive regional datasets in test bed areas.
Country
The Repository stores in digital format all the academic and scientific documentation (Theses, Articles, Papers) generated by the institution. Its main objectives are to promote open access to the scientific-technological production generated by the Institution. It is organized by collections: Thesis and Final Works, Research, Institutional History and Photographic Archive.
The Registry of Open Data on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge to their users. Anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users.
Country
DataverseNO (https://dataverse.no) is a curated, FAIR-aligned national generic repository for open research data from all academic disciplines. DataverseNO commits to facilitate that published data remain accessible and (re)usable in a long-term perspective. The repository is owned and operated by UiT The Arctic University of Norway. DataverseNO accepts submissions from researchers primarily from Norwegian research institutions. Datasets in DataverseNO are grouped into institutional collections as well as special collections. The technical infrastructure of the repository is based on the open source application Dataverse (https://dataverse.org), which is developed by an international developer and user community led by Harvard University.