Search | re3data.org

Eurac Research CLARIN Centre

ERCC

Subject(s)

Content type(s)

Country

The Eurac Research CLARIN Centre (ERCC) is a dedicated repository for language data. It is hosted by the Institute for Applied Linguistics (IAL) at Eurac Research, a private research centre based in Bolzano, South Tyrol. The Centre is part of the Europe-wide CLARIN infrastructure, which means that it follows well-defined international standards for (meta)data and procedures and is well-embedded in the wider European Linguistics infrastructure. The repository hosts data collected at the IAL, but is also open for data deposits from external collaborators.

Bitbucket

Atlassian Bitbucket

Subject(s)

Content type(s)

Country

United Kingdom

Bitbucket is a web-based version control repository hosting service owned by Atlassian, for source code and development projects that use either Mercurial or Git revision control systems.

Savannah

Subject(s)

Content type(s)

Country

International

Savannah hosts the majority of GNU software and some non-GNU software. Savannah's focus is on hosting for free software projects. To ensure that only free software is hosted, Savannah implements very strict hosting policies, including a ban against the use of non-free formats (such as Macromedia Flash).

Monash Bridges

monash.figshare (formerly)

Subject(s)

Content type(s)

Country

Bridges is Monash University's repository for research data, collections, and research activity outputs. It is also the home of the University's online archive of PhD and Masters by Research theses.

melbourne.figshare.com

University of Melbourne data repository

Subject(s)

Content type(s)

Country

melbourne.figshare.com is a specialised service that has been tailored according to specific needs and requirements of the University and our community of researchers. The service offered at the University is free to use, provides 100GB of data, and stores all data on the University's storage system.

Kaggle

Your home for data science

Subject(s)

Content type(s)

Country

United States

Kaggle is a platform for predictive modelling and analytics competitions in which statisticians and data miners compete to produce the best models for predicting and describing the datasets uploaded by companies and users. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know beforehand which technique or analyst will be most effective.

brainlife

brainlife.io

Subject(s)

Content type(s)

Country

United States

Brainlife promotes engagement and education in reproducible neuroscience. We do this by providing an online platform where users can publish code (Apps), Data, and make it "alive" by integragrate various HPC and cloud computing resources to run those Apps. Brainlife also provide mechanisms to publish all research assets associated with a scientific project (data and analyses) embedded in a cloud computing environment and referenced by a single digital-object-identifier (DOI). The platform is unique because of its focus on supporting scientific reproducibility beyond open code and open data, by providing fundamental smart mechanisms for what we refer to as “Open Services.”

ILC-CNR for CLARIN-IT repository

ILC4CLARIN

Subject(s)

Content type(s)

Country

ILC-CNR for CLARIN-IT repository is a library for linguistic data and tools. Including: Text Processing and Computational Philology; Natural Language Processing and Knowledge Extraction; Resources, Standards and Infrastructures; Computational Models of Language Usage. The studies carried out within each area are highly interdisciplinary and involve different professional skills and expertises that extend across the disciplines of Linguistics, Computational Linguistics, Computer Science and Bio-Engineering.

INESC TEC Research Data Repository

Subject(s)

Content type(s)

Country

Portugal

The INESC TEC data repository showcases datasets produced or used by INESC TEC researchers and their partners. The repository is organized in four groups (institutional clusters). Computer Science, Power and Energy, Network and Intelligent Systems and Power and Energy.

OpenML

Open Machine Learning

Subject(s)

Content type(s)

Country

OpenML is an open ecosystem for machine learning. By organizing all resources and results online, research becomes more efficient, useful and fun. OpenML is a platform to share detailed experimental results with the community at large and organize them for future reuse. Moreover, it will be directly integrated in today’s most popular data mining tools (for now: R, KNIME, RapidMiner and WEKA). Such an easy and free exchange of experiments has tremendous potential to speed up machine learning research, to engender larger, more detailed studies and to offer accurate advice to practitioners. Finally, it will also be a valuable resource for education in machine learning and data mining.

Loughborough Research Repository

Loughborough University Research Repository

Subject(s)

Content type(s)

Country

United Kingdom

Loughborough Research Repository is the institutional repository of Loughborough University powered by figshare.

CLARIN repository at the University of Tübingen

CLARIN Center Tübingen

Subject(s)

Content type(s)

Country

The repository is part of the National Research Data Infrastructure initiative Text+, in which the University of Tübingen is a partner. It is housed at the Department of General and Computational Linguistics. The infrastructure is maintained in close cooperation with the Digital Humanities Centre, which is a core facility of the university, colaborating with the library and computing center of the university. Integration of the repository into the national CLARIN-D and international CLARIN infrastructures gives it wide exposure, increasing the likelihood that the resources will be used and further developed beyond the lifetime of the projects in which they were developed. Among the resources currently available in the Tübingen Center Repository, researchers can find widely used treebanks of German (e.g. TüBa-D/Z), the German wordnet (GermaNet), the first manually annotated digital treebank (Index Thomisticus), as well as descriptions of the tools used by the WebLicht ecosystem for natural language processing.

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning