Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 28 result(s)
META-SHARE, the open language resource exchange facility, is devoted to the sustainable sharing and dissemination of language resources (LRs) and aims at increasing access to such resources in a global scale. META-SHARE is an open, integrated, secure and interoperable sharing and exchange facility for LRs (datasets and tools) for the Human Language Technologies domain and other applicative domains where language plays a critical role. META-SHARE is implemented in the framework of the META-NET Network of Excellence. It is designed as a network of distributed repositories of LRs, including language data and basic language processing tools (e.g., morphological analysers, PoS taggers, speech recognisers, etc.). Data and tools can be both open and with restricted access rights, free and for-a-fee.
The Eurac Research CLARIN Centre (ERCC) is a dedicated repository for language data. It is hosted by the Institute for Applied Linguistics (IAL) at Eurac Research, a private research centre based in Bolzano, South Tyrol. The Centre is part of the Europe-wide CLARIN infrastructure, which means that it follows well-defined international standards for (meta)data and procedures and is well-embedded in the wider European Linguistics infrastructure. The repository hosts data collected at the IAL, but is also open for data deposits from external collaborators.
EBRAINS offers one of the most comprehensive platforms for sharing brain research data ranging in type as well as spatial and temporal scale. We provide the guidance and tools needed to overcome the hurdles associated with sharing data. The EBRAINS data curation service ensures that your dataset will be shared with maximum impact, visibility, reusability, and longevity, https://ebrains.eu/services/data-knowledge/share-data. Find data - the user interface of the EBRAINS Knowledge Graph - allows you to easily find data of interest. EBRAINS hosts a wide range of data types and models from different species. All data are well described and can be accessed immediately for further analysis.
The Infrared Space Observatory (ISO) is designed to provide detailed infrared properties of selected Galactic and extragalactic sources. The sensitivity of the telescopic system is about one thousand times superior to that of the Infrared Astronomical Satellite (IRAS), since the ISO telescope enables integration of infrared flux from a source for several hours. Density waves in the interstellar medium, its role in star formation, the giant planets, asteroids, and comets of the solar system are among the objects of investigation. ISO was operated as an observatory with the majority of its observing time being distributed to the general astronomical community. One of the consequences of this is that the data set is not homogeneous, as would be expected from a survey. The observational data underwent sophisticated data processing, including validation and accuracy analysis. In total, the ISO Data Archive contains about 30,000 standard observations, 120,000 parallel, serendipity and calibration observations and 17,000 engineering measurements. In addition to the observational data products, the archive also contains satellite data, documentation, data of historic aspects and externally derived products, for a total of more than 400 GBytes stored on magnetic disks. The ISO Data Archive is constantly being improved both in contents and functionality throughout the Active Archive Phase, ending in December 2006.
The CLARIN-D Centre CEDIFOR provides a repository for long-term storage of resources and meta-data. Resources hosted in the repository stem from research of members as well as associated research projects of CEDIFOR. This includes software and web-services as well as corpora of text, lexicons, images and other data.
MODES focuses on the representation of the inertio-gravity circulation in numerical weather prediction models, reanalyses, ensemble prediction systems and climate simulations. The project methodology relies on the decomposition of global circulation in terms of 3D orthogonal normal-mode functions. It allows quantification of the role of inertio-gravity waves in atmospheric varibility across the whole spectrum of resolved spatial and temporal scales. MODES is compiled by using gfortran although other options have been succesfully tested. The application requires the use of the netcdf and (optionally) grib-api libraries.
CLARIN-LV is a national node of Clarin ERIC (Common Language Resources and Technology Infrastructure). The mission of the repository is to ensure the availability and long­ term preservation of language resources. The data stored in the repository are being actively used and cited in scientific publications.
The Extreme Light Infrastructure (ELI) is the world's most advanced laser-based research infrastructure. The ELI Facilities provide access to a broad range of world-class high-power, high repetition-rate laser systems and secondary sources. This enables cutting-edge research and new regimes of high intensity physics in physical, chemical, medical, and materials sciences.
SeaDataNet is a standardized system for managing the large and diverse data sets collected by the oceanographic fleets and the automatic observation systems. The SeaDataNet infrastructure network and enhance the currently existing infrastructures, which are the national oceanographic data centres of 35 countries, active in data collection. The networking of these professional data centres, in a unique virtual data management system provide integrated data sets of standardized quality on-line. As a research infrastructure, SeaDataNet contributes to build research excellence in Europe.
The main goal of the ECCAD project is to provide scientific and policy users with datasets of surface emissions of atmospheric compounds, and ancillary data, i.e. data required to estimate or quantify surface emissions. The supply of ancillary data - such as maps of population density, maps of fires spots, burnt areas, land cover - could help improve and encourage the development of new emissions datasets. ECCAD offers: Access to global and regional emission inventories and ancillary data, in a standardized format Quick visualization of emission and ancillary data Rationalization of the use of input data in algorithms or emission models Analysis and comparison of emissions datasets and ancillary data Tools for the evaluation of emissions and ancillary data ECCAD is a dynamical and interactive database, providing the most up to date datasets including data used within ongoing projects. Users are welcome to add their own datasets, or have their regional masks included in order to use ECCAD tools.
The Language Archive at the Max Planck Institute in Nijmegen provides a unique record of how people around the world use language in everyday life. It focuses on collecting spoken and signed language materials in audio and video form along with transcriptions, analyses, annotations and other types of relevant material (e.g. photos, accompanying notes).
EUMETSAT's primary objective is to establish, maintain and exploit European systems of operational meteorological satellites. EUMETSAT is responsible for the launch and operation of the satellites and for delivering satellite data to end-users as well as contributing to the operational monitoring of climate and the detection of global climate changes. The EUMETSAT Product Navigator is the catalogue for all EUMETSAT data and products.
The goal of the Center of Estonian Language Resources (CELR) is to create and manage an infrastructure to make the Estonian language digital resources (dictionaries, corpora – both text and speech –, various language databases) and language technology tools (software) available to everyone working with digital language materials. CELR coordinates and organises the documentation and archiving of the resources as well as develops language technology standards and draws up necessary legal contracts and licences for different types of users (public, academic, commercial, etc.). In addition to collecting language resources, a system will be launched for introducing the resources to, informing and educating the potential users. The main users of CELR are researchers from Estonian R&D institutions and Social Sciences and Humanities researchers all over the world via the CLARIN ERIC network of similar centers in Europe. Access to data is provided through different sites: Public Repository https://entu.keeleressursid.ee/public-document , Language resources https://keeleressursid.ee/en/resources/corpora, and MetaShare CELR https://metashare.ut.ee/
The FAIRDOMHub is built upon the SEEK software suite, which is an open source web platform for sharing scientific research assets, processes and outcomes. FAIRDOM (Web Site) will establish a support and service network for European Systems Biology. It will serve projects in standardizing, managing and disseminating data and models in a FAIR manner: Findable, Accessible, Interoperable and Reusable. FAIRDOM is an initiative to develop a community, and establish an internationally sustained Data and Model Management service to the European Systems Biology community. FAIRDOM is a joint action of ERA-Net EraSysAPP and European Research Infrastructure ISBE.
ILC-CNR for CLARIN-IT repository is a library for linguistic data and tools. Including: Text Processing and Computational Philology; Natural Language Processing and Knowledge Extraction; Resources, Standards and Infrastructures; Computational Models of Language Usage. The studies carried out within each area are highly interdisciplinary and involve different professional skills and expertises that extend across the disciplines of Linguistics, Computational Linguistics, Computer Science and Bio-Engineering.
Europeana is the trusted source of cultural heritage brought to you by the Europeana Foundation and a large number of European cultural institutions, projects and partners. It’s a real piece of team work. Ideas and inspiration can be found within the millions of items on Europeana. These objects include: Images - paintings, drawings, maps, photos and pictures of museum objects Texts - books, newspapers, letters, diaries and archival papers Sounds - music and spoken word from cylinders, tapes, discs and radio broadcasts Videos - films, newsreels and TV broadcasts All texts are CC BY-SA, images and media licensed individually.
CLARINO Bergen Center repository is the repository of CLARINO, the Norwegian infrastructure project . Its goal is to implement the Norwegian part of CLARIN. The ultimate aim is to make existing and future language resources easily accessible for researchers and to bring eScience to humanities disciplines. The repository includes INESS the Norwegian Infrastructure for the Exploration of Syntax and Semantics. This infrastructure provides access to treebanks, which are databases of syntactically and semantically annotated sentences.
Polish CLARIN node – CLARIN-PL Language Technology Centre – is being built at Wrocław University of Technology. The LTC is addressed to scholars in the humanities and social sciences. Registered users are granted free access to digital language resources and advanced tools to explore them. They can also archive and share their own language data (in written, spoken, video or multimodal form).
CLARIN-UK is a consortium of centres of expertise involved in research and resource creation involving digital language data and tools. The consortium includes the national library, and academic departments and university centres in linguistics, languages, literature and computer science.
The GRSF, the Global Record of Stocks and Fisheries, integrates data from three authoritative sources: FIRMS (Fisheries and Resources Monitoring System), RAM (RAM Legacy Stock Assessment Database) and FishSource (Program of the Sustainable Fisheries Partnership). The GRSF content publicly disseminated through this catalogue is distributed as a beta version to test the logic to generate unique identifiers for stocks and fisheries. The access to and review of collated stock and fishery data is restricted to selected users. This beta release can contain errors and we welcome feedback on content and software performance, as well as the overall usability. Beta users are advised that information on this site is provided on an "as is" and "as available" basis. The accuracy, completeness or authenticity of the information on the GRSF catalogue is not guaranteed. It is reserved the right to alter, limit or discontinue any part of this service at its discretion. Under no circumstances shall the GRSF be liable for any loss, damage, liability or expense suffered that is claimed to result from the use of information posted on this site, including without limitation, any fault, error, omission, interruption or delay. The GRSF is an active database, updates and additions will continue after the beta release. For further information, or for using the GRSF unique identifiers as a beta tester please contact FIRMS-Secretariat@fao.org.
OpenML is an open ecosystem for machine learning. By organizing all resources and results online, research becomes more efficient, useful and fun. OpenML is a platform to share detailed experimental results with the community at large and organize them for future reuse. Moreover, it will be directly integrated in today’s most popular data mining tools (for now: R, KNIME, RapidMiner and WEKA). Such an easy and free exchange of experiments has tremendous potential to speed up machine learning research, to engender larger, more detailed studies and to offer accurate advice to practitioners. Finally, it will also be a valuable resource for education in machine learning and data mining.
The repository is part of the National Research Data Infrastructure initiative Text+, in which the University of Tübingen is a partner. It is housed at the Department of General and Computational Linguistics. The infrastructure is maintained in close cooperation with the Digital Humanities Centre, which is a core facility of the university, colaborating with the library and computing center of the university. Integration of the repository into the national CLARIN-D and international CLARIN infrastructures gives it wide exposure, increasing the likelihood that the resources will be used and further developed beyond the lifetime of the projects in which they were developed. Among the resources currently available in the Tübingen Center Repository, researchers can find widely used treebanks of German (e.g. TüBa-D/Z), the German wordnet (GermaNet), the first manually annotated digital treebank (Index Thomisticus), as well as descriptions of the tools used by the WebLicht ecosystem for natural language processing.