Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 27 result(s)
Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. SNAP is also available through the NodeXL which is a graphical front-end that integrates network analysis into Microsoft Office and Excel. The SNAP library is being actively developed since 2004 and is organically growing as a result of our research pursuits in analysis of large social and information networks. Largest network we analyzed so far using the library was the Microsoft Instant Messenger network from 2006 with 240 million nodes and 1.3 billion edges. The datasets available on the website were mostly collected (scraped) for the purposes of our research. The website was launched in July 2009.
<<<!!!<<< CRAWDAD has moved to IEEE-Dataport https://www.re3data.org/repository/r3d100012569 The datasets in the Community Resource for Archiving Wireless Data at Dartmouth (CRAWDAD) repository are now hosted as the CRAWDAD Collection on IEEE Dataport. After nearly two decades as a stand-alone archive at crawdad.org, the migration of the collection to IEEE DataPort provides permanence and new visibility. >>>!!!>>>
CLARIN is a European Research Infrastructure for the Humanities and Social Sciences, focusing on language resources (data and tools). It is being implemented and constantly improved at leading institutions in a large and growing number of European countries, aiming at improving Europe's multi-linguality competence. CLARIN provides several services, such as access to language data and tools to analyze data, and offers to deposit research data, as well as direct access to knowledge about relevant topics in relation to (research on and with) language resources. The main tool is the 'Virtual Language Observatory' providing metadata and access to the different national CLARIN centers and their data.
nanoHUB.org is the premier place for computational nanotechnology research, education, and collaboration. Our site hosts a rapidly growing collection of Simulation Programs for nanoscale phenomena that run in the cloud and are accessible through a web browser. In addition to simulation devices, nanoHUB provides Online Presentations, Courses, Learning Modules, Podcasts, Animations, Teaching Materials, and more. These resources help users learn about our simulation programs and about nanotechnology in general. Our site offers researchers a venue to explore, collaborate, and publish content, as well. Much of these collaborative efforts occur via Workspaces and User groups.
Country
DataverseNO (https://dataverse.no) is a curated, FAIR-aligned national generic repository for open research data from all academic disciplines. DataverseNO commits to facilitate that published data remain accessible and (re)usable in a long-term perspective. The repository is owned and operated by UiT The Arctic University of Norway. DataverseNO accepts submissions from researchers primarily from Norwegian research institutions. Datasets in DataverseNO are grouped into institutional collections as well as special collections. The technical infrastructure of the repository is based on the open source application Dataverse (https://dataverse.org), which is developed by an international developer and user community led by Harvard University.
In order to meet the needs of research data management for Peking University. The PKU library cooperate with the NSFC-PKU data center for management science, PKU science and research department, PKU social sciences department to jointly launch the Peking University Open Research Data Platform. PKU Open research data provides preservation, management and distribution services for research data. It encourage data owner to share data and data users to reuse data.
The Information Marketplace for Policy and Analysis of Cyber-risk & Trust (IMPACT) program supports global cyber risk research & development by coordinating, enhancing and developing real world data, analytics and information sharing capabilities, tools, models, and methodologies. In order to accelerate solutions around cyber risk issues and infrastructure security, IMPACT makes these data sharing components broadly available as national and international resources to support the three-way partnership among cyber security researchers, technology developers and policymakers in academia, industry and the government.
Subject(s)
Country
Edmond is the institutional repository of the Max Planck Society for public research data. It enables Max Planck scientists to create citable scientific assets by describing, enriching, sharing, exposing, linking, publishing and archiving research data of all kinds. Further on, all objects within Edmond have a unique identifier and therefore can be clearly referenced in publications or reused in other contexts.
Kaggle is a platform for predictive modelling and analytics competitions in which statisticians and data miners compete to produce the best models for predicting and describing the datasets uploaded by companies and users. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know beforehand which technique or analyst will be most effective.
Brainlife promotes engagement and education in reproducible neuroscience. We do this by providing an online platform where users can publish code (Apps), Data, and make it "alive" by integragrate various HPC and cloud computing resources to run those Apps. Brainlife also provide mechanisms to publish all research assets associated with a scientific project (data and analyses) embedded in a cloud computing environment and referenced by a single digital-object-identifier (DOI). The platform is unique because of its focus on supporting scientific reproducibility beyond open code and open data, by providing fundamental smart mechanisms for what we refer to as “Open Services.”
Country
GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, association genetics, markers, polymorphisms, germplasms, phenotypes and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest. GnpIS is used by both large international projects and plant science departments at the French National Research Institute for Agriculture, Food and Environment. It is regularly improved and released several times per year. GnpIS is accessible through a web portal and allows to browse different types of data either independently through dedicated interfaces or simultaneously using a quick search ('google like search') or advanced search (Biomart, Galaxy, Intermine) tools.
<<<!!!<<< All user content from this site has been deleted. Visit SeedMeLab (https://seedmelab.org/) project as a new option for data hosting. >>>!!!>>> SeedMe is a result of a decade of onerous experience in preparing and sharing visualization results from supercomputing simulations with many researchers at different geographic locations using different operating systems. It’s been a labor–intensive process, unsupported by useful tools and procedures for sharing information. SeedMe provides a secure and easy-to-use functionality for efficiently and conveniently sharing results that aims to create transformative impact across many scientific domains.
GigaDB primarily serves as a repository to host data and tools associated with articles published by GigaScience Press; GigaScience and GigaByte (both are online, open-access journals). GigaDB defines a dataset as a group of files (e.g., sequencing data, analyses, imaging files, software programs) that are related to and support a unit-of-work (article or study). GigaDB allows the integration of manuscript publication with supporting data and tools.
The ColabFit Exchange is an online resource for the discovery, exploration and submission of datasets for data-driven interatomic potential (DDIP) development for materials science and chemistry applications. ColabFit's goal is to increase the Findability, Accessibility, Interoperability, and Reusability (FAIR) of DDIP data by providing convenient access to well-curated and standardized first-principles and experimental datasets. Content on the ColabFit Exchange is open source and freely available.
ETH Data Archive is ETH Zurich's long-term preservation solution for digital information such as research data, digitised content, archival records, or images. It serves as the backbone of data curation and for most of its content, it is a “dark archive” without public access. In this capacity, the ETH Data Archive also archives the content of ETH Zurich’s Research Collection which is the primary repository for members of the university and the first point of contact for publication of data at ETH Zurich. All data that was produced in the context of research at the ETH Zurich, can be published and archived in the Research Collection. An automated connection to the ETH Data Archive in the background ensures the medium to long-term preservation of all publications and research data. Direct access to the ETH Data Archive is intended only for customers who need to deposit software source code within the framework of ETH transfer Software Registration. Open Source code packages and other content from legacy workflows can be accessed via ETH Library @ swisscovery (https://library.ethz.ch/en/).
cIRcle is an open access digital repository for published and unpublished material created by the UBC community and its partners. In BIRS there are thousands of mathematics videos, which are primary research data. Our repository is the largest source of mathematics data with more than 10TB of primary research by the best mathematicians in the world, coming from more than 600 institutions.
LINDAT/CLARIN is designed as a Czech “node” of Clarin ERIC (Common Language Resources and Technology Infrastructure). It also supports the goals of the META-NET language technology network. Both networks aim at collection, annotation, development and free sharing of language data and basic technologies between institutions and individuals both in science and in all types of research. The Clarin ERIC infrastructural project is more focused on humanities, while META-NET aims at the development of language technologies and applications. The data stored in the repository are already being used in scientific publications in the Czech Republic. In 2019 LINDAT/CLARIAH-CZ was established as a unification of two research infrastructures, LINDAT/CLARIN and DARIAH-CZ.
Country
TUL Open Research Data Repository (RDB.open) is a service addressed to the scientific and research community of the Lodz University of Technology. The main purpose of RDB.open is to collect, share and store the open research data, both during the research and after its completion, at least for the minimum period indicated by the funder or the scientists. The RDB.open is a place where research data can be openly shared, accessed and then reused by others.
OpenKIM is an online suite of open source tools for molecular simulation of materials. These tools help to make molecular simulation more accessible and more reliable. Within OpenKIM, you will find an online resource for standardized testing and long-term warehousing of interatomic models and data, and an application programming interface (API) standard for coupling atomistic simulation codes and interatomic potential subroutines.
The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. It is used by students, educators, and researchers all over the world as a primary source of machine learning data sets. As an indication of the impact of the archive, it has been cited over 1000 times.
Polish CLARIN node – CLARIN-PL Language Technology Centre – is being built at Wrocław University of Technology. The LTC is addressed to scholars in the humanities and social sciences. Registered users are granted free access to digital language resources and advanced tools to explore them. They can also archive and share their own language data (in written, spoken, video or multimodal form).
CLARIN.SI is the Slovenian node of the European CLARIN (Common Language Resources and Technology Infrastructure) Centers. The CLARIN.SI repository is hosted at the Jožef Stefan Institute and offers long-term preservation of deposited linguistic resources, along with their descriptive metadata. The integration of the repository with the CLARIN infrastructure gives the deposited resources wide exposure, so that they can be known, used and further developed beyond the lifetime of the projects in which they were produced. Among the resources currently available in the CLARIN.SI repository are the multilingual MULTEXT-East resources, the CC version of Slovenian reference corpus Gigafida, the morphological lexicon Sloleks, the IMP corpora and lexicons of historical Slovenian, as well as many other resources for a variety of languages. Furthermore, several REST-based web services are provided for different corpus-linguistic and NLP tasks.
US Department of Energy’s Atmospheric Radiation Measurement (ARM) Data Center is a long-term archive and distribution facility for various ground-based, aerial and model data products in support of atmospheric and climate research. ARM facility currently operates over 400 instruments at various observatories (https://www.arm.gov/capabilities/observatories/). ARM Data Center (ADC) Archive currently holds over 11,000 data products with a total holding of over 3 petabytes of data that dates back to 1993, these include data from instruments, value added products, model outputs, field campaign and PI contributed data. The data center archive also includes data collected by ARM from related program (e.g., external data such as NASA satellite).