Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 19 result(s)
This is the KONECT project, a project in the area of network science with the goal to collect network datasets, analyse them, and make available all analyses online. KONECT stands for Koblenz Network Collection, as the project has roots at the University of Koblenz–Landau in Germany. All source code is made available as Free Software, and includes a network analysis toolbox for GNU Octave, a network extraction library, as well as code to generate these web pages, including all statistics and plots. KONECT contains over a hundred network datasets of various types, including directed, undirected, bipartite, weighted, unweighted, signed and rating networks. The networks of KONECT are collected from many diverse areas such as social networks, hyperlink networks, authorship networks, physical networks, interaction networks and communication networks. The KONECT project has developed network analysis tools which are used to compute network statistics, to draw plots and to implement various link prediction algorithms. The result of these analyses are presented on these pages. Whenever we are allowed to do so, we provide a download of the networks.
Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. SNAP is also available through the NodeXL which is a graphical front-end that integrates network analysis into Microsoft Office and Excel. The SNAP library is being actively developed since 2004 and is organically growing as a result of our research pursuits in analysis of large social and information networks. Largest network we analyzed so far using the library was the Microsoft Instant Messenger network from 2006 with 240 million nodes and 1.3 billion edges. The datasets available on the website were mostly collected (scraped) for the purposes of our research. The website was launched in July 2009.
Network Repository is the first interactive data repository for graph and network data. It hosts graph and network datasets, containing hundreds of real-world networks and benchmark datasets. Unlike other data repositories, Network Repository provides interactive analysis and visualization capabilities to allow researchers to explore, compare, and investigate graph data in real-time on the web.
The Social Science Japan Data Archive (SSJDA) collects, maintains, and provides access to the academic community, a vast archive of social science data (quantitative data obtained from social surveys) for secondary analyses.
Social Computing Data Repository hosts data from a collection of many different social media sites, most of which have blogging capacity. Some of the prominent social media sites included in this repository are BlogCatalog, Twitter, MyBlogLog, Digg, StumbleUpon,, MySpace, LiveJournal, The Unofficial Apple Weblog (TUAW), Reddit, etc. The repository contains various facets of blog data including blog site metadata like, user defined tags, predefined categories, blog site description; blog post level metadata like, user defined tags, date and time of posting; blog posts; blog post mood (which is defined as the blogger's emotions when (s)he wrote the blog post); blogger name; blog post comments; and blogger social network.
The projects include airborne, ground-based and ocean measurements, social science surveys, satellite data use, modelling studies and value-added product development. Therefore, the BAOBAB data portal enables to access a great amount and a large variety of data: - 250 local observation datasets, that have been collected by operational networks since 1850, long term monitoring research networks and intensive scientific campaigns; - 1350 outputs of a socio-economics questionnaire; - 60 operational satellite products and several research products; - 10 output sets of meteorological and ocean operational models and 15 of research simulations. Data documentation complies with metadata international standards, and data are delivered into standard formats. The data request interface takes full advantage of the database relational structure and enables users to elaborate multicriteria requests (period, area, property…).
Cell phones have become an important platform for the understanding of social dynamics and influence, because of their pervasiveness, sensing capabilities, and computational power. Many applications have emerged in recent years in mobile health, mobile banking, location based services, media democracy, and social movements. With these new capabilities, we can potentially be able to identify exact points and times of infection for diseases, determine who most influences us to gain weight or become healthier, know exactly how information flows among employees and productivity emerges in our work spaces, and understand how rumors spread. In an attempt to address these challenges, we release several mobile data sets here in "Reality Commons" that contain the dynamics of several communities of about 100 people each. We invite researchers to propose and submit their own applications of the data to demonstrate the scientific and business values of these data sets, suggest how to meaningfully extend these experiments to larger populations, and develop the math that fits agent-based models or systems dynamics models to larger populations. These data sets were collected with tools developed in the MIT Human Dynamics Lab and are now available as open source projects or at cost.
IDSC is IZA's organizational unit whose purpose is to serve the scientific and infrastructural computing needs of IZA and its affiliated communities. IDSC is dedicated to supporting all users of data from the novice researcher to the experienced data analyst. IDSC aims at becoming the place for economically minded technologists and technologically savvy economists looking for data support, data access support and data services about labor economics. IDSC is actively involved in organizing events (see our next Red Cube Seminar Talk) for data professionals, data analysts, and scientific data users and young researchers to discuss and share findings and to establish contacts for future cooperation. All data collected are accessible to the scientific community as scientific use files for scholarly analyses free of charge. The Data Repository is available at
The Digital German Women's Archive (DDF) is an interactive specialist portal on the history of women's movements in Germany. It invites you to get to know topics, actors and networks of the women's movements from two centuries. For this purpose, the lesbian / women's archives, libraries and documentation centers, which are linked in the i.d.a. umbrella organization, present selected digital copies and further information from their holdings.
The sources of the data sets include data sets donated by researchers, surveys carried out by SRDA, as well as by government department and other academic organizations. Prior to the release of data sets, the confidentiality and sensitivity of every survey data set are evaluated. Standard data management and cleaning procedures are applied to ensure data accuracy and completeness. In addition, metadata and relevant supplement files are also edited and attached.
The Survey of Health, Ageing and Retirement in Europe (SHARE) is a multidisciplinary and cross-national panel database of micro data on health, socio-economic status and social and family networks of more than 140,000 individuals (approximately 530,000 interviews) aged 50 or over from 28 European countries and Israel.
ECDC is an EU agency aimed at strengthening Europe's defences against infectious diseases. The core functions cover a wide spectrum of activities: surveillance, epidemic intelligence, response, scientific advice, microbiology, preparedness, public health training, international relations, health communication, and the scientific journal Eurosurveillance. Within the field of its mission, the Centre shall: search for, collect, collate, evaluate and disseminate relevant scientific and technical data; provide scientific opinions and scientific and technical assistance including training; provide timely information to the Commission, the Member States, Community agencies and international organisations active within the field of public health; coordinate the European networking of bodies operating in the fields within the Centre's mission, including networks that emerge from public health activities supported by the Commission and operating the dedicated surveillance networks; exchange information, expertise and best practices, and facilitate the development and implementation of joint actions.
Pandora is an open data platform devoted to the study of the human story. Data may be deposited from various disciplines and research topics that investigate humans from their early beginnings until present in addition to their environmental context (e.g. archeology, anthropology history, ancient DNA, isotopes, zooarchaeology, archaeobotany, and paleoenvironmental and paleoclimatic studies, etc.). Pandora allows autonomous data communities to self-manage their webspace and community membership. Data communities self-curate their data plus other supporting resources. Datasets may be assigned a new DOI and a schema markup is employed to improve data findability. Pandora also allows for links to datasets stored externally and having previously assigned DOIs. Through this, it becomes possible to establish data networks devoted to specific topics that may combine a mix of datasets stored either within Pandora or externally.
The European Union Open Data Portal is the single point of access to a growing range of data from the institutions and other bodies of the European Union (EU). Data are free for you to use and reuse for commercial or non-commercial purposes. By providing easy and free access to data, the portal aims to promote their innovative use and unleash their economic potential. It also aims to help foster the transparency and the accountability of the institutions and other bodies of the EU. The EU Open Data Portal is managed by the Publications Office of the European Union. Implementation of the EU's open data policy is the responsibility of the Directorate-General for Communications Networks, Content and Technology of the European Commission.
GeneWeaver combines cross-species data and gene entity integration, scalable hierarchical analysis of user data with a community-built and curated data archive of gene sets and gene networks, and tools for data driven comparison of user-defined biological, behavioral and disease concepts. Gene Weaver allows users to integrate gene sets across species, tissue and experimental platform. It differs from conventional gene set over-representation analysis tools in that it allows users to evaluate intersections among all combinations of a collection of gene sets, including, but not limited to annotations to controlled vocabularies. There are numerous applications of this approach. Sets can be stored, shared and compared privately, among user defined groups of investigators, and across all users.
<<<!!!<<< duplicate >>>!!!>>> see This record is combined with 'NASA Socioeconomic Data and Applications Center' The World Data Center for Human Interactions in the Environment has been superseded by the NASA Socioeconomic Data and Applications Center (SEDAC), which is a regular member of the World Data System (WDS). The International Council for Science (ICSU) replaced the World Data Centers (WDC) with the WDS, which supports the provision of trusted scientific data services by certifying its members to ensure that they maintain the organizational capabilities and infrastructure for managing the data products and services that they offer. SEDAC focuses on human interactions in the environment and is one of the Distributed Active Archive Centers (DAACs) in the NASA Earth Observing System Data and Information System (EOSDIS). The NASA Earth Science Data and Information System (ESDIS) Project, a WDS Network Member, manages the EOSDIS science systems.
LINDAT/CLARIN is designed as a Czech “node” of Clarin ERIC (Common Language Resources and Technology Infrastructure). It also supports the goals of the META-NET language technology network. Both networks aim at collection, annotation, development and free sharing of language data and basic technologies between institutions and individuals both in science and in all types of research. The Clarin ERIC infrastructural project is more focused on humanities, while META-NET aims at the development of language technologies and applications. The data stored in the repository are already being used in scientific publications in the Czech Republic. In 2019 LINDAT/CLARIAH-CZ was established as a unification of two research infrastructures, LINDAT/CLARIN and DARIAH-CZ.
The Arctic Data Center is the primary data and software repository for the Arctic section of NSF Polar Programs. The Center helps the research community to reproducibly preserve and discover all products of NSF-funded research in the Arctic, including data, metadata, software, documents, and provenance that links these together. The repository is open to contributions from NSF Arctic investigators, and data are released under an open license (CC-BY, CC0, depending on the choice of the contributor). All science, engineering, and education research supported by the NSF Arctic research program are included, such as Natural Sciences (Geoscience, Earth Science, Oceanography, Ecology, Atmospheric Science, Biology, etc.) and Social Sciences (Archeology, Anthropology, Social Science, etc.). Key to the initiative is the partnership between NCEAS at UC Santa Barbara, DataONE, and NOAA’s NCEI, each of which bring critical capabilities to the Center. Infrastructure from the successful NSF-sponsored DataONE federation of data repositories enables data replication to NCEI, providing both offsite and institutional diversity that are critical to long term preservation.