Content Types


AID systems



Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 17 result(s)
ModelDB is a curated database of published models in the broad domain of computational neuroscience. It addresses the need for access to such models in order to evaluate their validity and extend their use. It can handle computational models expressed in any textual form, including procedural or declarative languages (e.g. C++, XML dialects) and source code written for any simulation environment. The model source code doesn't even have to reside inside ModelDB; it just has to be available from some publicly accessible online repository or WWW site.
QSAR DataBank (QsarDB) is repository for (Quantitative) Structure-Activity Relationships ((Q)SAR) data and models. It also provides open domain-specific digital data exchange standards and associated tools that enable research groups, project teams and institutions to share and represent predictive in silico models.
ILC-CNR for CLARIN-IT repository is a library for linguistic data and tools. Including: Text Processing and Computational Philology; Natural Language Processing and Knowledge Extraction; Resources, Standards and Infrastructures; Computational Models of Language Usage. The studies carried out within each area are highly interdisciplinary and involve different professional skills and expertises that extend across the disciplines of Linguistics, Computational Linguistics, Computer Science and Bio-Engineering.
This classic collection of test cases for validation of turbulence models started as an EU / ERCOFTAC project led by Pr. W. Rodi in 1995. It is maintained by Dr. T. Craft at Manchester since 1999. Initialy limited to experimental data, computational results, and results and conclusions drawn from the ERCOFTAC Workshops on Refined Turbulence Modelling (SIG15). At the moment, each case should contain at least a brief description, some data to download, and references to published work. Some cases contain significantly more information than this.
OpenWorm aims to build the first comprehensive computational model of the Caenorhabditis elegans (C. elegans), a microscopic roundworm. With only a thousand cells, it solves basic problems such as feeding, mate-finding and predator avoidance. Despite being extremely well studied in biology, this organism still eludes a deep, principled understanding of its biology. We are using a bottom-up approach, aimed at observing the worm behaviour emerge from a simulation of data derived from scientific experiments carried out over the past decade. To do so we are incorporating the data available in the scientific community into software models. We are engineering Geppetto and Sibernetic, open-source simulation platforms, to be able to run these different models in concert. We are also forging new collaborations with universities and research institutes to collect data that fill in the gaps All the code we produce in the OpenWorm project is Open Source and available on GitHub.
Content type(s)
Datanator is an integrated database of genomic and biochemical data designed to help investigators find data about specific molecules and reactions in specific organisms and specific environments for meta-analyses and mechanistic models. Datanator currently includes metabolite concentrations, RNA modifications and half-lives, protein abundances and modifications, and reaction kinetics integrated from several databases and numerous publications. The Datanator website and REST API provide tools for extracting clouds of data about specific molecules and reactions in specific organisms and specific environments, as well as data about similar molecules and reactions in taxonomically similar organisms.
This website makes data available from the first round of data sharing projects that were supported by the CRCNS funding program. To enable concerted efforts in understanding the brain experimental data and other resources such as stimuli and analysis tools should be widely shared by researchers all over the world. To serve this purpose, this website provides a marketplace and discussion forum for sharing tools and data in neuroscience. To date we host experimental data sets of high quality that will be valuable for testing computational models of the brain and new analysis methods. The data include physiological recordings from sensory and memory systems, as well as eye movement data.
The Comparative RNA Web (CRW) Site disseminates information about RNA structure and evolution that has been determined using comparative sequence analysis. We present both raw (sequences, structure models, metadata) and processed (analyses, evolution, accuracy) data, organized into four main sections.
Cell phones have become an important platform for the understanding of social dynamics and influence, because of their pervasiveness, sensing capabilities, and computational power. Many applications have emerged in recent years in mobile health, mobile banking, location based services, media democracy, and social movements. With these new capabilities, we can potentially be able to identify exact points and times of infection for diseases, determine who most influences us to gain weight or become healthier, know exactly how information flows among employees and productivity emerges in our work spaces, and understand how rumors spread. In an attempt to address these challenges, we release several mobile data sets here in "Reality Commons" that contain the dynamics of several communities of about 100 people each. We invite researchers to propose and submit their own applications of the data to demonstrate the scientific and business values of these data sets, suggest how to meaningfully extend these experiments to larger populations, and develop the math that fits agent-based models or systems dynamics models to larger populations. These data sets were collected with tools developed in the MIT Human Dynamics Lab and are now available as open source projects or at cost.
NetPath is currently one of the largest open-source repository of human signaling pathways that is all set to become a community standard to meet the challenges in functional genomics and systems biology. Signaling networks are the key to deciphering many of the complex networks that govern the machinery inside the cell. Several signaling molecules play an important role in disease processes that are a direct result of their altered functioning and are now recognized as potential therapeutic targets. Understanding how to restore the proper functioning of these pathways that have become deregulated in disease, is needed for accelerating biomedical research. This resource is aimed at demystifying the biological pathways and highlights the key relationships and connections between them. Apart from this, pathways provide a way of reducing the dimensionality of high throughput data, by grouping thousands of genes, proteins and metabolites at functional level into just several hundreds of pathways for an experiment. Identifying the active pathways that differ between two conditions can have more explanatory power than just a simple list of differentially expressed genes and proteins.
The Basis Set Exchange (BSE) provides a web-based user interface for downloading and uploading Gaussian-type (GTO) basis sets, including effective core potentials (ECPs), from the EMSL Basis Set Library. It provides an improved user interface and capabilities over its predecessor, the EMSL Basis Set Order Form, for exploring the contents of the EMSL Basis Set Library. The popular Basis Set Order Form and underlying Basis Set Library were originally developed by Dr. David Feller and have been available from the EMSL webpages since 1994.
SimTK is a free project-hosting platform for the biomedical computation community that enables researchers to easily share their software, data, and models and provides the infrastructure so they can support and grow a community around their projects. It has over 62,000 members, hosts more than 960 projects from researchers around the world, and has had more than 500,000 files downloaded from it. Individuals have created SimTK projects to meet publisher and funding agencies’ software and data sharing requirements, run scientific challenges, create a collection of their community’s resources, and much more.
The Cancer Cell Line Encyclopedia project is a collaboration between the Broad Institute, and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models, to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patterns and to translate cell line integrative genomics into cancer patient stratification. The CCLE provides public access to genomic data, analysis and visualization for about 1000 cell lines.
BioSimulations is a web application for sharing and re-using biomodels, simulations, and visualizations of simulations results. BioSimulations supports a wide range of modeling frameworks (e.g., kinetic, constraint-based, and logical modeling), model formats (e.g., BNGL, CellML, SBML), and simulation tools (e.g., COPASI, libRoadRunner/tellurium, NFSim, VCell). BioSimulations aims to help researchers discover published models that might be useful for their research and quickly try them via a simple web-based interface.
DLESE is the Digital Library for Earth System Education, a geoscience community resource that supports teaching and learning about the Earth system. It is funded by the National Science Foundation and is being built by a community of educators, students, and scientists to support Earth system education at all levels and in both formal and informal settings. Resources in DLESE include lesson plans, scientific data, visualizations, interactive computer models, and virtual field trips - in short, any web-accessible teaching or learning material. Many of these resources are organized in collections, or groups of related resources that reflect a coherent, focused theme. In many ways, digital collections are analogous to collections in traditional bricks-and-mortar libraries.
This project is an open invitation to anyone and everyone to participate in a decentralized effort to explore the opportunities of open science in neuroimaging. We aim to document how much (scientific) value can be generated from a data release — from the publication of scientific findings derived from this dataset, algorithms and methods evaluated on this dataset, and/or extensions of this dataset by acquisition and incorporation of new data. The project involves the processing of acoustic stimuli. In this study, the scientists have demonstrated an audiodescription of classic "Forrest Gump" to subjects, while researchers using functional magnetic resonance imaging (fMRI) have captured the brain activity of test candidates in the processing of language, music, emotions, memories and pictorial representations.In collaboration with various labs in Magdeburg we acquired and published what is probably the most comprehensive sample of brain activation patterns of natural language processing. Volunteers listened to a two-hour audio movie version of the Hollywood feature film "Forrest Gump" in a 7T MRI scanner. High-resolution brain activation patterns and physiological measurements were recorded continuously. These data have been placed into the public domain, and are freely available to the scientific community and the general public.
When published in 2005, the Millennium Run was the largest ever simulation of the formation of structure within the ΛCDM cosmology. It uses 10(10) particles to follow the dark matter distribution in a cubic region 500h(−1)Mpc on a side, and has a spatial resolution of 5h−1kpc. Application of simplified modelling techniques to the stored output of this calculation allows the formation and evolution of the ~10(7) galaxies more luminous than the Small Magellanic Cloud to be simulated for a variety of assumptions about the detailed physics involved. As part of the activities of the German Astrophysical Virtual Observatory we have created relational databases to store the detailed assembly histories both of all the haloes and subhaloes resolved by the simulation, and of all the galaxies that form within these structures for two independent models of the galaxy formation physics. We have implemented a Structured Query Language (SQL) server on these databases. This allows easy access to many properties of the galaxies and halos, as well as to the spatial and temporal relations between them. Information is output in table format compatible with standard Virtual Observatory tools. With this announcement (from 1/8/2006) we are making these structures fully accessible to all users. Interested scientists can learn SQL and test queries on a small, openly accessible version of the Millennium Run (with volume 1/512 that of the full simulation). They can then request accounts to run similar queries on the databases for the full simulations. In 2008 and 2012 the simulations were repeated.