Reset all


Content Types


AID systems


Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type


Metadata standards

PID systems

Provider types

Quality management

Repository languages



Repository types


  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 47 result(s)
>>>!!!<<< SMD has been retired. After approximately fifteen years of microarray-centric research service, the Stanford Microarray Database has been retired. We apologize for any inconvenience; please read below for possible resolutions to your queries. If you are looking for any raw data that was directly linked to SMD from a manuscript, please search one of the public repositories. NCBI Gene Expression Omnibus EBI ArrayExpress All published data were previously communicated to one (or both) of the public repositories. Alternatively, data for publications between 1997 and 2004 were likely migrated to the Princeton University MicroArray Database, and are accessible there. If you are looking for a manuscript supplement (i.e. from a domain other than, perhaps try searching the Internet Archive: Wayback Machine . >>>!!!<<< The Stanford Microarray Database (SMD) is a DNA microarray research database that provides a large amount of data for public use.
dbEST is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. Expressed Sequence Tags (ESTs) are short (usually about 300-500 bp), single-pass sequence reads from mRNA (cDNA). Typically they are produced in large batches. They represent a snapshot of genes expressed in a given tissue and/or at a given developmental stage. They are tags (some coding, others not) of expression for a given cDNA library. Most EST projects develop large numbers of sequences. These are commonly submitted to GenBank and dbEST as batches of dozens to thousands of entries, with a great deal of redundancy in the citation, submitter and library information. To improve the efficiency of the submission process for this type of data, we have designed a special streamlined submission process and data format. dbEST also includes sequences that are longer than the traditional ESTs, or are produced as single sequences or in small batches. Among these sequences are products of differential display experiments and RACE experiments. The thing that these sequences have in common with traditional ESTs, regardless of length, quality, or quantity, is that there is little information that can be annotated in the record. If a sequence is later characterized and annotated with biological features such as a coding region, 5'UTR, or 3'UTR, it should be submitted through the regular GenBank submissions procedure (via BankIt or Sequin), even if part of the sequence is already in dbEST. dbEST is reserved for single-pass reads. Assembled sequences should not be submitted to dbEST. GenBank will accept assembled EST submissions for the forthcoming TSA (Transcriptome Shotgun Assembly) division. The individual reads which make up the assembly should be submitted to dbEST, the Trace archive or the Short Read Archive (SRA) prior to the submission of the assemblies.
The Gene database provides detailed information for known and predicted genes defined by nucleotide sequence or map position. Gene supplies gene-specific connections in the nexus of map, sequence, expression, structure, function, citation, and homology data. Unique identifiers are assigned to genes with defining sequences, genes with known map positions, and genes inferred from phenotypic information. These gene identifiers are used throughout NCBI's databases and tracked through updates of annotation. Gene includes genomes represented by NCBI Reference Sequences (or RefSeqs) and is integrated for indexing and query and retrieval from NCBI's Entrez and E-Utilities systems.
FungiDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for the kingdom Fungi. FungiDB was first released in early 2011 as a collaborative project between EuPathDB and the group of Jason Stajich (University of California, Riverside). At the end of 2015, FungiDB was integrated into the EuPathDB bioinformatic resource center. FungiDB integrates whole genome sequence and annotation and also includes experimental and environmental isolate sequence data. The database includes comparative genomics, analysis of gene expression, and supplemental bioinformatics analyses and a web interface for data-mining.
The ENCODE Encyclopedia organizes the most salient analysis products into annotations, and provides tools to search and visualize them. The Encyclopedia has two levels of annotations: Integrative-level annotations integrate multiple types of experimental data and ground level annotations. Ground-level annotations are derived directly from the experimental data, typically produced by uniform processing pipelines.
NCBI Datasets is a continually evolving platform designed to provide easy and intuitive access to NCBI’s sequence data and metadata. NCBI Datasets is part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.
Greengenes is an Earth Sciences website that assists clinical and environmental microbiologists from around the globe in classifying microorganisms from their local environments. A 16S rRNA gene database addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies.
We developed a method, ChIP-sequencing (ChIP-seq), combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing to identify mammalian DNA sequences bound by transcription factors in vivo. We used ChIP-seq to map STAT1 targets in interferon-gamma (IFN-gamma)-stimulated and unstimulated human HeLa S3 cells, and compared the method's performance to ChIP-PCR and to ChIP-chip for four chromosomes.For both Chromatin- immunoprecipation Transcription Factors and Histone modifications. Sequence files and the associated probability files are also provided.
The Cancer Genome Atlas (TCGA) Data Portal provides a platform for researchers to search, download, and analyze data sets generated by TCGA. It contains clinical information, genomic characterization data, and high level sequence analysis of the tumor genomes. The Data Coordinating Center (DCC) is the central provider of TCGA data. The DCC standardizes data formats and validates submitted data.
The Brain Transcriptome Database (BrainTx) project aims to create an integrated platform to visualize and analyze our original transcriptome data and publicly accessible transcriptome data related to the genetics that underlie the development, function, and dysfunction stages and states of the brain.
MycoCosm, the DOE JGI’s web-based fungal genomics resource, which integrates fungal genomics data and analytical tools for fungal biologists. It provides navigation through sequenced genomes, genome analysis in context of comparative genomics and genome-centric view. MycoCosm promotes user community participation in data submission, annotation and analysis.
BCCM/MUCL is a generalist fungal culture collection of over 30000 filamentous fungi, yeasts and arbuscular mycorrhizal fungi including type, reference and test strains. It provides curated documentation and information on these bioresourced in its database. The collections activities include the distribution of its holdings, the accession of new material in its public, safe and patent domains, and services valorising its holdings and/or expertise to cultivate, isolate and identify fungal diversity in natural and anthropological ecosystems, agro-food (food and feed transformation and spoilage), and fungal-plant interactions.
The European Xenopus Resource Centre (EXRC) is situated in Portsmouth, United Kingdom and provides tools and services to support researchers using Xenopus models. The EXRC depends on researchers to obtain and deposit Xenopus transgenic and mutant lines, Xenopus in-situ hybridization clones, Xenopus specific antibodies and other resources with the centre. EXRC staff perform quality assurance testing on these reagents and then makes them available to the community at cost. EXRC also supplies wild type Xenopus, embryos, oocytes, egg extracts, X.tropicalis Fosmids, X.laevis BACs and ORFeomes.
This library is a public and easily accessible resource database of images, videos, and animations of cells, capturing a wide diversity of organisms, cell types, and cellular processes. The Cell Image Library has been merged with "Cell Centered Database" in 2017. The purpose of the database is to advance research on cellular activity, with the ultimate goal of improving human health.
The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins. This initial release of the UniPROBE database provides a centralized resource for accessing comprehensive data on the preferences of proteins for all possible sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In total, the database currently hosts DNA binding data for 406 nonredundant proteins from a diverse collection of organisms, including the prokaryote Vibrio harveyi, the eukaryotic malarial parasite Plasmodium falciparum, the parasitic Apicomplexan Cryptosporidium parvum, the yeast Saccharomyces cerevisiae, the worm Caenorhabditis elegans, mouse, and human. The database's web tools (on the right) include a text-based search, a function for assessing motif similarity between user-entered data and database PWMs, and a function for locating putative binding sites along user-entered nucleotide sequences
ASAP (a systematic annotation package for community analysis of genomes) is a relational database and web interface developed to store, update and distribute genome sequence data and gene expression data collected by or in collaboration with researchers at the University of Wisconsin - Madison. ASAP was designed to facilitate ongoing community annotation of genomes and to grow with genome projects as they move from the preliminary data stage through post-sequencing functional analysis. The ASAP database includes multiple genome sequences at various stages of analysis, and gene expression data from preliminary experiments.
<<<!!!<<< This repository is no longer available>>>!!!>>>. Although the web pages are no longer available, you will still be able to download the final UniGene builds as static content from the FTP site You will also be able to match UniGene cluster numbers to Gene records by searching Gene with UniGene cluster numbers. For best results, restrict to the “UniGene Cluster Number” field rather than all fields in Gene. For example, a search with Mm.2108[UniGene Cluster Number] finds the mouse transthyretin Gene record (Ttr). You can use the advanced search page to help construct these searches. Keep in mind that the Gene record contains selected Reference Sequences and GenBank mRNA sequences rather than the larger set of expressed sequences in the UniGene cluster.
<<<!!!<<< OFFLINE >>>!!!>>> A recent computer security audit has revealed security flaws in the legacy HapMap site that require NCBI to take it down immediately. We regret the inconvenience, but we are required to do this. That said, NCBI was planning to decommission this site in the near future anyway (although not quite so suddenly), as the 1,000 genomes (1KG) project has established itself as a research standard for population genetics and genomics. NCBI has observed a decline in usage of the HapMap dataset and website with its available resources over the past five years and it has come to the end of its useful life. The International HapMap Project is a multi-country effort to identify and catalog genetic similarities and differences in human beings. Using the information in the HapMap, researchers will be able to find genes that affect health, disease, and individual responses to medications and environmental factors. The Project is a collaboration among scientists and funding agencies from Japan, the United Kingdom, Canada, China, Nigeria, and the United States. All of the information generated by the Project will be released into the public domain. The goal of the International HapMap Project is to compare the genetic sequences of different individuals to identify chromosomal regions where genetic variants are shared. By making this information freely available, the Project will help biomedical researchers find genes involved in disease and responses to therapeutic drugs. In the initial phase of the Project, genetic data are being gathered from four populations with African, Asian, and European ancestry. Ongoing interactions with members of these populations are addressing potential ethical issues and providing valuable experience in conducting research with identified populations. Public and private organizations in six countries are participating in the International HapMap Project. Data generated by the Project can be downloaded with minimal constraints. The Project officially started with a meeting in October 2002 ( and is expected to take about three years.
GeneWeaver combines cross-species data and gene entity integration, scalable hierarchical analysis of user data with a community-built and curated data archive of gene sets and gene networks, and tools for data driven comparison of user-defined biological, behavioral and disease concepts. Gene Weaver allows users to integrate gene sets across species, tissue and experimental platform. It differs from conventional gene set over-representation analysis tools in that it allows users to evaluate intersections among all combinations of a collection of gene sets, including, but not limited to annotations to controlled vocabularies. There are numerous applications of this approach. Sets can be stored, shared and compared privately, among user defined groups of investigators, and across all users.
Content type(s)
Genome resource samples of wild animals, particularly those of endangered mammalian and avian species, are very difficult to collect. In Korea, many of these animals such as tigers, leopards, bears, wolves, foxes, gorals, and river otters, are either already extinct, long before the Korean biologists had the opportunity to study them, or are near extinction. Therefore, proposal for a systematic collection and preservation of genetic samples of these precious animals was adopted by Korea Science & Engineering Foundation (KOSEF). As an outcome, Conservation Genome Resource Bank for Korean Wildlife (CGRB; was established in 2002 at the College of Veterinary Medicine, Seoul National University as one of the Special Research Materials Bank supported by the Scientific and Research Infrastructure Building Program of KOSEF. CGRB operates in collaboration with Seoul Grand Park Zoo managed by Seoul Metropolitan Government, and has offices and laboratories at both Seoul National University and Seoul Grand Park, where duplicate samples are maintained, thereby assuring a long-term, safe preservation of the samples. Thus, CGRB is the first example of the collaborative scientific infrastructure program between university and zoo in Korea.
Addgene archives and distributes plasmids for researchers around the globe. They are working with thousands of laboratories to assemble a high-quality library of published plasmids for use in research and discovery. By linking plasmids with articles, scientists can always find data related to the materials they request.
<<<!!!<<< As of 2023, support to maintain the and sites have been retired following the end of funding. To access data from the modENCODE project, or for questions regarding the data they make available, please visit these databases: Fly data: FlyBase: ModENCODE data at FlyBase: FlyBase: Worm data: WormBase Data, including modENCODE and modERN project data, is also available at the ENCODE Portal: (search metadata and view datasets for Drosophila and Caenorhabditis!=*&status=released&replicates.library.biosample.donor.organism.scientific_name=Drosophila+melanogaster&replicates.library.biosample.donor.organism.scientific_name=Caenorhabditis+elegans&replicates.library.biosample.donor.organism.scientific_name=Drosophila+pseudoobscura&replicates.library.biosample.donor.organism.scientific_name=Drosophila+mojavensis). >>>!!!>>>
CryptoDB is an integrated genomic and functional genomic database for the parasite Cryptosporidium and other related genera. CryptoDB integrates whole genome sequence and annotation along with experimental data and environmental isolate sequences provided by community researchers. The database includes supplemental bioinformatics analyses and a web interface for data-mining.