Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 103 result(s)
STRING is a database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations; they are derived from four sources: - Genomic Context - High-throughput Experiments - (Conserved) Coexpression - Previous Knowledge STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable.
dbEST is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. Expressed Sequence Tags (ESTs) are short (usually about 300-500 bp), single-pass sequence reads from mRNA (cDNA). Typically they are produced in large batches. They represent a snapshot of genes expressed in a given tissue and/or at a given developmental stage. They are tags (some coding, others not) of expression for a given cDNA library. Most EST projects develop large numbers of sequences. These are commonly submitted to GenBank and dbEST as batches of dozens to thousands of entries, with a great deal of redundancy in the citation, submitter and library information. To improve the efficiency of the submission process for this type of data, we have designed a special streamlined submission process and data format. dbEST also includes sequences that are longer than the traditional ESTs, or are produced as single sequences or in small batches. Among these sequences are products of differential display experiments and RACE experiments. The thing that these sequences have in common with traditional ESTs, regardless of length, quality, or quantity, is that there is little information that can be annotated in the record. If a sequence is later characterized and annotated with biological features such as a coding region, 5'UTR, or 3'UTR, it should be submitted through the regular GenBank submissions procedure (via BankIt or Sequin), even if part of the sequence is already in dbEST. dbEST is reserved for single-pass reads. Assembled sequences should not be submitted to dbEST. GenBank will accept assembled EST submissions for the forthcoming TSA (Transcriptome Shotgun Assembly) division. The individual reads which make up the assembly should be submitted to dbEST, the Trace archive or the Short Read Archive (SRA) prior to the submission of the assemblies.
UniProtKB/Swiss-Prot is the manually annotated and reviewed section of the UniProt Knowledgebase (UniProtKB). It is a high quality annotated and non-redundant protein sequence database, which brings together experimental results, computed features and scientific conclusions. Since 2002, it is maintained by the UniProt consortium and is accessible via the UniProt website.
OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. OMIM is authored and edited at the McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, under the direction of Dr. Ada Hamosh. Its official home is omim.org.
The Structure database provides three-dimensional structures of macromolecules for a variety of research purposes and allows the user to retrieve structures for specific molecule types as well as structures for genes and proteins of interest. Three main databases comprise Structure-The Molecular Modeling Database; Conserved Domains and Protein Classification; and the BioSystems Database. Structure also links to the PubChem databases to connect biological activity data to the macromolecular structures. Users can locate structural templates for proteins and interactively view structures and sequence data to closely examine sequence-structure relationships.
Country
Xanthobase provides information on Xanthomonas oryzae pv oryzae (Xoo), the rice (Oryza sativa) pathogenic bacterium in which genome sequencing has revealed very extensive race differentiation. The whole genome sequence of its native host has also been completed, and analysis of the host parasite interaction on the basis of the two genomes can be expected to be useful.
Country
The Organelle Genome Megasequencing Program (OGMP) provides mitochondrial, chloroplast, and mitochondrial plasmid genome data. OGMP tools allow direct comparison of OGMP and NCBI validated records. Includes GOBASE, a taxonomically broad organelle genome database that organizes and integrates diverse data related to mitochondria and chloroplasts.
Country
PLMD (Protein Lysine Modifications Database) is an online data resource specifically designed for protein lysine modifications (PLMs). The PLMD 3.0 database was extended and adapted from CPLA 1.0 (Compendium of Protein Lysine Acetylation) database and CPLM 2.0 (Compendium of Protein Lysine Modifications) database
=== !!!!! Due to changes in technology and funding, the RAD website is no longer available !!!!! ===
Intrepid Bioinformatics serves as a community for genetic researchers and scientific programmers who need to achieve meaningful use of their genetic research data – but can’t spend tremendous amounts of time or money in the process. The Intrepid Bioinformatics system automates time consuming manual processes, shortens workflow, and eliminates the threat of lost data in a faster, cheaper, and better environment than existing solutions. The system also provides the functionality and community features needed to analyze the large volumes of Next Generation Sequencing and Single Nucleotide Polymorphism data, which is generated for a wide range of purposes from disease tracking and animal breeding to medical diagnosis and treatment.
Country
The Human Genetic Variation Database (HGVD) aims to provide a central resource to archive and display Japanese genetic variation and association between the variation and transcription level of genes. The database currently contains genetic variations determined by exome sequencing of 1,208 individuals and genotyping data of common variations obtained from a cohort of 3,248 individuals.
OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation. OrthoMCL starts with reciprocal best hits within each genome as potential in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as potential ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; www.micans.org/mcl) is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.
Country
The taxonomically broad EST database TBestDB serves as a repository for EST data from a wide range of eukaryotes, many of which have previously not been thoroughly investigated. Most of the data contained in TBestDB has been generated by the labs of the Protist EST Program located in six universities across Canada. PEP is a large interdisciplinaryresearch project, involving six Canadian universities. PEP aims at the exploration of the diversity of eukaryotic genomes in a systematic, comprehensive and integrated way. The focus is on unicellular microbial eukaryotes, known as protists. Protistan eukaryotes comprise more than a dozen major lineages that, together, encompass more evolutionary, ecological and probably biochemical diversity than the multicellular kingdoms of animals, plants and fungi combined. PEP is a unique endeavor in that it is the first phylogenetically-broad genomic investigation of protists.
The Expression Atlas provides information on gene expression patterns under different biological conditions such as a gene knock out, a plant treated with a compound, or in a particular organism part or cell. It includes both microarray and RNA-seq data. The data is re-analysed in-house to detect interesting expression patterns under the conditions of the original experiment. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas. The Baseline Atlas displays information about which gene products are present (and at what abundance) in "normal" conditions (e.g. tissue, cell type). It aims to answer questions such as "which genes are specifically expressed in human kidney?". This component of the Expression Atlas consists of highly-curated and quality-checked RNA-seq experiments from ArrayExpress. It has data for many different animal and plant species. New experiments are added as they become available. The Differential Atlas allows users to identify genes that are up- or down-regulated in a wide variety of different experimental conditions such as yeast mutants, cadmium treated plants, cystic fibrosis or the effect on gene expression of mind-body practice. Both microarray and RNA-seq experiments are included in the Differential Atlas. Experiments are selected from ArrayExpress and groups of samples are manually identified for comparison e.g. those with wild type genotype compared to those with a gene knock out. Each experiment is processed through our in-house differential expression statistical analysis pipeline to identify genes with a high probability of differential expression.
Country
NONCODE is an integrated knowledge database dedicated to non-coding RNAs (excluding tRNAs and rRNAs). Now, there are 16 species in NONCODE(human, mouse, cow, rat, chicken, fruitfly, zebrafish, celegans, yeast, Arabidopsis, chimpanzee, gorilla, orangutan, rhesus macaque, opossum and platypus).The source of NONCODE includes literature and other public databases. We searched PubMed using key words ‘ncrna’, ‘noncoding’, ‘non-coding’,‘no code’, ‘non-code’, ‘lncrna’ or ‘lincrna. We retrieved the new identified lncRNAs and their annotation from the Supplementary Material or web site of these articles. Together with the newest data from Ensembl , RefSeq, lncRNAdb and GENCODE were processed through a standard pipeline for each species.
RDP provides quality-controlled, aligned and annotated Bacterial and Archaeal 16S rRNA sequences, and Fungal 28S rRNA sequences, and a suite of analysis tools to the scientific community.
Genomic Expression Archive (GEA) is a public database of functional genomics data such as gene expression, epigenetics and genotyping SNP array. Both microarray- and sequence-based data are accepted in the MAGE-TAB format in compliance with MIAME and MINSEQE guidelines, respectively. GEA issues accession numbers, E-GEAD-n to experiment and A-GEAD-n to array design. Data exchange between GEA and EBI ArrayExpress is planned.
IntAct provides a freely available, open source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.
ClinVar is a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. ClinVar thus facilitates access to and communication about the relationships asserted between human variation and observed health status, and the history of that interpretation. ClinVar processes submissions reporting variants found in patient samples, assertions made regarding their clinical significance, information about the submitter, and other supporting data. The alleles described in submissions are mapped to reference sequences, and reported according to the HGVS standard. ClinVar then presents the data for interactive users as well as those wishing to use ClinVar in daily workflows and other local applications. ClinVar works in collaboration with interested organizations to meet the needs of the medical genetics community as efficiently and effectively as possible
Country
A collection of high quality multiple sequence alignments for objective, comparative studies of alignment algorithms. The alignments are constructed based on 3D structure superposition and manually refined to ensure alignment of important functional residues. A number of subsets are defined covering many of the most important problems encountered when aligning real sets of proteins. It is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The first release provided sets of reference alignments dealing with the problems of high variability, unequal repartition and large N/C-terminal extensions and internal insertions. Version 2.0 of the database incorporates three new reference sets of alignments containing structural repeats, trans-membrane sequences and circular permutations to evaluate the accuracy of detection/prediction and alignment of these complex sequences. Within the resource, users can look at a list of all the alignments, download the whole database by ftp, get the "c" program to compare a test alignment with the BAliBASE reference (The source code for the program is freely available), or look at the results of a comparison study of several multiple alignment programs, using BAliBASE reference sets.
Country
The cisRED database holds conserved sequence motifs identified by genome scale motif discovery, similarity, clustering, co-occurrence and coexpression calculations. Sequence inputs include low-coverage genome sequence data and ENCODE data. A Nucleic Acids Research article describes the system architecture
AceView provides a curated, comprehensive and non-redundant sequence representation of all public mRNA sequences (mRNAs from GenBank or RefSeq, and single pass cDNA sequences from dbEST and Trace). These experimental cDNA sequences are first co-aligned on the genome then clustered into a minimal number of alternative transcript variants and grouped into genes. Using exhaustively and with high quality standards the available cDNA sequences evidences the beauty and complexity of mammals’ transcriptome, and the relative simplicity of the nematode and plant transcriptomes. Genes are classified according to their inferred coding potential; many presumably non-coding genes are discovered. Genes are named by Entrez Gene names when available, else by AceView gene names, stable from release to release. Alternative features (promoters, introns and exons, polyadenylation signals) and coding potential, including motifs, domains, and homologies are annotated in depth; tissues where expression has been observed are listed in order of representation; diseases, phenotypes, pathways, functions, localization or interactions are annotated by mining selected sources, in particular PubMed, GAD and Entrez Gene, and also by performing manual annotation, especially in the worm. In this way, both the anatomy and physiology of the experimentally cDNA supported human, mouse and nematode genes are thoroughly annotated.
Country
>>>!!!<<< 2017-06-02: We recently suffered a server failure and are working to bring the full ORegAnno website back online. In the meantime, you may download the complete database here: http://www.oreganno.org/dump/ ; Data are also available through UCSC Genome Browser (e.g., hg38 -> Regulation -> ORegAnno) https://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=686342163_2it3aVMQVoXWn0wuCjkNOVX39wxy&c=chr1&g=oreganno >>>!!!<<< The Open REGulatory ANNOtation database (ORegAnno) is an open database for the curation of known regulatory elements from scientific literature. Annotation is collected from users worldwide for various biological assays and is automatically cross-referenced against PubMED, Entrez Gene, EnsEMBL, dbSNP, the eVOC: Cell type ontology, and the Taxonomy database, where appropriate, with information regarding the original experimentation performed (evidence). ORegAnno further provides an open validation process for all regulatory annotation in the public domain. Assigned validators receive notification of new records in the database and are able to cross-reference the citation to ensure record integrity. Validators have the ability to modify any record (deprecating the old record and creating a new one) if an error is found. Further, any contributor to the database can comment on any annotation by marking errors, or adding special reports into function as they see fit. These features of ORegAnno ensure that the collection is of the highest quality and uniquely provides a dynamic view of our changing understanding of gene regulation in the various genomes.
The DIP database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Please, check the reference page to find articles describing the DIP database in greater detail. The Database of Ligand-Receptor Partners (DLRP) is a subset of DIP (Database of Interacting Proteins). The DLRP is a database of protein ligand and protein receptor pairs that are known to interact with each other. By interact we mean that the ligand and receptor are members of a ligand-receptor complex and, unless otherwise noted, transduce a signal. In some instances the ligand and/or receptor may form a heterocomplex with other ligands/receptors in order to be functional. We have entered the majority of interactions in DLRP as full DIP entries, with links to references and additional information