Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Data access

Data access restrictions

Database access

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
  • 1 (current)
Found 16 result(s)
OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation. OrthoMCL starts with reciprocal best hits within each genome as potential in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as potential ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; www.micans.org/mcl) is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.
<<<!!!<<< Effective May 2024, NCBI's Assembly resource will no longer be available. NCBI Assembly data can now be found on the NCBI Datasets genome pages. https://www.re3data.org/repository/r3d100014298 >>>!!!>>> A database providing information on the structure of assembled genomes, assembly names and other meta-data, statistical reports, and links to genomic sequence data.
As with most biomedical databases, the first step is to identify relevant data from the research community. The Monarch Initiative is focused primarily on phenotype-related resources. We bring in data associated with those phenotypes so that our users can begin to make connections among other biological entities of interest. We import data from a variety of data sources. With many resources integrated into a single database, we can join across the various data sources to produce integrated views. We have started with the big players including ClinVar and OMIM, but are equally interested in boutique databases. You can learn more about the sources of data that populate our system from our data sources page https://monarchinitiative.org/about/sources.
AceView provides a curated, comprehensive and non-redundant sequence representation of all public mRNA sequences (mRNAs from GenBank or RefSeq, and single pass cDNA sequences from dbEST and Trace). These experimental cDNA sequences are first co-aligned on the genome then clustered into a minimal number of alternative transcript variants and grouped into genes. Using exhaustively and with high quality standards the available cDNA sequences evidences the beauty and complexity of mammals’ transcriptome, and the relative simplicity of the nematode and plant transcriptomes. Genes are classified according to their inferred coding potential; many presumably non-coding genes are discovered. Genes are named by Entrez Gene names when available, else by AceView gene names, stable from release to release. Alternative features (promoters, introns and exons, polyadenylation signals) and coding potential, including motifs, domains, and homologies are annotated in depth; tissues where expression has been observed are listed in order of representation; diseases, phenotypes, pathways, functions, localization or interactions are annotated by mining selected sources, in particular PubMed, GAD and Entrez Gene, and also by performing manual annotation, especially in the worm. In this way, both the anatomy and physiology of the experimentally cDNA supported human, mouse and nematode genes are thoroughly annotated.
The NCBI Short Genetic Variations database, commonly known as dbSNP, catalogs short variations in nucleotide sequences from a wide range of organisms. These variations include single nucleotide variations, short nucleotide insertions and deletions, short tandem repeats and microsatellites. Short Genetic Variations may be common, thus representing true polymorphisms, or they may be rare. Some rare human entries have additional information associated withthem, including disease associations, genotype information and allele origin, as some variations are somatic rather than germline events. ***NCBI will phase out support for non-human organism data in dbSNP and dbVar beginning on September 1, 2017***
ASAP (a systematic annotation package for community analysis of genomes) is a relational database and web interface developed to store, update and distribute genome sequence data and gene expression data collected by or in collaboration with researchers at the University of Wisconsin - Madison. ASAP was designed to facilitate ongoing community annotation of genomes and to grow with genome projects as they move from the preliminary data stage through post-sequencing functional analysis. The ASAP database includes multiple genome sequences at various stages of analysis, and gene expression data from preliminary experiments.
Harmonized, indexed, searchable large-scale human FG data collection with extensive metadata. Provides scalable, unified way to easily access massive functional genomics (FG) and annotation data collections curated from large-scale genomic studies. Direct integration (API) with custom / high-throughput genetic and genomic analysis workflows.
<<<!!!<<< As of Aug. 15, 2019, we are suspending plasmid distribution from the collection. If you would like to request BioPlex ORF clones (Harper lab) or if you identify other clones in our collection for which you cannot find an alternative, please email us at plasmidhelp@hms.harvard.edu. >>>!!!>>>
>>>!!! <<< The Epigenomics database was retired on June 1, 2016. All epigenomics data are available in our GEO resource https://www.ncbi.nlm.nih.gov/geo >>> !!! <<< The Epigenomics database provides genomics maps of stable and reprogrammable nuclear changes that control gene expression and influence health. Users can browse current epigenomic experiments as well as search, compare and browse samples from multiple biological sources in gene-specific contexts. Many epigenomes contain modifications with histone marks, DNA methylation and chromatin structure activity. NCBI Epigenomics database contains datasets from the NIH Roadmap Epigenomics Project.
In response to emerging pathogens, LabKey launched the Open Research Portal in 2016 to help facilitate collaborative research. It was initially created as a platform for investigators to make Zika research data, commentary and results publicly available in real-time. It now includes other viruses like SARS-CoV-2 where there is a compelling need for real-time data sharing. Projects are freely available to researchers. If you are interested in sharing real-time data through the portal, please contact LabKey to get started.
The UniProt Reference Clusters (UniRef) provide clustered sets of sequences from the UniProt Knowledgebase (including isoforms) and selected UniParc records in order to obtain complete coverage of the sequence space at several resolutions while hiding redundant sequences (but not their descriptions) from view.
This site is now made possible by University of Canterbury, New Zealand, and serves as historical repository of collections of files once housed at datacenterhub (without the interactive tools that were available on that site). Former description: We offer a public platform to help researchers organize, share, and explore their research data. DataCenterHub provides a simple, standardized yet flexible platform to preserve and share data. In the future, this platform will offer data visualization tools and the ability to compare directly data from different sources.