OrthoDB
Lua error in package.lua at line 80: module 'strict' not found.
150px | |
---|---|
Content | |
Description | Catalog of Orthologs. |
Contact | |
Research center | Swiss Institute of Bioinformatics |
Laboratory | Computational Evolutionary Genomics Group |
Authors | Evgenia V. Kriventseva |
Primary citation | Kriventseva et al. (2015)[1] |
Release date | 2007 |
Access | |
Website | www |
'OrthoDB[1]'[2][3][4] presents a catalog of orthologous protein-coding genes across vertebrates, arthropods, fungi, and bacteria. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each radiation along the species phylogeny. The database of orthologs presents available protein descriptors, together with Gene Ontology and InterPro attributes, which serve to provide general descriptive annotations of the orthologous groups, and facilitate comprehensive orthology database querying. OrthoDB also provides computed evolutionary traits of orthologs, such as gene duplicability and loss profiles, divergence rates, and sibling groups, now extended to detail intron-exon architectures, syntenic orthologs, and parent-child trees.
Contents
Methodology
Orthology is defined relative to the last common ancestor of the species being considered, thereby determining the hierarchical nature of orthologous classifications. This is explicitly addressed in OrthoDB by application of the orthology delineation procedure at each radiation point of the considered phylogeny, empirically computed over the super-alignment of single-copy orthologs using a maximum-likelihood approach. The OrthoDB implementation employs a Best-Reciprocal-Hit (BRH) clustering algorithm based on all-against-all Smith–Waterman protein sequence comparisons. Gene set pre-processing selects the longest protein-coding transcript of alternatively spliced genes and of very similar gene copies. The procedure triangulates BRHs to progressively build the clusters and requires an overall minimum sequence alignment overlap to avoid domain walking. These core clusters are further expanded to include all more closely related within-species in-paralogs, and the previously identified very similar gene copies.
Data content
The database now contains over 300 eukaryotic species and more than 1000 bacteria [2] sourced from Ensembl, UniProt, NCBI, FlyBase and several other databases. The ever-increasing sampling of sequenced genomes brings a clearer account of the majority of gene genealogies that will facilitate informed hypotheses of gene function in newly sequenced genomes.
Examples of studies that have employed data from OrthoDB include comparative analyses of gene repertoire evolution,[5][6] comparisons of fruit fly and mosquito developmental genes,[7] analyses of bloodmeal- or infection-induced changes in gene expression in mosquitoes,[8][9][10] and analysis of the evolution of mammalian milk production.[11] Others studies citing OrthoDB can be found at PubMed and here.
Performance
OrthoDB has performed consistently well in benchmarking assessments alongside other orthology delineation procedures. Results were compared to reference trees for three well-conserved protein families,[12] and to a larger set of curated protein families.[13]
BUSCOs
Benchmarking sets of Universal Single-Copy Orthologs - Orthologous groups are selected from OrthoDB for the root-level classifications of arthropods, vertebrates, metazoans, and fungi. Groups are required to contain single-copy orthologs in at least 90% of the species (in others they may be lost or duplicated), and the missing species cannot all be from the same clade. Species with frequent losses or duplications are removed from the selection unless they hold a key position in the phylogeny. BUSCOs are therefore expected to be found as single-copy orthologs in any newly-sequenced genome from the appropriate phylogenetic clade, and can be used to analyse newly-sequenced genomes to assess their relative completeness.
Notes and references
- ↑ 1.0 1.1 Cite error: Invalid
<ref>
tag; no text was provided for refs namedpmid25428351
- ↑ 2.0 2.1 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ http://eggnog.embl.de/orthobench OrthoBench]
Lua error in package.lua at line 80: module 'strict' not found.