Talk:PLOS/Inferring horizontal gene transfer

Title
I have moved the article from Inferring Horizontal Gene Transfer: Methods and Benchmarking to Methods of detecting horizontal gene transfer because the latter is closer to Wikipedia's article naming conventions. An alternative could be Detection of horizontal gene transfer. See also existing "Methods of" and "Detection of" articles. --Daniel Mietchen 16:05, 26 June 2012 (PDT)
 * I finally moved the article here. --Daniel Mietchen 00:04, 17 August 2012 (PDT)

Further wikification
Please have another look at the guidelines, especially the introductory section and Hyperlinks. Furthermore, please provide the images as SVG rather than PNG. Thank you. --Daniel Mietchen 20:34, 21 September 2012 (PDT)

Review (Tom Williams)
Overall this page is well-written and comprehensive --- it will certainly make a fine addition to Wikipedia (and PLOS). It gives a good exposition of the different methods used to detect HGT.


 * Authors' response: We thank the reviewer for his favourable assessment and the constructive and relevant feedback.

One aspect perhaps missing, which might improve the article, is a discussion of the relative performance of the different methods. Perhaps this could be added to the “Evaluation” section, which currently summarises *how* methods can be compared but not the results of those comparisons.


 * Authors' response: We have extended the evaluation section with pointers to published benchmark studies. We also discuss conceptual differences between the two approaches and point out that parametric and phylogenetic methods are generally difficult to compare because they are based on different aspects of the input data.

I think there is a general sense that phylogenetic methods are the gold standard --- particularly when the incongruent relationships are robustly recovered in terms of bootstrap or posterior probability support. Is this borne out by the empirical tests? Although the overall rate of HGT is unknown, it seems likely to be at least somewhat higher than the rates detected with current methods.


 * Authors' response: The 'evaluation' section has been edited to emphasise (and justify) the generally accepted notion that phylogenetic methods tend to be more conclusive than compositional ones. This point is supported by the fact that multiple features of phylogenetic ones make them conceptually preferable. At the same time, we highlight the fact that the two are difficult to compare because they exploit different sources of information. Finally, though we tend to agree with the referee on the likelihood that HGT rates are underestimated, our understanding is that the encyclopedic format of topic/wikipedia pages makes it difficult to include such opinions.

Are phylogenetic methods more conservative than composition-based tests? For instance, weakly-supported incongruencies in single gene trees are frequently recovered in any large-scale gene tree inference project --- what biological meaning, if any, can be attached to these? Authors seem to vary in how much weight they attach to these trees, perhaps reflecting their own judgements about how much HGT is likely to occur.


 * Authors' response: Composition-based tests miss the majority of non-recent transfer events (Ragan et al. 2006) but are prone to produce false positives (Popotsova 2007). This is now emphasised in the text.

My other comments are very minor

1. I think the definition of HGT in the first sentence is a bit clunky (“occurs when” is vaguer than “is” or “refers to". Why not model it on the definition given in the linked article on HGT — something like HGT is (or refers to) the transfer of genes/acqusition of genetic material between organisms other than by traditional reproduction, or parent to offspring vertical inheritance.


 * Authors' response: We have modified the sentence to “the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance.”

2. It may be worth stating more explicitly why it is interesting and important to infer HGT events — i.e. for identifying genes associated with niche-specific adaptation, antibiotic resistance, or when attempting to reconstruct species trees and networks.


 * Authors' response: We now state in the lead section “In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate the investigations of evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. Of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants leading to the emergence of pathogenic lineages.”

This review refers to this version of the page: http://topicpages.ploscompbiol.org/w/index.php?title=Inferring_horizontal_gene_transfer&oldid=4255

--- Tom Williams

Review (Rob Beiko)
This is a very well-written article that covers a good amount of ground in its topic area. It covers the main areas of detection that I am aware of (although see below re: detection of gene fragments) and tries to organize and contrast the methods to some extent.

I recommend some additional references below that could enrich the article by giving additional perspectives or slightly contrasting methods - of course it's not possible to be exhaustive, but maybe some of these will be useful. I apologize that some of these are papers that I have had a hand in.


 * Authors' response: We thank the reviewer for his favourable assessment and the numerous pointers to relevant studies we missed in the earlier version of the manuscript.

I agree with Tom's principal comment that evaluation of the performance of different methods would be valuable.


 * Authors' response: We have substantially expanded the “evaluation” section and now contrast the main approaches.

I have a few suggestions for references that may be helpful in this direction (or there may be better ones that I am not aware of). Tsigrios and Rigoutsos (2005; ) did comparative evaluations of different methods, as did Becq et al. (2010; ). Xiong et al. (2012; ) took a feature-combination approach that examined both the sensitivity and redundancy in different predictors. We (Ragan et al., 2006; ) used phylogenetic predictions as a gold-standard method to examine the predictions of four non-explicitly-phylogenetic approaches, and also benchmarked the accuracy of recombination detection methods in identifying fragmentary LGT events (Chan et al., 2006; ). Finally, Nicolas Galtier introduced a LGT simulation framework in 2007, as


 * Authors' response: These helpful references and details were added at appropriate places in the review.

You may also want to expand on the question of *what* information you can get from different types of analysis. For example, in some cases phylogenetic methods may not be able to unambiguously identify partners (see for example Than et al 2007; ); in other cases, they may identify the partners but not the directionality; and in some cases you may get everything. It depends on the tree topology, and people sometimes use Robinson-Foulds distances to try and sort this out. Similarly, if you're brave you can try to claim that a parametric method has identified you the donor as well as the recipient.


 * Authors' response: While we generally agree with the reviewer, we believe that the current description (phylogenetic methods "can produce more detailed results than parametric approaches because the involved species, time and direction of transfer can potentially be identified") covers these ideas sufficiently. We cite Than et al's for their investigation of the confounding effect of incomplete lineage sorting on HGT.

Another interesting point is the transfer of gene fragments. In addition to the 2000 review and the saltern paper, Hao and Golding in several papers (most recently 2010; ) developed likelihood models of gene flux that consider gene transfer and subsequent loss, with explicit consideration of gene fragments.


 * Authors' response: We have added mention of this in the “Phylogenetic profiles” paragraph, along with the possibility of modelling gain/loss heterogeneity proposed by the same authors. Also, reference is made in the ‘Model-based reconciliation’ to the prospect of new network-based methods/models for evolution of gene fragments/domains/etc.

The paragraph that mentions incomplete lineage sorting would benefit from some references. I might recommend the same Than et al. article from above, as well as some of the other excellent work from Luay Nakhleh's group.


 * Authors' response: We have added the suggested reference.

Regarding the SPR paragraph, the recent algorithmic work by Chris Whidden (e.g., in showing that the problem is fixed-parameter tractable: http://epubs.siam.org/doi/abs/10.1137/110845045, and developing practical applications, ) has greatly sped up the inference of SPR moves. It might be worth mentioning another paper that ties SPR to LGT events (M. Baroni, S. Gr¨unewald, V. Moulton, and C. Semple "Bounding the number of hybridis- ation events for a consistent evolutionary history", J. Math. Biol., 51 (2005), pp. 171–182)


 * Authors' response: We have added the references as well as a short mention of the improvements.

Minor / Typographical:

1. Should "Implicit phylogenetic methods" be typeset in a larger font?


 * Authors' response: We have double-checked this but it appears that the heading level is correct.

2. Note that the current record holder (as far as I know) for AT-biased genomes is Candidatus Zinderia insecticola (13.5%) McCutcheon JP, Moran NA. Functional convergence in reduced genomes of bacterial symbionts spanning 200 My of evolution. Genome Biol Evol. 2010;2:708-718.


 * Authors' response: The text was edited accordingly.

3. A couple of interesting "early" (e.g., 2004) detection methods include RIATA-HGT and LatTrans, which offer very clever algorithmic approaches to the problem.


 * Authors' response: We have added the references and explained the approaches to deal with statistical branch support in gene trees in 'SPR' paragraph.

4. It is very good to see the Griffith experiment cited early in the paper. It's worth pointing out that Jones and Sneath, "Genetic transfer and bacterial taxonomy" in Bacteriol Rev (1970) is an excellent review of the field, and they cite Flu, P. C. 1927. Sur la nature du bacteriophage. C. R. Hebd. Seances Acad. Sci. Paris 96:1148-1149. I think there may be earlier references but they are difficult to find and/or written in various languages.
 * Found both papers: Jones and Sneath (1970), Flu (1927). --Daniel Mietchen (talk) 18:59, 1 October 2014 (PDT)
 * Authors' response: We've added a pointer to Jones and Sneath in the introduction. We read with great interest the article from P.-C. Flu. However, Flu’s account only provided evidence of the lytic activity of bacteriophages, and not that they can cause lateral gene transfer. Please note that the reference in Jones and Sneath is incorrect—it should be “Flu, P.-C. 1927. C.r. séances Soc. biol. ses. fil. 96(1): 1148-1149”.

Further wikification
I am going through the latest changes and will note here if I find things relevant to wikification that still need attention. --Daniel Mietchen (talk) 19:20, 1 October 2014 (PDT)
 * For references that have a PubMed ID, please use the Cite pmid template (example). This will facilitate the transfer to Wikipedia, where a similar template is in use. --Daniel Mietchen (talk) 19:20, 1 October 2014 (PDT)
 * Thanks - looks good now! --Daniel Mietchen (talk) 14:50, 2 October 2014 (PDT)


 * Please suggest some Wikipedia articles from which to link to this one. --Daniel Mietchen (talk) 20:28, 1 October 2014 (PDT)
 * Index_of_evolutionary_biology_articles
 * Horizontal_gene_transfer
 * Phylogenetic_tree
 * Phylogenetic_network
 * w:Category:Computational_biology
 * Bioinformatics
 * Comparative_genomics
 * Homology_(biology)
 * Thanks! --Daniel Mietchen (talk) 13:26, 2 October 2014 (PDT)


 * Please think about a possible Did you know phrase. --Daniel Mietchen (talk) 20:28, 1 October 2014 (PDT)
 * Authors' response: How about: "Did you know.... that some E. coli strains only have 40% of their genes in common, with the rest potentially acquired through horizontal gene transfer?"
 * That would link to horizontal gene transfer, but the purpose of the DYK would be to make people aware of the new article, so the text should link to Inferring horizontal gene transfer. --Daniel Mietchen (talk) 13:26, 2 October 2014 (PDT)


 * There is a relatively high number of words like "often" or "however", which should be avoided in the interest of verifiability and neutrality. --Daniel Mietchen (talk) 04:32, 2 October 2014 (PDT)
 * Authors' response: We have strongly reduced the number of instances of "often" but we feel that our use of "however" is justified and appropriate.
 * Thanks! --Daniel Mietchen (talk) 13:26, 2 October 2014 (PDT)


 * Please briefly explain what DLIGHT stands for. --Daniel Mietchen (talk) 18:00, 2 October 2014 (PDT)
 * Found it and put it in. --Daniel Mietchen (talk) 18:47, 2 October 2014 (PDT)

I just went through the article once more in detail and fixed any remaining wikification issues. --Daniel Mietchen (talk) 18:47, 2 October 2014 (PDT)
 * Authors' response: Thank you very much for all this editorial work!