Talk:PLOS/Micropeptide

Initial editorial comments
I think that the techniques and validation sections would be better off before the examples, but it is up to the authors

We also recommend the inclusion of Hyperlinks to Wikipedia (e.g  produces: words to display) so that readers can easily check terms in the article. It's typically best to hyperlink the first use of any significant technical term.

Otherwise the submission contents is ready to send to peer reviewers. T Shafee (talk) 18:28, 9 January 2018 (PST)

Reviewer 1: Eric Olson
I find this Topic Page to be well written and focused on a subject that is likely of broad interest to the PLOS Genetics community and the field of molecular biology at large. I only have a few comments to help clarify some of the discussion points, which should strengthen the article.

1. The authors should make it more clear they are only presenting a few examples of known micripeptides. It is a bit confusing when they state “below is a summary of the current understanding of eukaryotic micropeptide function.” They should more clearly state that they are just listing a fraction of the recently identified micropeptides. Notable missing examples include MRI-2 (Slavoff. et al., J Biol Chem, 2014), Humanin (Guo et al., Nature, 2003), MOTS-c (Lee et al., Cell Metab, 2015), endoregulin and another- regulin (Anderson et al, Sci Signal, 2016), and SPAR (Matsumoto, A. et al., Nature, 2017).
 * Response: We agree the sentence is confusing and have modified it to state “While many are yet to be functionally characterized, and likely many more remain to be discovered, below is a summary of recently identified eukaryotic micropeptide functions.”
 * Additionally, we have included the additional suggested micropeptide examples.

2. The authors should also cite the paper describing ELABELA where they discuss Toddler (Chng, S.C. et al., Dev Cell 2013).
 * Response: We agree and have added this reference.

3. While it is true that many micropeptides are localized to the cytoplasm, there are also many examples of micropeptides that reside in membranes (myoregulin, DWORF, endoregulin, another-regulin and myomixer). Therefore, we feel the statement “Micropeptides lack an N-terminal signaling sequence, suggesting they are likely to be localized to the cytoplasm after translation” is a bit misleading and should be edited to include the likelihood that several of these small proteins are stabilized by their integration into membranes.
 * Response: We agree, and have added the following sentence: “However, as more micropeptides are studied, they have been found in other cell compartments, as indicated by the existence of transmembrane micropeptides.”

4. While it is true that “hundreds of thousands of micropeptides have been identified through various techniques in a multitude of organisms”, we do not feel that the authors made it clear that this is an estimation of “putative” micropeptides based on sequence analysis. It is likely that this is an overestimation and that a portion of those that have been identified as having coding potential may not actually be real. Please clarify this.
 * Response: We have clarified this statement by editing it to the following “Given their small size, sORFs were originally overlooked. However, hundreds of thousands of putative micropeptides have been identified through various techniques in a multitude of organisms. While only a small fraction of those that have been identified as having coding potential have been proven to be real, those that have been functionally characterized, in general, have roles in cell signaling, organogenesis, and cellular physiology.”

Reviewer 2: Gerben Menschaert
GENERAL

This Topic Page focusses on a topic that is (or certainly will become) of broad interest to the research community and the PLOS Genetics readership specifically. Overall, I find that the Topic Page can be much improved, both in writing and content-wise. Currently, the text is very fragmented, if reads like an assembly of different sentences and not like a consistent piece of text. I suggest to carefully proof-read the text and add sentences where necessary to better highlight the narrative flow of the Topic Page.

e.g. the introduction should be more elaborate in my opinion: currently, each and every piece of information is presented in one short sentence (I might be wrong, maybe that's the idea of such a Wiki Page?), hereunder some suggestion on additions:
 * Response: The introductions is intentionally concise as this better follows the “wiki” style,which is unlike a full review article.

-	"Micropeptides are polypeptides less than 100 amino acids." Add: As compared to the normal protein length distribution of … Add: something on this threshold of 100 aa?
 * Response: We have added a short description of average protein length in both eukaryotes and prokaryotes. We intentionally kept it concise as to not lose the main focus of the topic page.

-	"They are distinguishable from bioactive peptides as the former is generated from short open reading frames (sORFs), whereas the latter is a cleavage product of a larger polypeptide." Add: references on this (e.g.: DOI: 10.1016/j.euprot.2014.02.006 and DOI: 10.1111/j.1440-169X.2008.00994.x) Add: something on normal bioactive peptides (proconvertase cleavage, references, examples?)
 * Response: We did not add additional information on bioactive peptides, as it is outside the scope of this topic page. If a bioactive peptide topic page is published in the future, we’ll link to it. We have added the suggested references.

-	"They are expressed in both prokaryotic and eukaryotic organisms." Add references on studies (see below in remark 3 for possible references) and reviews on this.
 * Response: We have included studies on micropeptides being expressed in prokaryotic and eukaryotic organisms in the example sections and added these to the introduction, as well as included the references.

-	"The sORFs from which micropeptides are translated can be encoded by small genes, polycistronic mRNAs, or genes originally characterized as long non-coding RNAs (lncRNAs)." This is far from complete and also a difference should be made between (i) putatively coding sORFs derived from predictions (based on RIBO-seq, phylogenetic conservation, sequence context …) and (ii) functionally characterized sORFs as is mentioned in the last paragraph of this introduction. See below for extra comments on this in remark 2.
 * Response: See #3 under “Remarks”.

-	"Only a small fraction of these with coding potential have been confirmed." Confirmed by what? Elaborate.
 * Response: We have changed this to: “Only a small fraction of these with coding potential have had their expression and function confirmed.”

-	 Those that have been functionally characterized, in general, have roles in cell signaling, organogenesis, and cellular physiology." See remark 5 below.
 * Response: See #5 under “Remarks”.

REMARKS:

1.	Add something on the nomenclature used for sORFs/micropeptides (see: DOI: 10.1002/pmic.201700035)
 * Response: We have added information about nomenclature to the introduction. “Micropeptides also known as microproteins or sORF-encoded peptides can also be named according to their genomic location. For example, the translated product of an upstream open reading frame (uORF) might be called a uORF-encoded peptide (uPEP).”

2.	There is still controversy on the upper size limit of what is considered a 'micropeptide'. While 100 AA is the most commonly accepted upper limited, a substantial amount of papers considers 150 AA to be the upper limit. This might be worth mentioning. See also recent review for extra information on this topic DOI: 10.1002/pmic.201700219.
 * Response: We have altered the introduction to reflect that while 100aa is commonly used, 150aa is also considered a micropeptide.

3.	On "The sORFs from which micropeptides are translated can be encoded by small genes, polycistronic mRNAs, or genes originally characterized as long non-coding RNAs (lncRNAs)." And related figure 1. sORFs have been found in various genomic regions, not limited to the regions mentioned. For example, sORFs have been identified in known protein coding genes as well (e.g. uORFs, out-of-frame ORFs …). See Figure 4.3 in https://link.springer.com/chapter/10.1007/978-3-319-42316-6_4. Figure 1 should be adjusted as well.
 * Response: We agree that sORFs can be encoded from genomic regions other than what we had mentioned and have adjusted the text accordingly. The figure and legend has been updated.
 * Response:

4.	On "They are expressed in both prokaryotic and eukaryotic organisms." For overview of conserved sORFs see DOI: 10.1186/s13059-015-0742-x For public repository see www.sorfs.org (DOI: 10.1093/nar/gkx1130)
 * Response: See #14.

5.	It might be informative to mention why sORFs have been generally overlooked, and why identification of functional important sORFs is tedious. Recent reviews on micropeptides and more specifically this topic exist: DOI: 10.1146/annurev-cellbio-100616-060516 DOI: 10.1038/nchembio.1964 DOI: 10.1016/j.tcb.2017.04.006
 * Response: In the introduction we say, “Given their small size, sORFs were originally overlooked.”

6.	On "Those that have been functionally characterized, in general, have roles in cell signaling, organogenesis, and cellular physiology." Translational activity on sORFs is not solely attributed to the production of functional micropeptides. Translational activity on sORFs may also have a regulatory function (e.g. peptoswitch mechanism). Refer to reviews mentioned under point 4 above and also DOI: 10.1038/nrg3520.
 * Response: We have added to the introduction: “As more micropeptides are discovered so are more of their functions. One regulatory function is that of peptoswitches, which inhibit expression of downstream coding sequences by stalling ribosomes, through their direct or indirect activation by small molecules.”
 * Response: We have added to the introduction: “As more micropeptides are discovered so are more of their functions. One regulatory function is that of peptoswitches, which inhibit expression of downstream coding sequences by stalling ribosomes, through their direct or indirect activation by small molecules.”

7.	It should be mentioned that RNA-seq is not an ideal technique for micropeptide encoding sORF identification as RNA-seq is ignorant toward sORF delination. Due to the small size of sORFs, sORFs have a high probability to occur by chance, RNA-seq provides no evidence on the translation of sORFs. However, RNA-seq can provide additional information in combination with other techniques.
 * Response: We have added to the paragraph on RNA-seq: “Because of the strong likelihood of sORFs less than 100 aa occurring by chance, further study is necessary to determine the validity of data obtained through this method.”

8.	On "Ribosome profiling has been used to identify potential micropeptides in both zebrafish and humans." Ribosome profiling has been used in various species to identify micropeptides, not just human and mouse. Again, refer to For overview of conserved sORFs see DOI: 10.1186/s13059-015-0742-x For public repository see www.sorfs.org (DOI: 10.1093/nar/gkx1130)
 * Response: We have changed this to: “Ribosome profiling has been used to identify potential micropeptides in a growing number of organism, including plants, fruit flies, zebrafish, mice and humans.”

9.	On "This method uses compounds such as harringtonine, puromycin or lactimidomycin to stop ribosomes at translation initiation sites." Different protocols are at hand to generate data on initiating ribosomes (using antibiotics mentioned above), but also on elongating ribosomes (using cycloheximide or no antibiotic). This should be mentioned.
 * Response: We have added “Translation elongation inhibitors, such as emetine or cycloheximide, may also be used to obtain ribosome footprints which are more likely to result in a translated ORF.”

10.	On "If a ribosome is bound at or near a sORF, it is likely to produce a micropeptide." Replace by: "If ribosomes are bound within a sORF region, it putatively encodes a micropeptide."
 * Response: We have changed “it is likely to produce a micropeptide” to “it putatively encodes amicropeptide”.

11.	In the paragraph on ribosome profiling, one might also add (i) specific metrics that are derived from this type of data to assess coding potential + references: triplet periodicity, ORF-score, FLOSS, RRS + ORF-delineation algorithms such as PRICE, Spectre, ORF-rater, RIBOtaper… (ii) Also, the technique called Poly-RIBO-seq might be introduced (see DOI: 10.7554/eLife.03528). (iii) Ribosome profiling (or RIBO-seq) can either capture translating or initiating ribosomes depending on the treatment protocol. Not solely initiating ribosomes (see also remark 8).
 * Response: We have hyperlinked to the Wikipedia page on “ribosome profiling”.

12.	Proteogenomics is not a method or technique, rather an interdisciplinary field where genomics, translatomics and proteomics are combined. Indeed, micropeptide identification is a proteogenomic endeavor. Furthermore, proteogenomics is not restricted to the methodology described here. The methodology exemplified here is one possible proteogenomic application (see also https://link.springer.com/chapter/10.1007/978-3-319-42316-6_4). Also, the sentence "This technique was employed to discover 90 micropeptides in humans, of which 86 were previously uncharacterized. Interestingly, 57% of the start codons for these micropeptides were found to be at non-AUG sites.[7]" is far from complete. Novel studies were published that apply this specific approach of proteogenomics (DOI: 10.1186/s13059-015-0742-x, DOI: 10.1002/pmic.201700218, the public repository sorfs.org also has a MS-rescanning pipeline in place to specifically search for micropeptides in already available MS data deposited in EBI-PRIDE).
 * Response: We have changed the title to “Proteogenomic applications” and altered the paragraph to reflect this. We have also included a hyperlink to the Proteogenomics Wikipedia page.

13.	Phylogenetic conservation should be mentioned as one of the techniques to identify functional important sORFs.
 * Response: We have added a new paragraph about phylogenetic conservation under the section for Techniques for identifying potential micropeptides.

14.	add paragraph on databases containing sORF-micropeptide information, such as sORFs.org(http://www.sorfs.org), ARA-PEP repository (http://www.biw.kuleuven.be/CSB/ARA-PEPs/) and SmProt (http://bioinfo.ibp.ac.cn/SmProt/).
 * Response: We have added a section on databases/repositories of sORFs and putative micropeptides.

15.	Check the completeness of the prokaryotic and eukaryotic example list: See following reviews: DOI: 10.1111/j.1440-169X.2008.00994.x 	DOI: 10.1016/j.euprot.2014.02.006 DOI: 10.1038/nrg3520 DOI: 10.1146/annurev-cellbio-100616-060516
 * Response: Our list of prokaryotic and eukaryotic micropeptides is a sampling and is not intended to be a complete list.

MINOR REMARKS

-	Replace "translational products" by "translation products"
 * Response: We have made the suggested change.

-	Replace "Deep sequencing of ribosomes" by "RIBO-Sequencing", "RIBO-seq" or "Deep sequencing of ribosome protected fragments"
 * Response: We have made the suggested change.

Reviewer 3: Lloyd Fricker
I read the article and have two major comments.

First, the term “micropeptide” is wrong. There’s nothing “micro” about the peptides that are discussed – they are in the typical peptide size range (10-100 amino acids). They should be called “microproteins”. Looking through the literature, there are 10x more PubMed articles on “microproteins” than “micropeptide” (and both describe the same thing – small proteins of <100 amino acids).

Suggestion: Either add the term “microprotein” to the subject of the article, or mention this term early in the article and say that they are synonymous.
 * Response: We have added the term “microprotein” to the article as a synonymous term for micropeptide.

Second, I have an issue with the following section:


 * "Proteogenomics
 * This method combines proteomics and genomics to discover micropeptides. This technique uses RNA-Seq data to create a custom database of all possible polypeptides. Liquid chromatography followed by tandem MS (LC-MS/MS) is performed to provide sequence information for translation products. Comparison of the transcriptomic and proteomics data can be used to confirm the presence of micropeptides.[1][7]
 * This technique was employed to discover 90 micropeptides in humans, of which 86 were previously uncharacterized. Interestingly, 57% of the start codons for these micropeptides were found to be at non-AUG sites.[8]"

The issue is that the referenced article (8) did NOT discover peptides. They treated their samples with trypsin and looked at the resulting peptide fragments. It was proteomics, not peptidomics. They did try to analyze endogenous peptides (I was a reviewer of this manuscript and insisted on this) and found evidence for 2 peptides in the absence of trypsin, but they couldn’t confirm by MS/MS sequencing. So it’s not exactly the gold standard identification suggested in the Wiki page.

Suggestion: Change the wording of this part: replace “to discover 90 micropeptides” with the phrase “to find evidence for 90 small proteins…”
 * Response: We have deleted the latter part of this section.