User:Aoholcombe/sandbox

Response to reviews of MOT preprint
Dear Dr. Leung,

Thank you for coordinating the process and thanks very much to the reviewers for their comments. Before detailing how I have revised the manuscript to address the reviewers, I would like to raise a question that you may be able to help me with. Because in the field of engineering and computer vision, the phrase "object tracking" refers to tracking by computers, I think the Wikipedia page should start with something like "For visual tracking of objects by computers, see Video tracking." I would appreciate your advice on whether you think that is appropriate and if so, exactly how I should format such a phrase.

The edits I made in response to the reviewer's comments have substantially lengthened the manuscript. This made it important to add additional structure, so I've found some ways to make the article a bit more hierarchical rather being restricted to top-level headings. To do this, I demoted the heading "Use in ability testing and training" to make it a sub-heading of "Human variation and development". By bulking up the Theories and Models content, I was also able to divide that section into "Serial or parallel?" and "Slots or resources?" subsections.

I have also made a lot of small edits to improve the writing of most of the sections.

Addressing Reviewer 2's Comments
''Reviewer 2 Comment 1. "the second paragraph in the procedure section should be expanded to describe major variations on the MOT method. For example, experiments where stimuli do not move randomly but rotate along a common circular path..."''

Thank you for this important point. I have rewritten the end of the Procedure section to address this, adding a few sentences.

Reviewer 2 Comment 2. As suggested, to address the jarring mention of "Harvard undergraduates", I deleted the phrase. To address the issue that the literature uses almost exclusively WEIRD participants, I added some sentences in the middle of the "Human variation" section, which I've now re-named "Human variation and development". To increase the representation of groups besides Western university undergraduates, I reported results from studies of children and of persons with autism spectrum disorders, using papers that include those mentioned by Reviewer 2 in his comment 3.

Reviewer 2 Comment 3. I really appreciate the reviewer's mention of two papers on Williams Syndrome, which I did not know about and have now added.

Reviewer 2 Comment 4. Reading the Thornton & Horowitz (2015) paper mentioned was helpful, but I wasn't sure how to describe the results in a way that fit well with my manuscript. The reviewer's comment suggested to me that there was no cost to action for MOT – although the authors did find that the dual-task cost was statistically significant, the authors characterized it as small. They're probably right about that, but it would require some subtle reasoning to justify a simple conclusion such as that action does not disrupt MOT, and I'm not sure how to make that argument for a Wikipedia audience. Note that for my book, which some of the points in the present manuscript is based on, I did start to draft a chapter on dual-task interference but I abandoned it when I realized the book would be too long with that included, and as a result I don't feel I have enough mastery of the dual-task literature to go deeply into it. Also, the recent paper by Terry and Trick seems to paint a more complicated picture (although I haven't read the paper in detail).

Reviewer 2 Comment 5. I appreciate the reviewer prompting me to add more about models - this is a complex topic and I'd shied away from it because I haven't seen a lot of critical tests comparing models, instead people tend to use pretty flexible models that can explain the same things, for example slots-plus-averaging can explain gradual performance decreases with display parameters, but this is rarely grappled with in the literature – the MOT literature is far behind the visual working memory literature on this. I've added a mention of additional models and a description of some of the issues they get at.

Reviewer 2 Comment 6. I left out the Oksama & Hyona 2004 paper for brevity, although I do cite that paper elsewhere in the manuscript.

Reviewer 2 PDF annotation 1. The reviewer describes the explanation of the dissociation between tracking and motion direction judgments as being due to dedicated motion system mediating direction judgments as "perilously close to redescribing the data". I think there are other ways besides a dedicated motion system that it could be explained, and which some might favor if we didn't already have as much evidence for a dedicated motion system as we do from several decades of motion perception research. For example, early researchers (and naive Wikipedia readers) might have expected that motion would be detected based on change in position, using an actual position tracker that is not separate from position perception, rather than using dedicated detectors such as Reichhardt suggested. In that case, the dissociation would be much more surprising, and for naive readers of Wikipedia who don't know about the motion aftereffect and other independent evidence for detectors of motion, they might hesitate to conclude there is a motion system distinct from that used for tracking extended trajectories, without the guiding hand of the sentence I included. Still I appreciate the reviewer's comment as it pushed me to add more substance; I have revised the sentence to the following: "This dissociation between motion perception and object tracking is thought to reflect that direction judgments can be based on low-level and local motion detectors responses that don't register the positions of objects."

Reviewer 2 PDF annotation 2. About the circular array display my manuscript refers to, the reviewer writes "Needs to be clearer that we've moved from the classic Pylyshyn style independent, random object motion to a rotational paradigm. Maybe briefly list the kinds of MOT paradigms above?" I want to avoid having to try to come up with a taxonomy of the wide variety of displays people use, which is somewhat implied by a list, instead I'm hoping to maintain writing the article in a way that implies an infinite range of possible trajectories could be used. So to better convey that, what I've done is add to the second paragraph under the Procedure subsection. I also added a new sentence to introduce the circular display before the sentence commented on by the reviewer.

Reviewer 2 PDF annotation 3. Thank you for pointing out that the meaning of that sentence is obscure. I'm not sure I should get into specifics of the experiment as my manuscript has gotten pretty long, but I hope the following is clearer – I think it is clear enough that there is a dissociation and the reader should consult the cited paper to understand the nature of that dissociation: "While one might expect this to tap into the same processing as an MIT task, the relationship between the two is unclear, as there is evidence that attentional tracking occurs can occur along a different trajectory than that which is the basis of updating the memory of an object's features."

Reviewer 2 PDF annotation 4. The reviewer pointed out a very unclear part of a sentence, "the corresponding dimension", so I've simply deleted that phrase. It did mean something but it's unimportant for this article; it referred to a detail that a motivated reader can learn about by reading the cited paper.

Addressing Reviewer 1's Comments
''Reviewer 1 Comment 1. The work of Yantis (1992, Cognitive Psychology) should be discussed. This study demonstrated that the tracking performance critically depends on whether the targets can be maintained as a coherent but nonrigid virtual polygon. The tracking is significantly disrupted when that visual interpretation is violated (e.g., a vertex of the virtual polygon crosses over an opposite edge of the polygon).''

Thanks to the reviewer for pushing me to revisit this paper. To accommodate it and related work, I have added a new section on grouping.

''Reviewer 1 Comment 2. The work of Franconeri, Jonathan, and Scimeca (2010; Psychological Science) should be discussed. This study argued that MOT is limited only by object spacing (how close together objects can be), but not by speed, time, or capacity. This opinion is somewhat extreme and may not be completely right, but it has made an important point about the role of object spacing.''

Object spacing's effects speaks to spatial interference, which is addressed in the "Spatiotemporal limits" section of the manuscript. The Fraconeri et al study, but also many related follow-up studies, have contributed to our understanding of spatial interference, with extensive discussion provided in the book of mine cited in that section. Combining the findings of the follow-up studies, as the reviewer may have been hinting, indicates that the strong Franconeri, Jonathan, & Scimeca (2010) claim is wrong, and the present manuscript cites the work that overturned that conclusion and supported the new, more nuanced understanding. As a result of that later work, it no longer seems important to me to discuss that paper explicitly in a Wikipedia article, as Wikipedia articles prioritize the historical development of theories even less than a scientific article might (unless the Wikipedia article is actually about that theory). I certainly could discuss this specific paper, and in fact I have explicit discussion of it in my book that I cited, but I think such discussion would be only of historical interest (as an ambitious hypothesis that was subsequently disconfirmed) so it is my opinion that it is not a judicious use of space in this proposed Wikipedia article.

Thanks again to the reviewers and I hope this has shaped up to be a high-quality candidate Wikipedia article.

best,

Alex

Notes for future work
ADD SOMETHING ON AUTONOMY FROM COGNITION? But might need to cite my CogSci paper.

Draw on my cogsci paper on relation to feature attention, spatial attention, bottom-up attention, and disabling those to isolate spatial selection.

visual routines Pylyshin in his last book seems to endorse Boolean map theory without naming it (p.127 of my pdf) that you can use FINSTed locations as input to recognize spatial configurations. He also refers to using them "to perform visual operations on them, such as scanning focal attention between them" (p.126). So he seems to buy into not knowing which is which but them still being useful.