User:Wugapodes/WikiBreathing

In this paper I develop a theory of "WikiBreathing" based on the historical records of three wikis: WikiWikiWeb, MeatBall, and Wikipedia. WikiBreathing was first documented on MeatBall as a model for understanding how participants on WikiWiki responded to increased attention brought on by the wiki boom of the mid aughts. Many large wikis saw increased engagement from cultural outsiders and bad faith actors which resulted in the wikis "breathing out" by repelling new contributors through technological means before "breathing in" and gradually opening themselves up to new contributors again. Using subsequent developments in wiki history, I reframe the concept of WikiBreathing as a pan-wiki process which predicts the outcomes of cultural encounters between wikizens and the super-culture they are situated within. Following the redevelopment of WikiWiki as a FederatedWiki and the reopening of MeatBall in the early 2020s, I propose that contemporary reader interactions are best described by a period of "breathing in". Finally, I apply this theory to how Wikipedia readers respond when they are allowed to edit certain high-traffic articles for the first time in over a decade.
 * Abstract

Introduction
In this paper I develop a theory of "WikiBreathing" based on the historical records of three wikis: WikiWikiWeb, MeatBall, and Wikipedia. WikiBreathing was first documented on MeatBall as a model for understanding how participants on WikiWiki responded to increased attention brought on by the wiki boom of the mid aughts. Many large wikis saw increased engagement from cultural outsiders and bad faith actors which resulted in the wikis "breathing out" by repelling new contributors through technological means before "breathing in" and gradually opening themselves up to new contributors again. Using subsequent developments in wiki history, I reframe the concept of WikiBreathing as a pan-wiki process which predicts the outcomes of cultural encounters between wikizens and the super-culture they are situated within. Following the redevelopment of WikiWiki as a FederatedWiki and the reopening of MeatBall in the early 2020s, I propose that contemporary reader interactions are best described by a period of "breathing in". Finally, I apply this theory to how Wikipedia readers respond when they are allowed to edit certain high-traffic articles for the first time in over a decade.

Eternal September and the dynamics of reader incursion
The history of wikis is inextricable from the history of the early internet. Prior to the turn of the millennium, reliable access to inter-networked computers was not widespread. The World Bank data estimates that in 1995 fewer than 10% of the United States and 1% of the global population were internet users. Generally speaking, access to computers and the early internet was restricted by largely sociological barriers. To use the internet effectively, early internet users needed the funds, technological expertise, and free time to engage in what was still a relatively niche technology. Because of systemic inequality present in the wider culture, this early internet population was similarly stratified. By 2000, a majority of white Americans (53%) had internet access compared to 38% of Black Americans; 81% of Americans making over $75,000 annually used the internet, compared to 34% of those making under $30,000; of those who never attended college, only 40% used the internet compared to 67% of those who had at least some college education (Pew Research 2021). While the early internet was populated largely by high-income, educated, white anglophones, the demographic gaps have consistently been narrowing.

The mid 1990s marked a paradigm shift in this technocracy ushered in by America On Line (AOL). AOL was an internet service provider who pioneered home internet access through the free distribution of software that would set up an internet subscription for users. This software was mass mailed to homes and was incredibly successful in recruiting new subscribers to the internet service provider. AOL's former Chief Marketing Officer describes the scale of this operation: "At one point, 50% of the CD’s produced worldwide had an AOL logo on it. We were logging in new subscribers at the rate of one every six seconds" (Siegler 2010). In September 1993 AOL introduced a new feature for its subscribers which allowed them easy access to Usenet news groups. For the existing Usenet culture, this would become a turning point referred to in digital folklore as Eternal September.

As the internet population diversified, these early communities were faced with increasing and more intense reader incursions. Early internet communities, like all communities, had norms of social conduct which were expected to be followed. New members of the community are gradually socialized into the expectations of the culture before becoming culturally competent members of the community. These community socialization processes have a finite bandwidth and are generally limited by the ratio of existing members to prospective new members. In the online context, these new members are largely readers who try to move from that role to a more active role as a member of the group. If some external condition causes readers to simultaneously change their role, the internet community can become inundated; as the reader incursion grows in size and duration, the pre-existing community cannot adequately socialize them all and elite members become overworked. Ultimately, the community perceives an existential threat and takes steps to protect itself from the reader incursion.

Generally, growth is an explicit goal of online communities, and research has largely focused on the interventions which effectively recruit or discourage newcomers, but what happens when these recruitment efforts are particularly successful? The effects of reader incursion are understudied in the digital humanities, and this gap has implications for the success of interventions. When recruitment interventions succeed, a particular cultural encounter occurs: the existing community structure must accommodate a large influx of newcomers unfamiliar with their cultural norms. If the community cannot cannot scale, then the recruitment intervention can counterintuitively harm the project and community behind it. Conversely, communities which can scale to accommodate a large influx of new members will better be able to retain newcomers who explore the community into regular members. Kiene, Monroy-Hernández, and Mako Hill (2016) conducted interviews with a stratified sample of members of Reddit's r/NoSleep forum to investigate how that community responded to a sudden growth in popularity. Through these interviews, the authors identified three systems that allowed the community to effectively integrate its new participants: Contrary to previous work (Butler, Joyce, and Pike 2008; Halfaker, et al. 2012), Kiene et al. suggest that social regulation is critical to a community surviving a reader incursion. The data from r/NoSleep demonstrates how communities can succeed in managing a reader incursion, and in the following section I compliment this with data from wikis which did not survive reader incursions. By investigating these failure modes, we can construct a more robust theory of how communities can effectively scale during periods of rapid growth.
 * 1) Strict and consistent enforcement of explicit community rules by moderators
 * 2) Participation of long-time members in moderating the community through Reddit's voting system
 * 3) Technical infrastructure which eases the maintenance burden on moderators

The first decade of wikis: 1995 to 2005
Known as WikiWiki the first wiki was founded in 1995 by Ward Cunningham as a companion site to the Portland Pattern Repository. Both WikiWiki and Portland Pattern Repository were dedicated to the documentation of programming patterns, and so their audience and contributor base were programmers. WikiWiki grew quickly, going from roughly 15,000 page views per month in 1995 to 50,000 page views per month in 1997. In that same time, content quadrupled from 2MB to 10MB. As the community grew, the interests of its members began to grow as well, with some topics reaching sufficient participation to gradually shift the course of the wiki. This diversification of topics created tension within the community.

The community tension was resolved by community forks, marking the first cycle of wiki breathing at WikiWiki around 2000. With the ballooning size of WikiWiki, a group of community members labeled WikiReductionists collaborated to reduce the size and scope of content on WikiWiki. The WikiReductionists focused largely on meta-commentary, which was referred to internally as "WikiOnWiki" content. As the wider community noticed the coordinated deletions and page refactorings, the community began to debate whether the actions were justified and the proper scope of content on WikiWiki. In the second quarter of 2000, the community held a vote on WikiReductionism recorded at WikiReductionistVotes. While a substantial minority were opposed to WikiReductionism, the majority view was that most content unrelated to the documentation of programming or patterns should be deleted. The communities of practice who maintained these now out-of-scope pages created their own communities to continue their work in a process referred to as forking. Meta-wiki commentary---the center of WikiReductionist tension---found its new home on MeatBallWiki.

MeatBallWiki was created to document the experiences and theories of how collaborative online communities function. Reflecting its roots in WikiWiki, much of MeatBallWiki's content is informed by the social dynamics of WikiWiki, but as the number and popularity of wikis grew, the content began to take into account a greater breadth of data. MeatBallWiki documents the philosophies and theories that underpin wikis developed around this time, and many early Wikipedia policies had their start on MeatBallWiki. For example, the Wikipedia policy en:w:Project:Assume good faith was heavily influenced by the MeatBallWiki article of the same name (Beesley 2004), and the MeatBallWiki article on the RightToVanish was incorporated into Wikipedia's privacy policy until 2004 (Beesley, et al. 2004) and remains a guideline at en:w:Project:Courtesy Vanishing. The interconnection between MeatBallWiki and Wikipedia is largely due to their simultaneous development.

Wikipedia was founded in early 2001, and comprised many members of the early wiki communities including Ward Cunningham and Sunir Shah. Like other wikis, Wikipedia grew quickly, but the scale was unprecedented. The English Wikipedia passed 100,000 articles in 2003 with roughly 500,000 articles across all languages; the following twelve months saw the worldwide Wikipedia double in size to over one million articles across all languages by the end of 2004. Whereas the previous wikis discussed were of rather niche interests and developed by a relatively small group of connected collaborators, Wikipedia was quickly becoming part of popular culture. Wikipedia was cited and quoted in a federal court decision for the first time in 2004 (Peoples 2010), and by 2005 Wikipedia had become the largest and fastest growing educational website (Smilowitz 2005). This period of growth saw the audience quadruple from 3 million to 12 million readers in a single year, and this brought along a similarly large increase in edits. During the 211 weeks between January 2002 and March 2005, the English Wikipedia saw 10 million edits; it would take 25 weeks to get the next 10 million. The English Wikipedia had two times more edits in the final 9 months of 2005 as in the previous three years.

WikiBreathing and the Eternal 2005
As Wikipedia entered the mainstream, increased attention was drawn to the smaller wiki communities which existed alongside it. As new editors explored the internal policies and meta wiki, these pages would often contain links to their intellectual predecessors on WikiWiki and MeatBallWiki. Wikipedians would follow these links, finding a community substantially different from the Wikipedia culture they came from. This catalyzed reader incursions which strained the ability of existing members to socialized newer members. The prominence of Wikipedia in search result likewise brought spammers who saw wikis as an opportunity to promote their content. Like the Eternal September of UseNet, the success fo Wikipedia brought about a seemingly Endless 2005 where increased attention on smaller wikis would strain the ability of community members to cope.

Ward Cunningham would introduce a new restriction mechanism to WikiWiki in order to offset the effects of this reader incursion. Documented at WikiAccessRestricted, the ability to edit WikiWiki would be restricted or disabled for most readers through an edit code word system. During periods of high server load or when the reader is accessing the wiki from a "bad neighborhood", editors submitting a change will be presented with a challenge similar to a captcha. The editor would be asked to enter a code, and sometimes the system will provide them with that code, but in a format that computers cannot easily decipher. In response to particularly bad disruption, the code may not be provided at all, preventing editing by certain people. Certain community members would be provided with codes by Ward Cunningham that would allow them to edit the wiki even when the system would not present a code to others, functionally restricting contributions to the most trusted contributors. This process of periodically restricting and opening the wiki in response to disruption was termed WikiBreathing by contributors to MeatBallWiki.

Case Study: unprotection on the English Wikipedia
Following the rapid growth of the English Wikipedia in the late aughts, the Wikipedia entered a "breathing out" period around 2009 where editing by readers was made more difficult. this manifested primarily as page protection and similar to the edit code word used by WikiWiki in 2005, those without a particular tenure were excluded from editing particular pages during periods of high disruption. Page protection is a technical restriction applied by administrators which prevents changes by editors who do not meet some predefined criteria; this article focuses on semi-protection which rejects changes submitted by editors without an account (called IPs as they are identified by their IP address) or by editors with an account with less than 4 days tenure and fewer than 10 edits (called autoconfirmed after the user right of the same name). Pages can have three independent types of protection---edit, move, and creation---and the protection can be for either a finite length of time or indefinite.

During this period of breathing out, many pages on the English Wikipedia became protected due to disruptive editing and long-term abuse. One such incident was an editor who vandalized pages by moving them to variations on the name of the character Hagrid from the Harry Potter series. Administrators move protected many pages for an indefinite period, and because of the minimal risk of move protection, these were often forgotten about even after more targeted anti-abuse systems were developed. In May 2021 the author participated in a community effort to review these and other indefinite move protections which have been in place for over a decade. During this review, a substantial number of pages were not only move protected, but edit protected. As part of the review process, the author also reviewed the edit protections to evaluate whether continued protection was justified, and pages were unprotected based on the author's judgment and experience as an English Wikipedia administrator.

Page protections are a last resort to stop disruption, and in rare cases pages are protected indefinitely rather than for a set time period. Many of these protections were likely forgotten about: placed on pop-culture pages that have faded out of the public eye or to combat widespread problems that have long-since been resolved. While protection stops disruption, that property makes it hard to determine when it is safe to remove protection. Since the only way to know is to remove protection and watch, pages can wind up protected for much longer than intended or even needed. The effects of page protection are immediate and obvious, so the documentation and policies of the English Wikipedia are quite robust. Editors and administrators have had ample opportunity to discuss when and why pages should be protected, and generally these policies recommend the least restrictive intervention necessary to stop disruption. This norm produces a kind of WikiBreathing similar to the edit code word introduced on WikiWiki in 2005 where contributions are restricted for brief periods based on the level of disruption.

Long term protection of high-visibility articles represents a qualitatively different kind of WikiBreathing more akin to the closures of MeatBallWiki and WikiWiki in 2013 and 2015, respectively. In these cases contributors are not simply delayed or temporarily inconvenienced, they are turned away outright. While indefinite page protection is not as extreme as making a wiki read-only, it still represents a marked shift from the traditional ways in which wikis work: consider that every page to be discussed has been protected for longer than the Wikipedia has existed. The WikiBreathing model was not developed for this time scale, and the implications of "breathing in" after a decade of "breathing out" are not well understood. In order to motivate a more robust theory of WikiBreathing, this section reports the results of 7 unprotections performed by the author in May 2021 and monitored for the following three months.

Methodology
An initial list of pages was created by Wikipedia editor Cryptic based on a database query using the Wikimedia Quarry interface. The list selected pages in the database which were indefinitely move protected and which were protected before 2010. A random subset of this list was selected by the author for move protection review and as of writing the first 55 have been reviewed. Of the 55 move protected articles, 19 were edit protected as well.

Pages were unprotected based on a holistic assessment by the author based on administrative experience in page protection. As a result, firm criteria are not established, but basic principles can be articulated as to when protection was considered.


 * Considerations favoring removal of protection
 * Pages on popular culture topics were more likely to be unprotected. Often these pages experienced disruption due to the topic's presence in media and increased readership. A decade later, many of these pages are no longer in the public eye, suggesting that the original need for protection has passed.
 * Pages which were protected solely due to their page counts were more likely to be unprotected. English Wikipedia sentiment has changed since 2009, and the community in 2021 is not particularly tolerant of preemptive protection. Further, pages popular in 2009 have likely waned in popularity, and what was high visibility then may not be high visibility now. Given this combination, removing protection was considered a worthwhile test of its necessity.
 * Pages on general topics were more likely to be unprotected. While the increased visibility suggests that disruption may be more likely, that visibility brings with it more watchers who can quickly respond to incidents. This mitigated the likely damage that vandalism would cause, and if necessary the page could simply be protected again.


 * Considerations favoring retaining protection
 * Pages which were biographies of living people were less likely to be unprotected. The risk of harm to living people is substantial, and so these pages were treated with additional caution.
 * Pages whose main topic is politically fraught were less likely to be unprotected. Articles on political topics frequently see disruption from partisans looking to further or express their point of view, and so the success of unprotection was less likely than for politically neutral topics.

Of the 19 protected pages, 7 were unprotected. These pages were actively monitored for 31 days with edits from IPs and new editors checked manually by the author once per day and editors would be reverted, improved, warned, or welcomed as necessary. Following the month of active monitoring, these interventions were reduced. If disruption was insignificant, the page was passively monitored with observations made roughly once every few weeks and rare itnervention. For pages which experienced disruption, monitoring was continued with reduced frequency and intervention; if the page experienced a large number of edits from IPs or new editors, or if established editors were repeatedly reverting these edits, the edits would be check by hand, but rarely reverted unless blatant vandalism.

Edits were evaluated based upon whether they were "good faith" or "bad faith". Good faith edits are those which were intended to be helpful, and per the Wikipedia policy en:w:Project:Assume good faith, edits are considered good faith unless there is evidence to the contrary. Good faith edits are not necessarily unproblematic edits---often these edits are imperfect and require intervention from more experienced editors---rather the goal of the good faith editor seems to be improvement even if not to typical publication standards for Wikipedia. By contrast, bad faith edits are those which actively try to damage the encyclopedia. The most typical example is vandalism such as removing all content from the page ("page blanking") or inserting expletives.

Edits in either category were evaluated on different metrics. Good faith edits are generally evaluated on the success of their outcomes. The most successful are those which were retained in whole or in part without another editor reverting the contribution. Some edits may not be retained after discussion, and so even if the contribution is not accepted, they can be considered a success if the new editor engaged in what English Wikipedia editors call the bold, revert, discuss cycle. In this case, a new editor makes a bold change, is reverted by another with some explanation in the edit summary, and discussion continues either on the talk page or in the summary of a subsequent edit. Other edits were not retained but spurred further development. For example, a good faith editor might insert a question directly into the article. While this is not appropriate and will be reverted, the question may prompt established editors to improve the article so that the question is answered. Other measures of success are talk page interactions, whether the editor received a welcome, and edit.

Bad faith edits are evaluated on their severity and the time they were visible to readers. Changes which are low visibility such as to non-visible content or introducing misspellings are not as severe as page blanking or introducing expletives or slurs into the top of the article. While bad faith edits are always unacceptable, minor vandalism is less damaging than severe vandalism given similar times. that said, minor vandalism present for a long period can become a substantial problem due to its longevity. Similarly, severe vandalism which is instantly reverted is less damaging given that its transience made it less visible. for this reason the length of time before a revert is taken into consideration when evaluating the level of disruption particular bad faith edits create.

Finally, the articles are evaluated on how unprotection has affected them as a whole based on two metrics: stability and growth. Stability is the edit rate following unprotection compared to the same period before unprotection. For articles which were already stable, we would expect that stability to continue as readers would have relatively little to contribute. Growth represents how much the page has grown since unprotection. These two metrics are related but distinct, and it is possible for an article to be stable while experiencing significant growth. The main distinction is that growth represents how unprotection has prompted article improvement, while stability represents how much "noise" was added to the history following unprotection.

Cattle

 * Cattle (protected due to vandalism on 30 May 2008; unprotected 2 May 2021) 25 edits in 7 days after unprotection (prior 25 edits date to January). Two IPs made good faith but net negative changes which were reverted. Another was warned for test edits which stood for about one minute. Four IPs vandalized: one was immediately reverted by ClueBot, two were reverted within a minute by recent changes patrollers, and a third was reverted by a helpful IP after about 30 minutes. One IP and three registered users have made constructive, non-revert edits after nearly a week. The page was temporarily semi-protected on May 11 following a number of disruptive edits that were not promptly reverted.

Dance

 * Dance (protected due to vandalism on 15 January 2009; unprotected 3 May 2021) 6 edits since unprotection (last six edits date to February). One new user made productive contributions and was welcomed. An IP added personal opinion which was removed about three hours later. No bad faith edits after 8 days.

David Beckham

 * David Beckham (protected due to vandalism on 7 January 2009; Beckham retires from football May 2013; unprotected 2 May 2021). Two editors (one IP) were reverted and warned for making test edits, both were reverted after about 12 hours. An IP vandalized the article and was reverted by a recent changes patroller in about a minute. An IP made a good faith edit which led to a removal of poorly sourced information. Another editor removed a poorly sourced statement due to (presumably) increased PageChurn. An anonymous editor was introducing unverified material on May 10 and 11 which was not promptly reverted. To prevent BLP violations, the page was pending changes protected for two weeks.

Dragon

 * Dragon (protected due to vandalism on 17 January 2009; unprotected 4 May 2021) an unconstructive edit was immediately reverted by ClueBot. An editor added an unreliable reference which was reverted by a presumed page watcher.

Flower

 * Flower (protected due to vandalism on 13 January 2009; unprotected 3 May 2021) within 24 hours of unprotection, an IP was warned for making a test edit where they intentionally misspelled a section heading; reverted 15 hours later. No other edits afterwards.

Frog

 * Frog (protected due to vandalism on 5 January 2009; unprotected 2 May 2021; protected indefinitely 3 May 2021) page was unprotected roughly 40 hours and experienced 11 edits. Of those, 7 edits were spamming sock puppets and the remaining 4 were reverts. Because of the volume and persistence of the problem, the protection was reinstated.

To edit
At this point, data are limited but point towards generally positive outcomes. Of the six test cases, only one had immediate justification for reprotection, and anonymous editors have generally interacted positively with the remaining five articles. The most questionable is Cattle which has experienced disruption roughly once a day, but while anonymous editors have made both positive and negative contributions, existing soft-security processes such as recent changes patrolling and page watching have removed the negative contributions rather quickly. While many positive contributions were ultimately rejected, others were reworked or led to further improvements by more experienced editors all while introducing readers to the editing interface.

This increase in "reader engagement" with the editing interface likely comes even from the "negative" contributions such as test edits. While an apparent majority of anonymous edits were tests that ultimately required volunteer labor to fix, it also provided an opportunity to WelcomeNewcomers who seem interested if not yet constructive. Similarly, the test edits were generally "conservative" manifesting as changes to single words or digits. While nonetheless disruptive, they are easily reverted and help identify readers who might be productive if they were given direction. For this reason, it may be better to use the {{subst:Welcome}} series of template (e.g., {{subst:welcome-anon-test}}) which provide links to introductory material instead of {{subst:uw-test1}} which links them to the sandbox and provides little direction for their curiosity. Perhaps uw-test1 could be improved by adding a "call to action" similar to.

Preliminary results are consistent with the theory of WikiBreathing and suggest that the way readers interact with out content has changed since 2009. The process of WikiBreathing was first documented on WikiWiki where the ability to contribute was periodically restricted and then loosened to manage wiki growth. As a model of the WikiLifeCycle, it theorizes that wikis go through a period of growth until a point where the community cannot cope with the influx of new members. At this point the wiki begins to "breathe out" by restricting contribution and allowing a net outflow of contributors. This is consistent with the history of Wikipedia prior to 2008 and the period of protections being reviewed. 2009 marks the peak of Wikipedia's entrance into the mainstream, with community size beginning to fall from that point (see Wikipedia). As the encyclopedia and wikis as a concept entered the mainstream, high-visibility pages became targets for test edits and general vandalism as the public began to experiment with an encyclopedia that anyone can edit. The existing community was not large enough to incorporate this influx of contributors and so the soft-security processes broke. To protect the encyclopedia, Wikipedia began to breathe out and make pages more difficult to edit. This trend continued into 2011 when the community added inactivity criteria for administrators (yielding our single highest year-over-year drop in administrators on record) and asking the Wikimedia Foundation to restrict article creation to auto-confirmed accounts (which the Foundation refused until 2017). This sequence mirrors other wikis (in direction, if not magnitude) such as MeatBall and WikiWiki which revoked editing access following major vandalism attacks between 2010 and 2015. The pages present in the data above are moderate- to high-visibility pages on major topics of public interest protected due to vandalism during a period of WikiExhaling, but have circumstances changed enough that continued WikiExhaling is actually harmful to the project?

While the protection accomplished its goal, the data present the possibility that the current effects are a net negative. The obvious harm is in preventing good faith anonymous edits. While many anonymous edits were not retained in full, readers were able to point out areas of the article that needed more attention through action rather than requiring them to write an essay on that talk page. Less apparent though is its affect on PageChurn and whether the increased presence on Special:RecentChanges increases edits from advanced editors. Preliminary results at David Beckham support the idea that the increased PageChurn positively affects articles. That article saw substantive copyedits by two editors new to the page (no edits to it within two years), and both occurred soon after an IP edit. While the page had been edited and expanded, the IP edits seemed to have instigated copy-editing distinct from what seems to have largely been writing and expansion over the last few edits. This suggests that the PageChurn brings controversial or sub-optimal sections of articles to the attention of experienced editors who are then able to apply the appropriate policies and improve the article. Long-term protection seems to have harmed these pages by making them harder to find and reducing how often regular editors are drawn to them, ultimately reducing the level of policy-informed copy-editing they receive. More data from more pages and for longer periods of unprotection is needed.