Vtuber Technology and History

Virtual Youtubers, also known as Vtubers, are online entertainers who use virtual avatars as their persona to make content with. Vtubers originated in Japan in 2016 and have only been increasing in popularity ever since. They offer content creators a way to attach a face and a brand to their content without having to reveal their identities or to simply match their appearance to the character they wish to portray themselves as. Real time capture motion software is used to record the user’s movements and make the avatar move in the same way.

Model Art and Cutting
When creating an avatar to be used for Vtubers, artists need to consider the fact that their art will be viewed from all sorts of different angles. When drawing traditionally, artists only need to fill in the spaces from one perspective moving in exactly one way, on one layer. But as models move to match the user, creating models that way would result in gaps in the artwork being shown because the artist didn’t plan on it being seen. Additionally, because the features would all be made using flat features, the model would be extremely limited in its movement.

In order to be able to realistically animate all of the different ways a model can move, including everything from blinking to breathing to smiling, artists create all the different elements of a model on different layers in a process called material separation, also known as cutting. Every piece of the model that is intended to move must be cut into different layers, including any reflections, shading, or highlighting. The degree of material separation varies depending on the skill of the artist and price of the model, but in general, the more detailed the cutting is, the higher quality the model will be.

On the most basic level, cutting is separated into three categories: hair, face, and body.

Hair
Hair is separated into three parts: the bangs, the sides of the hair, and the back.

Traditionally, the back can be drawn as one piece, as the majority of the movement will be in the bangs and the sides, and the back is mostly used to complete the full hair shape and to fill in any gaps in the hair that might be caused by movement.

The bangs and the sides are traditionally separated into different chunks of hair depending on the specific hairstyle.

Face
The face is separated into 5 parts: the eyes, mouth, the nose, the eyebrows, and the ears.

Because the Vtuber trend originated in Japan, the majority of models are drawn in an anime art style. This art style doesn’t tend to focus their detail on noses or eyebrows, nor are they parts of the face where their shape significantly changes, therefore as long as they are separate from the rest of the head, they will be okay. This also applies to the ears, though they tend to be shaded more significantly than the nose and mouth so they should be on a separate layer as well.

The mouth needs to be separated between the top lip, the bottom lip, and the inside of the mouth. The inside of the mouth needs to fully cover the space in between the lips so this shape is made slightly larger, and in order to fully cover that shape, it is also recommended to add some skin colored area to the lips as well.

As anime style characters tend to have a lot of focus and detail on the eyes, and since as content creators, they are going to be the feature most looked at on the face, Vtuber eyes tend to have the most detail as well as the most complicated material separation. At minimum, eyelashes, the whites of the eye, the iris, and any reflections or moving designs in the eye must be separated. If the model’s eyes have any shading or makeup as part of the design, those will need to be separated as well.

Body
The Body is separated into 4 main parts: The neck, torso, arms, and legs. However, the degree of separation varies most from model to model in the body, even between the highest quality models. This is because it depends entirely on the details of the outfit. A jacket with a lot of shiny metal details will require more separation than a bodycon dress that just moves with the user, for example. For this article, we will be focusing on the four areas mentioned above.

The neck needs to be drawn larger than what will be shown in order to not appear cut off when the face is moved. It is recommended for the neck to reach up to the character’s mouth. The torso can be separated into different layers depending on the complexity of the outfit and the desired movement effects but otherwise the torso can all be drawn as one piece. The arms and legs need to be separated between the left and the right side, though the legs can be drawn as one piece or separated at the knee, while the arms should be separated between the upper arm, the forearm, and the hands. For streamers, the majority of the time, the model is only seen from the bust up, so more intricate designs and movement tend to be saved for the upper half of the body.

Rigging
Vtuber models can be incredibly expensive, with the average cost of a 2D model being around $2000 and around $5000 on the higher end. The majority of the cost comes from the process known as rigging. This is known for being an incredibly difficult process to learn, which is why the most well known of riggers can charge thousands of dollars for their work. On the most basic level, rigging is the process of adding bones and joints to a model in order to make the model move.

Live2D has four areas to work with: parts, deformers, inspector, and parameters. Parts refers to all the different parts of your model. This is usually formatted as folders of all of the different parts of the model with those folders containing all of the different layers, known as textures in Live2D, of each part and show how they are layered from front to back.

Deformers allow you to bend and shape the art as it moves. For example, if the shape of the eyebrows of a model need to be changed in order to create the image of an angry face, you would bend the eyebrows using deformers. This section also shows you the deformer hierarchy of the model, which refers to the order and prioritization of how and when elements of the model move.

Inspector allows you to control the opacity, clipping, and blending of your textures. This tool is used to ensure that art elements are visible or hidden when they are supposed to be in any given animation.

Parameters refers to the extremes of the model’s movement. The face tracking software you use will use the parameters you’ve set to determine how far the model moves when you move, and the parameter section is also where you will set the key forms that animate the model.

The full range of movement of a 2D model is only limited by the rigger’s skill and whether or not the art was prepared for the movement. However, because fully rigging every possible movement and action a human being could possibly take would not only be insanely expensive and time consuming, some Vtubers opt to include toggled expressions in their rigging. Toggles can refer to any feature that can be shown or hidden at the press of a button, including different hairstyles, accessories, or outfits, but can also include different facial expressions. These expressions can include anything from crying, to blushing, to being angry and these expressions will stay active as long as the user wants them to be. This allows them to portray their model in a certain way without having to consciously imitate the facial expression.

Kizuna Ai
The first person to coin the term “Virtual YouTuber” was an account by the name of Kizuna AI. She gained popularity in late 2016, amassing over 2 million followers in 10 months. Kizuna AI was originally created under the name “Project A.I.” by the company Activ8. The channel had focused on edited videos about Q&As, discussions, and video game playthroughs. Her success inspired others in Japan to take on Vtuber models as well, and translated clips from Japanese to English started to spread to western audiences as well. By 2020, when the Covid-19 pandemic hit worldwide, Vtubers on Twitch and YouTube started to gain more and more attention, with the number of active Vtubers doubling by 2018.

Corporations
In 2018, AnyColor Inc, then known as Ichikara Inc, founded the Vtuber agency known as Nijisanji. They had developed a face tracking app, also titled Nijisanji, and began recruiting Japanese content creators to be virtual idols, or now known as virtual livers. The name Nijisanji comes from a combination of the Japanese words for two dimensional (niji-gen) and three dimensional (sanji-gen) and was meant to symbolize a new era of entertainment that combines elements of the two dimensional and three dimensional world.

Shortly after, the company Cover Corporation founded the company Hololive. Cover Corporation had been developing technology for augmented reality and virtual reality software but had pivoted to face tracking technology after being inspired by virtual stars such as the vocaloid Hatsune Miku and Kizuna AI. The app, also called hololive, was released and they began recruiting their own talents for content. Hololive is home to many of the most followed Vtubers on Youtube, whether they are English speaking or Japanese speaking, such as Gawr Gura who is currently the most followed Vtuber overall.

Between these two companies, they popularized the use of 2D models over 3D, as well as their use in live streams over edited and scripted videos. They also established branches of their company in places such as India, China, and South Korea, as well as an English speaking branch in order to capture a more global audience that was not necessarily as exposed to the existence of Vtubers.

In 2020, Vshojo was established in San Francisco, and is known as one of the first major western agencies focused on Vtuber content. Vshojo is home to some of the most popular English speaking Vtubers, including Ironmouse who is currently the most followed Vtuber on twitch, and the most followed female content creator on Twitch as well.

Indie Vtubers
As Vtuber models became more popular and more people began learning how to create them, we also saw a rise in indie Vtubers alongside corporation Vtubers. An indie vtuber just refers to anyone who operates without a major corporation behind them. They can design, make, and rig their own models or hire artists to do so for them.

Indie vs Corporation
There are several reasons that an aspiring content creator may choose to pursue an indie career or a corporation career.

When working under a corporation, it is ultimately a profession where you have to abide by the company's rules. For example, they might limit the amount of creative freedom you have by putting restrictions or guidelines on the type of content you are allowed to create, or who you are allowed to create content with. As you are working for the company, they would also receive a cut of the money a content creator makes through donations or ad revenue.

Corporations also tend to not allow the creators to have any creative decisions over the design of the model itself. When hiring new talents, they hire artists to create models ahead of time and hire based on who they think best fits the personality they intended for that specific set of models.

However, talents backed by corporations start their careers with an instant audience of viewers who trust that the company is continuing to hire worthy and entertaining talents. For example, the 7th wave of the English branch of Nijisanji known as Iluna started their livestream content on July 24th, 2022 but they were announced to debut on July 19th. Before they had even officially started yet, all 6 members of Iluna had over 70 thousand subscribers to their channel in less than 6 days.

Talents hired by corporations also freely receive any equipment they would need to stream successfully, as well as the model itself cutting down on initial costs. Corporation talents are also more likely to be approached with sponsorships which results in more money for the talents.

They also are more protected in terms of legal matters by the company’s personal legal team in the event of situations such as potential copyright infringement or harassment/defamation of talents.

Working as an indie Vtuber offers the opposite benefits and drawbacks. As an indie vtuber, unless you had already made a name for yourself as a content creator, you will be working from the ground up to build your audience. This might have been easier in 2016-2020 when vtubers were still largely an uncommon niche and just by being a vtuber you were standing out. Starting in 2024 as an indie is more difficult because at this point the market is oversaturated with vtubers. You will also need to pay for all of the equipment needed, as well as the model and/or any lore videos, starting/ending screens, emotes, etc. you desire for your debut stream.

On the other hand, you are limited only by your own imagination in terms of what content you make, how, and when it is made while corporation vtubers follow strict schedules. They also get to keep every cent they may make during their career. Additionally, by paying for their own models and visual elements, they are also fully in control of how their model appears and any aesthetics or branding they wish to go with it.

Parasocialism in Vtubers
Parasocialism is defined as one-sided relationships or bonds with people you don’t actually know. In a lot of cases, the people in question tend to be celebrities or fictional individuals. Vtubers are stuck in the middle, with a real person behind the model, but a lot of times are encouraged to keep their real identities as secret as possible. Most corporation Vtubers, including those at Nijisanji and Hololive, are not allowed to reveal their real names, faces, or even birthdays in some cases and in streams that require their actual hands such as cooking streams or sculpting streams, talents are required to wear black gloves covering their hands and arms so as not to break the immersion. In fact, corporate vtubers tend to encourage deep parasocial relationships with their audiences. For example, Vox Akuma is an incredibly popular Vtuber under the English branch of Nijisanji. He is the most followed channel in the English branch, and the third most followed in the entire company. He gained the majority of his popularity from his ASMR streams where he made content where he would pretend to be the viewer’s lover. Another example comes from one of Hololive’s most popular streamers known as Uhura Rushia. In 2022 for her birthday, Hololive sold merch of a ring inspired by her model that was sold to fans as a “Lifelong Engagement ring”.

Parasocialism as a whole, is neither an objectively good or bad thing. Parasocial relationships when pursued in a healthy way can decrease feelings of stress, anxiety, and loneliness. Considering the bulk of the Vtuber trend gaining popularity was in 2020 when the pandemic hit and people were stuck in quarantine, parasocial relationships with Vtubers showed positive effects in viewers.

However, because Vtubers operate by keeping their real identities as human beings hidden, it also is a lot easier for fans to forget that they are real people behind those models as well. Vtubers as a whole tend to lead their viewers towards obsessive parasocialism.

In May of 2022, Vox Akuma did an ASMR live stream where he pretended to go on a date with the viewer. During that stream, another Nijisanji talent known as Reimu Endou had entered the stream and asked him for help on a game she was playing that he was known for being good at. In response, Vox’s fans proceeded to send death threats to Reimu for “interrupting their date with their boyfriend.”

In February of that same year, Rushia did a live stream where a fellow singer and YouTuber named Mafumafu had messaged her on Discord saying he was heading home. Her fans interpreted this to mean that they were living together and thus dating, which inspired her fans to send her death threats for lying about being single. In a now deleted tweet, Rushia had stated that, “I can’t eat, I can’t sleep, I have trouble walking, there are so many terrible falsehoods that I want to die right now.” She would then go on to retire from streaming as Uhura Rushia as a whole less than 2 weeks after the incident. At the time of her retirement, she was the most super-chatted channel of all time.