Research in programming Wikidata/Ships

The article is devoted to the study of the object of the Wikidata ships. Three examples of good and poorly filled "ships" objects are distinguished. With the help of SPARQL queries, computed on objects of the "ship" type, the following tasks are solved: lists all ships of the world, as well as ships that participated in military conflicts and are associated with any country. Also an estimate of the completeness of the Wikidata is given. The paper presents a graph showing the relationship between ships associated with Russia and the military conflicts in which they participated.

Instances of the object "ship"
ship (Q11446) is a large marine vessel.

Wikidata properties considered in the work:
 * instance of (P31);
 * operator (P137);
 * country (P17);
 * conflict (P607).

Let's build a list of all ships in English.

SPARQL-query, 19 820 results (2017), 50 681 results (2020), 71 203 results (2021).

SPARQL-query, 107 results (2017), 578 results (2021).

Completeness of the Wikidata
Finding the exact number of ships in the world is a difficult task. After all, data about some of them are top secret, some are private vessels and there is no information about them either. Suppose that the total number of ships is about 1.6 millions, as indicated in the vessel database. The script in the listing showed only 71 203 records, which makes up only 4.5% of the total number of ships.

As for the Russian ships, the actual civil and military fleets includes 17 657 ships. At the time when the script in the listing showed only 579 records, which is only 3.27% of the total number of Russian ships.

In the first and in the second case, the difference between the actual number of ships and the result of requests is huge, which indicates the incompleteness of the Wikidata.

ProWD
Data was collected with ProWD.id, 2020. The graph and Gini coefficient show that completeness is not uniform.

The ship Krasin (Q281147) has the greatest quantity of properties (34) according to ProWD report. The ships Liven (Q99198666) (5 properties) and Dispatch (Q28155282) (4 properties) have the lowest quantity of properties.

Filling the properties of warships
It is required to find and fill a hundred objects of ships connected with Russia and participating in any military conflicts.

SPARQL-query — 1400 ships (2017), 3586 ships (2020), 3567 ships (2021).

With the serarator ";" in the script in the listing it is possible to extract multiple properties of the same object in one line of code. It this script to properties were extracted: country (P17) of the operator (P137) and conflict (P607) Military conflicts and military operations, which are part of wars, are different concepts. Filled data on ships can be roughly divided into two types:
 * 1) Objects in which military operations are combined with military conflicts. For example, in Soviet destroyer Gremyashchiy 10 wars / battles, see listing below. Such a large number is due to the fact that the ship took part in many  arctic convoys which are military operations.
 * 2) Objects in which military operations are separated from military conflicts. For example, in the British cruiser HMS Trinidad participation in the military campaign and the Arctic convoy are listed as part of World War II with the qualifier including (P1012). Thus, in the Wikidata, this cruiser has one war/battle.

SPARQL-query — War conflicts with destroyer Gremyashchiy and HMS Trinidad (Q1565575). 10 and 1 conflicts are found respectively, 2021.

SPARQL-query — 105 results (2017), 86 results (2020), 82 results (2021).

It is important to notice that the ships from script in listing are not necessary connected only with Russia, USSR or Soviet Union. For example, there is Kasato Maru (Q653477). It is a Japanese ship but it has multiple operators in the list. This list also includes Dobroflot (Q3737187), this operator owned this ship for some time. It means that the same ship may be owned by different operators in different periods. Owners may be changed time to time.

Fig. 2 shows a graph of the dependence of ships associated with Russia and participating in any military conflicts.

You can see that most of the ships and military operations belong to time the USSR and Russia. It should also be noted that in this graph, as well as in the data themselves, from which they were built, there is one shortcoming. The fact is that Russia can be divided into several different countries (by periods: in particular, on Russian Empire, Russian Socialist Federative Soviet Republic, USSR and the post-Soviet period). And in the ships filled with the editors of the Wikidata ships, the period when the ship existed is not always true. For example, in Fig. 2 can be seen (and the Wikidadata confirm this) that Borodino battleship existed in Russia, and not in the Russian Empire, which is a mistake.

Despite the shortcomings, the graph clearly shows which ship is participating in which particular military battle. The figure shows this dependence in the context of some periods of Russian rule. It also allows you to track which military event has the most warships and vice versa, to find a ship that participates in more wars.

Museum ships around the world
Museum ship (Q575727) — a ship that houses a museum exhibition dedicated to the history of the ship. Such ships are used for educational and memorial purposes. The ship's participation in the conflict (Q180684) may lead to the creation of a museum ship in memory of past events.

Let's build a graph of museum ships and the countries in which these ships are located. The vertices of the graph are country(Q6256) и museun ship (Q575727). The edge between a ship and a country means the ship is in that country. And the edge between the two countries means that there were conflicts between these countries, the number of which is equal to the weight of the edge. The script in listing below builds this graph according to the rules described above.

SPARQL-query, 117 vertices are found (2021).

From a fragment of the graph in figure 4 it can be seen that the museum ships mostly belong to Germany, the USA and Australia. This "correlation" is quite logical, since these countries have a long history, for which they have participated in many conflicts. Also, these countries have access to the sea, which historically determines the presence of a fleet.

Future work

 * 1) Find the "Guinness ship" (to choose from: the largest, the longest, the most capacious).
 * 2) Output pictures of those ships, about which the film were shot. If there are no such, then those ships, about which the books were written.
 * 3) Bring out the museum ships.

Exercises
{Based on the graph of the dependence of ships and military operations, which country has the most values of wars associated with ships? + Soviet Union + Russia - Russian Empire
 * type="[]"}

{ Based on the graph of the dependence of ships and military operations, which war accounts for most of the values of ships? - Russo-Japanese War + World War II - Crimean War
 * type=""}

{ The figure shows the most famous Soviet destroyer project 7, awarded the title of "Guards", name it. { Гремящий | Gremyashchiy | Thundering }
 * type="{}"} }

=References=

=Links=