Research in programming Wikidata/Newspapers

The newspaper is a printed periodical. The article is devoted to the study of the Wikidata objects "newspaper". With the help of SPARQL-queries, computed on Wikidata objects, the following tasks have been solved: there are 106 newspaper magazines with geo-referencing in Wikidata (properties of "location coordinates") and most of all newspapers possessing data having corresponding cities in Europe and America. It was find out that the most popular genres of newspapers in the world are satire, information, analytics, scientific journal and omniscience.

Instances of the object "Newspapers"
Let`s create a list of newspapers around the world using the following script: SPARQL-query, 14949 results.
 * Item: newspaper (Q11032).
 * Property: instance (P31).

List of newspapers that have a label in Russian and English: SPARQL-query, 364 results.

In the second script, there were fewer entries than in the first one, because not all newspapers have a labels in Russian and English at the same time.

The most complete and well-developed newspapers on the Wikidata are:
 * Zeri and the populist
 * Observer
 * Izvestia
 * Times

The low-information newspapers on Wikidata were:
 * Crimean truth
 * Russian disabled person
 * Kodima
 * Army and Navy of Workers 'and Peasants' Russia
 * Mercurius Politicus
 * Pavlovo-Posadskie Izvestia
 * Dick Manouch
 * Bulletin of Manchuria
 * Bukhoroi Sharif
 * Kyyum

Completeness of Wikidata
Let's analyze the completeness of Wikidata. According to the teaching aid to 2009, more than 50,000 print media were registered in the Russian Federation, including 27425 newspapers and weekly newspapers and 20433 journals.

According to the category List of newspapers in Russia of English Wikipedia there are 16 daily newspapers in Russia, as well as 9 newspapers, which are published with a frequency of one to four issues a week.

According to the category List of national newspapers, the newspaper is regarded as a national newspaper, that is, it must be distributed throughout the country, unlike a local newspaper that is published in a certain city or region. There are 87 national newspapers, including the capital's newspapers.

According to the category Russian newspapers in Russian Wikipedia there are 115 newspapers that are printed in Russia. Many newspapers have not only a print edition, but also a website, see for example the website of  Russia Beyond the Headlines.

Only 0.8% of newspapers (of the total number of registered newspapers (27425)) are presented in Wikidata, according to all the above categories. This indicates the low occupancy of the Wikidata.

The genre of newspapers
Newspaper materials should have a certain focus - careful consideration of all the specific features that are specific to the audience of a country or group of countries for which the publication is intended. There are three main genres in the newspapers:
 * 1) informative
 * 2) analytical
 * 3) artistic and journalistic

The informative genre includes: notes, reports, interviews, reports. Immediately this genre transmits to the audience all past announcements.

Analytical genres - correspondence, commentary, article, review, review of the press, letter, review - have broader time boundaries, they contain a study and analysis of the system of facts, situations, generalizations and conclusions.

Artistic-publicistic genres - sketch, feuilleton, pamphlet - have a greater emotional power, contain figurative expressive means.

Let us construct a bubble diagram of the distribution of newspapers by genre. SPARQL-query, 19 results.



The most popular genres were: satire, information, analytics, scientific journal, omniscience.

Newspapers on the map
The property "coordinate location" means the geographical coordinates of the city in which the newspaper is printed. We will publish newspapers that have the "coordinate location" property on the world map. For example, the newspaper "Banner" has in the "coordinate location" property the following coordinates: 54°30'34"N,36°14'59" E, and passing through them the city of Kaluga is displayed. This means that this newspaper "Banner" is published in Kaluga. SPARQL-query, 95 results.



With the help of this script, it is notice that newspapers with the "coordinate location" property have geographical coordinates, in most cases, corresponding to cities in Europe and America.

Filling out the Wikidata
The property "genre" (genre) means the way and form of information transfer in newspapers. For example, take the newspaper "Work". which in the property "genre" indicates "information" and this means that this newspaper refers to the information genre.

Let's construct the list of newspapers without the filled property genre (Q483394) and main subject (P921), that find out which newspapers should be added the "genre" and "main subject" property.

SPARQL-query, 26 results.

In the course of the work, the genre and main subject properties were filled in 100 newspaper objects.

With the help of the last script it was possible to get a list of 26 newspapers that do not have the properties of genre and main subject. These properties were filled in 26 newspapers. 74 objects in the category Russian newspapers were also examined. They had the genre and main subject properties filled.

Total, the properties of genre and main subject are filled in 100 objects (newspapers).

Let us construct a bubble diagram according to the "main subject" property of the newspapers of the whole world on the Wikidate: SPARQL-query, 68 results.



This script showed that the most popular topics in the newspapers are:
 * news (66 newspapers),
 * politics (50 newspapers),
 * economic science (26 newspapers),
 * culture (21 newspapers),
 * sport (21 newspapers).

Let us construct a bubble diagram according to the "genre" property from the newspapers of the whole world on the Wikidata.

SPARQL-query, 20 results.



The main newspaper genres are: A number of genres are much smaller than a of main subjects, since a newspaper can have only one genre, and a newspaper can have several main subjects.
 * information (103),
 * satire (18),
 * analytics (5).

Future work

 * 1) Output 20 newspapers with a circulation, using the property  quantity (Q41792217).
 * 2) Find the newspaper that has the longest printing history in Russia using the  inception (P571) property.
 * 3) Create a diagram that clearly shows where the world's most produced newspapers with political and economic themes. Use the main subject (P921) property.

Test
{ The following newspaper titles are listed: New Look,  Prinevsky Krai,  Private Correspondent. And as the year of their creation: 1919, 1992, 2008. Correlate the name of the newspaper and the date of its creation. -+- +-- --+
 * type=""}
 * 1919,| 1992,| 2008

{ Select the newspaper (s) that were printed only in Russia. + Kyyum - Observer + Beep - Bulletin of Manchuria
 * type = "[]"}

{ The following newspapers are given: Le Temps,  Kyym,  Pavlovsky-Posadskie Izvestia,  true. Each of them has its own circulation. The circulation of the newspaper Le Temps is 29.6 thousand, Kym is 23 thousand, Pavlovo-Posad news - 4050, the Crimean truth - 30 thousand. The newspaper Le Temps was published in France with a population of 66.6 million people, Kyim, Pavlovo-Posadskie Izvestia and the Crimean truth are Russian newspapers with a population of 146.8 million, the Crimean truth newspaper was published on the Crimean Truth with a population of 2.3 million. It is necessary to calculate: how many people account for one newspaper in the country and to answer in ascending order? -+-- Le Temps --+- Kyyum ---+ Pavlovsky-Posadskie Izvestia +--- Crimean truth
 * type = ""}
 * 78.03,| 2251.5, | 6383, | 36248

SPARQL queries with answers:
 * List of all newspapers with years of creation and images,
 * List of newspapers in Russia,
 * List of newspapers with circulation, country of origin and population.

Literature

 * (PDF-version)





Программирование Викиданных/Газеты