Research in programming Wikidata/Musical Compositions

This article is about research of musical compositions using the knowledge base of international project called Wikidata. With the help of SPARQL queries for items classified as "musical compositions" the following were received: the list of all musical compositions, the list of musical compositions that has a composer, the bubble diagram for composers that shows composers with most compositions. Moreover, the task of searching music gaps in public domain was done and completeness of Wikidata was evaluated.

List of musical compositions
Let's build a list of all musical compositions.
 * Item: musical composition (Q207628).
 * Property: instance of (P31).

SPARQL query, 5494 records.

👍 >The most complete and well-developed musical compositions on Wikidata are 	The Magic Flute, Für Elise, Mozart's Requiem, Eine kleine Nachtmusik.

👎 >Almost empty and uninformative musical compositions were Flight of the Bumblebee, Romeo and Juliet, Iron Foundry, Binks’ Waltz, The Rose-bud March, Leola.

Search music gaps in public domain
The task is to find musical works the composers of which passed away more than 70 years ago and audios of which are absent from Wikimedia Commons. The list of compositions must be sorted the in ascending order by publication date. This script can be used to find musical compositions that need to be digitized and then uploaded to Wikimedia Commons.

SPARQL query, 140  records.

Completeness of Wikidata
Let's analyze the completeness of Wikidata.

According to Grove Dictionary of Music and Musicians there are 20374 composers.

According to the category "List of composers by name" of Russian Wikipedia there are 6130 composers.

According to the category "List of composers by name" of English Wikipedia there are 4685 composers.

The number of musical compositions with filled property "composer (P86)" equals 3862, which is shown in SPARQL query, and that's if you take into account the fact that one composer could have written several musical compositions. For example Wolfgang Amadeus Mozart is the composer of 95 compositions, which decreases the number of unique composers. The number of 3862 is lower than the amount of composers from both Russian and English Wikipedia and substantially lower than the amount of composers from Grove Dictionary of Music and Musicians which confirms the incompleteness of Wikidata.

SPARQL query for compositions with filled property "composer (P86)" and property "country of origin (P495)" with value of "Russian Empire (Q34266)", "USSR (Q15180)" or "Russia (Q159)", gave us only 8 compositions, which means that it's impossible to analyze Russian musical compositions due to lack of data.

Let's build the bubble diagram for composers of musical compositions.

SPARQL query, 773 records.



Size of a bubble tells us about the amount of musical compositions. This diagram shows us that some composers have more compositions that the others. Top 5 includes Niels Gade (173 compositions), Johann Sebastian Bach (155 compositions), Christian Sinding (125 compositions), Johan Halvorsen (121 compositions), Alan Hovhaness (108 compositions).

Filling of Wikidata
The decision was made to fill "composer (P86)" property for "musical composition (Q207628)" items to get better results while performing the query for searching music gaps in public domain.

Let's build a list of all musical compostions with filled property "composer (P86)".

SPARQL query, 3864 records at 30/10/2017, 10:51.

SPARQL query, 3965 records at 30/10/2017, 12:47.

Future work

 * 1) Find a list of musical compositions that were created during The Age of Classicism (XVII—XVIII centuries). Property: "inception (P571)".
 * 2) Find a composer that had written more symphonies than the others. Properties: "instance of (P31)", "composer (P86)".
 * 3) Build a histogram that displays the amount of songs by The Beatles by the year of publication. Properties: "performer (P175)", "publication date (P577)".

Tasks
{Which one of the following composers had written more compositions? - Wolfgang Amadeus Mozart - Igor Stravinsky - Johann Sebastian Bach + Niels Gade
 * type=""}

{Which of the following compositions had been created in Russian Empire? - The Magic Flute + The Firebird + Flight of the Bumblebee - Requiem + Romeo and Juliet
 * type="[]"}

{Character of which opera by Richard Wagner is displayed on the picture?  { Lohengrin }
 * type="{}"} }

{Which of the following musical compositions has more parts than others? - The Haunted Manor + Il trovatore - Mathis der Maler - Turandot
 * type=""}

{In what century were the following compositions written? +- King Roger -+ Sleep -+ The Hope +- Chamber Symphony No.2
 * type=""}
 * 20th| 21st

{ Some compositions consist of many parts. Place the following compositions by the amount of parts: The Well-Tempered Clavier, The Magic Flute, Turandot, Il trovatore. +--- ---+ --+- -+--
 * type=""}
 * 1st,|2nd,|3rd,|4th

SPARQL query with answers:
 * SPARQL query for the list of composers sorted in ascending order by the amount of compositions, 782 records.
 * SPARQL query for the list of musical compositions that had been created in Russian Empire, 4 records.
 * SPARQL query for the list of all musical compostions, 5259 records.
 * SPARQL query for the list of compositions that were created in 21st century, 5 records.
 * SPARQL query for the list of compositions that were created in 20th century, 49 records.
 * SPARQL query for musical compositions that contain logos, 9 records.