Trust in Data, Information and Knowledge

Trust in Scientific Results
Reproducibility of scientific results is a key to create trust in the result. The reproducibility is the ability of an entire analysis of an experiment or study to be duplicated, either by the same researcher or by someone else working independently, whereas reproducing an experiment is called replicating it. Reproducibility and replicability together are among the main principles of the scientific method.

Requirements are:
 * Software/Scripts: access to the statistical or numerical scripts that are used for the analysis of collected data,
 * Data: Replicate the same experiment under the same requirements and constraints of the original experiment. Do we get the same findings?
 * Methodology: Even if we get the same results (results are reproducible), the methodolgy of experimental design might build on inappropriate assumptions. E.g. the statistical analysis might assume that the data has an underlying normal distribution, but the deeper analysis of the experimental design leads to the fact that the data is not following the normal distribution.

The values obtained from distinct experimental trials are said to be commensurate if they are obtained according to the same reproducible experimental description and procedure. The basic idea can be seen in Aristotle's dictum that there is no scientific knowledge of the individual, where the word used for individual in Greek had the connotation of the idiosyncratic, or wholly isolated occurrence. Thus all knowledge, all science, necessarily involves the formation of general concepts and the invocation of their corresponding symbols in language (cf. Turner). Aristotle′s conception about the knowledge of the individual being considered unscientific is due to lack of the field of statistics in his time, so he could not appeal to statistical averaging by the individual.

A particular experimentally obtained value is said to be reproducible if there is a high degree of agreement between measurements or observations conducted on replicate specimens in different locations by different people—that is, if the experimental value is found to have a high precision. However, in science, a very well reproduced result is one that can be confirmed using as many different experimental setups as possible and as many lines of evidence as possible (consilience).

Learning Task

 * Watch the Video on Reproducible Science resp. "Reproducible Research: True or False?" by John Ioannidis and derive your conclusions on improving trust in scientific findings.
 * Look at KnitR how scripts and open data could lead to reproducible science,
 * Create your own first KnitR project in RStudio (Youtube-Video on KnitR in 5min.
 * Replication/Reproducible Science: In a 2012 paper, it was suggested that researchers should publish data along with their works, and a dataset was released alongside as a demonstration. In 2015, Psychology became the first discipline to conduct and publish an open, registered empirical study of reproducibility called the Reproducibility Project. 270 researchers from around the world collaborated to replicate 100 empirical studies from three top Psychology journals. Fewer than half of the attempted replications were successful. . Try to find explainations why this high failure was not discovered earlier? Is replication of experiments honored, rewarded by scientific infrastructure?
 * Good training data is core requirement for Open Machine Learning. Discuss the role of trust in training data and its evolutionary genesis (Who did when what with the data?)