OToPS/Messy Data/Impossible

A common problem is getting impossible values for a variable. For example, on an item that is scaled from 1 (Strongly Disagree) to 5 (Strongly Agree), a case might get a score of 0, or 11. The eleven is more likely to happen when a human is typing in responses, it could be a keypunching error. The zero is likely to happen when a respondent skips an item and the survey software assigns a value of zero.

A more sneaky version happens when we program scoring into Qualtrics (circa 2018). We often use internal scoring to give participants their score at the end of the survey (as a reinforcer for participating, and in assessment centers, as a core service). We refer to this as "piping" because it is possible to "pipe" the score to show on a later page of the survey. When a person skips an item, Qualtrics internally gives it a score of zero. This is a terrible way to handle the missing value, as the person almost definitely would have had a higher score had they answered the item. Qualtrics saves the scores to the data frame, and they also get exported with the other variables.

We do not want to use these for presentation or analysis, because of the quirk in how Qualtrics handles missing data.

Example:

In SPSS, I ran descriptives:

descriptives /variables iapAuthoritarian iapChildCentered iapAuthoritative apqPoorMonP apqPosParP apqInconDiscP apqDadInvolveP apqMomInvolveP apqCorpPunP SC25 SC24 SC23 SC22 SC21 SC20.

It is missing data that are creating the differences between the Qualtrics and the syntax scoring. Two ways that I can tell:

The N for the versions computed by Qualtrics all are N=153. The syntax versions all have lower N, and the way that the syntax is written, it won’t calculate a score unless most of the items are available. Qualtrics is treating missing items as scores of zero.

Second tell is that the minimum scale score for all the variables from Qualtrics is zero, but the items all go 1 to 5 (or 9). Scale scores of zero are impossible in raw score format. If you run FREQUENCIES on SC20 to SC25, you’ll see a big pile of cases with scores of zero – those are all the people who skipped that part of the survey.

When we run correlations with SC20 to SC25, the zeros are getting treated as legitimate scores. This increases the N, and it trashes the correlation estimate.