Straight-liners (also sometimes called "flat-liners") are respondents who give the same response on every answer option in a categorical array. Data processing teams often like to detect and exclude these respondents from the dataset, because it's presumed they are not paying sufficient attention to the content of the questionnaire.
Identifying the straight-liners is about assessing the responses (at the case/respondent-level) on one or more arrays (typically multiple arrays).
There are two functions that enable you to do this, of which the second one will be discussed through a worked example.
The basic process
- Until July 2020, you need to have the developer version of the crunch R package installed. See here for loading details.
- Ensure you have loaded your dataset in R Studio.
- Use the straightlineResponse() function on each categorical array, storing the response in a variable.
- Use the results in the variables to set up an exclusion of your choice (for example, summing the number of arrays each respondent has straight-lined).
A theoretical example
Suppose you had a dataset with three categorical arrays that you want to evaluate for straight-lining (you may well have many more arrays than this, but we'll keep the example simple). The aliases of these arrays are 'Q1', 'Q2' and 'Q3'.
You could run straightlineResponse() on each array, storing the result in a variable, as follows:
ds$Q1_sl <- straightlineResponse(ds$Q1)
ds$Q2_sl <- straightlineResponse(ds$Q2)
ds$Q3_sl <- straightlineResponse(ds$Q3)
The results of these will be dichotomous TRUE/FALSE (1's and 0's). That is, did they straight-line or did they not on each respective array.
Note: the above only has a very small sample size (n=7) for illustrative purposes. Essentially, n=4 straight-lined Q1, n=2 on Q2 and n=2 on Q3.
From here, how you want to deal with your 'straight-liners' is up to you.
- You may like to exclude only those who straight-line on all the categorical arrays. In which case, this can be done through an exclusion based on a filter.
- You may like to create another variable that sums the number of arrays each respondent has 'straight-lined' on. In which case, that is a simple sum of the above variables.
ds$sum <- as.numeric(ds$Q1_sl) + as.numeric(ds$Q2_sl) + as.numeric(ds$Q3_sl)
An additional function
You may also like to explore the rowDistinct() function. This function returns a value for each respondent showing the number of 'unique' responses they gave on a categorical array. In other words, a value of '1' means they 'straight-lined'.