By Aaron Moss & Leib Litman, PhD
As a researcher, you know that planning studies, designing materials, and collecting data each take a lot of work. So, when you get your hands on a new dataset, the first thing you want to do is start testing your ideas.
Were your hypotheses supported? Which marketing campaign should you launch? How do consumers feel about your new product? These are the types of questions you want answers to.
But before you can draw conclusions from your dataset, you need to inspect and clean it, which entails identifying and removing problem participants.
Most data quality problems in online studies stem from a lack of participant attention or effort. It isn’t always easy to distinguish between the two types of problems, but researchers have a number of tools at their disposal to identify low-quality respondents and remove them from a dataset.
Speeders work through studies as quickly as they can and often engage in what is known as satisficing — skimming questions and answer options until they find a response that meets some minimum threshold of acceptability.
Speeders sometimes can be identified by examining the time it takes them to complete parts of a study or the entire study. Other times, speeders can be identified by their failure to pass simple attention checks or by the overall quality of their data.
Straightliners select the same answer for nearly every question in the study (e.g., selecting “Agree” for all questions).
Straightlining is less common than some other forms of inattentive responding because it is relatively easy to spot, and respondents may worry about having their submission rejected.
Slackers, or shirkers, lack the proper motivation to fully engage with your study. Several things may contribute to slacking: the pay for your study, the difficulty of the tasks, situational aspects of people’s environment, and an individual respondent’s level of commitment.
Slackers are usually identified by overall measures of data quality.
In the world of online research, some people may try to create scripts, or “bots,” that can automatically fill in question bubbles, essay boxes, and other simple questions.
While bots or scripts represent an extreme type of poor respondent (and one that there isn’t great evidence actually exists), they nevertheless remain a large concern for researchers. Fortunately, the goal of someone using scripts to complete online studies is incompatible with providing quality data. The unusual responses or low-effort answers to open-ended questions means bots are often easy to spot.
There are several terms — imposters, fraudsters, liars — for people who provide false demographic information in order to gain access to studies that target specific populations. Regardless of what you call them, you want to keep these people out of your studies.
The most effective way to keep imposters out of a study is to remove the opportunity for people to misrepresent themselves. Dissociating study screeners from the actual study can deny imposters the opportunity to gain access to your study. In addition, examining the consistency of people’s self-reported demographics over time or testing people’s relative knowledge in the domain of interest are other ways to prevent imposters from ruining your study.
Although only a small percentage of people in online panels are willing to misrepresent themselves, the sheer number of respondents means that even a small percentage of imposters can result in a study full of false respondents (see Chandler & Paolacci, 2017).
In online participant platforms, some people complete many studies and remain on the platform for a long period of time. People who take many studies may become increasingly familiar with the flow of studies, typical instructions and measures commonly used by researchers. As these participants become less naive, they may pose a threat to data quality.
Researchers can do one of two things to address the problems discussed above: select different participants or design different studies.
Not all online panels are created the same, and some data quality problems can be solved by selecting participants who are well-suited to the demands of the research task. This is what is known as finding participants who are “fit for purpose.”
When researchers cannot select different participants, they may choose to focus their energies on designing studies differently. Some changes to study design are simple and can dramatically improve data quality. For example, researchers can spend more time piloting materials and ensuring the study instructions are clear and easily understood. In other cases, changes to survey design are more extensive and require more effort.
One of the most common tools for detecting inattentive participants is the attention check question and its many varieties (e.g., trap questions, red herrings, instructional manipulation checks).
Attention check questions seek to identify people who are not paying attention by asking questions such as: “Have you ever had a fatal heart attack while watching TV?” Anyone reading this question should be able to easily indicate their attention by selecting “No.”
Although researchers have traditionally used attention check questions to identify inattentive participants, recent research has found several negative consequences of relying on attention checks as the sole indication of data quality.
Specifically, some researchers construct long and elaborate attention check questions that require people to read large chunks of text and to ignore “lure” questions. What research shows, however, is that these sorts of attention check questions are not good measures of attention.
By requiring more attention from people than most other portions of the study, these questions introduce bias into decisions about which participants to keep and which ones to exclude. Online participants with less education and lower socioeconomic status are more likely to fail such checks than those with more education and higher socioeconomic standing.
Another problem with attention check questions is that they often lack validity. Research shows that people who pass one attention check question do not necessarily pass other attention checks in the same study. In addition, a participant who passes an attention check question in one study may not pass the same attention check in a different study.
With all this in mind, the current best practice with attention checks appears to be the inclusion of multiple brief attention check questions and the recognition that attention checks are simply one, not the sole, measure of data quality. In other words, you should not make decisions about who to include and exclude from analyses based simply on attention checks.
Identifying and controlling for the various types of poor respondents in online surveys is not an easy task. To make this work easier, CloudResearch engages in a number of practices to help researchers select appropriate participants for their projects. We prescreen participants in our online panels using patent-pending technology to ensure they pass basic language comprehension tests and are likely to pay attention during your study. We regularly publish research investigating participant behavior in different online platforms, best practices for screening participants and instructions on gathering high-quality data for various types of research projects. Contact us today to learn how you can draw on our expertise to improve your data collection methods.