Everyone loves grandma. Grandma provides unconditional love, sage advice, and often good food. Grandma also reminds us of a time when the world moved at a different pace and things were a little less complicated. In grandma’s day, for instance, no one had to worry about data quality fraud in online surveys because online surveys didn’t exist!
Alas, that’s not the world we live in today.
Online fraud today is a big deal and a big threat to research validity. In this blog, we will tell you how much online survey fraud exists, how it is measured, and what you can do to avoid it.
No place may be more full of fraud than the internet. Consider that in addition to all the traditional scams—phishing attacks, Nigerian princes with an investment opportunity, debt elimination schemes—ransomware has become so lucrative in the last few years that criminal groups have adopted aspects of traditional business operations like customer support and brand management. It’s as if they want to say, “We’re bad guys, but we’re not really bad guys.”
Fraud within online surveys is far less common than online scams in general, but that doesn’t mean online survey fraud doesn’t occur.
Respondents in online surveys sometimes lie about their demographic information, use sophisticated tools to hide the location of their web traffic (when accessing studies from outside the host country), translate web pages into their native language (see below), auto fill multiple choice questions, and provide open-ended answers that are copied and pasted from the web rather than the thoughtful answers an engaged respondent would provide.
While it is extremely difficult to determine whether these behaviors are the product of individual bad actors or part of a coordinated criminal enterprise, what is not difficult is highlighting the threats they pose to data quality.
There are at least two ways to combat survey fraud. One way is for researchers and teams to include measures meant to detect fraud within the survey. The second method is to use data quality tools created by third parties.
On the side of individual action, research teams can add attention checks within surveys to flag inattentive and random responders. Teams may also add a system for flagging suspicious IP addresses and a process for evaluating the quality of open-ended responses. While these methods can help remove fraudulent respondents after the data are gathered, a more efficient approach may be to proactively keep fraudulent respondents out of surveys.
This is how the system developed by CloudResearch operates. With our patented Sentry technology, we combine industry standard technical checks with advanced technology like an event streamer that monitors respondents’ behavior and flags unnatural mouse movements or translation of webpage text into foreign languages. We also have respondents complete a few questions from a library of thousands of validated items to spot problems like yea-saying and inattention. Together, these processes identify and block problematic respondents before they enter your survey, improving data quality, saving the time required for reconciliation, and providing a baseline for data quality across panels.
No one can say for sure how threats to data quality will evolve in the future. What we can say, however, is that our team at CloudResearch will continue to inform researchers about these problems and develop the most innovative solutions to protect data quality. Just as today’s online environment isn’t a world most grandmas would recognize, the solutions of the future probably bear little resemblance to what works today.