Articles

Survey data quality: The 4 factors that matter most to researchers

Jane Hillman
|May 18, 2022

You know that old hypothetical “If you were stranded on a desert island and could only bring three things...” Well, pretend you’re a researcher stuck in a deserted office. You can only bring four factors of data quality. What would they be? What are the most valuable? The most important for survival? (Or the most important to, you know, data quality?)

Well. In talking it over with legal we opted not to actually strand someone in a deserted office to test this out. Fortunately, we could do the next best thing — ask 129 research professionals, averaging five years of research experience apiece, via an online, non-representative survey.

So, what was most important to our research respondents? Choosing from 11 factors related to online survey data quality, their consensus was clear: comprehension, attention, honesty, and reliability (by a large margin, and in that order). This means, stuck in a deserted office or not, the ability to control for these three factors should be an absolute priority when sourcing high-quality data online.

Curious where our survey respondents came from? Learn about how we recommend recruiting high quality survey participants.

#1 - Comprehension: How well respondents understand what you’re asking

In academic and market research, comprehension is the measure of how well survey respondents understand what they’re being asked to do. No wonder our research professionals ranked it at the top of the list. If you can’t trust that respondents understood what you’d asked them to do, you can’t trust the results they provide.

To assess comprehension, researchers typically ask participants to fully review a set of instructions. Then, each is asked to summarize what they’ve been tasked to do in the survey. The summarization methodology typically involves asking participants to make a judgment, perform a specific task, or consider different alternatives to a certain goal once the instructions are read.

Online research platforms and panels that are serious about the quality of their results should make it up front and easy to include comprehension as part of your survey design. And, since the experience of researchers may vary, platforms should also provide clear guidance to help ensure these checks will work as intended.

On Prolific, for instance, survey requesters must allow potential participants to freely re-read any key information and related to a comprehension check. Participants must also get at least two chances to answer a question correctly. And, while not required, we do recommend researchers test comprehension at the very beginning of their online survey.

Doing so ensures participants who fail both attempts at their comprehension check aren’t screened out after investing the time and effort it takes to complete the survey itself, or because they simply didn’t understand what you wanted them to do. (i.e., it’s just good form).

#2 - Attention: Whether or not respondents are engaged with your survey

At first glance, attention and comprehension seem similar. While comprehension relates to whether or not participants can understand the contents of a survey, attention relates to whether or not they’re engaging with it (instead of, for instance, season 4 of 90 Day Fiancée on Netflix).

Specifically, the factor of attention is a measure of whether or not survey participants were taking the time and care to read questions thoroughly before answering. This is where attention and comprehension differ. A respondent could successfully read the instructions and summarize them, thus passing a comprehension check at the start of a survey.

However, perhaps assuming they’ve “done a million of these” and “know what to do”, they might rush through answering multiple choice questions to collect their payment as quickly as possible. In this hypothetical, the survey requester would see that this respondent had passed the initial comprehension check. But without a way to measure the respondent’s level of attention, they might not realize their survey data quality is skewed. For this reason, attention is measured through the use of separate attention check questions (ACQs).

Alternately referred to as instructional manipulation checks, ACQs are simple, seemingly harmless questions. But these questions are preempted by instructions to the participant that they are to choose one specific answer from the ones provided. Respondents answering these questions as instructed would signal to the survey requestor that they were paying attention.

Another methodology involves the use of nonsensical statements that participants answer by making a choice on a numbered scale. As an example of this approach, participants could be asked to read the statement, “I travel through the center of the Earth on my commute to work each day. Recommend rephrasing this to "They would then make a choice of 1 (never) to 4 (every day)." In this instance, respondents choosing “never” would be indicating that they were paying attention while taking the survey.

#3 - Honesty: If respondents are being truthful (especially when you’re paying them)

Honesty refers to the extent to which a participant provides truthful responses. And honesty is especially important because poorly thought-out incentives inadvertently skew decision making and, by extension, introduce data quality issues. This is especially relevant when asking participants to self-report on performance-related tasks, or when researchers need them to respond to specific, demographic questions.

When participants must self-report during an incentivized survey or questionnaire, researchers must ensure they don’t create a situation where dishonesty could increase compensation. And researchers also need to pay attention to honesty when using demographic questions to determine participant eligibility. This last point is why, when using Prolific, we encourage researchers to validate any screening criteria within the survey or questionnaire itself.

Sometimes research may also involve personal or private aspects of people’s lives. And even the most honest of us can find it difficult to answer questions about sensitive topics (even when we’re getting paid to do so). This is why researchers can also encourage honesty by putting themselves in the participant’s position.

For instance, most potential participants working through an online research or survey platform like Prolific know their contributions will be anonymous. That said, it never hurts to remind participants of this at the beginning of your online survey or questionnaire, especially in light of recent online privacy legislation and advertising cookie concerns.

Phrasing to encourage honesty is also helpful. For sensitive topics, referencing commonly available sources like the news or statistics can help frame the subject matter as being normal and/or understood. Alternately, setting up the question as seeking the participant’s opinion is another way to keep answers honest and, by extension, survey data quality high.

#4 - Reliability: If your participants are consistent with how they respond

Finally, reliability was ranked fourth most important by research professionals. Unfortunately, reliability is slightly less straightforward than its three compatriots as a factor of research quality.

Partly, this is because reliability is chiefly a psychological construct. Comprehension, attention, and honesty are specific characteristics (i.e., behaviors) participants exhibit more or less of. Reliability, instead, measures the consistency of all these behaviors over time.

More than a problem of inconsistent answers, reliability equates to overall audience trust for researchers, trust the audience will act/respond the same to a survey no matter when it's given or if given multiple times. And while important, reliability is far harder to control for than our first three factors of quality. Smart survey design helps support reliability as it reduces common method bias. But when conducting high-quality survey research online, increasingly, you get what you pay for regarding reliability.

Professional-quality research platforms give researchers reason to trust the internal consistency of its potential participants. And for those that leave you wondering, poor reliability may well be a factor. Remember, lingering questions related to the research you’re conducting are fantastic. However, any related to who you’re paying to help complete it are not.

Comprehension, attention, honesty, and reliability: Survey data quality essentials

It’s clear researchers looking to avoid poor quality data online should place a premium on maximizing comprehension, attention, and honesty as part of their survey design. But these factors alone act as a part, not the whole, of quality assurance.

It’s natural for researchers to look at other aspects of the process, like data collection, metrics, different forms of quality checks, and follow up methodologies to ensure high quality online survey data. This is why it’s no accident that researchers have options that help them maximize the quality of data sourced through Prolific.

But it's easy to overlook the fact that survey data quality often hinges on how researchers recruit participants in the first place, which is why we recommend building on these insights first.

Log in or sign up to Prolific today and launch your study to thousands of participants in minutes