SRL sends out weekly news bulletins that cover various aspects of survey research and best practices. Below is an archive of the bulletins sent out to date by category; to view, click on any title.

If you would like to start receiving these weekly emails, please send your name, affiliation, and e-mail to

We are always looking for topics for future Survey News Bulletins and are interested in your ideas and suggestions for future topics. Click here to submit an idea for a future SNB.

Click on a title to view content.

General Education/Professional References

No. 22. Towards more transparency in survey research.

In an effort to encourage an open science of survey research, the American Association for Public Opinion Research (AAPOR) has formally launched a Transparency Initiative, designed to education and encourage the disclosure of methodological details regarding publicly released survey data. Building on its Code of Professional Ethics and Practices, the Transparency Initiative outlines a minimal set of technical details that it requires member organizations disclose when reporting survey data. These include the following:

  • Who sponsored the research study, who conducted it, and who funded it, including, to the extent known, all original funding sources.
  • The exact wording and presentation of questions and responses whose results are reported.
  • A definition of the population under study, its geographic location, and a description of the sampling frame used to identify this population. If the sampling frame was provided by a third party, the supplier shall be named. If no frame or list was utilized, this shall be indicated.
  • A description of the sample design, giving a clear indication of the method by which the respondents were selected (or self-selected) and recruited, along with any quotas or additional sample selection criteria applied within the survey instrument or post-fielding. The description of the sampling frame and sample design should include sufficient detail to determine whether the respondents were selected using probability or nonprobability methods.
  • Sample sizes and a discussion of the precision of the findings, including estimates of sampling error for probability samples and a description of the variables used in any weighting or estimating procedures. The discussion of the precision of the findings should state whether or not the reported margins of sampling error or statistical analyses have been adjusted for the design effect due to clustering and weighting, if any.
  • Which results are based on parts of the sample, rather than on the total sample, and the size of such parts.
  • Method and dates of data collection.

We strongly encourage all researchers to plan proactively to disclose these details about their surveys, along with other disclosure elements mentioned in AAPOR's Code, when reporting findings from their research.

For more information, see AAPOR's Transparency Initiative Web site and AAPOR's Code of Professional Ethics and Practices.


No. 30. SRL 50th History released.

In 2014, SRL reached its 50th anniversary, a landmark event that was celebrated by a symposium in Urbana and one in Chicago. Dr. Richard Warnecke, former director of SRL, spoke about the history of SRL, and Dr. Jon Krosnick and Dr. Norbert Schwarz gave invited lectures about the present and future of the survey methodology. Many past and former SRL staff, students, and faculty attended the event. SRL's history is documented in a publication that was released this week (including a summary of the history of the lab and brief descriptions of every study conducted at SRL in the past 50 years) that is available at


No. 31. AAPOR releases new standard definitions for calculating response rates.

The 8th edition of the Standard Definitions document was recently released by the American Association for Public Opinion Research (AAPOR). This document contains updated standardized lists of disposition codes and formulas for the calculation of response rates, cooperation rates, and refusal rates for telephone and in-person household surveys, mail and Internet surveys of specifically named persons, mixed-mode surveys, and establishment surveys. It includes updated sections on establishment surveys and dual-frame telephone surveys. AAPOR plans to continue to update this document with new disposition codes and rate estimation best practices to address evolving survey practices and new technologies. Using these standardized disposition codes and formulas facilitates meaningful comparisons across surveys and many professional journals now require their use when reporting findings from primary survey data collection efforts.

For more information, see

The American Association for Public Opinion Research. (2015). Standard definitions: Final dispositions of case codes and outcome rates for surveys (8th ed.). AAPOR.


No. 41. AAPOR revises its Code of Professional Ethics and Conduct.

The American Association for Public Opinion Research (AAPOR) recently updated its Code of Professional Ethics and Conduct. The Code is reviewed and updated every five years to insure it remains relevant for the continuing practice of survey research. A key section of the Code focuses on disclosure of research methods. AAPOR believes that "Good professional practice imposes the obligation upon all public opinion and survey researchers to disclose sufficient information about how the research was conducted to allow for independent review and verification of research claims." In addition to revising its standards for disclosure of survey research methodology, the updated Code includes for the first time standards for disclosure of (1) qualitative methodologies, including focus groups, and (2) content analyses.

The revised Code of Professional Ethics and Conduct is available at


No. 42. Online education opportunities in survey methods and public opinion research at the University of Illinois.

For those interested in learning more about surveys and public opinion research, there are many online educational opportunities available at the University of Illinois. In addition to the Survey News Bulletins, the Survey Research Laboratory conducts free webinars each semester on a variety of topics related to survey research methods (see for a list of the live fall webinars and access to recordings of past webinars). Also, the Institute of Government and Public Affairs will host a live screening of a webinar about racial attitudes and public opinion co-sponsored by the Midwest Association for Public Opinion Research and the American Association for Public Opinion Research on the UIC campus (see for details and to register to attend the live screening and for information about how to register for the webinar if you cannot attend the live screening). More formal online education opportunities are available through courses offered as part of the Survey Research Methods Online Certificate Program (see that are open to both current and nondegree students.


No. 55. Health Survey Research Methods Proceedings Available.

Between 1975 and 2013, there have been ten periodic conferences concerned with health survey research methodology held in the United States. These conferences, supported primarily by government agencies, have continuously tracked developments, innovations, and challenges in the design and implementation of health surveys. Proceedings from these conferences contain summaries of the presentations given at each conference and represent a valuable resource for health survey researchers about advances in health survey methods. The Survey Research Laboratory has been a leader in organizing and hosting these many of these meetings and has organized on its Web site PDF versions of the proceedings from all ten of these conferences. These documents can be accessed at


No. 61. Avoiding Research Misconduct.

The U.S. Department of Health and Human Services’ Office of Research Integrity (ORI) and Office for Human Research Protections (OHRP) have an online training module called The Research Clinic designed to teach both clinical and social researchers how to avoid research misconduct and protect subjects. In this module, participants can assume the role of principal investigator, clinical research coordinator, research assistant, or IRB coordinator. It is a valuable tool for teaching everyone involved in the research process--especially those new to subject recruitment and/or data collection--the consequences of deviating from established protocols.

The research clinic can be found at :


No. 86. What is Frugging? What is Sugging?

Ever agree to participate in a survey, only to realize after answering a few questions that you are really in the middle of a sales pitch or fund-raising call? If so, you may have been the victim of sugging (i.e., “selling-under-the-guise-of-research”) or frugging (i.e., “fund-raising-under-the-guise-of-research”). These common-place and highly unethical practices that disguise either selling or fundraising as a scientific survey give legitimate researchers a bad name and lead many people to be suspicious of all research and resistant to almost any request for survey participation. Sugging and frugging have both been publicly condemned by the American Association for Public Opinion Research (AAPOR) and other professional research associations.

For more information, visit the following web pages:

AAPOR Condemned Survey Practices:

AAPOR Statement on Trump/Pence Campaign Web Survey:


No. 92. AAPOR Releases Evaluation of 2016 Election Polls.

On May 4, the American Association for Public Opinion Research (AAPOR) released its much anticipated report concerning the accuracy of 2016 national and state election polls in the U.S. Key conclusions from that report include:

  • "National polls were generally correct and accurate by historical standards"
  • "State-level polls showed a competitive, uncertain contest but clearly under-estimated Trump’s support in the Upper Midwest"
  • There were multiple reasons why the polls under-estimated support for Trump, including:
    • "Real late change in voter preference during the final week of the campaign"
    • Adjustments for over-representation of college graduates was necessary, but many polls failed to do so
    • "Some Trump voters who participated in pre-election polls did not reveal themselves as Trump voters until after the election, and they out-numbered late-revealing Clinton supporters"
  • "Ballot order effects may have played a role in some state contests, but they do not go far in explaining the polling errors"
  • Predictions that Clinton had a very high probability of winning "helped crystalize the erroneous belief that Clinton was a shoo-in for president, with unknown consequences for turnout"
  • "A spotty year for election polls is not an indictment of all survey research or even all polling"

The full report (and technical appendices) can be found at:

A recording of the press conference during which the report was discussed can be found at:


No. 93. Academic Survey Centers.

Survey News Bulletins are written and distributed by the Survey Research Laboratory of the University of Illinois, an academic survey research center with offices in both Chicago and Urbana that provides services to the University community as well as Chicagoland and other Illinois communities. A recent article published by Inside Higher Ed ( examines the role that academic survey centers play on campuses across the U.S. In addition to providing support for faculty research, the article concludes that these centers provide many benefits to a university, including providing experiential learning opportunities to students, providing students, staff, and faculty with exposure to cutting-edge research methodologies, by contributing to the larger community, by providing students with practical and transferable skills, and by collecting data about the university itself that can be used to improve its programs and services. You can learn more about the University of Illinois Survey Research Laboratory here:


No. 100. A Look Back at the First 100 Survey News Bulletins.

This week, we celebrate the 100th Survey News Bulletin! The ninety nine bulletins produced since the SNB started in April 2014 have covered a wide range of topics from data collection mode, sampling, and questionnaire design, to analysis of survey data and important events related to surveys and survey methodology. Future SNBs will continue to bring you the most up-to-date best practices for designing and conducting survey research. As part of the 100th SNB celebration, we are also introducing a new feature of the SNBs – we would like your input on topics you would like covered in future SNBs! You can go to the SNB website (see to submit an idea. Next week, we'll resume our series of SNBs covering different types of data that can be combined with survey data.

We want to also remind you of the upcoming free Webinars in November from SRL. These Webinars on Social Desirability in Survey Research (Tim Johnson presenter) and Survey Experiments (Allyson Holbrook presenter) will be held at 12 noon on November 1 and 8, respectively. These webinars are free to University faculty, staff, and students, but they do require advance online registration (see Recordings of past SRL webinars on a variety of survey-related topics are also available at:


No. 112. User Experience (UX) and Survey Research.

One potential future application of survey research could be as part of User experience (UX) research. UX research focuses on evaluating a product or service from the perspective of the end user in order to avoid potential bias that might result from relying exclusively on the designer or developer. User experience (UX) is an emergent field expanding in prominence with the advent and explosion of social media and technological advancements and is one outcome of the development of the field of Human Computer Interaction (HCI) which focuses on the interaction between humans and technology.

There is a distinction between Usability testing and UX, in the sense that the former is regarded as the ability of the respondent to use a “tool” in order to successfully complete a task, where the latter is a broader concept considering the respondent’s whole experience, including beliefs, feelings and perceptions. Given the increasing use of social media as a means of communication and a new way of human socialization, understanding user experience has become even more important.

Because surveys are one of the best ways to assess subjective perceptions, experiences, and beliefs, surveys have increasingly become valuable to UX research. Many online survey tools such as Qualtrics, SurveyGizmo, and SurveyMonkey give us the opportunity to insert images into surveys and gather valuable data on several different metrics: feedback on perceived ease of use, likelihood to use, visual appeal, page layout, measure satisfaction, etc.

UX is constantly growing and focused on tackling new types of experiences, technologies, and innovative communication approaches. Surveys represent one tool that can be used in UX research to do so. UX research represents one likely area of major growth the future of survey research.

For more information, see:

Hill, Craig A., et al. Social Media, Sociality, and Survey Research, Wiley, 2013. ProQuest Ebook Central,

Albert, William, and Thomas Tullis. Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics, Elsevier Science, 2013. ProQuest Ebook Central,


No. 114. Cognitive Aspects of Survey Methodology (CASM).

The cognitive aspects of survey methodology (CASM) movement began in the 1980’s as an effort to create an interdisciplinary field that applied cognitive science to understand the process by which respondents answer survey questions. This approach considers the process of answering survey questions as a cognitive task that respondents must complete.

Currently, there is widespread acceptance that respondents go through a four-step cognitive process when they answer survey questions (e.g., Tourangeau, Rips, and Rasinski 2000). First respondents must understand the survey question and the task they are being asked to do; second, they must retrieve relevant information from memory; third, they must integrate that information into a judgment; and fourth, they must map that judgment onto the response format of the question (e.g., a set of specific response options or a scale).

Evidence from cognitive science about memory and retrieval processes has been applied to understanding how respondents complete each of these steps. The CASM movement has had an enormous impact on how researchers think about and write survey questions. As well, it led to the development of methods for cognitive pretesting of survey questionnaires (see SNB #71) and has been the major theoretical principle on which much current work on questionnaire design is based.

For more information, see:

Schwarz, N. (2007). Cognitive aspects of survey methodology. Applied Cognitive Psychology, 21(2), 277-287.

Tourangeau, R., Rips, L. J., & Rasinski, K. (2000). The psychology of survey response. New York: Cambridge University Press.


No. 117. Survey Satisficing.

SNB #114 described the cognitive aspects of survey methodology (CASM) movement and the cognitive steps that respondents go through when they answer a survey question. Krosnick’s (1991) theory of survey satisficing suggests that when respondents fully and carefully complete these steps, they are “optimizing.” However, optimizing involves a great deal of cognitive work. Some respondents who agree to participate in a survey will not be able or motivated to optimize. Other respondents may begin a survey by optimizing, but may become fatigued and may lose their motivation or ability to carry out the required cognitive steps as they progress through a questionnaire.

As a result, some respondents may sometimes shortcut their cognitive processes by engaging in either weak satisficing or strong satisficing. Weak satisficing amounts to a relatively minor cutback in effort: a respondent executes all the cognitive steps involved in optimizing, but less completely and with bias. Judgment patterns like response order effects (to be covered in a future SNB), acquiescence response bias (see SNB #20 and #21), and nondifferentiation (see SNB #11) are thought to reflect weak satisficing. Strong satisficing occurs when a respondent completely loses motivation and he or she is likely to seek to offer responses that will seem reasonable to the interviewer without having to do any memory search or information integration. No opinion responding (to be covered in a future SNB) and mental coin flipping are two response patterns thought to reflect strong satisficing.

The likelihood that a respondent will satisfice is thought to be a function of three classes of factors: respondent ability, respondent motivation, and task difficulty. People who have more limited abilities to carry out the cognitive processes required for optimizing are more likely to shortcut them. People who have minimal motivation to carry out these processes are likely to shortcut them as well. And people are most likely to shortcut when the cognitive effort required by optimizing is substantial.

Some satisficing can be minimized by avoiding particular types of questions that provide an easy cue for respondents (e.g., agree-disagree questions, long batteries of items that use the same response options, and questions with explicit no opinion or don’t know options), by minimizing questionnaire length, and by convincing respondents of the value and importance of the survey. In other cases, it may be useful to assess and control for the effects of satisficing (e.g., response order effects, whereby the order in which response options appear affects the distribution of responses) on survey responses.

For more information, see:

Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213–36.

Krosnick, J. A., Narayan, S., & Smith, W. R. (1996). Satisficing in surveys: Initial evidence. New Directions for Evaluation, 70, 29–44.


No. 120. Surveys as Conversations.

SNB #114 described the cognitive aspects of survey methodology (CASM) perspective whereby the process of answering survey questions is framed as a cognitive task. However, surveys are not simply a cognitive task. Surveys are also a type of conversation and a substantial amount of evidence suggests that respondents react to survey questions in ways that they would to everyday conversation.

One feature of conversations is that they are governed by conversational norms. Because speakers and listeners are cooperating to communicate effectively and efficiently with one another, they make assumptions about their communication that helps them to do so. Many of these conversation norms were outlined by Grice as a series of maxims (Grice 1975). Listeners assume that speakers provide relevant and truthful information and do not provide information that is unnecessary. In surveys, one consequence of this is that survey respondents sometimes infer meaning from information in unintended ways – for example, they infer meaning from numbers assigned to response options for clerical purposes or from other design choices (e.g., Schwarz 1996).

Conversations are also guided by conversational conventions. In contrast to conversational norms, these are habits of speech that do not communicate added meaning. One such convention is that it is conventional to offer positive words before negative. This has an impact for survey questions that use response options such as “favor or oppose,” “agree or disagree,” “for or against,” and “like or dislike.” Although one might want to rotate the order of these options to estimate and control for response order effects (see SNB #119), the conversational conventions perspective suggests that doing so (e.g., asking respondents if they oppose or favor a proposed policy) distracts respondents and introduces error into their answers (Holbrook et al. 1999). Rotating response option order for sets of response options where there are conventions about order (particularly those that use positive and negative response options) is therefore not recommended, although this practice can be useful for estimating and controlling for response option order effects when there is no ordering convention (e.g., for a question where the response options a set of discrete categorical answer choices).

These studies suggest that it is important for survey researchers to treat survey questionnaires as conversations. Specifically, it is important to avoid providing unnecessary information to respondents that they may interpret in unintended ways and to be aware of how design decisions may be interpreted by respondents. It is also important to follow conversational conventions in writing survey questions and presenting response options.

For more information, see:

Grice, P. (1975). Logic and conversation. In Cole, P. & Morgan, J. Syntax and semantics. New York: Academic Press. Pp. 41-58.

Holbrook, A. L., Krosnick, J. A., Carson, R. T., & Mitchell, R. C. (1999). Violating conversational conventions disrupts cognitive processing of attitudes questions. Journal of Experimental Social Psychology, 36(5), 465-494.

Schwarz, N. 1996. Cognition and communication: Judgmental biases, research methods, and the logic of conversation. New York: Psychology Press.


No. 128. Free Webinar Recordings Available.

The Survey Research Laboratory at the University of Illinois (SRL) first began offering workshops to faculty, students, and staff on the Urbana and Chicago campuses in 1998. For about the past five years, these workshops have moved to an online webinar format so that the webinars can be accessible to not only University of Illinois faculty, students and staff on multiple campuses, but also to people off campus. SRL has typically offered between 2 and 5 webinars each semester. Currently, SRL is engaged in a reorganization of its services and staff and will not be able to offer webinars this semester. However, recordings of past webinars are freely available on SRL’s website at: These past webinars cover a range of topics including an introduction to survey sampling, sampling rare populations, sampling hard to reach respondents, web surveys, questionnaire design, making decisions about response format in survey questionnaire design, cognitive pretesting of survey questionnaires, conducting focus groups, ethics in survey research, culture and survey measurement, survey design and data quality, constructing survey data sets, survey data analysis, and the use of agree-disagree questions in surveys.


No. 129. AAPOR’s 2018 SurveyFest being hosted at UIC.

TUIC will be hosting the American Association for Public Opinion’s (AAPOR) inaugural SurveyFest this fall. Part of AAPOR’s diversity initiative, SurveyFest is a one-day free conference to introduce undergraduate and graduate students in the Chicago area to opportunities related to survey and public opinion research. Panels will discuss careers, internships, and graduate school opportunities. The event will be held on November 3 in UIC’s Student Center East. More information is available here. The event is free to interested students, but pre-registration is required. Interested students can register here.


No. 131. Anonymous vs Confidential Data Collection.

The next few survey news bulletins will address issues of informed consent, confidentiality, and their implications for the Institutional Review Board (IRB) application process. For survey data collection to be anonymous, data must be collected without any name or identifiers, so that no one – including the investigator – can link an individual person with their responses. For example, if you ask residents who attend a community policing meeting to complete a paper questionnaire on their reactions to the meeting once it’s over, and that questionnaire does not ask for any personal or contact information, this would be considered anonymous data.

Web-based surveys that are accessed through a common URL for the survey (instead of via individualized links) may be considered anonymous if they do not ask the respondent for any identifying information. However, most Web survey tools (including Qualtrics) collect the IP address of the respondent’s computer when subjects are invited to participate through an individual link tied to their e-mail address, and for this reason data collection cannot be called anonymous. There are some cases, though, in which IRBs have allowed IP addresses to be collected while still classifying the protocol as being anonymous if the IP addresses are not downloaded from the web survey platform with the survey data.

Confidential data are data that could theoretically be linked to the survey respondent. Confidential data are usually coded; this code identifies the subject but is kept separately from the data file. Coded data are never anonymous. For example, if you send a mail survey to school principals, you will assign the principals a unique identification code so that you can track who has not responded for the purpose of non-response follow-up. Your final data file will only contain that identification number, but the separate sample file will link that ID number to the principal. Your IRB application will need to be specific about how and where the coded sample file will be securely stored.


No. 132. Consent vs Assent in Research.

Informed consent is the process by which potential research subjects (commonly referred to as human subjects) are informed about your research study. Human subjects may only participate in your study if they have been adequately informed about the risks, benefits and procedures involved with your research and voluntarily agree to participate.

Consent may only be given by individuals who have reached the legal age of consent (in the U.S., this is age 18). Assent is the agreement of someone unable to give legal consent to participate in a research activity. This includes adults who do not have the legal capacity to consent for themselves, in which case the signature of a legal guardian is also required, as well as minor children. Federal regulations require that children be asked to provide assent whenever they are capable of doing so. Thus, any data collection with children not capable of giving consent requires the permission of the parent or legal guardian and the assent of the minor subject. Usually, children aged 7 and older are asked for verbal assent; older children may be asked to sign an assent form, which is written at an age-appropriate level. IRB requirements may vary regarding the age at which a signed assent form is required, and waivers may be granted under some circumstances.

For sample consent and assent templates, check out your institution’s IRB website; UIC OPRS examples can be found here:


No. 133. How do survey respondents provide consent?

The typical survey-based study will allow for the key elements of informed consent to be provided to respondents in a clear and brief manner. In a telephone survey, these elements can be provided in the brief introductory statement before beginning the interview. For a mail survey, the consent elements can either be included in the cover letter accompanying the questionnaire, or with a separate study information sheet. In a Web survey, the introductory page the respondent sees upon clicking their unique survey link would contain these consent elements.

Obtaining informed consent is always a precondition to participation in survey research. In some studies, signed consent forms may increase risks to respondents if confidentiality is breached, and therefore reduce cooperation (e.g., surveys of immigrants). Since nonresponse is a primary source of error in surveys, signed consent forms can increase nonresponse error and increase respondent burden without the gain of protecting respondents from significant risks. Federal regulations (CFR 46.117c) on human subjects’ protections acknowledge that documentation of informed consent is not necessary or desirable in every research setting. The regulations allow IRBs to waive requirements for informed consent under certain circumstances.


No. 134. IRB Guidance on Survey Incentives.

Incentives for survey participation are now common practice, as they have been shown to increase survey response rates (see SNB #14). For research governed by IRB review, it is important to note that compensation to research subjects is not a ‘benefit’ of research; rather, incentives are meant to offset the time and inconvenience of participation in your study (as well as motivate your respondent to participate).

Survey incentives are usually in the form of cash, gift cards, or checks, though sometimes non-monetary incentives are used (such as gifts or extra credit). It is important your protocol and informed consent language clearly specify the incentive structure and who is eligible for payment. For example, if your study involves two stages of participation -- a baseline Web survey and then a paper travel diary -- you will need to specify whether the incentive is paid only upon completion and submission of the travel diary. Compensation may also be in the form of a lottery (for example, 20 respondents will be randomly selected for an Amazon gift card). IRBs will typically want to see your protocol specifies a fair method for selecting winners, and that the odds of winning are included in the consent language.

There are no specific federal regulations for what constitutes and appropriate incentive amount for a given study. IRBs typically look to norms for the respondent population along with study specific factors. If the population is from a high income stratum (for example, physicians) IRBs would likely approve a higher incentive amount to complete a 30 minute Web survey than they would for the general population. While the federal regulations do not directly address payments to subjects, incentives can become a concern if they are deemed to induce subjects to participate against their better judgment, or otherwise affect their ability to fully consider the risks of participation. However, research suggests that typical survey incentives are not coercive (Singer and Bossarte 2006).

See also:

Singer, E., and Bossarte, R.M. (2006). Incentives for Survey Participation; When are They ‘Coercive’? American Journal of Preventive Medicine, 31 (5) 411-418.


No. 136. Proxy Reporting from Secondary Research Subjects.

A study design may call for secondary research subjects, or “proxies” to be used in data collection. The Encyclopedia of Survey Research Methods defines a proxy as “a respondent who reports on the properties or activities of another person or group of persons (e.g. an entire household or a company)”.

The survey design may have a component where proxies are always used for a specific part of a survey, such as parents or guardians answering survey questions about their child who is under a certain age. Or there may be special situations where proxy respondents are occasionally allowed, such as when the selected respondent has a chronic health condition interfering with their ability to participate in the survey, or a selected respondent who does not have the capacity to give informed consent.

Using a proxy respondent can enhance survey data by providing a response when none would otherwise be available, or by obtaining data from a population that would be otherwise be hard to survey (i.e. young children). However, there are ethical considerations in the use of proxy respondents. These include:

  • Would the respondent wish to participate in the research?
  • How to consistently determine whether a proxy respondent is needed? Make sure this part of recruitment and consent is covered in interviewer training if proxies are allowed.
  • Is the proxy equipped to answer questions on behalf of the respondent? What about certain questions that are more attitudinal?
  • When proxies are needed, who is the best proxy? Make sure to take characteristics of the proxy into account when analyzing the data.
  • Should ability to consent be re-examined along the survey process for those with more than one phase or interview? For instance, what if the respondent regains ability to consent – or reaches the eligible age to consent -- but was originally enrolled by a proxy?

It is important to remember that a proxy should not be substituted for the selected respondent in special situations if this was not anticipated by the research team and already approved by the Institutional Review Board. Always include potential proxy situations in the initial IRB protocol.

For more information, see:


No. 137. Effects of Low Response Rates in Telephone Surveys.

It is commonly understood that telephone survey response rates have dropped considerably over the past two decades. An interesting report from the Pew Research Center investigates the effects of the declining response rates on the quality of survey estimates obtained. Although we recommend you read the full report (see link below), their basic findings are very instructive. When a typical telephone survey is compared to benchmark data from the Census Bureau’s Current Population Survey, estimates derived from each are very similar across a variety of demographic, economic and lifestyle benchmarks such as political party and religious affiliation. The telephone survey, however, appears to considerably over-estimate multiple indicators of civic engagement, such as volunteer activities, trust and communication with neighbors, and helping to solve neighborhood problems. This suggests that low response rate telephone surveys tend to over-represent persons who actively participate in community activities. Taken as a whole, this research suggests that low response rates can lead to serious bias for some – but not all – measures.

For the full Pew report, see:


Data Collection/Survey Management

No. 6. Responsive survey design & interviewer ratings.

Response rates (i.e., the proportion of eligible respondents who participate in a survey) have been decreasing and are of great concern to researchers (particularly telephone surveys). One approach to this has been research exploring the usefulness of responsive design procedures where contact efforts are guided by information and evidence collected from previous contacts. One source of such evidence is ratings from interviewers in both face-to-face and telephone surveys. Interviewer ratings of response likelihood were predictive of cooperation in a telephone survey, suggesting that interviewer ratings could prove to be a useful source of information for responsive design procedures.

Eckman, S., Sinibaldi, J., & Montmann-Hertz, A. (2013). Can interviewers effectively rate the likelihood of cases to cooperate? Public Opinion Quarterly, 77, 561-573.
Groves, R. M. & Heeringa, S. G. (2006). Responsive design for household surveys: Tools for actively controlling survey errors and costs. Journal of Research in Statistics, 169, 439-457.


No. 8. How long does it take to field a study?

Everyone wants their data fast. It takes time to design and conduct a high quality survey. A typical Web survey can take 3-5 months; a mail survey, 4-6 months; and the time needed to conduct a telephone or face-to-face survey depends on many factors that don't lend themselves to predictable time frames (such as geographical dispersion of the population and sample size). Steps that affect data collection time include the following:

  • Questionnaire development and testing (especially when new questions or measures are being developed or if programmed for self-administration via Web, or interviewer-administration in CAI software). This can be particularly time consuming if many stakeholders are involved in the questionnaire development process.
  • Sample frame development.
  • IRB review and approval (always leave time to respond to modifications! Do not expect approval upon initial submission).
  • Cognitive pretesting.
  • A thorough pilot study.
  • Time to amend the IRB protocol based on the pilot study.
  • Adequate time to collect data--plan for more than you think (you may decide to do another mailing, for example, if returns have been slow to come in).
  • Time for data processing and cleaning before the final data set is ready.


No. 14. Noncontingent incentives are more effective than contingent incentives.

Meta-analyses of randomized experiments have demonstrated that providing potential respondents a prepaid (i.e., noncontingent) incentive is more effective than a promised (i.e., contingent) incentive for increasing survey response rates in both self-administered mail (Church, 1993) and interviewer-mediated telephone and face-to-face surveys (Singer et al., 1999). In addition, monetary incentives are consistently found to be more effective than gifts and other forms of non-monetary incentives for increasing response rates.

Church, A. H. (1993) Estimating the effect of incentives on mail survey response rates: A meta-analysis. Public Opinion Quarterly, 57, 62-79.
Singer, E., Van Hoewyk, J., Gebler, N., et al. (1999) The effect of incentives on response rates in interviewer-mediated surveys. Journal of Official Statistics, 15, 217-230. (


No. 15. Survey budgeting.

Developing a survey requires trade-offs between data quality and the cost of obtaining the data. Beyond the labor costs associated with professional time for survey/sample design, questionnaire development and data analysis, there are myriad expenses that apply depending on your mode of data collection and study design. Whether you are fielding your own data collection effort or hiring a professional survey research firm to collect data, here is a sample list of expenses you should anticipate:

  • Questionnaire length (for mail surveys, affects printing, postage, and data entry costs; for in-person and telephone interviews, affects total interview time and interviewer costs)
  • Geographic dispersion of the sample
  • Whether screening is required to find target population
  • Printing costs
  • Postage
  • Telephone charges
  • Materials (envelopes, paper)
  • Translation of questionnaire and recruitment materials into other languages
  • Pretesting
  • Respondent incentives and disbursement (if they are mailed to subjects later, labor, materials, and postage on top of the incentive)
  • Number of contact attempts
  • Interviewer travel time and mileage
  • Validation
  • Data entry and processing
  • Software license fees for Web or computer-assisted interview questionnaires
  • Equipment purchases (computers or other electronic data collection tools such as tablets for in-person surveys in particular)

See Blair, J. E., Czaja, R. F., & Blair, E. A. (2013). Designing surveys (3rd ed., pp. 337-343). Thousand Oaks, CA: Sage.


No. 32. Conducting health surveys.

Health-related research relies heavily on the use of survey methodologies, and these methodologies have become increasing sophisticated in recent decades. Two recent books summarize much of the research literature on this topic. Aday and Cornelius (2006), in the third edition of Designing & conducting health surveys, walk investigators through each phase of the survey process, from conceptualization, through operationalization, data collection and basic analyses. Johnson's (2015) edited volume Handbook of health survey methods, presents 29 chapters written by experts in various aspects of health survey methodology, covering detailed topics under the headings of Design and Sampling, Measurement, Field Data Collection, Special Populations, and Data Management and Analysis.

For further information about health survey research methods, see

Aday, L. A., & Cornelius L. J. (2006). Designing & conducting health surveys (3rd ed.). San Francisco: Jossey-Bass.

Johnson, T. P. (Ed.) (2015). Handbook of health survey methods. Hoboken, NJ: John Wiley & Sons.


No. 44. Contact Attempts.

Many researchers choose to hire students or graduate research assistants to collect data, particularly with list samples of subjects. It is important to consider when you are most likely to reach your population so that you can staff the study accordingly. For example, a list of high school principals would only need contact on weekdays during business hours. But if you are trying to reach parents who have children in day care, you will need evening and weekend contact attempts to maximize your chance of reaching your subjects. You therefore need to hire your staff with the availability that closely matches that of your population. Moreover, varying the dates and times of contact attempts will affect your response rate. If you have students who only work Monday, Wednesday, and Friday, then you are not making any contact attempts on the other days of the week. Also avoid making contact attempts only in the early evening or Saturday morning, for example, as you will repeat the same patterns of noncontact; you need variety not only in the days of the week attempts are made but also in the times of day. At a minimum, 10 varied contact attempts should be made on each case before finalizing it as a noncontact. Finally, consider how student semester schedules overlap with your data collection schedule. You should avoid fielding your study during known breaks when students will most likely be absent.


No. 46. Panel Retention.

Panel studies involve the collection of data over time from the same sample of respondents. Unlike other forms of longitudinal studies, panels allow for the study of individual behavior change over time. However, because the same individuals are followed, there is eventual attrition, or nonresponse, after the baseline data collection wave. Attrition is either due to the researcher’s inability to locate the respondent for additional waves of data collection or to the respondent declining to participate when located. Since the value of panel surveys is dependent upon the ability to study the same respondents at different points in time, reducing attrition is of major concern in social and behavioral research. Loss of respondents over time raises the possibility of bias if those who are lost to follow-up differ from those who remain in the panel on key dependent variables. Therefore, panel attrition can affect both the internal and external validity of the study (Cook and Campbell, 1979). There are three main factors that will affect the degree of attrition in any panel study: (1) recruiting the respondent into the study; (2) successfully locating the respondent for subsequent interviews; and (3) maintaining the respondent’s commitment to the panel.

For more information, see:

Parsons, J.A. (2015). Longitudinal research: Panel retention. In J. D. Wright (Ed.-in-chief), International Encyclopedia of the Social & Behavioral Sciences (2nd ed., Vol 14, pp. 354–357). Oxford: Elsevier.

Cook, T.D. & Campbell, D. (1979). Quasi-experimentation: Design and analysis Issues for field settings. Geneva, IL: Houghton-Mifflin Company.


No. 47. Using Mock Interviews for Interviewer Training.

Training of interviewers for telephone or face-to-face data collection should always include practice interviews. Sometimes referred to as "mock" interviews, this hands-on practice should be as close as possible to an actual interview interaction and involve a trainer or supervisor playing the role of the respondent. Interviewers should be expected to read and record exactly as they would in an actual interview, and trainers should provide realistic practice scenarios.

Mock interviews with a screening questionnaire allow interviewers to practice introductions and refusal aversion and answer common respondent questions about the study. Group sessions of practice interviews with the screener and introduction can be conducted with each interviewer taking turns reading the same text and responding to scenarios posed by the trainer. As these practice interviews progress, they can be used to test interviewers in various scenarios and allow evaluation of reading, coding, and note taking on open-ended captures.

During these mock interview sessions, interviewers should be given immediate feedback on pacing, verbatim reading, probing, and following instructions. Effective mock interviews will engage interviewers in the training process, help them work through nerves, and allow them to listen to other interviewers. They also reinforce the importance of standardization and establish the role of feedback in the interview process.

For more information:

Lavrakas, P. J. (2008). Role playing. In P. J. Lavrakas (Ed.), Encyclopedia of survey research methods: Volume 1 (p. 768). Thousand Oaks, CA: Sage.

Fowler, F. J. Jr. (2009). Survey research methods (4th ed., pp. 127-145). Thousand Oaks, CA: SAGE Publications, Inc. doi:


No. 54. Monitoring and Interviewer Feedback.

Monitoring the work of interviewers is an essential part of data collection. Monitoring is done to gauge interviewer productivity, assess the quality of each interviewer’s work, minimize errors in data collection, and to guard against falsification.

In a phone center, monitoring is typically done remotely, with a monitor listening to both the interviewer and respondent and watching data entry as it happens. For face-to-face interviewing, monitoring can be done by accompanying an interviewer into the field and observing the interview process. It can also be supported by validation, a process by which some respondents are re-contacted to confirm that the interview was conducted properly. Among the questions to consider during monitoring:

   * Are interviewers providing study information accurately to informants and        respondents?
   * Are interviewers averting refusals effectively?
   * Are procedures for dialing cases or visiting sampled addresses being followed?
   * Are cases being coded correctly?
   * Is screening being conducted correctly?
   * Are notes accurate, succinct, and sufficiently detailed?
   * Are interviewers working efficiently?
   * Are questions being read verbatim, at the right pace, and with correct emphasis?
   * Are interviewers probing when necessary?
   * Is probing neutral and thorough?
   * Is data entry accurate?

Feedback given to interviewers immediately after monitoring should point out and validate correct behaviors and provide constructive feedback on things interviewers need to do better. It should be supported by specific examples from the interviewer’s work. Comments should be written and saved so there is a record of each monitoring, feedback given, and an overall snapshot of interviewer performance.

All new interviewers should be monitored early in a study to make sure they are following study procedures. Monitoring should continue throughout a study to make sure procedures are (still) being followed.

For more information:

Steve, K. W. (2008). Interviewer monitoring. In P. J. Lavrakas (Ed.), Encyclopedia of survey research methods: Volume 1 (pp. 372-375). Thousand Oaks, CA: Sage.

Czaja, R., & Blair, J. (2005). Designing surveys: A guide to decisions and procedures (2nd ed.). Thousand Oaks, CA: Pine Forge Press.


No. 56. Interviewer Training and Frequently Asked Questions (FAQs).

As part of training, interviewers should be given specific instructions about how to respond to commonly-asked informant or respondent questions. Scripted answers to frequently asked questions that might be posed (often called the FAQ) provide study-specific information while addressing respondent concerns. Written in everyday language, the FAQ are designed to help interviewers address respondent concerns and questions. In particular, FAQ often address respondent concerns that might be barriers to participation.

Reading the FAQ aloud during training is a first step in helping interviewers learn the responses they will have to provide while they are on the phone or in the field; they should be handy during mock practice interviews and refusal aversion practice (interviewer training specifically designed to help interviewers avoid respondent refusals). During a study, they are a resource for answers to questions that are asked less often or for providing details that can be hard to remember. For telephone surveys, FAQs are typically posted in booths in a phone center; they are carried by face-to-face interviewers in the field.

Answers should be scripted for any and all concerns that can be identified a priori, including sampling (How did you get this number? / Why did you pick me? / Can’t you interview someone else?); confidentiality (Who will see my answers? / How will this information be used?); respondent burden (How long will this take?); survey topic or knowledge concerns (What is this about? / Why do you want to know about x?); and who to call for more information. They can also script quick study-specific comebacks to refusals such as “I’m not interested“ or “I’m too busy.” While the goal is for interviewers to be able to use the information in the FAQ to provide responses in their own words or phrasing as they gain experience on a study, FAQs should be short enough that they can be committed to memory early on. Researchers may want to revise or add to the FAQs if additional common questions or concerns are identified after data collection has begun.

For more information:

Moore, D.L., & Tarnai, J. (2008). Interviewer training. In P.J. Lavrakas (Ed.), Encyclopedia of survey research methods: Volume 1 (p. 372-375). Thousand Oaks, CA: Sage.

Frey, J.H. (1989). Survey research by telephone. (2nd ed.) Newbury Park, CA: Sage.


No. 59. Standardized Reading and Question Formatting.

The way questions are presented for an interviewer to read helps achieve the goal of standardized questionnaire administration. Standardized reading can be an ongoing challenge when multiple interviewers are working on a study, and attending to the details of question formatting and writing helps in this process.

  • Questions should be scripted so that interviewers are not tempted (or forced) to add anything to make a question sound complete. Questions that seem complete on paper may not be readable aloud. An absence of question stems (Would you say…) can lead to different interviewer readings. One interviewer may add a stem, but another may not. Similarly, interviewers may read response categories differently in the absence of punctuation such as commas.
  • Complicated question formats, while they can save space on a screen, often leave out things (such as question stems, punctuation, or even words) that make for good reading.
  • Use standard formatting for emphasis, text that should be read vs. not read, and for notation for acceptable readings in repetitive question series. Interviewers should understand the conventions that are used to express all of these things. Inconsistent question formatting or notation for emphasis can lead to different interviewer readings.
  • Avoid parentheticals (words that clarify other words); they usually are not readable aloud.
  • If interviewers are allowed to provide definitions, or if specific instructions are required for data entry or coding, they should be provided on screen, rather than left to interviewer memory.
  • Providing scripted transitions between sections can help interviewers avoid the temptation to add words in an effort to be "conversational."
  • The scripting of one mode (such as Web) will likely not transfer seamlessly to interviewer administration.

Read-throughs with interviewers and mock interview practice can reveal surprising things about how interviewers see questions and can be helpful in identifying shortcomings before a questionnaire is fielded.

For more information, see:

Fowler, F. J. & Mangione, T. W. (1999). Standardized survey interviewing: Minimizing interviewer-related error. Newbury Park, CA: Sage.

Houtkoop-Steenstra, H. (2000). Interaction and the standardized survey interview: The living questionnaire. New York: Cambridge.


No. 63. Interviewer Falsification.

A reality of survey research is that falsification by interviewers does happen – and the risk is not just one for large, federally funded surveys administered by survey research organizations, but also for smaller studies where the principal investigator directly supervises a staff of interviewers. Falsification involves the interviewer’s intentional deviation from the study protocol, and includes fabricating all or part of an interview, changing outcomes of contact attempts with subjects, miscoding an answer to a question in order to skip out of follow-up questions, and interviewing a non-sampled person in order to reduce the amount of effort required to complete an interview. Preventing falsification involves fostering extrinsic and intrinsic motivation in study interviewers (Koczela, et al 2015). Detection of falsification requires resources that must be allocated at the budgeting and planning phase of your research.

For a summary of best practices on preventing and detecting interviewer falsification, see:

Interviewer Falsification in Survey Research: Current Best Methods for Prevention, Detection and Repair of Its Effects (


No. 68. What is the Difference between Interviewer Effects and Interviewer Variance?

Interviewers may unintentionally influence respondent behaviors in systematic ways during survey interactions. For example, considerable empirical research suggests that an interviewer’s observable characteristics -- such as gender, age and race/ethnicity -- may cue respondents to relevant social norms that then become integrated into their answers. This is believed to be most likely to happen when interviewer characteristics are directly relevant to the questions being asked. For example, interviewer gender may become relevant when respondents are answering questions about gender-related topics. These differential answers as a consequence of varying social identifiers are commonly referred to as interviewer effects.

In contrast, interviewer variance represents generalized differences across interviewers that are more idiosyncratic in nature, for example, how they phrase questions or probe responses. These differences may account for measurable amounts of unique variance across individual interviewers.

In most survey data analyses, both interviewer effects and interviewer variance remain unexamined, despite the fact that they may have significant influence on statistical estimates. The good news is that these can be evaluated using readily available software programs. Elliott and West (2015) present an example of an interviewer variance analysis. Davis et al (2010) review the literature on interviewer effects.

For more information, see:

Davis, R. E., Couper, M. P., Janz, N. K., Caldwell, C. H., & Resnicow K. (2010). Interviewer effects in public health surveys. Health Education Research, 25(1), 14-26.

Elliott, M. R., & West, B. T. (2015). “Clustering by interviewer”: A source of variance that is unaccounted for in single-stage health surveys. American Journal of Epidemiology, 182(2), 118-126.


No. 70. Urban-Rural Differences in Survey Response Rates.

One of the most reliable findings in the survey nonresponse literature is the consistent, and sometimes large, urban-rural differences in survey response rates. The response rates obtained in rural areas to both telephone and in-person surveys continue to be higher than in more densely populated urban areas. This “urbanicity” effect has been documented both in terms of the ease with which respondents can be contacted (i.e., contact rates), as well as in terms of respondent willingness to participate in surveys once contacted (i.e., cooperation rates). In urban areas, longer work commuting times, greater proportions of single person households, and greater numbers of restricted access residences all pose barriers to successfully contacting potential respondents. Once contacted, crime fears, reluctance to engage with strangers, and the reduced social cohesion typical of many urban areas contribute to lower cooperation rates. These challenges suggest greater effort is necessary to complete field work in urban environments. These may include increased number and timing of attempts to contact sampled households and individuals, longer data collection periods, decreased interviewer workloads, higher incentives, and more careful tailoring of interviewer-respondent interactions. More research, of course, is needed to address urban-rural disparities in survey response.


No. 73. Using Computer-Assisted Technology to Reduce Respondent Burden.

Surveys use computer-assisted survey information collection (CASIC) to aid in survey data collection in a variety of ways. Although CASIC may have advantages in terms of efficiency and error reduction, one of its main advantages is that it allows researchers to tailor survey questionnaires to each respondent to minimize respondent burden. Two specific strategies used to do so are programmed skip patterns and text fills.

Programmed skip patterns ask questions of respondents based on their responses to earlier questions. For example, a health survey might only ask about treatments for hypertension of those respondents who report they have been diagnosed with hypertension; or a post-election survey might only ask respondents who say they voted for whom they voted. The advantage of implementing skip patterns with CASIC is that it is possible in self-administered surveys to only show respondents the questions that apply to them. Similarly, in interviewer administered surveys, programmed skip patterns restrict interviewers to only see questions that should be asked of a given respondent. This is an advantage over paper-and-pencil questionnaires where either respondents (self-administered) or interviewers (interviewer-administered) need to follow instructions about which questions should be answered. Programmed skip patterns shorten the instrument, reduce cognitive burden and fatigue, and help to keep respondents engaged in the task of answering survey questions.

Text fills are another CASIC tool. Text fills involve filling in a word, phrase, or value in a question based on a respondent’s answers to one or more previous questions. For example, a respondent might first be asked to indicate the most important problem facing the country today. Later questions can be tailored to reference the problem s/he identified. A respondent who reports that “the economy” is the most important problem facing the country could be asked “Do you have more confidence in the Republican Party or the Democratic Party to deal with the economy?”, whereas a respondent who reports that the most important problem facing the country is "the moral climate" might be asked “Do you have more confidence in the Republican Party or the Democratic Party to deal with the moral climate?” Text fills may also be used to provide memory cues for later questions. For example, a question might ask a respondent the date of their last doctor’s appointment. That date can then be used as a cue in later questions about the appointment (e.g., "When you visited the doctor on [filled date], were you satisfied or dissatisfied with the amount of time the doctor spent with you?"). Text fills can also draw from more than one previous questions and can involve calculations. For example, a series of questions in a survey might ask a respondent about the number of people in different age categories (e.g., under 5, 5-12, 13-17, and 18 or older) living in the respondent's household. A text fill in a later question could use the total household size (i.e., the sum of all these responses). Text fills simplify the task of answering follow-up questions because the respondent doesn’t need to remember his or her responses to earlier questions.

Both skip patterns and text fills are made substantially easier by CASIC. These procedures help researchers avoid asking respondents questions that are irrelevant or unnecessary or requiring respondents to remember answers to previous questions. As a result, respondents are able to put their efforts toward answering the survey questions more carefully and are less likely to become fatigued or bored as the survey progresses.

For more information, see

Couper, M. P., Baker, R. P. Bethlehem, J., Clark, C. Z. F., Martin, J. Nicholls, W. L., & O’Reilly, J. M. (1998). Computer Assisted Survey Information Collection. New York: Wiley.


No. 87. Establishments as the Unit of Analysis.

Individuals typically serve as the unit of analysis in most survey research conduct in this country. Other units of analysis, of course, are possible and one that is used often is the establishment. Establishments can represent one of many organizational forms. They can be for-profit or not-for-profit organizations, and they can have varying functions, such as business, government, education, or criminal justice, to name a few. They can also vary in size and can have multiple locations. Each of these dimensions can present challenges not usually found in traditional person-level surveys. For example, the optimal informant(s) within an establishment need to be identified prior to collecting data. There may be a distinction between those who have the authority and those who have the ability to report data about the establishment. In addition, it is important to recognize that participating in a survey is not always consistent with organizational priorities. A few tips for executing a successful establishment survey include the following:

  • Avoid times when workloads are likely to be heaviest and survey requests are most likely to be given low priority
  • Be prepared to negotiate with gatekeepers, particularly when trying to reach the leader(s) of an establishment
  • Remember that appeals for survey participation are most likely to be successful when framed as being consistent with organizational goals
  • Be sure to collect information regarding the position of the informants who actually complete the survey, as informants within organizations are often assigned based on convenience, rather than because they are the most qualified to respond

For additional information regarding establishment surveys:

Special Issue on Establishment Surveys. (2014). Journal of Official Statistics 30(4). (


No. 107. Advance notification – telephone surveys.

Sending advance letters to telephone respondents prior to the first contact by a field interviewer (FI) has long been a strategy to increase response rates and reduce nonresponse bias. Advance letters are thought to convey study and sponsor legitimacy, as well as communicate the importance of the survey and give the FI support in their field efforts (Groves and Snowden, 1987). In a comprehensive meta-analysis of advance letters in telephone surveys, de Leeuw and colleagues (2007) found that advance letters have a positive effect on the response rate, both with RDD and with list-based address samples, but the effect was greater when the sample was based on a list of known addresses.

One common perception about advance letters in telephone surveys is that they are cost effective because the cost of mailings is counterbalanced by reduced number of call or contact attempts to finalize a case. In one study, respondents either received no advance mailing, a postcard notification about the study, or an advance letter. The number of call attempts to reach a final disposition code was significantly lower with the advance letter than with postcard or without any advance mailing, and the number of call attempts to complete an interview was lower in the advance letter group (Hembroff et al 2005).

One challenge in using advance letters in telephone surveys is that addresses are often not available for at least a portion of the telephone numbers in the sample. Another challenge in using advance letters, in any mode, is that unless the respondent is a specifically named person (as in a list frame), there is no way to tailor the advance letter to the selected respondent. In fact, the respondent often is not yet selected at the time the advance mailings are sent, making the use of advance letters in household telephone surveys (e.g., those that use RDD samples) a challenge. Moreover, advance letters can only have the desired effect if the letter is received, opened and read (ideally, by the household member who is eventually sampled to participate in the survey). Letters are of little utility if they are discarded without being opened, or read and discarded by someone in the household other than either the informant who completes the household screener or the selected respondent. To determine whether a respondent has actually received the advance letter, questions can be included in the survey interview to measure whether receiving the advance letter is recalled.

For more information, see:

Groves, R. M.,& Snowden, C. (1987). The Effects of Advanced Letters on Response Rates in Linked Telephone Surveys. Proceedings of the American Statistical Association, Survey Research Methods Section. 633–638.

De Leeuw, E., Callegaro, M., Hox, J., Korendijk, E., & Lensvelt-Mulders, G. (2007). The Influence of Advance Letters on Response in Telephone Surveys: A Meta-Analysis. Public Opinion Quarterly, 71(3), 413-443.

Hembroff, L. A., Rusz, D., Rafferty, A., McGee, H., & Ehrlicher, N. (2005). The cost-effectiveness of alternative advance mailings in a telephone survey. Public Opinion Quarterly, 69(2), 232–245.


No. 110. Advance notification – mail and Web.

In SNB #107, we reviewed advance notification in telephone surveys. This bulletin focuses on the use of advance notification in mail and Web surveys.

For years, prevailing wisdom has suggested that pre-notification will improve response rates to mail surveys (Dillman 2000). Prenotification letters have also successfully been used to increase response rates in web surveys (Couper 2008). Advance letters are believed to work because they legitimatize the survey request and are of particular help if the survey sponsor is not known to the respondent; in Web surveys, they let potential respondents know to look out for an e-mail that will contain the URL for the survey.

The decision on whether to use pre-notification comes down to several considerations: the population; the nature of the data requested; budget; and length of time available for data collection. If your questionnaire requires the respondent to look up organizational or individual records, advance notification is an important first step in the process so that they have time to gather the information. Sometimes, a pre-notification letter is important when the quality of the list sample frame is in question. By sending an advance notification letter with address correction requested from the U.S. Postal Service (USPS), you will be able to weed out bad addresses, and look up new ones, before the more expensive and elaborate mail survey packets are sent. However, it can take 3-4 weeks for the USPS to return all of the undelivered letters, which adds considerable time to the data collection period.

However, in Dillman’s latest volume (2014), the authors suggest that as people are increasingly saturated with survey requests, the efficiency of pre-notification letters may be in decline. In response to this, it is increasingly common for advance notification letters to include cash incentives to engage subjects and increase response rates.

For more information, see:

Couper, M. P. (2008). Designing effective web surveys. Cambridge, MA: Cambridge University Press

Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method. Hoboken, NJ: John Wiley & Sons


No. 130. Spoofing versus Local Presence Dialing.

In the field of survey research, spoofing is the act of misrepresenting or misleading potential respondents regarding the identity of the individual or organization requesting their participation in a survey. In telephone surveys, it involves manipulating the information presented on the caller ID devices of persons being contacted, typically by falsifying the incoming phone number to make it appear to be local. Attempting to trick respondents into believing they are being called by a local number is also sometimes referred to as “neighbor spoofing,” and is intended to increase the likelihood that potential respondents will answer the phone call. Some otherwise reputable survey organizations employ this technique as a means of coping with declining response rates, but it is not widely accepted as an ethical practice and may be illegal.

Spoofing is distinct from a closely related practice called local presence dialing. Local presence dialing also allows survey researchers to have a local number show up on caller ID for potential respondents. However, local presence dialing involves purchasing numbers in each area code (or purchasing access to actual numbers in each area code) to use as outgoing call numbers in each area code. In contrast, spoofing involves outright falsifying the caller ID number so that it only appears to come from a number in the same area code. These terms are sometimes incorrectly used interchangeably and both practices are somewhat controversial in survey research today, but spoofing is considered more deceptive than local presence dialing.

For more information:

Federal Communications Commission:


No. 135. Ethical Issues in Using Third Party Sites to do Data Collection.

It is sometimes necessary to use a third party site to conduct all or a portion of the data collection for your research. When doing so, it is important to ensure that these sites have in place the required human subject protections that apply to your specific study. Every study is different and will have specific criteria requested by your Institutional Review Board (IRB). Your IRB will need to see the data collection procedures of the third party site, as well as your study procedures.

For third party sites involving interviewers, IRB may require them to undergo additional training on human subjects research (even if that is not normally required by the site). Also pay attention to the recruiting strategies and introduction/consent language, the screening and eligibility assessment, and the standard callback and refusal procedures of the site. Your IRB will need to review and approve these procedures.

For third party sites involving self-administered Web and mail surveys, pay attention to how contact information of the subjects is collected and used. For instance, is the information collected used only for the intended purpose (data collection for your specific study), and how it is maintained or destroyed after that use? Is the information collected absolutely necessary? Often web survey packages will automatically collect IP addresses, so make sure that this information is not downloaded and maintained with your data. Also make sure that the data security information provided to the subjects is clear and easily accessible.


Survey Mode

No. 35. Survey mode I: In-person interviews.

One of the first and most central decisions to be made when designing a survey is the mode (or modes) in which survey data are to be collected. Each mode has advantages and disadvantages, and the specific choice of mode will depend on a variety of factors including the goals of the research project, the type of data one wants to collect, the planned sampling approach, the information available about potential respondents or households in the sampling frame, and the resources available to conduct the research. One of the four major modes of survey data collection used today is in-person interviewing.

The major advantages of in-person interviews are that they tend to have higher cooperation rates than other modes and that interviewers can use nonverbal cues in their communication with respondents to build rapport and identify problems. In-person interviews also allow researchers to present complex visual stimuli, conduct relatively long interviews, and provide the best mode for collecting non-self-report data (e.g., biophysical measures). They allow interviewers to provide documentation to establish the legitimacy of the survey request, provide clarification to respondents, and address respondent questions or problems; respondents interviewed in-person may work hard to answer survey questions carefully. Most in-person interviews today also use computer-assisted personal interviewing (CAPI), and respondents' answers are entered directly into a laptop or other electronic device, eliminating the need for additional data entry and allowing for complex skip patterns, randomizations, and/or fills to be programmed into the survey instrument. The disadvantages of in-person interviews are that they tend to be expensive and time consuming to conduct. They also have greater potential for interviewer bias and provide less privacy to respondents than do self-administered surveys. It is more difficult to supervise and monitor in-person interviews, and interviewer falsification may therefore be more likely than with telephone interviewing.

See: Holbrook, A. L., Green, M. C., & Krosnick, J. A. (2003). Telephone vs. face-to-face interviewing of national probability samples with long questionnaires: Comparisons of respondent satisficing and social desirability response bias. Public Opinion Quarterly, 67, 79-125.

Lyberg, L. E. & Kasprzyk D. (1991). Data collection methods and measurement error: An overview. In P. P. Biemer, R. M. Groves, L. E. Lyberg, N. A. Mathiowetz, & S. Sudman. (Eds.), Measurement errors in surveys (pp. 237-257). New York: Wiley.



No. 36. Survey mode II: Telephone interviews.

One of the first and most central decisions to be made when designing a survey is the mode (or modes) in which survey data are to be collected. Each mode has advantages and disadvantages and the specific choice of mode will depend on a variety of factors including the goals of the research project, the type of data one wants to collect, the planned sampling approach, the information available about potential respondents or households in the sampling frame, and the resources available to conduct the research. One of the four major modes of survey data collection used today is telephone interviewing.

The major advantage of telephone interviewing is that it can be used to collect data quickly. It is generally less expensive than in-person interviewing but more expensive than either mail or Web surveys (although this is not always the case). Data are entered directly into the computer by interviewers, eliminating the need for additional data entry and allowing for complex skip patterns, randomizations, or fills to be programmed into the survey instrument. Further, it allows for efficient monitoring and supervising of interviewers through the use of a centralized telephone facility and an electronic monitoring system that allows monitors to listen to interviewers as they make calls and talk to respondents. The disadvantages of telephone interviewing are that respondent cooperation rates are lower and the practical length of telephone interviews is generally shorter than in-person interviews. Although interviewers are available to answer questions and address potential problems, they are limited to verbal communication and cannot make use of nonverbal communication to build rapport with respondents or identify respondent problems. It is also more difficult for telephone interviewers to establish the legitimacy of the survey request because they cannot provide written documentation or ID. Telephone interviews have greater potential for interviewer effects on data than self-administered modes and may be less well-suited for asking sensitive questions than are other modes. Future challenges for telephone surveys include declining cooperation rates and the continuing development of sampling and interviewing strategies that incorporate cell phones.

For more information, see

Dillman, D., Smyth, J.,& Christian, L. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method. Hoboken, NJ: Wiley.

Groves, R. M., Biemer, P. P., Lyberg, L. E., Massey, J. T., Nicholls, W. L., & Waksberg, J. (2001). Telephone survey methodology. Hoboken, NJ: Wiley.



No. 37. Survey mode III: Mail surveys.

One of the first and most central decisions to be made when designing a survey is the mode (or modes) in which survey data are to be collected. Each mode has advantages and disadvantages and the specific choice of mode will depend on a variety of factors including the goals of the research project, the type of data one wants to collect, the planned sampling approach, the information available about potential respondents or households in the sampling frame, and the resources available to conduct the research. One of the four major modes of survey data collection used today is mailed paper and pencil self-administered questionnaires.

Mailed paper and pencil questionnaires have the advantage of generally being less expensive than modes that require interviewers (although this is not always the case). Mail surveys (and self-administered questionnaires more generally) are preferable to interviewer-administered surveys for asking sensitive questions because they maximize respondent privacy. Mailed questionnaires also allow respondents to complete the survey at their own pace. It is possible to present visual material in mail surveys, but some forms of complex stimuli cannot be presented (e.g., a video clip). Mail surveys generally have lower response rates than either telephone or in-person interviews, although optimally designed mail surveys can in some cases obtain higher response rates. All communication in mail surveys is typically done in writing, so the legitimacy of the survey request and all instructions must be clear and well written. Relying on written communication also assumes, however, that respondents read materials and instructions carefully and the researcher has little or no control over whether or not they do so. Skip patterns must be conveyed in writing to respondents and require them to follow instructions. There is no interviewer present to answer questions or address problems. In addition, data must be entered into a computer once questionnaires are returned and this is an additional cost and a place where error can be introduced. Double entry of all questionnaires (with checks of any inconsistencies) is becoming industry standard for mail surveys. Finally, mail surveys are generally more suited to surveys of known individuals than to household surveys where within-household selection of a respondent is necessary, as it can be difficult to implement a within-household selection process without the presence of an interviewer.

For more information, see

Dillman, D., Smyth, J.,& Christian, L. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method. Hoboken, NJ: Wiley.


No. 38. Survey mode IV: Web surveys.

One of the first and most central decisions to be made when designing a survey is the mode (or modes) in which survey data are to be collected. Each mode has advantages and disadvantages, and the specific choice of mode will depend on a variety of factors including the goals of the research project, the type of data one wants to collect, the planned sampling approach, the information available about potential respondents or households in the sampling frame, and the resources available to conduct the research. One of the four major modes of survey data collection used today is Web or Internet surveys.

Web surveys are increasingly popular and have a number of advantages. First, because respondents enter data directly, they avoid the additional cost and potential error introduced by data entry found with mailed questionnaires. Web surveys often can be conducted quite inexpensively and quickly by a single researcher. They also allow for the use of complex skip patterns, randomizations, or fills to be programmed into the survey instrument and provide respondents with privacy when completing the questionnaire, making them good for asking sensitive questions. However, like paper-and-pencil self-administered questionnaires, all materials and instructions must be written. As a result, the potential for errors or misunderstandings is greater than for interviewer-administered surveys. In addition, the researcher has little ability (beyond written instructions) to motivate respondents to answer survey questions carefully and completely. Perhaps the biggest limitation of Web surveys is that they are primarily appropriate when the sample frame includes e-mail addresses. This limits their utility for conducting surveys with probability samples of many populations including those of the general population (e.g., adults in a particular geographic area 18 and older). Nonprobability sampling approaches are quite population with Web surveys because of these limitations, but these bring their own set of problems.

For more information, see

Couper, M. (2000). Web surveys: A review of issues and approaches. Public Opinion Quarterly, 64, 464-494.

Dillman, D., Smyth, J., & Christian, L. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method. Hoboken, NJ: Wiley.


No. 39. Survey mode V: Mixed-mode design.

One of the first and most central decisions to be made when designing a survey is the mode (or modes) in which survey data are to be collected. Each of the four major modes (in-person interviewing, telephone interviewing, mailed paper-and-pencil self-administered questionnaires, and Web or Internet questionnaires) has advantages and disadvantages that have been reviewed in the last four Survey News Bulletins (Nos. 35-38). Sometimes instead of choosing one mode or another, researchers combine modes into what are called "mixed mode" designs. This often is done to capitalize on the different advantages of the modes to be combined. This can be done in a number of ways. In some surveys, respondents are given a choice of modes in which to respond. Ironically, there is some evidence, however, this may actually decrease response rates in some cases (in particular when mail surveys are provided with a Web option). In other cases, respondents who initially do not respond in one mode may be recontacted in another mode (e.g., nonrespondents to a mail survey may be contacted via telephone), and this is generally a more successful strategy for increasing participation. Other mixed-mode surveys involve using one mode to recruit respondents and another to interview them (e.g., recruiting respondents from a telephone sample to participate in an in-person interview) or use different modes in different waves of a panel survey. Finally, some surveys may use one mode embedded in another. For example, respondents in an in-person interview may be asked to answer a subset of particularly sensitive questions by entering their responses directly into a laptop computer (known as computer-assisted self-interviewing or CASI) in order to maximize privacy for those questions.

One disadvantage of a mixed-mode design is that it can confound mode (which can influence survey results) with other variables such as willingness to participate (e.g., when follow-up contacts are done in a different mode), respondent characteristics (e.g., when respondents self-select themselves into a particular mode), time (e.g., when waves of a panel survey are conducted in different modes), or question sensitivity (e.g., when a self-administered mode such as CASI is embedded in an in-person interview). As a result, the use of mixed-mode designs should be considered carefully within the goals of each particular study.

For more information see:

Dillman, D., Smyth, J.,& Christian, L. (2014). Internet, phone, mail, and mixed-mode surveys: The Tailored Design Method. Hoboken, NJ: Wiley.


No. 45. Key Informants.

Key informant interviews refer broadly to the collection of information about a particular organization or social problem through in-depth interviews of a select, nonrandom group of experts most knowledgeable of the organization or issue. They often are used as part of program evaluations and needs assessments, though they also can be used to supplement survey findings, particularly for the interpretation of survey findings. In survey studies, key informant interviews can be valuable in the questionnaire development process, so that all question areas and possible response options are understood. Further, relying on this method is appropriate when the focus of study requires in-depth, qualitative information that cannot be collected from representative survey respondents or archival records. While the selection of key informants is not random, it is important that there be a mix of persons interviewed, reflecting all possible sides of the issue at study. Key informant interviews are most commonly conducted in person and can include closed- and open-ended questions. They often are recorded and transcribed so that qualitative analyses of the interviews can be performed. Key informant interviews have a useful role in the beginning stages of research studies where information gathering and hypothesis building are the goal.

Excerpted From:

Parsons, J. A. (2008). Key informant. In P. J. Lavrakas (Ed.), Encyclopedia of survey research methods: Volume 1 (p. 407). Thousand Oaks, CA: Sage.


No. 48. Focus groups.

Focus groups are in-depth qualitative interviews with a small number of carefully selected people brought together to discuss a host of topics. They are often used to aid with designing survey questions or understand how to get cooperation from a target population. They are concerned with understanding attitudes, experiences, and motivation rather than measuring them. Their interactive nature allows a discussion to address "how" and "why." They are often less costly than surveys. However, the analysis is subjective, and one cannot generalize to the population (more for exploring, not representing).

When planning for a focus group session, make sure you have a clear “focus” for the group, define and locate your population dependent on the topic, and consider hiring a professional moderator. In most cases, you will need to submit your discussion guide, consent form, recruiting and screening materials, and protocol to the Institutional Review Board. Remember when designing your discussion guide to start with (3) clear goals / objectives, ask questions that require reflection (how or why, not yes or no), and use the “funnel approach”: begin with broad questions / topics and move to narrow and specific towards the end. Try to place more neutral questions before sensitive ones. Be ready with appropriate follow-ups and probes.

Your recruitment will need to include a screening / scheduling questionnaire with key questions to screen out ineligible people (under 18 years old), etc. You can locate members of the population you are targeting through flyers, lists of members of relevant organizations, mailed invitations to people or households that are likely to be eligible, telephone screening, on-line list-servs, Craigslist, or ads in local newspapers (the latter is particularly effective if you are targeting geographical areas). Set up an unique e-mail address and / or voicemail for potential participants to contact you. E-mail a confirmation letter with directions and contact information soon after scheduling. We recommend scheduling 12 to 15 participants in order to have 8 attend (you can adjust this ratio as you go). Make reminder calls / texts two business days in advance of the group meeting (this gives you time to possibly replace people who can no longer attend.)

Avoid putting people who know each other or who are in a chain of command (supervisors and employees) in the same group. Avoid “Professional Respondents” who have participated in many other focus groups and research studies (who you may attract using Craigslist) as much as possible. When appropriate, consider matching moderators to participants by gender and / or race-ethnicity.

Analysis can including meeting afterwards to discuss observations, a summary report by each observer / moderator, basic demographic questionnaires, documenting nonverbal behaviors, and transcriptions of the audio files to aid in identifying common themes and patterns (and deviations from these patterns).

For more information, see:

Krueger, R. A., & Casey, M. A. (2009). Focus groups: A practical guide for applied research. New York: Sage.


No. 57. Concurrent Web Options in Mail Surveys.

It is tempting to want to offer respondents a choice of survey modes upon first contact with the hope of maximizing response rates. For example, a mail survey might also include a URL for the respondent to complete the questionnaire online. However, research consistently indicates that more than one option depresses response rates. Offering more than one mode may make the respondent’s response decision more complex, which leads to a delayed response as they mull over which option to select. A 2012 meta-analysis of studies that offered a concurrent Web option in mail surveys offers further evidence of this effect (Medway & Fulton, 2012). Best practices dictate that a concurrent option be offered at later or final contact attempts. For an example on how to offer more than one mode to maximize response rates, see Chapter 2 of Dillman, Smyth, & Christian (2014).

Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, phone, mail and mixed-mode surveys: The tailored design method (4th ed.). John Wiley: Hoboken, NJ

Medway R. L., & Fulton J. (2012). When more gets you less: a meta-analysis of the effect of concurrent web options on mail survey response rates. Public Opinion Quarterly, 76(4), 733–746.


No. 127. Push-to-web methodology.

Decreasing response rates and increased costs associated with conducting surveys have led many survey researchers to consider ways to reduce costs. Web surveys have lower costs than many other modes because they do not require interviewers, postage, or printing costs. However, conducting a representative sample web survey also faces challenges (e.g., doing a survey of the residents of a particular community) because lists of e-mail addresses are not available for many populations. One strategy researchers have explored is the use of mixed or multiple modes as a strategy for increasing response rates and reducing costs (see SNB #39). As described in SNB #57, however, this does not always work as one might expect. A substantial amount of research now suggests, for example, that offering more than one response mode as an initial choice, such as offering a link to a web survey along with a mailed questionnaire, can depress response rates (e.g., Medway & Fulton, 2012).

An alternative to this is what is known as “push-to-web” or “web-push” methodology in which a web survey option is offered initially and a paper-and-pencil questionnaire is not offered until several contact attempts have been made (Dillman, Smyth, & Christian, 2014). This approach has typically been implemented by combining mail and web surveys. An initial sample of addresses/individuals is drawn and sent letters inviting them to participate in a web survey, often with a prepaid incentive. Several reminder postcards and letters are sent to nonrespondents requesting that they complete the survey via the web. After several contact attempts (often as the final contact attempt), respondents who have not yet participated are mailed a paper-and-pencil version of the questionnaire. This strategy allows many of the surveys to be completed via the less expensive mode (i.e., web), but including the final mailed questionnaire also increases response rates above the web-only option. Results are mixed as to whether this methodology results in higher response rates than mailed-questionnaire-only surveys (e.g., Dillman, Smyth, & Christian, 2014; McMaster et al., 2017), but pushing respondents to complete the web survey reduces costs relative to a mailed-questionnaire-only survey.

For more information, see:

Dillman, D. A. (2015). On climbing stairs many steps at a time: The new normal in survey methodology. Slides available at:

Dillman, D. A., Smyth, J. D., Christian, L. M. (2014). Internet, Phone, Mail and Mixed-Mode Surveys: The Tailored Design Method, 4th edition. John Wiley: Hoboken, NJ

Medway R. L. & Fulton J. (2012). When more gets you less: a meta-analysis of the effect of concurrent web options on mail survey response rates. Public Opinion Quarterly, 76(4), 733–746.

McMaster, H. S., LeardMann, C. A., Speigle, S., & Dillman, D. A. (2017). An experimental comparison of web-push vs. paper-only survey procedures for conducting an in-depth health survey of military spouses. BMC Medical Research Methodology, 17, 73.



No. 2. Probability & nonprobability Web survey panels.

Web surveys conducted with panels of survey respondents are increasingly popular. However, these Web panels differ in one important way - some of them are based on probability sampling and others on nonprobability sampling. In the past three years, the American Association for Public Opinion (AAPOR) has appointed two task forces to evaluate nonprobability Web panels and explore when using probability versus nonprobability samples affects results. (See here, here, and here.)


No. 12. Hard-to-survey populations.

Populations can be hard to survey for a variety of reasons. Tourangeau et al. (2014) distinguishes populations that are hard to sample, hard to identify, hard to find or contact, hard to persuade, and hard to interview. Hard-to-sample populations typically consist of groups that make up a small percentage of the overall population. Unless they are physically clustered in some way, it is often prohibitively expensive to screen the general population to find them. Populations that are hard to identify often are stigmatized in some way (e.g., illicit drug users or men who have sex with men). Thus, they are unlikely to admit membership in a population to an interviewer. Examples of hard-to-find populations are migrant laborers, homeless persons, or college students. The mobility of such populations makes them difficult to locate. Hard-to-persuade populations are those who refuse to participate in surveys when contacted. Research suggests that those who refuse to participate are less engaged civically than those who agree to participate. Finally, the hard-to-interview populations are challenging because of physical or cognitive barriers, language barriers, or due to vulnerability (e.g., prisoners or children).

For further information about various types of hard-to-survey populations, challenges specific to different populations, and strategies employed by researchers to obtain survey data from members of these populations, see Tourangeau, R., Edwards, B., Johnson, T. P., Wolter, K. M., & Bates, N. (Eds.). (2014). Hard-to-survey populations. Cambridge University Press.


No. 18. Standard formulas for calculating survey response rates.

The 7th edition of the Standard Definitions document, published in 2011 by the American Association for Public Opinion Research (AAPOR), contains standardized formulas for the calculation of response rates, cooperation rates, and refusal rates for telephone, in-person, mail and internet surveys. Use of these formulas facilitates meaningful comparisons across surveys and many professional journals now require their use when reporting findings from primary survey data collection efforts. We strongly encourage their use. For more information, see

The American Association for Public Opinion Research. (2011). Standard definitions: Final dispositions of case codes and outcome rates for surveys (7th ed.). AAPOR.


No. 23. Within-household respondent selection for general population mail surveys.

When conducting a mail survey of the general population, how do you select a respondent from the household to complete the survey? Battaglia et al. (2008) tested three respondent selection methods:

  • Any adult in the household
  • Adult with the next birthday
  • All adults in the household

Through follow-up telephone interviews, they were able to get detailed information on who completed the questionnaire (and why other household members did not complete the questionnaire). They found that the next birthday and all adult selections show promise as selection methods: their household level response rates were comparable to the any adult method. At the respondent level, however, the respondent rate for the all adults method was lower.

For more information, see Battaglia, M. P., Link, M. W., Frankel, M. R., Osborn, L. & Mokdad, A. H. (2008). An evaluation of respondent selection methods for household mail surveys. Public Opinion Quarterly, 72(3), 459-469.

No. 28. Cellphones: Dual-frame designs.

The proportion of adults living in cell-phone-only (CPO) households has increased dramatically in recent years-from less than 5 percent in 2003 to almost 40 percent at the end of 2013. Moreover, 65.7% of adults aged 25-29 live in CPO households while only 13.6% of adults aged 65 and older do. Adults who rent their homes, live in poverty, or are Hispanic are also more likely to live in CPO households. Because so many people live in households without landlines-and because those who have landlines are older, wealthier, and more likely to own their homes-survey researchers can no longer sample exclusively from landline phone numbers to attain a sample that represents the general population. In an attempt to include the entire adult population in phone studies, practitioners are combining landline and cellular phone numbers into a dual frame. The proportion of interviews that should be completed with respondents on a cellphone vs. a landline depends on the proportion of the population that is CPO, the proportion that uses both cell and landline, the proportion that is landline only, as well as the cost differential between calling landline vs. cell frames.

For information about the demographics of cellphone usage, see Blumberg, S. J., & Luke, J. V. (2014, July). Wireless substitution: Early release of estimates from the National Health Interview Survey, July-December 2013. National Center for Health Statistics.

For information about combining cellular and landline phone sample into a dual frame, see Levine, B., & Harter, R. (2015). Optimal allocation of cell-phone and landline respondents in dual-frame surveys. Public Opinion Quarterly, 79, 91-104.


No. 40. Address-based sampling.

In an effort to maximize coverage (and thus generalizability), many survey researchers are using address-based sampling (ABS) frames. ABS frames use the household address as the unit of analysis. In urban areas, the coverage of ABS frames is virtually complete. While coverage is not as high in rural areas, as rural-type addresses are updated to city-style addresses (i.e., addresses with street numbers and names) for the 911 system, the ABS coverage will continue to improve in rural areas.

Addresses are sampled from the U.S. Postal Service Delivery Sequence File (DSF) to which sampling vendors have access. ABS samples are ordered based on geography, which can be defined at almost any level-states, counties, tracts, blocks, ZIP-codes, street boundaries, or even by radius from a single point. Once the geography is determined, a random sample of addresses can be selected.

Samples selected using ABS frames can be used on their own for face-to-face studies or mail studies. Because some addresses in ABS frames can be linked to phone numbers, ABS frames can also be used for multi-mode studies that include telephone interviews.

For further information, see:

Iannacchione, V. G. (2011). Research synthesis: The changing role of address-based sampling in survey research. Public Opinion Quarterly, 75, 556-575.


No. 50. Snowball Sampling.

Snowball sampling, also known as chain referral sampling, is a nonprobability method of survey sample selection that is commonly used to locate rare or difficult-to-find populations. Although there are several variations, this approach involves a minimum of two stages: (a) the identification of a sample of respondents with characteristic x during the initial stage, and (b) the solicitation of referrals to other potentially eligible respondents believed to have characteristic x during subsequent snowball stages. In many applications, this referral process continues (or snowballs) until an acceptable number of eligible respondents have been located. Statistical inferences can be drawn from the first stage of a snowball sample, assuming that probability methods of selection were used. Samples drawn during snowball stages, and samples that combine the initial and snowball stages are not representative, however, and cannot be used to make statistical inferences.

Beyond nonrandom selection procedures, other limitations include correlations between social network size and selection probabilities, reliance on the subjective judgments of informants, and confidentiality concerns.

Key advantages include low cost and the potential time efficiency with which samples can be recruited.

The following resources provide more information about snowball sampling:

Biernacki, P. & Waldorf, D. (1981). Snowball sampling: Problems and techniques of chain referral sampling. Sociological Methods & Research, 10, 141-163.

Sudman, S. (1976). Applied sampling. New York: Academic Press.


No. 51. Respondent-Driven Sampling.

Respondent-driven sampling (RDS) can be used to recruit individuals who belong to hidden, hard-to-reach, stigmatized populations, where the members of the population are known to each other—for example, illegal drug users.

RDS includes several steps. The recruitment of the initial respondents—called seeds—is done by the researcher; however, all subsequent recruitment is done by the selected respondents. Following initial interviews, the seeds are given coupons that they are asked to give to other eligible members of the network. If the second-stage recruits wish to reveal their identity, they can contact the researcher to be interviewed. Researchers also give incentives to respondents when they participate and tell them that they will get an additional incentive if their recruits also participate. If the second-stage respondents complete an interview, they are in turn given coupons and incentives for the recruitment of third-stage respondents. The process continues until the needed number of completed interviews has been attained.

The coupons that are given to respondents are returned to the researcher at the time of the interview; information included on the coupon enables the researcher to trace the links between initial and subsequent respondents.

See the references for discussions regarding the modeling and weighting that is required to attain unbiased estimates from RDS samples. The first two references are papers by Heckathorn, who developed the RDS methodology. The third reference is an edited volume that contains several papers that discuss RDS.

For more information, see:

Heckathorn, D. D. (1997). Respondent-driven sampling: A new approach to the study of hidden populations. Social Problems, 44(2), 174-99.

Heckathorn, D.D. (2002). Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations. Social Problems, 49(1), 11-34.

Tourangeau, R., Edwards, B., Johnson, T. P., Wolter, K. M.,& Bates, N. (Eds.). (2014). Hard-to-survey populations. Cambridge University Press.


No. 52. Web-Based Respondent-Driven Sampling.

The Internet can be employed to dramatically accelerate the recruitment of hard-to-find populations using Web-based respondent-driven-sampling (WebRDS). In recent years, researchers have tested strategies for extending respondent driven sampling [see also SRL News Bulletin 51] for use on the Internet. The references below provide key information regarding the development and testing of this methodology. Key advantages of WebRDS include the low cost and speed with which online data can be collected and the privacy afforded by online self-administration. Limitations include the requirement that members of the target population have access to e-mail, that recruitment remains open long enough to compensate for variable e-mail usage, and the challenge of avoiding duplicate responses from individuals using more than one e-mail address.

For more information, see:

Bauermeister, J. A., Zimmerman, M. A., Johns, M. M., Glowacki, P., Stoddard, D., & Votz E. (2012). Innovative recruitment using online networks: Lessons learned from an online study of alcohol and other drug use utilizing a web-based, respondent-driven sampling (webRDS) strategy. Journal of Studies of Alcohol and Drugs, 73, 834-838.

Strömdahl, S., Lu, X., Bengtsson, L., Liljeros, F., & Thorson, A. (2015). Implementation of web-based respondent driven sampling among men who have sex with men in Sweden. PLOS One, 10(10), e0138599.

Wejnert, C., & Heckathorn, D. D. (2008). Web-based network sampling: Efficiency and efficacy of respondent-driven sampling for online research. Sociological Methods & Research, 37(1), 105-134.


No. 53. Network Sampling.

Network sampling—also known as multiplicity sampling—is a probability sampling methodology that can be used to locate members of a rare population. At the outset, a random sample of respondents is selected from the population; those respondents are asked if anyone in their network–including themselves—has the characteristic that makes them a member of the rare population.

The network must be clearly defined; members of one’s family including parents, siblings, and children is one example. The initial respondent must be able to identify the individuals in the network.

In addition, the initial respondent must know whether or not network members have the sought-after rare characteristic. For example, the diagnosis of cancer would likely be known within the family network defined above, whereas tax evasion may not be something that even close family members would know.

Because network sizes will differ, the probability of identifying eligible respondents will differ across networks. For this reason, weights must be constructed to adjust for the different selection probabilities.

For more on network (multiplicity) sampling, please see:

Blair, E., & Blair, J. (2015). Applied survey sampling. Sage Publications, Inc.

Sirken, M. G. (1970). Household surveys with multiplicity. Journal of the American Statistical Association, 65 (329), 257-266.


No. 67. Census vs. Sample Survey.

A census is the enumeration or administration of a questionnaire to an entire population. A sample survey is the collection of data from a subset of the population of interest. While researchers may be tempted to use a census so all members of the population are included, a well-designed survey based on a representative sample can provide accurate representation of the study population.

Major disadvantages of conducting a census are that they can be very time consuming and expensive. A census is most appropriate when it is important to have information for the entire population, such as the US decennial census that is mandated by law.

A survey of a well-designed probability sample provides results that are representative of the population. It is faster and more cost effective than a census and allows for resources to be spent on follow-up efforts, possibly resulting in higher response rates than a census. It may also allow for faster results. A sample survey is most appropriate when the population size is larger than the number of observations required for the desired statistical power and when the resources available to conduct a high quality, rigorous census are not available.


No. 74. What is a Sample Frame?

Once the target population for a survey is defined, one of the first steps to selecting a sample will be to develop a list of the elements that represent that population. This list of elements is known as a sample frame.

For example if the population of the State of Illinois is the target population, two possible sample frames would be a list of all the addresses of household residences within the State of Illinois or a list of all census blocks or tracts within the state. If the target population is college students within Illinois, an initial sample frame would be a list of all colleges and universities in the state. A sample frame could also be a list of the members of a professional organization.

Because the sampled cases will be selected from the sample frame, developing a quality frame is crucial to ensuring that the sampled cases will have good coverage of the target population. Only cases that are included on the sample frame have any possibility of being selected into the sample. To the extent that eligible cases are excluded from the sampling frame, and/or ineligible cases are included in the sample frame, coverage error may exist.

Over the next few weeks, we will build upon this information and discuss some common sample designs.

For more information on sample frames please see

Kalton, G. (1983). "Introduction to Survey Sampling." Sage University Paper series on Quantitative Applications in the Social Sciences, 35. Beverly Hills, New Delhi, and London: Sage Pubns.

Maisel, R., & Persell, C. H. (1996). How Sampling Works. Thousand Oaks, California: Pine Forge Press.


No. 75. Probability Sampling and Simple Random Sampling.

In order to be able to generalize survey results to the population from which it was drawn, the sample must be a probability sample. Such a sample is, by definition, one where the elements are randomly selected with a known—non zero—probability of selection.

In order to draw a sample with a known probability of selection, one must start with a sample frame [see SNB #74 for a description of sample frames]. If n cases are randomly selected from a sample frame that has N cases, the probability of selection for those cases is n/N.

The most basic type of probability sample is a Simple Random Sample. This is a sample where every element that is selected has the same probability of being selected and every combination of elements has the same probability of selection.

Elements from a sample frame can be randomly selected using a random number generator that is available within most software programs.

In the coming weeks we will introduce stratified and cluster sampling.

For more information on Probability Sampling, including Simple Random Sampling, see

Blair, E., & Blair, J. (2015). Applied Survey Sampling. Thousand Oaks, CA: Sage Publications.


No. 76. Stratified Sampling.

One strategy often used to ensure that a sample resembles some aspect of the population from which it is drawn, is a stratified sampling design.

With stratification, members of the population of interest are assigned to mutually exclusive and exhaustive groups, referred to as strata. Members are then sampled from each stratum independently.

If, for example, we want to select a sample of undergraduates from a university and we want to be sure that the selected sample matches the population of undergraduates vis-à-vis their year in school, we can take a stratified random sample.

In a proportionate stratified sample, the same probability of selection is used for each of the strata.

Sometimes, when the strata are of different sizes and the researcher wants to make comparisons between two or more of the strata, a disproportionate stratified sample design should be used. If one wants to compare the views of freshmen and seniors who live in undergraduate dormitories—and assuming that there are many more freshman than seniors living in the dorms—disproportionate random sampling can be used. After the sample frame containing all freshmen and seniors living in dormitories is divided into mutually exclusive strata, a simple random sample using a different probability of selection is taken from each of the strata. A higher probability of selection should be used to oversample the smaller strata—the seniors—and a lower probability of selection should be used to sample the larger strata—the freshmen. The sampling fractions can be calculated such that equal number of cases are selected from each strata. This equal number of cases will allow comparisons among the strata.

In order to use a stratified sample, the sample frame must include the information necessary (e.g., year in school, race, or gender) for assigning members to a stratum.

It is important to note that prior to data analysis, weights must be calculated when cases have been selected with different probabilities. Future Survey News Bulletins will cover weight construction.

For more information, see Chapter 5 Stratification and Stratified Random Sampling in

Levy, P. S., & Lemeshow, S. (2008). Sampling of populations: Methods and applications (4th ed.). Hoboken, New Jersey: John Wiley& Sons.


No. 77. Introduction to Cluster Sampling.

A design in which a group of individual enumeration units, which are in some way associated with one another, is sampled as a whole. The association is generally physical proximity (households on a block, students in a classroom, etc.). Cluster designs are used when a sample frame that lists individual enumeration units is not available or when such a list is available, but drawing a random sample from it would result in prohibitive interview costs. In the first case, one example is a survey of high school students. One straightforward way to interview students would be to draw a random sample of classes in a school, then administer the questionnaire to all students in each of those selected classrooms. The classroom is the cluster and the students are associated by virtue of being in the same class.

Another example of a cluster is a city block. If one were conducting a face-to-face household survey in the City of Chicago, for example, it is possible to obtain a complete listing of all addresses in Chicago. However, drawing a random sample from this list, then sending interviewers all over the city to interview households would be time-consuming and inefficient. To reduce travel costs and expenses, a sample of city blocks could first be drawn, then households on those blocks would be interviewed. The block is the cluster and households are associated by geography.

In a simple cluster design, once the clusters are sampled, all units within a selected cluster are subsequently sampled. In a multistage design, there are two or more stages of selection. In the first stage, the clusters are sampled. In subsequent stages, smaller clusters or individual sample units are sampled from the original clusters. For example, one might draw a sample of schools, then classrooms within the school, then students within the classroom.

Clusters can be selected using simple random sampling, with the probability of selection for each cluster being one divided by the number of clusters in the sample frame. Or, clusters can be selected using Probability Proportionate to Size (PPS) sampling, where the larger clusters have a greater probability of being selected than smaller clusters. If PPS sampling is used, the measure of size will be determined by information in the sampling frame, such as the number of students in the colleges or universities or the number of housing units on a block. The probability of a cluster being sampled is the number of elements in the cluster divided by the total number of elements across all eligible clusters.

For more information, see:

Levy, Paul S., and Stanley Lemeshow. Sampling of populations: methods and applications. John Wiley & Sons, 2013.

Lohr, Sharon L. Sampling: Design and Analysis. Cengage Learning, 2009.


No. 78. Complex Sample Designs.

A complex sample design is one in which many different components of sampling, such as stratification, clustering, or unequal probabilities of selection, are used in the same survey. While not technically the same as a multistage design (see Bulletin #77 –Cluster Sampling), complex survey designs generally incorporate multiple stages of selection. Many large, federally-funded surveys, such as the National Health Interview Survey (NHIS), the National Health and Nutrition Examination Survey (NHANES), and the American Community Survey (ACS), use multistage, stratified, cluster designs with over-sampling of populations of particular interest.

Primary sampling units (PSUs), such as Census blocks, are often stratified by some demographic characteristic, such as population size or race. Then the PSUs are sampled from the strata, housing units are sampled from the PSUs, and individuals are sampled from the households. In some cases, households with particular characteristics (e.g., dependent children in the household) are oversampled. Because clusters and individuals can be sampled at different rates, the computation of sample weights (see upcoming bulletin for more detail) is necessary to adjust for unequal probabilities of selection.

In addition, complex samples often produce larger variance estimates than simple random samples. As a result, statistical procedures that assume a simple random sample will underestimate variances and overstate statistical significance. Thus, it is necessary to analyze data from complex surveys with software that can take into account the sample design. Many programs (e.g., STATA, SAS, and SPSS) have this functionality. If conducting secondary data analysis on a dataset collected with a complex sample design, it is critical to read the documentation carefully.

For more information, see:

Levy, Paul S., and Stanley Lemeshow. Sampling of populations: methods and applications. John Wiley & Sons, 2013.

Lohr, Sharon L. Sampling: Design and Analysis. Cengage Learning, 2009.


No. 79. Sample Weights (part 1).

In the collection of survey data, a user may need to calculate sample weights. In its simplest form, a weight is the inverse of the probability of selection and indicates the number of population units each sample unit represents. For example, in a simple random sample in which 100 units are drawn from a population of 1,000, the probability of selection of each unit is 100/1,000, or 1/10. The base sample weight for each unit is therefore 10 and the weights sum to the population size. A sample in which each unit has the same probability of selection, and therefore the same weight, is referred to as a self-weighting sample. However, few surveys actually incorporate simple random samples. Many use disproportionate stratified samples, clusters samples, or complex, multistate designs. In these sample types, probabilities of selection are not equal. Sample weights are thus used to correct for oversampling of some units and undersampling of others. For example, if a population of 1,000 consisted of 800 men and 200 women, and we drew a disproportionate stratified sample of 50 men and 50 women, the probability of selection of women would be 50/200 (.25), while the probability of selection of men would be 50/800 (.0625)—women would have 4 times the chance of being sampled as men. If sample weights were not included in this analysis, the results would overrepresent women.

Additional types of weights will be presented in Bulletin #80.

For more information, see:

Lohr, S. (2009). Sampling: design and analysis. Nelson Education.


No. 80. Sample Weights (part 2).

In Survey News Bulletin #79, we introduced sample weights in their most basic form. This bulletin expands on that by covering other types of weights. A base weight is the inverse of the probability of selection and is equal to the number of population units each sample unit represents. In complex sample designs, there are multiple stages of selection, and therefore multiple probabilities of selection. In these designs, the probability of selection of the final sample unit is the product of the probability of selection at each stage. For example, a household survey utilizing a multistage design could include the probability of selection of the Primary Sampling Unit (PSU), which may be a census tract or block group; the probability of selection of the household; and the probability of selection of the respondent in the household. The base weight in this example would have three components.

Adjusting for different probabilities of selection is often insufficient, as most surveys include some level of nonresponse and nonresponse rates can vary across sample strata. Nonresponse weights are designed to inflate the weights of survey respondents to compensate for nonrespondents with similar characteristics. For example, assume we have a sample that is stratified by gender and women have a response rate of 50% compared to 40% for men. Even after adjusting for probabilities of selection, men would be underrepresented, because they participated a lower rate. Nonresponse weights are the inverse of the response rate—in this case 2 for women and 2.5 for men--and they indicate how many sampled units each responding unit represents.

Post-stratification weights are used to bring the sample proportions in demographic subgroups into agreement with the population proportion in the subgroups. After selection and nonresponse weights are calculated, the sample distribution of demographic characteristics may still vary from the population distribution. For example, the weighted sample may be 55% female and 45% male, but the population distribution may be 52% female and 48% male. A post-stratification weight is the ratio of the population percentage to the sample percentage. In this example, the post-stratification weight for women would be 52/55 or .95. In most surveys that use them, post-stratification weights are computed for multiple demographic variables. The use of post-stratification weights requires an auxiliary dataset that allows the user to compare the sample characteristics to the population from which the sample was drawn.

Many data analysis programs (SAS, STATA, SPSS) have weight statements that allow users to specify the variable containing the weight they would like to use. Thus, it is straightforward to specify the weight to use. However, some surveys (e.g., many federally funded surveys such as the NHANES, CPS, etc) have complicated sample designs and therefore complicated weighting schemes, with different weights being used for different subpopulations. When performing secondary data analysis on complex surveys, such as these, it is critical to read the documentation and use the appropriate weights.

Groves, R. M., Fowler Jr, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., &Tourangeau, R. (2011). Survey methodology (Vol. 561). John Wiley& Sons.


No. 108. Amazon’s Mechanical Turk (MTurk) and Respondent Characteristics.

Amazon’s Mechanical Turk (MTurk) platform has become a popular place to collect survey data via crowdsourcing. Many researchers use MTurk, amounting to an estimated 1,278 active requesters per day in 2015 ( How large is MTurk’s pool of participants? Although Amazon argues that MTurk allows researchers to access more than 500,000 workers from 190 countries, some are skeptical about this claim ( Thus, researchers have worked to independently estimate the size and characteristics of the MTurk pool of workers. A number of findings indicate that the number of participants accessible on MTurk is far fewer than the number Amazon claims (i.e., 500,000). For example, a study that employed a statistical method used in estimating the populations of particular species in ecology suggests that there are around 7,300 respondents on MTurk at any given time (Stewart et al., 2015). Such findings suggest that the number of participants available on MTurk may be small relative to the number of requesters.

Many researchers are also concerned about the characteristics of MTurk members and the extent to which they reflect the population as a whole. As a result, the characteristics of MTurk samples have also been studied extensively. Although MTurk samples are usually more diverse than other commonly used convenience samples, such as samples of college students (Berinsky, Huber, & Lenz, 2012; Buhrmester, Kwang,& Gosling, 2011; Paolacci, Chandler, & Ipeirotis, 2010), they are not and do not claim to be representative of the U.S. population.

One survey of MTurk workers in the U.S. (n = 2144) compared their results with the results of the population-based American National Election Studies (ANES) 2012 Time Series Study (Levay, Freese, & Druckman, 2016). The MTurk sample proved to be younger, more likely to be unmarried, lower income, and much less racially and ethnically diverse than the ANES sample. MTurk respondents also differed from 2012 ANES respondents on other variables (e.g., political preferences), in part due to those demographic differences.

According to one blog post in 2015 (, about 80% of the MTurk workers are from the US and 20% are from India. However, this proportion depends on time of day. There are more workers from India (~50%) than from the US around 8-10am UTC but their proportion goes down to 5% at 8-10pm UTC. Gender is balanced, and about half of the workers are around 30 years old. The median household income is around $50K per year for U.S. MTurk workers, which is on par with the median U.S. household income.

Further Resources:

An mturk-tracker tool shares live data about the demographics of MTurk workers based on a small 5-question survey that users answer.

For more information, see:

Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon. com's Mechanical Turk. Political Analysis, 20(3), 351-368.

Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon's Mechanical Turk: A new source of inexpensive, yet high-quality, data?. Perspectives on Psychological Science, 6(1), 3-5.

Levay, K. E., Freese, J., & Druckman, J. N. (2016). The demographic and political composition of Mechanical Turk samples. Sage Open, 6(1), 2158244016636433.

Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419.


No. 109. Surveys of MTurk Participants and Data Quality.

Last week’s SNB addressed the demographic characteristics of samples from surveys conducted on Mturk. This week’s bulletin addresses the quality of data obtained from such surveys. Recent research suggests that MTurk participants can be a good source of high quality data. Data obtained from MTurk participants are at least as reliable as those obtained through traditional methods. For example, psychometric properties (e.g., test-retest reliabilities in a set of individual difference measures administered 3 weeks apart via MTurk) compared favorably to those of traditional methods (Buhrmester, Kwang, & Gosling, 2011).

Another recent study compared the data quality obtained from MTurk participants, student samples, and professional panel samples (Kees, Berry, Burton, & Sheehan, 2017). To assess data quality, the authors examined a variety of outcomes, including reliabilities for dependent measures and a psychological construct used as a measured moderator, manipulation checks, response involvement, several attention checks, measures of prior research participation, general computer knowledge, and dependent variables for the advertising experiment. The manipulation check results showed that the effect of the manipulation was effective across the five samples. The αs for the MTurk sample were equal to or higher than all other samples for three out of four multi-item measures. Manipulation of a latent construct among MTurk participants was as effective as with student subjects. Furthermore, MTurk participants showed greater involvement and reported less multitasking than the other samples. In addition, MTurk workers wrote more text when responding to the two open-ended questions, exceeding both of the panel samples.

Methods are also available to improve data quality on MTurk. The most commonly discussed way is via attention check questions to screen out inattentive respondents. This method can reduce sample size and lead to unequal experimental cell sizes and selection bias (Oppenheimer et al., 2009, although see SNB #106 for a discussion of some of the potential drawbacks of attention checks). Alternatively, one can restrict participation to MTurk workers with high reputation ratings (e.g., above 95% approval ratings). Peer and colleagues (2014) compared these two methods of ensuring data quality on MTurk. They found that high reputation workers rarely failed ACQs and provided higher quality data than low reputation workers; ACQs improved data quality only for low reputation workers, and only in some cases. Sampling high reputation workers to ensure high data quality without using ACQs can prevent exclusion of participants after data collection. Also, ACQs may cause reactance and hamper the natural flow of a study.

These findings altogether suggest that data quality obtained from Mturk participants can in many cases be comparable to that of other participant groups. To ensure high data quality, researchers may wish to restrict participation in their research to MTurk workers with high reputation scores rather than employing attention check questions in their surveys on MTurk.

For more information, see:

Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon's Mechanical Turk: A new source of inexpensive, yet high-quality, data?. Perspectives on Psychological Science, 6(1), 3-5.

Kees, J., Berry, C., Burton, S., & Sheehan, K. (2017). An Analysis of Data Quality: Professional Panels, Student Subject Pools, and Amazon's Mechanical Turk. Journal of Advertising, 46(1), 141-155.

Peer, E., Vosgerau, J., & Acquisti, A. (2014). Reputation as a sufficient condition for data quality on Amazon Mechanical Turk. Behavior Research Methods, 46(4), 1023-1031.


No. 111. Using a Screener to Identify Eligible Respondents for a Study.

In some studies, researchers are interested in recruiting only respondents with particular characteristics. If respondents need to meet certain criteria in order to be eligible for a study, a “screener” can be used to identify those that qualify before the start of the substantive interview. For instance, the research may targeted to respondents in only a certain age range, gender, or racial-ethnic group. You may be looking for a specific type of business, such as family-owned or within a certain size. Or you may want respondents who have had certain experiences, such as crime victimization.

Screeners can be used with both nonprobability sampling approaches and probability sampling approaches. For example, one could screen potential respondents who call in response to fliers or other advertisements. Screening of a probability sample can also be done. For example, one could screen respondents from an initial general population RDD or ABS sample to find women between the ages of 18 and 25. An advantage of screening a probability sample is that it results in a representative sample of the population of interest. A limitation, however, is that it can be quite expensive if only a small portion of the population is likely to be eligible because it becomes necessary to contact and screen an unrealistically large initial sample.

Screeners are typically brief, consisting of several questions to assess eligibility, but not always. For instance, a researcher might wish to administer a more lengthy scale that scores the potential respondent on whether they are experiencing current trauma to make sure they are stable enough to participate in the interview.

In general, you want to avoid giving away the eligibility criteria to the extent possible. Respondents who want to avoid participation may take that easy way out, or respondents who want to qualify (e.g., for studies with a sizeable incentive) may be motivated to report a slightly different age, for instance. If respondents are ineligible, explain why (“Thank you for your time! We are only interviewing adults between 40 and 60 years old for this study. We appreciate your interest.”) You may consider offering information or resources to ineligible respondents if the recruiting and screening process was especially sensitive or time-consuming.

Keep in mind that the Institutional Review Board (IRB) considers respondents who are screened and found ineligible to also be research subjects. When submitting an IRB protocol that includes a screener, pay particular attention to how the screener data will be handled for those who are ineligible and don’t proceed with the interview.

Screeners can be administered in different survey modes. For Web, telephone, and in-person surveys they are often administered immediately before the interview and programmed so that respondents who do not qualify go directly to an ineligible script. For mail surveys, you can send a paper screener with instructions on how to fill it out and return it. The paper screener can then be data entered and eligible respondents will then make up your study sample.

If your survey is not self-administered, think about whether or not your screening questions need to be mandatory. If a respondent in a Web survey leaves a crucial screening question unanswered, you will not know if they are truly eligible. If you do not wish to make any questions mandatory, you can add a prompt telling respondents why it is important that they answer this particular question.

For more information, see:

Roger Tourangeau, Frauke Kreuter, Stephanie Eckman; Motivated Underreporting in Screening Interviews, Public Opinion Quarterly, Volume 76, Issue 3, 1 January 2012, Pages 453–469,


Data Analysis

No. 19. Missing Completely at Random (MCAR).

Missing completely at random is a term used to describe missing responses to items in a survey. Data are missing completely at random (MCAR) when being missing is independent from any of the variables in the model being estimated. For example, if missing data on an income question is unrelated to the respondent's actual income, the data are MCAR. Data can also be missing at random (MAR). Data are MAR when the probability of a variable being observed is independent of the true value of that variable, controlling for one or more variables. For example, if missing income is unrelated to income within varying levels of education, then the data are MAR. When the likelihood of being observed is dependent on the variable being observed, even when controlling for other factors, the data are not missing at random (NMAR). Strategies that can be used to address missing data in analysis are dependent on whether the data are MCAR, MAR, or NMAR. For further information, see

McKnight, P. E., McKnight, K. M., Sidani, S., Figueredo, A. J. (2007). Missing data: A gentle introduction. New York: Guilford Publications.
Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.


No. 25. Paradata.

Originally, paradata was defined as process data that are an offshoot of data collection efforts, such as number of call attempts (Couper, 1998; 2005). However, the definition has broadened to include any data collected or observed by interviewers that is not part of the questionnaire. This may include interviewer observations of neighborhood conditions or respondent characteristics, and specific information about contact attempts, such as disposition or time of contact. Olson (2013) describes five categories of paradata: interviewer observations of (1) the sampled unit's neighborhood, (2) the sampled housing unit, and (3) persons in the sampled housing unit. The last two categories are (4) call record information and (5) interviewers' observations about their interaction with the respondents. With the continuing decline in response rates, paradata become increasingly important in the analysis and adjustment of nonresponse bias. A complete presentation of the types and uses of paradata can be found in Kreuter (2013).

Couper, M. P. (1998). Measuring survey quality in a CASIC environment. In Proceedings of the Survey Research Methods Section (pp. 41-49). Alexandria, VA: American Statistical Association.
Couper, M. P. (2005). Technology trends in survey data collection. Social Science Computer Review, 23(4), 486-501.
Kreuter, F. (Ed.). (2013). Improving surveys with paradata: Analytic uses of process information. John Wiley & Sons.
Olson, K. (2013). Paradata for nonresponse adjustment. The Annals of the American Academy of Political and Social Science, 645, 142-170.


No. 27. Nonresponse bias.

Click the below image to see a larger version.

No. 33. Surveys and "Big data."

Big data is "data that is so large in context that handling the data becomes a problem in and of itself. Data can be hard to handle due its size (volume) and/or the speed of [sic] which it's generated (velocity) and/or the format in which it is generated, like documents of text or pictures (variety)" (AAPOR, 2015). In recent years, there has been a great deal of interest in Big Data as a potential alternative or complement to survey data. Big Data can be used to analyze economic or social systems at a macro level. Examples of Big Data are tweets posted on Twitter (or other social media messages) or the content of online searches (such as the "Google flu index"). Big Data are often secondary data that are "found" or "organic" rather than "made" or "design based" and researchers often cannot control or affect the specific format in which data is generated. Big Data represent a huge opportunity, but there are many challenges in using Big Data, including establishing the validity and reliability of measures based on Big Data, developing new analysis approaches to deal with the complexity of it, and establishing best practices and ethical guidelines for analyzing and reporting analysis of Big Data. Earlier this year, the American Association for Public Opinion Research (AAPOR) released a Report on Big Data (see below).

For more information see

AAPOR Big Data Task Force. (2015, February). AAPOR report on big data. AAPOR.


No. 58. p-Value.

p-values are commonly employed when analyzing survey data. A p-value is the probability of getting an observed result, or a more extreme result, under the assumption that the null hypothesis is true. For example, if a researcher wanted to test the efficacy of a drug compared to a gold standard, the null hypothesis would be that there is no difference in the efficacy of the two drugs. If the study showed that the new drug reduced symptoms to a greater degree than the gold standard and that result was statistically significant with a p-value of .05, it means that the probability of getting that result or a more extreme result (i.e., greater symptom reduction than what was observed) given the null hypothesis (i.e., no difference) is 5%. The p-value is routinely misunderstood and misused, so much so that the American Statistical Association recently issued a statement on statistical significance and p-values ( p-values do not measure the probability that a hypothesis is true, or the probability that the data were produced by random chance alone. p-values provide a measure of statistical significance and provide no information about the substantive significance of a finding. A trivial outcome can be statistically significant if the sample size is large enough. A p-value is only one piece of information that can speak to the value of a scientific finding, but in and of itself, should not be considered as evidence of the truth of a model or hypothesis.

The use of null hypothesis significance testing, which drives the use of p-values, has received some criticism (e.g., Gliner, Leech, & Morgan, 2002), but there are few viable alternatives to this statistical approach to testing whether a difference or relationship is meaningful. One criticism is that reliance on p-values biases the research process so that the process is designed to obtain significant results (see Greenwald, 1975). In addition to these broader potential biases in the research process, individual researchers may engage in what has come to be known as “p-hacking” – using one or more strategies specifically designed to obtain significant statistical tests (typically p-values less than .05). These strategies can include using some criteria to exclude participants, transforming the data, including covariates in models, analyzing and reporting results from only a subset of conditions in an experimental study, reporting results only from measures that show statistically significant results (while not reporting on results with measures that do not show statistically significant results), and stopping data collection once statistical tests are significant. While some of the strategies can be used for legitimate reasons (e.g., transforming variables), using one or more of these strategies specifically to obtain significant results is frowned upon and considered to be a form of statistical cheating.

For additional information, see:

American Statistical Association (

Cumming, G. (2013). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge.

Gliner, J. A., Leech, N. L., & Morgan, G. A. (2002). Problems with null hypothesis significance testing (NHST): What do the textbooks say? Journal of Experimental Education, 71(1), 83-92.

Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82(1), 1-20.

Nuzzo, R. (2014). Statistical errors. Nature, 506(7487), 150-152.


No. 64. Analysis of Complex Samples.

Survey respondents are often sampled using complex sample designs, such as stratification, clustering, or a combination of the two. Analysis of survey data that ignores this complexity can result in incorrectly calculated standard errors and therefore overstated statistical significance. To avoid this, it is necessary to use a statistical analysis package that can incorporate the sample design into the analysis and to use the correct procedures within those programs.Statistical programs that can adjust for complex designs include STATA, SAS, and SPSS.

Statistical procedures that can incorporate complex sample designs include, but are not limited to:

  • Power analysis
  • Descriptive statistics
  • Bivariate analysis of categorical data
  • Linear, logistic, multinomial logit, and other forms of regression
  • Structural equations modeling

The Survey Research Laboratory (SRL) at the University of Illinois has extensive experience in both designing complex surveys as well as analyzing data collected from such surveys. Clients who have a need to analyze complex survey data can benefit from SRL’s expertise. If you would like assistance conducting analysis with complex survey data, please contact Linda Owens at

For more information, see:
Lohr, S. (2009). Sampling: design and analysis. Nelson Education.
Heeringa, S. G., West, B. T., & Berglund, P. A. (2010). Applied survey data analysis. CRC Press.


No. 69. Methods for Analyzing Polling Data and Poll Results.

Election polling represents one of the most visible examples of survey research. Especially during the campaign leading up to presidential elections, there are many media polls showing a variety of results. There are many aspects of survey design that can influence the results of a poll and one of these is the approach taken to analyze the data. To illustrate this point, Nate Cohn of the New York Times Upshot recently gave raw data from a pre-election poll conducted by Siena College to four pollsters. The data were also analyzed by researchers at the NYT Upshot.

These five different pollsters/organizations took different approaches to adjusting and analyzing the data to estimate support for Trump and Clinton. First, they took different approaches to making the survey sample representative of the population, using different estimates of the population (e.g., the Census or voter registration files) and different approaches for doing so (e.g., traditional weighting versus statistical modeling). Pre-election polls are unique in that their accuracy also is dependent on predicting who votes. The five analysts used different definitions of who is a likely voter (using self-report, voter history, or a combination of the two) and therefore included different subsets of the respondents when estimating support for the two presidential candidates. The findings for the presidential election question varied quite a bit, from one analysis showing Clinton up by 4 points to one showing Trump up by 1 point.

This exercise illustrates the importance of analysis decisions in affecting the outcome of political polls and surveys more broadly. It also demonstrates why it is valuable to average or aggregate results across surveys that use different methodologies or analysis approaches. Aggregating political polls is a strategy that has been made visible by websites like RealClearPolitics ( and bloggers like Nate Silver ( Sam Wang (

For more information about the Upshot exercise, see:


No. 72. The 2016 Election and Pre-Election Polls.

The results of 2016 presidential election are in and they raise some interesting questions about the methods used for pre-election polling (and survey methods more broadly). Although pre-election polls showed the race being tight, most showed Hillary Clinton winning by a small margin. Current reports of the vote outcome show that Clinton appears to have won the popular vote by a very small margin, but less than was predicted by recent national surveys. Evidence from state polls in battleground states also suggests that they consistently overestimated support for Clinton relative to the election outcome. Although the causes of discrepancies between pre-election polls and the election outcome will not be fully understood for some time pending analysis by survey experts, there are a number of possible explanations:

  • The potential for nonresponse bias in pre-election polls is high because response rates are very low (typically less than 10%). Survey predictions of vote choice in an election will be inaccurate to the extent that vote choices among those who participate are different from those who do not. In other words, if Trump supporters were less likely to participate in surveys than Clinton supporters, surveys might have systematically overestimated support for Clinton.
  • The margin of error that is reported for polling estimates reflects sampling error, but it does not take into account non-sampling errors such as measurement error, coverage error, or nonresponse bias. The effects of these “non-sampling” sources of error are much more difficult to estimate and account for in our interpretation and analysis of survey data.
  • Surveys predominantly rely on respondents’ self-reports. That means that the accuracy of pre-election polls depends on respondents' willingness to honestly and accurately answer survey questions.
  • Predicting elections depends not only on predicting vote choice, but also on predicting the level of turnout and who will turn out to vote. Past polling failures to predict election outcomes (e.g., Jesse Ventura in the 1998 Minnesota gubernatorial election) have often been attributed to failures to accurately predict and understand turnout.
  • Finally, much pre-election polling has come to rely on non-probability sampling approaches including conducting surveys with members of Web panels constructed using nonprobability methods or simply interviewing the person who picks up the telephone in telephone surveys. The results of surveys that use nonprobability sampling may or may not accurately represent a population and it is nearly impossible to accurately estimate potential error in estimates from these surveys.

Although pre-election polling represents only a small portion of survey research, thinking about the potential sources of error and inaccuracy in pre-election surveys is also useful for understanding sources of error in other surveys.


No. 82. Should I use sample weights in regression models?

We are often asked at SRL whether or not sample weights should be used when conducting multivariate analyses of survey data. This is one of those questions that has passionate believers on both sides of the fence. Because sample weights are commonly used to adjust for sample design and differential selection probabilities, failure to employ them may leave the analyst with a sample that is not representative of the population being examined. Employing weights comes with some cost, however, as they can lead to increased variance and standard errors for regression parameters. Hence, the decision as whether or not to use weights is often based on the models to be examined. In particular, if the variables included in sample weights (e.g., education or age) are of substantive interest as model covariates, we would recommend not using the sample weights, as they may distort those relationships. We also recommend that analysts plan to compare models with and without weights to determine the degree to which their inclusion has an effect on the coefficients of interest, and report those comparisons (in a footnote if possible). Major differences between weighted and unweighted models is an indicator that the model may be misspecified, requiring further elaboration of the model being examined.

For more information:

Bollen, K. A., Biemer, P. P., Karr, A. F., Tueller, S., & Berzofsky, M.E. (2016). Are survey weights needed? A review of diagnostic tests in regression analysis. Annual Review of Statistics and its Application, 3, 375–392.

Winship, C., & Radbill, L. (1994). Sampling weights and regression analysis. Sociological Methods and Research, 23(2), 230–257.


No. 95. Using Survey Data in Combination with Data from Other Sources: Introduction.

One limitation of data from surveys is that they primarily rely on respondent self-reports. As a result, the accuracy of such data rely on the assumptions that respondents are able and willing to provide honest and accurate responses, and these assumptions may not always be true. First, survey respondents may not always be able to provide all types of information. They may not always know all the reasons for their opinions or be able to provide detailed medical or financial information from memory. Respondents’ memory of a specific event may not be accurate (particularly for events that are frequent or regular in which memories from similar events may be confounded) and respondents’ may have difficulty remembering when an event occurred (sometimes called event dating) even if they accurately recall the event itself. This is particularly problematic when respondents are asked to report on behaviors or experiences that occurred within a specific time frame (e.g., In the past 12 months, how many times did you go to see a doctor?). Second, respondents may not always be willing to answer honestly and completely. Survey questions sometimes ask respondents about topics that are sensitive (e.g., sexual history) or that respondents may want to answer in particular ways to give a more positive impression of themselves (e.g., turning out to vote in an election, attending church, or having egalitarian beliefs) or to avoid reporting negative opinions or behaviors (e.g., prejudice toward racial minorities, illegal drug use, or eating unhealthy foods). Despite these limitations, there is strong evidence that survey responses are typically quite valid and accurate, but researchers are increasingly combining survey data with data from other sources that provide information that is difficult or impossible to obtained via self-report. Over the next few Survey News Bulletins, some of the kinds of data that are combined with survey data will be discussed and examples of their uses given.

For more information, see:

Tourangeau, R., Rips, L. J., & Rasinski, K. (2000). The Psychology of Survey Response. New York: Cambridge University Press.


No. 96. Appending Sample Frame Data.

While the primary focus of a survey is to collect the data captured by the survey instrument, auxiliary data can also provide information about the target population, the survey process, or members of the sample. As defined by Smith, 2011, auxiliary data is all information not obtained directly from the interview. A sample frame (see SNB #74 for more information about sample frames) is one source of such auxiliary data. Sample frames can vary considerably in the amount of data included, depending on the source of the frame. General population surveys typically have less information in the sample frame, although this varies by mode. An address-based sampling or ABS (see SNB #40) sample frame includes a mailing address for each sampled unit. In a high percentage of cases, the respondent name can also be appended. An RDD telephone frame often includes only a phone number, which provides very limited information about even the location of the sampled household. List samples of specific populations typically include more frame data. For example, a sample frame of members of an HMO might include the age, gender, height, weight, health conditions, etc. of sampled members. Similarly, a list of the members of a professional organization might include demographics and some information about their professional experience (e.g., number of years they have been a member of the organization). While the primary purpose of the frame is to allow the researcher to make contact with the sampled members of the population, it can also be appended to the collected survey data. These data can be used directly in analysis of the survey data rather than collecting the information via self-report or they can be used as a check on self-report data to assess its accuracy. However, two of the most common uses of sample frame data are calculating response rates for specific strata or conducting nonresponse bias analysis. For example, knowing the gender of individuals in a sample frame of members of a professional organization would allow the researcher to assess the response rate separately for men and women and to compare the proportion of men and women who participated in the survey to the proportion in the sample frame. Similar analyses can be conducted with other sample frame characteristics to assess and correct for nonresponse bias.

For more information, see:

Smith, T. (2011). “The report of the International Workshop on Using Multi-level Data from Sample Frames, Auxiliary Databases, Paradata and Related Sources to Detect and Adjust for Nonresponse Bias in Surveys.” International Journal of Public Opinion Research, 23(3), 389–402.


No. 97. Using Survey Data in Combination with Data from Other Sources: Incorporating Biomeasures with Survey Data.

In this SNB, we consider one particular type of data that might be collected along with survey data: biomeasures. Although surveys predominantly rely on data from self-reports, it is increasingly common to collect other sorts of data as well. For example, particularly in health surveys, researchers may collect biomeasures. Biomeasures are physical measures taken by the survey interviewer or a researcher either at the time of the survey interview or (in some cases) at a later time during a physical exam. Examples of biomeasures collected with surveys include direct measures such as height, weight, waist/hip circumference, blood pressure; simple physical performance tests (walking, balance, strength, cognition); or collection of specimens such as saliva, urine, or blood samples.

There are several reasons why researchers may incorporate biomeasures with survey data: 1) to make population-representative inferences, if the survey sample is randomly drawn, 2) to serve as a reference to self-reported behaviors and health measures, 3) to better understand causal links between social / environmental exposures and health, and 4) to explore the role of genetics. Biomeasures may be particularly valuable when they provide data for variables that are difficult to assess via self-report, either because respondents may not be able to accurately answer survey questions on the topic (e.g., current blood pressure) or because the questions may be sensitive (e.g., illegal drug use).

When collecting biomeasures, it is important to consider what special training your interviewers will need in addition to standard study training, or whether you will require special personnel. Equipment needs and laboratory processing can greatly increase data collection costs, so it is important to build these into the budget. Also consider how respondent cooperation to the biomeasures will be handled in the study protocol: for instance, can respondents refuse that portion of the study and still complete the survey? Cooperation to biomeasures can range from very high with simple height / weight measurements to low with something more invasive such as blood or urine samples. Some studies build in an additional incentive for participation in the biomeasures component.

Giving results back to respondents is also important, whether that comes in the form of a sheet that interviewers complete on the spot, or a formal notice after the lab results are processed. Particularly if the biomeasures indicate potential health concerns, researchers are ethically obligated to share their findings with respondents.

For more information, see:

Sakshaug, J.W, Ofstedal, M.B., Guyer, H. & Beebe, T. (2015). The Collection of Biospecimens in Health Surveys. In T. P. Johnson (Ed.), Handbook of health survey methods. (pp. 383-419) Hoboken, NJ: John Wiley & Sons.


No. 98. Using Survey Data in Combination with Data from Other Sources: Geophysical data.

In this SNB, we consider another type of data that might be collected along with survey data: geophysical data or data about location. Researchers in many social science disciplines are increasingly aware that context and location are important and are taking space and geography into account in their theories, data collection efforts, and analyses. Linking survey data to data about location can be done in a number of ways. For example, face-to-face interviewers can take GPS coordinate readings for each interview or ABS samples (see SNB #40) can be linked to location.

Having data about location allows researchers to combine survey data with other data linked to location or context (e.g., distance from closest transportation or source of fresh produce; population density or demographic composition of neighborhood or ZIP code area). Having data about location also allows researchers to analyze data using GIS or Geographic Information Systems tools. This allows data to be analyzed and visualized to take space and location into account using maps and other graphic representations (e.g., work like this is being done at UIC by the Urban Data Visualization Laboratory;

The value of linking location data to survey data is that it allows the researcher to understand how the data are distributed across the geographic area being considered. Like many sources of data being considered in this series of SNBs about using survey data in combination with data from other sources, appending location data to survey data enhances the usefulness of the survey data and researchers are able to use the two sources of data together to answer questions that could not be answered with just survey data or just location data.

For more information about UIC resource for data visualization, see:

For more information about combining survey data with data from other sources, see:

Chen, J. T. (2015). Merging survey data with aggregate data from other sources: Opportunities and challenges. In T. P. Johnson (Ed.), Handbook of Health Survey Methods. New York: Wiley Pp. 717-754.

Lohr, S. L. & Raghunathan, T. E. (2017). Combining survey data with other data sources. Statistical Science, 32(2), 293-312.


No. 99. Using Survey Data in Combination with Data from Other Sources: Neighborhood or Community Characteristics.

In SNB #99, we describe an additional type of data that can be combined with survey data: information about neighborhood or community characteristics. Neighborhood or community characteristics are related to the location information discussed in SNB#98 because it is necessary to assign an (at least approximate) location to cases in order to combine survey data with these characteristics. For example, researchers may collect or obtain information about the block, ZIP code, neighborhood, or community in which a respondent or responding organization is located.

Data about neighborhoods or communities can be obtained in a variety of ways. In face to face surveys, interviewers can record their observations about the areas in which their respondents live. In other survey modes, data from sources like the American Community Survey, the Decennial Census, or more local data sources like crime statistics can be appended to the frame at the block, block group, tract, or community level, if the sample frame contains geographic information necessary for merging with auxiliary data (e.g. block group, zip code, etc.). Particularly for surveys covering large geographic areas (e.g., nationally representative surveys), this data may be at a broader geographic level such as the community-level. For more focused surveys, such as those of a more limited geographic area (e.g., the City of Chicago), data can be collected and appended at smaller geographic units. The specific level of geography used may depend upon the specificity of the level of information available about cases in the survey or sample frame and the types of variables being measured.

Neighborhood or community characteristic data can be useful for multiple purposes, including understanding the context in which behavior captured by surveys occurs and in conducting nonresponse bias analysis. For example, researchers have examined the effects of neighborhood or community characteristics on issues such as child and youth development, health outcomes, and crime and delinquency, just to name a few. In general, there is some evidence that neighborhood characteristics can affect individual outcomes, even when controlling for individual level characteristics. In addition, researchers have assessed the effects of neighborhood or community characteristics on survey nonresponse. In particular, observations about economic conditions or social disorganization have been used to model response propensity and its relationship to the substantive data collected by the survey.

For more information see:

Olson, Kristen, 2013. Paradata for nonresponse adjustment. The ANNALS of the American Academy of Political and Social Science, Vol. 645, Issue 1, pp. 142-170.

Sampson Robert J., Morenoff Jeffrey D., Gannon-Rowley Thomas. 2002. Assessing “neighborhood effects”: Social processes and new directions in research. Annual Review of Sociology 28:443–78.


No. 101. Using Survey Data in Combination with Data from Other Sources: Census Data.

In SNB #99, we described a type of data that can be combined with survey data: information about neighborhood or community characteristics. Once source of data for neighborhood or community characteristics is the US Census Bureau. While the US Census Bureau collects and disseminates data covering a variety of populations, topics, and geographies, the two sources of data used widely by survey researches are the Decennial Census and the American Community Survey.

Data from the Decennial Census are available at multiple levels and in several different products. Summary File 1 (SF1) includes 100% counts of the population and information about race, Hispanic/Latino ethnicity, and many population and housing characteristics. SF1 data are available at the block level. Summary File 2 (SF2) data include more detailed race information, but are available only at the tract level. The Census Redistricting Data are also available at the block level and include information about race and housing occupancy.

The advantage of the Decennial Census data products are the amount of information, the availability of block level data, and the inclusion of 100% population counts. The disadvantage of using decennial Census data is that they are collected every ten years and may be out of date at the point in which a researcher conducts his or her own survey.

The American Community Survey is an ongoing survey from which one, three, and five year data are available. The one and three year data are available for areas with populations of 20,000 or more, although the three year data have a larger sample size. The five year data are available at the block group level. The one year data are the most current, but least reliable, while the five year data are the most reliable, but least current. Because the ACS data are based on samples, estimates include both point estimates and margins of error.

To append any of these data files to survey data, the survey data must include the geographic information corresponding to the Census data product (e.g. tract, block group, block number).

For more information see:


No. 102. Using Survey Data in Combination with Data from Other Sources: Data from Administrative Records.

Over the last several bulletins, types of data that are often combined with survey data have been described. In SNB #102, we finish this series with a discussion of incorporating data from administrative records with survey data. Administrative records include any data that is kept by a business, government agency, or other organization for the purposes of their work or function. For example, it might include school records of students’ grades, medical records, or financial or tax records for an organization or business.

In some instances, administrative records may be used simply to construct a sample frame. For example, records about foster care placement might be used to construct a sample frame of all current foster parents in a county. Similarly, medical records could be used to identify patients with particular characteristics or experiences (e.g., who have had a colonoscopy in the past 12 months). Public records of tax status could similarly be used to identify and construct a sample frame of nonprofits in the state of Illinois.

As with many other external sources of data, data from administrative records can also be used to estimate nonresponse bias. For example, in a survey of the parents of children in a specific public school district, researchers could use administrative records to assess whether response rates are different across school or grade level. Similarly, given access they could compare the grades, test scores, or attendance records of students whose parents did and did not participate.

Administrative records can also contain useful substantive information. Medical records checks can be used to assess the accuracy of survey self-reports or to incorporate additional specific medical information with survey data. Similarly, administrative records of voter turnout, for example, are sometimes used as an objective measure of whether a respondent voted in an election. These records are used both to check the accuracy of respondents’ self-report of voting (because respondents tend to overreport turnout) and as a measure of turnout.

There are a number of challenges with using administrative records. One of these is gaining access to these records. In some cases (e.g., the tax status of an organization), these are public records or can be accessed through a freedom of information act (FOIA) request. In many other cases (e.g., medical records or school records), however, these are protected documents and getting access may be difficult. Extracting the necessary information from government records and linking it with individual survey cases can also be challenging. Administrative records are typically formatted and stored in ways that makes them most useful to the organization that originally collected the data – not always in a way that is most useful or accessible to researchers. So it is not uncommon for additional coding or matching to be necessary to put the data in a format that is usable by researchers or that allows them to link the records to individual survey cases.

For more information, see:

Davern, M., Roemer, M., & Thomas, W. (20). Merging survey data with administrative data for health research purposes. In Johnson, T. P. (Ed.), Handbook of Health Survey Methods. New York: Wiley. Pp. 695-716.

Chapter 12 of Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. New York: Wiley.


No. 106. Using Attention Checks to Identify Poor Quality Data.

With the ever increasing popularity of self-administered modes of survey data collection, particularly online, attention checks have become a common approach to verifying that respondents are in fact giving due attention to the survey response task. Also known as “Instructional manipulation checks” (IMCs), or “screeners,” attention checks are intended to identify individuals who satisfice when responding, typically by not reading questions carefully and hence failing to correctly follow instructions. Respondents unable to “pass” attention check questions are believed to provide poor quality data that is less reliable, and those respondents are often excluded when conducting data analyses.

However, more recent empirical research is inconclusive regarding these assumptions about attention checks and the value of excluding those who “fail” attention checks when analyzing the study data. There is concern that, because failure of attention checks may be correlated with some sociodemographic variables, deleting these cases may have a detrimental effect on the composition of final samples, which may also affect data quality. There are additional concerns that attention check questions may influence subsequent respondent behavior in ways that can also damage data quality by increasing respondent mistrust of researchers and by decreasing motivation to carefully answer subsequent questions. Consequently, recent research now advises against using attention checks and removing these respondents (Anduiza & Galais, 2016; Berinksy et al., 2016).

For more information, see:

Anduiza & Galais. (2016). Answering without reading: IMCs and strong satisficing in online surveys. International Journal of Public Opinion Research 29(3): 497-519.

Berinsky, Margolis & Sances. (2016). Can we turn shirkers into workers? Journal of Experimental Social Psychology 66: 20-28.


No. 121. Extreme Response Style (ERS).

One source of error in survey responses is response effects – or patterns of responding that do not reflect the content of the survey items. One such response effect, extreme response style (ERS), has been defined as the tendency for some people to prefer to choose extreme responses to survey questions (rather than more moderates ones), regardless of the content of the question. Many researchers have measured ERS by counting the number of a set of questions for which a respondents selects the most extreme response on either end of the scale; (e.g., Das and Dutta 1969; Greenleaf 1992), although this is not the only approach to measuring ERS.

One of the reasons why ERS is a concern in survey responses is because it has been shown to vary systematically across groups of respondents. One of the most consistent findings is that ERS tends to be higher among Blacks and Latinos than among Whites (i.e., Blacks and Latinos prefer more extreme responses than Whites; e.g., Bachman and O’Malley 1984a; 1984b; Bachman et al. 2011; Hui and Triandis 1989). At least some of the cross-group difference (e.g., White versus Latino respondents) is due to cultural differences, but culture does not entirely explain group differences (e.g., White versus Black respondents’ response patterns do not seem to be accounted for by cultural differences). Cross- group differences in ERS raises the possibility of differential measurement error across racial and ethnic groups.

Indeed, Bachman and O’Malley (1984b) argued that controlling for ERS eliminated the appearance of differences in mean self-esteem ratings between African-American and White respondents. Baumgartner and Steenkamp (2001) also found that ERS contributes to contamination from response styles for data collected in 11 European countries and De Jong et al. demonstrated that ERS posed a threat to the validity of survey-based marketing research featuring more than 12,000 consumers in 26 countries (De Jong et al. 2008).

Researchers should be particularly concerned about ERS when comparing the results of multi-item scales across racial, ethnic, or cultural groups. One approach to addressing potential bias introduced by ERS is to include a standardized measure of ERS (see Greenleaf 1992) to use as a control variable in one’s analyses.


Bachman, J. G. & O’Malley, P. M. (1984a). Black-White differences in self-esteem: Are they affected by response styles? The American Journal of Sociology, 90(3), 624-639.

Bachman, J. G. & O’Malley, P. M. (1984b). Yea-saying, nay-saying, and going to extremes: Black-white differences in response styles. Public Opinion Quarterly 48(2), 491-509.

Bachman, J. G. & O’Malley, P. M., Freedman-Doan, P., Trzesniewski, K. H. & Bonnellan, M. B. (2011). Adolescent self-esteem: Differences by race/ethnicity, gender, and age. Self and Identity, 10(4), 445-473.

Baumgartner, H. & Steenkamp, J. B. E. M. (2001). Response styles in marketing research: A cross-national investigation. Journal of Marketing Research, 38, 143-156.

Das, J. P. & Dutta, T. (1969). Some correlates of extreme response set. Acta Psychological, 29, 85-92.

De Jong, M., Steenkamp, J. B., Fox, J. P., & Baumgartner, H. (2008). Using item response theory to measure extreme response style in marketing research: a global investigation. Journal of Marketing Research, 45(1), 104-12.

Hui, C. H. & Triandis, H. C. (1989). Effects of culture and response format on extreme response style. Journal of Cross-Cultural Psychology, 20, 296–309.


No. 124. Sources for Survey Data for Secondary Analysis.

Previous Survey News Bulletins have often focused on best practices and research findings related to designing and collecting survey data. Designing and conducting surveys can be expensive and time consuming and requires a substantial amount of expertise to do well. Sometimes, however, designing and conducting an original survey is unnecessary because there are available archived survey data that can be used to answer the question of interest. Survey data are among the most widely archived and publicly available social science resources, and archiving of survey data continues to increase as many funding sources require investigators to make their deidentified data available after they have finished analyzing it. The purpose of this SNB is to describe a few of the major sources of archived survey data:

  • The Interuniversity Consortium for Political and Social Research (ICPSR) – This data archive is housed at the University of Michigan ( This archive includes many government and academic surveys, including the American National Election Study Survey, the General Social Survey, and the National Health Interview Survey. It does include some media surveys, but far fewer than the Roper Center archive (see #2 below). Questionnaires and methodology reports are publicly available. Data for more than 10,500 studies is available through ICPSR’s more than 700 member institutions across the world, which include most major research universities. Access to ICPSR is available through UIC’s library.
  • The Roper Center – The Roper Center Archive is hosted by Cornell University ( Although there is some overlap with ICPSR (i.e., some data are archived in both places) the Roper Center archive includes many more surveys conducted by commercial organizations and media outlets. The Roper Center archive includes over 23,000 datasets that date back to the 1930’s. Access to the Roper Center is available through UIC’s library.
  • Government Data Sources – One online source of survey data includes government agencies. In particular, many of the federal agencies that make up the Federal Statistical System routinely collect data using surveys of individuals, organizations, and businesses. These online resources include the Census Bureau (, the Bureau of Labor Statistics (, the Bureau of Justice Statistics (, and the CDC’s National Center for Health Statistics ( These websites often include not only access to the data, but also online tools for analyzing data and extensive reports of government analysis of the data.
  • Nonprofits and Think Tanks – An additional source of survey data includes data collected (often for media release) by nonprofit organizations and think tanks. These data are sometimes available directly from the organization or they are archived in one or both of the archives described in #1 and #2 above. Two examples of such organizations would be the Kaiser Family Foundation ( which collects survey data from Americans about their health and healthcare, and the Pew Research Center (, a nonpartisan think tank that conducts and analyzes survey data about current issues and trends.

Although there are disadvantages to analyzing data collected by someone else (i.e., inability to control design or measurement), conducting secondary data analysis is an efficient way to answer many research questions. These data sources are particularly invaluable for researchers who may not have the resources to collect their own survey data and, in some cases (e.g., government data collections), involve data collection on a larger scale than can typically be funded in typical academic research. They can also serve as a starting point for developing survey questions or measures of constructs as they allow researchers to see how questions have been measured in the past.


No. 138. Why Some Opinions may be Systematically Under-Represented in Surveys.

While surveys continue to be the primary method for measuring public opinion, research published by Adam Berinsky at Princeton University reminds us about the often ignored threat from item nonresponse in survey data. He demonstrated in a series of papers (see below) that persons holding certain opinions on a given topic were more likely not to answer survey questions concerning that topic, thereby insuring their beliefs were systematically under-represented. He refers to this as “exclusion bias.” Berinsky’s work examined the effects of exclusion bias on opinions about social welfare policy, racial policy, and opinions about the Viet Nam War. His findings serve as an important reminder of the multiple threats to measurement quality that must be considered when collecting and analyzing survey data.

We note that some researchers address the problem of item nonresponse in opinion surveys by providing an explicit “no opinion” response option. As discussed in News Bulletin 118, however, this can result in satisficing (discussed in News Bulletin 117), potentially exacerbating the effect.

For more information:

Berinsky, AJ (1999) The two faces of public opinion. American Journal of Political Science, 43, 1209-1230.

Berinsky, AJ (2002) Political context and survey response: The dynamics of racial policy opinion. Journal of Politics, 64, 567-584.

Berinsky, AJ (2002) Silent voices: Social welfare policy opinions and political equality in America. American Journal of Political Science, 46(2), 276-287.

Berinsky, AJ (2004) Silent Voices: Public Opinion and Political Participation in America. Princeton University Press.


Questionnaire Design

No. 3. “Some/Other” questions in surveys.

In a recent article examining the use of "some/other" questions (i.e., those that are introduced by saying that "Some people think that…, but other people think…."), researchers concluded that these questions increase question complexity and length without improving the validity of responses. Respondents took longer to answer these questions, but there was little evidence that they improved respondents answers. The authors recommend instead short, direct questions that avoid unconventional response option order.

Yeager, D. S., & Krosnick, J. A. (2012). Does mentioning "some people" and "other people" in an opinion question improve measurement quality? Public Opinion Quarterly, 76, 131-141.


No. 4. Giving clarifying information in survey questions.

Researchers often want to provide instructions for respondents to clarify inclusion/exclusion criteria when asking a question in a survey (e.g., respondents must know what to count or not count as a "shoe" in a question asking them about the number of pairs of shoes they own). Recent evidence suggests that it is better to give instructions before the question than after the question, but that decomposing the question into composite parts (e.g., asking separate questions about different types of shoes) may result in the most accurate responses, although asking multiple questions took longer than providing instructions (see Redline, C. (2013). Clarifying categorical concepts in a Web survey. Public Opinion Quarterly, 77, 81-105.)


No. 9. Branching in bipolar rating scale survey items.

The reliability and validity of attitude reports can be improved greatly by breaking up the respondents' rating task into a series of steps (branching) rather than asking for a single rating in one step. For example, the branching alternative to a question asking respondents to rate how positive or negative they feel about an attitude object would be to first ask respondents how they feel toward the object (positive, negative, or neutral), and follow up with a question about the degree (extremely, somewhat, etc.) to which they feel this way. More recent research suggests that the optimal way to implement branching would be to offer respondents three options for the first part (e.g., positive, negative, or neutral) and in the follow-up offer three options (e.g., extremely, somewhat, and slightly) only to those who select the endpoints (i.e., positive or negative). The outcome will be a quality attitudinal report collected on a seven-point bipolar scale.

Malhotra, N., Krosnick, J. A., & Thomas, R. K. (2009). Optimal design of branching questions to measure bipolar constructs. Public Opinion Quarterly, 73, 304-324. (


No. 10. Committee translation approach.

Although there are many variations, the basic committee translation approach involves several steps. First a translation committee team of three or more bilingual translators are identified, including a team leader. Second, each translator is assigned to translate a random section of the questionnaire. Third, the committee then meets to review the complete translated document and determine the extent to which it is correct in meaning, grammar, syntax, and language use that is familiar and culturally appropriate to the target population. Corrections are made where there is consensus regarding them. Fourth, where there is not a consensus, the team leader makes the final decision regarding the final version. Fifth, the final translation is pilot tested to identify any issues that require additional meetings of the translation committee to resolve.

(See; also see;;;


No. 11. Nondifferentiation.

Researchers routinely ask people to answer a series of questionnaire items all using the identical rating scale for recording responses. Nondifferentiation occurs when a person selects the same or similar response to all items in the series so as to invest minimal effort while responding, rather than because these ratings genuinely reflect the person's views, a behavior termed "satisficing." Nondifferentiation is more likely to occur toward the end of a long questionnaire, when fatigue presumably sets in and motivation to provide optimal answers declines. Studies have found a negative relationship between nondifferentiation and respondents' educational levels, consistent with the notion that satisficing is more likely among respondents with less cognitive ability. Data quality could likely improve by designing questionnaires to reduce the likelihood of non-differentiation. One simple way to do this is to present questions individually rather than as a series.

Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5(3), 213-236. (
Krosnick, J. A., Narayan, S., & Smith, W. R. (1996). Satisficing in surveys: Initial evidence. New Directions for Evaluation, 70, 29-44.


No. 20. Acquiescence response bias: Agree-disagree questions.

Agree-disagree questions are subjective survey questions that ask respondents to report whether they agree or disagree with a statement or a series of statements. These include questions that ask only for direction (e.g., Do you agree or disagree with the following statement?) and those that include intensity (e.g., Do you strongly agree, somewhat agree, neither agree nor disagree, somewhat disagree, strongly disagree with the following statement?). These questions are problematic because they are subject to acquiescence response bias (ARB) whereby respondents agree with statements regardless of their content. ARB may bias estimates based on either single agree-disagree items or scales composed of agree-disagree items. ARB is also influenced by cultural factors and has been shown to vary across countries and cultural groups within countries. Simply asking multiple agree-disagree questions some of which are positively worded and some of which are negatively worded and averaging them together, one solution that has been suggested to ARB, results in unnecessarily long and inefficient scales. Furthermore, those respondents who demonstrate ARB in response to these types of scales are given scale values in the middle of range of values, which may artificially reduce variance, and respondents from some cultures may be reluctant to endorse negatively worded statements. The recommended alternative to agree-disagree items is to use questions with construct specific response options. In order to do this, the researcher should determine the underlying construct of interest and write a survey questions explicitly designed to measure that construct. For example, one could conclude that a question asking respondents whether they agree or disagree with the statement "My doctor often treats me with respect" is designed to assess how much of the time the respondent's doctor treats him or her with respect. One could more directly assess this by asking respondents: "How often does your doctor treat you with respect? Would you say never, rarely, sometimes, often, or always?" The latter construct-specific question is more direct, minimizes the cognitive burden for respondents, and avoids ARB.

For more information, see

Saris, W., Revilla, M., Krosnick, J. A., & Shaeffer, E. M. (2010). Comparing questions with agree/disgree response options to questions with item-specific response options. Survey Research Methods, 4, 61-79.


No. 21. Acquiescence response bias: Yes-no questions.

Yes-no survey questions are those that ask respondents to answer yes or no to a question. For example, a survey respondent might be asked "Do you think the government has the responsibility to ensure that all Americans have access to jobs?" A problem with these questions is that they present only one possibility. In the example question, the other possibility is that the respondent does not think the government has this responsibility, but this possibility remains unspoken. As a result, respondents tend to start thinking about their answer to this question by considering reasons why they agree with the statement (i.e., reasons why this is the government's responsibility) rather than reasons why they disagree with it (i.e., reasons why this is not the government's responsibility). Because respondents' thinking may be biased in a confirmatory direction and because not all respondents may fully go through the process of considering both possibilities (i.e., they may stop before considering or fully considering why this is not the government's responsibility), respondents may demonstrate a form of acquiescence response bias (ARB) to these questions. ARB occurs when respondents agree or in this case, say "yes," to questions regardless of their content. Thus, yes-no questions may be biased toward overestimating the proportion of respondents who take the position (or engage in the behavior) framed as the "yes" response. The recommended solution to this problem is to revise the questions to present both positions in a more balanced way. For example, the question about government responsibility for providing jobs could be rewritten to read: "Do you think the government has the responsibility to ensure that all Americans have access to jobs, or do you not think that the government has this responsibility?" This presents both potential responses in an equal and balanced way. Although this question is slightly longer than the simpler yes-no question first presented, initial evidence suggests that it takes no longer to administer the balanced question in an interviewer-administered survey than it does to administer the simple yes-no question (Anand et al., 2010).

Anand, S., Owens, L. K., & Parsons, J. A. (2010). Forced-choice vs. yes-no questions: Data quality and administrative effort. Paper presented at the annual conference of the World Association for Public Opinion Research, Chicago.


No. 24. Time bounding and date prompts.

Time bounding is giving respondents a time window for their response. If the reference is appropriate respondents are more likely to actually count or estimate rather than use a vague quantifier. For instance, "How often do you exercise?" will yield a vague quantifier ("sometimes").

"How often in the past 30 days did you exercise?" is better. The respondent will probably figure out number of times per day and multiply by 30 or number of times per week and multiply by 4.

"How often have you exercised today?" is best; the respondent will count the number of times.

Often, when we give a time reference (past 30 days), we add date prompts that will tell the respondent what those exact dates are. Without them, respondents interpret the reference differently. Asking past 30 days in early August, for instance could mean "July" to a respondent while asking past 30 days in mid August could mean the first half of August to that same respondent. Similarly, when asking "past 7 days" or "past week" late in a given week (i.e., Thursday), respondents tend to leave out the previous weekend. This can be avoided by using the prompt "Since last Friday…"

Also see Rockwood, T. (2015). Assessing physical health. In T. P. Johnson (Ed.), Handbook of health survey methods (pp. 107-142). Hoboken, NJ: Wiley.


No. 34. Asking sensitive questions.

One challenge of using surveys to collect data is that they rely almost exclusively on self-report data. As such, they rely on respondents to be both able and willing to honestly and completely answer survey questions. One type of question that is a particular challenge in surveys is the sensitive question. Such questions measure, for example, constructs like whether a respondent has engaged in risky sexual behavior or used illegal drugs, the extent to which a respondent holds negative racial attitudes, or whether a student has cheated on an exam. Tourangeau and Yan (2007) define sensitive questions as those that are "intrusive," questions where there is a "threat of disclosure," or questions for which there are answers that might make the respondent appear "socially undesirable" (p. 860). People may deal with surveys that contain sensitive questions by not participating in the survey (unit nonresponse), not answering specific questions (item nonresponse), or not answering them honestly (socially desirable responding). There are many factors that may affect responses to sensitive questions including the respondent's tendency to engage in socially desirable responding, the mode of data collection, and interviewer characteristics and behavior. Indirect questioning techniques like the randomized response technique (RRT) and list technique (aka item count technique or unmatched count technique) are strategies used in surveys that allow respondents to answer a question to an interviewer in a way that protects their anonymity. Unfortunately, these methods are sometimes quite complex to implement, often result in reduced statistical power, and may not always work as intended. Another strategy in asking survey questions is to try to normalize the undesirable behavior or opinion being asked about or providing reassurances about the confidentiality of responses.

See Tourangeau, R., & Yan, T. (2007). Sensitive questions in surveys. Psychological Bulletin, 133, 859–883.


No. 43. Cross-cultural survey methods and respondent heterogeneity.

One growing issue in survey research today has to do with the effect of culture on survey responses and survey data collection. This is of increasing concern both because cross-national surveys are becoming more widespread (e.g., the European Social Survey) and because cultural heterogeneity is growing even within countries or other geographic areas. For example, the U.S. Census Bureau projects that by 2043, Latinos will be the largest racial or ethnic group in the U.S. but not a majority, and non-Hispanic Whites will be a minority (see These population and research trends emphasize the importance of designing surveys that equivalently measure constructs across cultures. A great deal of research in survey methods has addressed these issues, and a 2015 special issue of Public Opinion Quarterly was dedicated to this methodological issue. The articles in this special volume addressed comparability of measurement (e.g., Davidov et al., 2015; Yu & Yang, 2015); differences in the likelihood of responding to surveys across cultural groups (e.g., Banduci & Stevens, 2015); response effects and response styles across cultures (e.g., Banducci& Stevens, 2015; He & Van De Vijver, 2015); and the measurement of cultures and values across cultures (Lakotos, 2015).

See for the special volume of POQ on cross-cultural survey methods.


No. 49. Question-by-Question Guidelines.

Question-by-Question Guidelines (Q x Qs) should be part of the materials provided to interviewers for every telephone and face-to-face interviewer training. A Q x Q is a complete and annotated version of the questionnaire and the screener (i.e., questions used to determine eligibility for the survey) and thus is an important tool for interviewers as they learn how to administer a questionnaire. It includes not only the questions interviewers will be asking respondents, but also instructions about additional data to be collected (e.g., observations or procedures) and additional details and clarification about the survey questions. As part of training, allow ample time for a read-through of the Q x Q on paper with the trainer leading discussion of the annotations and answering interviewer questions. The Q x Q can also serve as a resource for interviewers and supervisors after training.

The Q x Q should put in writing as many details as possible about the questionnaire and interview process that an interviewer might need to refer to later. Not all questions require annotation, but notes should

  • point out question-level things that are important for collecting good data, such as data entry guidelines and probing instructions,
  • call attention to critical data entry (e.g. why certain pieces of data are collected, especially if they are to be used later on, such as capturing phone numbers for validation),
  • document skip patterns and explain the overall purpose of sections of the questionnaire, and
  • always provide details about the screening process or interview requirements.

Discussion of the Q x Q will also help interviewers learn how to better manage the answers they are likely to get on certain questions. They should be encouraged to use their copies of the Q x Q to take notes during training, and also to refer to it later on as they learn the questionnaire and participate in mock interviews.


No. 60. Age versus Year-of-Birth.

Survey researchers often want to measure respondents’ age. Although it might seem most straightforward to simply ask respondents “How old are you?”, this type of question (one that requires respondents to give an integer numeric response) may result in what is called response heaping or rounding. Response heaping occurs when respondents give answers that are divisible by 5 or 10 (see Holbrook et al., 2014; Pudney, 2008), and this type of heaping has been observed in reports of ages in the U.S. and in other countries. A simple histogram of age reports shows spikes of people answering with values that are divisible by 5 or 10. Since we know that this cannot be accurate, these spikes represent a source of error in age data. Because of the potential for this error in estimates of age, current best practices are to ask respondents their year (or date) of birth and to calculate age based on that report.

For more information, see:

Holbrook, A. L., Anand, S., Johnson, T. P., Cho, Y. I., Shavitt, S., Chavez, N., et al. (2014). Response heaping in interviewer-administered surveys: Is it really a form of satisficing? Public Opinion Quarterly, 78, 591-633.

Pudney, S. (2008). Heaping and leaping: Survey response behavior and the dynamics of self-reported consumption expenditure. Institute for Social and Economic Research, No. 2008-2009. Available at


No. 71. Cognitive Pretesting.

Cognitive pretesting (sometimes called cognitive interviewing) is used to develop and evaluate survey questionnaires. It originated in the Cognitive Aspects of Survey Methodology (CASM) movement where cognitive psychology has been applied to the survey context. Cognitive pretesting is based on the theory that respondents in a survey go through a series of cognitive steps to answer a survey question. In general, respondents must understand the question and its purpose, retrieve relevant information from memory, integrate that information into a summary judgment, and report that judgment using the response format requested.

There are two broad approaches to cognitive pretesting, although some cognitive interviews combine the two. The first is think-aloud interviews in which respondents are read proposed questions from a survey. However, instead of answering each question, respondents are instructed to report aloud their thoughts as they go through the process of thinking about and answering each question. Advantages of this approach are that it provides rich, unbiased data about respondents' cognitive processes. Disadvantages are that it is fatiguing for respondents (and very difficult for some), and the process of articulating thoughts may actually change the answer a respondent would give. The data may also not always address concerns that researchers have about specific survey questions.

A second cognitive pretesting approach involves structured probes. Respondents answer each proposed survey question and then are asked a series of follow-up questions about their thought processes. For example, they may be asked how they interpreted a specific term or word in the question, how they came to an estimate of how frequently they performed a particular behavior, or whether or not they thought most other people would be comfortable answering a particular question. The advantage of this approach is that a researcher can ask specific questions about specific concerns that he or she may have about a question. It is also substantially easier for respondents than "think alouds", and the data are easier to analyze and interpret. Disadvantages are that the researcher only collects information related to the specific probes he or she includes and may therefore miss important problems with proposed survey questions.

Some cognitive interviewing combines elements of these two approaches. Both rely on respondents' ability and willingness to report on their own cognitive processes. Cognitive interviewing is a very common part of the questionnaire development process (particularly when new questions or instruments are being developed) and is primarily used to identify problems with questions in order to revise and improve them. Unfortunately, less is known about how best to revise potentially problematic items so as to avoid introducing new problems, so it can be useful to use multiple iterations of cognitive pretesting to evaluate revised items.

For more information, see

Presser, S., Rothgeb, J. M., Couper, M. P., Lesslier, J. T., Martin, E. Martin, J., & Singer, E. (2004). Methods for testing and evaluating survey questionnaires. Hoboken, NJ: Wiley and Sons.

Willis, G. (2004). Cognitive interviewing: A tool for improving questionnaire design. Thousand Oaks, CA: Sage Publications.


No. 83. Using Questions from Previous Survey Questionnaires.

One aspect of questionnaire design that researchers often struggle with is whether to use verbatim questions from past questionnaires, to modify questions from previous questionnaires, or to write new questions. We encourage researchers to be thoughtful about this decision and to use several criteria for doing so.

First, one goal in designing a questionnaire should be to reduce measurement error. Advances in questionnaire design mean that best practices for writing good questions have changed over time and it, therefore, often does not make sense to use questions based on old practices or knowledge. Also, not all previously used survey questions were created using equivalent methods. Some questions have been developed, pretested, and validated extensively and others have not, so it is important to consider whether there is evidence about the quality of questions when deciding whether or not to include them.

Second, it is important to keep in mind that the optimal question wording may change over time or with the population being studied. For example, in the 1970s, the General Social Survey asked questions about Negroes, which would clearly not be appropriate terminology today. Similarly, questions about communication and social support have evolved to include electronic communication, as technology has become an increasingly important part of how we communicate. The target population can also impact optimal question design. For example, questions designed for adults may need to be rewritten or revised to be appropriate for children.

A third factor that researchers should consider is the purpose of the questions. In some cases, there may not be existing questions that measure exactly the construct of interest; in that case it is better to write a new question - or modify an existing one - that is targeted to the study’s purpose rather than to use an existing question; even a question that has been pretested and validated should not be used if it is not well-suited to the study’s purpose. On the other hand, researchers may sometimes want to make direct (even statistical) comparisons to previous data. In this case, using exact questions from previous research may be important to the research goals. Even verbatim question wording, however, does not ensure that data from a survey will be directly comparable to data from previous surveys if the meaning of words in the question have changed over time. For example, Smith (1987) notes that a question asked in a Gallup survey in 1954 ("Which American city do you think has the gayest night life?") would be interpreted very differently if asked in the 1980s.

There are reasons to use existing questions under some conditions and reasons not to use existing questions under other conditions. We encourage researchers to thoughtfully consider whether it’s best to use existing questions or whether new or revised questions can reduce measurement error or better measure the construct of interest. We strongly encourage researchers to not simply include questions from past surveys and use precedent as the rationale for doing so.

Smith, T. W. (1987). The art of asking questions, 1936-1985. Public Opinion Quarterly, 51, 95-108.


No. 84. Asking About Gender and Sexual Orientation in Surveys.

Traditionally, survey questions simply asked for the respondent’s sex and offered two choices—male or female. However, while that question may address biological sex at birth, it does not capture the variations in gender identity. One option for asking gender identity is:

Which of the following best describes your gender identity?
  a.  Female
  b.  Male
  c.  Transgender Female
  d.  Transgender Male
  e.  Gender variant/non-conforming
  f.  Other (specify)
  g.  Prefer not to answer
There are other options as well. Please consult the list of references below.

Sexual Orientation
Sexual orientation is separate from gender and involves three separate components—identity, behavior, and attraction. For example, a man may feel attracted to another man, or even engage in sexual activity with him, but may not identify as gay. Thus, a question asking about sexual orientation needs to specify which of the three separate components it seeks information about. SMART (Sexual Minority Assessment Research Team) makes the following recommendations:

Self-identification: how one identifies one’s sexual orientation (gay, lesbian, bisexual, or heterosexual).

Recommended Item: Do you consider yourself to be:
  a.  Heterosexual or straight;
  b.  Gay or lesbian; or
  c.  Bisexual

Sexual behavior: the sex of sex partners (i.e., individuals of the same sex, different sex, or both sexes).

Recommended Item: In the past (time period e.g., year) with whom have you had sex?
  a.  Men only
  b.  Women only
  c.  Both men and women
  d.  I have not had sex in the past [time period]

Sexual attraction: the sex or gender of individuals that someone feels attracted to.

Recommended Item: People are different in their sexual attraction to other people. Which best describes your feelings? Are you:
  a.  Only attracted to females?
  b.  Mostly attracted to females?
  c.  Equally attracted to females and males?
  d.  Mostly attracted to males?
  e.  Only attracted to males?
  f.  Not sure?

Also see

Sexual Minority Assessment Research Team (SMART). (2009). Best practices for asking questions about sexual orientation on surveys. The Williams Institute. Retrieved March 1, 2017, from

Federal Interagency Working Group on Improving Measurement of Sexual Orientation and Gender Identity in Federal Surveys. (2016). Current measures of sexual orientation and gender identity in federal surveys. Working Paper. Retrieved March 1, 2017, from

Miller, K., & Ryan, J. M. (2011). Design, development and testing of the NHIS sexual identity question. Questionnaire Design Research Laboratory, Office of Research and Methodology, National Center for Health Statistics. Retrieved March 1, 2017, from

Fryrear, A. (2016). How to write gender questions for a survey. Survey Gizmo. Retrieved March 1, 2017, from


No. 85. Rosa’s Law and Surveys about Disabilities.

Question wording plays a critical role in how respondents interpret and answer survey questions. Question wording in surveys can change over time in response to advances in questionnaire design, changes in society or culture, or changes in definitions. This is particularly true when survey researchers are using terminology that is associated with a medical diagnosis or legal definition.

One such change occurred in 2010 when President Obama signed a law in October 2010. Known as Rosa’s Law, this legislation required the federal government to replace the term “mental retardation” with “intellectual disability.” The law is named after Rosa Marcellino, a girl with Downs Syndrome who was nine years old when it became law, and who, according to President Barack Obama, “worked with her parents and her siblings to have the words 'mentally retarded' officially removed from the health and education code in her home state of Maryland.” Rosa’s Law is part of a series of modifications to terminology - beginning in the early 1990s - that have been used to describe persons with what we now refer to as intellectual disabilities.

One result of this law is that federal surveys such as the National Health Interview Survey changed the terminology used in survey questions from asking about “mental retardation” to asking about “intellectual disability, also known as mental retardation.” Survey researchers using the NHIS data on intellectual disabilities should be aware of this change and the possible implications for prevalence estimates, particularly if data from before and after 2010 are being compared or combined. In addition, researchers who are designing surveys that measure intellectual disabilities may want to use terminology and question wording that is consistent with federal guidelines.

Also see:

Language of Rosa’s Law:

Zablotsky, B., Black, L. I, Maenner, M. J., Schieve, L. A., & Blumberg, S. J. (2015). Estimating prevalence of autism and other developmental disabilities following questionnaire changes in the 2014 National Health Interview Survey. National Health Statistics Reports, No. 87. Available at:


No. 88. Response Scales – The Difference Between Unipolar and Bipolar Scales.

One common response format in surveys is a response scale. A survey question that uses a response scale is a closed-ended question that presents respondents with a series of response options that fall along a continuum or dimension (e.g., satisfaction, amount, frequency). Response scales can be unipolar—meaning that one end of the scale represents the complete absence of or a minimum level of the dimension (i.e., a “0” point) and the other represents the maximum level of the dimension. For example, “Over the past month, how often have you felt tired? Would you say never, occasionally, sometimes, often, or always?” Other dimensions are bipolar—meaning that the two ends of the scale represent equally intense opposites and the midpoint of the scale represents a “0” or a neutral point. For example, “Do you think spending on education in the U.S. should be increased a great deal, increased some, increased a little, stay at current levels, decreased a little, decreased some, or decreased a great deal?” (see Survey News Bulletin #9 under “Questionnaire Design” at for a discussion of using branching in these types of bipolar questions).

Some constructs are inherently unipolar (e.g., quantity, frequency) and should be measured using unipolar scales. Others are inherently bipolar (e.g., comparative judgments, change, evaluations of liking or quality) and should be measured using bipolar scales. Some constructs, however, can be assessed using either a unipolar or bipolar scale (e.g., satisfaction). Bipolar scales are more cognitively difficult and rely on the assumption that the two scale endpoints are opposing viewpoints on a single dimension (which might not always be the case). Unipolar scales are less cognitively difficult than bipolar scales and are much more clearly focused on a single dimension. Other aspects of response scales, such as the number and labeling of scale points for unipolar and bipolar scales will be addressed in future Survey News Bulletins.


No. 89. Response Scales—Number of Scale Points.

One question that arises when constructing survey scales is the number of scale points or response choices to offer respondents. Response scales used in questionnaires have varied between 2 or 3 to as many as 101 scale points (see the scale for feeling thermometers used in the American National Election Studies, questionnaires available at Researchers have tested the optimal number of scale points for both unipolar and bipolar scales, typically by manipulating the number of scale points and assessing the effect of the number of scale points on data quality (e.g., reliability or validity). The preponderance of evidence suggests that 5 scale points are optimal for unipolar scales, and 7 scale points are optimal for bipolar scales (e.g., Krosnick & Fabrigar, 1997). However, others have argued that the optimal number may be as high as 10 or 11 scale points, although the evidence regarding increases in data quality obtained by increasing the number of scale points above 7 is mixed (see Krosnick and Presser, 2010, for a review). These mixed findings may result because the effect of adding more points to a scale may vary across individuals and contexts. For example, the benefits of including more scale points might depend on factors such as how able respondents are to make more fine grained judgements, whether scale points are fully verbally labeled, and individual differences in ability and motivation to provide optimal responses to survey questions.

Krosnick, J. A., & Fabrigar, L. R. (1997). Designing rating scales for effective measurement in surveys. In L. Lyberg, P. Biemer, M. Collins, L. Decker, E. DeLeeuw, C. Dippo, N. Schwarz, and D. Trewin (Eds.), Survey Measurement and Process Quality. New York: Wiley-Interscience.

Krosnick, J. A., & Presser, S. (2010). Questionnaire design. In J. D. Wright & P. V. Marsden (Eds.), Handbook of Survey Research (Second Edition). West Yorkshire, England: Emerald Group.


No. 90. Response Scales—Labeling.

Survey questions that use a response scale format present respondents with a series of response options that fall along a continuum or dimension (e.g., satisfaction, amount, or frequency). The last two Survey News Bulletins have dealt with the difference between unipolar and bipolar response scales and the optimal number of scale points they should have (see Bulletins No. 88 and 89 in the Questionnaire Design category at

In addition, researchers need to make decisions about how to label the various points on such scales. Particularly in the case of visually presented scales (such as in web or mail surveys), there are several options for how to label scale points. Scale points can be labeled using numbers only, verbal labels only, or a combination. Scale endpoints alone can be labeled, scale endpoints and the midpoint can be labeled, or all scale points can be labeled. Because numeric values may be interpreted in ways not intended by the researcher (e.g., Schwarz et al., 1991), some researchers advocate using only verbal labels (Krosnick & Presser, 2010). However, if both verbal labels and numbers are being used, it is better to match verbal labels and numerical labels—for example to use negative values for the negative side of a bipolar scale and to use values from 0 to a positive number for a unipolar scale (Saris & Gallhofer, 2007). Several studies have also shown that scales with all points labeled (i.e., fully-labeled scales) provide more reliable data than do scales with labels on only some points (i.e., partially labeled scales; Alwin, 2007; Krosnick & Fabrigar, 1997; Saris & Gallhofer, 2007).

Alwin, D. F. (2007). Margins of error: A study of reliability in survey measurement. New York: John Wiley & Sons.

Krosnick, J. A., & Fabrigar, L. R. (1997). Designing rating scales for effective measurement in surveys. In L. Lyberg, P. Biemer, M. Collins, L. Decker, E. DeLeeuw, C. Dippo, N. Schwarz, and D. Trewin (Eds.), Survey Measurement and Process Quality. New York: Wiley-Interscience.

Krosnick, J. A., & Presser, S. (2010). Questionnaire design. In J. D. Wright & P. V. Marsden (Eds.), Handbook of Survey Research (Second Edition). West Yorkshire, England: Emerald Group.

Saris, W. E.,& Gallhofer, I. N. (2007). Design, evaluation, and analysis of questionnaires for survey research. New York: John Wiley & Sons.

Schwarz, N., B. Knauper, H. J. Hippler, E. Noelle-Neumann, & Clark, L. (1991). Rating scales: Numeric values may change the meaning of scale labels. Public Opinion Quarterly, 55(4), 570–582.


No. 91. Response Scales – Including a Midpoint.

The last several bulletins have dealt with the use of response scales in survey questions. In this Survey News Bulletin we discuss whether or not to include a midpoint in response scales. Bulletin No. 89 (see the Questionnaire Design section of bulletins at recommends using 5 scale points for unipolar scales and 7 for bipolar scales, both of which include a midpoint. The primary argument for including a midpoint, particularly in bipolar scales, is that some respondents legitimately have a neutral position and should not be forced to choose an option that indicates that their position is closer to one end of the scale than the other (e.g., Schuman and Presser, 1981). However, some researchers have expressed concerns about providing a midpoint for response scales, particularly for bipolar ones. Specifically, researchers have expressed concerns that respondents might select the midpoint of a scale not because that response accurately reflects their answer to the question, but as a way of avoiding giving a potentially unfavorable response (i.e., social desirability), as a strategy for quickly providing an answer to the question without carefully constructing a response (i.e., satisficing; Krosnick, 1991), or as a way of saying “don’t know.”

Although Schuman and Presser (1981) found that there is somewhat less midpoint responding when a “don’t know” response is explicitly offered than when one is not explicitly offered, the bulk of the empirical evidence suggests that providing a midpoint does not have a negative effect on data quality and that respondents who select a midpoint do so because it best reflects their true answer. Narayan and Krosnick (1996) found that midpoint selection was not associated with cognitive abilities as one would expect if it were the result of satisficing. Malhotra, Krosnick, and Thomas (2009) found that using a follow-up question asking respondents who selected the midpoint whether they leaned toward one end of the scale or the other lowered data quality. Both these findings suggest that respondents do not select the midpoint because they are unable or unmotivated to do the cognitive work necessary to answer the question carefully and completely. Finally, Weijters, Cabooter, and Schillewaert (2010) find that including a midpoint reduces the likelihood that respondents will give contradictory responses to two items that have opposite meaning or are coded in opposite directions (e.g., agreeing - or disagreeing - with two statements that are the opposite of one another) and recommend using scales with a midpoint unless there are compelling reasons not to do so.

Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213-236.

Malhotra, N., Krosnick, J. A., & Thomas, R. K. (2009). Optimal design of branching questions to measure bipolar constructs. Public Opinion Quarterly, 71(3), 304-324.

Narayan, S., & Krosnick, J. A. (1996). Education moderates some response effects in attitude measurement. Public Opinion Quarterly, 60(1), 58-88.

Schuman, H. & Presser, S. (1981). Questions and answers in attitude surveys: Experiments in question form, wording, and content. New York: Harcourt Brace Jovanovich.

Weijters, B., Cabooter, E., & Schillewaert, N. (2010). The effect of rating scale format on response styles: the number of response categories and response category labels. International Journal of Marketing Research, 27(3), 236-247.


No. 103. Instrument Validation: What is it and What Does it Mean?

In SNB #83, issues to consider regarding using questions from previous survey questionnaires were addressed. In this bulletin, we want to discuss the issue of validating measures. Validating a measure involves a series of steps to assess the construct validity of a measure or the extent to which it measures the construct researchers believe it does. It is more typically done for health-related measures (e.g., Torke, et al., 2017), but has also been used to develop measures of constructs like political knowledge (e.g., Delli Carpini & Keeter, 1996). It typically involves collecting pretest data from a sample of the target population in which the instrument is being validated along with other measures used to assess convergent, divergent, and predictive validity (see below).

There are a number of approaches to validate an instrument. These include assessing its face validity (i.e., To what extent do the questions seem to be assessing the intended construct?); comparing the measure to objective criteria when possible (e.g., Does a self-report of whether or not a respondent voted match data about turnout from election records?), assessing the measure’s convergent validity (i.e., To what extent does the measure correlate with measures it should be associated with?); assessing the measure’s divergent validity (i.e., To what extent does the measure not correlate with measures it should not be associated with?); assessing the measure’s predictive validity (i.e., To what extent does the measure predict – or to what extent is it predicted by – theoretically predicted antecedents or consequences of the construct?); whether the measure reflects a single underlying construct (e.g., using factor analysis); and assessing the reliability of the measure (e.g., by calculating an alpha reliability coefficient). Not all these steps are used to validate every measure, but validation typically involves showing validity using multiple approaches.

It is important to understand, however, that a measure that has been validated with one population or in one language may not simply be able to be used with a different population or translated into a different language. Although less often studied, the validity of a measure may also be affected by other aspects of the research context, such as the mode in which the measure is administered (e.g., completed in privacy by respondents in a paper and pencil measure versus responding to an interviewer reading the questions in a face-to-face or telephone interview) or even the other measures or questions included in the broader survey.

In sum, establishing a measure’s validity requires considerable effort and such validation is applicable only in the context in which the measure has been validated. It is insufficient to claim that a measure has been “validated” somewhere, at some time, with some unspecified population. The burden of proof is always on the researcher to demonstrate that measures have been carefully selected and are appropriate for the context in which they will be used.

For additional information and examples, see:

Delli Carpini, M. X. & Keeter, S. (1996). What Americans know about politics and why it matters. New York: Yale University Press.

Gandeka, B. & Ware, J. E. (1998). Methods for validating and norming translations of health status questionnaires: The IQOLA project approach. Journal of Clinical Epidemiology, 51(11), 953-959.

Torke, A. M., Monahan, P., Callahan, C. M., Helft, P. R., Sachs, G. A., Wocial, L. D., Slavens, J. E., Montz, K., Inger, L., & Burke, E. S. (2017). Validation of the family in patient communication survey. Journal of Pain and Symptom Management, 53(1), 96-108.


No. 113. Federal Standards for Measuring Race & Ethnicity.

Race and ethnicity are commonly recorded in surveys conducted in the U.S. Ironically, there is no generally agreed upon standard for measuring these important, and evolving, social constructs. One existing standard was established by the U.S. government’s Office of Management and Budget in 1977 and revised in 1997. These standards were designed to insure consistency in reporting of race and ethnicity as part of efforts to monitor equal protection and civil rights compliance.

This approach requires separate questions to measure racial vs. ethnic identity. Here, race is classified into five groups: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. In addition, respondents are provided the opportunity to self-identify with more than one racial category. Ethnicity is designed to be asked before the race question and is used to classify persons as to whether or not they are of Hispanic or Latino origin.

This basic classification scheme for race and ethnicity continues to be used today in federal statistical surveys, and by many other researchers. While useful for many purposes, there are several concerns with this approach to measuring these constructs. For many respondents, there remains confusion regarding the differences between measures of race and ethnicity. In addition, these items restrict concern with ethnicity to Hispanic vs. non-Hispanic status only, while the concept of ethnicity is generally viewed as having a far broader meaning. Non-federal researchers, of course, are free to employ other measures of race and ethnicity that may be more appropriate to their specific research needs. A future News Bulletin will review some of those approaches.

For More Information:

Federal Register. (1997). Revisions to the Standards for the Classification of Federal Data on Race and Ethnicity. Accessed at:

Research to Improve Data on Race and Ethnicity. (2017) Accessed at:


No. 115. Asking One Question at a Time Part 1: Double-Barreled Survey Questions.

SNB#114 described the CASM perspective on the process of answering survey questions. One implication of this perspective is that it is important to minimize cognitive burden in designing survey questionnaires. One way to do so is to write survey questions that ask about only a single construct. A common pitfall of survey writers is to include double-barreled questions such as the following:

Do you favor or oppose increased government spending on public education and Medicaid?

There are two constructs being measured in this example: support for increasing spending on education and support for increasing spending on Medicaid. If a respondent favors increased spending on education, but opposes increased spending on Medicaid, which response option will s/he select? The construct validity of the question as written is very poor, because you do not know which construct their answer is measuring.

Instead, you will need to ask these as two separate questions:

  • Do you favor or oppose increased government spending on public education?
  • Do you favor or oppose increased government spending on Medicaid?

Doing so avoids any potential confusion or frustration from respondents who have different opinions about spending on public education and Medicaid, minimizes respondents’ cognitive burden by only asking them to answer one question at a time, and improves the construct validity of the measures.

For more information, see:

Olson, Kristen. 2008. “Double-barreled question.” Encyclopedia of Survey Research Methods. Thousand Oaks: Sage.


No. 116. Asking One Question at a Time Part 2: Introducing new construct in response options.

In SNB #115, we reviewed double-barreled questions and the importance of limiting questions to ask about only one construct. A corresponding question-writing pitfall is to ask about a single construct in the question, but introduce another concept in the response options. For example:

Do you have Internet access at home?

  • Yes, DSL
  • Yes, Cable
  • Yes, Dial-up
  • Yes, some other form of access (please describe: __________________________)
  • No

Here the question itself asks about a single construct - whether the respondent has Internet access at home. The response options introduce an additional construct – the type of Internet access they have. This question and response options may be confusing to respondents both because there is disconnect between what the question stem and response options are asking respondents to do (i.e., simply asking whether or not they have Internet access or also asking where they obtain such access) and because it asks about two constructs in a single question. It also doesn't take into account the possibility that the respondent might have more than one type of Internet at home.

Instead, it would be better to first ask a question about whether or not the respondent has Internet access (avoiding Yes/No questions in favor of fully balanced questions as recommended in SNB#21), and then ask a follow-up question to those who answer yes:

Do you have Internet access at home or do you not have access there?

  • Do have Internet access at home
  • Do NOT have Internet access at home

What type of Internet access do you have at home? Please check all that apply:

  • DSL
  • Cable
  • Dial-up
  • Some other type of access (please describe: _____________________________)

These two questions separate the two judgments respondents are being asked to make and simplify respondents’ task by asking just one question at a time, minimizing respondent burden.


Olson, Kristen. 2008. “Double-barreled question.” Encyclopedia of Survey Research Methods. Thousand Oaks: Sage.


No. 118. Explicitly Offered "Don’t Know" or "No Opinion" Responses in Survey Questions.

When designing a survey, a survey researcher must make a decision about whether or not to include or omit an explicitly offered “don’t know” option. Explicitly offered “don’t know” responses are those that are read or shown explicitly to respondents. These are just one of a set of possible response choices such as “no opinion” or “do not have enough information” that reflect a lack of substantive response to the question. Sometimes these response choices are offered in a list with other response options and sometimes this option is explicitly provided to respondents in a filter question before a substantive question (e.g., “Do you have an opinion about taxing imports into the U.S.?”).

Nonsubstantive response options such as “don’t know” or “no opinion” are challenging for researchers because respondents who give these answers are often eliminated from statistical analysis. This can reduce the power of statistical analyses, particularly for multivariate analyses if respondents with nonsubstantive responses to any of multiple variables are eliminated from the statistical analysis. As a result, researchers typically want to avoid having substantial numbers of nonsubstantive responses.

Even when such an option is omitted, respondents can volunteer these options in interviewer-administered modes or leave questions blank in self-administered questionnaires in order to indicate a “don’t know” response. Some researchers have argued however, that explicitly including such a response (either as a response choice or as a filter) is important because it provides a valid answer choice for survey respondents who really do not have an opinion or those who hold what are sometimes called “nonattitudes” (e.g., Converse 1964). However, research on survey satisficing (see SNB #117) suggests that selecting an explicitly offered “don’t know” option does not reflect true nonattitudes for many respondents, but instead is a strategy taken by respondents to provide a reasonable answer to the survey question without going through the cognitive effort necessary to answer it carefully (Krosnick, 1991).

Consistent with this latter perspective, not only do more people select an explicitly offered “don’t know” response option than volunteer one when it is omitted, this effect is larger under the conditions thought to foster satisficing: among respondents with the fewest cognitive skills, for questions asked later in a survey question, and among respondents who report that they did not answer carefully (e.g., Krosnick et al. 2002). Furthermore, when respondents who initially select “don’t know” when it is explicitly offered are asked in a follow-up question to pick an option that they lean towards to pick the best option, their responses are nonrandom and predictably related to their characteristics and other opinions (e.g., Krosnick et al. 2002). As a result, current best practice is to generally omit explicit “don’t know” response options from both attitude (Krosnick et al. 2002) and knowledge (Mondak and Davis 2001) questions.

For additional information, see:

Converse, P. E. (1964). The nature of belief systems in mass publics. In D. E. Apter (Ed.), Ideology and discontent. New York: Free Press.

Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213–36.

Krosnick, J. A., Holbrook, A. L., Berent, M. K., Carson, R. T., Hanemann, W. M., Kopp, R. J., Mitchell, R. C., Presser, S., Ruud, P. A., Smith, V. K, Green, M. C., & Conaway, M. (2002). The impact of “no opinion” response options on data quality: Non-attitude reduction or an invitation to satisfice? Public Opinion Quarterly, 66(3), 371-403.

Mondak, J. J. & Davis, B. C. (2001). Asked and answered: Knowledge levels when we will not take “don’t know” for an answer. Political Behavior, 23(3), 199-224.


No. 119. Response Order Effects.

SNB #117 described satisficing theory (Krosnick, 1991), which suggests that when respondents are not able or motivated to answer questions carefully, they look for shortcuts or easy ways to provide acceptable answers without completing the cognitive work necessary to do so. One way that satisficing may manifest itself in surveys is through response order effects. A response order effect occurs when the order in which response options are listed in a survey question affects the distribution of answers that are given by respondents. Two types of response order effects occur. Primacy effects occur when response options are selected more when presented at the beginning of a list of response options than when presented at the end of the list. In contrast, recency effects occur when response options are selected more when presented at the end of a list of response options than when presented at the beginning.

Response order effects occur as a result of satisficing when respondents select the first reasonable response option that they consider rather than fully considering all the possible responses. Whether recency or primacy effects occur depends on at least two factors. In questions with categorical response options, primacy effects predominate when response options are presented visually (e.g., in self-administered questionnaires or when showcards are used) because respondents begin by considering the first response option in the list and then move on to the next. If respondents satisfice and choose the first acceptable option, they are biased toward selecting those near the beginning of the list. However, when response options are presented aurally, listening to the response options interferes with processing them, so respondents are more likely to wait until all the response options have been read in order to consider any of them and they’re more likely to begin by considering options read last. In this case, recency effects result. In questions that use scale response formats (i.e., where the response options fall along a dimension), however, primacy effects are found regardless of mode because respondents do not need to listen to the whole list of response options to understand the judgement they are being asked to make (i.e., they can infer what later response options are likely to be from hearing earlier ones). Response order effects may also be influenced by other factors like the linguistic structure of the question (e.g., Holbrook et al., 2007) and are more likely under the conditions thought to foster satisficing (e.g., among respondents with few cognitive skills).

Response order effects can only be assessed using random assignment to response order – in other words, it is only possible to assess whether response option has an effect by comparing data from questions using different response option orders. This is most often done between subjects (i.e., different respondents receive the response options in different orders), but can also be done within respondents (i.e., the same respondent receives response options in different orders for different questions). For scales, this can be done easily by randomly assigning some respondents to receive the scaled responses in one order and other respondents to receive them in the opposite order. However, when a list of categorical response options is long, the number of possible orders is large and it may be impractical to include all of them in an experimental design. As a result, it is not uncommon to not fully rotate response options under these circumstances and to use a subset of possible response option orders (e.g., for a question with five response options to use orders 1-2-3-4-5; 2-3-4-5-1; 3-4-5-1-2; 4-5-1-2-3; and 5-1-2-3-4 which involves putting each response in each of the 1-5 rank positions).

If a researcher is concerned about response order effects, s/he should consider incorporating response order experiments in his or her survey. This allows the researcher to assess whether response order is associated with responses and to control for the effects of response order if needed.

For more information, see:

Holbrook, A. L., Krosnick, J. A., Moore, D., & Tourangeau, R. (2007). Response order effects in dichotomous categorical questions presented orally: The impact of question and respondent attributes. Public Opinion Quarterly, 71(3), 325-348.

Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213–36.

Krosnick, J. A. & Alwin, D. F. (1987). An evaluation of a cognitive theory of response-order effects in survey measurement. Public Opinion Quarterly, 51(2), 201-219.


No. 122. Issues to Consider When Presenting Response Options Visually.

When designing web and mail surveys, it is important to consider the visual presentation of response options as part of the design (response options are sometimes presented visually on showcards in interviewer-administered surveys as well, particularly in face-to-face interviews).

Many aspects of visual presentation of both categorical and scale response options can affect survey responses. As reviewed in SNB#119, primacy effects can occur when either scale or categorical response options are presented visually. Consistent with this, research that measures eye movement among respondents shows that they do, as often theorized, spend more time with their eyes on the first several response options when presented with a list (Galesic, Tourangeau, Couper, & Conrad, 2008).

Other experiments show that response options such as “No opinion”, “Don’t know” or “Not applicable” (nonsubstantive response options) are considered to be part of the scale order by respondents unless they are visually separated or distinguished in some way from the substantive options. Respondents may include those nonsubstantive options in the scale when looking for a visual midpoint. For instance, in the below example, respondents may consider the scale to have 6 rather than 5 response options, and thus think of the midpoint as between “somewhat” and “slightly” effective (Tourangeau, Couper, & Conrad, 2004).

  • Very effective
  • Moderately effective
  • Somewhat effective
  • Slightly effective
  • Not at all effective
  • No opinion

Other researchers have found that the way response options are presented (e.g., double- or triple-banking response options rather than presenting them in a single list or having uneven spacing between response options) can also affect the distribution of responses (e.g., Christian & Dillman, 2004). Similarly, grouping long lists of response options into categories may also influence the distribution of responses selected and the extent to which a respondent successfully completes the task of selecting a response option (e.g., Smyth, Dillman, Christian, & Stern 2006).

Recommendations for presenting response options include randomly varying the position of response items within a list, but always preserving an expected order; and separating the nonsubstantive response items from the substantive items with a space between them.

For more information, see:

Christian, L. M. & Dillman, D. A. (2004). The influence of graphical and symbolic language manipulations on responses to self-administered questions. Public Opinion Quarterly, 68(1), 57-80.

Smyth, J. D., Dillman, D. A., Christian, L. M., & Stern, M. J. (2006). Effects of using visual design principles to group response options in web surveys. International journal of internet science, 1(1), 6-16.

Tourangeau, R., Couper, M. P., & Conrad, F. (2004). Spacing, position, and order: "Interpretive heuristics for visual features of survey questions." Public Opinion Quarterly, 68(3), 368–393.

Galesic, M., Tourangeau, R., Couper, M. P., & Conrad, R. G. (2008). Eye-tracking data: New insights on response order effects and other cognitive shortcuts in survey responding. Public Opinion Quarterly, 72(5), 892–913.


No. 123. Single-Item Measures.

Single-item measures are based on the premise of using a single question to measure a construct of interest (e.g., well-being). This is a useful and attractive approach to obtaining information in surveys as single-item measures are easy to use, inexpensive and time-efficient to administer, and because respondent time and attention are at a premium in surveys. Single-item (SI) measures can be a valuable tool against “over-surveying,” given the declining response rates and are very popular for their promptness, parsimony, face validity, and flexibility, as they can be more adaptable to different populations. SI measures are appropriate for comparative studies and known for assessing changes over time in a straightforward way. They are prevalent in marketing research and gaining ground as a cost-efficient tool in large companies but also in clinical settings, where the patient needs to provide feedback in a brief and accurate way.

However, single-item measures can be problematic when measuring some complex concepts that require understanding of a rich and detailed matter. Also, some questions may be too vague and might require a series of additional questions to adequately extract a conclusion: for example, the assessment of depression, or emotional functioning. Single-item measures are vulnerable to random measurement errors, do not have reliability of internal consistency and only test–retest reliability can be applied. On the contrary, multi-item measures have higher reliability, validity, and precision, allow the input of missing values, and provide more thorough descriptions.

Ultimately the use of single-item measures depends on several different factors; some of them could be the sample size, the effects size, the homogeneity of the items, etc. Single-item measures are ideal in exploratory research situations where there are relatively smaller samples with weak cross-item correlations, highly homogenous items (in an internal consistency sense), and a concrete construct.

Also see:

Bergkvist, L., & Rossiter, J. R. (2007). The predictive validity of multiple-item versus single-item measures of the same constructs. Journal of Marketing Research, 44(2), 175–184.

Bergkvist, L., & Rossiter, J. R. (2009). Tailor-made single-item measures of doubly concrete constructs. International Journal of Advertising, 28(4), 607–621.

Diamantopoulos, A., Sarstedt, M., Fuchs, C. et al. J. of the Acad. Mark. Sci. (2012) 40: 434.

Shing On Leung & Meng Lin Xu (2013) Single-Item Measures for Subjective Academic Performance, Self-Esteem, and Socioeconomic Status, Journal of Social Service Research, 39:4, 511-520, DOI: 10.1080/01488376.2013.794757

Jeff A. Sloan, Neil Aaronson, Joseph C. Cappelleri, Diane L. Fairclough, Claudette Varricchio (2002) Assessing the Clinical Significance of Single Items Relative to Summated Scores, Mayo Clinic Proceedings, 77: 5, 479-487,


No. 125. Select-All-That-Apply Questions.

A select-all-that-apply question (also referred to as a checkbox question) is one in which the respondent is presented with multiple options and is asked to check all options that apply to him/her. An example of such a question is:

Which of the following Harry Potter books have you read? Select all that apply

  • ☐ The Philosopher's Stone
  • ☐ The Chamber of Secrets
  • ☐ The Prisoner of Azkaban
  • ☐ The Goblet of Fire
  • ☐ The Order of the Phoenix
  • ☐ The Half-Blood Prince
  • ☐ The Deathly Hallows
  • ☐ None of the above

While the question stem in the above example suggests the respondent can pick more than one item from the list, it is best to explicitly state that respondents should select all that apply.

In addition, while one could infer that if the respondent does not select any of the items on the list, then she/he has not read any Harry Potter books, providing a none of the above option minimizes the ambiguity that might result if a respondent does not select any of the options. If a respondent does not check an item, it could mean he/she overlooked the item, does not remember, or did not read the book. Placing an explicit none of the above option makes it clear that unchecked options are intentional.

A checkbox or select-all question results in as many variables as there are response options. In a typical Web survey, each option has a value of 1 if the respondent checked it and is missing if the respondent did not. To include these items in data analysis, the missing values should be recoded to a valid value, such as zero or two.

It is common to ask questions such as this using a select-all-that-apply format in self-administered survey modes (e.g., mail or web), but to ask the same question as a series of forced choice questions (often yes/no) in interviewer-administered modes (e.g., telephone or in-person), with the assumption that the two processes result in similar data. However, researchers have argued that the process of answering the two types of questions are quite different (Sudman & Bradburn, 1982). Several studies comparing the two strategies in Web surveys have found that respondents given the forced choice format take longer to answer and endorse more response options than respondents given a select-all-that-apply format (Smyth et al, 2006). Additional evidence suggests that select-all-that-apply lists are often affected by order effects, whereby respondents are more likely to select answers presented early in the list of response options than those presented later, particularly among respondents who spend little time on the task (Smyth et al., 2006).

As a result, we recommend using select-all-that-apply questions sparingly and instead using a series of forced choice questions. Because of concerns about acquiescence response bias in yes/no questions (see SNB#21), we recommend the use of balanced forced choices. In the example above, for example, one might rewrite the question to read:

For each of the Harry Potter books listed below, please indicate whether you have read the book or have NOT read it?

Have read Have NOT read
The Philosopher's Stone   ☐  ☐ 
The Chamber of Secrets ☐  ☐ 
The Prisoner of Azkaban     ☐  ☐ 
The Goblet of Fire ☐  ☐ 
The Order of the Phoenix ☐  ☐ 
The Half-Blood Prince ☐  ☐ 
The Deathly Hallows ☐  ☐ 

The above format requires respondents to consider and make a judgment about each of the books while simultaneously avoiding acquiescence response bias.

For more information see:

Callegaro, M. Manfreda, K. L., & Vehovar, V. (2015). Web Survey Methodology. New York: Sage.

Smyth, J. D., Dillman, D. A., Christian, L. M., & Stern, M. J. (2006). Comparing check-all and forced-choice question formats in web surveys. Public Opinion Quarterly, 70(1), 66-77.

Sudman, S. & Bradburn, N. M. (1982). Asking Questions. San Francisco: Joseey-Bass.


Web Survey Design

No. 1. Effect of item position on ratings in Web surveys.

A recent article suggests that items that are shown higher up on the screen in a web survey are rated more positively than when those items are presented lower on the screen. The studies manipulated screen position in several different ways (e.g., showing the target above or below the rating scale, rotating two items to be shown at the top or bottom of the screen) and also rotated other characteristics of the question (e.g., how familiar the target was, the rating scale order). A meta-analysis of all six studies showed that the effect was homogenous and reliable across studies.

Tourangeau, R., Couper, M. P., Conrad, F. G. (2013). "Up means good": The effect of screen position on evaluative ratings in Web surveys. Public Opinion Quarterly, 77, 69-88.


No. 5. Web survey panel conditioning.

Web surveys conducted with panels of respondents are increasingly popular. One concern with using panels of respondents is that "experienced" respondents may somehow be substantially different from "inexperienced" ones. However, several recent articles suggest that panel conditioning does not affect substantive survey responses and that "experienced" respondents may actually give higher quality responses.

Binswanger, J., Schunk, D., & Toepoel, V. (2013). Panel conditioning in difficult attitudinal questions. Public Opinion Quarterly, 77, 783-797.
Dennis, J. M. (2001). Are Internet panels creating professional respondents? Marketing Research, 13, 34-38.


No. 7. Collecting data on mobile devices while offline.

Software packages that allow respondents to complete a survey questionnaire while online have been available for some time now. These packages are usually relatively inexpensive and offer menu-driven programming and sample management options. Owing to the proliferation of mobile devices, the software also allows for programmed questionnaires to be displayed on and answered using mobile devices such as smartphones and tablets. A recent feature being incorporated into these packages is the completion of survey questionnaires while offline using mobile devices. The data are collected offline and can be uploaded to the server when online. This feature might come in handy, for example, when collecting data from students who are in a classroom where wireless Internet access is not available or from participants at an event such as fairs without the need to set up kiosks or booths.


No. 13. Using grids in Web surveys.

In Web surveys, respondents often face the task of answering a series of questions using the same rating scale, presented in a grid or matrix format, with the items along the rows and the rating scale points in the columns. The use of this grid/matrix format might increase the tendency of respondents to nondifferentiate (i.e., to select the same or similar response to all items in the series to minimize cognitive effort while responding). Research has found higher correlations among items presented in this format, which is consistent with the occurrence of nondifferentiation. Also, reverse-worded items presented in this format were more likely to solicit an opposite than expected response, suggesting that respondents were not devoting cognitive effort to providing optimal answers. However, research also suggests that there are lower rates of missing data in grids, and grids take less time to complete than when same items are presented on separate Web pages.

Tourangeau, R., Couper, M. P., & Conrad, F. (2004). Spacing, position, and order interpretive heuristics for visual features of survey questions. Public Opinion Quarterly, 68, 368-393. (
Couper, M. P., Traugott, M. W., & Lamias, M. J. (2001). Web survey design and administration. Public Opinion Quarterly, 65, 230-253. (
Toepoel, V., Das, M., & Van Soest, A. (2008). Effects of design in Web surveys comparing trained and fresh respondents. Public Opinion Quarterly, 72, 985-1007. (


No. 16. Progress indicators can increase or decrease breakoffs in Web surveys.

Progress indicators are designed to provide Web survey respondents with continual feedback regarding how much of the questionnaire they have completed as they work their way through an online instrument. Providing this feedback is intended to minimize respondent breakoffs before completing the questionnaire by providing information regarding progress and a sense of accomplishment. Experimental evidence suggests that progress indicators will decrease breakoffs when the feedback is perceived as encouraging (i.e., for short questionnaires), but they can also increase breakoffs when perceived as discouraging (i.e., for long questionnaires). Consequently, it may be best to display progress indicators only intermittently when deploying long questionnaires, for which progress will be made at a slower pace than for shorter questionnaires. For more information, see

Conrad, F. G., Couper, M. P., Tourangeau, R., & Peytchev, A. (2010). Impact of progress indicators on task completion. Interacting with Computers, 22, 417-427.
Yan, T., Conrad, F. G., Tourangeau, T., & Couper, M. P. (2011). Should I stay or should I go: The effects of progress feedback, promised task duration, and length of questionnaire on completing Web surveys. International Journal of Public Opinion Research, 23, 131-147.


No 17. Web survey questionnaires: How information is presented may affect processing.

When designing Web questionnaires, an important caveat to remember is that respondents’ choices about the visual presentation of information may be processed and used by respondents. As a result, aspects of formatting or design such as colors, fonts, and images must be chosen carefully so as not to provide respondents with unintentional cues. For example, an experiment by Couper, Conrad, and Tourangeau (2007) documented how the inclusion of images of persons exercising vs. laying in a hospital bed can influence self-health ratings by serving as a frame for personal comparison. Hence, respondents viewing an image of a person actively exercising were found to rate their personal health as lower than did those viewing an image of a person in a hospital bed. Respondents also use formatting and layout information in forming their judgments in other survey modes (e.g., self-administered mail or telephone or in-person interviews), but the myriad of choices for formatting and design in Web surveys make it a particular concern in this mode.

Couper, M. P., Conrad, F. G., & Tourangeau, R. (2007). Visual context effects in Web surveys. Public Opinion Quarterly, 71, 91-112.


No. 26. Visual context effects and verbal instructions.

One of the advantages of Web surveys is that pictures and other media can be used. However, pictures can also systematically influence respondents' answers to survey questions. For example, Witte et al. (2004) found in a National Geographic survey that images increased support for species protection. Similarly, Couper et al. (2007) found that when a picture of a fit person was shown with a question about respondents' health, respondents reported consistently lower health than when the same question was shown with a picture of a sick person. Toepoel and Couper (2011) found that respondents reacted to the content of images shown, giving higher-frequency reports when pictures of high-frequency events were shown and lower-frequency reports when pictures of low-frequency events were shown. In this study, the effects of pictures on survey responses were similar to assimilation effects found with verbal instructions (i.e., ratings became more similar or were "assimilated" to the context). Verbal and visual cues had independent effects and also interacted. Verbal instructions had stronger effects, were attended to first (before pictures), and took longer to process than did pictures. The effects of verbal instructions could counteract the effect of pictures when both were present and contradictory. This suggests that survey respondents pay attention to verbal instructions more than visual cues such as pictures and that good question writing with clear instruction can reduce context effects from visual cues.

Also see: Couper, M. P., Conrad, F. G.,& Tourangeau, R. (2007). Visual context effects in Web surveys. Public Opinion Quarterly, 71, 633-634.

Toepoel, V., & Couper, M. P. (2011). Can verbal instructions counteract visual context effects in Web surveys? Public Opinion Quarterly, 75, 1-18.

Witte, J. Pargas, R., Mobley, C.& Hawdon, J. (2004). Instrument effects of images in Web surveys. Social Science Computing Review, 22, 1-7.


No. 29. New response formats: Slider bars.

Slider bars are a type of response format used in Web surveys in which respondents are asked to slide a marker along a continuum to indicate their response. For example, respondents might be asked to report their attitudes toward a target object on a continuum labeled with "strongly like" at one end and "strongly dislike" at the other. The hypothesized advantages of sliders are that they are more enjoyable for respondents (in line with the notion of gamification of the survey response process) and that they allow respondents to choose a response anywhere along the continuum, unlike traditional scales with a fixed number of response options. However, research investigating sliders suggests that they take longer to complete than traditional radio buttons or more traditional visual analog scales in which respondents are asked to click on a continuum rather than dragging and dropping a marker (as they are asked to do in a slider). Furthermore, sliders may reduce the response rate particularly on mobile devices and may be difficult to use across a wide range of mobile devices. The distribution of responses obtained is similar to those obtained from more traditional radio buttons, so there is little advantage to using slider bars in Web surveys. Current best practices suggest avoiding this response format.

See also Couper, M. P., Tourangeau, R., Conrad, F. G., & Singer, E. (2006). Evaluating the effectiveness of visual analog scales: A web experiment. Social Science Computer Review, 24, 227.

Funke, F. (2015). Negative effects of slider scales compared to visual analogue scales and radio button scales. Social Science Computer Review, published online.


No. 65. Smartphone Use and Web Surveys.

As smartphones have become more prevalent, an increasing number of respondents may choose to complete web surveys using these mobile devices. Smartphones are different from desktop and laptop computers (and even tablets) because of their portability and small screen size. Typing a response is also very different on a smartphone because most involve touch screens or very small keyboards. The increased use of smartphones and other mobile devices by respondents to complete web surveys has led researchers to study the effects of mobile technologies on survey responding. Similar responses to sensitive questions and rates of item nonresponse are found for computers and mobile devices. However, researchers have found that completing web surveys using mobile devices can decrease the length of responses to open-ended questions and reduce the quality of responses to particular types of questions (in particular, those that require scrolling to view on mobile devices). It also takes respondents longer to complete web questionnaires when they complete them on a mobile device. This has led researchers to begin to develop web questionnaire formats that work well across device types and to collect data about the device on which a respondent completes a web survey.

For more information, see:

Buskirk, T.D., & C.H. Andrus. (2014). Making mobile browser surveys smarter: Results from a randomized experiment comparing online surveys completed via computer or smartphone. Field Methods 26(4): 322-342.

de Bruijne, M. & A. Wijnant. (2013). Can mobile web surveys be taken on computers? A discussion on a multi-device survey design. Survey Practice, 6(4), 1–8. Available at

Callegaro, M. (2010). Do you know which device your respondent has used to take your online survey? Survey Practice, 3(6): 1–12. Available at

Mavletova, A. (2013). Data Quality in PC and Mobile Web Surveys. Social Science Computer Review, 31(6), 725-743. Available at

Revilla, M., Toninelli, D., & Ochoa, C. (2016). Personal computers vs. smartphones in answering web surveys: Does the devise make a difference? Survey Practice, 9(4), 1-6. Available at


No. 66. What is an Online Survey Panel?

Online survey panels are sample frames of email addresses linked to individuals who indicate their willingness to participate in future Web surveys. Some organizations have online panels that include tens of thousands of individuals who have volunteered or otherwise agreed to be contacted; others claim to have panels representing millions of persons. Most panels are recruited using mostly or exclusively non-probability methods; a smaller number are based on probability sampling, which is more expensive and time-consuming. Researchers interested in developing representative estimates of population characteristics should avoid the use of non-probability online panels. Those considering the use of an online panel also need to look carefully at how the panel was initially recruited in order to understand whether or not it is probability-based. The degree to which those who maintain the panel are transparent regarding how it is constructed and maintained will determine the extent to which researchers are able to discern whether or not the sample is representative of the population from which it was drawn. The American Association for Public Opinion Research has released a Task Force Report on Online Panels that will be useful to researchers considering the use of an online panel for their research. This report is available free at:

Additional information regarding online panels can be found in these references:

Callegaro, M., Baker, R., Bethlehem, J., Goritz, A. S., Krosnick, J. A., & Lavrakas, P. J. (2014). Online Panel Research: A Data Quality Perspective. Chichester, United Kingdom: Wiley.

Spijkerman, R., Knibbe, R., Knoops, K., Van De Mheen, D., Van Den Eijndem, R. (2009). The utility of online panel surveys versus computer-assisted interviews in obtaining substance -use prevalence estimates in the Netherlands. Addiction, volume 104, issue 10, pp. 1641-1645.


No. 94. When should questions in a Web survey be mandatory?

Researchers concerned about item nonresponse (i.e., when respondents who participate in a survey do not answer a specific question) are often tempted to make responses in Web surveys mandatory. However, requiring respondents to answer every question is antithetical to the norm of survey participation as a voluntary effort (Couper, 2008), and can be detrimental to respondent motivation to complete your survey. If respondents do not have an answer to your question but are forced to respond, they can make two decisions: quit the survey, or make up an answer (Dillman et al, 2014). Moreover, most institutional review boards will not allow your study protocol to include mandatory response to every question. It is more common to tell potential respondents that if they elect to participate, they can skip any question they choose. Sometimes, however, a response is required in order to route respondents to the correct path of follow-up questions; for example, you need to know if your respondent is faculty, staff, or student in order to pipe them to the set of questions specific to their subgroup. In this case, it is recommended that you provide a brief instruction to let the respondent know why the response is required.

Couper, M. P. (2008). Designing effective web surveys. Cambridge, MA: Cambridge University Press.

Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method. Hoboken, NJ: John Wiley & Sons.


No. 126. Use of auto-advance features in online surveys.

Many online survey software packages provide an option that automatically advances respondents from page-to-page through questionnaires as they answer each question. Such auto-advance functions are believed to minimize respondent burden by reducing the number of clicks necessary to navigate through the questionnaire, and also reduce the amount of time necessary to complete a survey task. Available experimental evidence, though, suggests a potentially serious downsides to the use of automatic question advancement features. In separate studies that compared a questionnaire version employing the auto-advance feature to the traditional use of next and previous buttons for navigating through a questionnaire, it has been found that the auto-advance feature does not necessarily produce faster completion times (Arn et al., 2015) and seems to contribute to more item missing data – presumably as some respondents appeared to skip questions by continuing to use the next button in addition to having the auto advance feature (de Bruijne, 2015). While more research needs to be done, survey designers should consider the potential trade-offs between respondent burden and data quality when using auto-advance features in online surveys.

For more information:

Arn, Klug & Kolodziejski. (2015). Evaluation of an adapted design in a multi-device online panel: A DemoSCOPE case study. Methods, Data, Analyses 9(2): 185-212.

De Bruijne, M. (2015). Designing web surveys for the multi-device internet. Doctoral Dissertation, Tilburg University.


Survey of Special Populations

No. 104. Surveys of Physicians.

Many surveys, such as ones exploring topics of patient care, prescription drug use, health policy, public health issues, involve physicians and/or associated health professionals as the target population of interest. While there are several sampling frames that can be used for surveys of physicians, in the U.S., the most common source is the AMA Masterfile. Mail surveys have most commonly been used for physician surveys, although internet-based surveys or mixed-methods are also used. However, possibly owing to competing other priorities and the surfeit of survey requests, surveys with such populations are characterized by low and declining response rates which raises concerns about the generalizability of the findings. Studies have investigated various techniques to improve response rates in physician surveys, and results indicate the following best practices (i) mail surveys yield higher response rates than internet-based surveys, though mixed-mode surveys are about as effective as mail surveys in terms of response rates. In addition, at the present time, sample frames for physician tend to include more complete and accurate contact information for a mail versus web mode of survey, therefore increasing concerns about the representativeness of a web-based sample. Results of studies indicate that (ii) surveys offering monetary incentives, even as little as $1, yield higher response rates than either those offering non-monetary incentives or those offering no incentives, and (iia) the monetary incentive is especially effective when offered in advance of the respondent completing the survey (that is, a prepaid incentive), and (iib) is in the form of cash rather than a check. Studies have also found that (iii) following up with nonrespondents can increase response rates, and that (iv) shorter questionnaire lengths or questionnaires that minimize perceived burden can improve response rates.

Useful references:

Cho, Y. I., Johnson, T. P., & VanGeest, J. B. (2013). Enhancing surveys of health care professionals: a meta-analysis of techniques to improve response. Evaluation & the health professions, 36(3), 382-407.

Klabunde, C. N., Willis, G. B., McLeod, C. C., Dillman, D. A., Johnson, T. P., Greene, S. M., & Brown, M. L. (2012). Improving the quality of surveys of physicians and medical groups: A research agenda. Evaluation and the Health Professions, 35, 477–506.

McLeod, C. C., Klabunde, C. N.,Willis, G. B., & Stark, D. (2013). Health care provider surveys in the United States, 2000-2010: A review. Evaluation and the Health Professions, 36, 106–126.

VanGeest, J. B., Johnson, T. P., & Welch, V. L. (2007). Methodologies for improving response rates in surveys of physicians: a systematic review. Evaluation & the Health Professions, 30(4), 303-321.


No. 105. Availability of list samples for different populations.

When surveying a specific population, there are situations where a list of the people or organizations in the population exists and can be used to either survey everyone or draw a random sample. Various types of employers, professionals, and members or organizations tend to have directories. For instance, the American Medical Association (AMA) directory can serve as the sample frame for a physician survey. A list of business owners of certain types or sizes of businesses within a certain geographical area can be purchased from Dun & Bradstreet ( University directories clearly work well for campus community or climate surveys. Similarly, stores may have lists of regular customers or websites may have lists of registered users.

Keep in mind that your sample will only be as good as the list. Lists can quickly get outdated. You may have under-coverage; specific people or organizations who are missing from the list or certain subgroups within your target population who are missing from the list. You may also have inclusion of people or organizations who actually are not in your target population. Lists can often include duplicate records, so add a duplication check to your sample list review.

One of the advantages of listed sample is that it often (although not always) allows the researcher to identify a specific respondent by name. This is helpful because it eliminates the need for sometimes difficult-to-implement steps such as within household selection. Furthermore, it allows the researcher to personalize communication about the survey, which enhances the likelihood that a selected respondent will agree to participate.

In order to be used for a survey, listed sample must include some form of contact information for the individuals or organizations on the list (e.g., mailing and/or email addresses or telephone numbers). The type of contact information available may restrict which survey modes can be used. For example, it would be difficult to do a telephone survey for a listed sample that only included mailing addresses. One challenge with listed sample is when different types of contact information are available for different people or organizations in the list (e.g., telephone numbers are available for some portion of the list and addresses are available for a different portion). When this occurs, one can use a mode that covers the largest portion of the frame (e.g., using a mail survey if 90% of cases in the list have mailing addresses). Alternatively, a multi-mode design can be used to contact potential respondents in different modes.

For more information, see:

(Ed.) Tourangeau, R., Edwards, B., Johnson, T. P., Wolter, K.M. & Bates, N. (2014). Hard to Survey Populations. Cambridge University Press.

Levy, P.S. & Lemeshow, S. (2008). Sampling of Populations: Methods and Applications, 4th Edition. Wiley.


SRL Survey News Bulletins Contributors:

Sowmya Anand Marni Basic
Anne Diffenderffer Isabel Farrar
Allyson Holbrook Tim Johnson
Linda Owens Jennifer Parsons
Karen Retzer Marina Stavrakantonaki