An Empirical Assessment of the Random Response Method of Sensitive Data Collection

An Empirical Assessment of the Random Response Method of Sensitive Data Collection

Laurie E. Linden

David J. Weiss

Department of Psychology, California State University

Los Angeles, CA 90032

The random response method for investigating sensitive aspects of one’s personal history was assessed by administering a written questionnaire to 285 undergraduate students. It was hypothesized that, in accord with previous findings, the protection of privacy afforded by randomization would yield more revelations in response to sensitive questions than would direct questioning. Members of the direct questioning group simply responded, while those in the random responding group rolled a die before answering each question to determine if a truthful response should be given. Overall, the results failed to show consistent differences between estimated population proportions obtained using the two methods of eliciting sensitive information. If sufficient care is paid to guaranteeing anonymity, straightforward questioning appears to be as effective for sensitive issues as it is for neutral ones.

Difficulties in extracting honest answers to personal questions have motivated researchers to use techniques incorporating randomness in an effort to obtain valid estimates of population proportions (Crino, Rubenfeld, & Willoughby, 1985; Frenette & Begin, 1979; Gelles, 1978; Martin & Newman, 1988). The additional layer of protection is hypothesized to elicit more sensitive revelations.

The random response technique was introduced by Warner (1965) as an aid in estimating population proportions of sensitive behaviors or attitudes. Prior to recording a dichotomous response on a questionnaire, the respondent engages a random device, such as a die, and the outcome of the random process is superimposed on the response. For example, the instructions may call for giving a truthful answer if the die comes up "2"-"6", but ask for the opposite answer if the die should come up "1".

Warner's presentation employed a straightforward scheme for correcting observed proportions of "yes" responses in accord with the directed probability of an inaccurate response. More complex algorithms have been discussed (Greenberg, Kuebler, Abernathy, & Horvitz, 1971; Levy, 1976, 1977; Liu & Chow, 1976; Pitz, 1980), but none has been established as a standard.

The rationale for the technique is that the guarantee of anonymity is enhanced, because the random element means that a particular response cannot be associated with an individual respondent even if the confidentiality of the questionnaire is breached. Thus the respondent can more freely admit to socially undesirable attitudes or behaviors. These closely related promises, anonymity and confidentiality, are crucial in appreciating a respondent's apprehension when considering whether to reveal sensitive information. The researcher's pledge of confidentiality provides less security than the condition of anonymity, wherein even thc researcher cannot determine how a participant responded.

One might expect sexual issues to be among the most sensitive, and so it is in explorations of such matters that the superiority of random responding should be apparent. The landmark study in this domain was carried out by Fidler and Kleinknecht (1977). A written questionnaire of nine items, mostly of a sexual nature, was administered to two groups of sorority members (N = 200). Each group employed one of the aforementioned methods of gathering sensitive data. Each member of the direct questioning group filled out the questionnaire individually in the presence of the female researcher. Anonymity was promised to the respondents, hut it was not guaranteed. This is problematic since the researcher could make a connection between the respondent and her questionnaire.

The random response group also partook in the study individually, but the researcher was not present. The subjects were informed, correctly, that this absence would enhance anonymity. A randomizing device was employed which the subjects had to be trained to use. The device dictated the manner in which they should respond to a given question. If a subject drew a red pellet from a globe, she was to answer the question truthfully. If a non-red pellet was drawn, she was to answer "yes" or "no", in accordance with what was written on the pellet.

Results of this study showed that, for questions deemed by the researchers to be of a less sensitive nature (i.e., Are you a Protestant?), the two methods yielded approximately the same population proportion estimates. However, when the question was of a more sensitive nature (i.e., Have you ever masturbated?), the random response method typically generated larger estimates of population proportions. The researchers concluded that the random response method allowed subjects to answer sensitive questions more honestly by providing more privacy than the direct questioning method.

This conclusion about the value of the random response method must be regarded with great caution. First, the use of a randomizing device which one must be trained to use seems awkward and possibly confusing. Although this potential difficulty was not addressed in the study, unintentional misuse of the device could result in inaccurate answers.

Second, and more seriously, Fidler and Kleinknecht did not treat the groups alike. The direct questioning group responded to questions in a one-on-one situation. Anonymity cannot be achieved under such circumstances. The researcher knows which responses were made by the subject. Whereas confidentiality could be promised (but was not), anonymity could not be promised (but was). Clearly, calling for answers to such sensitive questions as "Have you ever had a homosexual experience?" in the one-on-one setting can be an intimidating experience, one that could induce socially acceptable lies. On the other hand, the respondents in the random responding group were left alone, which might encourage the revelation of delicate behaviors. This procedural discrepancy could explain the direction of the outcome.

Begin, Boivin, and Bellerose (1979) conducted a survey similar to that of Fidler and Kleinknecht (1977), but with some important modifications. First, the researchers left the respondents in both groups alone to complete the questionnaire; however, they returned in a short while to collect the answers. While this addresses the problem of not treating the groups alike, gathering the questionnaires individually still threatens to compromise anonymity. Second, the sample comprised both male and female college students (N = 405). Subjects were randomly assigned to either the direct questioning group or the random response group. Third, the majority of the questionnaire items were not of a sexual nature. A broad range of social and political issues was included.

Begin et al. (1979) did not find consistent superiority for the random response method. For items deemed by the researchers to be very sensitive (i.e., masturbation, complete intercourse, cheating), no significant differences between the two methods were found. This is contradictory to the results of Fidler and Kleinknecht (1977), who did find significant differences on the questions they labeled as very sensitive. However, for some of the sensitive questions (i.e., abortion, legalized marijuana), a difference was obtained. Significance was also obtained for questions dealing with nuclear power plants, organ donation, halfway homes for mental patients and pleading madness. While these inconsistencies are difficult to interpret, they do not show random responding to be useful in the cases where one might expect the method to demonstrate an advantage.

The present study addressed the possible shortcomings of previous research by treating both groups alike and by taking additional measures to provide anonymity. Specifically, the researcher was present for both groups rather than leaving one group alone to complete the questionnaire. Also, participants were run in groups to avoid a perhaps intimidating one-on-one situation. Subjects knew that the gathering of the response sheets was carried out so as to eliminate identifiability, thus preserving anonymity from the subject’s perspective.

METHOD

Subjects

Two hundred eighty-five students enrolled m twelve undergraduate courses at a large urban public university participated in the survey. The set of courses comprised the entire undergraduate offering for that summer term. The subjects were run in classes assigned to the two conditions by random permutation, with one class shifted afterward to achieve near-equality of group sizes. Lower division and upper division courses were balanced over the conditions. A total of 141 subjects, in five classes, comprised the direct questioning group; 144 subjects, in seven classes, were in the random responding group. Students enrolled in more than one of the classes were excused from the room if they had already participated in the experiment. One student refused to participate.

Procedure

For both conditions, the experimenter went to the classroom in which the given course was being conducted. Volunteers were sought to "participate in a follow-up survey to one conducted by a professor at the university a few years earlier." The students were told that some of the questions would be "of a sensitive nature" and that their responses would be completely anonymous and confidential. They were also informed that the questionnaire consisted of fourteen yes or no items. Subjects were asked not to put their name or any other identifying mark on the paper and, if they did so, their paper would be immediately destroyed. Instructions also directed subjects to place their completed questionnaire in the cardboard box (30.5 x 45.7 x 25.4 cm) at the front of the room. They could fold their sheet in half or blindly mix it with others in the box if they wished (five blank questionnaires were placed in the box at the start by the experimenter so the first one done could do this). The paper was not to be handed directly to the experimenter, who stayed in the classroom. The rationale given for this procedure was the protection of their privacy by preventing the experimenter from making any connection between subject and response.

From this point, the directions differed for the two groups. Subjects in the direct questioning group were asked to answer all questions honestly. Each member of the random responding group was provided a die, which was described as fair in that all six numbers had an equal likelihood of being rolled. Specific instructions were given for answering each of the questions. Participants were told to roll the die before answering each of the fourteen questions, and if the die came up "2," "3," "4," "5," or "6" they were to answer the question truthfully. If they rolled a "1," they were to answer falsely (see Appendix A for verbatim instructions).

Design

Of the fourteen items that comprised the questionnaire, seven of them were considered to be of a sensitive nature while the remaining seven were non-sensitive (see Appendix B). While the designation as sensitive is culture-dependent and to some extent subjective, the operational distinction adopted here is that sensitive questions explored sexual issues. Some of the sensitive questions were adapted from Fidler and Kleinknecht (1977) and Begin et al. (1979). The remainder of the questions were constructed by the researchers. All questions were asked in a fashion that was neutral for gender and sexual orientation. The response options were unequivocally "yes" or "no" (see Table 1 for the questions).

RESULTS

Overall, the results yielded little evidence for the superiority of the random response method. Frequencies and imputed frequencies are presented in Table 1.

For each question, the frequency from the random responding group has been corrected for the proportion answering falsely as directed by the roll of the die. The following formula, equivalent to that of Fidler and Kleinknecht (1977), was employed to achieve the imputation:

(1)

where:

= imputed proportion of true yes responses

P(L) = probability of die directing a false response = 1/6

n_y = number of people who respond "yes" to the question

N = size of sample

Chi Square was used to test significance of the difference between proportions. The frequency of "yes" and "no" responses for the direct questioning group and the imputed frequency for the random responding group were compared at the .05 level for each of the questionnaire items. Significance was obtained for only three (#2, #4, #12) of the fourteen questions¹. More direct questioning subjects stated they were not virgins than random responding subjects (c ² = 5.52, df = 1, p < .025). Direct questioning subjects reported having a sexual experience with someone the same gender as themselves significantly more than their random responding counterparts (c ² = 6.31, df = 1, p < .025). In addition, direct questioning subjects reported being sexually molested as a child at a significantly higher rate than the random response subjects (c ² = 11.68, df = 1, p<.001).

All three items yielding significant differences were from the subset of sensitive questions. However, the direction of the differences was the opposite of that expected. On the basis of prior results, we anticipated more "yes" responses to sensitive questions from the random responding group; but as may be seen in Table 2, a larger proportion of "yes" responses was obtained from the direct questioning group in these three cases.

DISCUSSION

The results did not support the hypothesis that the random response method of data collection would yield more sensitive information than the direct questioning method. There were few differences in revelations. This conclusion depends upon the accuracy of tile frequencies imputed for the random response method group. The observed near-equality of the frequencies for the non-sensitive questions is sufficient to demonstrate the computational efficacy of the correction. This accord alleviates one of the possible dangers in espousing a conclusion of no difference between methods, namely that with additional observations the increased power would generate significant disparity.

For two of the questions designated as sensitive, there was a significant difference obtained between the two methods. Surprisingly, though, more disclosure was obtained from the direct questioning group. These were questions #4 and #12, referring to same-gender sexual experience and to being molested as a child.

1 The Chi-Square tests employed here are approximate, since the imputed frequency has built into it not only the binomial variability of the actual frequency, but an additional component attributable to the random impact of the die. The latter component increases the variability, and so the use of standard Chi-Square procedure results in slightly larger obtained values than a corrected analysis would yield. Consequently, exact analyses would generate even stronger support for the conclusion that the two response methods do not produce different outcomes.

TABLE 1

Response Frequencies and Corrected Frequencies for Random Response Group

"Yes" "No"

	Direct	Random	Direct	Random
	Questioning	Responding	Questioning	Responding	c ²	p
1. Are you a female?	99	97.50	42	46.50	0.21	>.25
2. Are you a virgin?	27	45.00	114	99.00	5.52	<.025
3. Do you smoke cigarettes on a daily basis?	10	10.50	131	133.50	0.004	>.25
4. Have you ever had a sexual experience with someone of the same gender as yourself?	15	4.50	126	139.50	6.31	<.025
5. Have you ever touched a partner's genitals with your mouth or have they ever touched yours with theirs?	106	96.00	35	48.00	2.50	>.10
6. Have you ever lied to your physician?	30	34.50	111	109.50	0.29	>.10
7. Do you exercise at least 3 times a week?	68	75.04	73	68.96	0.43	>.25
8. Have you ever been tested for HIV?	38	37.50	103	106.50	0.03	>.25
9. Would you practice "safe sex" if you had more than one sexual partner?	136	138.00	4	6.00	0.38	>.25
10. Do you drink alcoholic beverages on a weekly basis?	26	15.00	115	129.00	3.72	>.05
11. Do you know for a fact that you are HIV positive?	4	4.50	137	139.50	0.02	>.25
12. Were you ever sexually molested as a child?	28	9.00	113	135.00	11.68	<.001
13. Are you married?	27	22.50	114	121.50	0.62	>.25
14. Are you between 20 and 30 years old?	92	102.00	49	42.00	1.02	>.25

TABLE 2

Comparison of proportions for "Yes" Responses for Direct Questioning and Random Response Groups

"Yes"

	Direct	Random
	Questioning	Responding	c ²	p

Sensitive Questions
2. Are you a virgin?	0.19	0.31	5.52	<.025
4. Have you ever had a sexual experience with someone of the same gender as yourself?	0.11	0.03	6.31	<.025
5. Have you ever touched a partner's genitals with your mouth or have they ever touched yours with theirs?	0.75	.067	2.50	>.10
8. Have you ever been tested for HIV?	0.27	0.26	0.03	>.25
9. Would you practice "safe sex" if you had more than one sexual partner?	0.96	0.96	0.38	>.25
11. Do you know for a fact that you are HIV positive?	0.03	0.03	0.02	>.25
12. Were you ever sexually molested as a child?	0.20	0.06	11.68	<.001

Non-sensitive Questions
1. Are you a female?	0.70	0.67	0.21	>.25
3. Do you smoke cigarettes on a daily basis?	0.07	0.07	0.004	>.25
6. Have you ever lied to your physician?	0.21	0.24	0.29	>.10
7. Do you exercise at least 3 times a week?	0.48	0.52	0.43	>.25
10. Do you drink alcoholic beverages on a weekly basis?	0.18	0.10	3.72	>.05
13. Are you married?	0.19	0.16	0.62	>.25
14. Are you between 20 and 30 years old?	0.65	0.71	1.02	>.25

A significant difference between the two groups was also obtained on question #2. More members of the random responding group reported being a virgin than did those in the direct questioning group. This result is difficult to interpret as a revelation since it is not clear whether it is desirable to be a virgin in an urban university².

Our interpretation of these differences is that they are Type I errors. Since 14 independent Chi Square analyses were performed, the probability of at least one Type I error is not .05 but .51. Of course, this possibility is inherent in a design in which multiple independent tests are carried out. Had we chosen a Bonferroni procedure with accordingly reduced significance level (and reduced power as well), only question #12 would have yielded a significant difference. Note that this difference, namely a greater proportion of "yes" responses by the direct questioning group to a delicate item, is opposite in direction to that expected if the random response method produced more revelations of sensitive information.

In contrast to the results obtained by Fidler and Kleinknecht (1977), and by Begin et al. (1979), the present data do not show superiority for the intuitively more promising random response method. Thc discrepancy is not likely to involve statistical power, as the present study employed a sample size intermediate between the cited predecessors. While running subjects within previously constituted groups can under some conditions reduce the effective size of the sample, those conditions are not likely to have arisen here. Of greatest concern is violation of independence among responses from different individuals, such as might occur if subjects communicated while filling out the questionnaires. Another possibility, unlikely given the nature of the sensitive topics (there was no class in sexuality offered), is that class discussion led to students answering the questions in similar ways.

Another possible bias is that students who choose to take a particular class might have like tendencies to reveal sensitive matters. We elected to carry out the randomization on a class basis in order to avoid the chaos that would arise from mixing random responding and direct responding within a classroom. If students with similar propensity to reveal were clustered in particular classes, then a difference in proportions of "yes" responses to sensitive questions should appear across classes. We compared the proportions of "yes" responses to those questions across the five classes which comprised the direct questioning group. Analysis of variance on these proportions showed no significant differences among the classes, F(4, 30) < 1. A similar analysis on the seven classes comprising the random responding group (using imputed proportions rather than

2 For most of the sensitive questions asked in the present study, the socially desirable response was presumed to be obvious. This cavalier approach to determining which answers may be regarded as revelations can scarcely be recommended. As Catania. Gibson, Chitwood. and Coates (1990) have observed, self-presentation bias is poorly understood, especially in the realm of sexual activity.

observed values) showed a similar result, F(6, 42) < 1. Therefore, it appears that random assignment of classes, rather than individual subjects, to the experimental conditions had no adverse impact.

To reconcile the conflict between the negative results of the current study and the positive results of its predecessors, it may be useful to review some procedural aspects of the present study that differed from those of the earlier work. In the present investigation, subjects were run in groups and all questionnaires were placed in a large box by the respondents, rather than being handed directly to the researcher. There was no one-on-one contact between the researcher and subjects in either group. These measures assured both groups of anonymity, since the researcher had no way of knowing who made a given response. Perhaps with such strong guarantees of privacy, the subjects in the direct questioning group felt comfortable in answering honestly even the most sensitive questions.

The results of the current study suggest that revelations can be obtained if respondents believe in the privacy of their answers. The Achilles heel of face-to-face data collection may be in persuading subjects that their responses are truly private. Random responding may be of value in studies in which respondents are run individually. A well-controlled investigation of this question would be useful.

The investigator's perspective on privacy is likely to differ from that of the subject. To the researcher, a response is an objective piece of information. To the subject, a response is personal history, possibly emotionally charged. The prospect of another person associating that response with their identity may be frightening and inhibiting. In any study, the issue of privacy should be broached by the researcher; but because the perspectives differ, the distinction between confidentiality and anonymity may be glossed over.

Whereas a promise of confidentiality is routine among researchers investigating sensitive matters, the promise of anonymity entails restrictions on the scope of an inquiry. Studies calling for repeated measurement require some sort of subject identification. These include interventions and longitudinal projects, as well as those concerned with a methodological issue such as test-retest validity. Anonymity can be maintained in such contexts by having subjects assign code numbers to their questionnaires. However, the random response method would not seem practical for such investigations, since an individual response cannot be taken at face value.

A related uncertainty is that associations are difficult to prove. In a health-oriented study, for example, one might wish to ask whether a person who answered positively to one question (e.g., have you engaged in anal sex within the last month) had an increased likelihood of answering positively to another (have you had an HIV test). Such personal questions would seem to be the natural playing field for random response methodology. In inquiries exploring these delicate issues, the researcher's interest is likely to be focused on connections between responses. Unfortunately, the random element means that the researcher cannot classify the subject with certainty on the basis of any of the answers given. While it is possible to estimate population proportions for paired responses³, the compounded variability reduces the reliability of the estimates.

The random response method does not offer broad applicability, nor is it particularly convenient. The present evidence shows that direct questioning with a strong guarantee of anonymity (such as that offered by the box in the current study) is equally effective in eliciting socially delicate responses. When anonymity is feasible, and made credible for respondents, random responding seems to have little practical value.

3 The estimate is computed from the observed frequencies of the four combinations of "yes" and "no" responses for any selected pair of questions. The estimated proportion of true "yes"-"yes" responses is given by:

where the quantities are defined analogously !o those used in Equation 1. We are indebted to Donald Bamber for suggesting this approach.

APPENDIX A

Instructions for Direct Questioning Group

This survey is a follow-up to one conducted by Dr. Weiss a few years ago. Some of the questions will be of a sensitive nature. Please answer each question honestly. All answers will remain anonymous and confidential. Please do not put your name or any other identifying mark on the paper. Any papers found to have any mark which might identify a participant will be immediately destroyed. When you have completed the questionnaire, place it in the cardboard box at the front of the classroom. This way I can't tell which participant goes with which questionnaire. You may fold the questionnaire, if you wish, or mix it with other questionnaires in the box. Are there any questions?

Instructions for Random Response Group

This survey is a follow-up to one conducted by Dr. Weiss a few years ago. Some of the questions will be of a sensitive nature. In order to insure anonymity, a die is provided for you. You are to roll the die prior to answering each of the questions. All the dice are fair, so you have an equal likelihood of rolling each of the six numbers on the die. If you should roll a 2, 3, 4, 5, or 6, please answer the question truthfully. If you should roll a 1, you are to answer the question falsely. You are to roll the die for each of the fourteen questions and answer each question according to the instructions I just gave you. All responses will remain anonymous and confidential. Please do not put your name or any other identifying mark on the paper. Papers found to have any mark which might identify a participant will be immediately destroyed. When you have completed the questionnaire, place it in the cardboard box at the front of the classroom. This way I can't tell which participant goes with which questionnaire. You may fold the questionnaire, if you wish, or mix it with other questionnaires in the box. Are there any questions?

APPENDIX B

Questionnaire

Health Practices Survey

1. Are you a female?	Yes	No
2. Are you a virgin?	Yes	No
3. Do you smoke cigarettes on a daily basis?	Yes	No
4. Have you ever had a sexual experience with someone of the same gender as yourself?	Yes	No
5. Have you ever touched a partner's genitals with your mouth or have they ever touched yours with theirs?	Yes	No
6. Have you ever lied to your physician?	Yes	No
7. Do you exercise at least 3 times a week?	Yes	No
8. Have you ever been tested for HIV?	Yes	No
9. Would you practice "safe sex" if you had more than one sexual partner?	Yes	No
10. Do you drink alcoholic beverages on a weekly basis?	Yes	No
11. Do you know for a fact that you are HIV positive?	Yes	No
12. Were you ever sexually molested as a child?	Yes	No
13. Are you married?	Yes	No
14. Are you between 20 and 30 years old?	Yes	No

Authors' Notes: This report is based on a thesis submitted by the first author and supervised by the second to the Department of Psychology, California State University, Los Angeles in partial fulfillment of the requirements for the M.S. degree. We thank committee members Jerry Tate and David Fitzpatrick for their insights. We also are grateful to Donald Bamber for mathematical advice and to Burton Alperson for comments on an earlier version of the manuscript.

Correspondence concerning this article should be addressed to the second author.

Journal of Social Behavior and Personality, 1994, Vol 9, No. 4, 823-836.