Comprehensive Handbook of Psychological Assessment

Volume 2: Personality Assessment

Part 4: Specific Instruments

Author

Mark J. Hilsenroth (Editor), Daniel L. Segal (Editor), Michel Hersen (Editor-in-Chief)

Comprehensive Handbook of Psychological Assessment, Volume 2: Personality Assessment Mark J. Hilsenroth (Editor), Daniel L. Segal (Editor), Michel Hersen (Editor-in-Chief) ISBN: 978-0-471-41612-8

September 2003

Part Four: Specific Instruments

26 Rorschach Assessment: Current Status 343 Irving B. Weiner

27 The Thematic Apperception Test (TAT) 356 Robert J. Moretti and Edward D. Rossini

28 The Use of Sentence Completion Tests with Adults 372 Alissa Sherry, Eric Dahlen, and Margot Holaday

29 Use of Graphic Techniques in Personality Assessment: Reliability, Validity, and Clinical Utility 387 Leonard Handler, Ashley Campbell, and Betty Martin

30 The Hand Test: Assessing Prototypical Attitudes and Action Tendencies 405 Harry J. Sivec, Charles A. Waehler, and Paul E. Panek

31 Early Memories and Personality Assessment 421 J. Christopher Fowler

32 The Adult Attachment Projective: Measuring Individual Differences in Attachment Security Using Projective Methodology 431 Carol George and Malcolm West

CHAPTER 26 Rorschach Assessment: Current Status

IRVING B. WEINER

CONCEPTUAL BASIS 344 DEVELOPMENT AND PSYCHOMETRIC CHARACTERISTICS 345 Intercoder Agreement 345 Reliability 346 Validity 346 Normative Reference Data 347 UTILITY OF RORSCHACH APPLICATIONS 348 Clinical Practice 349

Forensic Practice 349 Organizational Practice 349 LEGAL ISSUES 350 CROSS-CULTURAL CONSIDERATIONS 351 COMPUTERIZATION OF RESULTS 352 CURRENT AND FUTURE STATUS 352 REFERENCES 353

The Rorschach Inkblot Method (RIM) is a relatively unstructured, performance-based measure of personality functioning consisting of 10 inkblots printed individually on 63 ⁄4$ by 93 ⁄4$ cards. Seven of the blots appear in shades of gray and black (Cards I and IV through VII); two in shades of red, gray, and black (Cards II and III); and the remaining three in shades of various pastel colors (Cards VIII through X). Hermann Rorschach selected these 10 inkblots from among many others with which he experimented on the basis of finding them particularly helpful in identifying the psychological makeup of persons he evaluated. His experiment resulted in the 1921 publication of Psychodiagnostics (Rorschach, 1921/1942), in which he described his new assessment method and presented findings and conclusions from the testing of 117 “normal” volunteers and 350 patients in a Swiss mental hospital where he worked as a staff psychiatrist. Rorschach’s 10 selected inkblots have since that time constituted the measure that bears his name, and they have been printed in the identical fashion and by the same publisher, Hans Huber in Bern, Switzerland, since 1922.

The RIM is administered by handing a respondent each of the cards one at a time, proceeding always from Card I through Card X, and asking, “What might this be?” Following this initial phase of the administration, called the “free association,” there is an inquiry phase in which respondents are shown the inkblots a second time and asked to indicate where they saw each of their percepts and what made them look the way they did. When properly used, this procedure generates useful information about numerous aspects of personality functioning. This chapter addresses the conceptual basis of Rorschach interpretation; the development and psychometric characteristics of the instrument; the utility of Rorschach applications; legal issues concerning the admissibility into evidence of Rorschach testimony; cross-cultural considerations in Rorschach assessment; the computerization of Rorschach results; and the current and future status of the instrument.

CONCEPTUAL BASIS

Rorschach responses provide three types of data, which are commonly referred to as the structural, the thematic, and the behavioral components of the test protocol (see Weiner, 2003, Chapters 5 through 7). The structural data consist of various codes used to designate features of how the inkblots have been perceived. These codes categorize such objective perceptual features of responses as where they are seen (e.g., in the whole blot or in some part of it), why they look as they do (e.g., because of their color or because of their shape), what they look like (e.g., a human figure, an animal), and whether they are frequently given responses (coded as popular). In the Rorschach Comprehensive System developed by Exner (2003), response codes are tabulated and combined to yield over 100 summary scores and indices for a test protocol. The interpretation of these summary scores and indices is based on two companion assumptions: first, that the manner in which people perceive the inkblots is a representative sample of how they generally perceive objects and events in their lives; and second, that how people perceive objects and events in their lives—that is, how they look at their world—indicates what kind of people they are, particularly with respect to their tendencies to think, feel, and act in certain ways.

For example, a person who gives responses mainly to the entire blot and rarely to blot details is likely to be someone who attends to global aspects of situations while overlooking their important details; conversely, a respondent who gives abundant detail responses but not many whole responses is probably the type of individual who becomes preoccupied with details and fails to see the forest for the trees. Similarly, people who report a relatively high frequency or percentage of commonly given percepts are more likely to see things and conduct themselves in conventional ways than people whose records contain a high frequency of percepts that rarely occur.

The thematic data in the RIM consist of the imagery with which most respondents embellish their reports of what the inkblots might be. Whereas structural data involve a perceptual process attuned to the stimulus properties of the inkblots (form, color, shading), thematic data derive from an associative process in which characteristics not intrinsic to the blot stimuli are nevertheless attributed to them. For example, human figures may be seen as moving, even though the blots are, in fact, static, and they may be described as being happy or sad, even though there are no objective perceptual indications of their feeling state. The conceptual basis for interpreting such attributions consists of assuming that, since they do not come directly from the blots, they must come from inside the person. Thematic imagery produced by associations to the blot stimuli are accordingly likely to provide clues to a person’s inner life of underlying needs, attitudes, conflicts, and concerns. If movement ascribed to human figures more frequently depicts aggressive than cooperative interactions, for example, the respondent may be inclined to view interpersonal interactions as being more likely to involve competition than collaboration. Similarly, respondents who frequently describe objects as being “broken,” “damaged,” “destroyed,” “worn down,” “rusting away,” or “in ruins” may well be preoccupied with concerns about having been harmed or injured themselves, or being vulnerable to harm or injury, or having become mentally or physically dysfunctional in some way.

Behavioral data in Rorschach testing consist of the manner in which respondents deal with the test situation, handle the test materials, relate to the examiner, and use language in expressing themselves. The conceptual rationale underlying interpretation of the behavioral data in a Rorschach examination closely resembles the guiding principles applicable in all kinds of clinical evaluations: namely, that the manner in which people being evaluated approach the tasks set for them and structure their relationship with the clinician provides a representative sample of their task and interpersonal orientation. As a relatively unstructured verbal interaction, the Rorschach situation creates ample opportunity to sample such orientations. Suppose, for example, a respondent being handed Card I takes it and asks, “Doctor, is it permitted to turn the card?” Suppose another respondent begins by tossing Card I on the table and asking, “Who thought up this stupid test?” These behavioral data would give reason to suspect that the first of these respondents is a passive and dependent individual who is inclined to structure interpersonal situations in an authoritarian way, whereas the second is a negativistic and antagonistic person who may characteristically resist being self-revealing and is fearful of being evaluated.

By integrating these structural, thematic, and behavioral features of the data, contemporary Rorschach assessors generate comprehensive descriptions of a respondent’s personality functioning. These descriptions typically address adaptive strengths and weaknesses in how people manage stress, attend to and perceive their surroundings, form concepts and ideas, experience and express affect, view themselves, and regard other people. Rorschach-based descriptions of personality characteristics, enriched by thematic clues to the content of a respondent’s concerns, facilitate in turn numerous applications of the instrument. As elaborated later in the chapter, these applications include helping to identify the presence and nature of psychological disorder, a person’s need for and amenability to treatment of various kinds, and whether the person is likely to function effectively or inadequately in certain kinds of situations.

DEVELOPMENT AND PSYCHOMETRIC CHARACTERISTICS

Hermann Rorschach died in 1922 at the age of 37, just one year after his monograph was published, leaving his work with his instrument largely unfinished. Many different approaches to Rorschach assessment were subsequently developed in countries where the inkblot method quickly captured the interest of scholars and practitioners, most notably in France, Germany, Italy, and Japan. In the 1920s the Rorschach plates were brought to the United States by David Levy, a psychiatrist who had taken postdoctoral training in Switzerland, and Levy later encouraged a psychology trainee from Columbia University, Samuel Beck, to do some research with the test. Thus encouraged, Beck chose for his doctoral dissertation a Rorschach study of mentally retarded children, and reports of his findings became the first English-language publications on the Rorschach (Beck, 1930a, 1930b). In the ensuing years, Beck’s way of using the inkblot method became one of five major Rorschach systems developed in the United States, the others emerging in the hands of Bruno Klopfer, Marguerite Hertz, Zygmunt Piotrowski, and the tandem of David Rapaport and Roy Schafer. Further information about these historical developments is provided by Ellenberger (1954), Exner (1969, 2003, Chapter 1), Schwarz (1996), and Wolf (2000).

The blossoming of these various Rorschach systems in the United States and abroad enriched the instrument for clinical purposes, but at a cost to its scientific development. The many Rorschach variations created by gifted and respected clinicians severely curtailed cumulative research on the psychometric properties of the instrument. This state of affairs led John Exner in the early 1970s to undertake a standardization of Rorschach administration and coding procedures. Relying on logical analysis and empirical findings, he combined seemingly rational and scientifically sound aspects of each of the five major U.S. systems into a Comprehensive System (CS), which was first published in 1974 (Exner, 1974). Exner subsequently provided CS normative reference data for 600 nonpatient adults ages 19 to 69 examined in various parts of the country and including 18% African American, Asian American, and Hispanic persons; for 1,390 nonpatient children subdivided by ages, from 5 to 16; for 535 adult outpatients presenting a broad variety of symptoms; and for 601 psychiatric inpatients, including 328 persons with a first admission for schizophrenia and 279 persons diagnosed with major depressive disorder (Exner, 2001, Tables 13 through 34).

As revised and enhanced over the years in light of research findings and increased clinical sophistication (see Exner, 1993, 2000; Weiner, 2003), the CS has become by far the most widely used Rorschach method worldwide. The standardization of procedure provided by the CS has fostered substantial advances in knowledge about the RIM, particularly with respect to the psychometric soundness of the instrument as demonstrated by its intercoder agreement, reliability, and validity. This information is presented next, along with some comments concerning the current adequacy of the CS normative reference data largely collected over 20 years ago.

Intercoder Agreement

In constructing the Rorschach CS, Exner included only variables on which his coders could achieve 80% agreement. Subsequent research has confirmed that the CS variables can be reliably coded with at least this level of agreement. Some variables show almost perfect concordance among coders, including whether a response has been given to the whole blot or just part of it, whether it is one of 13 specified popular responses, and whether it involves a single object or a pair of identical objects. Since 1991, the major assessment journals have required Rorschach studies to include evidence of adequate intercoder reliability (see Weiner, 1991), and an undiminished flow of published Rorschach research meeting this requirement bears witness to the psychometric soundness of the instrument in this respect.

Some critics of Rorschach assessment have questioned whether percentage of agreement is satisfactory as a measure of intercoder reliability for multidimensional instruments like the Rorschach and argued for using instead a statistic that corrects for chance agreements, such as kappa or intraclass correlation coefficients (ICCs; Wood, Nezworski, & Stejskal, 1996). This has proved an idle challenge, given that intercoder reliability remains impressive when measured by these coefficients. Meta-analytic reviews and studies with patient and nonpatient samples have identified mean kappa coefficients ranging from .79 to .88 across various CS coding categories, which for kappa coefficients is generally regarded as being in the good to excellent range (Acklin, McDowell, Verschell, & Chan, 2000; McDowell & Acklin, 1996; Meyer, 1997a, 1997b).

In a new data set including 219 protocols from four different samples, Meyer et al. (2002) found a median ICC of .93 for intercoder agreement across 138 regularly occurring Rorschach variables, 134 of which showed excellent chancecorrected reliability. There accordingly appears to be no rational or empirical basis for questioning the conclusion of Viglione and Hilsenroth (2001) that research findings “provide conclusive empirical evidence of strong interrater reliability for the great majority (95%) of Comprehensive System (CS) variables” (p. 452).

Reliability

As summarized by Exner and Weiner (1995, pp. 21–27), the reliability of Rorschach summary scores and indices has been demonstrated in a series of retest studies with both children and adults over intervals ranging from 7 days to 3 years. Almost all of the variables coded in the CS that are conceptualized as relating to trait characteristics show substantial short-term and long-term stability in adults. Retest correlations for most of these variables exceed .75, and some approach .90 (e.g., the affective ratio and the egocentricity index). Only two Rorschach variables show consistently low retest correlations among adults—inanimate movement (m) and diffuse shading (Y)—and both of these variables have traditionally been construed as indices of situational distress. Among children, 3-week retest studies identify stability coefficients similar to those found in adults. Over a 2-year retest interval, however, young people initially fluctuate considerably in their Rorschach scores but then show steadily increasing long-term consistency as they grow older, just as would be expected in light of the gradual consolidation of personality characteristics that occurs during the developmental years. By ages 14 to 16, adolescents display the same level of 2-year retest stability as adults (Exner, Thomas, & Mason, 1985).

Concern has been raised by some authors that, because retest correlations have been published for only a portion of the variables coded in the CS, Rorschach reliability is yet to be established (Garb, Wood, Nezworski, Grove, & Stejskal, 2001). As pointed out by Viglione and Hilsenroth (2001), however, most of the “missing” retest correlations mentioned by these authors involve either (1) composite variables for which component part reliability data are available or (2) infrequently occurring variables with base rates so low as to preclude meaningful statistical treatment. As an illustration of the latter instance, a rarely coded variable like color projection (CP) could well show a frequency of zero in a sample of persons tested twice, in which case the first test would exactly predict the second test, but the resulting perfect correlation would be meaningless as a reliability estimate. As matters stand, currently available retest correlations for all regularly occurring Rorschach variables having interpretive significance for trait dimensions of personality show a degree of reliability that compares favorably with psychometric findings for other frequently used and highly regarded measures, including the Wechsler scales and the Minnesota Multiphasic Personality Inventory (MMPI).

The Rorschach retest data have important implications not only for the reliability of the instrument but for its intercoder agreement and validity as well. With respect to intercoder agreement, the substantial stability coefficients shown by most Rorschach variables attest good interrater reliability among the many persons who did the coding for these retest studies. The correlation between two sets of scores is statistically limited by their reliability, and large correlations can emerge from a retest study only when both sets of scores have been reliably assigned. Regarding Rorschach validity, the previously mentioned high retest correlations for Rorschach variables considered to measure trait characteristics and low retest correlations for variables posited to measure situational or state characteristics lend construct validity to interpreting these variables as indices of trait or state dimensions of personality, respectively. Similarly, the gradual increase in 2-year retest correlations for most Rorschach variables during the developmental years validates the RIM as a measure of developmental progression in personality consolidation.

Validity

The present author has elaborated in previous publications three critical considerations in evaluating the validity of the RIM and other multidimensional personality assessment instruments (Weiner, 1977, 1996, 2001a). First, the validity of personality measures that yield multiple scores and indices resides in the correlations of their individual scales with personality characteristics that these scales are intended to measure. Rorschach scales should accordingly be expected to correlate with phenomena or events rooted in personality characteristics, and their validity should not be judged by how well or poorly they predict complex behaviors in which personality characteristics exert only minor influence.

Second, the validity of multidimensional personality assessment instruments should be determined from the correlations of their scales with observed rather than inferred variables. Observed variables consist of directly noted features of how people think, feel, and act; inferred variables are hypotheses about how people are likely to think, feel, and act that are derived from indirect sources of information. Personality assessment instruments are themselves inferential measures, which means that their correlations with each other provide at best only modest indications of how valid they are for explaining or predicting aspects of observed behavior. Hence Rorschach validity studies should be focused on comparing Rorschach findings with observed behavioral manifestations of personality characteristics, not with the results of other assessment instruments.

Third, in addition to addressing the use of specific scales for specific personality-related purposes and correlating these scales with dimensions of observed behavior, validation studies should be conceptually based. In conceptually based personality assessment research, predictions are formulated in terms of particular personality characteristics that are believed to account both for particular test scores that measure these characteristics and for particular behaviors that reflect them. When validation research is designed in this way, significant correlations between test scores and observed behavior identify not only the co-occurrence of certain variables, which provides criterion validity, but also the reasons for this co-occurrence, which provides construct validity. The advantage of construct over criterion validity in psychodiagnostic assessment is the possibility it allows for assessors to comprehend and communicate why observed relationships exist and why accurate predictions hold true.

Turning to substantive data bearing on the validity of Rorschach assessment, the most comprehensive and incisive report available in the literature is a meta-analytic study conducted by Hiller, Rosenthal, Bornstein, Berry, and Brunell-Neuleib (1999; see also Rosenthal, Hiller, Bornstein, Berry, & Brunell-Neuleib, 2001). These investigators selected for their database a random sample of Rorschach and MMPI research studies published from 1977 to 1997 in which there was at least one external (i.e., nontest) variable and in which some reasonable basis had been posited for expecting associations between variables. This selection procedure resulted in examination of 2,276 Rorschach protocols and 5,007 MMPI protocols and led Hiller et al. to the following conclusions:

The validity of the Rorschach and the MMPI as measured by the average effect sizes in these studies is almost identical. The unweighted mean validity coefficients were .29 for Rorschach variables and .30 for MMPI variables, and there is no significant difference between these two validity estimates.
The obtained effect sizes for the RIM and the MMPI are sufficiently large to warrant confidence in using both instruments for their intended purposes. Hiller et al. conclude specifically in this regard that “validity for these instruments [Rorschach and MMPI] is about as good as can be expected for personality tests” (p. 291).
On the average, Rorschach variables are somewhat superior to MMPI variables in predicting behavioral outcomes, such as whether patients remain in or drop out of treatment (mean validity coefficients of .37 and .20, respectively). On the other hand, the MMPI shows higher effect sizes than the RIM in correlating with psychiatric diagnosis and self-reports (.37 vs. .18). These differences probably reflect the particular sensitivity of the RIM to persistent behavioral dispositions and the reliance of the MMPI on self-report methodology similar to that on which psychiatric diagnoses are based.

Two other meta-analytic studies illustrate the validity that can be demonstrated for specific Rorschach scales when they

are used appropriately to measure personality characteristics that contribute to people behaving in certain ways. Focusing particularly on treatment variables, Meyer and Handler (1997) examined 20 effect sizes for the Rorschach Prognostic Rating Scale (RPRS) among 752 persons tested at the beginning of therapy. They found an average effect size of .44 between the RPRS and independent ratings of psychological treatment outcomes 1 year later. Working with the Rorschach Oral Dependency (ROD) scale, Bornstein (1999) found an average validity coefficient of .37 for 21 effect sizes in 538 test protocols used to predict independently observed dependencyrelated behaviors from the ROD. For comparison purposes in appreciating the magnitude of these coefficients, empirically demonstrated correlations between predictor and criterion variables referenced by Meyer et al. (2001) include the following: psychotherapy and subsequent well-being (.32); Hare Psychopathy Checklist scores and subsequent violent behavior (.33); MMPI scale scores and average ability to detect depression or psychotic disorders (.37); Viagra and improved male sexual functioning (.38); MMPI validity scales and detection of underreported psychopathology (.39); cardiac fluoroscopy and diagnosis of coronary artery disease (.43); and weight and height for U.S. adults (.44). Further documentation of the research base demonstrating the validity of Rorschach assessment is presented by Viglione (1999), Viglione and Hilsenroth (2001), and Weiner (2001a).

Normative Reference Data

The previously mentioned publication of normative reference data for the Rorschach CS enhanced the standardization of the instrument and contributed as well to improved decision making based on it. These reference data make it possible to state for a broad range of Rorschach findings how frequently they are likely to occur in nonpatient groups, in persons with various kinds of psychological disorder, and in young people at different ages. In recent years, however, questions have been raised concerning whether Exner’s reference data, collected mainly from 1973 to 1986, adequately reflect societal and age-group changes over the years, as well as altered definitions of disorder and refinements in Rorschach coding. Assessment instruments customarily undergo periodic renorming with such considerations in mind, and the RIM should be no exception. Recent normative studies in California with modest samples of nonpatient children (n ” 100) and adults (n ” 123) have in fact appeared to show some differences from Exner’s reference data (Hamel, Shaffer, & Erdberg, 2000; Shaffer, Erdberg, & Haroian, 1999), as have nonpatient data being accumulated in a collaborative 12 country international study (Erdberg & Shaffer, 1999, 2001).

The similarities to Exner’s norms in these California and cross-cultural studies far outweigh the differences, but there are nevertheless some deviations from the older data to be explained. The most likely explanation of these deviations points to methodological differences in from whom, by whom, and in what manner these reference data were collected. The participant samples in these studies differ in their educational level and in how adequately they represent the population from which they were drawn, and the examiners who did the testing vary among the studies in their level of experience in Rorschach administration. In elaborating on the implications of these and other issues in normative data collection, the present author has also noted that volunteer nonpatient respondents may not become seriously enough engaged in the Rorschach task to provide reliable reference data unless they are given adequate ego-involving instructions (Weiner, 2001b).

A second possible explanation for divergence in normative reference data, specifically applicable to cross-cultural studies, involves two alternative reasons why group findings might differ. On one hand, some personality characteristics may be manifest differently in different cultures, in which case Rorschach criteria for inferring these characteristics have to be adjusted on the basis of culture-specific norms. For example, a variable associated with dysphoria in one country when its frequency exceeds two may not become a marker for dysphoria in some other country until it exceeds three. As the alternative possibility, Rorschach criterion scores for inferring personality characteristics may be universal, in which case observed cross-cultural variations in Rorschach findings are reflecting actual cultural differences in modal personality patterns. Ephraim (2000) discusses in detail these alternative explanations for cross-cultural differences in Rorschach findings and describes the kinds of future research necessary to unravel them.

Presently in progress is a new normative data collection project being undertaken by Exner in the United States specifically for the purpose of updating the CS reference information. As in Exner’s earlier normative work, respondents are being solicited to provide a demographically representative sample, and they are being tested by experienced examiners proceeding with a uniform and carefully formulated set of instructions. Exner (2002) has published the findings for the first 175 persons tested in this project, and, despite the passage of time and concerns to the contrary, the new data thus far closely resemble the older CS reference data. Of particular note, conclusions by Wood, Nezworski, Garb, and Lilienfeld (2001) that the currently available CS norms are inaccurate and lead to excessive inference of psychopathology appear to be unwarranted. In the new and carefully collected Exner data, only 1 of the 175 nonpatients showed an elevation on the CS index for perceptual and thinking disorder (PTI ! 2), only 16% elevated on the CS index for depression (DEPI ! 4), and only 6% showed CS indices of deficient coping skills (CDI ! 3).

Like the retest reliability findings for CS variables, the reference norms also bear implicit witness to the validity of Rorschach assessment. Two examples will be given here, one referring to degree of psychological disturbance and the other to developmental changes over time. The four adult groups for which Exner (2001) presents reference data (nonpatients, outpatients, hospitalized persons with major depressive disorder, and hospitalized persons with first admission schizophrenia) can reasonably be considered to represent increasing degrees of psychological disturbance. Two major Rorschach indices of psychological disturbance are X–% (an index of impaired reality testing) and WSum6 (an index of disordered thinking). If X–% and WSum6 are valid measures of disturbance, they would be expected to increase in linear fashion across these four reference groups—which they do. The mean value for X–% increases from .07 (nonpatients) to .16 (outpatients), .20 (inpatient depressives), and .37 (inpatient schizophrenics). The mean WSum6 values for the four groups, respectively, are 4.48, 9.36, 18.36, and 42.17 (Exner, 2001, Chapter 11).

As for developmental changes, young people are known to become decreasingly self-centered (i.e., less egocentric) and increasingly capable of moderating their affect (i.e., less emotionally intense). Self-centeredness is shown on the RIM by the egocentricity index and affect moderation by the balance between indices of relatively mature emotionality (FC) and relatively immature emotionality (CF). If these variables are valid measures of what they are posited to measure, their average values should change in the expected direction among children and adolescents at different ages—which they do. In the CS reference data, the mean egocentricity index is .67 at age 6 and decreases in almost linear fashion to .43 at age 16, which is just slightly higher than the adult mean of .40. The mean for FC increases steadily over time from 1.11 at age 6 to 3.43 at age 16 (compared to an adult mean of 3.56), while mean CF declines from 3.51 to 2.78 between age 6 and 16 (the adult mean is 2.41).

UTILITY OF RORSCHACH APPLICATIONS

By virtue of its nature as a personality assessment instrument, the RIM is useful in helping to make decisions that are based in substantial part on personality characteristics. Need for personality-based decisions arises frequently in the practice of clinical, forensic, and organizational psychology, which accordingly provide the three main contexts for Rorschach applications.

Clinical Practice

Rorschach assessment contributes to clinical practice by facilitating differential diagnosis and treatment planning. Although not itself a diagnostic test, the RIM can through its various scales and indices identify states and traits that are considered to characterize particular conditions. To the extent that paranoia involves being hypervigilant and interpersonally aversive, for example, Rorschach indications of hypervigilance and interpersonal aversion (HVI) will suggest the presence of paranoid features in a respondent. Because schizophrenia is typically defined by disordered thinking and impaired reality testing, Rorschach evidence of these cognitive difficulties (poor form quality and an elevation in critical special scores) usually points to the possibility of a schizophrenia spectrum disorder. Depressive disorder is suggested by Rorschach indices of dysphoria (achromatic color and color-shading blends) and pessimistic thinking (morbids), obsessive-compulsive personality disorder by indices of pedantry and perfectionism (obsessive index), and so on (see Weiner, 1998).

In addition to helping establish the diagnostic status of persons seen clinically, Rorschach findings contribute to treatment planning by measuring personality characteristics that have a bearing on numerous decisions that must be made prior to and during an intervention process. As elaborated elsewhere (Weiner, 1999b), these decisions include whether a respondent is functioning sufficiently well to be treated as an outpatient or so poorly as to require inpatient care; whether the person’s coping capacities and level of adjustment call primarily for a supportive approach designed to relieve distress or an exploratory approach intended to enhance selfunderstanding; what kinds of problems or concerns the person has that should be identified as treatment targets and with what priority these targets should be addressed; what kinds of obstacles to progress in the treatment can be anticipated on the basis of the patient’s interpersonal needs and attitudes; and whether a person in treatment is ready for termination.

Forensic Practice

Parallel to the way clinical diagnosis with the RIM proceeds through linkages between personality characteristics that typify certain disorders and Rorschach variables that measure these characteristics, forensic applications of the instrument derive from translation of legal concepts into psychological terms. In criminal law, for example, competency to stand trial is based primarily on whether defendants can appreciate the nature of the charges against them and assist in their own defense. Sanity is determined solely or in part by whether defendants were able at the time of their alleged offense to recognize the wrongfulness of their actions. The factors to be considered in addressing competence and sanity include whether a defendant, because of a psychotic impairment of reality testing, is presently or was likely at some previous time to have been incapable of perceiving experience accurately. In civil law cases involving allegations of personal injury, a key legal question concerns the extent of psychological damage a plaintiff may have sustained as a consequence of some dereliction of duty by the defendant. The extent of psychological damage in such cases can often be measured by the severity of indices of anxiety, depression, or stress disorder that appear to have arisen subsequent to the alleged injury. Because both impaired reality testing and emotional upset can readily be identified from Rorschach data, the instrument can be usefully applied in these types of forensic cases.

Even more so than in criminal and personal injury cases, Rorschach assessment can prove valuable in family law cases involving termination of parental rights or determination of custody and visitation privileges. There is general agreement concerning kinds of personality characteristics that contribute to a person’s being a good parent, such as being relatively free of psychological disorder, possessed of reasonably good self-control and frustration tolerance, and capable of forming close and nurturing relationships with other people. In addition to measuring such personality characteristics in adults, Rorschach data can also help to identify special needs or adjustment problems in their offspring that will have a bearing on the court’s decision concerning to whom responsibility for them should be assigned.

Organizational Practice

Rorschach assessment in organizational practice is concerned primarily with the selection and evaluation of personnel. Personnel selection typically consists of determining whether a person applying for a position in an organization is a suitable candidate to fill it, or whether a person already in the organization is qualified for promotion to a position of increased responsibility. Standard psychological procedure in making these selection decisions consists of first identifying the personality requirements for success in the position in question and then determining the extent to which a candidate shows these personality characteristics. For example, a leadership position requiring initiative and rapid decision making would probably not be filled well by a person who is behaviorally passive and given to painstaking care in coming to conclusions, both of which are personality characteristics measured by Rorschach indices (p ! a#1 and Zd ! #3.0, respectively).

Personnel evaluation usually involves assessing the current fitness for duty of persons whose ability to function has become impaired by psychological disorder. Most common in this regard is the onset of an anxiety, depressive, or stress disorder that prevents people from continuing to perform their job or practice their profession as competently as they had previously. Impaired professionals seen for psychological evaluation have also frequently had difficulties related to abuse of alcohol, drugs, or prescription medication. Because Rorschach data help to identify the extent to which people are anxious or depressed and whether they are struggling with more stress than they can manage, the RIM can often contribute to determinations of fitness for duty and to assessment of recovered capacity in impaired personnel participating in a treatment or rehabilitation program.

The kinds of applications to which the RIM contributes by virtue of measuring personality characteristics identify its limitations as well. In assessing psychopathology, for example, Rorschach data are of little use in determining the particular symptoms a person is manifesting. Someone with Rorschach indications of an obsessive-compulsive personality style may be a compulsive hand-washer, an obsessive prognosticator, or neither; someone with depressive preoccupations may be having crying spells, disturbed sleep, or neither. There is no isomorphic relationship between the personality characteristics of disturbed people and their specific symptoms. Accordingly, the nature of these symptoms is better determined from observing or asking directly about them than by speculating about their presence from Rorschach data. Similarly, Rorschach data do not indicate dependably whether a respondent has had certain life experiences or behaved in a particular way, unless there is a substantial known correlation between specific personality characteristics and the likelihood of these experiences or behavior having occurred. As a related general rule, the predictive validity of Rorschach findings will always be limited by the extent to which personality factors determine whatever is to be identified or predicted.

LEGAL ISSUES

The previously mentioned forensic applications of Rorschach assessment have raised legal issues concerning the utility of Rorschach findings in courtroom proceedings and the admissibility of these findings into evidence. With respect to their utility, Rorschach findings bring to expert mental health testimony three valuable types of information. First, the quantified indices and normative reference data provided by the CS allow examiners to specify in numerical terms the extent to which certain personality characteristics are present, such as a person’s level of reality testing and subjectively felt distress. Second, because the relatively unstructured nature of the RIM limits respondents’ awareness of what their percepts might signify, Rorschach responses often reflect aspects of personality functioning that people do not recognize in themselves or are reluctant to reveal during an interview or on a self-report inventory.

Third, the indirect manner in which the RIM measures personality states and traits makes it difficult for respondents to manufacture a false impression of themselves. Respondents trying to look more disturbed or impaired than they actually are typically overdo their efforts to appear incapacitated in ways that are obvious to experienced examiners. Respondents attempting to deny or conceal psychological difficulties may succeed in keeping these difficulties hidden, but they usually do so in ways that identify their guardedness and call into question the reliability of the test data they have given. This sensitivity of the RIM to attempted impression management, together with its quantification of personality characteristics and its capacity to transcend a respondent’s conscious awareness and intent, often generate forensically critical data that would not otherwise have become available.

The admissibility into evidence of Rorschach findings is determined according to legal guidelines that address (1) whether the expert witness testimony is likely to help the judge or jury make their decision and (2) whether this testimony is based on methods and principles that are scientifically reliable and generally accepted in the professional community (see Hess, 1999). With specific respect to Rorschach testimony, McCann (1998) has shown in a detailed analysis of contemporary research and practice that the RIM can and should fall within these guidelines for admissibility.

Consistent with McCann’s analysis, survey data indicate that the RIM has in fact been welcome in the courtroom. Weiner, Exner, and Sciara (1996) sampled the experience of 93 Rorschach clinicians while testifying during the previous 5 years in 4,024 criminal cases, 3,052 custody cases, and 858 personal injury cases. In only 6 (.08%) of these almost 8,000 cases was the integrity of the RIM seriously challenged, and in only 1 (.01%) instance was the Rorschach testimony ruled inadmissible into evidence. Meloy, Hansen, and Weiner (1997) examined Rorschach citations found in 247 cases heard between 1945 and 1995 in state, federal, and military courts of appeal. In only 26 (10.5%) of these cases did the reliability or validity of the Rorschach findings become an issue, and typically in these instances the questions that arose concerned the examiner’s interpretation of the data, not the nature of the instrument.

Despite this background of conceptual analysis and empirical findings, Grove and Barden (1999) have asserted that Rorschach assessment using the Comprehensive System is not sufficiently relevant and reliable to provide an admissible basis for courtroom testimony. In making this assertion, however, these authors overlooked or minimized considerable evidence to the contrary. Noteworthy in this regard are (1) the just mentioned indications that Rorschach testimony is in fact regularly admitted into evidence; (2) the previously noted equivalence of the average validity coefficients found for the RIM and the MMPI, the latter being the measure generally regarded in forensic circles as the gold standard of clinical personality assessment; and (3) the substantial body of empirical data demonstrating the relevance and reliability of the RIM in evaluating specific aspects of personality functioning. These and other shortcomings of the Grove and Barden critique are elaborated by Ritzler, Erard, and Pettigrew (2002), who update the McCann analysis with additional data showing how Rorschach assessment satisfies several specific criteria for admissibility, including being standardized, testable, valid, reliable, extensively peer reviewed, associated with a reasonable error rate, accepted by a substantial scientific community, and relevant to a wide range of forensic issues.

CROSS-CULTURAL CONSIDERATIONS

Because of the nonverbal nature of its stimuli, the RIM is a culture-free measure that can be used in essentially identical fashion with persons in all walks of life and from all parts of the world, whatever their racial or ethnic background. Mirroring the fact that dimensions of personality are universal phenomena, psychological states and traits are reflected in the same Rorschach structural data wherever and whenever the test is administered. Among all groups of people, for example, some are more reflective than expressive in how they deal with experience, whereas others are more expressive than reflective, and this individual difference is universally measured on the Rorschach by the preferences respondents show for attributing human movement to the blots or for reacting to their chromatic features. Likewise, to reprise earlier examples, preoccupation with Rorschach details indicates difficulty forming global impressions, infrequent popular responses indicate unconventional perspectives, and numerous inaccurately perceived forms indicate impaired reality testing, no matter who the respondent is. Like a musician, then, who can read a musical score and play it properly anywhere in the world, a knowledgeable Rorschach clinician can translate any set of Rorschach scores and indices, no matter from whom obtained, into accurate inferences about the individual’s personality characteristics.

Nevertheless, adequate cross-cultural application of Rorschach assessment does require attention to four considerations that go beyond the interpretive significance of the structural data for particular personality characteristics. First, because the Rorschach task involves a verbal interaction, the test should always be administered in the native language of both the examiner and the respondent, or at least in a second language with which both are thoroughly familiar. Idiomatic expressions and subtleties of language usage can interfere sufficiently with communication to cast doubt on the validity of a Rorschach administration conducted through an interpreter or in a shaky second language for either participant. Second, the types of thematic imagery respondents produce and the symbolic significance they attach to particular objects and events are typically rooted in their cultural heritage. Examiners must accordingly be sufficiently sensitive to a respondent’s background to grasp the likely meaning of the person’s fantasy productions and judge whether there is anything strange or unusual in how they are being expressed.

Third, aside from the previously mentioned and as yet unresolved question of whether Rorschach interpretation should be guided by culture-specific normative standards, Rorschach scoring includes three codes that are based on population norms and may show cross-cultural differences. The popular (P) code in the CS is given to 13 responses that were found to occur in one third or more of 7,500 records in Exner’s U.S. database. There is reason to believe that similarly developed lists of populars in other countries will closely resemble but not necessarily be identical to the CS list (Mattlar & Fried, 1993; Nakamura, 2001). This means that the number of Ps coded for a record may vary with a respondent’s nationality. The other two normatively based CS codes that could be affected by cross-cultural norms pertain to whether a response has been given to a common or an unusual blot detail (based on a cutting score of 5% frequency of occurrence in the database) and whether a response consists of an object ordinarily seen as having the form of a blot or blot detail (based on 2% or more normative frequency of occurrence).

Fourth, the implications of Rorschach interpretations for the quality of a respondent’s life adjustment may depend on the kinds of personality characteristics that are valued by the society in which the person lives. Some personality characteristics usually prove advantageous or maladaptive in almost any surroundings, whereas the impact of others varies with cultural expectations and preferences. For example, good reality testing as inferred from accurate perception of the inkblots probably contributes to successful adaptation in any cultural context, and impaired reality testing is very likely a universal impediment to getting along well. By contrast, Rorschach indications of being a relatively passive, dependent, condescending, self-effacing, and altruistic person are more likely to be associated with good adjustment in a communal, group-oriented, and noncompetitive society than in a society that rewards assertiveness, individual achievement, and a self-centered focus. Although the interpretive significance of Rorschach data involves universally applicable descriptions of personality characteristics, then, the adaptive significance of these descriptions will be relative to the cultural surround.

COMPUTERIZATION OF RESULTS

Because of the interactive nature of Rorschach testing and the virtually infinite variability of the verbal responses it generates, little progress has been made in automating the administration of the RIM or the coding of individual responses. However, with the raw test data collected and the response codes determined, there are software programs to assist in the scoring and interpretation of Rorschach protocols. These programs print out a list of the response codes entered as the raw data, a table showing the summary scores and indices calculated from these codes, and a narrative interpretive report consisting of descriptive statements based on these summary scores and indices. Being derived solely from coded responses, these interpretive statements capture mainly the implications of the structural data in a Rorschach assessment. They note the potential significance of the thematic data only when codes are assigned to it (as in the previously mentioned instances of coding thematic imagery for morbid or aggressive content), and they do not take account of any behavioral data.

Rorschach computerization is exemplified by the current version of the Rorschach Interpretation Assistance Program (RIAP4 Plus) developed by Exner & Weiner (2001). Like computer-based test interpretation (CBTI) programs developed for other instruments, the RIAP is based on a combination of empirical findings and clinical judgment concerning the behavioral correlates of particular test patterns. Also in common with other CBTI programs, the narrative interpretive statements generated by RIAP do not necessarily or in all respects describe the individual respondent whose codes have been entered. Instead, these interpretations apply in general to people who give similar kinds of responses, and all individual respondents are likely to differ in some specific respects from persons with whom they otherwise have much in common. Accordingly, a typical computer-generated narrative will contain some statements that are clearly not applicable to the person who was examined and some statements that are not completely consistent with other statements. The specific implications of automated interpretive statements must therefore always be evaluated in light of interrelationships within the test data, the individual’s life context, and information from other sources concerning his or her personality functioning (Butcher, 2002).

CURRENT AND FUTURE STATUS

The status of assessment instruments is typically reflected in the frequency with which they are used and studied. As reviewed by Camara, Nathan, and Puente (2000); Viglione and Hilsenroth (2001); and Weiner (1999b), numerous surveys over the past 40 years have consistently shown substantial endorsement of Rorschach testing as a valuable skill to teach, learn, and practice. These surveys indicate that over 80% of clinical psychologists engaged in providing assessment services use the RIM in their work and believe that clinical students should be competent in Rorschach assessment, that over 80% of graduate programs teach the RIM, and that students find this training helpful in improving their understanding of their patients and developing other clinical skills. In recent comprehensive surveys of predoctoral internship sites, training directors commonly assigned considerable value to Rorschach testing, indicated that it was one of the three measures most frequently used in their test batteries (along with the Wechsler Adult Intelligence Scale [WAIS]/Wechsler Intelligence Scale for Children [WISC] and Minnesota Multiphasic Personality Inventory-2 [MMPI-2]/MMPI-Adolescent [MMPI-A]), and expressed a desire for their incoming interns to have had a Rorschach course or arrive with a good working knowledge of the instrument (Clemence & Handler, 2001; Stedman, Hatch, & Schoenfeld, 2000).

Survey findings indicate that Rorschach assessment has gained an established place in forensic as well as clinical practice. Data collected from forensic psychologists by Ackerman and Ackerman (1997), Boccaccini and Brodsky (1999), and Borum and Grisso (1995) showed 30% using the RIM in evaluations of competency to stand trial, 32% in evaluations of criminal responsibility, 41% in evaluations of personal injury, and 48% in evaluations of adults involved in custody disputes.

As for study of the instrument, the scientific status of the RIM has been attested over many years by a steady and substantial volume of published research concerning its nature and utility. Buros (1974) Tests in Print II identified 4,580 Rorschach references through 1971, with an average yearly rate of 92 publications. In the 1990s, Butcher and Rouse (1996) found an almost identical trend continuing from 1974 to 1994. An average of 96 Rorschach research articles appeared annually during this 20-year period in journals published in the United States, and the RIM was second only to the MMPI among personality assessment measures in the volume of research it generated. As implied in the earlier mention of cross-cultural collaboration in normative data collection, there is also a large international community of Rorschach scholars and practitioners whose research published in languages other than English has for many years made important contributions to the literature (see Weiner, 1999a). The international presence of the RIM is reflected in a survey of test use in Spain, Portugal, and Latin American countries by Muniz, Prieto, and Almeida (1998), in which the Rorschach emerged as the third most widely used psychological assessment instrument, following the Wechsler intelligence scales and versions of the MMPI. Finally of note in this regard, an international society for Rorschach and projective methods has been in existence since 1947, and triennial congresses sponsored by this society typically attract participants from over 30 countries.

Despite the information presented in this chapter concerning the psychometric soundness and numerous applications of the RIM and the frequency with which it is used and studied, not all psychologists look favorably on Rorschach assessment. Particularly in academic circles, there are some who remain unconvinced of its reliability and validity and argue against its being taught or studied in university programs (see Lilienfeld, Wood, & Garb, 2000). Let it be said that the RIM, like virtually all instruments used in psychological assessments, is neither perfectly understood nor the ultimate answer to all questions. Like all widely used tests in psychology, it is more valid for some purposes than others and awaits further research to clarify its characteristics and corollaries. As Meyer and Archer (2001) conclude in the most recent summary of the empirical evidence available at the time of this writing, “Given this evidence, and the limitations inherent in any assessment procedure, there is no reason to single out the Rorschach for praise or criticism” (p. 499). Regrettably, however, intractable Rorschach critics often appear immune to persuasion by the continuing accumulation of research data confirming the scientific merit of the instrument, and they often seem unacquainted with the practical utility of Rorschach findings, which would not exist if it were an unreliable or invalid instrument. Reviewing the Rorschach in the current edition of the Mental Measurements Yearbook, Hess, Zachar, and Kramer (2001) concur that “the Rorschach, employed with the Comprehensive System, is a

better personality test than its opponents are willing to acknowledge” (p. 1037).

The future of Rorschach assessment holds some risk that its critics will curtail its teaching in those academic settings where their views are influential. Any such silencing of Rorschach instruction would be regrettable. As would be true for any widely used and apparently helpful method that is not yet perfectly understood or completely validated, who will be capable of pursuing an appropriate research agenda if no one is being taught to use it appropriately? Among knowledgeable assessment psychologists, however, there is no indication of flagging interest in using the RIM clinically or doing research with it. The literature is providing a constant flow of fresh ideas and improved guidelines for the practical application of Rorschach findings, and accumulating research results are steadily strengthening the psychometric foundations of the instrument and expanding comprehension of how it works. Societies around the world concerned with Rorschach assessment are thriving, and seminars and workshops on the Rorschach method continue to attract a large audience. The current status of Rorschach assessment, 80 years old at the time of this writing, appears healthy, vigorous, and poised for continued enhancement in the twentyfirst century.

REFERENCES

Ackerman, M.J., & Ackerman, M.C. (1997). Custody evaluations in practice: A survey of experience professionals (revisited).Professional Psychology, 28, 137–145.
Acklin, M.W., McDowell, C.J., Verschell, M.S., & Chan, D. (2000). Interobserver agreement, intraobserver agreement, and the Rorschach Comprehensive System. Journal of Personality Assessment, 74, 15–57.
Beck, S.J. (1930a). The Rorschach test and personality diagnosis. American Journal of Psychiatry, 10, 19–52.
Beck, S.J. (1930b). Personality diagnosis by means of the Rorschach test. American Journal of Orthopsychiatry, 1, 81–88.
Boccaccini, M.T., & Brodsky, S.L. (1999). Diagnostic test usage by forensic psychologists in emotional injury cases. Professional Psychology, 30, 253–259.
Bornstein, R.F. (1999). Criterion validity of objective and projective dependency tests: A meta-analytic assessment of behavioral prediction. Psychological Assessment, 11, 48–57.
Borum, R., & Grisso, T. (1995). Psychological test use in criminal forensic evaluations. Professional Psychology, 26, 465–473.
Buros, O.K. (Ed.) (1974). Tests in print II. Highland Park, NJ: Gryphon Press.

354 Rorschach Assessment: Current Status

Butcher, J.N. (2002). How to use computer based reports. In J.N. Butcher (Ed.), Clinical personality assessment (2nd ed., pp. 109– 126). New York: Oxford.
Butcher, J.N., & Rouse, S.V. (1996). Personality: Individual differences and clinical assessment. Annual Review of Psychology, 47, 87–111.
Camara, W., Nathan, J., & Puente, A. (2000). Psychological test usage: Implications in professional use. Professional Psychology, 31, 141–154.
Clemence, A., & Handler, L. (2001). Psychological assessment on internship: A survey of training directors and their expectations for students. Journal of Personality Assessment, 76, 18–47.
Ellenberger, H.F. (1954). The life and work of Hermann Rorschach (1884–1922). Bulletin of the Menninger Clinic, 18, 173–219.
Ephraim, D. (2000). Culturally relevant research and practice with the Rorschach Comprehensive System. In R.H. Dana (Ed.), Handbook of cross-cultural and multicultural personality assessment (pp. 303–328). Mahwah, NJ: Erlbaum.
Erdberg, P., & Shaffer, T.W. (1999, August). International symposium on Rorschach nonpatient data: Findings from around the world. Paper presented at the XVI International Congress of Rorschach and Projective Methods, Amsterdam, The Netherlands.
Erdberg, P., & Shaffer, T.W. (2001, March). International symposium on Rorschach nonpatient data: Worldwide findings. Symposium conducted at the meeting of the Society for Personality Assessment, Philadelphia, PA.
Exner, J.E., Jr. (1969). The Rorschach systems. New York: Grune & Stratton.
Exner, J.E., Jr. (1974). The Rorschach: A comprehensive system. New York: Wiley.
Exner, J.E., Jr. (2000). A primer for Rorschach interpretation. Asheville, NC: Rorschach Workshops.
Exner, J.E., Jr. (2001). A Rorschach workbook for the Comprehensive System (5th ed.). Asheville, NC: Rorschach Workshops.
Exner, J.E., Jr. (2002). A new nonpatient sample for the Rorschach Comprehensive System: A progress report. Journal of Personality Assessment, 78, 391–404.
Exner, J.E., Jr. (2003). The Rorschach: A comprehensive system: Volume 1. Basic foundations and principles of interpretation (4th ed.). Hoboken, NJ: Wiley.
Exner, J.E., Jr., Thomas, E.A., & Mason, B. (1985). Children’s Rorschachs: Description and prediction. Journal of Personality Assessment, 49, 13–20.
Exner, J.E., Jr., & Weiner, I.B. (1995). The Rorschach: A comprehensive system: Volume 3. Assessment of children and adolescents (2nd ed.). New York: Wiley.
Exner, J.E., Jr., & Weiner, I.B. (2001). Rorschach Interpretation Assistance Program: Version 4 Plus for Windows (RIAP4 Plus). Odessa, FL: Psychological Assessment Resources.
Garb, H.N., Wood, J.M., Nezworski, M.T., Grove, W.M., & Stejskal, W.J. (2001). Towards a resolution of the Rorschach controversy. Psychological Assessment, 13, 433–448.
Grove, W.M., & Barden, R.C. (1999). Protecting the integrity of the legal system: The admissibility of testimony from mental health experts under Daubert/Kumho analyses. Psychology, Public Policy, and the Law, 5, 224–242.
Hamel, M., Shaffer, T.W., & Erdberg, P. (2000). A study of nonpatient preadolescent Rorschach protocols. Journal of Personality Assessment, 75, 280–294.
Hess, A.K. (1999). Serving as an expert witness. In A.K. Hess & I.B. Weiner (Eds.), Handbook of forensic psychology (2nd ed., pp. 521–555). New York: Wiley.
Hess, A.K., Zachar, P., & Kramer, J. (2001). Rorschach. In B.S. Plake & J.S. Impara (Eds.), Fourteenth mental measurements yearbook (pp. 1033–1038). Lincoln: University of Nebraska Press.
Hiller, J.B., Rosenthal, R., Bornstein, R.F., Berry, D.T.R., & Brunell-Neuleib, S. (1999). A comparative meta-analysis of Rorschach and MMPI validity. Psychological Assessment, 11, 278–296.
Lilienfeld, S.O., Wood, J.M., & Garb, H.N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27–66.
Mattlar, C-E., & Fried, R. (1993). The Rorschach in Finland. Rorschachiana, 18, 105–125.
McCann, J.T. (1998). Defending the Rorschach in court: An analysis of admissibility using legal and professional standards. Journal of Personality Assessment, 70, 135–144.
McDowell, C.J., & Acklin, M.W. (1996). Standardizing procedures for calculating Rorschach interrater reliability. Journal of Personality Assessment, 66, 308–332.
Meloy, J.R., Hansen, T.L., & Weiner, I.B. (1997). Authority of the Rorschach: Legal citations during the past 50 years. Journal of Personality Assessment, 69, 53–62.
Meyer, G.J. (1997a). Assessing reliability: Critical correlations for a critical examination of the Rorschach Comprehensive System. Psychological Assessment, 9, 480–489.
Meyer, G.J. (1997b). Thinking clearly about reliability: More critical correlations regarding the Rorschach Comprehensive System. Psychological Assessment, 9, 495–498.
Meyer, G.J., & Archer, R.P. (2001). The hard science of Rorschach research: What do we know and where do we go? Psychological Assessment, 13, 486–502.
Meyer, G.J., Finn, S.E., Eyde, L.D., Kay, G.G., Moreland, K.L., Dies, R.R., et al. (2001). Psychological testing and psychological assessment: A review of evidence and issues. American Psychologist, 56, 128–165.
Meyer, G.J., & Handler, L. (1997). The ability of the Rorschach to predict subsequent outcome: Meta-analysis of the Rorschach Prognostic Rating Scale. Journal of Personality Assessment, 69, 1–38.
Meyer, G.J., Hilsenroth, M.J., Baxter, D., Exner, J.E., Jr., Fowler, J.C., Pers, C.C., et al. (2002). An examination of interrater re-

liability for scoring the Rorschach Comprehensive System in eight data sets. Journal of Personality Assessment, 78, 219–274.

Muniz, J., Prieto, G., & Almeida, L. (1998, August). Test use in Spain, Portugal, and Latin American countries. Paper presented at the 24th International Congress of Applied Psychology, San Francisco, CA.
Nakamura, N. (2001, March). Popular responses of 450 Japanese nonpatients compared to the U.S. and Spain. Paper presented at the meeting of the Society for Personality Assessment, Philadelphia, PA.
Ritzler, B., Erard, R., & Pettigrew, G. (2002). Protecting the integrity of Rorschach expert witnesses: A reply to Grove and Barden (1999) re: The admissibility of testimony under Daubert/Kumho analyses. Psychology, Public Policy, and the Law, 8, 201–215.
Rorschach, H. (1942). Psychodiagnostics: A diagnostic test based on perception. Bern, Switzerland: Hans Huber. (Original work published 1921)
Rosenthal, R., Hiller, J.B., Bornstein, R.F., Berry, D.T.R., & Brunell-Neuleib, S. (2001). Meta-analytic methods, the Rorschach, and the MMPI. Psychological Assessment, 13, 449–451.
Schwarz, W. (1996). Hermann Rorschach, M.D.: His life and work. Rorschachiana, 21, 6–17.
Shaffer, T.W., Erdberg, P., & Haroian, J. (1999). Current nonpatient data for the Rorschach, WAIS, and MMPI-2. Journal of Personality Assessment, 73, 305–316.
Stedman, J., Hatch, J., & Schoenfeld, L. (2000). Preinternship preparation in psychological testing and psychotherapy: What internship directors say they expect. Professional Psychology, 31, 321–326.
Viglione, D.J. (1999). A review of recent research addressing the utility of the Rorschach. Psychological Assessment, 11, 251– 265.
Viglione, D.J., & Hilsenroth, M.J. (2001). The Rorschach: Facts, fictions, and future. Psychological Assessment, 13, 452–471.
Weiner, I.B. (1977). Approaches to Rorschach validation. In M.A. Rickers-Ovsiankina (Ed.), Rorschach psychology (2nd ed., pp. 575–608). Huntington, NY: Krieger.
Weiner, I.B. (1991). Editor’s note: Interscorer agreement in Rorschach research. Journal of Personality Assessment, 56, 1.
Weiner, I.B. (1996). Some observations on the validity of the Rorschach Inkblot Method. Psychological Assessment, 8, 206– 213.
Weiner, I.B. (1999a). Contemporary perspectives on Rorschach assessment. European Journal of Psychological Assessment, 15, 78–86.
Weiner, I.B. (1999b). Rorschach Inkblot Method. In M. Maruish (Ed.), The use of psychological testing in treatment planning and outcome evaluation (2nd ed., pp. 1123–1156). Mahwah, NJ: Erlbaum.
Weiner, I.B. (2001a). Advancing the science of psychological assessment: The Rorschach Inkblot Method as exemplar. Psychological Assessment, 13, 423–432.
Weiner, I.B. (2001b). Considerations in collecting Rorschach reference data. Journal of Personality Assessment, 77, 122–127.
Weiner, I.B. (2003). Principles of Rorschach interpretation (2nd ed.). Mahwah, NJ: Erlbaum.
Weiner, I.B., Exner, J.E., Jr., & Sciara, A. (1996). Is the Rorschach welcome in the courtroom? Journal of Personality Assessment, 67, 422–424.
Wolf, Elizabeth B. (2000). Hermann Rorschach. In A.E. Kazdin (Ed.), Encyclopedia of psychology (pp. 115–117). Washington, DC: American Psychological Association.
Wood, J.M., Nezworski, M.T., Garb, H.N., & Lilienfeld, S.O. (2001). The misperception of psychopathology: Problems with the norms of the Comprehensive System of the Rorschach. Clinical Psychology, 8, 350–373.
Wood, J.M., Nezworski, M.T., & Stejskal, W.J. (1996). The Comprehensive System for the Rorschach: A critical examination. Psychological Science, 7, 3–10.

CHAPTER 27 The Thematic Apperception Test (TAT)

ROBERT J. MORETTI AND EDWARD D. ROSSINI

TEST DESCRIPTION 356 THEORETICAL BASIS 357 TEST DEVELOPMENT 358 PSYCHOMETRIC CHARACTERISTICS 359 RANGE OF APPLICABILITY AND LIMITATIONS 361 CROSS-CULTURAL FACTORS 361 ACCOMMODATION FOR POPULATIONS WITH DISABILITIES 362

LEGAL AND ETHICAL CONSIDERATIONS 362 COMPUTERIZATION 363 CURRENT RESEARCH STATUS 363 USE IN CLINICAL OR ORGANIZATIONAL PRACTICE 364 FUTURE DEVELOPMENTS 366 REFERENCES 366

TEST DESCRIPTION

Personality assessment and psychodiagnostic evaluation have been defining aspects of the professional history and the contemporary practice of clinical psychology. The utility and validity of such assessment are well established using any criterion of clinical efficacy (Meyer et al., 2001). For nearly 70 years, the Thematic Apperception Test (TAT) has been part of this rich tradition. A series of recent books on the TAT for frontline clinicians (Aronow, Weiss, & Reznikoff, 2001; Bellak & Abrams, 1997; Cramer, 1996; Teglasi, 2001) and academic clinical psychologists (Gieser & Stein, 1999a) speak to its enduring popularity beyond the consistently high ranking of the TAT in psychodiagnostic test usage surveys.

The TAT is a semistructured projective technique, requiring subjects being examined to make up stories in response to pictures of intentionally varied ambiguity. Originally introduced in a medical journal for psychoanalytically oriented psychiatrists (Morgan & Murray, 1935) and then further developed for several years (see the Test Development section), the standard version of the TAT was published by Harvard University Press (Murray, 1943) for wide-scale use by clinical psychologists.

There are 31 achromatic pictures or cards, unchanged from first publication, which are adapted from works of art, photographs, or unique drawings. A majority of the cards portray interpersonal situations, while the others present a single figure or landscape. One of the cards is totally unstructured, being simply a blank white card. On the back of each card is a numerical designation and letters pertaining to the card’s recommended gender and age-level usage. Murray designated four partially overlapping sets of 20 cards each for administration to men, women, boys, or girls. According to Murray’s instructions, 10 cards are administered in a 50 minute session, and in a following session, the remaining 10. Pictures are presented one at a time, and the subject’s task is to

make up as dramatic a story as you can for each. Tell what has led up to the events shown in the picture, describe what is happening at the moment, what the characters are feeling and thinking; and then give the outcome. Speak your thoughts as they come to mind. (Murray, 1943; p. 3)

Murray originally instructed subjects that they were taking a test of imagination, one form of intelligence; modern TAT examiners rarely suggest this. Instructions for the second session, in which the more ambiguous cards are presented, encourage the subject to give freer rein to the imagination. In a third session, Murray recommends using an interview to inquire about possible sources for the plots used by the subject in the stories. It has become commonplace, however, for examiners to select their own preferred, and abbreviated, sets of cards in the interest of time or for a special purpose, and to administer these in a single session, with follow-up interview being uncommon.

The examiner adopts an attitude of encouragement and appreciation in order to stimulate the individual’s productivity, answers questions nondirectively, but avoids entering into discussion. The person may be reminded of the story requirements if they are not being met, through the asking of pertinent questions. Slightly altered instructions are provided to children, seriously disturbed patients, and adults of limited education.

The manual instructs the examiner to write down the exact words of the subject, but acknowledges that this is virtually impossible without knowing shorthand. At the time the TAT first appeared, modern inexpensive tape recording was not available, but may commonly be used today. Other examiners have resorted to having subjects write their stories.

Murray analyzed TAT content in terms of needs, which are forces emanating from the story’s main character or hero, and press, or forces emanating from the environment. Other material, such as story themes, outcomes, feelings of characters, and so forth, are also taken into account. The whole mix makes for a cumbersome and time-consuming process that has largely been abandoned by the typical practitioner. Various scoring systems have arisen, but have largely been neglected or ignored (see Use in Clinical or Organizational Practice section). Most clinicians today seem to rely upon their own impressionistic inferences, sometimes based in a particular theory, but more commonly pantheoretical in nature. Since Murray felt that the examiner needs to possess a carefully trained and critical intuition in order to interpret stories, perhaps it is not surprising that clinical practitioners have assumed these attributes as the backbone of their approach to the TAT.

The many alterations in TAT administration and scoring over the years have led to an inconsistent and even idiosyncratic employment of the technique. Even so, Henry Murray would be unlikely to object to these developments. As Caroline Murray (1999) noted in reference to her husband, “For Harry, there was no set way to use or interpret the TAT and, for that matter, no set TAT” (p. xi).

THEORETICAL BASIS

The TAT draws upon a familiar narrative tradition of putting into words a range of conscious and less than conscious personal experience. From perhaps the dawn of language use, human beings have tried to capture and retain experience in the form of stories. In primitive times, these stories may have referred simply to what had occurred that day—in the hunt, for example. With time, some particularly useful stories became embellished while being passed on through generations, taking the form that we call a folktale. Over many generations, some traditional stories became myths, embodying the worldview of a culture. In our more modern times, stories have lost none of their salience. We still tell them to each other, but now we also write them and read them in fiction, and show them and watch them in the form of movies. Many individuals also try to capture their nocturnal dreams, finding a way to cast the confusing content in the form of a coherent story. Undoubtedly, human beings have a powerful storytelling tendency, one that is universal and perhaps defining of what it means to be human.

Henry Murray’s genius was to realize the importance of storytelling and the way in which the stories we tell say something about who we are. As a voracious and widely read student of fiction (Gieser & Morgan 1999; Robinson, 1992), he knew that some types of story content seem almost characteristic of individuals, as when a novelist’s genre of work often echoes similar themes repetitively across books. It is hardly surprising that stories should be of interest to psychologists studying human personality. After all, psychologists take personal and developmental “histories” and our psychotherapy patients tell us their life stories. With clinical seasoning, we come to learn that most people’s lives have recurrent themes, and that the plots and characters tell us a great deal about who individuals really are and what they struggle with in life (McAdams, 1985, 1993).

Similarly, through analysis of the stories told to the TAT, a trained assessor can be led to underlying variables in the individual’s personality—such as drives, sentiments, emotions, complexes, conflicts, and other tendencies (Murray, 1943). Morgan and Murray (1935) wrote that the TAT is

based on the well recognized fact that when someone attempts to interpret a complex social situation he is apt to tell as much about himself as he is about the phenomenon on which his attention is focused. At such times, the person is off his guard, since he believes he is merely explaining objective occurrences. To one with “double hearing,” however, he is exposing certain inner forces and arrangements, wishes, fears, and traces of past experiences. (p. 390)

This is a succinct description of the underpinnings of what is sometimes called the projective hypothesis (Frank, 1939). Human beings are constantly projecting aspects of themselves onto the outer world, usually without awareness, and a person similarly projects his or her personality into the content and structure of stories (Stein, 1999). Because the cards of the TAT are vague, complex, or ambiguous, there is lots of room for individuals to project aspects of themselves into the stories they tell in response to them. This is an understanding of projection that is more common and reflective of

everyday life than Freud’s early notion of projection as a defense mechanism; it is similar in tone to the way in which Jung used projection to describe a nearly ubiquitous tendency of human nature (see Frey-Rohn, 1976).

Murray distinguished between perception, which is recognition based upon sensory impressions, as in the simple identification of what is pictured in the TAT cards, and apperception, which is the process by which additional meaning is assigned to those elements. The apperception that occurs in the TAT is assumed to be the result of projection, and Murray introduced the term “apperceptive projection” to describe the process (Anderson, 1999).

Perhaps anticipating his later critics, Murray fully realized the complicated nature of determining exactly what the projected material referred to or meant. Speaking of the personality tendencies elicited by the test, he said:

They represent (not literally in most cases but symbolically) (1) things the subject has done, or (2) things he has wanted to do or been tempted to do, or (3) elementary forces in his personality of which he has never been entirely conscious although they may have given rise to fantasies and dreams in childhood or later; and/or they represent (4) feelings and desires he is experiencing at the moment; and/or (5) anticipations of his future behavior, something he would like to do or will perhaps be forced to do, or something he does not want to do but feels he might do because of some half-recognized weakness in himself.

The second assumption is that the press variables represent forces in the subject’s apperceived environment, past present, or future. They refer, literally or symbolically, to (1) situations he has actually encountered, or (2) situations which in reveries or dreams he has imagined encountering, out of hope or fear; or (3) the momentary situation (press of the examiner and the task) as he apperceives it; and/or (4) situations he expects to encounter, or dreads encountering. (Murray, 1943, p. 14)

And,

In any event, the conclusions that are reached by an analysis of TAT stories must be regarded as good “leads” or working hypotheses to be verified by other methods, rather than as proved facts. (p. 14; italics as in original)

TEST DEVELOPMENT

Henry Murray did not develop the TAT by himself, but in close collaboration with his largely forgotten, but personally influential coauthor, Christiana Morgan, who was his colleague at Harvard in the early days of the Harvard Psychological Clinic. Full-length biographies of both Murray (Robinson, 1992) and Morgan (Douglas, 1993) are available. Morgan was a Jungian-influenced artist and lay analyst who drew six of the cards currently in use (W.G. Morgan, 2000), played a major editorial role in the selection of the final TAT cards, and was probably the most experienced of the early users of the TAT (Douglas, 1993). Though Morgan’s name was dropped from later editions of the test without explanation, luminaries who were present at the TAT’s inception, such as Robert White and Saul Rosenzweig (Douglas, 1993; Rosenzweig, 1999) have stressed Morgan’s right to be cited as first author. Morgan was a complex person who had undergone an analysis with C.G. Jung in Zurich during 1926– 1927, an analysis that involved the production of abundant fantasy material utilizing Jung’s method of active imagination. Her fantasy productions were later used extensively by Jung in a series of seminars (Jung, 1976). Drawing from interviews and a 15-year friendship with Murray, Anderson (1999) has described some of the ways in which Jung influenced both Morgan and Murray.

There were, according to Tomkins (1947), at least three prior pieces of research involving the telling of stories to pictures that had been published prior to the development of the TAT. The one most evocative of the later TAT was conducted by Schwartz (1932), who administered eight pictures to juvenile delinquents and asked them to describe what the character portrayed in each was thinking and what he might do. Schwartz’s express purpose was to get the boy being examined to project aspects of his personality into the stories and in response to additional questions (Stein & Gieser, 1999). It is not known how much Schwartz’s work actually influenced Murray, but one of the pictures discarded for the final 1943 version of the TAT had special instructions that are similar to those used by Schwartz in her study (W.G. Morgan, 2000).

Murray had already been experimenting with obtaining people’s evaluations of pictures when a graduate student by the name of Cecelia Washburn Roberts told him how she had been able to evoke rich fantasy material by asking her young son to tell a story in response to a picture. This conversation appears to have galvanized Murray and Morgan’s development of the TAT. Next, Murray asked his mother and his daughter to tell stories to pictures he gave them. He was fairly astonished by the depth of the material and the accuracy with which it portrayed their dynamics, as well as by the fact that his subjects did not seem to realize that they had revealed psychological material they may have been otherwise unable to put into words (Anderson, 1999).

Morgan and Murray then put together a set of pictures that were chosen for several reasons. First, the pictures were intended to suggest a critical situation and to evoke fantasy related to the situation. Next, the pictures were purposely selected for having some degree of ambiguity, so that not everyone would tell the same story. Most pictures also portrayed at least one person into which the storyteller could easily project himself or herself. However, the overriding principle was how stimulating the pictures were to the production of rich material.

The TAT was just one of several experimental measures of personality assessment developed by an interdisciplinary team at the Harvard Psychological Clinic, under the direction of Henry Murray, and introduced as part of a large-scale study of 50 Harvard undergraduates. This intensive and multifaceted study, focusing upon strengths and competencies as well as symptoms and pathology, resulted in the classic book Explorations in Personality (Murray, 1938).

After three revisions of the cards, the final set of TAT cards and a brief accompanying manual were published in 1943 (Murray, 1943). Murray intended that the TAT be used in research and in clinical psychodiagnosis, but he especially recommended it as a method that could shorten the length of psychoanalysis or psychotherapy. Although he used the needspress framework to interpret stories and was heavily influenced by psychoanalysis and analytical psychology, Murray did not intend for the TAT to be restricted to any one conceptual school (Anderson, 1999). He also was continually interested in new pictures to develop sets of cards for different purposes and would have probably continued to change the TAT cards if World War II hadn’t intervened. He developed separate sets of cards, for example, to elicit underlying Jungian archetypes of the unconscious; to determine identifications in Biblical stories of the New Testament; to help the U.S. Navy and U.S. Air Force select personnel; to select paratroopers for the Chinese army; and to probe the psyches of Russians in wartime (C. Murray, 1999). Murray and his colleagues continually experimented with different ways of administering the test. It would seem that he was fascinated with the technique of storytelling in response to pictures and creatively applied it in new ways.

Various modifications or extensions of the TAT have been introduced. Perhaps the best known of these are the Senior Apperception Technique (SAT), the Children’s Apperception Test (CAT), and the Apperceptive Personality Test (APT). The SAT and CAT are thoroughly discussed in Bellak and Abrams (1997). The APT was developed (Holmstrom, Silber, & Karp, 1990) to bypass the shortcomings of the TAT, especially by its use of an objective scoring system. Consisting of eight cards of moderate ambiguity, the test requires that a TAT-type story be told to each card. Following administration, the subject scores his or her own stories for a variety of personality categories on a series of rating scales that can be computer scored. As might be expected, this approach yields good reliabilities through the elimination of scorer errors or misjudgments. The authors suggest that the questionnaire used for scoring can also be used with TAT stories, although the normative data from the APT would not apply. The stories told to the APT can still be used in the usual TAT fashion by the assessor, so the test provides two sources of data, one objective and one projective. Validity studies have been conducted on several criterion groups and are reported in Karp (1999). Suggestions for clinical use are given in Silber, Karp, & Holmstrom (1990). Despite its promising premise, the APT has so far failed to gain wide attention.

PSYCHOMETRIC CHARACTERISTICS

“To be, or not to be a psychometric test?” This question concerning the TAT has divided personality assessors into opposing camps. Papers continue to appear presenting arguments of the advocates (Karon, 2000; Woike & McAdams, 2001) and detractors (Garb, 1998; Lilienfeld, Wood, & Garb, 2000) of the TAT.

The arguments can be largely settled by asking a simple question: How is the TAT being used—as a clinical technique or as a psychometric measure? When clinicians use their own idiosyncratic card sets, give instructions that vary, do not consult a set of norms, and score according to any system or no system at all, we cannot possibly be talking about the use of a psychometric test. We are instead talking about the use of a technique, one that resembles a novel type of semistructured interview. Trying to establish psychometric properties of a technique that is used in such varied ways is impossible. This does not take away, however, from the usefulness and helpfulness of the technique, any more than the lack of psychometric properties makes a clinical interview of no value. Much depends upon the quality of the individual interpreter’s intuition, experience, and clinical acumen.

Researchers, too, have not been consistent in using a standard set of cards when attempting to establish psychometric properties. Comparisons of studies and generalizations to practice seem hardly possible when such a wide range of stimulus materials have been accepted as constituting the TAT (Keiser & Prather, 1990).

Criticisms of the TAT’s reliability mostly have addressed its low internal consistency and its low test-retest stability. However, the very nature of the TAT’s construction explains quite well why internal consistency is an unreasonable expectation. Murray and his staff selected pictures for the TAT that covered a broad range of themes and that were likely to elicit different needs and perceived press. There is no expectation that the same needs or press, for example, will appear in all stories, and it would be considered unusual (and interpretively important!) if they did. Since each picture is different, often dramatically so, from the others in the set, it is hardly surprising that the needs expressed from story to story are not consistent.

The test-retest reliability coefficients may be influenced by the usual instructions given on retest. The standard instructions, when used for retest purposes, seem to imply that the patient or research subject should produce a story as dissimilar from the original story as possible, which artificially lessens test-retest reliability. Whether alterations in the retest instructions are effective at improving reliability remains controversial (Kraiger, Hakel, & Cornelius, 1984; Winter & Stewart, 1977). However, in one test-retest study covering a 1-year span of time (Lundy, 1985), instructions were designed to break the implicit set to produce a new and different story from that previously told. Test-retest correlations for need for affiliation and need for intimacy were .48 and .56, respectively. The author notes that these are about the same test-retest reliabilities as obtained for the MMPI, 16PF, and California Psychological Inventory.

Another threat to the TAT’s test-retest reliability is its sensitivity to situational variables. Anastasi cites Atkinson’s work (1958) and states that

A considerable body of experimental data is available to show that such conditions as hunger, sleep deprivation, social frustration, and the experience of failure in a preceding test situation significantly affect TAT responses. (Anastasi, 1988, p. 604)

Even music may sometimes affect TAT stories (McFarland, 1984). All of these situational variables are unlikely to be consistent across test and retest conditions, thereby limiting reliability. Yet it is also in this very regard that the TAT may also be said to have extraordinary sensitivity to immediate and important, yet transiently variable needs that have been aroused within the person by the current life situation. What remains as critical, however, is for the examiner to understand how to differentiate between momentary influences and more central motives or needs, a process that has never been adequately described. But certainly a familiarity with the immediate circumstances of the patient, obtained through history and interview, enables the examiner to watch for and not overly emphasize content related to the circumstances.

Interscorer reliability is an essential foundation to the other types of reliability mentioned. It is our strong impression that the issue of interscorer reliability seldom gets raised in the training given to students learning the TAT, though we know of no data reported in the literature that confirm this. At least one notable exception lies in the work of Dana, who describes his way of training students:

For nearly forty years, I have taught graduate students to learn projective technique interpretation by examining the validity of the concepts in their own reports. This method requires that a group of students and one or more experienced assessors use the same data for preparation of independent reports. The concepts in both student and criterion reports are then compared for agreements and disagreements. Using a minimum of four separate data sets, I found a consistent decrease in concept disagreements and a concomitant increase in agreements on concepts (Dana, 1982). Student reports became indistinguishable in their contents from reports prepared by the more experienced assessors, although stylistic differences remained. (Dana, 1999, pp. 180–181)

Even critics of the TAT (Lilienfeld et al., 2000) concede that it can generate at least modest construct validities when carefully scored. One complicating issue, however, is the fact that a genuine need may or may not be reflected in overt behavior. As indicated in his comments quoted earlier (see the Theoretical Basis section), Murray clearly was aware of this. Straightforward relationships between needs expressed on the TAT and overt behavior are not necessarily even expected by the clinician. Anastasi (1988), citing older studies (Harrison, 1965; Mussen & Naylor, 1954; Pittluck, 1950) points out that,

Depending on other concomitant personality characteristics, high aggression in fantasy may be associated with either high or low overt aggression. There is some evidence suggesting that if strong aggressive tendencies are accompanied by high anxiety or fear of punishment, expressions of aggression will tend to be high in fantasy and low in overt behavior. (Anastasi, 1988, p. 619)

Nor should the needs or motives expressed in the TAT be expected to correlate with questionnaire measures of traits, since motives and traits are fundamentally different elements of personality that are conceptually distinct and empirically unrelated (Winter, John, Stewart, Klohnen, & Duncan, 1998).

All of this implies that any valid TAT portrait of the individual is bound to be quite complex, and it should be so. The clinical use of the TAT requires the clinician to create just such a portrait. Yet it is reasonable to ask how accurate the rendered portrait actually is. Some earlier studies allowed clinicians to use their own methods to interpret TAT protocols they were given, either alone (Henry & Farley, 1959) or as part of a battery of tests (Silverman, 1959). Results showed that the evaluations of personality given by the clinicians matched independently gathered case histories better than chance. However, the fact that some handful of experienced clinicians should be capable of making such matches should not blind us to the reality that many other clinicians may not be capable or may never have had to demonstrate their capability in this regard.

The TAT method can be used, and has been used, in the capacity of a test with appropriate psychometric characteristics, by researchers who focused upon the measurement of single motives or needs with the TAT. Some ultimately powerful examples of this come from the work of McClelland and Atkinson and their colleagues (Atkinson, 1958; McClelland, Atkinson, Clark, & Lowell, 1953; Smith, 1992). Winter (1973), for example, provides a detailed account of the development of the scoring system for the power motive, including its crossvalidation. Generally speaking, McClelland’s scoring systems involve rating of a discrete category and subcategories of content as present or absent, and assume that there is a relationship between the frequency with which a motive content appears and the intensity of the motive. Though critics such as Entwisle (1972) attack the McClelland use of the TAT because of low reliabilities and insufficient internal consistency, McClelland (1980) rebuts these arguments with his well-reasoned contention that traditional psychometric approaches do not make sense for the TAT. As for validity, McClelland and colleagues affirm that their TAT-based motives do not predict the same behaviors as questionnaires, which presumably tap more conscious-level traits, but instead predict long-term behavior or life outcomes (McClelland, Koestner, & Weinberger, 1989; Smith, 1992; Weinberger & McClelland, 1990; Winter, 1996). This is discussed further in the Current Research Status section.

Even limited scoring systems such as those of McClelland’s group can take a great deal of time and effort to develop and learn (Winter, 1998). One might wonder just how much time and training would be required to reach high interscorer agreement levels on an entire list of scored variables, sufficient to encompass the array of needs expressed in the typical clinical TAT protocol.

RANGE OF APPLICABILITY AND LIMITATIONS

The TAT is a lifespan personality assessment technique applicable from middle childhood through old age and suitable for most types of personality and psychodiagnostic assessment referrals. Clinical psychologists and counseling psychologists are the principal users (Camara, Hathjan, & Puente, 2000), although 33% of clinical neuropsychologists use it (Butler, Retzlaff, & Van de Ploeg, 1991), as well as 6% of forensic neuropsychologists (Lees-Haley, Smith, Williams, & Dunn, 1996).

The nature of the TAT pictures tends to draw for darker and more somber stories, and depressive story content must therefore not be overinterpreted. Additionally, users of the TAT should be made aware of the potentially confounding issue of story length. Murstein and colleagues have demonstrated in two related TAT studies (Murstein & Mathes, 1996; Murstein & Wolf, 1970) that “. . . garrulous but otherwise healthy test-takers are in jeopardy because the more they talk or see on a projective technique, the more likely they are to be judged as pathological” (1996; p. 345).

The TAT should not be used as a stand-alone instrument for psychodiagnostic purposes, but should be used in conjunction with other psychological tests, especially objective measures. This provides for comparing and contrasting hypotheses developed from different sources, a standard practice among clinicians. When used with minority populations, or cross culturally, psychologists must be cognizant of known limitations in applying personality concepts derived from one culture to understanding personality of another culture (see Cross-Cultural Factors).

CROSS-CULTURAL FACTORS

The TAT was developed by European Americans and has been widely used with the same population. Even so, the TAT has been popular in both research and clinical practice in Asia, Europe, and South America. Although figures on some of the TAT cards possess ambiguous racial features, the question of the TAT’s cross-cultural applicability has led to the development of culturally specific versions. For example, there have been early (Thompson, 1949) and more recent attempts (Bailey & Greene, 1977) to develop an African American set of TAT cards parallel to the originals, but these attempts have apparently not caught on very well, despite the fact that they showed differences in response from Blacks and Caucasians. Chinese researchers have also developed a culturally specific version of the TAT (Zhang, Xu, Cai, & Chen, 1993).

The TAT and similar measures derived from it are used very commonly in cross-cultural research (Retief, 1987). However, Dana (1999) argues persuasively that a variety of requirements need to be met if we are to obtain accurate information about diverse cultures by using the TAT. These include using all of the following: relevant pictures as stimuli; scoring variables that have been normed in the culture of interest; norms that take into account educational level, social class, and acculturation status; direct participation by local people in the development of scoring categories; and interpretation that refers to culture-specific personality theory. Without meeting these requirements, we may end up misattributing psychopathology to individuals and groups whose cultural conception of self and personal boundaries may be quite different from that of the dominant culture in the United States. Dana’s recommendations constitute an important prescription for the proper use of TAT methodology across cultures. However, there apparently are very few instances in which researchers have seriously attended to, collected, maintained, or utilized the recommended data. As Dana himself has commented,

There are TAT studies from many countries throughout the world. These reports reflect useful descriptions of personality in culture, clinical case studies, examinations of specific hypotheses, and attempts to provide interpretation. . . . However, these illustrative studies compose only an interesting mosaic, providing glimpses of people who are not Anglo Americans because there has been little empirical collection of normative data (e.g., Avila-Espada, in press; Zhang, Xu, Cai, & Chen, 1993). (Dana, 1999, pp. 187–188)

The easiest place to start is to critically scrutinize the stimulus materials used for assessments of minorities in our own culture. For example, Constantino and Malgady (1999) report that the clinical analysis of projective tests given to Hispanic and Black children has resulted in conclusions about low verbal fluency and inferred emotional disturbance that are highly suspect. However, when culturally sensitive instruments are used, minority children are verbally articulate in their responses (Bailey & Green, 1977; Constantino & Malgady, 1983; Malgady, 1996).

“Yet cultural adaptations of traditional projective tests and the development of new culturally sensitive tests are especially rare,” according to Constantino and his associates (Constantino, Malgady, Colon-Malgady, & Bailey, 1992, p. 434). One important exception is the Tell-Me-A-Story or TEMAS (Constantino, Malgady, & Rogler, 1988), which was developed specifically to offer a TAT-type multicultural test for children and adolescents. Parallel sets of cards are available for Hispanic and African Americans (minority version) and for European Americans (nonminority version). The cards are less ambiguous than the TAT cards, having been designed to pull for specific personality functions, and having been validated by interjudge agreement as pulling for those functions. Each story is scored separately for cognitive, affective, and personality functions. In particular, the scoring for many of these variables is of interval-level statistical quality, and the test appears to have very good psychometric qualities, particularly in the areas of internal consistency and interrater reliabilities. Test-retest reliabilities are low to moderate. Predictive validities that have been investigated show good promise. The reader is referred to Constantino and Malgady (1999) for a concise yet thorough description of the TEMAS, its development, and its psychometric properties.

ACCOMMODATION FOR POPULATIONS WITH DISABILITIES

Issues and accommodations related to the projective assessment of persons with disabilities (Wachs, 1966) and older adults (La Rue & Watson, 1998) have been ongoing concerns for clinicians. The TAT should not be administered to all patients or used for all evaluations. For example, even though the TAT requires only that the examinee be able to see what is on the cards and have enough language to tell a story, these prerequisites deserve attention. The ambiguity of the TAT cards makes for a challenging visual information-processing task, and the production of useful stories requires elaborate verbal responses.

Many students and seasoned clinicians have been embarrassed by having inferred that significant perceptual distortions on the TAT were indicators of thought disorder, when in fact the patients being assessed suffered from hyperopia, astigmatism, visual deterioration common to normal aging, or neuropsychological deficits such as visual neglect. To get a sense of how the more common forms of uncorrected visual impairment might affect TAT performance, we sometimes recommend to our students that they look at the TAT cards without their glasses or contact lenses. The pictures lose clarity, and intentionally ambiguous features disappear, to the point where a card may lose its unique stimulus pull (e.g., gun/scissors on Card 3BM; Patalano, 1986). Some type of basic visual screening seems essential.

It is also best to have a measure of expressive language available, such as the Wechsler Verbal Comprehension Index (VCI), in order to properly interpret a TAT protocol. The clinician can then compare and contrast expressive language in the structured and projective situations. For example, defensiveness or emotional disturbance can be more readily inferred when a patient produces terse, concrete TAT narratives but has VCI of high average classification. Clinicians generally assume that patients with very low language abilities will produce impoverished TAT narratives of little value. However, at least one review concludes that the TAT is an excellent personality assessment device for developmentally disabled individuals (Hurley & Sovner, 1985).

LEGAL AND ETHICAL CONSIDERATIONS

While generally accepted in most routine clinical evaluations, the use of the TAT in forensic evaluations and litigation remains controversial. In 1993, the United States Supreme Court handed down a decision, Daubert v. Merrell Dow Pharmaceuticals, which essentially set forth criteria that federal courts must follow in admitting or excluding scientific evidence or expert testimony from consideration by juries (Gold, Zaremski, Rappaport, & Shefrin, 1993). Applying the now widely disseminated criteria outlined in that case, the TAT as ordinarily given may not qualify as a legally defined valid scientific measure. However, to date there have been no peer-reviewed articles addressing whether the TAT stands up to the Daubert case criteria. Clinicians doing forensic assessments and wanting to use the TAT need to review specialty works such as Ziskin’s legal textbook (1995), the special issue of Psychology, Public Policy, and Law (Shuman & Sales, 1999) entitled “Daubert’s Meanings for the Admissibility of Behavioral and Social Science Evidence,” as well as critiques of projective techniques in general (Lilienfeld, Wood, & Garb, 2000) in order to prepare for legal challenges to the use of the TAT.

Standard 6.04 of the “Ethical Principles of Psychologists and Code of Conduct” (American Psychological Association, 1992) includes projective techniques among the procedures that require “specialized training, licensure, or expertise” (p. 1607). To use the TAT in an ethical manner, the clinician is therefore required to have the appropriate experiential background. For a review of the applied aspects of competency and ethics in personality assessment, see Weiner (1989).

COMPUTERIZATION

While there are no technical barriers to creating a computeradministered TAT (excepting copyright protection of the TAT itself ), neither are there any easily identified advantages to doing so. The cards are easily administered manually, and clinicians find hard copies of TAT narratives easier to review. On the other hand, word counts and lexical analyses are sometimes useful, as in the detection of dementia (Johnson, 1994), and word-processing programs can perform such analyses. In research applications where TAT narratives are written, it does not seem to matter whether they are handwritten or written at a keyboard (Blankenship & Zoota, 1998).

But there are two important areas in which computer assistance will likely be considerably advantageous in the near future. The first of these is scoring the TAT. Computer coding of specific theoretical constructs, whether motives, defense mechanisms, levels of object relatedness, or needs-press interactions, would obviate the need for highly trained coders and resolve the perennial problem of interscorer reliability. Bellak (1999), a proponent of structured scoring systems and computer assistance, has said,

One would think that some bright young computer specialist would program his or her gadgets to not only count words but also analyze clusters of what Murray called syndromes of press and need—units of stimulus and drive. These clusters should give a lively picture of a personality and keep methodologists happy. (p. 138)

Another area that is on the verge of offering important time savings is that of speech-to-text computer transcribing. This type of voice transcription software is currently available, but at present relies upon “learning” an individual user’s speech patterns over a period of time in order to increase accuracy to acceptable levels and is therefore not usable for TAT narrative recording of multiple patients, each with a unique verbal and vocal style. This rapidly evolving technology will ultimately do away with inefficient and inaccurate attempts to manually write down whatever the patient says, or the sheer drudgery of hand transcription from tape recordings.

CURRENT RESEARCH STATUS

The TAT has been a prodigious generator of research. Within a decade of its publication in 1943, nearly 800 studies had been published (Bellak, 1954). In the last quarter century, over 1,000 published papers have appeared (Cramer, 1996). Dozens of doctoral dissertations and master’s theses using the TAT are written each year.

Running parallel to the early publications concerning general scoring approaches is a body of research that applied the TAT and TAT-type pictures in more tightly controlled experimental studies investigating single human needs or motives. David C. McClelland and his student, John W. Atkinson, were the primary inaugurators of this research, bringing to it the rigor rooted in McClelland’s background as an experimental psychologist and attracting like-minded followers. Beginning with the achievement motive (Atkinson, 1958; McClelland et al., 1953;), and subsequently moving on to the affiliation motive (Atkinson, Heyns, & Veroff, 1954), power motive (Veroff, 1957; Winter, 1973), and intimacy motive (McAdams, 1982), these researchers produced data that powerfully bolstered the underlying premise that projection did indeed occur in TAT-type stories. They also importantly demonstrated how to develop reliable and valid scoring systems for a single human need, proving that the TAT could indeed be used as a test possessing appropriate psychometric qualities. Finally, their data indicated that TAT-measured motives are capable of predicting longitudinal outcomes, such as entrepreneurial behavior, overall life adjustment, and organizational leadership (Smith, 1992; Winter, 1996). They are also associated with susceptibility to disease (Jemmott, 1987; McClelland, 1989); elevated blood pressure and hypertensive pathology 20 years postassessment (McClelland, 1979); alterations in immune functioning (McClelland, 1989; McClelland & Krishnit, 1988); and release of motive-specific hormones in the bloodstream (McClelland, Davidson, Saron, & Floor, 1980). For two excellent histories of McClelland’s empirically derived TAT measures, readers should consult Winter (1998) and McClelland (1999). Cramer (1996) provides a helpful summary of how such scoring systems are developed.

Other researchers have taken the cue, and much recent TAT research focuses upon scoring methods for single TAT predictors of various clinical states and characteristics, or upon small systems of related predictors. Abrams (1999) and Cramer (1996) provide valuable summaries of this recent research. We limit ourselves here to the mention of four of the areas we consider to be most promising or useful. One cluster consists of the creation of a simple yet sophisticated scoring for ego defense mechanisms, their developmental progression over time, and their relationship to different levels of psychopathology and response to treatment (Cramer, 1991, 1996, 1999, Cramer & Blatt, 1990, 1993). In a second cluster, Westen and colleagues have developed a scoring system for objects relations, called the Social Cognition and Object Relations Scale (SCORS; Westen, 1991a, 1991b). The SCORS has been shown to differentiate borderline personality disorder patients from other psychiatric patients as well as from normals (Westen, Lohr, Silk, Gold, & Kerber, 1990), and has shown the capacity to differentiate among DSM-IV Cluster B personality disorders (Ackerman, Clemence, Weatherill, & Hilsenroth, 1999). The third cluster of studies reports on the further development and application of the Singer and Wynne (1966) method of scoring communication deviance (CD), a type of thought disorder resembling an inner, autistic preoccupation while attempting to tell TAT stories. Originally believed to be most closely related to schizophrenia, research now indicates that CD is also found in the parents of manic patients (Miklowitz, Velligan, Goldstein, & Neuchterlein, 1991) as well as the parents of schizophrenic patients (Sass, Gunderson, Singer, & Wynne, 1984). The presence of CD has also been established in parents of Norwegian (Rund, 1986) and Mexican American psychiatric patients (Doane et al., 1989), demonstrating that it is not a language- or culturebound phenomenon.

Since so much TAT research has either an underlying or explicitly psychodynamic character, it has been refreshing to see the inception of a cognitive-behavioral research program investigating personal problem solving (Ronan, Colavito, & Hammontree, 1993; Ronan, Date, & Weisbrod, 1995). The TAT tends to draw for stories of problems and their resolution or lack of resolution, making this line of research particularly relevant.

USE IN CLINICAL OR ORGANIZATIONAL PRACTICE

Since shortly after its inception, the TAT has been a popular part of clinical practice, ranking among the top four or five tests used by psychologists in clinical settings (Archer, Maruish, Imhof, & Piotrowski, 1991; Piotrowski, Sherry, & Keller, 1985; Piotrowski & Zalewski, 1993; Watkins, Campbell, Nieberding, & Hallmark, 1995), and currently among the top three personality assessment methods (Butcher & Rouse, 1996; Watkins et al., 1995).

Clinicians almost always give the TAT as part of a larger assessment battery rather than alone, thereby generating opportunities to cross-check TAT-derived hypotheses against other clinical data (Bellak, 1999). Moreover, this approach is consistent with Murray’s use of the TAT as part of a multimethod assessment battery at the Harvard Psychological Clinic (Murray, 1938). In administering the TAT, psychologists tend to select their own preferred, abbreviated sets of cards (see Table 27.1), and do not necessarily follow a standard administration or scoring.

Even though such administration is a technique rather than a test, we advise clinicians to apply some method in working up their interpretations. A structure that follows some reasonable rationale encourages thorough, consistent consideration of the material at hand. Numerous scoring or coding

TABLE 27.1				TAT Card Sets Used by Various Clinicians

Bellak (1986)	1, 2, 3BM, 4, 6BM, 7GF, 8BM, 9GF, 10, 13MF
Cramer (1996)	1, 6BM, 7GF, 8BM, 12M, 13MF, 14, 17BM
Dana (1996)	1, 2, 3BM, 4, 6BM, 7BM, 8BM, 12M, 13MF,
	18BM
Hartman (1970)	1, 2, 3BM, 4, 6BM, 7BM, 13MF, 8BM, 12M,
	13MF
Holt (1951) Males:	1, 2, 3BM, 4, 6BM, 7BM, 8BM, 12M, 13MF, 16,
	18GF
Females:	1, 2, 4, 7GF, 9GF, 10, 13MF, 16, 18GF, 12M,
	3BM
Karon (1981)	1, 3BM, 4, 6BM, 7GF*, 10, 11, 12M, 13MF, 14,
	16, 20
Peterson (1990)	10, 7GF, 7BM, 13B, 8BM, 1, 9BM, 4, 2, 17BM,
	13MF, 12M

*For females.

systems other than Murray’s have been proposed for the TAT, with several books appearing in the early years detailing how major users were approaching the test (Arnold, 1962; Aron, 1949; Bellak, 1954; Henry, 1956; Shneidman, 1951; Stein, 1948; Tomkins, 1947). Many of these approaches have been virtually ignored in clinical practice, probably because of their time-consuming nature. Henry’s (1956) book, Cramer’s recent book (1996), Karon’s (1981) chapter, and journal articles (e.g., Hartman, 1970; Schafer, 1958; see Rossini & Moretti, 1997, for other recommendations) best represent our own preferences, though Bellak’s very comprehensive approach (Bellak, 1986, 1993; Bellak & Abrams, 1997) and Murray’s (1943) original manual are more popularly used in training. An additional very useful approach stresses the importance of attending to the formal characteristics of story structure, paying particular attention to peculiar language, inability to maintain the storytelling frame of reference, strange turns in story plot, failure to include obvious card content, and so on (Hartman, 1949; Holt, 1958; Murstein, 1961; Rapaport, Gill, & Schafer, 1946; Schafer, 1958). Actuarial and quasi-normative datasets exist for reference, but these are somewhat more difficult to access and are likely to be outdated (e.g., Eron, 1950, 1953; Murstein, 1972). At least one study has demonstrated that cultural shifts over a time period as short as 10 years can significantly affect the portrayal of characters in TAT stories (Cramer, 1986), indicating that an updating of norms should be undertaken frequently and that it is currently long overdue. Many older casebooks have large numbers of full-text TAT protocols appended (e.g., Kobler, 1964). Modal time needed for administration, coding, and interpretation has been reported to be approximately 1.5 hours (Ball, Archer, & Imhof, 1994).

It may be helpful to consider the endeavor of psychodiagnostic testing in general in order to more clearly apprehend the continued popularity of the TAT. In clinical practice, most patients are referred for psychological assessment when there is a confusing differential diagnosis to be sorted out or, less commonly, when there is a seemingly insurmountable therapeutic impasse. The need for routine testing of patients has waned as the successive versions of the Diagnostic and Statistical Manual have become more objective and precise. Now more commonly than ever, the role of the personality assessment portion of the test battery, and the TAT in particular, is to generate an understanding of the patient’s inner world and dynamics that will hopefully advance the therapeutic process, rather than primarily to determine diagnosis. Considered in this regard, the variety of instructions, card sets, and interpretive schemes associated with the TAT, responsible in part for the criticisms of it as a test, are a moot point. Essentially, the TAT is given in most clinical assessments as a semistructured interview technique, and its elicitation of story narratives provides rich material that primes the understanding of the psychotherapist who will be providing treatment. The therapist effectively gets a head start in comprehending the complex individuality of the patient, and the fuller meaning of the TAT material becomes apparent within the context of treatment. This is exactly as Henry Murray intended it when he introduced the TAT as an aid to shortening the length of therapy or analysis, capitalizing on the technique’s ability to quickly reveal situational issues and psychodynamics (Morgan & Murray, 1935; Murray, 1943).

Up until 1948, the published references indicated that the TAT was rarely used as a diagnostic device in a psychiatric sense, but that it had been used extensively in exploring unconscious dynamics. We find it surprising that Murray’s original vision for the therapeutic use of the test rather quickly fell by the wayside, probably because of historical factors. The burgeoning development of clinical psychology as a diagnostic discipline that began in World War II may have influenced the subsequent direction of TAT use (Rosenzweig, 1948).

However, not everyone forgot about the TAT’s therapeutic potential. As Bellak (1999) has remarked,

Cooperation and insight for both therapist and patient are frequently gained when patients discover, to their surprise, in their TAT stories that they have unwittingly reproduced some of their most important problems. (p. 136)

Bellak (1999; Bellak & Abrams, 1997) describes several illustrative ways that he uses the TAT for psychotherapy: reading stories back to the patient; asking patients what they think about their stories; informing the patient that the stories are different from those of others; asking the patient to be the psychologist by identifying what the stories have in common; and using cards 3BM and 14 to get immobilized or suicidal patients to indirectly talk about their feelings. Rosenzweig (1948, 1999) uses a similar approach, by asking patients to free-associate to their TAT stories to induce catharsis or overcome blockages. Other clinicians have also written about the use of the TAT in therapy (Araoz, 1972; Hoffman & Kuperman, 1990; Peterson, 1990; Ullman, 1957). A very interesting recent approach derives from narrative psychotherapy. Patients who complete telling their stories to the TAT pictures go back over the cards and make up additional stories from the point of view of the secondary characters or antagonists. Taken together, all the stories express motives that make up the multiplicity of the self’s voices (Hermans, 1999; Hermans & Kempen, 1993).

Recent researchers have provided data that points to the value of collaboratively sharing test responses with patients in feedback sessions, moving from ego-syntonic information to interpersonal and intrapersonal themes drawn directly from projective material. The approach, which is consistent with therapeutic use of the TAT, has been shown to increase the therapeutic alliance as shown in decreased treatment dropout (Ackerman, Hilsenroth, Baity, & Blagys, 2000).

The TAT has been used in organizational psychology as part of a selection battery, beginning with the work of Murray and Stein (1943) in selecting combat officers and continuing into the complex assessment protocol of the Office of Strategic Services, forerunner to today’s Central Intelligence Agency, for the selection of spies (Office of Strategic Services Assessment Staff, 1948). Although still included in texts on personality assessment in industrial and organizational settings (Rothstein & Goffin, 2000), legal constraints on the types of tests permitted for personnel selection seem to have virtually eliminated the TAT’s use in this regard, and our colleagues in two international organizational psychology firms tell us they are unaware of anyone using the TAT for selection purposes.

McClelland reports having used a method in an organizational setting that did not use TAT cards but quite closely resembled the TAT experience. He used a TAT-type interview that asked race relations consultants to tell three stories of successful experiences as a consultant and three stories of unsuccessful experiences as a consultant, while probing for expression of a true report of thoughts, feelings, and actions of the person being interviewed. He was thereby able to distinguish how outstanding consultants differed from ordinary ones, and then proceeded to develop scoring systems for these competency differences that were similar to scorings of needs for the TAT. Eventually, he was able to reliably code seven different thematic differences and to spur the development of a training program designed to teach the competencies to new consultants. Dozens of similar studies were subsequently undertaken to identify the competencies associated with outstanding performance in different managerial or leadership positions (McClelland, 1999). This work was reported in Spencer and Spencer (1993). McClelland reports that about a dozen competencies have now been identified as being connected with success in managerial positions. Applications of this information within an organization have cut executive turnover, saved money, and helped in the selection of new employees who performed more successfully in their first year.

FUTURE DEVELOPMENTS

The TAT met the millennium with no signs of fading away, even though many other projective techniques have not similarly withstood the test of time. Its future seems assured, since many psychologists accept Henry Murray’s premise: People reveal who they are when they make up stories. Or, as McAdams has stated, one’s self-stories are the organizing structure of personality (McAdams, 1993, 2001).

New approaches to scoring will continue to emerge, though we do not foresee that anyone will soon validate a complete system of clinical interpretation such as those introduced by early users of the TAT; the task is simply too daunting. Instead, there is likely to continue to be an accumulation of studies establishing validities for single or small systems of TAT-derived predictors of various clinical states and personality characteristics. Whether this research will make its way into clinical practice remains to be seen, but the chances certainly seem greater for those constructs that are rapidly and easily scored and that can be taught so as to produce good accuracy across users.

We hope that we are entering an era where the TAT can be seen for what it truly and simply is: a pantheoretical projective technique that accesses a person’s unique narratives. It is a wonderfully flexible method, for these stories can be used in myriad ways. Whether employed primarily as a technique to learn more about a patient’s imaginings or fantasy life, as an adjunct to help unblock psychotherapy, as a method of uncovering the theme of an individual life story, as a tool to discern the multiple voices of the self, or as a psychometric measure of a focal personality construct—the effectiveness of the TAT as a vehicle for discovery is undeniable. But the inferences and clinical interpretations made from it, unless they have been validated, will remain open to debate.

REFERENCES

Abrams, D.M. (1999). Six decades of the Bellak scoring system, among others. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 143–159). Washington, DC: American Psychological Association.
Ackerman, S.J., Clemence, A.J., Weatherill, R., & Hilsenroth, M.J. (1999). Use of the TAT in the assessment of DSM-IV Cluster B personality disorders. Journal of Personality Assessment, 73, 422–448.
Ackerman, S.J., Hilsenroth, M.J., Baity, M.R., & Blagys, M.D. (2000). Interaction of therapeutic process and alliance during psychological assessment. Journal of Personality Assessment, 75, 82–109.
American Psychological Association (1992). Ethical principles of psychologists and code of conduct. American Psychologist, 47, 1597–1611.
Anastasi, A. (1988). Psychological testing (6th ed.). New York: Macmillan.
Anderson, J.W. (1999). Henry A. Murray and the creation of the Thematic Apperception Test. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 23–38). Washington, DC: American Psychological Association.
Araoz, D.L. (1972). The Thematic Apperception Test in marital therapy. Journal of Contemporary Psychotherapy, 5, 41–48.
Archer, R.P., Maruish, M., Imhof, E.A., & Piotrowski, C. (1991). Psychological test usage with adolescent clients: 1990 survey findings. Professional Psychology: Research and Practice, 22, 247–252.
Arnold, M.B. (1962). Story sequence analysis. New York: Columbia University Press.
Aron, B. (1949). A manual for analysis of the Thematic Apperception Test. Berkeley, CA: Willis E. Berg.
Aronow, E., Weiss, K.A., & Reznikoff, M. (2001). A practical guide to the Thematic Apperception Test: The TAT in clinical practice. Philadelphia: Brunner/Mazel.
Atkinson, J.W. (1958). Motives in fantasy, action, and society. Princeton, NJ: Van Nostrand.
Atkinson, J.W., Heyns, R.W., & Veroff, J. (1954). The effect of experimental arousal of the affiliation motive on thematic apperception. Journal of Abnormal and Social Psychology, 49, 405–410.
Avila-Espada, A. (in press). Objective scoring for the TAT. In R.H. Dana (Ed.), Handbook of cross-cultural and multicultural personality assessment. Hillside, NJ: Erlbaum.
Bailey, B.E., & Green, J., III. (1977). Black Thematic Apperception Test stimulus material. Journal of Personality Assessment, 4, 25–30.
Ball, J.D., Archer, R.P., & Imhof, E.A. (1994). Time requirements of psychological testing: A survey of practitioners. Journal of Personality Assessment, 63, 239–249.
Bellak, L. (1954). The T.A.T. and C.A.T. in clinical use. New York: Grune & Stratton.
Bellak, L. (1986). The Thematic Apperception Test, Children’s Apperception Test, and the Senior Apperception Technique in clinical use (4th ed.). Orlando, FL: Academic Press.
Bellak, L. (1993). The Thematic Apperception Test, Children’s Apperception Test, and Senior Apperception Technique in clinical use (5th ed.) Boston: Allyn & Bacon.
Bellak, L. (1999). My perceptions of the Thematic Apperception Test in psychodiagnosis and psychotherapy. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 133–141). Washington, DC: American Psychological Association.
Bellak, L., & Abrams, D.M. (1997). The T.A.T., the C.A.T., and the S.A.T. in clinical use (6th ed.). Needham Heights, MA: Allyn & Bacon.
Blankenship, V., & Zoota, A.L. (1998). Comparing power imagery in TATs written by hand or on the computer. Behavior Research Methods, Instruments, and Computers, 30, 441–448.
Butcher, J.N., & Rouse, S.V. (1996). Personality: Individual differences and clinical assessment. Annual Review of Psychology, 47, 87–111.
Butler, M., Retzlaff, P.H., & Van de Ploeg, R. (1991). Neuropsychological test usage. Professional Psychology: Theory, Research, and Practice, 22, 510–512.
Camara, W.J., Hathjan, J.S., & Puente, A.E. (2000). Psychological test usage: Implications for professional psychology. Professional Psychology: Research and Practice, 31, 141–154.
Constantino, G., & Malgady, R.G. (1983). Verbal fluency of Hispanic, Black, and White children on TAT and TEMAS, a new thematic apperception test. Hispanic Journal of Behavioral Sciences, 5, 199–206.
Constantino, G., & Malgady, R.G. (1999). The Tell-Me-A-Story Test: A multicultural offspring of the Thematic Apperception Test. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 191– 206). Washington, DC: American Psychological Association.
Constantino, G., Malgady, R.G., Colon-Malgady, G., & Bailey, J. (1992). Clinical utility of the TEMAS with non-minority children. Journal of Personality Assessment, 59, 433–438.
Constantino, G., Malgady, R.G., & Rogler, L.H. (1988). TEMAS (Tell-Me-A-Story) manual. Los Angeles: Western Psychological Services.
Cramer, P. (1986). Fantasies of college men: Then and now. Psychoanalytic Review, 73, 567–578.
Cramer, P. (1991). The development of defense mechanisms: Theory, research, and assessment. New York: Springer-Verlag.
Cramer, P. (1996). Storytelling, narrative, and the Thematic Apperception Test. New York: Guilford Press.
Cramer, P. (1999). Future directions for the Thematic Apperception Test. Journal of Personality Assessment, 72, 74–92.
Cramer, P., & Blatt, S.J. (1990). Use of the TAT to measure change in defense mechanisms following intensive psychotherapy. Journal of Personality Assessment, 54, 236–251.
Cramer, P., & Blatt, S.J. (1993). Change in defense mechanisms following intensive treatment, as related to personality organization and gender. In U. Hentschel, G.J.W. Smith, W. Ehlers, & J.D. Draguns (Eds.), The concept of defense mechanisms in contemporary psychology (pp. 310–320). New York: Springer-Verlag.
Dana, R.H. (1982). A human science model for personality assessment with projective techniques. Springfield, IL: Charles C. Thomas.
Dana, R.H. (1996). The Thematic Apperception Test. In C.S. Newmark & D.M. McCord (Eds.), Major psychological assessment instruments (2nd ed.; pp. 110–124). Needham Heights, MA: Allyn & Bacon.
Dana, R.H. (1999). Cross-cultural–multicultural use of the Thematic Apperception Test. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 177–190). Washington, DC: American Psychological Association.
Doane, J.A., Miklowitz, D.J., Oranchak, E., Apodaca, R.F., Karno, M., Strachan, A.M., et al. (1989). Parental communication deviance and schizophrenia: A cross-cultural comparison of Mexican- and Anglo-Americans. Journal of Abnormal Psychology, 98, 487–490.
Douglas, C. (1993). Translate this darkness: The life of Christiana Morgan. New York: Simon & Schuster.
Entwisle, D.R. (1972). To dispel fantasies about fantasy-based measures of achievement motivation. Psychological Bulletin, 83, 1131–1153.
Eron, L.D. (1950). A normative study of the Thematic Apperception Test. Psychological Monographs, 64 (Whole No. 315).
Eron, L.D. (1953). Responses of women to the Thematic Apperception Test. Journal of Consulting Psychology, 17, 269–282.
Frank, L.K. (1939). Projective methods for the study of personality. Journal of Psychology, 8, 343–389.
Frey-Rohn, L. (1976). From Freud to Jung: A comparative study of the psychology of the unconscious. New York: Delta.
Garb, H.H. (1998). Recommendations for training in the use of the Thematic Apperception Test. Professional Psychology: Research and Practice, 29, 621–622.
Gieser, L., & Morgan, W.G. (1999). Look homeward, Harry: Literary influence on the development of the Thematic Apperception Test. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 53–64). Washington, DC: American Psychological Association.
Gieser, L., & Stein, M.I. (1999). Evocative images: The Thematic Apperception Test and the art of projection. Washington, DC: American Psychological Association.
Gold, J.A., Zaremski, M.J., Rappaport, E., & Shefrin, D.H. (1993). Daubert v. Merrell Dow: The Supreme Court tackles scientific evidence in the courtroom. Journal of the American Medical Association, 270, 2964–2967.
Harrison, R. (1965). Thematic apperception methods. In B.B. Wolman (Ed.), Handbook of clinical psychology (pp. 562–620). New York: McGraw-Hill.
Hartman, A.A. (1949). An experimental examination of the thematic apperception technique in clinical diagnosis. Psychological Monographs, 63(8, Whole No. 303).
Hartman, A.A. (1970). A basic TAT set. Journal of Projective Techniques, 34, 391–396.
Henry, W.E. (1956). The analysis of fantasy: The thematic apperception technique in the study of personality. New York: Wiley. (Reprinted 1973 in Huntington, NY, by Krieger.)
Henry, W.E., & Farley, J. (1959). Symposium on current aspects of the problem of validity: A study in validation of the Thematic Apperception Test. Journal of Projective Techniques, 23, 273– 277.
Hermans, H.J.M. (1999). The Thematic Apperception Test and the multivoiced nature of the self. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 207–211). Washington, DC: American Psychological Association.
Hermans, H.J.M., & Kempen, H.J.G. (1993). The dialogical self: Meaning as movement. San Diego, CA: Academic Press.
Hoffman, S., & Kuperman, N. (1990). Indirect treatment of traumatic psychological experiences: The use of the TAT cards. American Journal of Psychotherapy, 44, 107–115.
Holmstrom, R.W., Silber, D.E., & Karp, S.A. (1990). Development of the Apperceptive Personality Test. Journal of Personality Assessment, 54, 252–264.
Holt, R.R. (1951). Methods in clinical psychology: Volume I. Projective assessment. New York: Plenum Press.
Holt, R.R. (1958). Formal aspects of the TAT—a neglected resource. Journal of Projective Techniques, 22, 163–172.
Hurley, A.D., & Sovner, R. (1985). The use of the Thematic Apperception Test in mentally retarded persons. Psychiatric Aspects of Mental Retardation Reviews, 4, 9–12.
Jemmott, J.J. (1987). Social motives and susceptibility to disease. Journal of Personality, 55, 267–298.
Johnson, J.L. (1994). The Thematic Apperception Test and Alzheimer’s disease. Journal of Personality Assessment, 62, 314–319.
Jung, C.G. (1976). The visions seminars. Zurich, Switzerland: Spring Publications.
Karon, B.P. (1981). The Thematic Apperception Test. In A.I. Rabin (Ed.), Assessment with projective techniques: A concise introduction (pp. 85–120). New York: Springer.
Karon, B.P. (2000). The clinical interpretation of the Thematic Apperception Test, Rorschach, and other clinical data: A reexamination of statistical versus clinical prediction. Professional Psychology: Research and Practice, 31, 230–233.
Karp, S.A. (Ed.). (1999). Studies of objective/projective personality tests. Brooklandville, MD: Objective/Projective Tests.
Keiser, R.E., & Prather, E.N. (1990). What is the TAT? A review of ten years of research. Journal of Personality Assessment, 55, 800–803.
Kobler, F.J. (1964). Casebook in psychopathology. Staten Island, NY: Alba House.
Kraiger, K., Hakel, M.D., & Cornelius, E.T. (1984). Exploring fantasies of TAT reliability. Journal of Personality Assessment, 48, 365–370.
La Rue, A., & Watson, J. (1998). Psychological assessment of older adults. Professional Psychology: Research and Practice, 29, 5–14.
Lees-Haley, P.R., Smith, H.H., Williams, C.W., & Dunn, J.T. (1996). Forensic neuropsychological test usage: An empirical survey. Archives of Clinical Neuropsychology, 11, 45–51.
Lilienfeld, S.O., Wood, J.M., & Garb, H.N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27–66.
Lundy, A. (1985). The reliability of the Thematic Apperception Test. Journal of Personality Assessment, 49, 141–145.
Malgady, R.G. (1996). The question of cultural bias in assessment and diagnosis of ethnic minority clients: Let’s reject the null hypothesis. Professional Psychology: Research and Practice, 27, 101–105.
McAdams, D.P. (1982). Intimacy motivation. In A.J. Stewart (Ed.), Motivation and society (pp. 133–171). San Francisco: Jossey-Bass.
McAdams, D.P. (1985). Power, intimacy and the life story. New York: Guilford Press.
McAdams, D.P. (1993). Stories we live by: Personal myths and the making of the self. New York: Morrow.
McAdams, D.P. (2001). The person: An integrated introduction to personality psychology (3rd ed.). Fort Worth, TX: Harcourt College Publishing.
McClelland, D.C. (1979). Inhibited power motivation and high blood pressure in men. Journal of Abnormal Psychology, 88, 182–190.
McClelland, D.C. (1980). Motive dispositions: The merits of operant and respondent measures. In L. Wheeler (Ed.), Review of personality and social psychology (Vol. 1, pp. 10–41). Beverly Hills, CA: Sage.
McClelland, D.C. (1989). Motivational factors in health and disease. American Psychologist, 44, 675–683.
McClelland, D.C. (1999). How the test lives on: Extensions of the Thematic Apperception Test approach. In L.Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 163–175). Washington, DC: American Psychological Association.
McClelland, D.C., Atkinson, J.W., Clark, R.A., & Lowell, E.L. (1953). The achievement motive. New York: Appleton-Century-Crofts.
McClelland, D.C., Davidson, R.J., Saron, C., & Floor, E. (1980). The need for power, brain norepinephrine turnover and learning. Biological Psychiatry, 10, 93–102.
McClelland, D.C., Koestner, R., & Weinberger, J. (1989). How do self-attributed and implicit motives differ? Psychological Review, 96, 690–702.
McClelland, D.C., & Krishnit, C. (1988). The effect of motivational arousal through films on salivary immunoglobulin. Psychology and Health, 2, 31–52.
McFarland, R.A. (1984). Effects of music upon emotional content of TAT stories. Journal of Psychology, 11, 227–234.
Meyer, G.J., Finn, S.E., Eyde, L.D., Kay, G.G., Morland, K.L., Dies, R.R., et al. (2001). Psychological testing and psychological assessment. American Psychologist, 56, 128–165.
Miklowitz, D.J., Velligan, D.I., Goldstein, M.J., & Neuchterlein, K.H. (1991). Communication deviance in families of schizophrenic and manic patients. Journal of Abnormal Psychology, 100, 163–173.
Morgan, C.D., & Murray, H.A. (1935). A method for investigating fantasies: The Thematic Apperception Test. Archives of Neurology and Psychiatry, 34, 289–306.
Morgan, W.G. (2000). Origin and history of an early TAT card: Picture C. Journal of Personality Assessment, 74, 88–94.
Murray, C. (1999). Foreword: Harry’s compass. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. ix–xi). Washington, DC: American Psychological Association.
Murray, H.A. (Ed.). (1938). Explorations in personality: A clinical and experimental study of fifty men of college age. New York: Oxford University Press.
Murray, H.A. (1943). Thematic Apperception Test: Manual. Cambridge, MA: Harvard University Press.
Murray, H.A., & Stein, M.I. (1943). Note on the selection of combat officers. Psychosomatic Medicine, 5, 386–391.
Murstein, B.I. (1961). The role of the stimulus in the manifestation of fantasy. In J. Kagan & G.S. Lesser (Eds.), Contemporary issues in thematic apperceptive methods (pp. 229–273). Springfield, IL: Charles C. Thomas.
Murstein, B.I. (1972). Normative written TAT responses for a college sample. Journal of Personality Assessment, 36, 109–147.
Murstein, B.I., & Mathes, S. (1996). Projection on projective techniques ” pathology: The problem that is not being addressed. Journal of Personality Assessment, 66, 337–349.
Murstein, B.I., & Wolf, S.R. (1970). Empirical test of the “levels” hypothesis with five projective techniques. Journal of Abnormal Psychology, 75, 38–44.
Mussen, P.H., & Naylor, H.K. (1954). The relationships between overt and fantasy aggression. Journal of Abnormal and Social Psychology, 49, 235–240.
Office of Strategic Services Assessment Staff. (1948). Assessment of men. New York: Rinehart.
Patalano, J. (1986). Creativity and the TAT blank card. Journal of Creative Behavior, 20, 127–133.
Peterson, C.A. (1990). Administration of the Thematic Apperception Test: Contributions of psychoanalytic psychotherapy. Journal of Contemporary Psychotherapy, 20, 191–200.
Piotrowski, C., Sherry, D., & Keller, J.W. (1985). Psychodiagnostic test usage: A survey of the Society for Personality Assessment. Journal of Personality Assessment, 49, 115–119.
Piotrowski, C., & Zalewski, C. (1993). Training in psychodiagnostic testing in APA-approved PsyD and PhD clinical training programs. Journal of Personality Assessment, 61, 394–405.
Pittluck, P. (1950). The relation between aggressive fantasy and overt behavior. Unpublished doctoral dissertation, Yale University, New Haven, CT.
Rapaport, D., Gill, M., & Schafer, R. (1945). Diagnostic psychological testing: The theory, statistical evaluation, and diagnostic application of a battery of tests. Chicago, IL: Year Book. (Revised Ed., 1968, R.R. Holt, Ed.).
Retief, A. (1987). Thematic apperception testing across cultures: Tests of selection versus tests of inclusion. South African Journal of Psychology, 17, 47–55.
Robinson, F.G. (1992). Love’s story told: A life of Henry A. Murray. Cambridge, MA: Harvard University Press.
Ronan, G.F., Colavito, V.A., & Hammontree, S.R. (1993). Personal problem-solving system for scoring TAT responses: Preliminary reliability and validity data. Journal of Personality Assessment, 61, 28–40.
Ronan, G.F., Date, A.L., & Weisbrod, M. (1995). Personal problemsolving scoring of the TAT: Sensitivity to training. Journal of Personality Assessment, 64, 119–131.
Rosenzweig, S. (1948). The thematic apperception technique in diagnosis and therapy. Journal of Personality, 16, 437–444.
Rosenzweig, S. (1999). Pioneer experiences in the clinical development of the Thematic Apperception Test. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 39–50). Washington, DC: American Psychological Association.
Rossini, E.D., & Moretti, R.J. (1997). Thematic Apperception Test (TAT) interpretation: Practice recommendations from a survey of clinical psychology doctoral programs accredited by the American Psychological Association. Professional Psychology: Research and Practice, 28, 393–398.
Rothstein, M.G., & Goffin, R.D. (2000). The assessment of personality constructs in industrial-organizational psychology. In R.D. Goffin & E. Helmes (Eds.), Problems and solutions in human assessment: Honoring Douglas N. Jackson at seventy (pp. 215– 248). Norvell, MA: Kluwer Academic Publishers.
Rund, B.R. (1986). Communication deviance in parents of schizophrenics. Family Process, 25, 133–147.
Sass, L.A., Gunderson, J.G., Singer, M.T., & Wynne, L.C. (1984). Parental communication deviance and forms of thinking in male schizophrenic offspring. Journal of Nervous and Mental Disease, 172, 513–520.
Schafer, R. (1958). How was this story told? Journal of Projective Techniques, 22, 181–210.
Schwartz, L.A. (1932). Social situation pictures in the psychiatric interview. American Journal of Orthopsychiatry, 2, 124–132.
Shneidman, E. (1951). Thematic test analysis. New York: Grune & Stratton.
Shuman, D.W., & Sales, B.D. (Eds.). (1999). Daubert’s meanings for the admissibility of behavioral and social science evidence. Special edition of Psychology, Public Policy, and Law, 5 (March).
Silber, D.E., Karp, S.A., & Holmstrom, R.W. (1990). Recommendations for the clinical use of the Apperceptive Personality Test. Journal of Personality Assessment, 55, 790–799.
Silverman, L.H. (1959). A Q-sort study of the validity of evaluations made from projective techniques. Psychological Monographs, 73(7, Whole No. 477), 28.
Singer, M.T., & Wynne, L.C. (1966). Principles for scoring communication defects and deviances in parents of schizophrenics: Rorschach and TAT scoring manuals. Psychiatry, 29, 260–288.
Smith, C.P. (Ed.). (1992). Motivation and personality: Handbook of thematic content analysis. New York: Cambridge University Press.
Spencer, L.M., Jr., & Spencer, S.M. (1993). Competence at work: Models for superior performance. New York: Wiley.
Stein, M.I. (1948). The Thematic Apperception Test: A manual for its clinical use with males. Cambridge, MA: Addison-Wesley.
Stein, M.I. (1999). A personological approach to the Thematic Apperception Test. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 125–131). Washington, DC: American Psychological Association.
Stein, M.I., & Gieser, L. (1999). The zeitgeists and events surrounding the birth of the Thematic Apperception Test. In L. Gieser & M.I. Stein (Eds.), Evocative images: The Thematic Apperception Test and the art of projection (pp. 15–21). Washington, DC: American Psychological Association.
Teglasi, H. (2001). Essentials of TAT and other storytelling techniques assessment. New York: Wiley.
Thompson, C.E. (1949). The Thompson modification of the Thematic Apperception Test. Rorschach Research Exchange, 13, 469–478.
Tomkins, S.S. (1947). The Thematic Apperception Test: The theory and technique of interpretation. New York: Grune & Stratton.
Ullman. L. (1957). Selection of neuropsychiatric patients for group psychotherapy. Journal of Consulting Psychology, 21, 277–280.
Veroff, J. (1957). Development and validation of a projective measure of power motivation. Journal of Abnormal and Social Psychology, 54, 1–8.
Wachs, T.D. (1966). Personality testing and the handicapped: A review. Journal of Personality Assessment, 53, 827–831.
Watkins, C.E., Campbell, V.L., Nieberding, R., & Hallmark, R. (1995). Contemporary practice of psychological assessment by clinical psychologists. Professional Psychology: Research and Practice, 26, 54–60.
Weinberger, J., & McClelland, D.C. (1990). Cognitive versus traditional motivational models: Irreconcilable or complementary? In E.T. Higgins & R.M. Sorrentino (Eds.), Handbook of motivation and cognition (Vol. 2, pp. 562–597). New York: Guilford Press.
Weiner, I.B. (1989). On competence and ethicality in psychodiagnostic assessment. Journal of Personality Assessment, 53, 827–831.
Westen, D. (1991a). Clinical assessment of object relations using the TAT. Journal of Personality Assessment, 56, 56–74.
Westen, D. (1991b). Social cognition and object relations. Psychological Bulletin, 109, 429–455.
Westen, D., Lohr, N.E., Silk, K., Gold, L., & Kerber, K. (1990). Object relations and social cognition in borderlines, major depressives, and normals: A Thematic Apperception Test analysis. Psychological Assessment: A Journal of Consulting and Clinical Psychology, 2, 355–364.
Winter, D.G. (1973). The power motive. New York: Free Press.
Winter, D.G. (1996). Personality: Analysis and interpretation of lives. New York: McGraw-Hill.
Winter, D.G. (1998). “Toward a science of personality psychology”: David McClelland’s development of empirically derived TAT measures. History of Psychology, 1, 130–153.
Winter, D.G., John, O.P., Stewart, A.J., Klohnen, E.C., & Duncan, L.E. (1998). Traits and motives: Toward an integration of two traditions in personality research. Psychological Review, 105, 230–250.
Winter, D.G., & Stewart, A.J. (1977). Power motive reliability as a function of retest instructions. Journal of Consulting and Clinical Psychology, 45, 436–440.
Woike, B.A., & McAdams, D.P. (2001). TAT-based personality measures have considerable validity. American Psychological Society Observer, 14, 10.
Zhang, T., Xu, S., Cai, Z., & Chen, Z. (1993). Research on the Thematic Apperception Test: Chinese revision and its norms. Acta Psychologica Sinica, 25, 314–323.
Ziskin, J. (1995). Coping with psychiatric and psychological testimony (5th ed., Vol. II). Marina del Ray, CA: Law and Psychology Press.

CHAPTER 28 The Use of Sentence Completion Tests with Adults

ALISSA SHERRY, ERIC DAHLEN, AND MARGOT HOLADAY

HISTORY OF SENTENCE COMPLETION METHODS 372 ROTTER INCOMPLETE SENTENCES BLANK 373 Test Description 373 Theoretical Basis 373 Test Development 374 Psychometric Characteristics 374 Range of Applicability and Limitations of Sentence Completion Methods 375 Cross-Cultural and Diversity Factors 376

Although sentence completion tests (SCTs) are among the most common approaches to personality assessment (Archer, Maruish, Imhof, & Piotrowski, 1991; Camara, Nathan, & Puente, 2000; Goh & Fuller, 1983; Kennedy, Faust, Willis, & Piotrowski, 1994; Piotrowski, 1985), they are not generally included in popular sources on assessment (e.g., Groth-Marnat, 1999). Part of the problem is that “SCT” is a generic label used to describe many verbal projective techniques. In fact, our literature review revealed over 40 SCTs used in research and practice settings (see Appendix), and it is likely that this is a conservative estimate, considering the number of such instruments that were never published.

Because a comprehensive review of all known SCTs was beyond the scope of this chapter and unlikely to be useful to readers of a volume such as this, it was necessary to limit our focus. First, we excluded those instruments that are primarily used in neuropsychological and cognitive assessment, those that are used exclusively with children, and those that had not been published in the literature. Second, the decision about which of the remaining SCTs to include was based on a recent study in which 60 members of the Society for Personality Assessment were surveyed about their use of SCTs (Holaday, Smith, & Sherry, 2000). Thus, this chapter discusses those sentence completion tests that are most commonly used by clinicians in order to facilitate personality assessment with adult clients. Because this survey found that the Rotter Incomplete Sentences Blank (Rotter, Lah, & Rafferty, 1992; Rotter & Rafferty, 1950) was the most widely used instru-

Accommodation for Populations With Disabilities 377 Computer-Based Testing 378 Current Research Status 378 Clinical Applications 379 Future Directions 379 APPENDIX: LIST OF SENTENCE COMPLETION METHODS FOUND IN THE LITERATURE 380 REFERENCES 382

ment by a wide margin, it will receive considerable attention here.

Following an overview of the history of SCTs, the Rotter Incomplete Sentences Blank will be described in terms of its theoretical rationale, development, and psychometric properties. We will then broaden our discussion of the range of applicability and limitations to SCTs in general, addressing their use with diverse populations and persons with disabilities as well as its computerization, research, and the future of SCTs.

HISTORY OF SENTENCE COMPLETION METHODS

As noted in many other places in this volume, projective techniques are based on the hypothesis that the manner in which an individual responds to a relatively unstructured task reveals latent aspects of his or her personality (Anastasi & Urbina, 1997). Because more ambiguous tasks are thought to be less likely to evoke defensive responses from the respondent, projective techniques are typically viewed as accessing aspects of one’s personality that are not open to self-report. Sentence completion instruments are generally considered to be one type of verbal projective technique, providing more structure than inkblots and some drawing techniques and less structure than many thematic methods.

Despite their frequent association with projective methods, SCTs did not begin as a projective technique. Herman Ebbinghaus introduced the first known SCT in 1897 as a means of studying reasoning ability and intellectual capacity of school children (Ebbinghaus, 1897, in Lah, 1989b, 2001). This early SCT is also considered by some to be one of the first modern intelligence tests, later inspiring Alfred Binet and Theodore Simon to incorporate a version of Ebbinghaus’s method in their early intelligence scale (Lah, 1989b, 2001).

The use of SCTs as projective techniques originated from Jung’s 1916 early use of word association as a method for studying personality. However, the use of a single stimulus word was found to be problematic due to wide variation in response frequency by many demographic and cultural factors (Anastasi & Urbina, 1997), and researchers began to explore the use of phrases and sentence stems in the elicitation of responses. In the late 1920s, Arthur Payne used SCTs to measure personal traits as an aid in vocational counseling of college students, marking the first use of these methods for studying personality (Lah, 1989b, 2001). Further in this lineage was Alexander Tendler (1930), who began to use the SCT with the more “projective” approach that many associate with SCTs. He devised sentence stems that tapped into a person’s emotional states as a measure of emotional insight where stems began with “I” and were followed by an emotion such as “love” or “hate” and the subject was to finish the stem.

SCTs are used in a variety of settings and populations in order to provide information about one’s overall adjustment, as well as qualitative data pertaining to one’s latent personality. They are often administered during the course of personality, vocational, and cognitive assessment batteries, and they are used with adults, adolescents, and children. The Rotter Incomplete Sentences Blank, originally developed in 1950 by Rotter and Rafferty, continues to be among the most popular SCTs, and it is to this influential measure that we now turn.

ROTTER INCOMPLETE SENTENCES BLANK

The Rotter Incomplete Sentences Blank (RISB; Rotter & Rafferty, 1950; Rotter et al., 1992) was developed primarily for clinical purposes and has enjoyed widespread use. In fact, a recent survey of members of the Society for Personality Assessment found that the RISB was the most commonly used of 15 SCTs about which respondents were asked (Holaday et al., 2000). The RISB is consistent with many accepted theories of personality, and it has been used in a variety of settings (e.g., industry, military, junior and senior high schools, research settings, and hospital and mental health clinics) both

during the initial assessment interview and as part of a more thorough test battery.

Test Description

The RISB is currently in its second edition and is published by the Psychological Corporation. It consists of 40 sentence stems that were originally designed to aid in screening overall adjustment among college students (Rotter & Rafferty, 1950) and has since broadened to include high school students and adults (Rotter et al., 1992). Like many other SCTs, the RISB has a clear projective orientation, as the stems were constructed with relatively low face validity. Item stems from the Rotter are similar to the following:

I dislike . . . The fondest time . . . I wonder about . . . Where I was raised . . . I feel guilty when . . . At night . . . The opposite sex . . . The most wonderful . . . What gets on my nerves . . . Everyone . . .

The 40 items are printed on both sides of one sheet of paper that is designed to be self-administered. This allows the RISB to be administered individually or in groups. Alternatively, the RISB can be administered orally. Administration of the RISB typically requires 20 to 40 minutes, depending on the level of detail provided. While no special training is needed for the administration of the RISB, interpretation should be attempted only by trained professionals (Rotter et al., 1992).

Theoretical Basis

Although it was originally designed as a screening instrument of overall adjustment among college students, the RISB is a semistructured projective personality assessment technique that, like other projective techniques, is assumed to tap into the latent personality of the respondent. Although SCTs are often viewed as an extension of word-association techniques, the RISB differs in that there is no pressure on the respondent for an immediate response. Thus, similar to the Thematic Apperception Test, the material presented in the response is usually that which the individual is willing to give rather than that which she or he cannot help but give (Rotter et al., 1992).

The RISB was designed to measure both adjustment and maladjustment on a graduated scale, and it is assumed that the individual’s level of each of these is reflected in his or her statements about himself or herself, his or her work, relationships with others, or other aspects of the individual’s life. Rotter and colleagues (1992) defined adjustment as “the relative freedom from prolonged unhappy/dysphoric states (emotions) of the individual, the ability to cope with frustration, the ability to initiate and maintain constructive activity, and the ability to establish and maintain satisfying interpersonal relationships” (p. 4). Similarly, maladjustment was defined as

the presence of prolonged unhappy/dysphoric states (emotions) of the individual, inability to cope or difficulty in coping with frustration, a lack of constructive activity or interference in initiating or maintaining such activity, or the inability to establish and maintain satisfying interpersonal relationships. (p. 5)

This distinction sets the RISB apart from many screening instruments that simply define adjustment as the absence of psychopathology.

Test Development

The original version of the RISB, the RISB-College Form, was published in 1950 by Rotter and Rafferty and was based on an experimental form used in the United States Army by Rotter and Willerman (1947). According to Rotter and colleagues (1992), two primary objectives guided the development of the RISB. First, they sought to create a projective measure that could be administered and scored easily enough to permit its widespread use in screening and research. Thus, their goal was the development of an instrument that would retain the advantages of projective methods while at the same time utilizing a standardized method of administration and an objective scoring system. Second, they wanted their new instrument to save clinicians time by providing specific diagnostic information. Unlike some other projective techniques, the RISB was not designed to provide information about the whole personality or to uncover deep structural variables. In contrast, they intended to create a measure that would be used to help clinicians structure early interviews, increasing diagnostic efficiency and treatment planning.

The second edition of the RISB was published in 1992 by Rotter and colleagues, who noted, “Except for two slight changes, the 40 sentence stems of this second edition are identical to those of the first edition” (p. 1). This revision was undertaken in order to provide an updated literature review and provide updated normative data, scoring criteria, and examples for use in scoring the instrument. The current RISB has three forms: High School, College, and Adult. The College Form appeared first, and the other forms followed with slight changes in wording.

The RISB can be distinguished from most other SCTs on the basis of its objective scoring system. Each sentence stem is scored according to the degree of adjustment the response reflects using a 7-point Likert scale from 0 (most positive) to 6 (most conflict). Three types of responses are scored numerically: conflict, positive, and neutral responses. Conflict responses indicate an unhealthy or maladjusted state and would be indicated by pessimism, hostility, hopelessness, or suicidal thoughts. Positive responses are those indicating a healthy, well-adjusted frame of mind and would include indicators such as humor, optimism, acceptance, or positive feelings about self and others. Neutral responses are those responses that do not fall into either the conflict or positive categories. Stereotypes, catchphrases, song titles, or other cultural cliche´s are examples of neutral responses. Scoring a completed RISB form may take from 15 to 35 minutes, depending on experience. The RISB manual provides detailed scoring examples selected on the basis of their frequency in criterion protocols and their illustrative value. Although these examples are helpful, scoring is still largely dependent on clinical judgment.

Once all sentence stems have been scored, the overall adjustment score is calculated by summing scores for the 40 items. The recommended cut score for identifying maladjustment among college students has been identified as 145 for screening, selection, and research purposes. However, this score has varied between 120 and 160, depending on the purpose of the score (e.g., research vs. the identification of clinical populations, etc). In addition, Rotter and colleagues (1992) point out that the cut score of 145 is not absolute, but rather a guide for which clinical judgment is ultimately necessary.

Psychometric Characteristics

Psychometric characteristics are important when deciding whether to use an assessment instrument. Many of the sentence completion methods listed in the Appendix list reliability and validity information in the original manuscripts or manuals that accompany them. However, several other sentence completion methods do not list pertinent psychometric characteristics in most cases because of the method of data collection and analysis. In order to simplify the following section, rather than discuss the psychometric properties of multiple tests, only the psychometric properties of the Rotter Incomplete Sentences Blank will be discussed.

Norms

The original version of the RISB-College Form was normed on representative samples of first-year college students at Ohio State University (Rotter & Rafferty, 1950). By 1972, it was clear that the 1950 norms were somewhat dated and could no longer be considered an accurate representation of college students (Cross & Davis, 1972; McCarthy & Rafferty, 1971; Snow, 1972). Data gathered in the 1980s continued to show that the 1950 norms were no longer relevant, and some authors recommended the development of new norms (Lah, 1989a; Lah & Rotter, 1981). In the manual for the most recent edition of the RISB, Rotter and colleagues (1992) presented new norms based on data collected from three studies conducted between 1977 and 1988. These samples do not appear to be representative of the college population in general, and the authors suggest that the development of local norms is likely to be more useful. In addition, given that these data were collected over 13 years ago, one must wonder whether they are still applicable to the modern college student. Moreover, although there are Adolescent and Adult forms of the RISB, there are no separate norms for these groups included in the manual, and there is some evidence that the college norms are inappropriate for adolescents (Ames & Riggio, 1995).

Reliability

The RISB manual reports adequate internal consistency, stability, and interrater agreement (Rotter et al., 1992). Because the RISB is designed to sample broad content areas, assessing the internal consistency of the measure yields only conservative estimates of its reliability. However, the RISB still yields moderate reliability values for both split-half reliability estimates and estimates of Cronbach’s alpha. Split-half estimates for the different forms of the RISB range from .74 to .84 for males and .83 to .86 for females (Rotter, Rafferty, & Lotsoff, 1954; Rotter, Rafferty, & Schachtitz, 1949). Cronbach’s alpha was .69 for a sample of college men (Catanzaro, 1989). Thus, a moderate internal consistency is evident in spite of the RISB’s diverse content.

Stability is especially important for the RISB because it was developed to assess change in adjustment over time as a result of treatment or intervention; however, there has been some question as to whether the RISB measures state or trait aspects of personality (Churchill & Crandall, 1955). One- to 2-week test-retest reliability coefficients average about .82 (Arnold & Walter, 1957; Richardson & Soucar, 1971), and little change in mean RISB scores was found over 8 weeks among members of a no-treatment control group (Shell, O’Mally, & Johnsgard, 1964). As expected, stability coefficients were smaller when the test-retest interval was extended, with 6-month intervals producing coefficients between .43 and .54 (Churchill & Crandall, 1955).

In terms of interscorer reliability, the original validity study of the RISB found coefficients of .91 for males and .96 for females (Rotter & Rafferty, 1950). Since that time, such estimates have been replicated in the literature, and coefficients of agreement have ranged from as high as .99 (Snow, 1972; Vernallis, Shipper, Butler, & Tomlinson, 1970) to a low of .72 (Feher, Vandecreek, & Teglasi, 1983). Over time, the consistency of the scorers’ ability to score protocols correctly was also impressive with interscorer reliability coefficients of .90, .93, and .95 for 3 sample years ranging from 4 to 15 years (Lah & Rotter, 1981).

Validity

Compared to other projective tests, sentence completion tests have been described as one of the most valid (Murstein, 1965), and among SCTs, the RISB has the most consistent evidence supporting its use in the diagnosis and assessment of adjustment (Goldberg, 1965). Initial studies conducted by Rotter and colleagues (1949) indicated that the RISB was able to correctly identify 78% of the adjusted respondents and 59% of the maladjusted respondents for women and 89% of the adjusted respondents and 52% of the maladjusted respondents for men. Correlations between the RISB scores and adjustment classification were .50 and .62 for women and men, respectively. More recent studies have been even more promising. For example, Lah (1989b) compared overall adjustment scores from a control sample with those of a clinical sample and found significant relationships between adjustment and group membership (.72 for males and .67 for females). In addition, a cut score of 145 was able to correctly identify 84% of the controls and 85% of those in the clinic sample. Unfortunately, no validity studies were found in the literature since Lah’s (1989a) study.

Range of Applicability and Limitations of Sentence Completion Methods

According to a recent survey of members of the Society for Personality Assessment, SCTs are most often used as a part of an assessment battery (Holaday et al., 2000). Other common uses included attempts to explore a client’s personality structure, to provide “quotable quotes” from respondents, and as a part of a structured interview. Because SCTs are inexpensive, easy to score, and much less time consuming than many other personality instruments, they are versatile enough to be used in a wide range of settings. Although most SCTs are designed for written administration, many can be orally administered as well, permitting their use for illiterate individuals.

One advantage of the sentence completion methodology is that clinicians may choose to construct their own stems in order to assess a certain aspect of functioning with a client. Clinicians may also choose some stems from an already existing SCT and other stems from another one. There is also the possibility of using different scoring methods with various SCTs. The RISB, in particular, has a multitude of different scoring methods that have been developed over the years (Rotter et al., 1992). One may also use the stems of one SCT and the scoring criteria from another SCT. While such practices warrant careful consideration in terms of normative data and the appropriateness of certain approaches, this flexibility gives the SCT method the ability to be used with a variety of populations of all different ages.

Despite their impressive applicability, it would be a mistake to conclude that SCTs are appropriate for every client. One obvious limitation involves potential language barriers. This and other issues regarding the use of SCTs with diverse populations are discussed in the next section. A second limitation involves the use of SCTs with clients who have a more concrete cognitive style or a neurological disorder. In these cases, sentence completion tasks may not yield as much psychologically relevant material, other than identifying this style of responding. Other limitations may include their use with forensic populations. Some sentence stems have high face validity, and because there are no validity scales that guard against overreporting or underreporting, these are legitimate concerns with certain populations (Schretlen, 1997).

Cross-Cultural and Diversity Factors

As American society becomes increasingly diverse, greater attention has been focused on preparing psychologists to provide services to diverse groups (Sandoval, 1998). As a result, professional guidelines have been revised and expanded to address testing practices with individuals with different ethnic or cultural backgrounds, persons whose primary language is not English, those from different socioeconomic backgrounds, and individuals with disabilities (e.g., American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999; American Psychological Association, 2001). Rather than reviewing the general aspects of these guidelines that apply to all psychological tests, we will focus on those issues that are likely to be particularly problematic when using SCTs with diverse clients.

First, this issue of language is central to effective use of SCTs with diverse clients. Because SCTs rely heavily on written or spoken language, difficulties in translation and the manner in which certain words or phrases are interpreted by the client may complicate administration and interpretation. Ideally, sentence completion measures would be administered in the client’s native language. Of course, not every SCT has been translated into every language, and translation equivalence is notoriously difficult to accomplish (Sandoval & Dura´n, 1998). In addition, even among persons who share a common linguistic background, one may still encounter different understandings of meaning and uses of common metaphors (Dunnigan, McNall, & Mortimer, 1993). Thus, an understanding of various dialects and language variations is essential when working with these diverse individuals.

Second, the theoretical constructs on which various SCTs are based may not apply to members of diverse groups. For example, several studies have examined the validity of Loevinger’s (1966) ego development model with crosscultural populations using the Loevinger’s Sentence Completion Test. Some studies have found differences in the modal state of ego development and predominant themes for different groups (Lasker & Strodbeck, 1975; Ravinder, 1986), while other data support the cross-cultural applicability of the theory with minor revision (Snarey & Blasi, 1980). Similarly, Oshodi (1999) recently developed the Oshodi Sentence Completion Test (OSTC), an Africentric SCT that assesses motivation toward achievement. Because traditional methods of measuring achievement motivation come from Murray’s (1938) work with Caucasian samples, the OSTC was developed to reestablish the African personality perspective that revolves around a more nonlinear, spiritual, and holistic expression of oneself.

A third important concern that clinicians need to recognize is that the sentence completion format may be unfamiliar to persons from certain cultural groups. Cultural differences in the familiarity of SCT methods, cultural variability in response styles, and social desirability associated with personality assessment may affect responses to sentence stems (Van de Vijver, 2000). The open-ended, ambiguous nature of sentence stems may be particularly anxiety provoking for persons from certain cultures and may not be conducive to the response-generating process in which the clinician is interested.

Next, the possibilities of instrument and clinician bias are relevant in SCTs. Consider administering stems that have to do with the institution of marriage to a gay, lesbian, bisexual, or transgender (GLBT) individual. Remembering that the scoring procedure of the RISB uses positive and conflict scores to indicate one’s level of adjustment, this particular stem may measure unique aspects of adjustment in this population not intended to be measured by the instrument. Given the historical legal prejudice against same-sex marriages, it is likely that there may be aspects of adjustment toward marriage in the GLBT culture that cannot be measured or are not found in the heterosexual culture and vice versa. Obviously, clinician bias may be relevant here, too. Implicit views held by clinicians toward certain groups may influence both the selection of specific sentence completion measures and the interpretation of responses (Van de Vijver, 2000). Since the presence of a standardized scoring procedure would help to reduce clinician bias, clinicians should be encouraged to use tests such as the RISB that offer such a scoring system and should be informed of the importance of adhering to standardized administration and scoring procedures.

Despite noted limitations with diverse populations and the relatively few culture-specific SCTs currently available, SCTs have frequently been used as a way of qualitatively assessing cultural differences. Because of their open-ended framework, SCTs can be used in an exploratory manner, so that information can be gathered about cultural differences in regard to various theories within a research context that might provide guidance for clinicians working within various cultural frameworks. Such studies have provided information regarding the personality differences among Japanese, Americans, Italians, and Eskimos (Sofue, 1979), differences between Japanese and Americans on spontaneous causal attributions (Hayamizu, 1992), and differences between U.S. and Congo/ Zaire elderly adults on the constructs of individualism and collectivism (Westerhof, Dittmann-Kohli, & Katzko, 2000).

Although most of the literature on SCTs utilizes etic approaches in which already existing SCTs that have been developed for one particular cultural population (usually White America) are applied to persons from other cultures, some research has adopted an emic approach, attempting to discover constructs that are important from within a particular culture (Lonner, 1985). For example, the Shanan Sentence Completion Test (SSCT; Shanan & Nissan, 1961) consists of 65 incomplete sentence stems designed to investigate four basic categories: (1) the ability to identify and express external goals, (2) the ability to detect and express external problems, (3) the readiness to actively cope with problems, and (4) self-esteem. The SSCT was developed for Israeli samples, was originally written in Hebrew, and is intended to measure aspects of personality and coping from an Israeli perspective.

Clinicians who use SCTs with diverse populations should carefully consider their choice of instrument in light of the composition of the normative sample. While not all sentence completion measures offer normative data, those that do should be considered first. Clearly, the use of SCTs with diverse groups would be enhanced by continued efforts to develop normative data, particularly that which includes culturally diverse individuals (Potash, Crespo, Patel, & Ceravolo, 1990). Similarly, although not all SCTs offer standardized scoring criteria, those that do offer an advantage in minimizing clinician bias, provided such scoring criteria is sensitive to diversity issues and is minimally biased. Ideally, such scoring criteria would be developed with the particular cultural background of the individual in mind.

Accommodation for Populations With Disabilities

Additional sensitivity and training in the area of disabilities is also required when working with individuals with special needs. SCTs have been used in numerous studies as a way to assess various types of disabilities, particularly cognitive or learning disabilities. Such studies have used SCTs as a means of identifying underlying causes for the behavior of learningdisabled students (Katims & Zapata, 1988); identifying semantic, syntactic, and pragmatic inadequacies as they are related to reading comprehension (Vellutino & Shub, 1982); and as a means of identifying people with and without dyslexia (Rudel, Denckla, & Broman, 1981).

However, little research has been done regarding specific accommodations for people with disabilities as it relates to the administration of SCTs. Some of the accommodations for individuals with physical disabilities may entail reading sentence stems aloud and allowing the individual to respond verbally, administering the test in braille, or even using sign language or other visual cues as a means of administration. When working with persons with cognitive and learning disabilities, alternative administrations may also be considered, such as reading stems out loud or being available to answer questions about wording or spelling.

However, many concerns have been voiced about the appropriateness of projective techniques, including SCTs for persons with cognitive disabilities. Concerns include (1) the possibility that some SCTs are too abstract and require too high a level of cognitive functioning, (2) the difficulty anticipating the manner in which a client’s impairment may impact his or her responses, (3) debate over how much variability exists in individuals with mental retardation, (4) the effects of medication on responses, and (5) the effects of possible long-term lack of environmental stimulation for those individuals who may have been institutionalized (Panek, 1997). In addition, the use of alternative administration procedures and administration to individuals other than those for whom the test was intended warrant extreme caution when scoring and interpreting these protocols. For example, the normative data available for the RISB is for written responses only,

where the length of the space in which the subject can write his or her responses serves to control the response length. Such controls are not in place when administered orally or in sign language. However, these issues are of less concern when the formal scoring procedure is not used and the goal of the SCT administered is to gather more interview information or screen for certain response sets.

Computer-Based Testing

The use of computers in psychological assessment is increasing, and it is likely that computers will become increasingly important for assessment in the future (Garb, 2000). Although a recent survey of clinical psychologists and neuropsychologists suggested that computer-based test administration is still relatively uncommon among clinicians, respondents reported using computer scoring with approximately 10% of the tests they used (Camara et al., 2000). Clinicians in this survey also reported using computer-scoring services most frequently with tests administered for the assessment of personality and/or psychopathology.

While computer scoring offers several advantages for objective personality inventories, such as the Minnesota Multiphasic Personality Inventory-2 (Graham, 2000), few such programs have been developed to assist clinicians with scoring projective measures. It seems unlikely that SCTs are likely to benefit from computer scoring in the near future. Persons responding to sentence completion tests have so much freedom in formulating their responses to sentence stems that computer scoring is impractical (Megargee & Spielberger, 1992). In addition, because so few clinicians appear to follow standardized scoring procedures for these measures, there is little market for such computerized programs.

Despite the problems associated with computer-scoring services for SCTs, computer-based test administration of these measures is rather simple and may facilitate research with these measures. In fact, among projective techniques designed to assess personality or psychopathology, sentence completion measures may be the easiest to computerize due to the simplicity of the stimulus (i.e., brief sentence stems without pictorial stimuli) and the straightforward manner of responding by completing sentence stems (Rasulis, Schuldberg, & Murtagh, 1996). Rasulis and colleagues developed a computerized version of the RISB and found that the effects of administration format (i.e., computer or traditional) were minimal, even when attitudes toward computers were taken into account. Although these results need replication with other SCTs, their study suggests that computer-based administration of sentence completion measures is possible and that it generally yields results similar to traditional administration format.

Current Research Status

Sentence completion tests are widely used in research settings, primarily as a means to measure a variable of interest. Studies in just the past several years have used SCTs or sentence completion methods to study cognitive functioning in people with learning disabilities (Clark, Prior, & Kinsella, 2000), neurological deficits (Marangolo, Basso, & Rinaldi, 1999), and ways in which the individuals process cognitive information in general (Whittlesea & Williams, 2000). They have also been used to learn more about various psychological disorders in a variety of settings (Evans, Brody, & Noam, 2001), people’s beliefs about body image and eating disorders (Kostanski & Gullone, 1999), and, as noted earlier, in multicultural and cross-cultural research (Liu, Wilson, McClure, & Higgins, 1999).

While the RISB has probably generated more research than any other SCT (Lah, 1989b), similar to the studies cited above, research on the RISB has tended to focus on other variables, using the RISB primarily as a brief index of adjustment for screening or comparing groups (Lah, 1989a). Thus, few studies have been conducted since the early validation research that have directly addressed the utility of the RISB.

Research using SCTs is quite diverse and prevalent. For example, SCTs are well suited for exploring differences between groups both quantitatively and qualitatively. While research using projectives can fall into a variety of categories such as the use of the technique in the evaluation of personality change or the testing of personality theories (Singer, 1968), the number of SCTs used in this capacity exceeds this chapter’s ability to cover all of these in depth. Because of the breadth of this information, the focus of this section will primarily be on the research that has been conducted on SCTs themselves in terms of presentation, administration, and scoring.

Turnbow and Dana (1981) explored the effects of stem length on SCTs using stems from the Forer Sentence Completion Test, the Miale-Holsopple Sentence Completion Test, the RISB-College Form, and the Sacks Sentence Completion Test. They found that structured stems (i.e., those that require the respondent to respond to specific areas that the items were designed to tap) elicited more feeling responses from respondents and more hypotheses from those scoring the protocols than unstructured stems (i.e., those that are more ambiguous and do not pull for certain responses). Interestingly, these results were present regardless of whether respondents were instructed to focus on their feelings during test administration or on the speed at which they completed the responses.

Similar findings occurred in a study investigating methods of presenting incomplete sentence stimuli with the RISB (Wood, 1969). No statistically significant differences on response type were found between protocols where some respondents were told the test was measuring cognitive speed and others were told the test was a personality measure. In addition, the RISB was altered so as to reflect stems that had pronouns, proper names, or neither as stimuli. These differences in test content were largely responsible for statically significant score differences where proper names elicited more maladjusted scores and forms using neither pronouns nor proper names produced the least maladjusted scores.

Again, other research is prevalent using SCTs in evaluating treatment approaches, in exploring new information on diverse populations, and in evaluating the specific psychometric aspects of specific tests. Applicable to some of the earlier discussions in this chapter, Flynn (1974) investigated differences between oral and written administration of the RISB with hospitalized psychiatric patients. He found no statistically significant differences between the different administrations for this population. Similar research continues to be needed in order to assess the appropriateness of SCTs with various populations using various administration methods and scoring.

Clinical Applications

As previously noted, clinicians who use sentence completion measures do so most often as part of an assessment battery (Holaday et al., 2000). In this context, SCTs are typically included as a projective measure of personality or psychopathology. They provide the clinician with information about the client’s overall adjustment or maladjustment and may offer insight into aspects of his or her personality structure. Given the frequency with which one finds indicators of global adjustment or maladjustment on most widely used personality/ diagnostic tests (e.g., the MMPI-2, Symptom Check List-90- Revised [SCL-90R], etc.), it seems that the real value of sentence completion tests involves their ability to reveal qualitative aspects of the respondent’s personality. For example, a client with a Spike 7 profile on the MMPI-2 might be described as intensely anxious and as experiencing obsessive thoughts (Graham, 2000); however, the addition of a sentence completion test might provide additional clues to the nature of the obsessive thoughts and the degree to which they may dominate the client’s thinking. This qualitative use of SCTs is certainly consistent with reports by clinicians who use these measures that they do not routinely utilize formal scoring procedures (Holaday et al., 2000). Although its authors consider the objective scoring system of the RISB to be a primary advantage (Rotter et al., 1992), clinicians seem to favor a clinical use of this and similar instruments, preferring to interpret them as verbal projective methods. This qualitative use of SCTs would be enhanced through oral administration so that the client would have the opportunity to elaborate on unusual responses or provide nonverbal cues that might be useful in understanding the relevance of his or her responses. Of course, clinicians using SCTs in this manner are also advised to develop their own local norms so that they have some way to recognize deviant responses.

The utility of SCTs in clinical settings is not limited to diagnostic or evaluative applications. Sentence completion tests are also used during the course of therapy to evaluate progress and treatment response (Albert, 1970) and as a therapeutic intervention in their own right. For example, clients in group, family, or couples therapy may be asked to respond to sentence stems in order to stimulate the therapeutic process (Gumina, 1980). In addition, sentence stems may be adapted to fit the theoretical orientation of the therapist. For example, a cognitive therapist might use sentence completion tests during therapy as a way of identifying a client’s cognitive distortions. The reader is referred to other sections of this chapter for additional guidance regarding the clinical applications of SCTs.

Future Directions

Recognition that SCTs have utility outside their traditional diagnostic role has allowed for many creative alternative uses. Looking toward the future, an integration of the traditional (i.e., typically psycho-dynamic) projective uses of SCTs may combine nicely with some of the current postmodern, constructivist approaches in cognitive psychology that have been emerging. The basic tenet of the constructivist perspective asserts that each individual’s reality is largely constructed by language and the manner in which that language is interpreted within one’s personal and cultural context (Lyddon & Weill, 1997; Terrell & Lyddon, 1996). Within this perspective, culture, gender, and individual diversity are highly regarded, assumptions of “normal” are reevaluated, and the notion of language plays a primary role.

Recent studies are beginning to recognize the utility of SCTs as a way of evaluating the meaning-making process of individuals in a certain context. For example, Pryzgoda & Chrisler (2000) used a sentence completion method in order to evaluate how different people understood the words gender and sex. Respondents gave examples ranging from the belief that gender and sex were the same thing to the concept that

gender was associated with females and discrimination. Such qualitative investigations of meaning making within specific contexts can be extremely valuable.

While the discussion of constructivist psychology may seem antithetical to discussions about projective personality assessment, such views may have similarities when considering SCTs. The primary goal of the SCT is to learn more about the client through written or expressed answers to sentence stems. As such, some of the primary limitations discussed about SCTs are focused on the limitations of language from the perspective of the client (through responses) and the clinician (through interpretations). Constructivist approaches allow for a more open approach to interpretation by understanding that the client’s responses are a reflection of the client’s culture and contextual experience, rather than merely a means to collect specific data regarding maladjustment, for example. A combination of projective and constructivist approaches may assist professionals in forming diagnostic impressions and treatment approaches that are affirming and sensitive to diverse populations while still preserving the purpose and intent of SCTs and similar methods of projective personality assessment.

APPENDIX: LIST OF SENTENCE COMPLETION METHODS FOUND IN THE LITERATURE

Name of Instrument	Intended Purpose or Theory	Original or Relevant Citations
Aronoff Sentence Completion	Integration of sociology and Maslow’s theory of personality	(Aronoff, 1967)
Bloom Sentence Completion Survey	Designed to reveal global attitudes about important variables in everyday life situations	(Bloom, 1980)
Chillicothe Sentence Completion Test	65 stems, 40 of which came from Rotter’s ISB; used at the Chillicothe VA hospital to measure ward adjustment of patients	(Cromwell & Lundy, 1954)
Defense Mechanism Profile	Self-administered projective sentence completions	(Johnson & Gold, 1995)
Forer Sentence Completion Test	Focus on attitudes and value systems based on Murray’s theory of needs, press, and inner states	(Forer, 1960, 1963)
Hart Sentence Completion Test	Child personality information about family, school, peers, and self	(Hart, Kehle, & Davies, 1983)
Hartman & Hasher Sentence Completion Task	High-cloze sentence frames that look at retention of disconfirmed and target endings; indirect test of memory	(Hartman & Hasher, 1991)
Hayling Sentence Completion Test	Neuropsychological measure of executive functioning; stem responses are strongly cued by the structure of the stem	(Burgess & Shallice, 1997)
“I am” Sentence Completion Method	Measure of self-attitudes	(Kuhn & McPartland, 1954)
Incomplete Sentences Task	Used to identify emotional problems that might interfere with learning	(Lanyon & Lanyon, 1979)
Inselberg’s Sentence Completion Blank	Sentence completion technique for the measurement of marital satisfaction	(Inselberg, 1964)
Loevinger’s Sentence Completion Test (Washington University Sentence Completion Test)	Measures level of ego development based on Loevinger’s theory of personality	(Loevinger, 1998; Loevinger & Wessler, 1970; Loevinger, Wessler, & Redmore, 1970)
London Sentence Completion Test	Explores interpersonal relationships in adolescence	(Coleman, 1970)
Luther Hospital Sentence Completion	Screening measure for evaluating attitudes and emotional reactions essential for the field of nursing	(Thurston, Brunclik, & Feldhusen, 1968)

Appendix: List of Sentence Completion Methods Found in the Literature 381

Name of Instrument	Intended Purpose or Theory	Original or Relevant Citations
Mainord Sentence Completion Test	Explores the ego-analytic theory of coping style	(Mainord, 1956; also see Andrew, 1973)
Mayers’s Gravely Disabled Sentence Completion Task	Used to identify individuals with severely impaired mental status	(Mayers, 1991)
McKinney Sentence Completion Blank Miale-Holsopple Sentence Completion Test	Measure of emerging self-concept Designed to permit the expression of thoughts and feelings in a nonthreatening manner	(McKinney, 1967) (Holsopple & Miale, 1954)
Michigan Sentence Completion Test	Measures four structured personality areas: opposite sex, guilt feelings, aggression, positive and negative interpersonal relations, and an unstructured stem set. Used in VA research for the selection of clinical psychologists	(Unpublished. See Kelly & Fiske, 1951)
Miner Sentence Completion Scale	Measure of managerial motivation	(Miner, 1964, 1968, 1978)
Mosher Incomplete Sentences Test	Measure of guilt	(Mosher, 1962)
Mukherjee Sentence Completion Test	50 forced-choice triads reflecting achievement orientation	(Mukherjee, 1965)
Oshodi Sentence Completion Test	Measure of achievement motives; items reflect a variety of African-centered theories of motivation	(Oshodi, 1999)
Peck Sentence Completion Test	Measure of mental health based on psycho dynamic theory	(Peck, 1959; Peck & McGuire, 1959)
Personnel Reaction Blank	Designed to measure integrity for the purpose of selecting employees to fill nonmanagerial positions; based on a theory of antisocial personality	(Gough, 1971)
Quantified Self-Concept Inventory	Measure of self-concept	(Wattenberg & Clifford, 1962)
Rotter Incomplete Sentences Blank	Screening method for adjustment	(Rotter, 1951; Rotter et al., 1992; Rotter & Willerman, 1947)
SELE Sentence Completion Questionnaire	Assessment of personal meanings	(Dittmann-Kohli & Westerhof, 1997)
Self Focus Sentence Completion Scale	30 self-reference stems that provide an index of egocentricity or self-focused attention	(Exner, 1973)
The Sentence Completion Method	Based on Murray’s need theory, used to explore reactions and needs that lie deeper than those generally acknowledged	(Rohde, 1946, 1957)
Sentence Completion Method	Measure of coping responses	(Wayment & Zetlin, 1989)
Sentence Completion Series	Designed to identify psychological themes underlying current patient concerns and areas of distress	(Brown & Unger, 1998)
The Sentence Completion Test	Explores specific clusters of attitude or significant areas of an individual’s life	(Sacks & Levy, 1950)
Sentence Completion Test compiled by Y. Kataguchi (1957)	Cross-cultural application of a sentence completion test	(Sofue, 1979)

Name of Instrument	Intended Purpose or Theory	Original or Relevant Citations
Sentence Completion Test for Depression Sentence Completion Test for Group Orientation	Measure of depressive symptoms Elicits feelings about group and interpersonal relations on task orientation, interaction orientation, self in group orientation, and self-encapsulation	(See Barton & Morley, 1999) (Rothaus, Johnson, Hanson, Brown, & Lyle, 1967)
Sentence Completion Test for Psychiatric Diagnosis	Measures emotional and psychiatric symptoms	(Thelen et al., 1954)
Sentence Completion Test for the Office of Strategic Services Assessment Program (VA hospital)	Used by the VA to assess candidates’ personalities; based on psychodynamic theory	(Murray & MacKinnon, 1946)
Sentence Completion Test of Moral Attitudes	Measures the structure of moral thinking	(Musgrave, 1984)
Sentence Completion Test of Schroder & Streufert	Measure of cognitive complexity; each protocol is scored according to the level of cognitive structuring it reflects	(Schroder & Streufert, 1962; see Reilly & Sugerman, 1967, for application)
Sentence Contexts	Used to identify Alzheimer’s disease patients who have difficulty remembering words that follow obvious cues	(Hamberger, Friedman, & Rosen, 1996)
Shanan Sentence Completion Test	Relatively objective scoring based on four categories: external goals, detection of external problems, ability to actively cope, and self-esteem	(Mar’i & Levi, 1979; Shannan & Nissan, 1961)
Special Incomplete Sentence Test for Underachievers	Test to identify underachievers	(Riedel, Grossman, & Burger, 1971)
Stein Sentence Completion Test	Measure of attitude and maladjustment	(Stein, 1949)
Stotsky-Weinberg Sentence Completion Test	Focus on the work rehabilitation of chronic psychiatric patients; SCT of work attitudes	(Stotsky & Weinberg, 1956)
Taffel’s Sentence Completion Technique	Stimulus cards with a verb and six pronouns are presented for client to create sentences	(Taffel, 1955)
Tendler Sentence Completion Test	Used to help psychologists gain emotional insight into client problems	(Tendler, 1930)
Test of Egocentric Associations	Assesses an individual’s tendency to concentrate on self	(Szustrowa, 1976)

REFERENCES

Albert, G. (1970). Sentence completions as a measure of progress in therapy. Journal of Contemporary Psychotherapy, 3, 31–34.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington DC: American Educational Research Association.
American Psychological Association. (2001). APA guidelines for providers of psychological services to ethnic, linguistic, and culturally diverse populations. Retrieved March 2001 from http:// www.apa.org/pi/oema/guide.html
Ames, P.C., & Riggio, R.E. (1995). Use of the Rotter Incomplete Sentences Blank with adolescent populations: Implications for determining maladjustment. Journal of Personality Assessment, 64, 159–167.
Anastasi, A. & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River, NJ: Prentice Hall.
Andrew, J. (1973). Coping style and declining verbal abilities. Journal of Gerontology, 28, 179–183.
Archer, R.P., Maruish, M., Imhof, E.A., & Piotrowski, C. (1991). Psychological test usage with adolescent clients: 1990 survey findings. Professional Psychology: Research and Practice, 22, 247–252.
Arnold, F.C., & Walter, V.A. (1957). The relationship between a self- and other-reference sentence completion test. Journal of Counseling Psychology, 4, 65–70.
Aronoff, J. (1967). Psychological needs and cultural systems. Princeton, NJ: Van Nostrand.
Barton, S.B., & Morley, S. (1999). Specificity of reference patterns in depressive thinking: Agency and object roles in selfrepresentation. Journal of Abnormal Psychology, 108, 655–661.
Bloom, M.W. (1980). Bloom sentence completion surveys: Adult: Instructional manual (revised). Chicago: Stolting.
Brown, L.H., & Unger, M.A. (1998). PAR comprehensive catalog. Odessa, FL: Psychological Assessment Resources.
Burgess, P.W., & Shallice, T. (1997). Hayling Sentence Completion Test. Suffolk, England: Thames Valley Test Co. Ltd.
Camara, W.J., Nathan, J.S., & Puente, A.E. (2000). Psychological tests usage: Implications in professional psychology. Professional Psychology: Research and Practice, 31, 141–154.
Catanzaro, S.J. (1989). Effects of enhancement expectancies on expectancy and minimal goal statements. Journal of Psychology, 123, 91–100.
Churchill, R., & Crandall, V.J. (1955). The reliability and validity of the Rotter Incomplete Sentences Test. Journal of Consulting Psychology, 19, 345–350.
Clark, C., Prior, M., & Kinsella, G.J. (2000). Do executive function deficits differentiate between adolescents with ADHD and oppositional defiant/conduct disorder? A neuropsychological study using the Six Elements Test and Hayling Sentence Completion Test. Journal of Abnormal Child Psychology, 28, 403–414.
Coleman, J.C. (1970). The study of adolescent development using a sentence completion method. British Journal of Educational Psychology, 40, 27–34.
Cromwell, R.L., & Lundy, R.M. (1954). Productivity of clinical hypotheses on a sentence completion test. Journal of Consulting Psychology, 18, 421–424.
Cross, H.J., & Davis, G.L. (1972). College student adjustment and frequency of marijuana use. Journal of Counseling Psychology, 19, 65–67.
Dittmann-Kohli, F., & Westerhof, G.J. (1997). The SELE-Sentence Completion Questionnaire: A new instrument for the assessment of personal meanings in aging research. Anuario de Psicologia, 73, 7–18.
Dunningan, T., McNall, M., & Mortimer, J.T. (1993). The problem of metaphorical nonequivalence in cross-cultural survey research: Comparing the mental health statuses of Hmong refugee and general population adolescents. Journal of Cross-Cultural Psychology, 24, 344–365.
Evans, D.W., Brody, L., & Noam, G.G. (2001). Ego development, self-perception, and self-complexity in adolescence: A study of female psychiatric inpatients. American Journal of Orthopsychiatry, 7, 79–86.
Exner, J.E. (1973). The Self Focus Sentence Completion: A study of egocentricity. Journal of Personality Assessment, 37, 437– 455.
Feher, E., Vandecreek, L., & Teglasi, H. (1983). The problem of art quality in the use of human figure drawing tests. Journal of Clinical Psychology, 39, 268–275.
Flynn, W. (1974). Oral vs. written administration of the Incomplete Sentences Blank. Newsletter for Research in Mental Health and Behavioral Sciences, 16, 19–20.
Forer, B. (1960). Word association and sentence completion methods. In A.I. Rabin & M.R. Haworth (Eds.), Projective techniques with children (pp. 210–224). New York: Grune & Stratton.
Forer, B. (1993). The Forer Structured Sentence Completion Test. Los Angeles: Western Psychological Services.
Garb, H.N. (2000). Computers will become increasingly important for psychological assessment: Not that there’s anything wrong with that! Psychological Assessment, 12, 31–39.
Goh, D.S., & Fuller, G.B. (1983). Current practices in the assessment of personality and behavior by school psychologists. School Psychology Review, 12, 240–243.
Goldberg, P.A. (1965). A review of sentence completion methods in personality assessment. Journal of Projective Techniques and Personality Assessment, 29, 12–45.
Gough, H.G. (1971). Preliminary manual for the “Personnel Reaction Blank.” Palo Alto, CA: Consulting Psychologists Press.
Graham, J.R. (2000). MMPI-2: Assessing personality and psychopathology (3rd ed.). New York: Oxford University Press.
Groth-Marnat, G. (1999). Handbook of psychological assessment (3rd ed.). New York: Wiley.
Gumina, J.M. (1980). Sentence-completion as an aid to sex therapy. Journal of Marital and Family Therapy, 6, 201–206.
Hamberger, M.J., Friedman, D., & Rosen, J. (1996). Completion norms collected from younger and older adults for 198 sentence contexts. Behavior Research Methods, Instruments, and Computers, 28, 102–108.
Hart, D.H., Kehle, T.J., & Davies, M.V. (1983). Effectiveness of sentence completion techniques: A review of the Hart Sentence Completion Test for Children. School Psychology Review, 12, 428–434.
Hartman, M., & Hasher, L. (1991). Aging and suppression: Memory for relevant information. Psychology and Aging, 6, 587–592.
Hayamizu, T. (1992). Spontaneous causal attributions: A crosscultural study using the Sentence Completion Test. Psychological Reports, 71, 715–720.
Holaday, M., Smith, D.A., & Sherry, A. (2000). Sentence completion tests: A review of the literature and results of a survey of members of the Society for Personality Assessment. Journal of Personality Assessment, 74, 371–383.
Holsopple, J.Q., & Miale, F.R. (1954). Sentence completion: A projective method for the study of personality. Springfield, IL: Charles C. Thomas.
Inselberg, R.M. (1964). The sentence completion technique in the measurement of marital satisfaction. Journal of Marriage and the Family, 26, 339–341.
Johnson, N.L., & Gold, S.N. (1995). The Defense Mechanism Profile: A sentence completion test. In H.R. Conte, & R. Plutchik (Eds.), Ego defenses: Theory and measurement. Publication series of the Department of Psychiatry of Albert Einstein College of Medicine of Yeshiva University, No. 10 (pp. 247–262). New York: Wiley.
Jung, C. (1916). The association method. American Journal of Psychology, 21, 219–269.
Kataguchi, Y. (1957). The development of the Rorschach test in Japan. Journal of Projective Techniques, 21, 258–260.
Katims, D.S., & Zapata, J.T. (1988). Understanding student behavior. Academic Therapy, 24, 21–26.
Kelly, E.L., & Fiske, D.W. (1951). The prediction of performance in clinical psychology. Ann Arbor: University of Michigan.
Kennedy, M.L. Faust, D., Willis, W.G., & Piotrowski, C. (1994). Social emotional assessment practices in school psychology. Journal of Psychoeducational Assessment, 12, 228–240.
Kostanski, M., & Gullone, E. (1999). Dieting and body image in the child’s world: Conceptualization and behavior. Journal of Genetic Psychology, 160, 488–499.
Kuhn, M.H., & McPartland, T.S. (1954). An empirical investigation of self attitudes. American Sociological Review, 19, 68–76.
Lah, M.I. (1989a). New validity, normative, and scoring data for the Rotter Incomplete Sentences Blank. Journal of Personality Assessment, 53, 607–620.
Lah, M.I. (1989b). Sentence completion tests. In C.S. Newmark (Ed.), Major psychological assessment instruments: Vol. II (pp. 133–163). Boston: Allyn & Bacon.
Lah, M.I. (2001). Sentence Completion Tests. In W.I. Dorfman & M. Hersen (Eds.), Understanding psychological assessment. New York: Kluwer/Plenum.
Lah, M.I., & Rotter, J.B. (1981). Changing college student norms on the Rotter Incomplete Sentences Blank. Journal of Consulting and Clinical Psychology, 49, 985.
Lanyon, B.P., & Lanyon, R.I. (1979). Incomplete Sentence Test instruction manual. Chicago: Stoelting.
Lasker, H.M., & Strodbeck, F.L. (1975). Stratification and ego development in Curacao. In A.F. Marks & R.A. Romer (Eds.), Family and kinship in Middle America and the Caribbean. Proceedings of the 14th Seminar of the Committee of Family Research. Leiden: Royal Institute of Caribbean Studies.
Liu, J.H., Wilson, M.S., McClure, J., & Higgins, T.R. (1999). Social identity and the perception of history: Cultural representations of Aotearoa/New Zealand. European Journal of Social Psychology, 9, 1021–1047.
Loevinger, J. (1966). The meaning and measurement of ego development. American Psychologist, 21, 195–206.
Loevinger, J. (Ed.). (1998). Technical foundations for measuring ego development: The Washington University Sentence Completion Test. Mahwah, NJ: Erlbaum.
Loevinger, J., & Wessler, R. (1970). Measuring ego development: Volume 1. San Diego, CA: Jossey-Bass.
Loevinger, J., Wessler, R., & Redmore, C. (1970). Measuring ego development: Volume 2. San Diego, CA: Jossey-Bass.
Lonner, W.J. (1985). Issues in testing and assessment in crosscultural counseling. Counseling Psychologist, 13, 599–614.
Lyddon, W.J., & Weill, R. (1997). Cognitive psychotherapy and postmodernism: Emerging themes and challenges. Journal of Cognitive Psychotherapy, 11, 75–90.
Mainord, W.A. (1956). Experimental repression related to copping and avoidance behavior in the recall and re-learning of nonsense syllables. Unpublished doctoral dissertation, University of Washington, Seattle.
Marangolo, P., Basso, A., & Rinaldi, M.C. (1999). Preserved confrontation naming and impaired sentence completion: A case study. Neurocase, 5, 213–221.
Mar’i, S.K., & Levi, A.M. (1979). Modernization or minority status: The coping style of Israel’s Arabs. Journal of Cross-Cultural Psychology, 10, 375–389.
Mayers, K.S. (1991). A sentence completion task for use in the assessment of psychotic patients. American Journal of Forensic Psychology, 9, 19–30.
McCarthy, B.W., & Rafferty, J.E. (1971). Effect of social desirability and self-concept on the measurement of adjustment. Journal of Personality Assessment, 35, 576–583.
McKinney, F. (1967). The sentence completion blank in assessing student self-actualization. Personnel and Guidance Journal, 45, 709–713.
Megargee, E.I., & Spielberger, C.D. (1992). Reflections on fifty years of personality assessment and future directions in the field. In E.I. Megargee & C.D. Spielberger (Eds.), Personality assessment in America: A retrospective on the occasion of the fiftieth anniversary of the Society for Personality Assessment (pp. 170– 190). Hillsdale, NJ: Erlbaum.
Miner, J.B. (1964). Scoring guide for the Miner Sentence Completion Scale. New York: Springer.
Miner, J.B. (1968). The early identification of managerial talent. Personnel and Guidance Journal, 46, 586–591.
Miner, J.B. (1978). The Miner Sentence Completion Scale: A reappraisal. Academy of Management Journal, 21, 283–294.
Mosher, D.L. (1962). The development and validation of a sentence completion measure of guilt. Dissertation Abstracts, 22, 2468– 2469.
Mukherjee, B.H. (1965). A forced choice test of achievement motivation. Journal of the Indian Academy of Applied Psychology, 2, 85–92.
Murray, H.A. (1938). Explorations in personality. New York: Oxford University Press.
Murray, H.A., & MacKinnon, D.W. (1946). Assessment of OSS personnel. Journal of Consulting Psychology, 10, 76–80.
Murstein, B.I. (Ed.). (1965). Handbook of projective techniques. New York: Basic Books.
Musgrave, P.W. (1984). Adolescent moral attitudes: Continuities in research. Journal of Moral Education, 13, 133–136.
Oshodi, J.E. (1999). The construction of an Africentric sentence completion test to assess the need for achievement. Journal of Black Studies, 30, 216–231.
Panek, P.E. (1997). The use of projective techniques with persons with mental retardation: A guide for assessment instrument selection. Springfield, IL: Charles C. Thomas.
Peck, R.F. (1959). Measuring the mental health of normal adults. Genetic Psychology Monographs, 60, 197–255.
Peck, R.F., & McGuire, C. (1959). Measuring changes in mental health with the sentence completion technique. Psychological Reports, 5, 151–160.
Piotrowski, C. (1985). Clinical assessment: Attitudes of the Society for Personality Assessment membership. Southern Psychologist, 2, 80–83.
Potash, H.M., Crespo, A., Patel, S., & Ceravolo, A. (1990). Crosscultural attitude assessment with the Miale-Holsopple Sentence Completion Test. Journal of Personality Assessment, 55, 657– 662.
Pryzgoda, J., & Chrisler, J.C. (2000). Definitions of gender and sex: The subtleties of meaning. Sex Roles, 433, 553–569.
Rasulis, R., Jr., Schuldberg, D., & Murtagh, M. (1996). Computeradministered testing with the Rotter Incomplete Sentences Blank. Computers in Human Behavior, 12, 497–513.
Ravinder, S. (1986). Loevinger’s Sentence Completion Test of Ego Development: A useful tool for cross-cultural researchers. International Journal of Psychology, 21, 679–684.
Reilly, D.H., & Sugerman, A.A. (1967). Conceptual complexity and psychological differentiation in alcoholics. Journal of Nervous and Mental Disease, 144, 14–17.
Richardson, L., & Soucar, E. (1971). Comparison of cognitive complexity with achievement and adjustment: A convergentdiscriminant study. Psychological Reports, 29, 1087–1090.
Riedel, R.G., Grossman, J.H., & Burger, G. (1971). Special Incomplete Sentence Test for Underachievers: Further research. Psychological Reports, 29, 251–257.
Rohde, A.R. (1946). Exploration in psychology by the Sentence Completion Method. Journal of Applied Psychology, 30, 169– 181.
Rohde, A.R. (1957). The Sentence Completion Method. New York: Ronald.
Rohde, B.R. (1960). Word association and sentence completion methods. In A.I. Rabin & M.R. Haworth (Eds.), Projective techniques with children (pp. 210–224). New York: Grune & Stratton.
Rothaus, P., Johnson, D.L., Hanson, P.G., Brown, J.B., & Lyle, F.A. (1967). Sentence-completion test prediction of autonomous and therapist-led group behavior. Journal of Counseling Psychology, 14, 28–34.
Rotter, J.B. (1951). Word association and sentence completion methods. In H.H. Anderson & G.L. Anderson (Eds.), An introduction to projection techniques (pp. 279–310). New York: Prentice Hall.
Rotter, J.B., Lah, M.I., & Rafferty, J.E. (1992). Rotter Incomplete Sentences Blank. San Antonio, TX: Harcourt Brace.
Rotter, J.B., & Rafferty, J.E. (1950). Manual: The Rotter Incomplete Sentences Blank: College Form. New York: Psychological Corporation.
Rotter, J.B., Rafferty, J.E., & Lotsoff, A.B. (1954). The validity of the Rotter Incomplete Sentences Blank: High School Form. Journal of Consulting Psychology, 18, 105–111.
Rotter, J.B., Rafferty, J.E., & Schachtitz, E. (1949). Validation of the Rotter Incomplete Sentence Blank for college screening. Journal of Counseling Psychology, 13, 348–355.
Rotter, J.B., & Willerman, B. (1947). The Incomplete Sentences Test as a method of studying personality. Journal of Consulting Psychology, 11, 43–48.
Rudel, R.G., Denckla, M.B., & Broman, M. (1981). The effect of varying stimulus context on word-finding ability: Dyslexia further differentiated from other learning disabilities. Brain and Language, 13, 130–144.
Sacks, J.M., & Levy, S. (1950). The Sentence Completion Test. In L.E. Abt & L. Bellak (Eds.), Projective psychology (pp. 357– 402). New York: Knopf.
Sandoval, J. (1998). Testing in a changing world: An introduction. In J. Sandoval, C.L. Frisby, K.F. Geisinger, J.D. Schenuneman, & J.R. Grenier (Eds.), Test interpretation and diversity: Achieving equity in assessment (pp. 3–16). Washington, DC: American Psychological Association.
Sandoval, J., & Dura´n, R.P. (1998). Language. In J. Sandoval, C.L. Frisby, K.F. Geisinger, J.D. Schenuneman, & J.R. Grenier (Eds.), Test interpretation and diversity: Achieving equity in assessment (pp. 181–211). Washington, DC: American Psychological Association.
Schretlen, D.J. (1997). Dissimulation on the Rorschach and other projective measures. In R. Rogers (Ed.), Clinical assessment of malingering and deception (2nd ed., pp. 208–222). New York: Guilford Press.
Schroder, H.M., & Streufert, S. (1962). The measurement of four systems varying in level of abstractions (Sentence Completion Method) (Tech. Rep. No. 11 on project NR-171–055). Princeton, NJ: Princeton University.
Shanan, J., & Nissan, S. (1961). Sentence completion as a tool of assessing and studying personality. Megamot, 1, 232–252 (in Hebrew).
Shell, S.A., O’Mally, J.M., & Johnsgard, K.W. (1964). The semantic differential and inferred identification. Psychological Reports, 14, 547–558.
Singer, J.L. (1968). Research applications of projective methods. In A.I. Rabin (Ed.), Projective techniques in personality assessment (pp. 581–610). New York: Springer.
Snarey, J.R., & Blasi, J.R. (1980). Ego development among adult Kibbutzniks: A cross-cultural application of Loevinger’s theory. Genetic Psychology Monographs, 102, 117–155.
Snow, S.T. (1972). Factor analysis of Rotter’s Incomplete Sentences Blank. Dissertation Abstracts International, 32(12-B), 7325.
Sofue, T. (1979). Aspects of the personality of Japanese, Americans, Italians and Eskimos: Comparisons using the Sentence Completion Test. Journal of Psychological Anthropology, 2, 11–52.
Stein, M.I. (1949). The record and a sentence completion test. Journal of Consulting Psychology, 13, 448–449.
Stotsky, B.A., & Weinberg, H. (1956). The prediction of the psychiatric patient’s work adjustment. Journal of Counseling Psychology, 3, 3–7.
Szustrowa, T. (1976). Test of Egocentric Associations (TES). Polish Psychological Bulletin, 7, 263–267.
Taffel, C. (1955). Anxiety and conditioning of verbal behavior. Journal of Abnormal and Social Psychology, 51, 496–501.
Tendler, A. (1930). A preliminary report on a test for emotional insight. Journal of Applied Psychology, 14, 122–136.
Terrell, J.C., & Lyddon, W.J. (1996). Narrative and psychotherapy. Journal of Constructivist Psychology, 9, 27–44.
Thelen, H.A., Stock, D., Ren-Zeev, S., Gradolph, I., Gradolph, R., & Hill, W.F. (1954). Methods for studying work and emotionality in groups. Chicago: University of Chicago, Human Dynamics Laboratory.
Thurston, J.R., Brunclik, H.L., & Feldhusen, J.F. (1968). The relationship of personality to achievement in nursing education, phase II. Nursing Research, 17, 265–268.
Turnbow, K., & Dana, R.H. (1981). The effects of stem length and directions on sentence completion test responses. Journal of Personality Assessment, 45, 27–32.
Van de Vijver, F. (2000). The nature of bias. In R.H. Dana (Ed.), Handbook of cross-cultural and multicultural personality assessment. Personality and clinical psychology series (pp. 87– 106). Mahwah, NJ: Erlbaum.
Vellutino, F.R., & Shub, M.J. (1982). Assessment of disorders in formal school language: Disorders in reading. Topics in Language Disorders, 2, 20–33.
Vernallis, F.F., Shipper, J.C., Butler, D.C., & Tomlinson, T.M. (1970). Saturation group psychotherapy in a weekend clinic: An outcome study. Psychotherapy: Theory, Research, and Practice, 7, 144–152.
Wattenberg, W., & Clifford, C. (1962). Relationships of the selfconcept to beginning achievements in reading. Detroit, MI: Wayne State University.
Wayment, H.A., & Zetlin, A.G. (1989). Coping responses of adolescents with and without mile learning handicaps. Mental Retardation, 27, 311–316.
Westerhof, G.J., Dittmann-Kohli, F., & Katzko, M.W. (2000). Individualism and collectivism in the personal meaning system of elderly adults: The United States and Congo/Zaire as an example. Journal of Cross-Cultural Psychology, 31, 649–676.
Whittlesea, B.W.A., & Williams, L.D. (2000). The discrepancyattribution hypothesis: II. Expectation, uncertainty, surprise, and feelings of familiarity. Journal of Experimental Psychology: Learning, Memory and Cognition, 27, 14–33.
Wood, F.A. (1969). An investigation of methods of presenting incomplete sentence stimuli. Journal of Abnormal Psychology, 74, 71–74.

CHAPTER 29 Use of Graphic Techniques in Personality Assessment: Reliability, Validity, and Clinical Utility

LEONARD HANDLER, ASHLEY CAMPBELL, AND BETTY MARTIN

INTRODUCTION 387 THE DRAW-A-PERSON TEST (DAP) AND THE HOUSE-TREE-PERSON DRAWING TEST (H-T-P) 388 Test Descriptions 388 Theoretical Basis 389 Test Development 389 Psychometric Characteristics 390 Range of Applicability and Limitations 391 Cross-Cultural Factors 391 Accommodation for Populations With Disabilities 392 Legal and Ethical Considerations 392 Computerization 392 Current Research Status 392 Use in Clinical Practice 394 Future Developments 395 Additional Assessment Strategies 395

INTRODUCTION

Little did Karen Machover realize when she published her slim monograph on the Draw-A-Person Test (DAP; Machover, 1949), that it would spark a controversy that has spanned the last 50 or so years. As in political and religious debates, emotions run high when psychologists discuss the validity of various drawing tests. This is due, in part, to American psychologists’ penchant for the importance of so-called “objectivity” and “science” as guides in assessment, which is often misinterpreted as mere quantification of data. European psychology, on the other hand, has more typically been characterized as emphasizing integrative efforts that focus on experiential variables, where quantification is less important. Rather than search for points of intersection of these approaches, adherents of each have collided head-on. Those influenced by the American tradition typically emphasize refinements in the measurement of details and the construction of comprehensive scoring systems, while those imbued with

THE KINETIC FAMILY DRAWING TECHNIQUE (K-F-D) 396 Test Description 396 Theoretical Basis 396 Test Development 396 Psychometric Characteristics 396 Range of Applicability and Limitations 396 Cross-Cultural Factors 397 Accommodation for Populations With Disabilities 397 Legal and Ethical Considerations 397 Computerization 397 Current Research Status 397 Use in Clinical Practice 398 Future Developments 398 Additional Assessment Strategies 399 APPENDIX: KEY WORKS AND CITATIONS FOR FURTHER READING 399 REFERENCES 399

the importance of phenomenology search for ways to describe test phenomena that personalize meaning for the patient, in an attempt at a holistic description. This approach often comes into sharp conflict with the search for numerical values as ways in which to describe and understand personality functioning.

Graphic personality tests have suffered from a fate similar to the one suffered by other projective tests. Before the Comprehensive System (Exner, 1993) became popular there was no consistency in the way the Rorschach was administered, scored, and interpreted. Many applied clinicians and early researchers did not believe in the use of a quantified scoring system from which hypotheses could be extracted. Instead, they stressed an impressionistic approach, based upon the interpretation of content, from which they extracted “personal” meanings. The Rorschach has gained immeasurably in respectability from the use of the Comprehensive System, because of the standardization in administration and scoring. This accomplishment made it more amenable to significantly improved research methods, which allow the comparability of research findings among studies and provide a system for the integration of variables in a constellation rather than as separate indices. Nothing like this is available for the use of graphic techniques as personality measures; there is no standardized method for their administration. Many scoring methods exist for the analysis of the tests discussed in this chapter, but no single comprehensive scale has been devised that is reliable and valid. Therefore, graphic techniques are typically not looked upon as “respectable” measures by researchers (at least in the United States). However, the rich clinical data they generate make them quite popular with clinicians.

In this chapter we will describe the DAP, the House-Tree-Person Test (H-T-P), and the Kinetic Family Drawing Test (K-F-D). We will discuss the administration of each test and we will also supply information to assist the reader in evaluating each test concerning applicability and limitations, current research status, use in clinical practice, and a variety of other considerations.

THE DRAW-A-PERSON TEST (DAP) AND THE HOUSE-TREE-PERSON DRAWING TEST (H-T-P)

Test Descriptions

DAP

The DAP was devised by Karen Machover (1949) based upon the observation that Draw A Man Test (DAM; Goodenough, 1926) productions reflected personality issues as well as provided a measure of intelligence, for which the DAM was devised. Although there are several different sets of directions available, the typical instructions are described as follows:

The patient is given a sheet of 81 ⁄2$ by 11$ unlined paper and a No.2 pencil and is simply asked to “draw a person.” The examiner should answer all questions nondirectively: “Do it in any way you like.” The patient who expresses concern about his or her artistic ability should be told: “This is not a test of artistic ability; that’s not important.” If patients ask whether they should draw the entire figure or just the head, or whether a stick figure is acceptable, they should be instructed to do as they wish. However, if they draw only the head or if they draw a stick figure, they should then be given another sheet of paper and asked to draw an entire person or a person that is not a stick figure. The patient is then given another sheet of paper, is asked to draw a picture of the opposite sex (gender) and is then asked to make up stories about both drawings. If the patient cannot do so it is acceptable to ask a series of questions about the person drawn (see Handler, 1996).

Kissen (1986) outlined a modification of DAP administration, based on object-relations theory, which, he feels, enhances the “psychodynamic potentiality” of the DAP. He encourages the patient to “adopt an attitude of naivete and curiosity” toward his or her own drawn figures, and he then invites the patient to “explore psychologically some of the salient expressive characteristics of the human figures produced” (pp. 43–44). The patient becomes a consultant, “allow[ing] [him or her] to become spontaneous and open to inner experiential states” (p. 44). Kissen recommends saying: “I would like you to look at your first drawing as though it were drawn by somebody else. From the physical characteristics of the drawing, facial expression, posture, style of clothing—what sort of person comes through to you? What personal characteristics come to mind?” (p. 45). Questions are also asked concerning how the person might relate to others of his or her own gender and to others of the opposite gender. He draws a cartoon balloon coming from the mouths of the figures and asks the patient to “write in the balloon a statement that you can imagine the person you have described making . . . a typical statement that is characteristic of this sort of person” (pp. 45–46).

H-T-P

The H-T-P was developed by John Buck (1948) and Emanuel Hammer (1958). Buck, in the United States, and Emil Jucker, in Switzerland, independently noted that tree drawings could reflect underlying personality traits. Jucker’s student, Charles Koch (1952), developed the tree drawing as a projective test. It is now used extensively in Europe and in many non-European countries. Buck added the house drawing, as, in part, a representation of the “self,” to form the H-T-P. Hammer (1958) indicates that the house and tree drawings were also used because they were familiar items, even to very young children, and most people were quite willing to draw them. The reasons for using the H-T-P are essentially the same reasons cited for the DAP. In addition, the H-T-P is said to reflect patients’ feelings about their home situation, typically represented by the house drawing. The tree drawing is said to also reflect patients’ emotional history and to tap deeper layers of personality.

The examiner should use sheets of 81 ⁄2$ by 11$ unlined paper and a No.2 pencil and should ask the patient to draw, in order, as good a picture of a house (tree, person) as they can, each on a separate sheet of paper. The examiner could then also ask for a drawing of the opposite sex (gender). As with the DAP, the examiner should answer all questions about the drawing task in a noncommittal manner. It is also important to record all spontaneous comments to assist in the later interpretation of the data. Buck (1966) recommends that the patient be asked a list of questions regarding the drawings (see Handler, 1996). The first author asks patients to make up stories about each drawing. Some examiners substitute open-ended questions for stories (e.g., “Tell me about this house [tree, person]”).

Theoretical Basis

DAP

The DAP is described as a projective test; people are said to project underlying personality dynamics and personality traits into their drawings. It is typically believed that the patient’s DAP productions are a reflection in some way or another of himself or herself, but it is also possible that the drawings represent some other important person in the patient’s life. Projection of these personality traits and dynamics is not typically reflected directly or consciously, but is presented symbolically, in some indirect manner. Thus, people do not draw themselves as they appear, but as they experience themselves or as they wish to be. (See case illustrations in Handler, 1996, and Handler & Riethmiller, 1998.) Handler (1996) describes a case in which a short, chubby 11-year-old boy drew a fierce-looking, muscular warrior as an ego ideal or hero figure. The opposite sex drawing is often said to tap attitudes about people in the patient’s life who are of that gender.

Human beings have projected themselves into their artistic productions ever since the era of the caveman. Indeed, one Paleolithic cave painting discovered in the Trois Fre`res cave in France depicts a human form with deer antlers, a horse’s head, and bear paws, symbolizing the caveman’s incorporation of those animals’ strengths. Similar attempts can be seen when children are asked to draw imaginary animals. They combine the various parts of animals that have significance for them in terms of power and strength (Handler & Hilsenroth, 1994). Those who study art recognize that the artist’s personality is often evident from the style or the content of his or her artistic productions. Although many people recognize the symbolic content they view in museums and art galleries, they are often reluctant to use these insights in the interpretation of a patient’s artistic productions. We believe that artistic productions and scientifically based personality assessment can be brought together to help understand the people we assess. Typically the DAP is analyzed using ego psychological and object-relations theories, although recently Leibowitz (1999) described DAP interpretation from a selfpsychology viewpoint.

H-T-P

The theoretical underpinning of the H-T-P is similar to that of the DAP; the patient is said to reflect underlying traits and

personality dynamics indirectly. The house is also said to be a symbolic representation of the self and taps unconscious issues concerning the patient’s present or early family life (Hammer, 1958). Clinical observations also suggest that the house drawing represents the patient’s attitudes and emotions concerning present family relationships. Hammer (1958) and Koch (1952) believe that the tree drawing reflects deeper and more unconscious feelings about “self” compared with figure drawings, which represent a “closer to consciousness” view of “self” in the environment. The H-T-P was devised from psychoanalytic principles, but recently some clinicians have interpreted H-T-P data using a humanistic or a phenomenological approach (Burns, 1987).

Test Development

DAP

There are numerous DAP rating scales available in the literature to score various aspects of the drawings. For example, Witkin, in a series of studies (Witkin et al., 1954), developed and researched the concept of “field-dependence-independence” and later extended this concept as “psychological differentiation” (Witkin, Dyk, Faterson, Goodenough, & Karp, 1962). It is defined as a person’s articulation of his or her experience of the world and of the self, the development of a separate sense of identity, and the development of defensive style and structure. The findings from this monograph are quite consistent with present-day object-relations and attachment theories, although the authors did not include such conceptualizations in their work. In her earlier work (Witkin et al., 1954), Machover developed a DAP scoring instrument that was reported to correlate significantly with field dependence-field independence. In a series of studies (Epstein, 1957; Fliegel, 1955; Gruen, 1955; Linton, 1952; Rosenfeld, 1958; and Young, 1959), significant correlations were found between measures of field-dependence-independence and the DAP scale for children, adolescents, and young adults.

Witkin et al. (1962) developed a global scale derived from Machover’s work, called the “Sophistication-of-Body-Concept Scale,” based upon the degree of primitiveness and oversimplification of the drawings (e.g., circles or ovals for bodies, stick figures), omission and distortion of body parts, and the lack of sexual differentiation of the drawing. Many of these signs are those that some researchers describe as poor artistic ability. They are some of the same variables that Handler and Reyher (1965) describe as valid indicators of conflict/anxiety, and they are some of the same variables scored on the Harris-Goodenough DAM scale (Harris, 1963).

These findings, as well as those of Robins, Blatt, and Ford (1991) and others, demonstrate, that to a great extent, differences in artistic ability are related to personality style. Nevertheless, we recognize that there are indeed large differences in artistic ability among people. However, it is not difficult to differentiate drawings due to poor artistic ability from those that represent poor psychological differentiation of the self. The latter are drawings that contain a great deal of distortion, oversimplification (e.g., circles, ovals, rectangles, or squares for body parts; dots or circles for facial features; sticklike arms and legs; spikelike hands, fingers, and feet), lack of sexual differentiation, omission of major body parts, and unstable or otherwise slanted stance (Handler, 1996).

There are a number of conflict/anxiety scales in the literature for adults (e.g., Handler, 1967; Handler & Reyher, 1965) and for children (e.g., Koppitz, 1966a, 1966b; Naglieri, McNeish, & Bardos, 1991). Unfortunately, there is no scale available to score the verbal material obtained in the administration of the DAP. Koppitz (1966a) developed a list of 30 emotional indicators for the DAP by examining the drawings of over 1,500 children, ages 5 to 12. The scale, said to identify children with emotional problems, has very good interrater reliability and good validity, although follow-up studies by Fuller, Preuss, and Hawkins (1970); Eno, Elliot, and Woehlke (1981); and Johnson (1989), while validating many of Koppitz’s findings, indicated some problems with cutoff scores to determine presence of emotional problems, resulting in mixed findings for clinical utility.

Based on her earlier work, Koppitz identified certain emotional indices as representing different conscious or unconscious attitudes. For example, tiny figures and the omission of nose, mouth, and hands was associated with shyness, timidity, and withdrawal, whereas gross asymmetry of limbs; the presence of teeth, long arms, and big hands; and the presence of genitals was said to be associated with a hostile attitude and impulsivity. While the validity of these associations was supported by Handler and McIntosh (1971), it was not supported by Lingren (1971) or Black (1972, 1976). Although Koppitz was successful in generating a list of age-related (developmental) DAP signs, researchers found it was not always possible to differentiate these signs from those indicating emotional problems. In general, however, drawings from clinical groups contained more conflict indices, compared with comparison groups. For example, Hibbard and Hartman (1990) found more anxiety in the drawings of alleged sexual abuse victims, compared with a comparison group.

H-T-P

Buck (1948, 1966) constructed a quantitative H-T-P scale to measure intelligence, but it is rarely ever used clinically or in research (Handler, 1996). He also devised a very elaborate scoring system as a personality measure, but most clinicians do not use it in their clinical work. Instead, they analyze each drawing impressionistically, or they use Buck’s sign approach method. While specific H-T-P signs are often used in research and in clinical application, psychologists typically understand that no single interpretation is adequate for any drawing sign. For example, Hammer (1954) indicates that for some patients the chimney of a house may represent a phallic symbol, while for others it represents just an important detail of a house.

Psychometric Characteristics

DAP

Test-retest reliabilities range from fair to good (Guinan & Hurley, 1965; Handler, 1996). Interrater reliabilities for the various scales range from good to excellent (Handler, 1996). Norms are available for both children and adults. Those for children are quite good, but those for adults are rather poor. The reader is referred to Gilbert and Hall (1962), Handler (1996), Jones and Thomas (1964), Thomas (1966), Urban (1963), and Wagner and Schubert (1955) for adult norms. Norms for children are discussed in Groves and Fried (1991), Handler (1996), Koppitz (1967, 1968, 1969, 1984), Machover (1960), Saarni and Azara (1977), and Schildkrout, Shenker, and Sonnenblick (1972). Changes in children’s drawings follow a developmental pattern described by the authors cited. For example, Schildkrout et al. (1972) emphasize that drawings of normal adolescents typically reflect their predominant age- and stage-related problems or conflicts, which should not be confused with psychopathology. Saarni and Azara (1977) found that, relative to female adolescents, males drew with more aggressive-hostile indices and high school girls demonstrated twice the number of insecure-labile signs than did those of young adult females. Thus, Handler (1996) states, “There are apparently some DAP variables that are primarily related to personality factors, those that are primarily related to sociocultural factors, and those that reflect both [factors] as they interact” (p. 228). Gilbert and Hall (1962) found that as people age there is an increasing tendency for their drawings to become absurd, incongruous, fragmented, and primitive.

H-T-P

There are few available norms for the H-T-P and little material available concerning age-related differences. Little formal research has been done on the H-T-P since the 1970s, except for a recent study by Vass (1998), who produced a new scoring system and manual using hierarchical cluster analysis that has good interrater reliability. A scale by Van Hutton (1994) purports to measure sexual abuse in children, but it has limited clinical utility.

Range of Applicability and Limitations

DAP

The DAP is very useful because it takes very little time to complete, typically about 5 to 10 minutes, it is easy to administer, and it can be administered to a wide range of (sighted) patients, ranging in age from early childhood (perhaps 3 or 4) to old age. It is especially useful with shy, inhibited, or otherwise nonverbal children and adults, and it is typically nonthreatening. It is useful with patients who have a wide variety of language-related or speech-related problems and for people who do not speak the language of the clinician doing the assessment. The DAP also offers the examiner an opportunity to observe motor performance, an area that is hardly tapped in assessment batteries. The DAP has proven to be quite useful in reflecting improvement in psychotherapy (e.g., Harrower, 1965; Leibowitz, 1999; Robins et al., 1991) and in sex therapy (e.g., Hartman & Fithian, 1972; Sarrel & Sarrel, 1979; Sarrel, Sarrel, & Berman, 1981). Most people find the test nonthreatening and children typically find it quite engaging. A very important reason for including the DAP in an assessment battery is that it is the only test in which there is no external stimulus or structure provided for the patient. There are no designs to copy and no vague images to examine. Since there is no external stimulus to copy or interpret, the clinician has an opportunity to observe the patient’s functioning where structure and organization must come from within. For those patients who cannot provide the inner structure necessary to do well on these tasks, it is possible to postulate a lack of a developed sense of self. It is for this reason that the DAP is so sensitive to psychopathology and to intrapsychic and interpersonal changes due to psychotherapy. The DAP is often useful for the assessment of patients who are evasive and/or guarded. Such patients often give barren records on verbal tests, but they are often less guarded on drawing tests because they are not familiar with what constitutes sound performance.

Limitations include problems in movement and/or coordination and other visual-perceptual problems, such as those seen in some neurologically impaired patients and in some aged patients (Hayslip & Loman, 1986; Oberleder, 1967). Plutchik, Conte, Weiner, and Teresi (1978) indicate that the process of aging per se has effects on figure drawings that are similar to mental illness at any age, while other studies (Cumming & Henry, 1961; Lakin, 1960) indicate that it is possible to differentiate motor problems from emotional problems in the aged. However, it is probably not possible to separate the destructive cortical effects of aging from those that reflect emotional problems. With older patients, clinicians would do best to understand the quality of the patient’s functioning on the DAP rather than attempt to search for a specific diagnosis.

Patients with very low IQs produce rather meager drawings; the DAP is typically not as useful with these patients as a personality measure. The DAP is not useful in testing patients who have had little or no experience in drawing, especially in drawing the human figure. There are a number of cultures, for example, in which such drawing is forbidden or strongly discouraged (Dennis, 1966). Drawings from such people are quite primitive because of lack of drawing experience. Other problems with the DAP center around the interpretation of the drawings using an unvalidated sign approach.

H-T-P

The H-T-P is used with children, adolescents, and (less so) adults. It is not a useful instrument if the clinician is seeking a test that can be scored objectively. Essentially the same range of applicability and limitations exist for the H-T-P as for the DAP.

Cross-Cultural Factors

DAP

There are significant differences in the drawings of both children and adults from cultures other than our own (e.g., Dennis, 1966; Gardiner, 1974; Gonzales, 1982; Handler & Habenicht, 1994; Klepsch & Logie, 1982; Koppitz & Casullo, 1983; Koppitz & de Moreau, 1968; Mebane & Johnson, 1970; Meili-Dworetzki, 1982; Money & Nucombe, 1974; Smart & Smart, 1975; Zaidi, 1979). It is important, in the interpretation of drawings of patients from other cultures, to understand cultural effects so that they are not interpreted as personality issues (Handler & Clemence, 2003). Space limitations do not allow for a discussion of the effects of culture on various drawing tests. It is not ethical to use any of the drawing techniques discussed in this chapter unless separate norms are obtained for that culture or if it has been demonstrated that there are no cultural differences in performance on that test (Handler & Clemence, 2003). It is also helpful to determine the degree of acculturation of the patient to the mainstream culture, in order to determine whether the use of traditional norms is appropriate.

H-T-P

As with the DAP, there are significant differences among H-T-Ps of subjects and patients of various cultures (e.g., Alcade et al., 1982; Granela-Suarez, Alverez-Reyes, & Lopez-Enrich, 1985; Mc Hugh, 1963; Sallery, 1968; Soutter, 1994).

Accommodation for Populations With Disabilities: DAP and H-T-P

It is possible to make accommodation for those patients with speech and/or hearing disabilities; the simple instructions can be written out for the hearing impaired, and the stories or associations to the drawings can also be written.

Legal and Ethical Considerations: DAP and H-T-P

Because the available objective scoring systems are primarily used in research and few have been used in clinical application, it is difficult to employ data from the DAP and the H-T-P in forensic settings. In fact, several writers, engaged in the controversy about whether drawings are valid, have described their use as “unethical” (e.g., Kahill, 1984; Martin, 1984; Motta, Little, & Tobin, 1993) because of the lack of validated scoring systems. These arguments have been rebutted by drawing test adherents (e.g., Bardos, 1993; Patterson & Janzen, 1984) who support their clinical use.

Computerization: DAP and H-T-P

With the use of various computer sketch-pad devices it is possible to draw a person, a house, or a tree on an electronic tablet and have the drawing appear on a computer screen. These drawings, answers to questions about the drawings, and stories can then easily be transmitted to the assessor. We doubt, however, that line quality and other formal aspects of the drawings would be the same, compared with the use of pencil and paper. Studies are needed to compare the two methods for individual patients.

Current Research Status

DAP

In the controversy concerning the validity of figure drawings, positive findings are typically not cited by those who are convinced that drawing techniques are invalid, whereas those who support the use of the DAP often ignore negative findings. Consequently, each side has claimed victory for their point of view. For example, Joiner, Schmidt, and Barnett (1996) describe their opinion of DAP validity as “50 years of unimpressive validity data” (p. 126). On the other hand, many studies demonstrate the validity of the DAP, especially in reflecting psychopathology and in reflecting change in psychotherapy (e.g., Handler & Reyher, 1964, 1965, 1966; Kahn & Jones, 1965; Kot, Handler, Toman, & Hilsenroth, 1994; Lewinsohn, 1965; Maloney & Glasser, 1982; Robins et al., 1991; Tharinger & Stark, 1990; Tolor & Tolor, 1955; Yama, 1990).

One problem concerning the validity of drawing techniques is that researchers choose procedures that will no doubt result in nonsignificant findings. For example, they isolate one or a few indices from a figure drawing rating scale, taken completely out of context, and then attempt to correlate these indices with measures they view as excellent criterion variables, such as self-report inventories. Joiner et al. (1996) isolated three anxiety/conflict variables, two of which have been determined to be rather poor (Handler & Reyher, 1965) and correlated children’s DAP scores with their scores on self-report measures of anxiety and depression. Not surprisingly, they obtained nonsignificant correlations. In such a study the possible conclusions are that neither the DAP nor the self-report measures are valid; that both are valid, but are tapping different levels of personality functioning; that the DAP is valid and the self-report measures are invalid; or the inverse. The authors unfortunately concluded that the self-report measures are valid but the DAP is not valid. Such studies, and there are many like them in the literature on the DAP, tell us nothing more concerning the validity of the DAP than we knew before the study was done. The issue of poor correlations between self-report inventories and projective tests has been discussed and researched by many psychologists (e.g., Bornstein, Bowers, & Bonner, 1996; Bornstein, Bowers, & Robinson, 1995; Ganellen, 1996; McClelland, Koestner, & Weinberger, 1989; Meyer, 1996, 1997; Shedler, Mayman, & Manis, 1993); the poor correlations have been found to be unrelated to the validity of the instruments in question.

To take one or a few DAP indices, out of context, and to correlate them with integrated scales, composed of many items, is considered to be poor research procedure. Rushton, Brainerd, and Priestly (1983) discuss the principle of aggregation, which says simply that any measure, and especially any individual scale item, will have error associated with it. A single item, then, is likely to yield low correlations with associated constructs unless it is combined with other measures that are measuring the same construct. If items are combined, error tends to average out, thus yielding a more accurate estimate of the true relationship between the variables. A single item or variable has little discriminative power when compared to an entire test or scale. It is for this reason that Handler and Habenicht (1994) state, in reference to the research on the K-F-D, “It is not surprising that no significant differences were found [using this research procedure], because we cannot expect that all children . . . will reflect their feelings in the same way graphically. The analysis of single signs or variables is to be discouraged” (p. 447).

DAP research in which only isolated variables are chosen for validation is reductionistic in nature, because this approach loses the richness drawings can convey in their entirety (Waehler, 1997). Gustafson and Waehler (1992) found that a composite score of drawing characteristics yielded a more significant relationship between the DAP and a measure of abstract thinking than did any single DAP variable alone. Therefore, an experimental design to determine DAP validity should allow for an examination of the relationship of a multivariable scale with other measures.

An additional complicating problem is that individual signs may have a complex relationship with psychopathology. For example, Handler and Reyher (1964, 1965, 1966) found that both small size and large size DAPs and use of light or heavy line indicate conflict. These extremes were found to be related to two different styles of reacting to stress or conflict. Two drawing patterns emerged under stress conditions: constriction of the drawing (characterized by heavy lines, mechanical breaks in the line, absence of line sketchiness, detached or semidetached body parts, and small size) and expansion (marked by diffusion of body boundaries; vagueness of body parts; extremely sketchy lines, loosely bound together; light lines; and large size). The authors theorized that these two configurations reflect two qualitatively different reactions to stress. The expansive pattern includes variables that suggest a desire to finish the drawing quickly and with little involvement; these individuals respond to anxiety by avoiding the anxiety-provoking situation. The constricted pattern involves a controlled, deliberate drawing approach, with great attention to detail, reflecting attempts to cope with the anxiety-provoking situation, rather than to avoid it. Therefore, the size of the drawing, the amount of detail present, the heaviness of the line, and the degree of definiteness of the body boundary could all vary, depending upon how the person copes with stress or conflict. Both of these patterns are associated with experienced stress and conflict; when this relationship is ignored, the scores of subjects with significantly smaller drawings, heavier lines, distinct body boundaries, and excessive detail could very likely cancel out the scores of subjects with significantly larger drawings, lighter lines, diffuse body boundaries, and sparse detail. Consequently, nonsignificant research findings would be obtained. Subjects who attempt to avoid the stress of the situation would demonstrate less shading and erasure, compared with

those who attempt to cope with the stress, who would demonstrate significantly more shading and erasure. Therefore, in doing DAP research or interpretating DAPs in a clinical setting, the use of hypothesized “signs” in isolation does not work well, and it is not considered good science; isolated variables, taken out of context, often produce nonsignificant or misleading results. This situation may be seen when we examine findings for erasure of the drawing. One needs to determine whether the redrawn body part is improved after the erasure or is made worse.

A second issue that makes research concerning the validity of the DAP difficult and complicated concerns the perceived source of the stress or conflict experienced by the subject. Handler and Reyher (1964, 1966) and Jacobson and Handler (1967) found that some variables (size, line lightness or heaviness, shading and erasure) seemed to be related to the subject’s response to external stress, while other variables (omission of body parts, diffuse boundaries, degree of distortion of the head or the body, simplification of the head or body, and vertical imbalance) were more related to intrapsychic conflicts. Handler and Reyher (1965) found evidence for the validity of these patterns in their summary of 51 studies. They found significant validity in 17 of 20 studies measuring distortion; 22 of 23 for omission; 11 of 12 for lack of detail; 15 of 20 for head or body simplification; and 4 of 5 for vertical imbalance. Small and large size, when scored separately; omission of major body parts; lack of detail, distortion of head and/or the body; simplification of head or body; and vertical imbalance look quite good as valid DAP indices of conflict and anxiety. The findings of Robins et al. (1991) concerning change in DAPs after a significant period of psychotherapy found that these same variables changed dramatically after psychotherapy. It appears that the more emotionally disturbed a patient is, the more there is disturbance in body image and in reality testing, thereby resulting in major distortions in the drawings of disturbed patients. The equivocal findings for some variables, such as shading, hair shading, and erasure are probably due to the fact that these indices are sometimes a sign of adaptation, if they improve the drawing, or they may reflect conflict/anxiety, especially if they degrade the drawing (Mogar, 1962). The large number of nonsignificant findings for these variables (Handler & Reyher, 1965) is probably due to a cancellation effect, where those subjects for whom these variables measured conflict/ anxiety and those subjects for whom they might be measuring adaptation cancelled each other out.

State-trait issues are important in research as well as in the clinical application of all assessment instruments. It is possible with the DAP to determine whether the results are due to situational (state) stress or to more central (trait) issues. Handler and Reyher (1964, 1966) devised a “neutral” (reflecting few intrapsychic conflicts) drawing, the drawing of an automobile, to control for artistic ability. However, some variables reflected the external stress situation and were present in the drawings of the automobile as well as in the drawings of the people, whereas other indices were present in only the figures and not in the automobile (Handler & Reyher, 1964, 1965, 1966; Jacobson & Handler, 1967). These variables were taken to represent intrapsychic issues rather than external or state stress.

A quite different approach to DAP validity emphasizes the importance of the role of the interpreter and of his or her ability to “empathize” with the drawing in order to reach meaningful and accurate interpretations. This approach is illustrated in the work of Scribner and Handler (1987), who found that more affiliative and empathic interpreters were significantly more accurate in their interpretations, compared with less affiliative and empathic interpreters. Burley and Handler (1997) found that good DAP interpreters were more empathic, intuitive, and cognitively flexible, compared with poor interpreters. Thus, one way in which to validate the DAP is to use an experiential approach in which the drawing contents and the examiner’s approach and skill are also key factors. Methods to improve DAP skill may be found in Handler (1996) and in Riethmiller and Handler (1997).

Lewinsohn (1965) found that the overall quality of drawings was related to external measures of patients’ adjustment, while Maloney and Glasser (1982) found that the overall quality of drawings differentiated various patient populations. Yama (1990) found that ratings of overall artistic quality, a rating of bizarreness of the figure, and a drawing estimate of overall adjustment in a group of Vietnamese foster children were all related to the frequency of foster home placements. The children with severe emotional problems, as reflected in the drawings, had a great deal of trouble adjusting to foster homes and therefore had to be moved frequently. These findings, as well as those of Robins et al., demonstrate that grossly poor artistic ability is related to personality dysfunction. Also, if the automobile drawing is of relatively good quality and the figures are poorly done, it would be safe to interpret poor drawings as an indication of conflict or psychopathology rather than as an indication of poor artistic ability.

H-T-P

There are relatively few reliability and validity studies available for the H-T-P. Many studies find significant differences between specific clinical groups, compared with normal comparison or control groups, but not between or among various clinical subgroups. Thus, the H-T-P is a rough or nonspecific measure of pathology, but it is an undifferentiated measure.

Use in Clinical Practice

DAP

Drawing techniques are frequently used in clinical practice, for the reasons stated earlier (Archer, Maruish, Imhof, & Piotrowski, 1991; Kennedy, Faust, Willis, & Piotrowski, 1994; Piotrowski & Keller, 1989, 1992; Watkins, Campbell, Nieberding, & Hallmark, 1995). However, methods of interpretation are quite varied and are not clearly delineated. The focus on individual variables, examining each one in turn, encourages a molecular approach to our understanding of patients, leading the interpreter away from an understanding of the patient’s self-experience, resulting in an unintegrated and often simplistic analysis. Our clinical experience indicates that interpretations that are guided by impressions and emotional reactions to the drawings, further enriched by the use of an objective scoring system, are superior to interpretations that use a simple sign approach.

An example of such a phenomenologically based approach is one employed by Tharinger and Stark (1990), in which two doctoral students were asked to sort DAPs and K-F-Ds into five piles, ranging from absence of psychopathology to presence of severe emotional disturbance. The raters were then asked to describe those aspects of the drawings that led them to their ratings, based upon their affective experience when putting themselves in the place of the individual depicted in the drawings. They listed four criteria: (1) feelings of inhumaneness (feeling animalistic, grotesque, or missing body parts); (2) lack of agency (drawings gave the rater a feeling of being unable to change their environment); (3) lack of well-being (related to the facial features of the drawing); and (4) a hollow, vacant, or stilted feeling. Two new raters then rated the drawings using these four criteria. They successfully differentiated children with mood disorders from those of a control group; none of the 30 individual signs, the total scores on the Koppitz Figure Drawing Scale, or the 37 K-F-D signs differentiated the groups.

Kot et al. (1994) used an experiential approach in evaluating DAPs from matched groups of homeless men, male psychiatric inpatients, and unemployed males enrolled in a vocational rehabilitation program. The four criteria extracted, posed as questions, were: (1) Is the person in the drawing frightened of the world? (2) Does he have intact thinking? (3) Is he comfortable with close relationships? and (4) Would you feel safe being with the person drawn? This approach significantly differentiated the three groups, while an objective scoring system did not do so. Based on this research and our own clinical experience we recommend the use of this approach to interpret various graphic tests. It is helpful if the interpreter attempts to imagine being the person drawn and to then describe feelings and thoughts, using self-reflection. The interpreter might ask, “How do I feel about myself?” “How do I approach others?” “What do I need in order to feel safe in my world?” The interpreter could also use knowledge of specific drawing variables to round out the interpretations. The clinician should then adopt the stance of the “generalized other” (Potash, 1998) by observing his or her own reactions concerning how others who might interact with the person drawn would experience him or her. Questions such as “What would it be like to be this person’s friend, his or her child, parent, or employer?” “How does this person approach new relationships with others?” and “How does this person want others to experience him or her?” help to understand the patient’s presentation of “self” in his or her interactions with others.

H-T-P

Although there is no agreement concerning what the house and the tree symbolize, other than a representation of the self, the house is sometimes said to represent the part of the self that is concerned with the body (Hammer, 1958, 1997; Jolles, 1971). However, it is also said to represent nurturance, stability, and a sense of belonging. While the tree, too, is said to represent the self, it is sometimes discussed in the literature as representing the patient’s sense of growth, vitality, and development. There is little research concerning interpretive guidelines, but the analysis of house and tree parts is often identified with various aspects of the self. For example, the roof is said to represent fantasy and/or intellectualization; the walls to represent ego strength; the windows and doors to represent interpersonal accessibility; and the chimney to represent either masculinity or warmth. These symbolic interpretations are extremely problematic because there are no research data to support them. The interpreter is better off, in our opinion, in using the H-T-P in an impressionistic manner, as described in the discussion of the DAP. This approach avoids the use of untested interpretations.

Future Developments

DAP

DAP research should focus on the clinician making interpretations from the drawings rather than on the physical properties of the drawings themselves. We also need more research to differentiate those drawing variables related to poor artistic ability and those that are related to psychopathology. The theoretical interpretive base should also be outlined more clearly, especially in the area of self-representation. In this regard it would be helpful to discuss the drawings with the patients who drew them so that self-representational issues and object-relational issues as they are reflected in the drawings are clarified.

H-T-P

The collection of more normative data is essential if this instrument is to be of any real clinical use. In addition, test-retest reliability and validity studies are necessary. The research foundation for the H-T-P is rather weak. The test remains a clinical exploratory tool, especially for children, in the identification and exploration of self and family attitudes and conflicts.

Additional Assessment Strategies

DAP

There are many drawing techniques related to the DAP, such as the Draw-A-Group Test, the Draw-A-Person-in-the-Rain Test, and the Most Unpleasant Concept Test (Hammer, 1958, 1997). Verinis, Lichtenberg, and Henrich (1974) describe various animal drawing tests. M. Miller (1997) devised a test in which the patient draws a tree, a tree in a storm, and a tree after a storm. This technique is said to measure a person’s ability to cope with stress. A relatively recent innovation is the Draw-A-Person: Screening Procedure for Emotional Disturbance (DAP:SPED), devised by Naglieri, McNeish, and Bardos (1991), in which the child is asked to draw a man, a woman, and a self-drawing. The test is well normed and it has excellent interrater reliability. The Projective Motherand-Child Drawing approach (Gillespie, 1994) asks the patient to “draw a mother and a child.” Handler (Handler & Hilsenroth, 1994) developed the Fantasy Animal Drawing Technique as an assessment and therapeutic approach with children, where the two activities are combined. The child is asked to invent a make-believe animal, to draw it, and then to tell a story about it. The therapist then tells the child a story that is designed to symbolically express a therapeutic message to him or her.

H-T-P

Hammer (1958) devised the Chromatic H-T-P, in which the patient draws with colored art supplies. He believes the Chromatic H-T-P taps a deeper layer of the unconscious compared with the Achromatic H-T-P. Hammer presents some very

compelling material to support his case, but he does not present any research data. Burns (1987) devised the Kinetic House-Tree-Person Test (K-H-T-P) and has discussed what constitutes healthy signs in this approach. Patients draw a house, tree, and person, “all on one page, and with each in action.” However, there are no data available concerning the validity or the clinical utility of this approach at this time.

THE KINETIC FAMILY DRAWING TECHNIQUE (K-F-D)

Test Description

The K-F-D is a projective technique that asks the individual “to draw everyone in your family, including you, doing something” (Burns, 1982; Burns & Kaufman, 1970, 1972). It is used in personality assessment, art therapy, family treatment settings, and in special education settings (Handler & Habenicht, 1994). The K-F-D is said to facilitate the emergence of the subject’s perception of dynamic relationships among family members, as well as individual adaptive and defensive styles in relation to family functioning (Handler & Habenicht, 1994).

Theoretical Basis

Burns indicates that by asking the patient to inject action into the drawing, it becomes more projective, tapping more unconscious feelings about family members and also illustrating patterns of family relationships. There is no single guiding theory that forms the basis of the K-F-D, except for the projective hypothesis. Recently, the K-F-D has been tied to attachment theory (e.g., Grzywa, Kucharaska-Pietura, & Jasiak, 1998; Pianta, Longmaid, & Ferguson, 1999) in several different settings.

Test Development

The K-F-D was developed from the Draw A Family Test (DAF; Hulse, 1951, 1952) in which the patient is asked to “draw a family” without instruction for action. The DAF, in turn, was based upon the work of Appel (1931) and Wolff (1946). Burns (1982) developed a scoring method that involves, in part, the actions depicted, the physical arrangements and proximities of the figures, and the presence and characteristics of certain drawing styles that are said to represent psychopathology. Other researchers have used different scoring variables, such as distortion of body parts, line quality, and sexual differentiation, thereby creating difficulty in the ability to compare studies concerning the reliability and validity of the K-F-D (Handler & Habenicht, 1994).

Psychometric Characteristics

Existing K-F-D scales can be scored with a high degree of interrater reliability (Handler & Habenicht, 1994). Test-retest reliabilities are also quite good (Handler & Habenicht, 1994). Normative findings are provided by Bauknight (1977), Brewer (1980), Jacobson (1973), Rodgers (1992), Shaw (1989), and Thompson (1975) for children and adolescents. Nevertheless, these norms are derived using a number of different scoring systems and they are not detailed enough. In order to interpret the K-F-D accurately, it is necessary to understand how age and sex-related norms are reflected in the drawings. Otherwise the clinician might interpret such variables as indicating psychopathology.

The validity of the K-F-D is difficult to ascertain due to the inconsistent use of various scoring systems. In a review by Gardano (1988), many studies using the Burns variables obtained significant findings. Other studies found the Burns scoring variables to be of questionable validity; sometimes more pathology variables were found in the K-F-Ds of normals instead of those of the disturbed subjects (Handler & Habenicht, 1994). Such inconsistency in results may indicate poor K-F-D validity, but, as with the DAP, inappropriate research procedures have been employed. For example, some studies (e.g., Acosta, 1989; Schacker, 1983) used only one or two variables without finding significant results. Significant findings have been obtained in studies in which many K-F-D variables were employed (e.g., Annunziata, 1983; Layton, 1983).

Range of Applicability and Limitations

The K-F-D allows clinicians to obtain a quick understanding of the patient’s view of his or her family and of his or her position in the family, including interaction patterns and relationship patterns, as the patient experiences them. The K-F-D is helpful in understanding the presenting complaint and its meaning for family dynamics and it illuminates the effects of the prevailing culture and subculture on the patient’s personality development. Despite the current controversy regarding K-F-D validity, the technique continues to be used worldwide in a variety of settings, allowing for a burgeoning production of age, gender, and cross-cultural normative findings. Normative studies have been completed with elementary school children of various ages and with adolescents. Results indicate that caution is necessary when interpreting the K-F-D to ensure that variables are not scored as pathological rather than as reflecting developmental and/ or gender differences. For example, Brewer (1980) found that 6- to 8-year-old children will picture themselves as interacting often, compared with 9- to 12-year-old children. Thompson (1975) discovered that adolescents depict isolation among family members, lending support to the idea that the K-F-D could be misinterpreted to indicate pathology rather than to reflect adolescents’ healthy need for individuation, independence, and separation from the family. Thompson also provides evidence that certain variables previously interpreted as pathological actually indicate healthy positive adaptive skills, possibly explaining the discrepant findings that show more pathology in normal groups versus abnormal or problem groups. Jacobson (1973) demonstrated that boys omit more body parts than girls and concluded that omission of detail by a boy may be due to a typical developmental lag rather than being an index of pathology.

Cross-Cultural Factors

Cross-cultural studies help clinicians differentiate those aspects of the K-F-D that may indicate poor psychological functioning from those that express cultural influences. Taiwanese and Chinese K-F-Ds of children ages 8 to 14 years demonstrate an emphasis on group orientation rather than on individualism and independence seen in American families (Cho, 1987; Nuttall, Chieh, & Nuttall, 1988). Other studies also emphasize the differences in portrayal of family figures and activities as an expression of cultural influences. For example, Japanese K-F-Ds reflect the importance of the father (drawn first or larger) due to a patriarchal family structure (Fukada, 1990), whereas other cultures, notably African and Native American cultures, emphasize the mother figure as a mirror of their matriarchal system (Deren, 1975; Gregory, 1992).

Cultural themes may also vary according to family integration patterns; some cultures emphasize active shared family activities, while others emphasize passive activities, where family members are engaged in isolated activities. Ledesma (1979) found social class differences among the drawings of Filipino adolescents; upper-class children drew more passive family interactions and lower-class children drew more active family interactions. In comparing Filipino children to Japanese children ages 9 to 12, Cabacungan (1985) discovered that Filipino K-F-Ds depict work and recreational activities, whereas Japanese families are often depicted in recreational activities alone. Chartouni (1992) found that K-F-Ds accurately portrayed cultural differences between American Lebanese children and American Caucasian children. The Lebanese drawings showed families as more intimate, cooperative, communicative, and nurturing compared with Caucasian families. These findings are consistent with the unique family qualities inherent in the two cultures; Lebanese families are traditionally close, with an emphasis on nurturance and interdependence, while American families tend to encourage autonomy and individuality. Additional crosscultural findings are reported in Handler and Habenicht (1994).

Accommodation for Populations With Disabilities

The same accommodations are appropriate for the K-F-D as for the DAP.

Legal and Ethical Considerations

As with the DAP and H-T-P, the lack of a single reliable and valid scoring system and adequate norms weakens the use of the K-F-D in a variety of legal and forensic settings.

Computerization

The K-F-D can be administered by computer in the same way as described for the DAP.

Current Research Status

A detailed summary of validity findings can be found in Handler and Habenicht (1994). Unfortunately, many of the formal scoring variables described by Burns have not been supported by research. However, as with the DAP, many of the nonsignificant findings are the result of poor research procedures, such as exploring single variables rather than examining patterns of variables observed in context. In addition, a holistic research approach has been more successful in validating the K-F-D. For example, in a study by Hackbarth (1988), objective scores did not differentiate between K-F-Ds of sexually abused and non-sexually abused children, but a global rating of “like to live in this family” significantly differentiated the two groups. This variable reflected the raters’ impression of positive family relationships and the presence of an environment for personal growth, compared with a feeling that there was something wrong or hurtful in the family. Tharinger and Stark (1990) also obtained significant findings for the K-F-D used holistically, while a sum of scores did not significantly differentiate clinical from nonclinical groups.

Other research applications of the K-F-D have been quite productive. It has been used to study the effects of various family and cultural forces on children. For example, Malpique et al. (1998) used the K-F-D to study the effects of alcohol and violence in the family on children and adolescents and Krauthamer (1997) studied children from war conflict areas of the world.

The K-F-D continues to be employed in a wide variety of clinical applications. Veltman and Browne (2000) concluded that although the K-F-D is not reliable in the diagnosis of child maltreatment by teachers and mental health professionals, the drawings can be useful in eliciting information from children about distressful events. L. Miller (1995) determined that K-F-Ds of juvenile sexual offenders were significantly different from those of general population in that offenders omitted body features and parental figures from the drawings, had more distorted body features, drew more barriers, depicted families lacking nurturance, and drew more dangerous objects or activities. Miller concluded that the K-F-D may be used to identify sexual offenders, but no individual predictive statistics were provided. Other studies continue to emphasize the clinical utility of the K-F-D in generating family dialogue among members and as a means of documenting therapeutic processes (Linesch, 1999; Stein, 1997). Cobia and Brazelton (1994) have used the K-F-D clinically with blended families and Stein (1997) has used it in primary care settings to provide a rapid understanding of the patient’s family relationships. Despite the lack of an empirical research base, the K-F-D continues to be used as a clinical assessment tool because it uniquely taps an understanding of family functioning as perceived by the patient (Thompson & Nurse, 1999). Amid the controversy concerning the validity and reliability of the K-F-D, it continues to be used to inform clinical decisions. Contemporary modifications include new scoring systems to measure attachment among family members (Pianta et al., 1999; Taylor, Kymissis, & Pressman, 1998) and methods to employ the K-F-D along with interview material (Linesch, 1993).

Use in Clinical Practice

The K-F-D is used in clinical practice primarily as a way to understand a child’s or adolescent’s view of his or her family structure and dynamics. It is best used with an interview and other assessment instruments to build a unique picture of the patient’s view of himself or herself as a family member. It is difficult to apply the available research scales to clinical practice; results are better if the examiner interprets the drawings using an impressionistic approach. Leads obtained from the K-F-D are quite useful in psychotherapy with children and adolescents and are also quite useful when (judiciously) shared with parents in order to sensitize them to the existence of family relationship problems.

Future Developments

Handler & Habenicht (1994) called for normative research that specifically addresses each age group across gender, ethnic, socioeconomic, and racial groups. They stressed the importance of exploring factors such as intelligence, birth order, family size, and drawing ability. There is also a need for studies highlighting a holistic, integrative approach in lieu of techniques that focus on the summation of a number of signs. With this in mind, future studies should also investigate the use of clinical inquiry along with the actual K-F-D and should employ the use of clinical interpreters in actual research rather than base the results on individual scoring variables (Handler & Habenicht, 1994). The K-F-D remains primarily a clinical instrument with inadequate norms and questionable validity. It is also difficult to obtain a comprehensive picture of K-F-D research since it is scattered throughout a number of diverse areas and because much of the contemporary research exists as unpublished doctoral dissertations, which are difficult to obtain. Regardless, current research indicates that some of Handler and Habenicht’s suggestions have been incorporated into ongoing K-F-D research.

For example, Abate (1994) established K-F-D developmental trends for children ages 5 to 13 years, creating a normative sample for comparison. She noted significant gender and age differences in nine aspects of the K-F-D. Girls demonstrated quantitative superiority (e.g., frequency of details), while boys demonstrated qualitative excellence (e.g., use of shading and profiles). Abate also noted the developmental trend of younger children drawing their family members closer and involved, while with increasing age, children were more likely to draw members as separate and engaged in different activities. This finding lends support to previous suggestions that the K-F-D should be examined from a developmental viewpoint rather than misinterpreting developmental differences as psychopathology. Abate also found a high frequency of some pathology indices in her normative sample, suggesting that these may be more developmental rather than emotional indicators.

DeOrnellas (1997) explored K-F-Ds of third-grade African American, Hispanic, and American Caucasian children and found there were no significant differences. Wegmann and Lusebrink (2000) devised a scoring method for crosscultural studies using Burns’s variables and variables proposed by other K-F-D researchers and determined the reliability of samples across three continents, for children ages 7 to 10 years. These researchers demonstrated an overall strong reliability for their scoring method, but many variables demonstrated statistically significant different reliability results from one population to another, thus highlighting the importance of testing the reliability of these variables with each culture in which the K-F-D is used.

Gregory (1992) found that the K-F-D is a valid measure for Native American children; there were few significant differences between the drawings of Native American and Caucasian children, ages 6 to 14 years. McClements-Hammond (1993) specifically explored the special vulnerabilities of Vietnamese children uprooted from their native land and placed in American foster homes compared to Vietnamese minors traveling to America with their families and found that the K-F-Ds of the unaccompanied minors included a higher number of variables relating to family stress and conflict.

Additional Assessment Strategies

Prout and Phillips (1974) and Knoff and Prout (1985) devised the Kinetic School Drawing Test, in which the patient is asked to draw people doing something at school.

APPENDIX: KEY WORKS AND CITATIONS FOR FURTHER READING

DAP

DiLeo (1973). Hammer (1958, 1997). Handler (1996). Handler & Riethmiller (1998). Kissen (1988). Koppitz (1968, 1984). Leibowitz (1999). Machover (1949). Tharinger & Stark (1990).

H-T-P

Buck (1978). Buck & Hammer (1969). Hammer (1958, 1997). Handler (1996). Jolles (1971).

K-F-D

Burns (1982). Burns & Kaufman (1970, 1972). Handler (1996).

Handler & Habenicht (1994). Linesch (1993, 1999). Thompson & Nurse (1999).

REFERENCES

Abate, M. (1994). Developmental trends in elementary school age children’s Kinetic Family Drawings. Dissertation Abstracts International, 55 (3-B), 1175.
Acosta, M. (1989). The Kinetic Family Drawing: A developmental and validity study. Unpublished doctoral dissertation, University of Washington, Seattle.
Alcade, N., Lapitz, L., Lopez, J., Marasa, F., Poch, J., Riera, A., et al. (1982). Normative study of the House-Tree-Person and Draw-an-Animal test. British Journal of Projective Psychology and Personality Study, 27, 1–4.
Annunziata, J. (1983). An empirical investigation of the Kinetic Family Drawings of children of divorce and children from intact families. Unpublished doctoral dissertation, Rutgers University, New Brunswick, NJ.
Appel, K. (1931). Drawings by children as aids to personality studies. American Journal of Orthopsychiatry, 1, 129–144.
Archer R., Maruish, M., Imhof, E., & Piotrowski, C. (1991). Psychological test usage with adolescent clients: 1990 survey findings. Professional Psychology: Research and Practice, 22, 247– 252.
Bardos, A. (1993). Human figure drawings: Abusing the abused. School Psychology Quarterly, 8, 177–181.
Bauknight, C. (1977). Parent-child interaction on the Family Drawing Test as an indication of withdrawn behavior in children. Unpublished doctoral dissertation, University of West Virginia, Morgantown.
Black, F. (1972). Factors related to human figure drawing size in children. Perceptual and Motor Skills, 35, 902.
Black, F. (1976). The size of human figure drawings of learning disabled children. Journal of Clinical Psychology, 32, 736–741.
Bornstein, R., Bowers, S., & Bonner, S. (1996). The effects of induced mood states on objective and projective dependency scores. Journal of Personality Assessment, 67, 324–340.
Bornstein, R., Bowers, S., & Robinson, K. (1995). Differential relationships of objective and projective dependency scores to selfreports of interpersonal life events in college students. Journal of Personality Assessment, 65, 255–269.
Brewer, F. (1980). Children’s interaction patterns in Kinetic Family Drawings. Unpublished doctoral dissertation, United States International University, San Diego, CA.
Buck, J. (1948). The H-T-P. Journal of Clinical Psychology, 4, 151–159.
Buck, J. (1966). The House-Tree-Person Technique, revised manual. Los Angeles: Western Psychological Services.

400 Use of Graphic Techniques in Personality Assessment: Reliability, Validity, and Clinical Utility

Buck, J. (1978). The H-T-P technique: A qualitative and quantitative scoring manual. Journal of Clinical Psychology, 4, 397–405.
Buck, J., & Hammer, E. (1969). (Eds.). Advances in the House-Tree-Person technique: Variations and applications. Los Angeles: Western Psychological Services.
Burley, T., & Handler, L. (1997). Personality factors in the accurate interpretation of projective tests. In E. Hammer (Ed.), Advances in projective drawing interpretation (pp. 359–377). Springfield, IL: Charles C. Thomas.
Burns, R. (1982). Self-growth in families: Kinetic Family Drawings (K-F-D) research and application. New York: Brunner/Mazel.
Burns, R. (1987). Kinetic-House-Tree-Person Drawings (K-H-T-P). New York: Brunner/Mazel.
Burns, R., & Kaufman, S. (1970). Kinetic Family Drawings (K-F-D): An introduction to understanding children through kinetic drawings. New York: Brunner/Mazel.
Burns, R., & Kaufman, S. (1972). Actions, styles, and symbols in Kinetic Family Drawings (K-F-D). New York: Brunner/Mazel.
Cabacungan, L. (1985). The child’s representation of his family in Kinetic Family Drawings (KFD): A cross cultural comparison. Psychologia, 28, 228–236.
Chartouni, T. (1992). Self-concept and family relations of American-Lebanese children. Unpublished doctoral dissertation, Andrews University, Berrien Springs, MI.
Cho, M. (1987). The validity of the Kinetic Family Drawing as a self-concept and parent/child relationship among Chinese children in Taiwan. Unpublished doctoral dissertation, Andrews University, Berrien Springs, MI.
Cobia, D., & Brazelton, E. (1994). The application of family drawing tests with remarriage families: Understanding the familial role. Elementary School Guidance and Counseling, 29, 129– 136.
Cumming, E., & Henry, W. (1961). Growing old. New York: Basic Books.
DeOrnellas, K. (1997). A comparison of the Kinetic Family Drawings of African American, Hispanic, and Caucasian third graders. Unpublished doctoral dissertation, Texas Women’s University, Dallas.
Dennis, W. (1966). Group values through children’s drawings. New York: Wiley.
Deren, S. (1975). An empirical evaluation of the validity of the Draw-A-Family-Test. Journal of Clinical Psychology, 31, 542– 546.
DiLeo, J. (1973). Young children’s drawings as diagnostic aids. New York: Brunner/Mazel.
Eno, L., Elliott, C., & Woehlke, P. (1981). Koppitz emotional indicators in the human figure drawings of children with learning problems. Journal of Special Education, 15, 459–470.
Epstein, L. (1957). The relationship of certain aspects of the body image to the perception of the upright. Unpublished doctoral dissertation, New York University.
Exner, J. (1993). The Rorschach: A comprehensive system: Volume 1. Basic foundations (3rd ed.). New York: Wiley.
Fliegel, Z. (1955). Stability and change in perceptual performance of a late adolescent group in relation to personality variables. Unpublished doctoral dissertation, New School for Social Research, New York.
Fukada, N. (1990, July). Family drawing: A new device for crosscultural comparison. Paper presented at the Twenty-Second International Congress of Applied Psychology, Kyoto, Japan.
Fuller, G., Preuss, M., & Hawkins, W. (1970). The validity of the human figure drawings with disturbed and normal children. Journal of School Psychology, 8, 4–56.
Ganellen, R. (1996). Integrating the Rorschach and the MMPI-2 in personality assessment. Mahwah, NJ: Erlbaum.
Gardano. A. (1988). A revised scoring method for Kinetic Family Drawings and its application to the evaluation of family structure with an emphasis on children from alcoholic families. Unpublished doctoral dissertation, George Washington University, Washington, DC.
Gardiner, H. (1974). Human figure drawings as indications of value development among Thai children. Journal of Cross Cultural Psychology, 5, 124–130.
Gilbert. J., & Hall, M. (1962). Changes with age in human figure drawings. Journal of Gerontology, 17, 397–404.
Gillespie, J. (1994). The projective use of mother-and-child drawings: A manual for clinicians. New York: Brunner/Mazel.
Gonzales, E. (1982). A cross-cultural comparison of the developmental items of five ethnic groups in the Southwest. Journal of Personality Assessment, 46, 26–31.
Goodenough, F. (1926). Measures of intelligence by drawings. New York: World Book.
Granela-Suarez, M., Alverez-Reyes, A., & Lopez-Enrich, M. (1985). Psychopathology and self. Boletin de Psicologia-Cuba, 8, 78–86.
Gregory, S. (1992). A validation and comparative study of Kinetic Family Drawings of Native-American children. Unpublished doctoral dissertation, Andrews University, Berrien Springs, MI.
Groves, J., & Fried, P. (1991). Developmental items on children’s human figure drawings: A replication and extension of Koppitz to younger children. Journal of Clinical Psychology, 47, 140– 147.
Gruen, A. (1955). The relation of dancing experience and personality to perception. Psychological Monographs, 69 (Whole No. 399).
Grzywa, A., Kucharaska-Pietura, K., & Jasiak, E. (1998). The influence of treatment on changes in the family pictures in graphic works of paranoid schizophrenics. Psychology: A Journal of Human Behavior, 35, 31–39.
Guinan, J., & Hurley, J. (1965). An investigation of the reliability of human figure drawings. Journal of Projective Techniques, 29, 300–304.
Gustafson, J., & Waehler, C. (1992). Assessing concrete and abstract thinking with the Draw-A-Person Technique. Journal of Personality Assessment, 59, 439–447.
Hackbarth, S. (1988). A comparison of Kinetic Family Drawing variables of sexually abused children, unidentified children, and their mothers. Unpublished doctoral dissertation, East Texas State University, Commerce.
Hammer, E. (1954). A comparison of H-T-Ps of rapists and pedophiles. Journal of Projective Techniques, 18, 346–354.
Hammer, E. (Ed.). (1958). The clinical application of projective drawings. Springfield, IL: Charles C. Thomas.
Hammer, E. (1997). Advances in projective drawing interpretation. Springfield, IL: Charles C. Thomas.
Handler, L. (1967). Anxiety indexes in projective drawing: A scoring manual. Journal of Projective Techniques and Personality Assessment, 31, 46–57.
Handler, L. (1996). The clinical use of drawings. In C. Newmark (Ed.), Major psychological assessment instruments (2nd ed., pp. 206–293). Boston: Allyn & Bacon.
Handler, L., & Clemence, A. (2003). Education and training in psychological assessment. In J. Graham & J. Naglieri (Eds.), Handbook of assessment psychology: Volume 10. Assessment psychology (pp. 181–212). New York: Wiley.
Handler, L., & Habenicht, D. (1994). The Kinetic Family Drawing Technique: A review of the literature. Journal of Personality Assessment, 62, 440–464.
Handler, L., & Hilsenroth, M. (1994, April). The use of a fantasy animal drawing and story-telling technique in assessment and psychotherapy. Paper presented at the annual meeting of the Society for Personality Assessment, Chicago.
Handler, L., & McIntosh, J. (1971). Predicting aggression and withdrawal in children with the Draw-A-Person and the Bender-Gestalt. Journal of Personality Assessment, 35, 331–335.
Handler, L., & Reyher, J. (1964). The effects of stress on the Draw-A-Person Test. Journal of Consulting Psychology, 28, 259–264.
Handler, L., & Reyher, J. (1965). Figure drawing anxiety indices: A review of the literature. Journal of Projective Techniques and Personality Assessment, 29, 305–313.
Handler, L., & Reyher, J. (1966). Relationship between GSR and anxiety indexes in projective drawings. Journal of Consulting Psychology, 30, 60–67.
Handler, L., & Riethmiller, R. (1998). Teaching and learning the administration and interpretation of graphic techniques. In L. Handler & M. Hilsenroth (Eds.), Teaching and learning personality assessment (pp. 267–294), Mahwah, NJ: Erlbaum.
Harris, D. (1963). Children’s drawings as a measure of intellectual maturity. New York: Harcourt, Brace, & World.
Harrower, M. (1965). Psychodiagnostic testing: An empirical approach. Springfield, IL: Charles C. Thomas.
Hartman, W., & Fithian, M. (1972). Treatment of sexual dysfunction. Long Beach, CA: Center for Marital and Sexual Studies.
Hayslip, B., & Lowman, R. (1986). The clinical use of projective techniques with the aged: A critical review and synthesis. Clinical Gerontologist, 5, 63–94.
Hibbard, R. & Hartman, G. (1990). Emotional indicators in human figure drawings of sexually victimized and nonabused children. Journal of Clinical Psychology, 46, 211–219.
Hulse, W. (1951). The emotionally disturbed child draws his family. Quarterly Journal of Child Techniques, 3, 152–174.
Hulse, W. (1952). Childhood conflict expressed through family drawings. Journal of Projective Techniques, 16, 66–79.
Jacobson, D. (1973). A study of Kinetic Family Drawings of public school children ages six through nine. Dissertation Abstracts International, 34 (6-B), 2935.
Jacobson, H., & Handler, L. (1967). Extroversion-introversion and the effects of stress on the Draw-A-Person Test. Journal of Consulting Psychology, 31, 433.
Johnson, G. (1989). Emotional indicators in the human figure drawings of hearing-impaired children: A small validation study. American Annals of the Deaf, 134, 205–208.
Joiner, T., Schmidt, K., & Barnett, J. (1996). Size, detail, and line heaviness in children’s drawings as correlates of emotional distress: (More) negative evidence. Journal of Personality Assessment, 67, 127–141.
Jolles, I. (1971). A catalog for the qualitative interpretation of the H-T-P. Beverly Hills, CA: Western Psychological Services.
Jones, L., & Thomas, C. (1964). Studies on figure drawings: Structural and graphic characteristics. Psychiatric Quarterly Supplement, 38, 76–110.
Kahill, S. (1984). Human figure drawings in adults: An update of the empirical evidence, 1967–1982. Canadian Psychology, 25, 269–292.
Kahn, M., & Jones, N. (1965). Human figure drawings as predictors of admission to a psychiatric hospital. Journal of Projective Techniques and Personality Assessment, 29, 319–322.
Kennedy, M., Faust, D., Willis, W., & Piotrowski, C. (1994). Socialemotional assessment practices in school psychology. Journal of Psychoeducational Assessment, 12, 228–240.
Kissen, M. (1986). Object relations aspects of figure drawings. In M. Kissen (Ed.), Assessing object relations phenomena (pp. 175– 191). Madison, CT: International Universities Press.
Klepsch, M., & Logie, L. (1982). Children draw and tell. New York: Brunner/Mazel.
Knoff, H., & Prout, H. (1985). The Kinetic Drawing System: A review and integration of the Kinetic Family and School Drawing techniques. Psychology in the Schools, 22, 50–59.
Koch, C. (1952). The tree test. New York: Grune & Stratton.
Koppitz, E. (1966a). Emotional indicators on human figure drawings of children: A validation study. Journal of Clinical Psychology, 22, 313–315.

402 Use of Graphic Techniques in Personality Assessment: Reliability, Validity, and Clinical Utility

Koppitz, E. (1966b). Emotional indicators on human figure drawings of shy and aggressive children. Journal of Clinical Psychology, 22, 466–469.
Koppitz, E. (1967). Expected and exceptional items on human figure drawing and IQ scores of children age 5 to 12. Journal of Clinical Psychology, 23, 81–83.
Koppitz, E. (1968). Psychological evaluation of children’s human figure drawings. New York: Grune & Stratton.
Koppitz, E. (1969). Emotional indicators on human figure drawings of boys and girls from lower- and middle-class backgrounds. Journal of Clinical Psychology, 25, 432–434.
Koppitz, E. (1984). Psychological evaluation of human figure drawings of middle school pupils. New York: Grune & Stratton.
Koppitz, E., & Casullo, M. (1983). Exploring cultural influences on the human figure drawings of young adolescents. Perceptual and Motor Skills, 57, 479–483.
Koppitz, E. & de Moreau, M. (1968). A comparison of emotional indicators on human figure drawings of children from Mexico and from the United States. Revista Interamericana de Psicologia, 2, 41–48.
Kot, J., Handler, L., Toman, K., & Hilsenroth, M. (1994, April). A psychological assessment of homeless men. Paper presented at the annual meeting of the Society for Personality Assessment, Chicago.
Krauthamer, K. (1997). The effects of war on children. Dissertation Abstracts International, 58 (6-B), 3319.
Lakin, M. (1960). Formal characteristics of human drawings by institutionalized and non institutionalized aged. Journal of Gerontology, 15, 76–78.
Layton, M. (1983). Special features in the Kinetic Family Drawings of children. Unpublished doctoral dissertation, Temple University, Philadelphia.
Ledesma, L. (1979). The Kinetic Family Drawings (KFD) of Filipino adolescents. Unpublished doctoral dissertation, Boston College.
Leibowitz, M (1999). Interpreting projective drawings. Philadelphia: Brunner/Mazel.
Lewinsohn, P. (1965). Psychological correlates of overall quality of figure drawings. Journal of Consulting Psychology, 29, 504– 512.
Linesch, D. (1993). Art therapy with families in crisis: Overcoming resistance through nonverbal experience. Philadelphia: Brunner/ Mazel.
Linesch, D. (1999). Art making in family therapy. In D. Weiner (Ed.), Beyond talk therapy: Using more expressive techniques in clinical practice (pp. 225–243), Washington, DC: American Psychological Association.
Lingren, R. (1971). An attempted replication of emotional indicators in human figure drawings by shy and aggressive children. Psychological Reports, 29, 35–38.
Linton, H. (1952). Relations between mode of perception and tendency to conform. Unpublished doctoral dissertation, Yale University, New Haven, CT.
Machover, K. (1949). Personality projection in the drawing of the human figure. Springfield, IL: Charles C. Thomas.
Machover, K. (1960). Sex differences in the developmental pattern of children as seen in human figure drawings. In A. Rabin & M. Haworth (Eds.), Projective techniques with children (pp. 238– 257). New York: Grune & Stratton.
Maloney, M., & Glasser, A. (1982). An evaluation of the clinical utility of the Draw-A-Person Test. Journal of Clinical Psychology, 38, 183–190.
Malpique, C., Barrias, P., Morais, L., Salgado, M., Pinta da Costa, I., & Rodriques, M. (1998). Violence and alcoholism in the family: How are the children affected? Alcohol and Alcoholism, 33, 42–46.
Martin, R. (1984). Martin responds to reactions to his projections column. School Psychologist, 38, 9.
McClelland, D., Koestner, R., & Weinberger, J. (1989). How do self attributed and implicit motives differ? Psychological Review, 96, 690–702.
McClements-Hammond, R. (1993). Effects of separation of Vietnamese unaccompanied minors: Assessment through the use of the Kinetic Family Drawing Test, Hopkins Symptom Checklist-25 and the Vietnamese Depression Scale. Unpublished doctoral dissertation, Rutgers University, New Brunswick, NJ.
Mc Hugh, A. (1963). The H-T-P proportion and perspective in Negro, Puerto Rican and white children. Journal of Clinical Psychology, 19, 312–313.
Mebane, D., & Johnson, D. (1970). A comparison of the performance of Mexican boys and girls on Witkin’s cognitive tasks. Revista Interamericana de Psicologia, 4, 227–239.
Meili-Dworetzki, G. (1982). Speilarten des Menschenbildes: Ein Japanischer und schweizerischer Kinder. Bern: Hans Huber.
Meyer, G. (1996). The Rorschach and MMPI: Toward a more scientifically differentiated understanding of cross-method assessment. Journal of Personality Assessment, 67, 558–578.
Meyer, G. (1997). On the integration of personality assessment methods: The Rorschach and MMPI. Journal of Personality Assessment, 68, 297–330.
Miller, L. (1995). Kinetic Family and human figure drawings of child and adolescent sexual offenders. Unpublished doctoral dissertation, Andrews University, Berrien Springs, MI.
Miller, M. (1997). Crisis assessment: The projective tree drawing before, during, and after a storm. In E. Hammer (Ed.), Advances in projective drawing interpretation (pp. 153–193), Springfield, IL: Charles C. Thomas.
Mogar, R. (1962). Anxiety indices in human drawings: A replication and extended report. Journal of Consulting Psychology, 26, 108.
Money, J., & Nucombe, B. (1974). Ability tests and cultural heritage: The Draw-A-Person and Bender Tests in Aboriginal Australia. Journal of Learning Disabilities, 7, 297–303.
Motta, R., Little, S., & Tobin, M. (1993). The use and abuse of human figure drawings. School Psychology Quarterly, 8, 177– 181.
Naglieri, J., McNeish, T., & Bardos, A. (1991). Draw A Person Screening Procedure for Emotional Disturbance (DAP:SPED). Austin, TX: PRO-ED, Inc.
Nuttall, E., Chieh, L., & Nuttall, R. (1988). Views of the family by Chinese and U.S. children: A comparative study of Kinetic Family Drawings. Journal of School Psychology, 26, 191–194.
Oberleder, M. (1967). Adapting current psychological technique for use in testing the aged. The Gerontologist, 7, 188–191.
Patterson, J., & Janzen, H. (1984). Another reply to Martin: Projective procedures: An ethical dilemma. School Psychologist, 38, 8–9.
Pianta, R., Longmaid, K., & Ferguson, J. (1999). Attachment-based classifications of children’s family drawings: Psychometric properties and relations with children’s adjustment in kindergarten. Journal of Clinical Psychology, 28, 244–255.
Piotrowski, C., & Keller, J. (1989). Psychological testing in outpatient mental health facilities: A national study. Professional Psychology: Research and Practice, 20, 423–425.
Piotrowski, C., & Keller, J. (1992). Psychological testing in applied settings: A literature review from 1982–1992. Journal of Training and Practice in Professional Psychology, 6, 74–82.
Plutchik, R., Conte, H., Weiner, M., & Teresi, J. (1978). Studies of body image: IV. Figure drawings in normal and abnormal geriatric and nongeriatric groups. Journal of Gerontology, 33, 68–75.
Potash, H. (1998). Assessing the social subject. In L. Handler & M. Hilsenroth (Eds.), Teaching and learning personality assessment (pp. 137–148). Mahwah, NJ: Erlbaum.
Prout, H., & Phillips, D. (1974). A clinical note: The Kinetic School Drawing. Psychology in the Schools, 11, 303–306.
Riethmiller, R., & Handler, L. (1997). Problematic methods and unwarranted conclusions in DAP research: Suggestions for improved research procedures. Journal of Personality Assessment, 69, 459–475.
Robins, C., Blatt, S., & Ford, R. (1991). Changes in human figure drawings during intensive treatment. Journal of Personality Assessment, 57, 477–497.
Rodgers, P. (1992). A correlational-developmental study of sexual symbols, actions, and themes in children’s Kinetic Family and Human Figure Drawings. Unpublished doctoral dissertation, Andrews University, Berrien Springs, MI.
Rosenfeld, I. (1958). Mathematical ability as a function of perceptual field-dependency and certain personality variables. Unpublished doctoral dissertation, University of Oklahoma.
Rushton, P., Brainerd, C., & Priestly, M. (1983). Behavioral development and construct validity: The principle of aggregation. Psychological Bulletin, 94, 18–38.
Saarni, C., & Azara, V. (1977). Developmental analysis of human figure drawings in adolescence, young adulthood, and middle age. Journal of Personality Assessment, 41, 31–38.
Sallery, R. (1968). Artistic expression and self-description with Arab and Canadian students. Journal of Social Psychology, 76, 273–274.
Sarrel, P. & Sarrel, L. (1979). Sexual unfolding. Boston: Little, Brown.
Sarrel, P., Sarrel, L., & Berman, S. (1981). Using the Draw-A-Person (DAP) Test in sex therapy. Journal of Sexual and Marital Therapy, 7, 163–183.
Schacker, E. (1983). The Kinetic Family Drawings as an indicator of marital instability and family stress. Dissertation Abstracts International, 43 (7-A), 2323.
Schildkrout, M., Shenker, I., & Sonnenblick, M. (1972). Human figure drawings in adolescence. New York: Brunner/Mazel.
Scribner, C., & Handler, L. (1987). The interpreter’s personality in Draw-A-Person interpretation: A study of interpersonal style. Journal of Personality Assessment, 51, 112–122.
Shaw, J. (1989). A developmental study on the Kinetic Family Drawing for a nonclinic, Black child population in the Midwestern region of the United States. Unpublished doctoral dissertation, Andrews University, Berrien Springs, MI.
Shedler, J., Mayman, M., & Manis, K. (1993). The illusion of mental health. American Psychologist, 48, 1117–1131.
Smart, R., & Smart, M. (1975). Group values shown in preadolescents’ drawings in five English speaking countries. Journal of Social Psychology, 97, 23–37.
Soutter, A. (1994). A comparison of children’s drawings from Ireland and Oman. Irish Journal of Psychology, 15, 587–594.
Stein, M. (1997). The use of family drawings by children in pediatrics practice. Journal of Developmental and Behavioral Pediatrics, 18, 334–339.
Taylor, S., Kymissis, P., & Pressman, N. (1998). Prospective Kinetic Family Drawing and adolescent mentally ill chemical abusers. Arts in Psychotherapy, 25, 115–124.
Tharinger, D., & Stark, K. (1990). A qualitative versus quantitative approach to evaluating the Draw-A-Person and Kinetic Family Drawings: A study of mood-and anxiety-disordered children. Psychological Assessment, 2, 365–375.
Thomas, C. (1966). An atlas of figure drawing variables. Baltimore: Johns Hopkins Press.
Thompson, L. (1975). Kinetic Family Drawings of adolescents. Unpublished doctoral dissertation, California School of Professional Psychology, San Francisco.
Thompson, P., & Nurse, R. (1999). The KFD: Clues to family relationships in family assessment: Effective uses of personality tests with couples and families. New York: Wiley.
Tolor, A., & Tolor, B. (1955). Judgement of children’s personality from their human figure drawings. Journal of Projective Techniques, 19, 170–176.

404 Use of Graphic Techniques in Personality Assessment: Reliability, Validity, and Clinical Utility

Urban, W. (1963). The Draw-A-Person catalog for interpretive analysis. Los Angeles: Western Psychological Services.
Van Hutton, V. (1994). House-Tree-Person and Draw-A-Person as measures of abuse in children: A quantitative scoring system. Odessa, FL: Psychological Assessment Resources.
Vass, Z. (1998). The inner formal structure of the H-T-P drawings: An exploratory study. Journal of Clinical Psychology, 54, 611– 619.
Veltman, M., & Browne, K. (2000). Pictures in the classroom: Can teachers and mental health professionals identify maltreated children’s drawings? Child Abuse Review, 9, 328–336.
Verinis, J., Lichtenberg, E., & Henrich, L. (1974). The Draw-a-Person in the Rain technique. Journal of Clinical Psychology, 30, 407–414.
Waehler, C. (1997). Drawing bridges between science and practice. Journal of Personality Assessment, 69, 482–487.
Wagner, M., & Schubert, H. (1955). DAP quality scale for late adolescents and young adults. Kenmore, NY: Delaware Letter Shop.
Watkins, C., Campbell, V., Nieberding, R., & Hallmark, R. (1995). Contemporary practices of psychological assessment by clinical

psychologists. Professional Psychology: Research and Practice, 26, 54–6.

Wegmann, P., & Lusebrink, V. (2000). Kinetic Family Drawing scoring method for cross-cultural studies. Arts in Psychotherapy, 27, 179–190.
Witkin, H., Dyk, R., Faterson, H., Goodenough, D., & Karp, S. (1962). Psychological differentiation. New York: Wiley.
Witkin, H., Lewis, H., Hertzman, M., Machover, K., Meissner, P., & Wapner, S. (1954). Personality through perception. New York: Harper.
Wolff, W. (1946). The personality of pre-school children. New York: Grune & Stratton.
Yama, M. (1990). The usefulness of human figure drawings as an index of overall adjustment. Journal of Personality Assessment, 54, 78–86.
Young, H. (1959). A test of Witkin’s field-dependence hypothesis. Journal of Abnormal and Social Psychology, 59, 188–192.
Zaidi, S. (1979). Values expressed in Nigerian children’s drawings. International Journal of Psychology, 14, 163–169.

CHAPTER 30 The Hand Test: Assessing Prototypical Attitudes and Action Tendencies

HARRY J. SIVEC, CHARLES A. WAEHLER, AND PAUL E. PANEK

THEORETICAL BASIS AND TEST DEVELOPMENT 405 The Hand Test Scoring System 406 Interscorer Agreement 407 Psychometric Characteristics 408 RANGE OF APPLICABILITY AND LIMITATIONS 414 Hand Test Interpretation for Persons With Mental Retardation 414 The Use of the Hand Test With Older Adults 415 ACCOMMODATION FOR POPULATIONS WITH DISABILITIES 416

The world can only be grasped by action, not by contemplations. The hand is more important than the eye; the hand is the cutting edge of the mind.

—Jacob Bronowski (1908–1974), British scientist

People use hands to work and play, to communicate and relate, to help and hurt, to love and hate. Hands are both tools and the means to create other tools. As such, they are rich in function and purpose. They can also be powerful symbols that evoke attitude and action tendencies relevant to how productive or destructive, interactive or isolated, integrated or fragmented a person feels when engaging the world.

Dr. Edwin E. Wagner, no doubt, considered these musings as he crafted the Hand Test more than 40 years ago (Bricklin, Piotrowski, & Wagner, 1962). He developed a set of ten cards approximately 3$ by 5$ in size, the first nine of which portray a rough outline drawing of a hand in an ambiguous position. The cards are presented one at a time with the question, LEGAL AND ETHICAL CONSIDERATIONS 416 COMPUTERIZATION 416 USE IN CLINICAL AND ORGANIZATIONAL PRACTICE 416 CURRENT RESEARCH STATUS 416 CROSS-CULTURAL FACTORS 416 FUTURE DEVELOPMENTS 417 REFERENCES 417

“What might this hand be doing?” The 10th card is blank and is given to the subject with the instructions: “This card is blank. I would like you to imagine a hand, and tell me what it might be doing.” Subjects are not limited in the number of responses they give to any individual cards or the entire set (although they are encouraged with the instruction “anything else?” if they give only one response to the first card). The Hand Test administration is typically brief (about 10 minutes) and meant to supplement other material in a test battery. Integrating Hand Test responses with interview information and other test data can best be undertaken by considering its theoretical base.

THEORETICAL BASIS AND TEST DEVELOPMENT

The Hand Test was developed not only to meet the general criteria for a projective test but also to be easily classified for empirical examination. The scoring categories are meant to represent prototypical action tendencies that would lend themselves to validation against behavioral criteria. Specifically, the test presents relatively unstructured and ambiguous stimuli that invite a number of qualitatively different responses. The stimuli are somewhat disguised, and because they lack full face validity, are less amenable to conscious control. In

Acknowledgments: The authors would like to thank Dr. Edwin E. Wagner for his input and review of this chapter. In addition, Harry would like to thank Cathy, Jacob, and Caitlin; Charlie would like to thank Tracy, Kailee, and Casey; and Paul would like to thank Christine for their patience and forbearance through this extended research and writing endeavor.

addition, a premium is placed on generating subjective, idiosyncratic responses from the individual rather than forcing a choice (e.g., true or false) set upon the subject. At the same time these responses can be categorized and organized in order to provide meaningful information about the respondent.

The Hand Test emerged at a time when projective tests were being criticized as subjective and their partisans as blithely unconcerned with empirical verification. Academics and clinicians were calling for measures that could anticipate real-life behavior. A cursory review of the current assessment literature reveals a continuance of this time-honored, and, at times, acrimonious polemic (see Meyer, 1997; Wood, Nezworski, & Stejskal, 1996). A pupil of renowned Rorschach expert Zygmunt Piotrowski, Wagner was well aware of both the great benefits and challenges associated with the process of projective testing. As such, Wagner sought to develop a test that could be used to directly predict behavior. He recalls being influenced by past research that showed that individuals could judge emotions from both hand positions and facial expressions (E.E. Wagner, personal communication, September 2001).

Inspired by the prospect of developing a projective technique unabashedly designed to predict behavior, Wagner drew pictures of hands in nine different poses. The drawings were modeled after his own hands. (It is interesting to note that the crooked finger on Card II depicts Wagner’s own hand injured playing baseball.) Imprecisely sketched hands were chosen over more clearly depicted hands under the assumption that greater ambiguity would yield more varied and trenchant responses. Piotrowski suggested that Card IX be inverted because he thought it would be more provocative positioned that way, especially with regard to sexual issues. All hands were drawn to pull for certain themes. Card III was intentionally selected to appear early in the series of hands to allow for a “reality check.” That is, Card II was thought to elicit conflicted feelings, while Card III, because it is simple and straightforward (pointing), provides an opportunity for the person to “regroup” psychologically. Card X was left blank (a` la TAT Card 16) to elicit attitudes and action tendencies oriented toward the future.

The Hand Test is rooted in the assumption that important action tendencies would be projected into pictures of hands since hands are vital to interacting with the external world. It was further assumed, on a rational basis, that humans interact in a world with other living beings as well as inanimate objects (Wagner, 1962a). These initial observations led to creating two major categories of responses (Interpersonal and Environmental), which account for the majority of responses in most protocols. Two additional categories were added to encompass responses representing unsuccessful interactions with others or the environment. Responses that present internal conflicts or external, antagonistic pressures are coded within the Maladjusted category. Similarly, severe reactions to personal and life circumstance representing a collapse of rational responding, which may represent a concomitant withdrawal from reality contact, were assigned to the fourth major category, Withdrawal.

Wagner indicated that he was strongly influenced by the writings of Bhagavan Das (1953) in developing the interpersonal scoring category (E.E. Wagner, personal communication, May 2001). Das proposed a theory of emotion that postulated that individuals generally strive to associate with or dissociate from others based on whether they feel superior, inferior, or equal. Prototypal interpersonal attitudes and action tendencies were thus viewed in terms of basic patterns of engagement with others. For example, does an individual respond to the pictured hands in a manner that suggests a desire for affiliative interpersonal contact on an equal basis (e.g., Affection) or a need to dominate or control inferiors (e.g., Direction)?

Wagner developed the four major scoring categories with the initial breakdown of interpersonal responses influenced by Das. Reacting to Wagner’s initial proposal, however, Bricklin suggested that including common terminology for the interpersonal subcategories would be more user friendly to American psychologists. Bricklin also suggested employing a formula to predict acting-out tendencies, which was termed the Acting Out Ratio (E.E. Wagner, personal communication, September 2001). Wagner conducted pilot research of the Hand Test at the Temple University Testing Bureau. The referral base primarily involved college students but also included individuals from the surrounding community. Soon after, prison and criminal populations were evaluated in order to establish the Hand Test’s ability to predict acting-out behavior. This early research led to the publication of the original Hand Test monograph (Bricklin, et al., 1962). The outgrowth of these theoretical foundations and early research efforts has been the development of a relatively simple and direct scoring system for the Hand Test.

The Hand Test Scoring System

The Hand Test scoring system was developed so that the clinician can assign one of the 15 basic quantitative scores as he or she is recording the responses verbatim. After test administration, these individual codes are organized in terms of four major categories and integrated using three summary scores. (The following material is abbreviated from the more extensive explanation and examples available in Wagner [1983, pp. 6–16].)

Interpersonal (INT) is a major category that includes coding for responses that involve relations with other people. “INT responses represent sensitivity to, interest in, and the ability to interact with other people” (Wagner, 1983, p. 7). INT has the most variation among the major categories with six subcategories, including: Affection (AFF) coded for an interpersonal interchange or bestowal of pleasure, affection, or friendly feeling (e.g. “Patting someone on the back,” “Giving a comforting touch”); Dependence (DEP) responses involve expressed dependence on or need for help or aid from another person (e.g., “Begging, panhandling,” “Requesting help”); Communication (COM) is scored for a presentation or exchange of information (e.g., “Stressing a point in conversation,” “Someone saying ‘I don’t understand’”); Exhibition (EXH) responses involve displaying or exhibiting oneself in order to obtain approval from others (e.g., “Showing off his big ring,” “Child showing off her clean hands”); Direction (DIR) responses involve dominating, directing, or influencing the activities of others (e.g., “Police officer saying ‘Stop!,’” “Leading an orchestra”); Aggression (AGG) is scored when a response involves the giving of pain, hostility, or aggression (e.g., “Pushing someone off a cliff,” “Someone punching a guy”).

The Environmental (ENV) major category includes responses that are relatively noninterpersonal in nature, but that reflect a response to or coming to grips with the environment. There are three subcategories in ENV, including: Acquisition (ACQ), coded when the response involves an attempt to acquire a goal or object and the movement is ongoing or the goal is yet unobtained (e.g., “Reaching for something on a high shelf,” “Trying to catch a ball”); Active (ACT), in which an action constructively manipulates, attains, or alters an object or goal (e.g., “Picking up a pen,” “Throwing a ball”); Passive (PAS), which involves an attitude of rest and/or relaxation in relation to the force of gravity and a deliberate and appropriate withdrawal of energy from the hand (e.g., “Just resting,” “A sleeping hand”).

Within the protocols of “normal” individuals, roughly 90% of responses should fall in the first two major categories of INT or ENV (Wagner, 1983). The final two major categories are intended to include scores that represent difficulty in carrying out various action tendencies successfully due to subjectively experienced, inner weakness, and/or external prohibition (Maladjustive [MAL]) or the abandonment of meaningful or effective life roles (Withdrawal [WITH]). MAL has three subcategories: Tension (TEN) is coded when feelings of anxiety, strain, or tension are present in the response (e.g., “A fist clenched in anger,” “Hanging onto the edge of a cliff”); Crippled (CRIP) responses involve a hand that is dead, disfigured, sick, injured, or incapacitated (e.g., “A dead person’s hand,” “A broken hand”); Fear (FEAR) involves a response in which the hand is threatened with pain, injury, incapacitation, or death (e.g., “Trembling because it is afraid,” “Being sucked into quicksand”). WITH also has three subcategories: Description (DES), coded when examinees can do no more than acknowledge the presence of the hand (e.g., “Just a hand,” “A left hand, just there”); Bizarre (BIZ) responses, which partially or completely ignore the drawn contour of the hand and/or incorporate bizarre, idiosyncratic, or morbid content (e.g., “It is a spider,” “Could be the highway of love”); Failure (FAIL), coded when no scorable response is given to a particular card (e.g., “I can’t say anything it might be doing”).

Three summary scores are computed from these individual codes in order to help with protocol interpretation. Experience Ratio (ER) is a comparison of the frequency counts from the four major scoring categories (e.g. Sum INT: Sum ENV: Sum MAL: Sum WITH). ER “provides a useful overall estimate of basic personality structure and the disposition or psychological energy available for behavior” (Wagner, 1983, p. 11). Acting-Out Ratio (AOR) compares the sum of the positive INT scores (AFF # DEP # COM) to the more negative INT scores (DIR # AGG). AOR is often converted for research purposes to an Acting-Out Score (AOS) derived by subtracting the left side of the ratio (AFF # DEP # COM) from the right (DIR # AGG). Pathology (PATH), which is computed by combining the sum of all MAL scores with two times the sum of WITH scores, is meant to provide an overall estimate of the total amount of pathology in a response set (Wagner, 1983).

There are also two quantitative scoring codes that reflect the time it takes to formulate responses to each card. These scores are predicated on the person administering the test counting the seconds that elapse from first turning up the card until the first scorable response. The Average Initial Response Time (AIRT) is calculated after the test is completed by dividing the total response time (in seconds) by 10. High Minus Low (H-L) is computed by subtracting the lowest initial response time from the highest initial response time.

(Wagner [1983] also includes 17 qualitative scores that can be used to add codes secondary to the quantitative scores for certain verbal processing styles [e.g., ambivalence, repetition, or automatic phrases] or content themes [e.g., sexual, sensual, hiding]. Although potentially valuable, little extant research has been performed with these qualitative codes.)

Interscorer Agreement

Researchers using the Hand Test have shown consistently high levels of interscorer agreement. Sivec and Hilsenroth (1994) reported the range of interscorer agreement from 85% to 90% across three separate studies. Moran and Carter (1991) reported intraclass correlations for raters scoring the combined categories ranging from .85 to .97. Maloney and Wagner (1979) reported that overall agreement between raters was 89% for scores of normal examinees. In a study designed to emulate clinical practice, Wendler and Zachary (1983; cited in Wagner, 1983) evaluated the interscorer agreement of raters who were not specifically trained in the use of the Hand Test. The two raters (doctoral-level psychologists) referred only to the test manual for scoring. The total agreement between two scorers was 72% on the 15 subcategories and 87% for the combined scores. These data suggest that the Hand Test scoring can be learned from the manual alone, but that specific training is likely to improve scoring reliability. Overall, the Hand Test scoring system is conducive to reliable scoring among raters.

Psychometric Characteristics

Psychometric characteristics are those significant considerations that separate tests with scientific merit from subjective impressions. These factors include comparisons with other people, the consistency and accuracy of information gleaned from a technique, and the usefulness of observations made.

Norms

For any test, reference to normative data is essential to make valid interpretive statements. Criteria for evaluating normative information include representativeness, size, and relevance (Sattler, 1988). The Hand Test has two primary manuals for use with adults. Wagner (1962a) provides data for “normal adults” (N ” 100, mean age ” 34.1, SD ” 11.6) representing “all walks of life and various occupations” (p. 18). Additional normative data are provided for Air Force pilots, college students, student nurses, high school students, and children (Wagner, 1962a). The revised Hand Test manual (Wagner, 1983) provides normative data for “normals” (N ” 100, mean age ” 23.91, range 17 to 60). Normative information is also provided for a range of clinical groups: alcoholics, mental retardation, organic brain syndromes, schizophrenia, conduct disorders, anxiety disorders, affective disorders, somatoform disorders, and personality disorders (see Wagner, 1983).

In terms of representativeness, the adult reference groups tend to be small and the sample in the most current Hand Test manual (Wagner, 1983) is overrepresented by college students (50%) and young adults and is not very complete in reporting racial makeup as “15% black and 85% white” (p. 17). Reference to the earlier manual would provide a greater range of normative information, though the norms are somewhat dated. There are a variety of diagnostic groups for which the Hand Test norms are available. This information is helpful in understanding Hand Test patterns commonly found in these groups and for comparing scores between different groups. Fortunately, additional normative data have been provided for older groups (Panek & Wagner, 1985) that can supplement the information provided in the manual. In terms of relevance, normative information is provided for both normal and various clinical groups in the manuals and in published research (e.g., Panek, Cohen, Barrett, & Matheson, 1998), making the Hand Test appropriate for a broad range of clientele.

Given the widespread use of the Hand Test with children, Wagner, Rasch, and Marsico (1991) provided a manual supplement for using the Hand Test with them. Moran and Carter (1991) also reported Hand Test norms for schoolchildren. In terms of size, the normative information for these combined resources is 853 children covering the age range from kindergarten to high school. Although the sample size is less than ideal (e.g., Sattler [1988] recommends N ” 100 per age group), it begins to approach an acceptable range. Also, the sample selected for norms seems to be largely representative of and relevant for use with urban schoolchildren (see Sivec & Hilsrenroth, 1994).

Reliability

Reliability refers to the degree to which a particular measure will assess a particular attribute consistently. Although testretest correlations ranging from .70 to .90 are considered ideal, this range of scores is probably too high for a test such as the Hand Test, which purports to measure attitudes and action tendencies that are by definition related to environmental activity. At the same time, there should be enough stability in the test scores to give credence to the notion that these action tendencies are typically present and presumably important in guiding interactions with the environment. For personality measures such as the Hand Test, correlations that range from .50 to .70 are considered to have moderately high stability (Beck, 1987). This range will serve as the criteria for satisfactory reliability for the Hand Test.

Panek and Stoner (1979) tested a sample of “normals” (N ” 71) twice with a 2-week interval between tests in order to establish the test-retest reliability of the Hand Test. Although correlations ranged from .30 to .89, only two scores fell below .50 (H-L ” .30 and AIRT ” .43). Both of these scores reflect response time to the stimuli, which can be quite variable and which may be subject to learning effects. McGiboney and Carter (1982), evaluating a sample of students (N ” 40) from an alternative school setting with a 3-week test interval, reported only the ACQ variable (.21) fell below a correlation of .50 (range ” .21 to .91). Wagner, Maloney, and Wilson (1981) reported generally lower testretest reliability coefficients (range .33 to .66) for Hand Test variables. However, their retest interval ranged from 1 to 10 years, which is not particularly well suited to the intended meanings of most Hand Test scores. Sivec and Hilsenroth (1994) concluded that certain Hand Test variables were consistently shown to be unstable (AIRT, H-L), but that certain Hand Test pathological scores (e.g., WITH, PATH, FAIL) seem to be quite stable over brief intervals (e.g., 2 to 4 weeks). Overall, the majority of the other Hand Test scores fall within an adequate range for the intended use of this test.

The next level of evaluation involves examining the validity of the interpretations offered for Hand Test responses and scoring categories. In this regard, the construct and criterion-related validity demonstrated within the many published studies focusing on all or some of the Hand Test scores can help determine the extent to which a particular interpretation for a test score is accurate (i.e., corresponds to the behavior it is meant to explain [Moss, 1992]). Given that the Hand Test originated as a measure of overt aggressive tendencies, this Hand Test review begins with an examination of predicting acting-out behavior.

Validity—Predicting Acting-Out Behavior: AOS

The Hand Test was introduced to the assessment community as a projective test with “special reference to the prediction of overt aggressive behavior” (Bricklin, et al., 1962). The Acting-Out Score (AOS) has been the most thoroughly examined Hand Test variable in this area. Rooted in Das’s theory of emotion, the AOS is a mathematical equation for estimating an individual’s propensity to foster positive relationships versus acting to control or disrupt relationships. The original form of the AOS was calculated by subtracting the sum of AFF, DEP, COM, and FEAR from the sum of AGG and DIR. The scale was later modified (Wagner, 1962a) by omitting the FEAR variable and by examining these variables as a ratio (AOR) rather than a score. Although reviewing variables as a ratio allows for a more thorough comparison of behavioral tendencies, the vast majority of research in this area has converted the ratio into a single score because this form is more conducive to statistical evaluation.

By way of introducing the validity of the Hand Test, Bricklin et al. (1962) demonstrated that the mean AOS of people known to be acting out (psychiatric patients and prison inmates, n ” 76) was significantly higher than the mean AOS of comparable non-acting-out individuals (n ” 72).

Using a cutoff score of ” 1, 77% of participants were correctly classified as either acting out or non-acting out. Within the same volume, Bricklin et al. presented data comparing criminals on the AOS. Recidivists (n ” 37) were defined as those individuals on parole who have two or more convictions. Nonrecidivists (n ” 37) were those individuals on parole who had only one conviction. Although recidivists produced significantly higher mean scores on AOS, only 58% of the sample were correctly classified using a cutoff score of ” 1 on AOS.

In a large sample of military prisoners (N ” 614), Brodsky and Brodsky (1967) found that prisoners with “person offenses” (e.g., assault, rape, murder) produced significantly higher mean AOS scores compared to “property offenders” (e.g., larceny, bad checks). All the prisoners were classified into three categories based on their behavior: disciplinary offender, middle group, and model prisoner. AOS mean scores for the disciplinary offenders (M ” .25) was significantly higher than for the model prisoners (M ” !.80). However, efforts to correctly classify these groups with AOS cutoff scores did not provide better than chance classification levels.

Two other studies of prisoners reported by Miller and Young (1999) yielded positive results using the Hand Test AOS. Tariq and Ashfaq (1993) evaluated male prisoners while they were in jail. They found significant mean differences in the AOS when criminals were compared to noncriminals. Similarly, Porecki and Vandergroot (1978) evaluated maximum security inmates (N ” 107) and found a significant correlation (r ” .51) between inmates’ AOS scores and ratings of aggressive behavior over the course of 1 year.

Three early studies examining the AOS in samples of patients diagnosed with chronic schizophrenia produced contradictory results. Although Wagner and Medvedeff (1962) were able to classify correctly 67% of patients as either acting out or non-acting out using an AOS of ” 1, Drummond (1966) and Himelstein and Von Grunau (1981) were only able to achieve chance levels of classification (50% and 53%, respectively) using this score criteria with patients diagnosed with schizophrenia. In their review of the AOS literature and schizophrenia, Zizolfi and Cilli (1999) found vast differences in methodology that may account for the mixed findings. They raised concerns about the lack of interrater reliability data, number of staff used in ratings, staff turnover, type of data used for rating, and, most critically, the length of time between the behavioral assessment and the administration of the Hand Test. Specifying time duration is critically important given that acting out in schizophrenia is probably more of a state phenomena than a trait feature. In their own study, Zizolfi and Cilli (1999) examined a sample of outpatient schizophrenics carefully diagnosed by DSM-III-R/IV standards. Patients (N ” 74) were classified as aggressive or nonaggressive by four different criteria: (1) case notes over 5 years; (2) case notes 1 year before testing; (3) concurrent ratings by psychiatrists and nurses 2 years before testing (IRR ” .85); and (4) records of caregivers 1 month after testing. Using an AOS ” 1, 28.6% were classified correctly when case notes were reviewed for 5 years, 30.8% when notes were reviewed for the 1 year before testing, 50% based on concurrent ratings for 2 years prior to testing, and 85.7% for aggressive behavior documented by caregivers 1 month after testing. Clearly, direct ratings obtained closer to time of test administration were more consistent with inferences associated with the AOS.

Sivec and Hilsenroth (1994), in their review of the Hand Test with children, reported that the AOS was able to differentiate groups clearly classified as aggressive versus nonaggressive (e.g., Azcarate & Gutierrez, 1969; Oswald & Loftus, 1967). However, the AOS has not effectively classified school placement levels with more heterogenous groups (e.g., Hilsenroth & Sivec, 1990; Waehler, Rasch, Sivec, & Hilsenroth, 1992).

Sivec and Hilsenroth (1994) cited a lack of research in the area of racial differences for using the AOS. An exception was King (1973), who compared African American adolescents who were identified for aggressive behavior and African American adolescents without a history of acting out. There were no significant differences between the two groups on the AOS. McGiboney and Huey (1982) provided Hand Test normative information for African American males described as disruptive. These authors tested eighth and ninth graders (N ” 51) and found that a pattern of more Interpersonal category scores compared to Environmental category scores, especially in the presence of an elevated PATH score, may identify aggressive tendencies. However, the lack of a comparison group, limited sample size, and age restriction seriously limit the generalizability of these results.

Two recent studies by Clemence and colleagues clarified the use of the AOS with child and adolescent samples. Clemence, Hilsenroth, Sivec, Rasch, and Waehler (1998) found significant differences in the mean scores on the AOS in a sample of psychiatric inpatients, outpatients, and normal adolescents. The inpatient sample produced significantly higher mean AOS scores than the outpatients, and both groups produced higher mean AOS scores than normals. An AOS of ” 2 provided the optimal classification rates (ranging from 43% to 63%) and a positive predictive power (i.e., the probability that a case will be identified with a given test score) of .91 when both clinical groups were compared to the nonclinical group. Clemence et. al.’s sample was quite homogeneous in terms of a restricted age range and the administration of the Hand Test temporally closer to actual emotional or behavioral problems (48 hours after admission). In this way, the Hand Test AOS tends to be more effective when the samples are clearly defined and when testing occurs closer to the time of the predicted behavioral activity. Clemence, Hilsenroth, Sivec, and Rasch (1999) found the mean AOS score to be significantly higher in a group of children referred for aggressive behavior compared to a matched normal group (N ” 74). Using an AOS of ” 0 resulted in an overall classification rate of 66%.

The Hand Test AOS has also been found to be useful for predicting aggressive behavior in individuals with mental retardation. Panek, Wagner, and Suen (1979) designed a study to evaluate the AOS as a predictor of violent and destructive behavior for persons with mental retardation living in a residential facility. These researchers reported that there were significant positive correlations between the participants’ AOS and ACT (MOV) responses (see Wagner, 1983) and the criteria of violent and destructive behavior. Panek (1985) investigated the effectiveness of the AOS and movement content response in identifying persons with mental retardation who were currently on structured behavior management programs for the control of dangerous behavior. Examination of the Hand Test protocols indicated that 18 of the 24 residents on behavior management programs contained one or more of these indicators of aggressive behavior. Panek and Wagner (1989) reviewed the files of persons with mental retardation over a continuous 5-year period at a residential facility and identified individuals (n ” 24) who were discharged for violent and destructive behavior and individuals (n ” 12) who were discharged to a less restrictive environment. These protocols were inspected for the presence of the two investigated indices of violent and destructive behavior, AOS ” 1, ACT-MOV ” 1. Twenty of the 24 residents (83%) discharged for aggressive behavior evinced one or more of the signs, whereas only 3 of the 12 residents (25%) discharged to a less restrictive environment exhibited one or more of the signs.

In summary, the AOS is a relatively stable score (McGiboney & Carter, 1982; Panek & Stoner, 1979) for short intervals, and it is less stable over long intervals (Breidenbaugh, Brozovich, & Matheson, 1974; Stoner & Lundquist, 1980; Zizolfi & Cilli, 1999). Groups who are identified as aggressive or acting out consistently produce higher mean AOS values compared to normal or non-acting-out groups. The AOS has not been effective in identifying different levels of placement for socially and emotionally maladjusted urban schoolchildren and African American youth identified for acting-out behavior. A cutoff score of AOS ” 1 is most commonly used, but it is not always the most effective in classifying acting-out individuals. Using this cutoff score, the AOS improves upon chance classification from roughly 10 to 35%, so it should be considered a modest index of acting-out behavior that should always be supplemented with additional data. As is true with most cutoff scores, specific values have been identified to fit individual samples. Without focusing on the single best value, however, the sum and substance of the available data indicate that an AOS imbalanced in favor of negative behavior is associated with a greater tendency to act out. It is also important to mention that an absence of a positive AOS does not necessarily preclude acting out. Other data (notably the PATH score) should always be reviewed when making inferences regarding acting-out potential.

Validity—Predicting Acting-Out Behavior: AGG

The AGG variable is integral to computing the AOS and has received considerable research attention in its own right. Normative data suggest the typical range for AGG responses is 0 to 2 (M ” 1.17, SD ” .91; Wagner, 1983). The AGG variable has shown moderate levels of test-retest reliability (r ” .51, 2-week interval) for normal subjects (Panek & Stoner, 1979) and moderately high test-retest reliability (r ” .67, 3-week interval) for a sample of adolescents in an alternative school for acting-out behavior (McGiboney & Carter, 1982).

Haramis and Wagner (1980) compared a matched group of 60 alcoholics classified as acting out versus non-acting out based upon behavior exhibited during intoxication. The acting-out group produced significantly more AGG responses. Clemence et al. (1999) found that youth in an aggressionreferred group scored significantly higher on the AGG variable compared to a nonreferred group. Using an AGG score of ” 2, 69% of the sample were correctly classified. Wetsel, Shapiro, and Wagner (1967) similarly used an AGG score of ” 2 to correctly classify recidivist delinquents with 68% accuracy. Samples of youth defined as severely behavior handicapped (Waehler et al., 1992) and socially and emotionally disturbed (Hilsenroth & Sivec, 1990) have also produced significantly higher mean scores on AGG compared to “normal” control groups. In summary, the AGG variable seems to be reasonably stable over short intervals, and an elevated AGG score (e.g., ” 2) has been shown to correspond to a greater likelihood of aggressive behavior.

Identifying Psychopathology With PATH/WITH/MAL

Next to predicting acting-out behavior, a second research area for the Hand Test has been its potential for assessing psychopathology and/or emotional disturbance. In this regard, the Hand Test PATH score “provides an overall estimate of

the total amount of pathology in an examinee’s protocol” (Wagner, 1983, p. 11). The PATH score is one of the more stable Hand Test variables for short intervals (r ” .69 to .76; Wagner, 1983) and is relatively low in the normative sample (M ” 1.22, SD ” 1.31; Wagner, 1983).

Two studies of convergent validity support the PATH score in this regard. Wagner, Darbes, and Lechowick (1972) examined patients’ (N ” 50) PATH scores in relation to staff ratings of psychopathology. The staff rated patients on a 4-point scale of pathology. Staff were trained on the rating materials and 88% of their judgments were either identical or one step removed. Despite an imperfect criterion and limited range of pathology, the PATH score was found to correlate significantly with ratings of pathology (Spearman rho [rank-orders] r ” .51) in this sample largely represented by psychotic inpatients.

In a similar vein, Hilsenroth, Fowler, Sivec, and Waehler (1994) correlated the PATH score with Minnesota Multiphasic Personality Inventory-2 (MMPI-2) clinical scales in a sample of outpatients (N ” 43). Using the PATH variable as a criterion, a multiple R of .71 was obtained using all 13 MMPI-2 scales and an r of .65 was obtained using only 7 clinical scales. Scale 7 (psychasthenia), which has been described as “a reliable index of psychological turmoil and discomfort” (Graham, 2000, p. 76) was the single MMPI-2 clinical scale most highly correlated with the PATH score (r ” .61). Hilsenroth et al. also reported divergent validity for the PATH score due to its nonsignificant correlations with MMPI-2 scales thought to reflect exaggeration of symptoms (F) and defensiveness (L, K).

Several studies have usually examined the criterion-related validity of the PATH score by comparing identified clinical samples with nonclinical groups. For example, Lenihan and Kirk (1990) found significantly higher PATH scores in their sample of eating-disordered, college-age women (n ” 34) compared to a group of non-eating-disordered women (n ” 26). Young, Wagner, and Finn (1994) found a significantly higher mean PATH score in their sample of patients diagnosed with multiple personality disorder (n ” 11) compared with a matched clinical control group (n ” 11). Although the sample sizes in these studies have been limited, the PATH score has consistently been higher in clinical groups compared to nonclinical samples.

In a larger sample of Vietnam veterans (N ” 108), Walter, Hilsenroth, Arsenault, Sloan, and Harvill (1998) evaluated Hand Test responses of veterans who met DSM criteria for post-traumatic stress disorder (PTSD; n ” 85) and veterans who demonstrated some post-traumatic stress symptoms (PTSS), but who did not meet full criteria (n ” 23). The PATH score was significantly higher in the PTSD group

(M ” 3.67, SD ” 2.49), who by definition would be expected to express more severe psychopathology compared to the PTSS group (M ” 2.17, SD ” 3.31), providing strong support for the PATH score’s capacity for differentiating severity of illness.

In summary, the PATH score tends to be stable over time (McGiboney & Carter, 1982; Panek & Stoner, 1979). Clinical groups consistently produce higher mean PATH scores compared to nonclinical groups. Moreover, the PATH score has also been shown to be sensitive to degrees of severity within and across clinical groups (e.g., Walter et al., 1998; Young et al., 1994). The PATH score has also been used to identify groups demonstrating acting-out behavior (e.g., Azcarate & Gutierrez, 1969) and youth with social and emotional problems (Hilsenroth & Sivec, 1990; Waehler et al., 1992). In general, a PATH score of ” 3 can be used as a marker of psychopathology (Wagner, 1983) except with children (see Sivec & Waehler, 1999) in kindergarten through second grade, who tend to produce a higher mean PATH score (M ” 3.65, SD ” 4.1) than all other age groups (Wagner et al., 1991).

The WITH score represents the sum total of the most pathological of the Hand Test responses (DES, BIZ, FAIL). This category of scores was originally thought to reflect an “abandonment of meaningful, effective life roles” (Wagner, 1983, p. 21). These types of responses can be anticipated for those individuals suffering from the debilitating effects of severe illnesses and are rarely if ever found in normal protocols. For example, Wagner (1961, 1962b) found that an elevated WITH score (” 2) was more likely to be associated with schizophrenia compared to “normals” and “neurotics.” This WITH score effectively classified 84% of the schizophrenics versus normals, but only 50% (chance level) of schizophrenics versus neurotics. An elevated WITH category score was not necessarily only associated with psychotic symptoms.

Studying patients suffering cerebral impairment due to a stroke, Wang and Smyers (1977) demonstrated that cognitively compromised stroke patients (n ” 42) produced more WITH responses compared to a group (n ” 32) of medical patients without cognitive impairment. In addition, the stroke patients produced more FAIL responses compared to the control group. These data suggest that WITH and FAIL responses may reflect a limited capacity to respond to novel stimuli, in this case, due to organic impairment.

Wagner et al. (1991) found the WITH score to be significantly higher in a sample of severely behavior-handicapped (SBH) children (n ” 98) compared to a sample of normal children (n ” 98) matched closely on age, gender, and race. SBH children are described as exhibiting severe emotional and behavioral problems. Similarly, Hilsenroth and Sivec (1990) found the WITH score to be higher in socially and emotionally disturbed child and adolescent groups compared to normals. Waehler et al. (1992) further demonstrated that a higher WITH score identified youth in a “special classroom” or “separate facility” (which are indications of severity of condition) compared to normals. These studies suggest that the WITH score is associated with youth who are not able to function effectively within the school environment.

In summary, WITH tends to be stable across short and longer term intervals (.60 to .86; Wagner, 1983). Studies to date have shown the WITH score to be elevated in a variety of disordered populations. At present, the WITH score measures disengagement from the task at hand that may reflect psychosis, overly concrete thinking, lack of investment, or unwillingness to follow prescribed rules. For this reason, it is important to examine carefully the variables that are inflating the overall WITH score. For example, a patient providing many BIZ responses is likely to present differently than the patient who gives the limited, but less distorted, FAIL or DES responses.

The DES response reflects a “feeble and safe reaction to reality” (Wagner, 1983, p. 21). When an individual can only provide a simple description (DES) of the drawings, this may signal a limitation in cognitive resources and/or depressive issues. For example, Wagner, Klein, and Walter (1978) divided patients (N ” 100) into four groups based upon IQ scores. In this sample, the DES variable correlated !.28 with IQ. McCormick and Wagner (1983) found that a sample of “brain-damaged” patients (n ” 50) produced a significantly higher mean number of DES responses compared with a functionally disturbed group (n ” 50). Sivec, Hilsenroth, and Wagner (1989) found the DES variable negatively correlated with IQ in grade-school students (Stanford-Binet r ” !.39, N ” 60; WISC r ” !.23, N ” 112). Patients (N ” 15) in the depressed phase of a bipolar illness also produced significantly more DES responses than in testing conducted during the manic phase of their illness (Wagner & Heise, 1981). In summary, DES scores tend to be moderately stable (Wagner, 1983), are inversely related to intelligence, are frequently found in mental retardation (Panek, 1999), and can sometimes be seen in organic conditions. DES scores may also be associated with depression.

The Maladjustive (MAL) category taps those responses that are thought to reflect “difficulty in carrying out action tendencies due to inner weakness and/or outer prohibition” (Wagner, 1983, p. 20). Of the scores comprising MAL, the FEAR variable is the rarest and perhaps most clinically significant when present. For example, Gianakos and Wagner (1987) examined the Hand Test responses of battered women residing in a shelter or receiving outpatient treatment. In this study, the sample of battered women (n ” 27) produced significantly more FEAR responses compared to nonbattered women (n ” 27). The FEAR variable was also significantly correlated with frequency of abuse, previous visits to a shelter, and history of medical treatment following battering. When the FEAR variable was combined with card shock and PAS (higher in nonbattered women), 78% of battered and 96% of nonbattered women were classified correctly. Of note, Dalton and Kantner (1983) did not find significant elevations in the FEAR response by comparing the Hand Test responses of battered and nonbattered women. Therefore, further study of the FEAR variable in abuse situations is warranted.

Rasch and Wagner (1989) found that children who had been sexually abused (n ” 24) produced more MAL responses (of which FEAR is included) compared to a matched control group (FEAR was not reported separately in this article). Young et al. (1994) reported that patients (n ” 11) diagnosed with multiple personality disorder (MPD) produced significantly more FEAR responses than a clinical control group (n ” 11). The common thread in these studies is a history of recent or past abuse. Although no separate rating of fear or abuse was included in the Young et al. (1994) study, abuse has been clearly associated with cases of MPD (Putnam, 1989). In summary, although the FEAR variable is sometimes unstable (e.g., with older adults, Stoner and Lundquist [1980]), it does seem to accurately reflect “genuine apprehension about threats to ego integrity” (Wagner, 1983, p. 21) as in abuse situations.

CRIP, another variable in the Maladjustive category, is thought to reflect feelings of inadequacy, types of inferiority, and/or degrees of incapacitation (Wagner, 1983). In a straightforward examination of the inferences associated with CRIP, Wagner and Young (1999) found that pain patients produced significantly more CRIP responses compared to normals (N ” 100). In addition, veterans diagnosed with PTSD also produced significantly more CRIP responses compared to patients with PTSS, but not the full diagnosis (Walter et al., 1998). Although physical incapacitation is an obvious feature of pain patients, it is not as clearly associated with PTSD. Knowing if the PTSD veterans also reported more physical problems compared to the PTSS group would help to clarify whether the CRIP response reflects sensitivity to physical damage or problems, a psychological sense of limitation or inadequacy, or both.

Other Noteworthy Hand Test Scoring Categories: EXH/ACT

An EXH response denotes the desire to receive attention from others, especially with the expectation of deriving pleasure due to one’s specialness. Wagner (1974) examined the validity of this interpretation by contrasting normals and clinical groups with a group of strippers (n ” 7). The small group of strippers produced significantly more EXH responses (M ” 4.0) compared to matched groups of schizophrenics and neurotics, but not significantly different from the normal comparison group. It should be noted that the sample size is very small and the comparisons with normals and clinical groups may not be representative. Within the same study, a group of males identified as “exhibitionists” (n ” 12) were compared with a clinical control group matched on age and diagnostic grouping (neurotic, schizophrenic, character disorder). The male exhibitionists produced more EXH responses compared to the control group, but fewer EXH responses (M ” 1.8) compared to the strippers. Although these data support notions associated with EXH, further studies are needed to clarify the inference that EXH responses are associated with a desire to draw attention to oneself.

The ACT response is thought to be given by individuals “who are involved in constructive accomplishment” (Wagner, 1983, p. 20). This response is quite common in normal populations (M ” 3.77, SD ” 2.93; Wagner, 1983). ACT has been studied as a potential marker of “good” workers. Wagner and Cooper (1963) found that unskilled to semiskilled workers rated as “Satisfactory” (n ” 30) tended to produce more ACT responses than coworkers described as “Unsatisfactory” (n ” 20). Using a median cutoff, 45 out of 50 workers were classified correctly, with ACT ” 5 indicating a satisfactory worker. Huberman (1964) was unable to cross-validate this finding in a smaller sample of workers (N ” 18) rated as “high, average, or low.” However, there were only six workers in each of the three groups, which is not adequate for meaningful statistical comparisons. The ACT score was also positively correlated with rankings of workshop success in a sample (N ” 27) of severely retarded adults (Wagner & Hawver, 1965). The ACT score was also able to differentiate mentally retarded workers classified as “good” (n ” 28) versus “poor” (n ” 19) using a cutoff score of ” 2 to identify the “good workers” (Wagner & Capotosto, 1966).

The ACT variable has also been identified at lower levels in groups showing less involvement in reaching life and work goals for a number of reasons. For example, Gianakos and Wagner (1987) and Dalton and Kantner (1983) both found lower ACT scores in groups of battered women compared to normals. Walter et al. (1998) found significantly lower ACT scores in the PTSD group compared to a less severely disturbed group of post-traumatic stress symptom veterans. Wagner and Romanik (1976) found that marijuana (n ” 30) and multi-drug-using (n ” 30) college students produced significantly fewer ACT responses compared to a matched sample of normal college students (n ” 30). These results were reported as supporting the hypothesis that marijuana users develop a passive attitude and may withdraw from pursuing challenging goals.

Overall, an elevated ACT score has been consistently associated with productive work performance in lower functioning individuals. The available studies suggest a moderate to high level of correspondence to the proposed interpretation of this variable. However, it should be noted that no single cutoff score exists for “productive” workers. The above studies have maximized the separation of groups based upon each group’s characteristics, suggesting that effective use of the ACT variable will require establishing local base rates and defining appropriate cutoff scores for each specific setting. Lower ACT scores also appear to be associated with certain clinical conditions (e.g., PTSD and drug abuse).

RANGE OF APPLICABILITY AND LIMITATIONS

Some areas of research using the Hand Test have been sufficiently broad and systematic to warrant additional attention in this review. In this regard, the Hand Test has been used extensively with persons with mental retardation (MR) and older adults. These research areas have revealed distinct Hand Test patterns and characteristics for these groups. Therefore, the following sections provide summaries for Hand Test use with the mentally retarded and older adults.

Hand Test Interpretation for Persons With Mental Retardation

Although projective techniques have been used widely with persons with mental retardation since the 1930s, there exist fundamental concerns that they may be too abstract or complex, demand too much sustained attention, and require too high a level of verbal, cognitive, and perceptual-motor activity to be effective (Panek & Wagner, 1979; Prout & Strohmer, 1994). (See Panek [1997] for a complete discussion of using projective tests with persons with mental retardation.) The Hand Test overcomes many of these proposed challenges to assessing persons with mental retardation due in large part to the simplicity of the task demands of the test, the “concreteness” of the test stimuli, and the relatively brief time for administration (Wagner, Ryan, & Panek, 1991).

In this regard, the Hand Test has established norms for persons with mental retardation (Wagner, 1962a, 1983). In addition, Stoner (1985) demonstrated adequate test-retest reliability by administering the Hand Test to 60 institutionalized persons with mental retardation and then retesting them 6 weeks later. Significant correlations were found for 18 Hand Test scoring variables. Correlations for the various Hand Test variables ranged from low (.23 AFF, CRIP) to high (.87 DIR), with the majority being in the moderate range. Given the relatively long time interval between administrations (6 weeks), the strength of these correlations is within an acceptable range.

According to Panek (1997, 1999), although variability exists (usually as a function of IQ level), the Hand Test manual and other literature suggest there is a prototypical response pattern for persons with mental retardation at different stages of the life span. Specifically, compared to the typical person without mental retardation, persons with mental retardation tend to manifest a response pattern with seven characteristics:

1. Low R, characterized by 10 or fewer responses (Panek, 1999; Panek & Wagner, 1979, 1980; Wagner, 1962a, 1983);
1. Use of fewer response categories and a reduced number of responses within categories (Panek, 1999; Panek & Wagner, 1979, 1980; Wagner, 1962a, 1983);
1. Elevated DES, which can be the most common response category, often accompanied by Demonstrations (D) and Emulations (E) (Panek 1997, 1999; Wagner, 1962a, 1983);
1. High repetition of responses in all scoring categories (Panek, 1997, 1999);
1. More FAIL scores, especially to Card X (Wagner, 1962a, 1983). (Note: When no comorbid, Axis I condition exists, FAIL responses are often associated with a motivational deficit [Panek, 1997, 1999]);
1. Low MAL (if they occur at all, they typically indicate actual, recent physical, medical, or injury problems, such as a finger broken at a workshop [Panek 1997, 1999]);
1. Elevated WITH and PATH due to the large number of DES and FAIL responses in the protocol (Panek, 1997, 1999). Generally, high WITH for persons with mental retardation indicates limited personality and cognitive resources rather than a lack of reality contact. A high PATH should not be automatically associated with psychopathology as with other populations, due to the higher average occurrence of DES and FAIL responses in mental retardation.

Persons with mental retardation are vulnerable to a full range of psychiatric impairment and in fact may even be at greater risk of mental illness than the general population due to their substantial biological and psychosocial challenges, as well as negative social conditions (Reiss, 1994). Panek and Wagner (1993) conducted an exploratory investigation to determine the effectiveness of the Hand Test in differentiating between institutionalized older adults with mental retardation (n ” 17) and those with a dual diagnosis of mental retardation and a comorbid mental illness (n ” 17). Noteworthy was the BIZ scoring category inasmuch as nine of the mentally ill or mentally retarded individuals gave at least one BIZ, whereas none of the persons diagnosed solely with mental retardation produced a BIZ. The DES category was the next most discriminative variable, correctly classifying 17 cases (and misclassifying 8). Using BIZ and DES sequentially (i.e., first pulling the cases with BIZ and then looking for those with fewer DES), it was possible to correctly identify 76% of the participants. Results indicated that the typical person with mental retardation not only has a low IQ but is also characterized by bland affect, superficial involvement with people and the environment, and crude, unembellished perceptions (chiefly reflected in the DES). On the other hand, the older adult who is mentally retarded and also has some form of mental illness is characterized by intrusive psychotic processes, with this lack of reality contact causing the idiosyncratic embellishments reflected in the BIZ scoring category. As with other projective techniques, the Hand Test literature regarding persons with a dual diagnosis is limited. Therefore, findings should be considered speculative and additional research on this population is needed.

In summary, the Hand Test has demonstrated adequate test-retest reliability with the MR population (Panek, 1999). As noted earlier, the AOS has been used effectively to identify acting out in persons with mental retardation. Beginning criterion-related validity for Hand Test scoring variables such as R, WITH, PATH, and ACT have also been demonstrated in the literature. The Hand Test has been found to be sensitive to individual variability among persons with mental retardation. The interpretation of some Hand Test variables requires slight modification when working with the MR population. For example, the WITH score indicates limited personality and cognitive resources rather than a lack of reality contact for persons with mental retardation.

The Use of the Hand Test With Older Adults

Although projective techniques have been used extensively with older adults (see Hayslip, 1999; Panek & Wagner, 1985; Panek, Wagner, & Kennedy-Zwergel, 1983), their use is not without criticism. Concerns about the effective use of projective techniques with older adults revolve around eight issues: (1) norms, (2) validity, (3) reliability, (4) test-taking behavior, (5) sensory and motor ability, (6) research designs, (7) relevance of test stimuli, and (8) adequate interpretation of responses (see Panek & Wagner, 1985; Panek et al., 1983, for an in-depth discussion of these problems/issues).

With regard to older adults, there are a number of Hand Test studies that provide normative and standardization data for community-living or normal (Maloney & Wagner, 1990; Panek, Sterns, & Wagner, 1976; Panek, Wagner, & Avolio, 1978; Stoner, Panek, & Satterfield, 1982) and clinical or institutionalized samples (Panek & Rush, 1979; Panek & Spencer, 1983). Hand Test reliability (Stoner & Lundquist, 1980; Wagner et al., 1991) and validity (Hayslip & Panek, 1982, 1983; Panek & Hayslip, 1980) studies are also available. For example, regarding reliability, Stoner and Lundquist (1980) administered the Hand Test twice (M test-retest interval ” 34.9 days) to older adults (N ” 50) residing in a nursing facility. These researchers reported significant testretest correlations for 23 of the 24 investigated Hand Test variables; only FEAR was not significant. Significant testretest correlations ranged from .29 (H-L) to .83 (INT) with the median correlation being in the moderate-high range (.61). Studies that have investigated norms and psychometric properties of the Hand Test with older adults have generally controlled or adjusted for potential sensory-motor factors. Studies investigating the effects of aging reflected in the Hand Test have employed cross-sectional, longitudinal (Wagner et al., 1991), and time-lagged (Hayslip, Panek, & Stoner, 1990) research designs. Based on these studies, Hayslip (1999) and Panek and Wagner (1985) concluded that many of the aforementioned concerns have been addressed satisfactorily in Hand Test research with the aged.

In general, although there are individual differences within and between older adults, studies with the Hand Test investigating age differences report findings similar to those obtained with the Rorschach and TAT (Panek & Wagner, 1985). Specifically, the responses of older adults indicate a withdrawal in a psychological or emotional sense from the environment (elevated WITH, PATH, and lower ACQ) and constricted interpersonal relationships (lowered INT, COM, elevated DIR). Older adults manifest an increased dependence on others (higher DEP) and demonstrate less assertive tendencies (lower ACQ). They tend to be slower to respond to novel situations (increased H-L, AIRT) and tend to exhibit more stereotypical response patterns (increased Repetition, Emulation).

The inferences associated with the significant Hand Test scores are largely consistent with findings from the Kansas City Studies of adult personality (Hayslip, 1999). That is, there seems to be an age-related propensity toward “interiority,” or the withdrawal of the aged person from the outside world with a concomitant reinvestment of personality resources into the self. However, as reported by Panek et al. (1998) these findings should not be interpreted so rigidly as to imply “withdrawal” in a maladaptive sense, since these differences may reflect successful adaptation to environmental and interpersonal events. Baltes (1997) suggested that with increasing age an individual must learn to compensate and be selective in all aspects of activity in order to maintain effective or optimal functioning. Thus, what appears as withdrawal may not be “negative,” but may be an adaptive way to compensate for declining abilities. Similar to other age ranges, gender differences in Hand Test scores are minimal in older adults (Panek et al., 1998; Stoner et al., 1982). As is true for other areas of Hand Test research, the majority of research for older adults occurred prior to 1990 and is in need of replication and extension with current cohorts of younger and older adults. Although the Hand Test has been used to differentiate clinical versus nonclinical groups, it is not advised to diagnose patients based upon Hand Test data alone. As noted throughout the Hand Test literature, additional data are always required when forming diagnostic conclusions.

ACCOMMODATION FOR POPULATIONS WITH DISABILITIES

As noted previously, the Hand Test is particularly suited to individuals with disabilities (e.g., MR) due to the simplistic nature of the task. Obviously it cannot be used for individuals with serious visual impairments, but it is suitable for most other groups for which normative and research data are available.

LEGAL AND ETHICAL CONSIDERATIONS

As is true with all psychological tests, only those individuals adequately trained in the uses and limitations of the Hand Test should employ this technique when making clinical, work, or placement determinations (see APA ethics code; American Psychological Association, 2002).

COMPUTERIZATION

The Hand Test is not currently available in a computerized format. At present, the state of the research does not seem adequate to develop computer technology for interpretation. Additionally, the test is simple, straightforward, brief, with few complicated mathematical conversions that would best be computed mechanically. Therefore, traditional handling of the test material and interpretation seems to provide the muchneeded interpersonal element without overly taxing the resources of the assessor.

USE IN CLINICAL AND ORGANIZATIONAL PRACTICE

Overall, the Hand Test has demonstrated its usefulness as a general measure of psychopathology (PATH) and for predicting acting-out behavior (AOS). Also, the following variables have been associated with specific populations: DES (limited intelligence), FEAR (abused individuals), and CRIP (pain patients and PTSD). The Hand Test has not been used in work settings with a few exceptions. As noted in this text, the Hand Test has been used with persons diagnosed with MR in various work settings. It has also been used to a limited extent in evaluating workers in semiskilled settings and with patrol officers (Rand & Wagner, 1973). Of note, O’Roark (1999) offers case examples of Hand Test assessments used to inform employment decisions.

CURRENT RESEARCH STATUS

The Hand Test continues to be evaluated in a variety of clinical and nonclinical settings. It has stood the test of time and it has stimulated interest in different countries. The Hand Test has been subjected to a variety of reliability and validity studies. As noted by Sivec and Hilsenroth (1994), the Hand Test boasts many studies supporting concurrent, criterion-related validity, but fewer studies that support the construct validity of specific interpretations. Since their review, only Hilsenroth et al. (1994) have provided further information regarding the convergent and divergent validity of the PATH score. Further work is needed in this domain.

One possible limitation of the Hand Test is that intentionally altered responses (e.g., fake good, fake bad) may not be detected. Singer and Dawson (1969) reported that collegeaged students (N ” 40) were able to produce Hand Test responses that corresponded to their “best” and “worst” impression. However, no further investigations of this claim have been provided to date.

CROSS-CULTURAL FACTORS

For projective tests to be useful in the new millennium they must address variations within and across different ethnic groups and cultures (Panek, 2001). Test interpretations must occur with an awareness of the range of responses found across different cultures. Thus, there is a need for projective techniques to develop norms for a variety of countries and ethnic groups within particular countries.

The Hand Test has been used effectively in a number of countries in Europe, Asia, and North America. For example, Panek et al. (1998) have begun to develop Canadian norms for the Hand Test. In Europe, although norms have not been developed, the Hand Test has been used effectively in Norway (Drs. Nils Lie and Arne Haeggernes), Italy (Drs. Salvatore Zizolfi, Gabriella Cilli, Vito Tummino, and associates), and Romania (Mr. Dan Murarasu). In Asia, Ms. Eiko Yamagami, Dr. Mari Yoshikawa, and Ms. Hiroko Sasaki have translated Wagner’s texts, conducted original research, and developed Japanese norms for the Hand Test. The Hand Test has also been used effectively in Pakistan (Drs. Zahid Mahmood and Rukhana Kausar). Sharing of Hand Test research across continents has been limited by language issues and by a lack of a common forum for sharing this information (with the exception of international conferences). The development of an international Hand Test manual or book of readings may bridge the work of these diverse authors.

FUTURE DEVELOPMENTS

The Hand Test would benefit from updating the norms with both clinical samples and nonclinical samples; norms for the current adult manual are over 20 years old (Wagner, 1983). In addition, it would be helpful to provide a centralized system for developing and distributing cross-cultural norms and findings. Current longitudinal and life-span data would also aid in interpretation efforts. Performing an update that further investigates the Hand Test as it relates to specific diagnostic and treatment applications would help support the advantages of this instrument as a measure of prototypical attitudes and action tendencies that are “close to the surface” of psychological activities. For example, diagnostic patterns have been reported for patients in manic and depressed states (Wagner & Heise, 1981), eating-disordered patients (Lenihan & Kirk, 1990), and patients with dissociative identity disorder (see Young et al., 1994). In these ways, some Hand Test patterns show potential to be related to specific DSM diagnoses and, along with these designations, could be helpful in specifying intervention strategies.

REFERENCES

American Psychological Association. (2002). Ethical principles of psychologists and code of conduct. American Psychologist, 57, 1060–1073.
Azcarate, E., & Gutierrez, M. (1969). Differentiation of institutional adjustment to juvenile delinquents with the Hand Test. Journal of Clinical Psychology, 25, 200–203.
Baltes, P.B. (1997). On the incomplete architecture of human ontogeny. American Psychologist, 52, 366–379.
Beck, S. (1987). Questionnaires and checklists. In C.L. Frame & J.L. Matson (Eds.), Handbook of assessment in childhood psychopathology: Applied issues in differential diagnosis and treatment evaluations (pp. 79–105). New York: Plenum.
Breidenbaugh, B., Brozovich, R., & Matheson, L. (1974). The Hand Test and other aggression indicators in emotionally disturbed children. Journal of Personality Assessment, 38, 332–334.
Bricklin, B., Piotrowski, Z.A., & Wagner, E.E. (1962). The Hand Test: A new projective test with special reference to the prediction of overt behavior. In M. Harrower (Ed.), American lecture series in psychology. Springfield, IL: Charles C. Thomas.
Brodsky, S.L., & Brodsky, A.M. (1967). Hand Test indicators of antisocial behavior. Journal of Projective Techniques and Personality Assessment, 31, 36–39.
Clemence, A.J., Hilsenroth, M.J., Sivec, H.J., & Rasch, M.A. (1999). Hand Test AGG and AOS variables: Relation with teacher rating of aggressiveness. Journal of Personality Assessment, 7, 334–344.
Clemence, A.J., Hilsenroth, M.J., Sivec H.J., Rasch, M.A., & Waehler, C.A. (1998). Use of the Hand Test in the classification of psychiatric inpatient adolescents. Journal of Personality Assessment, 71, 228–241.
Dalton, D.A., & Kantner, J.E. (1983). Aggression in battered and non-battered women as reflected in the Hand Test. Psychological Reports, 53, 703–709.
Das, B. (1953). The science of emotions (4th ed.). Adyar, Madras, India: Theosophical Publishing House.
Drummond, F. (1966). Failure in the discrimination of aggressive behavior of undifferentiated schizophrenics with the Hand Test. Journal of Projective Techniques and Personality Assessment, 30, 275–279.
Gianakos, I., & Wagner, E.E. (1987). Relations between Hand Test variables and the psychological characteristics and behaviors of battered women. Journal of Personality Assessment, 51, 220– 227.
Graham, J.A. (2000). MMPI-2: Assessing personality and psychopathology (3rd ed.). New York: Oxford University Press.
Haramis, S.L., & Wagner, E.E. (1980). Differentiation between acting out and non-acting out alcoholics with the Rorschach and Hand Test. Journal of Clinical Psychology, 36, 791–797.
Hayslip, B., Jr. (1999). The Hand Test and aging. In G.R. Young & E.E. Wagner (Eds.), The Hand Test: Advances in application and research (pp. 167–181). Melbourne, FL: Krieger.
Hayslip, B., Jr., & Panek, P.E. (1982). Construct validation of the Hand Test with the aged: Replication and extension. Journal of Personality Assessment, 46, 345–349.
Hayslip, B., Jr., & Panek, P.E. (1983). Physical self-maintenance, mental status, and personality in institutionalized older adults. Journal of Clinical Psychology, 39, 479–485.
Hayslip, B., Jr., Panek, P.E., & Stoner, S.B. (1990). Cohort differences in Hand Test performance: A time lagged analysis. Journal of Personality Assessment, 54, 704–710.

418 The Hand Test: Assessing Prototypical Attitudes and Action Tendencies

Hilsenroth, M., Fowler, C., Sivec, H., & Waehler, C. (1994). Concurrent and discriminant validity between the Hand Test pathology score and the MMPI-2. Assessment, 1, 111–113.
Hilsenroth, M.J., & Sivec, H.J. (1990). Relationships between Hand Test variables and maladjustment in school children. Journal of Personality Assessment, 55, 344–349.
Himelstein, P., & Von Grunau, G. (1981). Differentiation of aggressive and nonaggressive schizophrenics with the Hand Test: Another failure. Psychological Reports, 49, 556.
Huberman, J. (1964). A failure of the Wagner Hand Test to discriminate among workers rated high, average and low on activity level and general acceptability. Journal of Projective Techniques and Personality Assessment, 28, 280–283.
King, G.T. (1973). A comparison of Hand Test responses of aggressive and non-aggressive black adolescents. Dissertation Abstracts International, 34, 1736A.
Lenihan, G.O., & Kirk, W.G. (1990). Personality characteristics of eating disordered outpatients as measured by the Hand Test. Journal of Personality Assessment, 55, 350–361.
Maloney, P., & Wagner, E.E. (1979). Interscorer reliability of the Hand Test with normal subjects. Perceptual and Motor Skills, 49, 181–182.
Maloney, P., & Wagner, E.E. (1990). Predicting normal age-related changes with intelligence, projective, and perceptual-motor variables. Perceptual and Motor Skills, 71, 1225–1226.
McCormick, M.K.T., & Wagner, E.E. (1983). Validity of the Hand Test for diagnosing organicity in a clinical setting. Perceptual and Motor Skills, 57, 607–610.
McGiboney, G.W., & Carter, C. (1982). Test-retest reliability of the Hand Test with acting-out adolescent subjects. Perceptual and Motor Skills, 55, 723–726.
McGiboney, G.W., & Huey, W.C. (1982). Hand Test norms for disruptive black adolescent males. Perceptual and Motor Skills, 54, 441–442.
Meyer, G.J. (1997). Thinking clearly about reliability: More critical corrections regarding the Rorschach Comprehensive System. Psychological Assessment, 9, 495–498.
Miller, H.A., & Young, G.R. (1999). The Hand Test in correctional settings: Literature review and research potential. In G.R. Young & E.E. Wagner (Eds.), The Hand Test: Advances in application and research (pp. 183–190). Malabar, FL: Krieger.
Moran, J.J., & Carter, D.E. (1991). Comparisons among children’s responses to the Hand Test by grade, race, sex, and social class. Journal of Clinical Psychology, 47, 647–664.
Moss, P.A. (1992). Shifting conceptions of validity in educational measurement: Implications for performance assessment. Review of Educational Research, 62, 229–258.
O’Roark, A.M. (1999). Workplace applications: Using the Hand Test in employee screening and development. In G.R. Young & E.E. Wagner (Eds.), The Hand Test: Advances in application and research (pp. 25–32). Malabar, FL: Krieger.
Oswald, M.O., & Loftus, A.P.T. (1967). A normative and comparative study of the Hand Test with normal and delinquent children. Journal of Projective Techniques and Personality Assessment, 31, 62–68.
Panek, P.E. (1985). Presence of Hand Test indices of aggressive behavior for mentally retarded adults on behavior management programs for aggressive behavior. Psychological Reports, 57, 1144–1146.
Panek, P.E. (1997). The use of projective techniques with persons with mental retardation: A guide for assessment instrument selection. Springfield, IL: Charles C. Thomas.
Panek, P.E. (1999). The appraisal of mental retardation with the Hand Test. In G.R. Young & E.E. Wagner (Eds.), The Hand Test: Advances in application and research (pp. 117–127). Malabar, FL: Krieger.
Panek, P.E. (2001). Projective psychology in the new millennium: Issues and challenges. Journal of Projective Psychology and Mental Health, 8, 73–74.
Panek, P.E., Cohen, A.J., Barrett, L., & Matheson, A. (1998). An exploratory investigation of age differences on the Hand Test in Atlantic Canada. Journal of Projective Psychology and Mental Health, 5, 145–149.
Panek, P.E., & Hayslip, B., Jr. (1980). Construct validation of the Hand Test withdrawal score on institutionalized older adults. Perceptual and Motor Skills, 51, 595–598.
Panek, P.E., & Rush, M.C. (1979). Intellectual and personality differences between community-living and institutionalized older adult females. Experimental Aging Research, 5, 239–250.
Panek, P.E., & Spencer, W.B. (1983). Hand Test personality correlates of aging in institutionalized mentally retarded adults. Perceptual and Motor Skills, 57, 1021–1022.
Panek, P.E., Sterns, H.L., & Wagner, E.E. (1976). An exploratory investigation of the personality correlates of aging using the Hand Test. Perceptual and Motor Skills, 43, 331–336.
Panek, P.E., & Stoner, S. (1979). Test-retest reliability of the Hand Test with normal subjects. Journal of Personality Assessment, 43, 135–137.
Panek, P.E., & Wagner, E.E. (1979). Relationships between Hand Test variables and mental retardation: A confirmation and extension. Journal of Personality Assessment, 43, 600–603.
Panek, P.E., & Wagner, E.E. (1980). Mental retardation as a facade self phenomenon: Construct validation. Perceptual and Motor Skill, 51, 823–826.
Panek, P.E., & Wagner, E.E. (1985). The use of the Hand Test with older adults. Springfield, IL: Charles C. Thomas.
Panek, P.E., & Wagner, E.E. (1989). Validation of two Hand Test indices of aggressive behavior in an institutional setting. Journal of Personality Assessment, 53, 169–172.
Panek, P.E., & Wagner, E.E. (1993). Hand Test characteristics of dual diagnosed mentally retarded older adults. Journal of Personality Assessment, 61, 324–328.
Panek, P.E., Wagner, E.E., & Avolio, B.J. (1978). Differences in the Hand Test responses of healthy females across the life-span. Journal of Personality Assessment, 42, 139–142.
Panek, P.E., Wagner, E.E., & Kennedy-Zwergel, K. (1983). A review of projective test findings with older adults. Journal of Personality Assessment, 47, 562–582.
Panek, P.E., Wagner, E.E., & Suen, H. (1979). Hand Test indices of violent and destructive behavior for institutionalized mental retardates. Journal of Personality Assessment, 43, 376–378.
Porecki, D., & Vandergroot, D. (1978). The Hand Test Acting-Out score as a predictor of acting out in correctional settings. Offender Rehabilitation, 2, 269–273.
Prout, H.T., & Strohmer, D.C. (1994). Assessment in counseling and psychotherapy. In H.T. Prout & D.C. Strohmer (Eds.), Counseling and psychotherapy with persons with mental retardation and borderline intelligence (pp. 79–102). Brandon, VT: Clinical Psychology Publishing.
Putnam, F.W. (1989). Diagnosis and treatment of multiple personality disorder. New York: Guilford Press.
Rand, T.M., & Wagner, E.E. (1973). Correlations between Hand Test variables and patrolman performances. Perceptual and Motor Skills, 37, 477–478.
Rasch, M.A., & Wagner, E.E. (1989). Initial psychological effects of sexual abuse on female children as reflected in the Hand Test. Journal of Personality Assessment, 53, 761–769.
Reiss, S. (1994). Handbook of challenging behavior: Mental health aspects of mental retardation. Worthington, OH: IDS Publishing.
Sattler, J. (1988). Assessment of children (3rd ed.). San Diego, CA: Jerome M. Sattler.
Singer, M.M., & Dawson, J.G. (1969). Experimental falsification of the Hand Test. Journal of Clinical Psychology, 25, 204–205.
Sivec, H.J., & Hilsenroth M.J. (1994). The use of the Hand Test with children and adolescents: A review. School Psychology Review, 23, 526–545.
Sivec, H.J., Hilsenroth M.J., & Wagner, E.E. (1989). Correlations between Hand Test variables and intelligence for public school students. Perceptual and Motor Skills, 69, 241–242.
Sivec, H.J., & Waehler, C.A. (1999). Behaviorally disturbed children and the Hand Test: Placement considerations. In G.R. Young & E.E. Wagner (Eds.), The Hand Test: Advances in application and research (pp. 137–153). Melbourne, FL: Krieger Publishing Co.
Stoner, S. (1985). Test-retest reliability of the Hand Test with institutionalized mentally retarded adults. Psychological Reports, 56, 272–274.
Stoner, S.B., & Lundquist, T. (1980). Test-retest reliability of the Hand Test with older adults. Perceptual and Motor Skills, 50, 217–218.
Stoner, S.B., Panek, P.E., & Satterfield, T.G.T. (1982). Age and sex differences on the Hand Test. Journal of Personality Assessment, 46, 260–264.
Tariq, P.N., & Ashfaq, S. (1993). A comparison of criminals and non-criminals on Hand Test scores. British Journal of Projective Psychology, 38, 107–118.
Waehler, C.A., Rasch, M.A., Sivec, H.J., & Hilsenroth, M.J. (1992). Establishing a placement index for behaviorally disturbed children using the Hand Test. Journal of Personality Assessment, 58, 537–547.
Wagner, E.E. (1961). The use of drawings of hands as a projective medium for differentiating normals and schizophrenics. Journal of Clinical Psychology, 2, 279–280.
Wagner, E.E. (1962a). The Hand Test: Manual for administration, scoring and interpretation. Akron, OH: Mark James.
Wagner, E.E. (1962b). The use of drawings of hands as a projective medium for differentiating neurotics and schizophrenics. Journal of Clinical Psychology, 3, 208–209.
Wagner, E.E. (1974). Projective test data from two contrasted groups of exhibitionists. Perceptual and Motor Skills, 39, 131–140.
Wagner, E.E. (1983). Hand Test manual (Rev. ed.). Los Angeles: Western Psychological Services.
Wagner, E.E., & Capotosto, M. (1966). Discrimination of good and poor retarded workers with the Hand Test. American Journal on Mental Deficiency, 71, 126–128.
Wagner, E.E., & Cooper, J. (1963). Differentiation of satisfactory and unsatisfactory employees at Goodwill Industries with the Hand Test. Journal of Personality Assessment, 27, 354–356.
Wagner, E.E., Darbes, A., & Lechowick, T.P. (1972). A validation study of the Hand Test Pathology score. Journal of Personality Assessment, 36, 62–64.
Wagner, E.E., & Hawver, D.A. (1965). Correlations between psychological tests and sheltered workshop performance for severely retarded adults. American Journal of Mental Deficiency, 69, 685–691.
Wagner, E.E., & Heise, M. (1981). Rorschach and Hand Test data comparing bipolar patients in manic and depressive phases. Journal of Personality Assessment, 45, 240–249.
Wagner, E.E., Klein, I., & Walter, T. (1978). Differentiation of brain damage among low IQ subjects with three projective techniques. Journal of Personality Assessment, 42, 49–55.
Wagner, E.E., Maloney, P., & Wilson, D.G. (1981). Split-half and test-retest Hand Test reliabilities for pathological samples. Journal of Clinical Psychology, 37, 589–592.
Wagner, E.E., & Medvedeff, E. (1962). Differentiation of aggressive behavior of institutionalized schizophrenics with the Hand Test. Journal of Projective Techniques, 1, 111–113.
Wagner, E.E., Rasch, M.A., & Marsico, D.S. (1991). Hand Test manual supplement: Interpreting child and adolescent responses. Los Angeles: Western Psychological Services.
Wagner, E.E., & Romanik, D.G. (1976). Hand Test characteristics of marijuana-experienced and multiple-drug using college students. Perceptual and Motor Skills, 43, 1303–1306.

420 The Hand Test: Assessing Prototypical Attitudes and Action Tendencies

Wagner, E.E., Ryan, C.A., & Panek, P.E. (1991). Personality stability of institutionalized mentally retarded adults as measured by the Hand Test. Journal of Clinical Psychology, 47, 436–439.
Wagner, E.E., & Young, G.R. (1999). Hand Test characteristics of pain clinic patients. In G.R. Young & E.E. Wagner (Eds.), The Hand Test: Advances in application and research (pp. 205–211). Malabar, FL: Krieger.
Walter, C., Hilsenroth, M., Arsenault, L., Sloan, P., & Harvill, L. (1998). Use of the Hand Test in the assessment of combat-related stress. Journal of Personality Assessment, 70, 315–323.
Wang, P.L., & Smyers, P.L. (1977). Psychological status after stroke as measured by the Hand Test. Journal of Clinical Psychology, 33, 879–882.
Wendler, C.L.W., & Zachary, R.A. (1983, April). Reliability of scoring categories on a projective test. Paper presented at the meeting of the Western Psychological Association, San Francisco, CA.
Wetsel, H., Shapiro, R.J., & Wagner, E.E. (1967). Prediction of recidivism among juvenile delinquents with the Hand Test. Journal of Projective Techniques and Personality Assessment, 31, 69–72.
Wood, J.M., Nezworski, M.T., & Stejskal, W.J. (1996). The Comprehensive System for the Rorschach: A critical examination. Psychological Science, 7, 3–10.
Young, G.R., Wagner, E.E., & Finn, R.F. (1994). A comparison of three Rorschach diagnostic systems and use of the Hand Test for detecting multiple personality disorder in outpatients. Journal of Personality Assessment, 62, 485–497.
Zizolfi, S., & Cilli, G. (1999). Hand Test Acting-Out and Withdrawal scores and aggressive behavior of DSM-IV chronic schizophrenic outpatients. In G.R. Young & E.E. Wagner (Eds.), The Hand Test: Advances in application and research (pp. 155–164). Malabar, FL: Krieger.

CHAPTER 31 Early Memories and Personality Assessment

J. CHRISTOPHER FOWLER

EARLY MEMORIES AND PERSONALITY ASSESSMENT 421 HISTORICAL AND CONCEPTUAL CONSIDERATIONS 421 EMPIRICAL EVIDENCE 423 Diagnostic Assessment of Personality Types 423 Assessment of Psychological Distress 423

CONCLUSIONS 427 REFERENCES 428

EARLY MEMORIES AND PERSONALITY ASSESSMENT

What someone thinks he remembers from his childhood is not a matter of indifference; as a rule the residual memory—which he himself does not understand cloak priceless pieces of evidence about the most important features of his mental development.

—Sigmund Freud (1910)

This epigram, drawn from Sigmund Freud’s analysis of Leonardo da Vinci’s early memory, captures the essential postulate that early childhood memories (EMs) can reveal crucial intrapsychic data for understanding the individual’s psychological life. Freud’s (1910/1957) analysis of da Vinci’s childhood memory is the first comprehensive use of early memories as a tool for assessing personality structure. This analysis was the product of a decade of clinical and theoretical struggle to comprehend the nature of autobiographical memory in relation to intrapsychic functioning and its expression in differing forms of neurosis. Early memory research began with Freud’s analysis of the defensive functions of screen memories, but has expanded to incorporate aspects of ego psychology, object-relations theory, individual psychology, and cognitive models.

The major clinical theories of early childhood memory will be summarized in order to contextualize the varieties of empirical research to follow. Based on the empirical evidence to date, it is proposed that the use of early childhood memories as a projective technique is a valid and reliable tool for assessing some aspects of personality functioning but fails to demonstrate validity in other arenas. As such, this chapter will serve as a resource and a reminder that considerably more empirical evidence is required before determining the validity of the sweeping claims of Freud and others.

Assessing Personality Features of Aggressiveness and

Assessing Child and Adolescent Psychopathology 426

Alcoholism 424

Object Relations and Affect 425

Applications to Treatment 427

HISTORICAL AND CONCEPTUAL CONSIDERATIONS

The pathogenic effect of early childhood trauma and its relationships to memory has been a cornerstone of psychological theory and therapy since Freud introduced the seduction theory of neurosis (Freud, 1896/1989). More than a century later, early life experiences are still considered by contemporary personality and developmental theorists as one of three major causative factors in the genesis of psychopathology.

Freud was the first to seriously struggle with the issue of accuracy in early childhood memories. Despite an earlier conviction that sexual trauma was the cause of all neuroses, he later rejected the seduction theory in favor of a constructivist view of memory that placed greater emphasis on the role that unconscious fantasy played in the distortion and reconstruction of memory and the formation of neurotic symptoms. By the time Freud had completed Screen Memories (1899/1962) and The Interpretation of Dreams (1900/1953) he had come to view early childhood memories as extremely subjective phenomena that are distorted under the pressure of present unconscious desires and motives: “It may indeed be questioned whether we have any memories at all from our childhood: memories relating to our childhood may be all we possess. Childhood memories did not, as people are accustomed to say, emerge; they were formed at that time. And a number of motives, with no concern for historical accuracy, had a part in forming them, as well as in the selection of the memories themselves” (1899/1962, p. 322).

From this understanding, Freud drew a profound conclusion about the importance of memory in the development of psychopathology: “the neurotic symptoms were not related directly to actual events but to wishful phantasies (sic), and that as far as the neurosis was concerned, psychic reality was of more importance than material reality” (Freud, 1925/1989, p. 21). He did not, as some critics claim, completely dismiss the role of actual traumatic life experiences; rather, he believed that sexual traumas and seductions did not account for all the neurotic reactions he had witnessed in his patients. While this new theory made it possible for analysts to approach EMs as data for assessing current psychic conflicts, Freud’s screen memory formulation emphasized the defensive, camouflaging function of early memories. In this model the manifest memory was regarded as a decoy; therefore, analysis of the patient’s associations to the memory were the only plausible means of revealing its underlying psychic meaning and significance.

Alfred Adler broke from Freud’s emphasis on the screen function to emphasize the importance of manifest content of memories as they reveal central themes in the patient’s current view of the world and self (1931, 1937). Adler concurred with the reconstructive nature of autobiographical memory, but believed that manifest memories revealed as much as they concealed, therefore making free associations unnecessary for analyzing the psychological meaning. While accounting for pathological material, Adler’s emphasis clearly centered on the adaptive functions of EMs as “a story he repeats to himself to warn him or comfort him, to keep him concentrated on his goal, to prepare him, by means of past experiences, to meet the future with an already tested style of action” (1931, p. 73). This shift is important in two ways it highlights early memories as a preconscious function of reinforcing self-schemas, and it transforms EMs into a projective tool because the manifest material becomes meaningful without the demand for further associations.

As psychoanalysis evolved to encompass ego psychology and object-relations theory, appreciation grew for how early memories could be used in diagnostic assessment of character pathology (Langs, 1965b) and object relations (Mayman, 1968). Equipped with a modern theory of ego psychology, Saul (Saul, Snyder, & Sheppard, 1956) proposed that early memories are similar in structure to dreams because they are selected and altered by the same motivational forces of the personality; however, early memories are superior in the sense that they are less influenced by day residue. From this he concluded that EMs have a significance equal to that of the first dream in psychoanalytic treatment because, “Earliest memories are absolutely specific, distinctive, and characteristic for each individual; moreover, they reveal, probably more clearly than any other single psychological datum, the central core of each person’s psychodynamics, his chief motivations, forms of neurosis, and emotional problem” (Saul et al., 1956, p. 229).

Working at the interface of modern object-relations theory and ego psychology, Mayman (1968) viewed memory as a principal factor in creating and maintaining distinctive representation, a working model if you will, of the self and of important others in the individual’s life. These representations, according to Mayman, influence major facets of personality functioning.

I hope to show that early memories are not autobiographical truths, nor even “memories” in the strictest sense of the term, but largely retrospective inventions developed to express psychological truths rather than objective truths about the person’s life; that early memories are expressions of important fantasies around which a person’s character structure is organized; that early memories are selected (unconsciously) by the person to conform with and confirm ingrained images of himself and others around object relational themes . . . In short, I propose that a person’s adult character structure is organized around objectrelational themes which intrude projectively into the structure and content of his early memories just as they occur repetitively in his (sic) relations with significant persons in his life (p. 304).

This approach to memory clearly places emphasis on the diagnosis of character as well as the implications for treatment, especially as EMs reveal potential transference patterns. Mayman developed the first systematic approach to gathering early memories in which he queried 16 specific memories. Unstructured memories such as the earliest memory and earliest memory of mother and father provided the primary data for analyzing object representations, while specific memory probes for the first day of school and feelings of anger, happiness, and fear provided insights into prototypic cognitive, affective, and behavioral reactions.

The latest development in early memory theory is Bruhn’s cognitive-perceptual model (1985, 1990, 1992a, 1992b). Bruhn’s basic theorem is built on cognitive and ego psychology principles, emphasizing the cognitive basis for memory distortion. “According to the cognitive-perceptual method, perception aims for a ‘general impression’ rather than a detailed picture of the whole, a point made long ago by Bartlett (1923). The basis of selectivity in perception is that needs, fears, interests, and major beliefs direct and orchestrate first the perceptual process itself and later the reconstruction of the events which are recalled” (Bruhn, 1985, p. 588). In addition to outlining a cognitive theory, Bruhn and his colleagues have constructed a systematic procedure for gathering data (Bruhn, 1990) and a Comprehensive Early Memories Scoring System (CEMSS; Last & Bruhn, 1983, 1985) used in a variety of empirical investigations.

EMPIRICAL EVIDENCE

Clinical case studies provided the first compelling evidence that skilled clinicians could interpret the structure of early memories to diagnose clinical syndromes and character organizations, but few efforts were made to empirically validate these claims until the 1960s. Invaluable as case reports and analyses of historical figures may be, emphasis in this chapter will be placed on empirical studies that provide some generalizability across individuals. Due to the breadth of coverage that follows, only an outline of the studies is possible. Conforming to current standards, only methodologically sound studies demonstrating significant results (probability values of .05 or better) with adequate interrater reliability coefficients will be presented. While not exhaustive in scope, the review to follow will outline the major contributions to clinical psychology—there is a considerable body of research utilizing EMs in career counseling and school psychology that will not be reviewed here.

Diagnostic Assessment of Personality Types

Early efforts from ego psychologists included a manual for scoring the manifest content of the earliest memory (Langs, Rothenberg, Fishman, & Reiser, 1960). In addition to obtaining adequate reliability for manifest content and low-level dynamic inferences, thematic contents of punishment and discipline from others and physical attacking from the protagonist were significantly higher for inpatient hysterical characters (n ” 10) than for inpatient paranoid schizophrenics (n ” 10). This early study is one of eight known studies successfully discriminating schizophrenic patients from other disturbed psychiatric groups (Charry, 1959; Friedman, 1952; Friedman & Schiffman, 1962; Furlan, 1984; Hafner, Corrotto, & Fakouri, 1980; Hafner & Fakouri, 1978; Hafner, Fakouri, Ollendick, & Corrotto, 1979; Pluthick, Platman, & Fieve, 1970). A follow-up study (Langs, 1965b) of character types revealed modest differences between obsessive-compulsive (n ” 12) versus hysteric (n ” 9) and narcissistic (n ” 13)

characters. Obsessive patients produced first memories with a paucity of people and a higher degree of passivity. Langs (1965a) used the single earliest childhood memory to predict personality traits in a sample of 48 male actors. Correlational analyses of EMs, Rorschach, TAT, clinical interview, and intelligence test data revealed that themes of attack, damage, conflict, and illness in the earliest memory were significantly correlated with fears of loss of control over aggression and a greater degree of disorganization, yet negatively correlated with integration of identity and social values. While hampered by small sample sizes and an excessive number of exploratory analyses, these studies provided the first empirical evidence of the diagnostic value of EMs.

Several investigators have utilized EMs to assess narcissistic character traits. Harder (1979) used multiple projective measures (early memories, TAT, and Rorschach) to assess ambitious-narcissistic character traits in 40 male university students. Rorschach, TAT, and EM scores were found to have acceptable levels of reliability and showed cross-method correlations suggestive of adequate convergent validity. More importantly, the narcissistic features embedded in EMs successfully differentiate blind ratings of subjects as high in ambitious-narcissistic style. Shulman (Shulman, McCarthy & Ferguson, 1988) applied DSM-III criteria to score EMs and TAT narratives in order to assess narcissistic traits in normal subjects (N ” 40). The authors found adequate interrater reliability, as well as significant prediction of narcissistic traits as determined by a senior clinician who conducted extensive diagnostic interviewers with each participant. Shulman and Ferguson (1988) also found EM and TAT scores to be significantly correlated with a self-report measure of selfabsorption and self-admiration (N ” 75). Using a similar methodology, Tibbals’ (1992) study of 70 male university students found that highly narcissistic subjects produced more early memories reflecting a need for admiration, high levels of grandiosity, and themes of interpersonal exploitation than did control subjects.

Assessment of Psychological Distress

In an impressive study of the diagnostic power of early memories, Shedler (Shedler, Mayman, & Manis, 1993) tested the hypothesis that individuals who underestimate their level of psychological distress on self-report measures, but produce disturbed projective early memories (thereby engaging in defensive denial of psychological distress) would be prone to excessive physiological reaction. When subjects scored in the healthy range on both the self-report measures and the clinicians’ ratings of EMs, they were classified as genuinely healthy (n ” 9). When both data sources indicated the subject was distressed, he or she was classified as “manifestly distressed” (n ” 18). However, when the self-report data indicated that the subject was psychologically healthy, but the clinicians’ ratings of EMs indicated distress, they were rated as having “illusory mental health” (n ” 11). Utilizing blood pressure and heart rate as physiological measures of arousal, subjects were exposed to mildly stressful conditions to assess their reactivity to stress. The physiological reactivity for the “illusory mental health” group was about twice that of the manifestly distressed and healthy groups. Defensiveness in the illusory mental health group demonstrated not only that these individuals underestimated their level of distress but also that such defensive avoidance came at the cost of heightened coronary reactivity—a known risk factor for medical illnesses when a chronic condition. Recently, Karliner, Westrich, Shedler, and Mayman (1996) applied the Adelphi Early Memory Index (AEMI) scores to the illusory mental health dataset (Shedler et al., 1993). They found that doctorallevel psychology students could reliably score the AEMI. To test the criterion validity of the scale, the authors created a health-distress index to classify EMs. The authors replicated all the findings using the AEMI in place of clinicians’ global ratings of health distress, and EM ratings provided a more accurate assessment of psychological distress in subjects who defensively denied their level of distress on self-report measures.

Detecting the presence of mood disorders and degree of psychological distress is a complex arena of assessment that early memory researchers have undertaken with mixed results. Beck (1961) reported that hospitalized patients (N ” 200) with high scores on the Beck Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) produced EMs containing more themes of disappointment, rejection from others, and negative affect than did patients with low scores on the BDI. Acklin and colleagues (Acklin, Sauer, Alexander, & Dugoni, 1989) investigated the utility of EMs in predicting naturally occurring depressive moods in college students (N ” 212). Using a modified version of the CEMSS (Last & Bruhn, 1983), the authors found that EM variables significantly predicted BDI scores, correctly classifying approximately 62% of the sample into depressed, mildly depressed, and nondepressed groups. Post hoc analyses of EM themes revealed that depressed students more frequently perceived others as frustrating their needs and perceived themselves as more damaged and threatened and the environment as unsafe and unpredictable. Depressed students also produced EMs with more negative affect tone than nondepressed students. Several additional studies (Allers, White, & Hornbuckle, 1990, 1992; Fakouri, Hartung, & Hafner, 1985) have found similar patterns of negative affect and passivity embedded in EMs of individuals with high BDI scores.

Saunders and Norcross (1988) examined the possible relationship between early memories (utilizing the CEMSS) and a broad spectrum of psychological and family functioning in a cohort of university students (N ” 184). Among the significant correlations, more unpleasant emotional tone of EMs was related to greater disturbance in the subject’s report of the quality of family communications, role relationships, emotional responsiveness, affective involvement, and general quality of family functioning. In addition, the CEMSS variable of emotional tone was significantly correlated with Symptom Check List-90-Revised (SCL-90R) somaticization, hostility, and paranoid ideation scores, while negative self-perceptions embedded in the EMs was associated with greater somaticization, obsessive-compulsive symptoms, hostility, paranoid ideation, and psychoticism.

In a study of 122 outpatient psychiatric patients, Acklin (Acklin, Bibb, Boyer, & Jain, 1991) assessed aspects of object representations, quality of affect, and self-representations using a scale designed for this study, the Early Memory Relationship Scoring System (EMRSS). Comparing EMRSS scores with self-report measures of symptomatic distress, they found that the relationship scale was significantly correlated with 9 out of 10 Minnesota Multiphasic Personality Inventory (MMPI) clinical scales and all SCL-90R subscales. Followup multiple regression analyses revealed that the perception of the environment variable of the EMs accounted for more than 30% of the variance in reported distress from both MMPI and SCL-90R scales.

Utilizing Acklin’s EMRSS, Caruso and Spirrison (1994) investigated the links between EM themes and variations in personality functioning and coping in a large sample of university students (N ” 134). Positive elements of self-esteem rated from EMs were negatively correlated with NEO Personality Inventory-Revised (NEO-PI-R) neuroticism scores, while evidence of greater social interest was positively correlated with Neo-PI-R extraversion scores. Conversely, no other significant findings emerged between EMRSS and the NEO-PI-R, suggesting limited convergent validity between EMs and this psychometrically sound self-report measure.

Assessing Personality Features of Aggressiveness and Alcoholism

Several studies have focused on the ability of EMs to inform clinicians regarding aggressive and delinquent behavior. Hankoff (1987) found incarcerated males developed EMs with dramatic and unpleasant themes, especially themes of disturbed and aggressive interactions with others. Quinn (1973), by contrast, found no differences among prison recidivists and nonrecidivists, nor among criminals who had committed crimes against individuals compared with those committing property crimes. Bruhn and Davidow (1983) used EMs to classify delinquent behavior in 32 adolescent males, 15 of whom had been arrested for property crimes. The EM scale consisted of bipolar coding of themes involving Injury to Self, Rule-Breaking, Self Alone Versus Interest in Others, Mastery Versus Failure, and Victimization. Using the total scores for all categories, the researchers correctly identified 12 of 15 delinquent adolescents and 18 of 18 nondelinquent males. Delinquents were more likely to recall traumatic personal injuries, whereas nondelinquent males were more likely to recall others getting injured. Delinquent males were also more likely to recall failures in attempts at mastery and were more likely to cast themselves as victims.

Tobey and Bruhn (1992), using the CEMSS-R and the Early Memory Aggressiveness Potential Score System (EMAPSS; Bruhn & Tobey, 1991) demonstrated criterion validity in the classification of the criminally dangerous. Using a sample of 30 dangerous and 30 nondangerous psychiatric inpatients, the authors found that 73% of the patients were accurately classified using the EM aggressive potential variable. In addition to those classified as dangerous, the false-positive rate for the EMAPSS was extremely low (6%), providing a high degree of utility in clinical and probate settings.

Assessing underlying personality structure associated with addictions, Chaplin and Orlosfsky (1991) utilized Mayman’s psychosexual scoring system and the Manchester-Perryman scoring system to differentiate inpatient alcoholics (n ” 45) from substance-free inpatients (n ” 45). Alcoholics produced significantly more EMs representing oral and anal organization, whereas nonalcoholics produced significantly more mature themes suggestive of greater psychosexual maturity. Collateral evidence from analyses of the Manchester-Perryman variable strengthened this finding: Alcoholics’ EMs contained significantly less social interest, greater degrees of external locus of control, more negative affect, and lower self-concept than the control group. In a follow-up with the alcoholics completing an 18-day treatment program, the authors found that posttreatment EMs reflected an increase of internal locus of control. Hafner (Hafner, Fakouri, Ebrahim, & Chesney, 1988) assessed differences between female alcoholics (n ” 27) and substance-free females (n ” 30), finding significantly greater disturbance in alcoholics’ perception of relationships, more negative affect, and little capacity for accepting responsibility for their actions. A related study (Hafner, Fakouri, & Labrentz, 1982) found similar disturbance in object relations among male and female alcoholics.

Object Relations and Affect

Mayman (1968) articulated a formal procedure for collecting 16 early memory narratives and developed a prototype scale for assessing psychosexual conflicts, relationship paradigms, coping styles, defense mechanisms, and self-structure and object representations of mother, father, and ego ideal. Mayman and others used this structure in numerous case studies with the first empirical studies appearing in the early 1970s. Krohn and Mayman (1974) assessed object representations in a psychiatric sample (N ” 24) using the Rorschach, early memories, and dreams. First, reliability of the object representation variables for each data source was found to be moderate to high. Object representations manifest in Rorschach and EMs were highly correlated with scores derived from patients’ dreams and with therapist ratings of general psychiatric severity, as well as supervisors’ ratings of object relations. This provided the first evidence that object-relations patterns could be accurately interpreted from EMs and that these prototypic patterns emerged in the psychotherapies of these patients, as seen through the supervisor’s assessment of psychotherapy process.

Fowler (Fowler, Hilsenroth, & Handler, 1995) examined object relations, affect, and cognitive complexity across early memories, MMPI, and the Rorschach using Mayman’s standard queries, in addition to three novel queries. Comparing a clinical sample (n ” 60) to a sample of university students (n ” 58), the authors found that EMs (including the novel probes) of the clinical group manifested significantly greater negative affect and less complexity of object representations than those of the student sample. Additionally, the clinical sample revealed that negative affect tone in the EMs was correlated with higher MMPI Anger and lower Ego Strength scores. In a second study (Fowler, Hilsenroth, & Handler, 1996), the authors assessed the concurrent, predictive, and discriminant validity of the memory probe of feeding, being fed, or eating by comparing early memories to a variety of Rorschach measures and therapist ratings of patient in-session behavior. High degrees of dependency in early memories were highly correlated with Rorschach Oral Dependency scores as well as therapist ratings of dependent and clinging behavior in the psychotherapy. Equally important, patients who produced counterdependent memories manifested very low scores on the Rorschach Oral Dependency scale and were rated by their therapist as behaving in a hostile, hyperindependent fashion. The EM scores were not correlated with Rorschach scales assessing aggression and general objectrelations development, providing adequate discriminant validity. A third study (Fowler, Hilsenroth, & Handler, 1998) demonstrated concurrent validity of the transitional phenomena probe in which EM scores were positively correlated with the Rorschach Transitional Object Scale and with therapist independent ratings of their patients’ ability to engage in a useful transference, use of language to create humor and capacity for evocative memory.

Acklin et al. (1991) developed an object-relations scale (EMRSS) with high levels of interrater reliability. These early memory scores were then found to demonstrate a high level of convergent and criterion validity with a number of selfreport measures of attachment style, mood, psychiatric symptoms, and personality. The quality of relationships in early memories was associated with meaningful patterns of maladjustment on self-report measures. In addition to relational quality, the level of benevolent or malevolent affect expressed in early memories narratives also holds diagnostic significance. This has repeatedly been supported in studies of patients diagnosed with borderline personality disorder.

Frank and Paris (1981) found differences between borderline personality disordered patients and control subjects on number of negative memories. Similarly, Arnow and Harrison (1991) found that borderline patients had significantly more malevolent and fewer positively toned early memories, compared with neurotic or paranoid schizophrenic patients. Only the neurotic group had a majority of affectively positive memories. This was one of the first EM studies to lend support to the existing theoretical and diagnostic literature on borderline psychopathology that would expect higher levels of negative affect expressed and expected in relationships. This finding received some mixed support from Richman and Sokolove (1992), who found that borderline patients had significantly fewer positive early memories and more negative early memories than a neurotic comparison group. However, in this sample the authors found these differences to be mediated by the patients’ IQ and their level of depression. In still another examination of borderline patients, Nigg and colleagues (Nigg et al., 1991) found that a reported history of sexual abuse, but not a reported history of physical abuse, predicted the presence of extremely malevolent representations in EM narratives, including representations involving deliberate injury as well as affect tone (from malevolent to benevolent). Borderline personality diagnosis also exhibited a trend toward the prediction of average affect tone scores. Furthermore, patients with a diagnosis of borderline personality disorder who reported a history of sexual abuse were particularly likely to report malevolent expectations of others on these early memories narratives (Nigg et al., 1991). In a second study (Nigg, Lohr, Westen, Gold, & Silk, 1992) borderline personality disorder patients, with and without comorbid major depressive disorder, were discriminated from a group of patients with major depressive disorder and from nonclinical groups. The results of this study indicated that both borderline personality disorder groups showed greater maladjustment in their quality of affective representations, produced more extreme malevolent responses and more memories involving deliberate injury, and portrayed potential helpers as less helpful in their early memory narratives.

Assessing Child and Adolescent Psychopathology

Utilization of EMs in assessing psychopathology in children and adolescent populations was considered by some clinicians to yield far less useful information than for adults (see Bruhn, 1981, for the theoretical rationale). Several studies (Hedvig, 1965; Monahan, 1983; Weiland & Steisel, 1958) yielded negative findings for classifying children’s level of psychopathology, giving some credence to this position. Since that early phase, a series of studies have demonstrated criterion validity for early memories in classifying various pathological conditions and personality traits of children and adolescents. Lord (1971), for example, showed that the valence of affect in adolescent boys’ (N ” 32) early memories was associated with TAT measures of identity formation, differentiation of body concept, and representations of activity level in human figure drawings. The EMs did not predict selfreport measures of vocational goals or sense of effectiveness in coping with life stresses. Kopp and Der’s (1982) assessment of 18 adolescent outpatients demonstrated that level of activity in early memories differentiated acting-out adolescents from passive and withdrawn ones. As noted earlier, Bruhn and Davidow (1983) used EMs to accurately classify delinquent behavior in 32 adolescent males, 15 of whom had been arrested for property crimes. Last and Bruhn (1983) utilized the CEMSS to identify the degree of psychopathology in well-adjusted (n ” 31), mildly maladjusted (n ” 44), and severely maladjusted (n ” 19) children. Exploratory analysis of 28 variables yielded several prediction models with varying degrees of accuracy. Discriminant function analysis revealed that the CEMSS object-relations variable was relatively successful in classifying well adjusted (65%) and mildly maladjusted (66%), but not severely maladjusted (0%). Relation to reality, setting type, caretaking relatives, and affect variables combined to produce better results—well-adjusted children were accurately classified 65% of the time, mildly maladjusted 68%, and severely maladjusted 26% of the time.

Kroger (1990) examined the relationship between EM themes and identity status in late adolescence (N ” 73). Working from Erik Erikson’s model of adolescent ego and identity development, Kroger predicted that adolescents’ early memories would mirror their stage of identity development. Utilizing Marcia’s (1966) Identity Status Interview to determine subjects’ ego identity status and Gushurst’s (1971) EM scoring procedure, Kroger found a significant main effect for identity status in four of the five EM types. Adolescents who were categorized as having achieved a high level of identity formation and role performance produced more memories in which they were contentedly moving alone or alongside an important authority figure. By contrast, those adolescents who were categorized as being stuck in a moratorium phase of identity development were more likely to recall memories in which they were moving away from significant others. Furthermore, adolescents who were classified as having prematurely foreclosed on their identity (who accepted a prescribed identity from parents or family without considering their interests or needs) were more prone to recall memories in which they sought support, security, and closeness to important authority figures. Individuals with a highly diffuse sense of identity expressed more themes of longing for relatedness in their EMs. In a follow-up study, Kroger (1995) examined EM themes of adolescents (N ” 131) at the beginning of their university studies, and then followed 80 subjects for 2 years. Utilizing the same scales as the previous study, she found that those adolescents who had foreclosed on an identity by accepting the expectations of others produced significantly more EMs with themes of seeking security and support from others than other identity status groups. Furthermore, she found that those with the highest EM scores of seeking security remained in a foreclosed identity status after 2 years, whereas adolescents who had lower seeking security scores at the outset but were identified as foreclosed or moratorium status were more likely to have moved beyond that identity phase and into a more stable identity status.

Applications to Treatment

A multitude of clinicians have presented applications of early memories to the psychotherapeutic endeavor, yet scant empirical evidence exists for its effectiveness in either assessing change in personality functioning or in its utility in predicting crucial treatment processes. The major problem lies in the assumption that case studies and clinical experience are sufficient evidence and therefore no further examination is needed. The second problem involves the changing nature of memory. While early theorists tended to assume temporal stability of EMs, only one study reported test-retest reliability (Acklin et al., 1991). Coefficients for 10-week test-retest stability indicate that self-representation (r ” .48), representation of others (r ” .69), and perception of the environment (r ” .41) are differentially affected by naturally occurring mood states at the time of testing. This finding complicates the use of EMs in longitudinal studies and their use in assessing intrapsychic change. The lack of exhaustive categories in the test-retest study creates further uncertainty regarding their use. In light of these limitations, the few studies that utilize EMs to examine therapeutic factors and therapeutic change will be reviewed.

Ryan and Bell (1984) assessed change in object-relations functioning manifest in the EMs of psychotic inpatients (N ” 63) collected at admission, at 9 months into treatment, and at 6 months postdischarge. Psychotic patients manifested no discernible improvement in object-relations scores (as measured by the Ryan Object-Relations Scale) at 9 months of treatment, yet did demonstrate a trend toward improvement at discharge. Most notable, psychotic patients demonstrated a significant improvement in object representations at 6-month follow-up after discharge. Specific changes were noted in the complexity of representations and affect tone, from poorly differentiated, disorganized, and empty, to greater organization, albeit somewhat shallow and narcissistic. A subsample of patients (n ” 48) was followed to examine object-relations scores in relation to relapse and rehospitalization. This analysis revealed that patients with greater disturbance in object relations reflected in EMs at 6-month follow-up were twice as likely to require later rehospitalization than those that manifested more organized and benevolent object relations.

Ryan and Cichetti (1985) utilized EMs and other pretreatment projective data to predict the quality of alliance during the first psychotherapy hour. Memories were scored on the Ryan Object-Relations Scale, serving as the sole pretreatment measure of object relations. Approximately 40% of the variance for prediction of the quality of alliance was explained by pretreatment variables, with EMs being the single best predictor of alliance in the first hour (an impressive 30% of variance).

Burrows (1981) applied an EM scoring scheme for assessing relationship to authority figures, then assessed the behavior toward the group consultant of 15 members of a self-analytic group. The author found a robust correlation (r ” .65) between EM Authority Figure Orientation rating and the Member-Leader Affect rating of actual behavior across the first week of group experience. While not a formal psychotherapy group, the findings do suggest that transference to the group consultant may be strongly influenced by internal representations expressed in the early memories.

CONCLUSIONS

The strength of early memories when applied to psychological assessment appears to support the contention of early theorists such as Adler, Langs, and Mayman in that EMs reveal some of the complex interactions among self- and object representations, pathological formations, and ego strengths. Insofar as early childhood memories reveal aspects of inner life, they can be used to differentiate clinical groups from nonclinical controls, clinical populations such as schizophrenics from depressives, and borderline personality disorder. Their use in nonclinical populations has also revealed their concurrent validity for assessing narcissism, degree of depression, identity formation, and transference phenomena in group settings. The illusory mental health studies (Karliner et al., 1996; Shedler et al., 1993) demonstrated that clinical judgments and scoring systems for early memories can substantially improve our understanding of defensive denial and its impact on physiological functioning, as well as demonstrate that EMs can be utilized to supplement self-report measures of distress and psychological disturbance.

While this projective technique has a number of strengths, researchers have yet to provide evidence that EMs, in Saul’s (Saul et al., 1956) words, “reveal, probably more clearly than any other single psychological datum, the central core of each person’s psychodynamics, his chief motivations, forms of neurosis, and emotional problem.” The arena of research most lacking at present is the use of early memories to assess treatment outcome and psychological development across time, both crucial arenas for future examination.

Taken as a body of research, one glaring shortcoming is evident—investigators have preferred to create new scales to assess an ever-expanding array of psychological functions, rather than create a program of research to replicate and build on previous studies (Malinoski, Lynn, & Sivec, 1998). It appears that the early memory research field is at the same developmental crossroad that Rorschach psychology was in the 1960s. There are many disparate systems for gathering and scoring EMs, and we lack comprehensive and standardized methods for building a substantial body of evidence. Several systems have been proposed to integrate and standardize administration and scoring (Bruhn’s CEMSS being the most comprehensive), but relatively few researchers have closed ranks to join in assessing these systems. Much work is required in order to further validate specific scoring systems, especially studies that replicate findings utilizing existing systems.

One model for how early memory researchers may go about organizing the assessment of early memories can be found in the field of adult attachment research. While not formally recognized as a projective early memory system, assessment of adult attachment styles via the Adult Attachment Interview (AAI; Main, 1991) is remarkably similar to EM research in that subjects report their early childhood memories. These memories are not viewed as accurate reports of childhood attachment style and are not scored for contents. Instead, the narrative structure of the memories is analyzed for coherence, cohesiveness, and plausibility and is then classified according to attachment style of the adult. The rigorous training of researchers and the systematic approach to research demonstrating the predictive validity of the AAI to adult attachment style, psychological distress, and parenting styles may be an ideal strategy for systematizing the early memory field. It may also be fruitful to build links between the AAI system and early memory scoring systems.

REFERENCES

Acklin, M.W., Bibb, J.L., Boyer, P., & Jain, V. (1991). Early memories as expressions of relationship paradigms: A preliminary investigation. Journal of Personality Assessment, 57, 177–192.
Acklin, M.W., Sauer, A., Alexander, G., & Dugoni, B. (1989). Predicting depression using earliest childhood memories. Journal of Personality Assessment, 53, 51–59.
Adler, A. (1931). What life should mean to you. New York: Grosset & Dunlap.
Adler, A. (1937). The significance of early recollections. International Journal of Individual Psychology, 3, 283–287.
Allers, C.T., White, J., & Hornbuckle, D. (1990). Early recollections: Detecting depression in the elderly. Individual Psychology, 46, 61–66.
Allers, C.T., White, J., & Hornbuckle, D. (1992). Early recollections: Detecting depression in college students. Individual Psychology, 48, 324–329.
Arnow, D. & Harrison, R.H. (1991). Affect in early memories of borderline patients. Journal of Personality Assessment, 56, 75–83.
Beck, A.T. (1961). A systematic investigation of depression. Comprehensive Psychiatry, 2 162–170.
Beck, A.T., Ward, C.H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4, 561–571.
Bruhn, A.R. (1981). Children’s earliest memories: Their use in clinical practice. Journal of Personality Assessment, 45, 258–262.
Bruhn, A.R. (1985). Using early memories as a projective technique: The cognitive-perceptual method. Journal of Personality Assessment, 49, 587–597.
Bruhn, A.R. (1990). Earliest memories: Theory and application to clinical practice. New York: Praeger.
Bruhn, A.R. (1992a). The early memories procedure: A projective test of autobiographical memory, Part 1. Journal of Personality Assessment, 58, 1–15.
Bruhn, A.R. (1992b). The early memories procedure: A projective test of autobiographical memory, Part 2. Journal of Personality Assessment, 58, 326–346.
Bruhn, A.R., & Davidow, S. (1983). Earliest memories and the dynamics of delinquency. Journal of Personality Assessment, 47, 467–482.
Bruhn, A.R., & Tobey, L.H. (1991). Earliest memory aggressiveness potential score system (EMPASS). Unpublished manual.
Burrows, P.B. (1981). The family-group connection: Early memories as a measure of transference in a group. International Journal of Group Psychotherapy, 31, 3–23.
Caruso, J.C., & Spirrison, C.L. (1994). Early memories, normal personality function, and coping. Journal of Personality Assessment, 63, 517–533.
Chaplin, M.P., & Orlosfsky, J.L. (1991). Personality characteristics of male alcoholics as revealed through their early recollections. Individual Psychology, 47, 356–371.
Charry, J.B. (1959). Childhood and teen-age memories in mentally ill and normal groups. Dissertation Abstracts International, 20, 1073.
Fakouri, M.E., Hartung, J.R., & Hafner, J.L. (1985). Early recollections of neurotic depressive patients. Psychological Reports, 57, 783–786.
Fowler, C., Hilsenroth, M.J., & Handler, L. (1995). Early memories: An exploration of theoretically derived queries and their clinical utility. Bulletin of the Menninger Clinic, 59, 79–98.
Fowler, C., Hilsenroth, M.J., & Handler, L. (1996). A multimethod approach to assessing dependency: The Early Memory Dependency Probe. Journal of Personality Assessment, 67, 399–413.
Fowler, C., Hilsenroth, M.J., & Handler, L. (1998). Assessing transitional relatedness with the transitional object early memory probe. Bulletin of the Menninger Clinic, 62, 455–474.
Frank, H., & Paris, J. (1981). Recollections of family experiences in borderline patients. Archives of General Psychiatry, 38, 1031– 1034.
Freud, S. (1953). The interpretation of dreams (Standard ed., Vols. 4–5, pp. 1–751). London: Hogarth Press. (Original work published 1900)
Freud, S. (1957). Leonardo da Vinci and a memory of his childhood (Standard ed., Vol. 11, pp. 63–137). London: Hogarth Press. (Original work published 1910)
Freud, S. (1962). Screen memories (Standard ed., Vol. 3, pp. 299– 322). London: Hogarth Press. (Original work published 1899)
Freud, S. (1989). The aetiology of hysteria. In P. Gay (Trans.), The Freud reader (pp. 96–111). New York: Norton. (Original work published 1896)
Freud, S. (1989). An autobiographical study. In P. Gay (Trans.), The Freud reader (pp. 3–41). New York: Norton. (Original work published in 1925)
Friedman, A. (1952). Early childhood memories of mental patients. Journal of Child Psychiatry, 2, 266–269.
Friedman, A., & Schiffman, H. (1962). Early recollections of schizophrenic and depressed patients. Journal of Individual Psychology, 18, 57–61.
Furlan, P.M. (1984). “Recollection” on the individual psychotherapy of schizophrenia (7th International Symposium: Psychotherapy of schizophrenia, 1981, Heidelberg, W. Germany). Psychiatrica Fennica, 15, 57–61.
Gushurst, R.S. (1971). The reliability and concurrent validity of an idiographic approach to the interpretation of early recollections. Unpublished doctoral dissertation, University of Chicago.
Hafner, J.L., Corrotto, L.V., & Fakouri, M.E. (1980). Early recollections of schizophrenics. Psychological Reports, 46, 408–410.
Hafner, J.L., & Fakouri, M.E. (1978). Early recollections, present crises and future plans in psychotic patients. Psychological Reports, 43, 927–930.
Hafner, J.L., Fakouri, M.E., Ebrahim, M., & Chesney, S.M. (1988). Early recollections of alcoholic women. Journal of Clinical Psychology, 44, 302–306.
Hafner, J.L., Fakouri, M.E., & Labrentz, H.L. (1982). Early memories of “normal” and alcoholic individuals. Journal of Individual Psychology, 38, 238–244.
Hafner, J.L., Fakouri, M.E., Ollendick, T.H., & Corrotto, L.V. (1979). First memories of “normal” and of schizophrenic, paranoid type individuals. Journal of Clinical Psychology, 35, 731–733.
Hankoff, L.D. (1987). The earliest memories of criminals. International Journal of Offender Therapy and Comparative Criminology, 31, 195–201.
Harder, D.W. (1979). The assessment of ambitious-narcissistic character style with three projective tests: The early memories, TAT, and Rorschach. Journal of Personality Assessment, 43, 23–32.
Hedvig, E.B. (1965). Children’s early recollections as a basis for diagnosis. Journal of Individual Psychology, 21, 187–188.
Karliner, R., Westrich, E., Shedler, J., & Mayman, M. (1996). The Adelphi Early Memory Index: Bridging the gap between psychodynamic and scientific psychology. In J. Masling and R. Bornstein (Eds.), Psychoanalytic perspectives on developmental psychology (pp. 43–67). Washington, DC: American Psychological Association.
Kopp, R.R., & Der, D-F. (1982). Level of activity in adolescents’ early recollections: A validity study. Individual Psychology, 38, 213–222.
Kroger, J. (1990). Ego structuralization in late adolescence as seen through early memories and ego identity status. Journal of Adolescence, 13, 65–77.
Kroger, J. (1995). The differentiation of “firm” and “developmental” foreclosure identity statuses: A longitudinal study. Journal of Adolescent Research, 10, 317–337.
Krohn, A., & Mayman, M. (1974). Object representations in dreams and projective tests. Bulletin of the Menninger Clinic, 39, 445– 466.
Langs, R.J. (1965a). Earliest memories and personality. Archives of General Psychiatry, 12, 379–390.
Langs, R.J. (1965b). First memories and characterological diagnosis. Journal of Nervous and Mental Disorders, 141, 319–320.
Langs, R.J., Rothenberg, M.B., Fishman, J.R., & Reiser, M.F. (1960). A method for clinical and theoretical study of the earliest memory. Archives of General Psychiatry, 3, 523–534.

430 Early Memories and Personality Assessment

Last, J.M., & Bruhn, A.R. (1983). The psychodiagnostic value of children’s earliest memories. Journal of Personality Assessment, 47, 597–603.
Last, J.M., & Bruhn, A.R. (1985). Distinguishing child diagnostic types with early memories. Journal of Personality Assessment, 49, 87–192.
Lord, M.M. (1971). Activity and affect in early memories of adolescent boys. Journal of Personality Assessment, 45, 448–642.
Main, M. (1991). Metacognitive knowledge, metacognitive monitoring, and singular (coherent) vs. multiple (incoherent) model of attachment: Findings and directions for future research. In C.M. Parkes, J. Stevenson-Hinde, & P. Harris (Eds.), Attachment across the life cycle (pp. 127–159). London: Routledge.
Malinoski, P., Lynn, S.J., & Sivec, H. (1998). The assessment, validity, and determinants of early memory reports: A critical review. In S.J. Lynnand & K.M. McConkey (Eds.), Truth in memory (pp. 109–136). New York: Guilford Press.
Marcia, J.E. (1966). Development and validation of ego identity status. Journal of Personality and Social Psychology, 3, 551– 558.
Mayman, M. (1968). Early memories and character structure. Journal of Projective Techniques and Personality Assessment, 32, 303–316.
Monahan, R.T. (1983). Suicidal children’s and adolescents’ responses to Early Memories Test. Journal of Personality Assessment, 47, 257–264.
Nigg, J.T., Lohr, N.E., Westen, D., Gold, L.D., & Silk, K.R. (1992). Malevolent object representations in borderline personality disorder and major depression. Journal of Abnormal Psychology, 101, 51–67.
Nigg, J.T., Silk, K.R., Westen, D., Lohr, N.E., Gold, L.D., Goodrich, S., et al. (1991). Object representations in the early memories of sexually abused borderline patients. American Journal of Psychiatry, 148, 864–869.
Quinn, J.R. (1973). Predicting recidivism and type of crime using early recollections of prison inmates. Dissertation Abstracts International 35 (1-A), 197.
Pluthick, R., Platman, S.R., & Fieve, R.R. (1970). Stability of the emotional content of early memories in manic-depressive patients. British Journal of Medical Psychology, 43, 177–181.
Richman, N.E., & Sokolove, R.L. (1992). The experience of aloneness, object representation, and evocative memory in borderline and neurotic patients. Psychoanalytic Psychology, 9, 77–91.
Ryan, E.R., & Bell, M.D. (1984). Changes in object relations from psychosis to recovery. Journal of Abnormal Psychology, 93, 209–219.
Ryan, E.R., & Cicchetti, D.V. (1985). Predicting quality of alliance in the initial psychotherapy interview. Journal of Nervous and Mental Disease, 173, 717–725.
Saul, L.J., Snyder, T.R., & Sheppard, E. (1956). On earliest memories. Psychoanalytic Quarterly, 25, 228–237.
Saunders, L.M.I., & Norcross, J.C. (1988). Earliest childhood memories: Relationship to ordinal position, family functioning, and psychiatric symptomatology. Individual Psychology, 44, 95– 105.
Shedler, J., Mayman, M., & Manis, M. (1993). The illusion of mental health. American Psychologist, 48, 1117–1131.
Shulman, D.G., & Ferguson, G.R. (1988). Two methods of assessing narcissism: Comparison of the narcissism-projective (N-P) and the narcissistic personality inventory. Journal of Clinical Psychology, 44, 857–866.
Shulman, D.G., McCarthy, E.C., & Ferguson, G.R. (1988). The projective assessment of narcissism: Development, reliability, and validity of the N-P. Psychoanalytic Psychology, 5, 285–297.
Tibbals, C.J. (1992). The value of early memories in assessing narcissism. Dissertation Abstracts International, 52 (8-B), 4483.
Tobey, L.H., & Bruhn, A.R. (1992). Early memories and the criminally dangerous. Journal of Personality Assessment, 59, 137– 152.
Weiland, J.H., & Steisel, I. (1958). An analysis of manifest content of the earliest memories of childhood. Journal of Genetic Psychology, 92, 1–52.

CHAPTER 32 The Adult Attachment Projective: Measuring Individual Differences in Attachment Security Using Projective Methodology

CAROL GEORGE AND MALCOLM WEST

ATTACHMENT THEORY AND DEFENSE 432 THE ADULT ATTACHMENT PROJECTIVE 434 Validation of the AAP 435 THE AAP CLASSIFICATION SYSTEM 436 Defensive Processes 436 Discourse and Story Content 439 ASSIGNING ATTACHMENT STATUS USING THE AAP 441 Secure Adult Attachment 441

The projective tradition of personality assessment has long emphasized the idea that meaning in the content of an individual’s response is revealed in the ways in which underlying needs are transformed by defensive operations. Thus, Rapaport (1952) and Schafer (1954), writing from a psychoanalytic ego psychology viewpoint, gave to Rorschach test interpretation an illuminating analysis of individual differences in defensive style. The recent contributions of Lerner, Albert, and Walsh (1987) and Cooper and his colleagues (Cooper, Perry, & Arnow, 1988; Cooper, Perry, & O’Connell, 1991) significantly advanced the use of the Rorschach as a technique for assessing defensive operations. Additionally, as summarized by Cramer (1999), contemporary approaches to the interpretation of the Thematic Apperception Test have introduced coding systems that devote particular attention to defense mechanisms.

A major feature of attachment theory is Bowlby’s (1980) discussion of the conditions that lead individuals to defend attachment experiences from conscious awareness. Thus, defensive exclusion is one of the key attachment concepts. Bowlby, while acknowledging the influence of the Freudian mechanisms of defense upon his thinking, used an informationprocessing model to redefine defense. Defense, in the Bowlbian sense, refers to the process of defensive exclusion whereby attachment experiences and feelings that should be attended

Dismissing Adult Attachment 442 Preoccupied Attachment 443 Unresolved Attachment 444 Summary 445 REFERENCES 446

to as information instead are treated as unintelligible or unintegrated noise that is filtered and transformed prior to gaining access to conscious thought. This characterization of defense brings attachment theory into a close relationship with the psychoanalytic perspective on personality assessment, according to which the play of defensive operations needs to be integrated into the evaluation of the individual’s inner experience of thoughts and feelings about attachment.

The Adult Attachment Interview (AAI; George, Kaplan, & Main, 1984/1985/1996) was the first form of attachment assessment to examine this inner experience of attachment, or, following Main (1995) “the state of mind with respect to attachment.” The AAI is a clinical-style interview that leads individuals through a discussion of their childhood attachment experiences. Inferences regarding individuals’ current states of mind regarding attachment are drawn from variations in discourse coherence that emerge during the interview. Each pattern of adult attachment represents a particular pattern of thinking, speaking, and feeling in regard to attachment experiences. The hallmark of secure attachment, designated “autonomous” by Main (1995), is an unrestricted, freeflowing style of discourse. The patterns of insecure attachment (dismissing, preoccupied, and unresolved) derive from discussion of attachment experiences that is unintegrated specifically, discourse that is restricted, diverted, or unconrolled. Although the AAI system for identifying these patterns of adult attachment was not concerned specifically with defensive exclusion, varying forms of its expression may be inferred from the derivation of the AAI attachment groups. When viewed from this perspective, the AAI groups spread out over a continuum of defensive exclusion—from the relative absence of defense (i.e., secure) toward one end and defensive distortion of attachment information (i.e., insecure) toward the other.

Historically, child attachment researchers were the first to link Bowlby’s concept of defensive exclusion to specific patterns of attachment (Cassidy & Kobak, 1988; George & Solomon, 1996, 1999; Solomon, George, & De Jong, 1995). George and Solomon’s investigations established explicit and systematic definitions of forms of defensive exclusion that differentiate child attachment classification groups and maternal states of mind regarding caregiving. In this chapter, we extend the work of these researchers by focusing on individual differences in defensive patterns as they are manifested in the projective assessment of adult attachment. We describe here the Adult Attachment Projective (AAP), a new assessment methodology that, as the name denotes, uses adults’ story responses to pictures of hypothetical attachment situations to evaluate their “states of mind” or mental representations of attachment.

This chapter begins with a brief discussion of the attachment concept of defensive exclusion. We next describe the Adult Attachment Projective, providing a summary of the coding system and validation data for this new measure. We then take up again the discussion of defensive exclusion, using story examples from the AAP to illustrate how Bowlby’s conceptualization of defensive exclusion differentiates the four major adult attachment classification groups used in the field today. Insofar as other aspects of the AAP such as coherency of discourse are interwoven with defensive processing, the analysis of defensive exclusion contributes to the consideration of other equally important indications of each classification group. Finally, we present the AAP responses of four individuals to illustrate the defining features of the major classification groups.

ATTACHMENT THEORY AND DEFENSE

Despite its central place in Bowlby’s (1980) third volume of Attachment and Loss, his theory of defensive exclusion has received surprisingly little attention from attachment researchers. As noted above, defensive exclusion, like its psychoanalytic counterpart, repression, refers to those psychological operations that are intended to exclude information from awareness and thereby avoid the painful consequences that would accrue upon conscious awareness of this information.

In defining the role of defense in the development of attachment relationships, Bowlby described two general levels of defensive exclusion that he then used to differentiate patterns of attachment insecurity. He proposed that at one level, perceptual exclusion resulted in the deactivation of the attachment system with behavioral and representational pattern consequences that Bowlby termed compulsive self-sufficiency. At a second level, he suggested that preconscious exclusion led to stopping the processing of information prior to gaining access to conscious thought, thus resulting in the disconnection of some attachment information from awareness. In this case, activation of the attachment system is allowed but accurate interpretation of the meaning of activation disallowed. Bowlby proposed that two patterns of insecure attachment, compulsive caregiving and anxious attachment, resulted from this form of defensive exclusion.

Thus, for all conditions of insecure attachment, the normal operation of the attachment system is excluded defensively. Since deactivating and disconnecting strategies suppress direct expression of attachment memories, feelings, behavior, or thoughts, the concept of defense emphasizes that we must attend to what is substituted in order to differentiate patterns of insecurity. Before discussing insecurity, however, we focus briefly on defining attachment security. Bowlby’s discussion of defense never specifically addressed attachment security. Rather, our understanding of defense in relation to security is best derived from assessments that have been used to define internal working models of secure individuals.

As noted above, one prominent assessment method that has helped to define states of mind related to security is the AAI. The AAI requires individuals to tell their life’s story of attachment “on the spot”; individuals do not have the opportunity to reflect on or rehearse their responses in advance. This makes the AAI an excellent tool by which to observe defensive exclusion, as individuals struggle to complete the interview while protecting themselves, if necessary, from attachment distress activated by the interview questions. It is generally accepted that the coherency of discourse is synonymous with individuals’ “current states of mind with respect to attachment” (Main, 1995). As such, evaluations of the degree to which individuals can construct and tell a life story without obvious blockages, interruptions, interferences, or distortions indicates a good deal about the secure versus insecure organization of their states of mind with regard to attachment.

Like Ainsworth and her colleagues (Ainsworth, Blehar, Waters, & Wall, 1978), who were able to differentiate individual differences in infant attachment status based on patterns of behavior, Main and her colleagues (Main, 1995; Main, Kaplan, & Cassidy, 1985) differentiated individual differences in attachment status in adults based on representational characteristics of discourse in response to the AAI. As defined by Main and Goldwyn (1985/1991/1994), coherence is indicated by adherence to four discourse maxims as explicated by Grice (1975): quality (“be truthful and have evidence for what you say”), quantity (“be succinct, yet complete”), relation (“be relevant”), and manner (“be clear and orderly”). We propose that the varying degrees of coherence evidenced by these maxims as patterns of secure and insecure attachment reflect varying forms of defensive exclusion. We further propose that focusing on these varying forms of defensive exclusion provides a frame of reference for comprehending attachment organization in general and classifying patterns of attachment in particular. Additionally, they will furnish the necessary background for our discussion of how defensive exclusion is exhibited in responses to the Adult Attachment Projective.

According to Main, secure or “autonomous” attachment is defined by specific features of coherence, in particular the ability to recall attachment-related memories and feelings and speak about them in a thoughtful and reflective manner. Evaluating this definition in terms of defense, we have stressed in our work that it is the relative absence of defensive exclusion that makes it possible for secure individuals to elaborate accounts of their childhood attachment experiences clearly, without contradiction, distortion, or distraction (West & George, 1999; for other discussions, see Bretherton & Munholland, 1999, and Solomon et al., 1995).

Individuals who are not secure are by definition incoherent. Insecurity at the representational level is marked by defensive processing that excludes attachment information (including feelings) in the service of protecting the individual from attachment-related anxiety and distress. Thus, as a product of defense, insecure individuals compromise one or more of the elements Grice defined as necessary components of coherence.

Looking carefully at the discourse patterns associated with the insecure attachment groups, we see that typically different forms of incoherency (i.e., coherency errors or violations) are associated with different forms of insecurity. For example, dismissing individuals defend against attachment distress through deactivating strategies (George & Solomon, 1996, 1999; Solomon et al., 1995); that is, they attempt to minimize, avoid, or neutralize difficulties related to attachment experiences (Main, 1990). As a result, deactivating strategies allow dismissing adults to prototypically describe their childhood experiences with attachment figures more positively than can be supported by memories (violating the quality maxim). Defensive maneuvers to deactivate attachment often also mean that attachment as a topic of discussion is closed for them. Their responses to questions requiring them to describe attachment experiences (e.g., describing their relationships with parents or parental responses to injury, illness, or childhood fears) tend to be strikingly unreflective and terse (violating the quantity maxim). Interestingly, despite the fact that dismissing individuals never achieve full integration of attachment experience and affect, they typically do not appear to be bothered by this lack of integration. Quite to the contrary, as the result of deactivating strategies their descriptions of relationships and past caregiving experiences are presented as normal and supportive. For example, parents are described as involved and caring in ways that are applauded by our society. Their mothers are described as making school lunches, assuming leadership roles in child-centered activities (e.g., Brownie leader), and as listening and offering advice about problems at school or with peers. Their fathers are described as taking the family on vacations, teaching the individual the pragmatic necessities of life (e.g., gardening, how to work machines), and helping with academic projects (e.g., science projects). Deactivation, however, disrupts integration because these individuals strive for normalcy by editing out attachment from their generalized view of relationships and the self.

In contrast, attachment topics, while open for discussion, are also hyperarousing for preoccupied individuals. As a result, they dwell on the details of memories, frequently emphasizing past or current grievances against attachment figures (Main, 1990, 1995). Defense in this group is characterized by cognitive disconnection, the attempt to separate attachment information from the source of arousal or distress (Bowlby, 1980; George & Solomon, 1996, 1999; Solomon et al., 1995). Disconnection as a defense is less effective than deactivation in preventing or “smoothing out” attachment distress. Based on the style of discourse associated with cognitive disconnection, the disconnecting and sorting processes shown by these individuals during the AAI results in a different form of failed integration of relationships and self. Incoherency among preoccupied individuals is typically revealed by their immersion in lengthy descriptions of childhood experiences (violating the quantity maxim), tangential wandering off topic (violating the relation maxim), and a plethora of long run-on, entangled, and vague thoughts (violating the manner maxim). As a result, disconnection leads to contradiction, confusion, and a literal preoccupation with the issues related to attachment figures and their caregiving behavior.

The AAI identifies one other major insecure group unresolved attachment. This is a superordinate pattern that occurs in conjunction with the states of mind that characterize the autonomous, preoccupied, or dismissing patterns. Similar to these patterns, unresolved attachment is also incoherent although it does not adhere quite as clearly to the violations of Grice’s maxims. Mental representations of unresolved attachment occur as sequelae to experiences of attachmentthreatening trauma, such as sexual or physical abuse or loss of an attachment figure through death (Main, 1995). Individuals judged unresolved exhibit a particular form of incoherency that appears when discussing the above traumatic events. In particular, individuals show striking lapses in their ability to monitor how they describe the details of these events (e.g., giving years later the minute details of the deceased on her deathbed) or their reasoning about the occurrence of these events (e.g., suggesting that physical abuse was in fact caused by the individual and, thus, deserved). In terms of defensive processing, we suggest that this quality of discourse is captured aptly by Bowlby’s (1980) concept of “segregated systems.” Segregated systems result from a pervasive repressive emphasis occasioned either by strong attempts to deactivate or cognitively disconnect traumatic attachment information. We further suggest that the lapses in the monitoring of reasoning or discourse described by Main are the consequence of traumatic attachment material that emerges when defensive processes are failing (George & West, 1999, 2001; West & George, 1999). Thus, unresolved attachment means that defense is failing, that segregated systems material is consequently emerging, and that the individual is prone to dysregulation such that thought and discourse are likely to be disorganized and disoriented in quality.

THE ADULT ATTACHMENT PROJECTIVE

The AAP is a projective measure that is comprised of a set of eight black-and-white line drawings developed in the traditional projective tradition to contain only sufficient detail to identify the selected event. (Examples of three pictures from the projective set are provided in Figures 32.1, 32.2, and 32.3.) Facial expressions and other details were omitted or drawn ambiguously. The drawings were also developed carefully to avoid gender and racial bias.

The scenes in the AAP projective set were selected to capture three core features of attachment as defined by Bowlby (1969/1982). The first feature is activation of the attachment system. Drawing on the characteristics of behavioral systems described by ethologists, Bowlby stressed that the valid assessment of the attachment system depended on observing individuals under conditions that threatened or compromised physical or psychological safety. Therefore, in developing the projective set, we developed pictures that depicted situations that were, according to attachment theory,

Figure 32.1 AAP projective picture: Bench.

likely to elicit attachment distress, such as separation, solitude, fear, and death.

The second feature is the availability of an attachment figure. According to attachment theory, it is only the prompt and effective response of an attachment figure that can successfully alleviate attachment distress resulting in deactivation of the attachment system (Ainsworth, 1964; Ainsworth et al., 1978; Bowlby, 1969/1982) and “felt security” (Sroufe & Fleeson, 1986). For infants and young children, termination of the attachment system requires the physical proximity of and access to attachment figures. For older children, adolescents, and especially adults, physical proximity is increasingly replaced by psychological proximity such that individuals can now appeal to internalized attachment figures (drawing on internal working models or mental representations of attachment figures) when the attachment system is activated. Some AAP scenes portray adult-adult or adultchild dyads, thus depicting physical proximity and the availability of a potential attachment figure. Other AAP scenes portray an adult or a child alone. Because an attachment figure is not present in these pictures, responses that reflect representations of internalized attachment figures may be elicited.

The third feature is Bowlby’s (1969/1982) life-span view of attachment: He proposed that the attachment system, together with the availability of real and internalized attachment figures, was an essential contributor to mental health from infancy through adulthood. We captured this feature in the projective set by including characters that represent a range of ages, from the young child to the elderly.

Similar to other attachment assessments, the AAP stimuli are administered in an order that is designed to gradually increase attachment distress. The AAP order of presentation parallels other methods of assessing attachment, including the Strange Situation (Ainsworth et al., 1978), child attachment assessment techniques using doll play or picture story stems (e.g., Bretherton, Ridgeway, & Cassidy, 1990; Kaplan, 1987; Solomon et al., 1995), and the AAI (George et al., 1984/1985/ 1996). The AAP begins with a warm-up picture depicting two children playing with a ball. Seven attachment scenes follow: Child at Window—a girl looks out a picture window; Departure—an adult man and woman with suitcases stand facing each other; Bench—a youth sits alone on a bench; Bed—a child and woman sit facing each other at opposite ends of the child’s bed; Ambulance—an older woman and a child watch as a stretcher is being loaded into an ambulance; Cemetery—a man stands at a grave site; and Child in Corner a child stands askance in a corner with one arm extended outward. (We refer the reader to West and Sheldon-Keller [1994] and George and West [2001] for a discussion of the selection of the specific pictures that now comprise the AAP.)

Although the pictures were drawn as projective stimuli, the method of administration combines projective and interview techniques in the form of a semistructured interview. This technique has strong demonstrated success in adult and child attachment research (e.g., Bretherton et al., 1990, Cassidy, 1988; George et al., 1984/1985/1996; George & Solomon, 1996; Gloger-Tippelt, 1999; Green, Stanley, Smith, & Goldwyn, 2000; Slade, Belsky, Aber, & Phelps, 1999;

Figure 32.2 AAP projective picture: Bed.

Figure 32.3 AAP projective picture: Cemetery.

Solomon et al., 1995; Zeanah & Barton, 1989). In the Adult Attachment Projective, the interviewer begins by asking the individual to describe what is happening in each AAP picture. The individual’s initial response is followed by probes, as needed, to obtain information about what led up to the events of their story, what the characters are thinking and feeling, and what will happen next.

Validation of the AAP

Based on Ainsworth’s seminal work (Ainsworth et al., 1978), the last three decades of attachment theory and research have concentrated on the differentiation of individuals in terms of their relative attachment security. Following this tradition, we developed the AAP classification scheme specifically to identify the four main attachment groups that are identified by the “gold standard” measure of adult attachment status, the AAI. The AAI identifies four main groups—secure, dismissing, preoccupied, and unresolved attachment.

We approached the development of the AAP classification scheme in two stages. The initial classification scheme was developed based on 13 AAP transcripts of men and women recruited from the community through newspaper advertisement. Because defensive processes influence both the content and the way in which a story is told, we examined verbatim transcripts of their AAP stories from a number of different aspects, including themes, specific content features, descriptive images, and discourse patterns. Nine of these individuals had also been given the AAI prior to administration of the AAP and classified blind by the first author. Subsequently, guided by attachment theory and research, we developed a set of coding categories for the AAP stories that allowed us to differentiate individuals classified into one of the four AAI attachment groups. We checked our AAP classifications against the AAI and then used our knowledge of the AAI classification to refine the AAP classification system on a case-by-case basis.

The next step was to test our scheme with larger samples. We began with a sample of 25 mothers drawn randomly from an ongoing study of infant risk conducted by Dr. Diane Benoit at the University of Toronto. Dr. Benoit collected AAIs and AAPs on this sample of women, randomly changing the order in which these two measures were administered. Dr. Benoit, a trained AAI judge, classified the AAIs. Dr. Benoit was blind to all information about the mother, including her infant’s status (risk vs. control) and her AAP stories. Three judges the authors and our colleague, Dr. Odette Pettem—classified the AAP transcripts. We next tested our classification scheme with a sample of 23 women who participated in a large-scale study of depression (West, Rose, Spreng, Verhoef, & Bergman, 1999). The first author did blind AAI classifications. The second author and Dr. Pettem did blind AAP classifications. Recently we have been engaged in a large validity study for the AAP. To date we have completed data collection for a sample of 48 individuals (N ” 42 women, 6 men) recruited through community, university, and clinical settings. We have followed the same AAI and AAP classification procedure on this dataset as described for the depression sample. (Note: This study was designed to examine test-retest reliability and any relation of intelligence and social desirability to the AAP. Data on these variables as related to the AAP are not available at this time.)

The results of our work to date demonstrate strong interjudge reliability and agreement between AAI and AAP classifications. Interjudge reliability and agreement between AAP and AAI classifications were calculated using percentage agreement among judges based on the samples described. AAP interjudge reliability for secure versus insecure classifications was .97 (kappa ” .68, p # 000); interjudge reliability for the four major attachment groups was .92 (kappa ” .86, p # 000). Convergence between AAP and AAI for secure versus insecure classifications was .96 (kappa ” .76, p # 000); convergence between AAP and AAI classifications

for the four major AAI classification groups was .94 (kappa ” .86, p # 000).

THE AAP CLASSIFICATION SYSTEM

Attachment classification using the AAP is based on the analysis of the verbatim transcript of the story responses to the seven attachment pictures. Three existing attachment classification schemes contributed to the initial development of the AAP classification system. The AAI (George et al., 1984/ 1985/1996; Main & Goldwyn, 1985/1991/1994), the Attachment Doll Play Procedure (Solomon et al., 1995) and the Caregiving Interview (George & Solomon, 1989, 1996) were instrumental to our thinking about coherency and defensive processes. The AAP classification system also includes several new discriminating features derived conceptually from attachment theory. As a result, the AAP classification system is comprised of a set of coding categories that evaluate three different dimensions of the stories: (1) defensive processes, (2) discourse, and (3) content. In this section we provide an overview of the markers that comprise each of these dimensions.

Defensive Processes

Like psychoanalysis, information-processing models describe how individuals represent (encode) and remember (retrieve from long-term memory storage) attachment-related experience, both at the conscious level (information in short-term memory) and the unconscious level (nonconscious, parallel processing). Unlike proponents of traditional cognitive models, Bowlby expanded the concept of “information” to include emotional information. Upon activation of the attachment system, defensive processes select, exclude, and transform behavior, thought, and emotional appraisals to allow, if possible, termination of the attachment system while preventing undue distress. During the course of administering the AAP, each projective picture increasingly activates the attachment system. The AAP, therefore, provides an excellent framework from which to observe individuals’ defenses “at work” and to identify the kind and pervasiveness of their defensive operations.

As we have seen, Bowlby distinguished three forms of defensive exclusion: deactivation, cognitive disconnection, and segregated systems. Recently, George and Solomon explicated the defining features of each form of defense to distinguish between child attachment groups (Solomon et al., 1995) and the corresponding maternal caregiving representations (George & Solomon, 1996, 1999). Based on this work,

we have defined the identifying criteria for deactivation, cognitive disconnection, and segregated systems to differentiate the attachment groups on the AAP.

The task of evaluating defensive processes requires the AAP judge to record the details of each form of defense as expressed in the words, images, and language patterns in the story response to each attachment picture. It is not possible to describe this complex coding process in detail here. We describe instead the general characteristics that define each form of defense and provide examples of these characteristics in Table 32.1.

Deactivation

This form of defensive exclusion functions to diminish, dismiss, devalue, or minimize the importance or influence of attachment and is the form of defense that characterizes dismissing attachment. The goal of deactivation is to shift attention away from events or feelings that arouse the attachment system (similar to avoidance in the Strange Situation or dismissing discourse in the AAI). Deactivation enables the individual to complete the task of telling a story without being distracted by attendant attachment distress. A common form of deactivation is the development of story lines that avoid themes of personal distress; instead, themes emphasize relationships and interactions that are guided by stereotypical social roles, materialism, authority, or achievement. Fre-

TABLE 32.1 Defensive Processing Dimensions Coded in the AAP

quently, characters are evaluated negatively, such as having done the wrong thing or gone against an authority or rules. Deactivation is also seen in story lines that seemingly avoid an attachment theme, emphasizing instead exploration (hitchhiking adventure), affiliation (friends), or romantic interludes (dating).

Cognitive Disconnection

According to Bowlby, cognitive disconnection functions to split attachment information, so to speak, so that distressing information and affect are literally disconnected from their source. George and Solomon (1996) proposed that the foundation of cognitive disconnection is uncertainty that results from the individual continually shifting back and forth in both attachment behavior and thought. In the AAP, cognitive disconnection is clearly inefficient and rarely functions to terminate the arousal of attachment distress (see also Solomon et al., 1995). Cognitive disconnection produces an inability to make decisions about the story line and uncertainty and ambivalence about events. Some individuals are unable to make up their minds as to what is going on in a story and are frequently unable to complete their thoughts. Cognitive disconnection is perhaps most clearly observed when individuals develop two diametrically opposed themes. For example, in Departure the man is sad because he wants the woman to stay and the woman is happy because she wants to leave. In

Defense Variables	Stimuli Coded	Definition	Some Examples of Evidence in AAP Stories
Deactivation	All	Evidence of deactivation and demobilization.	Negative evaluation—e.g., person is wrong or being disciplined. Rejection—e.g., person is ignored; child requests hug but mother gives medicine instead. Social roles—e.g., a child this age should not act this way; gravestones should not be defaced. Authority—e.g., power (materialism, prestige); personal strength. Achievement—e.g., taking responsibility; problem solving.
Cognitive Disconnection	All	Evidence of uncertainty, ambivalence, and preoccupation.	Uncertainty—e.g., cannot decide who the character is; the story is left unfinished; characters are bored, confused, worried. Withdrawal—e.g., character leaves the scene prematurely; reserve. Withhold—e.g., hides face so as not to show sadness; surrender. Anger—e.g., fight, argument. Busy—parents have no time for the child; bake cookies to distract child from distress. Feisty—e.g., child is naughty, bratty. Entangled—e.g., tease, nag, scold. Glossing over—e.g., “He’ll grow out of it.”
Segregated Systems	All	Evidence of overwhelm or dysregulation by attachment trauma.	Danger—e.g., death, abuse. Failed protection—e.g., abandonment. Helplessness—e.g., overpowered, trapped. Out of control—violence, disintegration. Emptiness/Isolation—e.g., in jail, desperately alone. Dissociation—e.g., speaking to the dead. Intrusion—e.g., references to own loss or abuse.

Bed, theme opposition may be seen when the boy is described as either waking up in the morning or getting ready for bed at night.

Segregated Systems

As we noted earlier, Bowlby (1980) proposed that segregated systems were the product of an extreme form of defensive exclusion adopted by individuals who had experienced attachment trauma. The concept of segregated systems is complex. Before describing how a segregated system is identified in the AAP, it is important that we define the concept in more detail.

Bowlby developed the term “segregated system” carefully to capture both the psychoanalytic features of repression and the cognitive theory of mental representation. A segregated system represented to Bowlby the strongest form of repression. The system contained traumatic material that was blocked (thus, segregated) from conscious awareness by strong forms of defensive exclusion (deactivation or cognitive disconnection). According to attachment theory, behavioral systems such as attachment are organized by mental representational structures (internal working models) (Bowlby, 1969/1982, 1973, 1980). Thus, he used the term system here to suggest that this traumatic mental representation was organized; that is, it had its own representational rules, postulates, and appraisals.

Bowlby’s original thinking regarding segregated systems centered on explaining the lack of resolution of the loss of an attachment figure during childhood and seemingly unexplainable behavior subsequently exhibited in adulthood. His concept of lack of resolution has since been expanded to include other traumas, such as abuse or parental abandonment (Ainsworth & Eichberg, 1991; George & Solomon, 1996, 1999; Main et al., 1985; Solomon & George, 1999; Solomon et al., 1995; West & George, 1999).

George and Solomon noted that segregated systems are prone to defensive breakdown; that is, to a state of mental or behavioral dysregulation that results from the undermining or collapse of normative forms of deactivation or cognitive disconnection. Importantly, the failure of defense and the concomitant dysregulation of segregated systems appear to be associated with strong stressors to the attachment system and in most individuals are not a pervasive quality of their behavior or thought (Solomon & George, 1999). The breakdown of defensive processes results in disorganized, dysregulated behavior, or a complete shutdown. During moments of disorganization or dysregulation, Bowlby discussed at length how an individual’s behavior might now appear out of context and even bizarre. He proposed that this behavior resulted from the sudden and ill-organized emergence of attachment memories and the accompanying distress.

The identification of unresolved segregated systems material in AAP stories is the single most important feature for judging unresolved attachment status (see George, West, & Pettem, 1999, for an extensive discussion of the links between unresolved attachment status and attachment disorganization). Segregated systems markers in AAP stories are evaluated in a two-step process.

The first step is to identify the presence of segregated systems material in the story. Following George and Solomon’s work, segregated systems evidence or “markers” include those aspects of a story that connote helplessness, fear, failed protection, or abandonment (see Table 32.1), such as references to dangerous events, being helpless or out of control, or isolation. Some segregated systems markers have a dissociated or eerie quality, a feature that parallels Main’s (1995) link between unresolved attachment and dissociation. Others are manifested in the sudden intrusion of descriptions of the individual’s own traumatic experiences into a story, a feature similar to the intrusions observed in unresolved AAI transcripts.

The second step is to evaluate resolution of segregated systems markers. Resolution indicates that individuals, drawing upon their internal working models of attachment, successfully integrated or contained this material within the context of their stories. We stress once again that integration, as evidenced by resolution at the representational level, is the sole indicator that differentiates organized from disorganized or unresolved attachment status in children and adults (Main, 1995; Solomon & George, 1999; Solomon et al., 1995). AAP stories are considered resolved when the story content demonstrates that characters have drawn on internal resources to understand events or have taken action to protect the self. Other forms of resolution include the use of attachment figures to provide physical comfort or to provide the security needed to explore threatening events internally (see “haven of safety” and “internalized secure base” described in the next section). For example, in Ambulance, the grandmother comforts the child; in Cemetery, the man thinks about the importance of the deceased. Resolution through containment is noted when the individual is protected without appeal to attachment figures (e.g., protective services step in to prevent abuse) or the individual takes steps to change the situation (e.g., tells an abusive parent to “Stop”).

A story is judged unresolved when there is no evidence of integration or containment of segregated material. Typically, the unresolved story is devoid of events or people that provide comfort, protection, or help, or the character continues to be “haunted” or threatened by feelings of abandonment, fear, helplessness, and vulnerability.

In some instances, unresolved segregated systems are indicated by a total shutdown response (constriction). In this form, the individual is profoundly unable or refuses to engage in telling a story about one or more AAP pictures. The individual may, for example, pass the picture back to the administrator, recoiling from it as if the attachment stimulus is upsetting, dangerous, or personally threatening. It should be noted in this regard that an analogous form of constricted response to a projective stimulus is characteristic of some disorganized children, a child attachment group linked empirically to unresolved adult attachment (George et al., 1999; Lyons-Ruth & Jacobvitz, 1999; Main et al., 1985; Solomon et al., 1995).

So far in this chapter we have explored Bowlby’s concept of defensive exclusion as central to the regulation (activation and termination) of the attachment system. In terms of measuring attachment, defense is certainly related to attachment status. However, the identification of specific forms of defense is not sufficient to differentiate secure from insecure attachment patterns.

Discourse and Story Content

In addition to defensive processing, a complete evaluation of the AAP stories requires us to examine story discourse (language patterns related to how stories are told) and content (features of the characters and the plot) for each attachment picture. Of course, these features of a projective story are inextricably intertwined with defensive processing; however, the identification of the specific qualities of these features is essential to discriminating among the four attachment groups. Evidence for the defenses we just described only tell us how the attachment system is regulated, not the quality of its organization. Indeed, the features that are used to evaluate resolution of segregated systems, for example, are content features of the story.

Two aspects of discourse are evaluated, story coherence and references during the telling of a story to the individual’s personal experience. Again, coherence, as already described in detail in the introduction, evaluates the degree to which the story is logically connected, consistent, clearly articulated, and intelligible. Each attachment story is judged as high, moderate, or low in coherence based on the qualitative synthesis of the features of quality, quantity, relation, and manner as defined specifically for the AAP (see Table 32.2).

Personal experience is a particular form of a relation violation that is noted separately. In contrast to interview techniques, the AAP task is never defined as a context for telling about one’s own experience. Probes never ask individuals to connect events portrayed in the picture with their own life events. According to attachment theory, individuals whose internal working models of attachment are maximally bal-

TABLE 32.2	AAP Coherency and Content Dimensions
————	————————————–

	Stimuli
Dimensions	Coded	Definition	Rating Summary
Discourse Dimensions
Coherency All		Degree of organization and integration in the story as a whole. Quality: The degree to which there is a basic plot with specific details to understand the basics: who, what, why, what happens next. Quantity: The degree to which the response is sufficient to tell a story. Relation: The degree to which the response is relevant to the story. Manner: The degree to which language is clear.	3-point rating scale combining quality, quantity, relation, manner.
Personal Experience	All	A particular form of relation violation in which the response includes reference to one’s own life experience.	Present; absent.
Content Dimensions
Agency of Self	Alone	Designates degree to which story character is portrayed as integrated and capable of action.	Internalized secure base, haven of safety; capacity to act; no agency.
Connectedness	Alone	Expression of desire to interact with others.	Clear signs of a relationship in the story. Relationship not possible (e.g., someone walks away, someone is dead); engaged in own activity.
Synchrony	Dyadic	Characters’ interactions are reciprocal and mutually engaging.	Mutual, reciprocal engagement; failed reciprocity; no relationship is acknowledged in the story (story told as if one of the characters is alone).

anced and flexible (i.e., secure) maintain self-other boundaries. By contrast, representational merging (i.e., the inability to keep the self and other separate) has been shown to be a defining feature of attachment disorganization (George & Solomon, 1996, 1999; Solomon & George, 1999) as well as a characteristic of a preoccupation with attachment. Thus, the personal experience marker tells us the degree to which the individual maintains boundaries between the self and the fictional character(s) in response to the pictures; the more stories in which personal experiences are present, the more preoccupied and potentially overwhelmed the individual is with his or her own attachment stress. Our evaluations of this dimension simply note whether or not reference(s) to personal experience is present in the story.

We developed a set of content dimensions to evaluate the portrayal of relationships in story events. Two content dimensions are coded for stimuli that depict characters as alone: agency of self and connectedness. Connectedness is coded only for the Window and Bench alone pictures, as this feature of relationships is compromised in scenes of death (Cemetery) or potential abuse (Corner). Only one content dimension is coded for stories that depict characters in dyads. This dimension is called synchrony.

Following attachment theory, we developed agency of self to evaluate the story character’s ability to draw on internal or external resources in order to resolve personal stress or threat (see Table 32.2). This capacity is present when the character is depicted as distressed and subsequently resolves this distress either by appealing to an attachment figure as a haven of safety or by drawing on his or her own internal resources. We term this latter phenomenon internalized secure base.

The concept of the attachment figure providing protection and safety upon activation of the attachment system is central to attachment theory (Bowlby, 1969/1982). Bretherton (1985) used the term haven of safety to refer to this phenomenon, and we incorporated her term in the AAP. Haven of safety is coded when the story identifies events in which the character’s problem or distress results in a successful appeal to an attachment figure. Typically, these types of events are seen in stories in which the individual has specified the character as a child, as, for example, in the Child at Window picture.

Other characters, particularly adult ones, are depicted as drawing upon their own internal resources instead of appealing directly to attachment figures in response to activation of their attachment systems. We thus included a second form of agency of self, internalized secure base, to capture this internal capacity. In contrast to haven of safety, this form of agency is seen when the story character is portrayed as engaging in some form of self-reflection and/or using solitude to explore feelings and experiences. Internalized secure base is a new concept that has emerged from our work with the AAP and is central to attachment security. We pause briefly, therefore, to clarify how this concept fits within the framework of attachment theory.

The secure base phenomenon in early childhood is wholly dependent on the physical proximity and availability of the attachment figure; the attachment figure literally becomes the child’s secure base. In the developmental phase of the attachment relationship Bowlby (1969/1982) called a goalcorrected partnership, the emerging ability of the child to form enduring internalized models of the relationship with the caregiver especially takes hold. Increasingly, mental representations of the attachment relationship have the capacity to supplement actual interactions with the caregiver; for secure children, separations are less likely to be threatening because representations of attachment figures allow the child to maintain secure models of them even in their physical absence. Over time, a more highly differentiated internal representational capacity emerges such that the older child’s sense of security is maintained not by seeking physical proximity to the attachment figure (except in times of high activation of the attachment system) but by reference to the internal working model of the attachment figure. In an essential way, the secure base effect in adults is demonstrated in the absence of the attachment figure; that is, maintenance of proximity to the attachment figure becomes almost exclusively an internalized representational process. Further, internalization of the attachment relationship informs and shapes mental representations of the self (Bowlby, 1969/1982; Sroufe & Fleeson, 1986), allowing the individual to not only explore the external world but to also explore the internal world of the self. We thus use the concept of internalized secure base to refer to that state in which the sense of security and integrity of self are derived largely from the individual’s internal relationship to the attachment figure.

There is one further elaboration with regard to the effect of internalized secure base, a feature similar to that which has been emphasized by Fonagy and Target (1997) in their discussion of reflective self-capacity. Because adults predominantly maintain proximity to their attachment figure by reference to an internal working model of this person, it becomes possible to use solitude for self-exploration. Just as the young child uses the caregiver as a secure base from which to initiate exploration, the presence of an internalized secure base provides the foundation for self-reflection. Thus, on the basis of the foregoing considerations, we define internalized secure base as story content in which characters are depicted as having entered and actively explored their internal working models of attachment.

Finally, we have identified a third form of agency of self called capacity to act. In this case, the story character demonstrates that he or she is able to do something constructive in response to stress or difficulty. In other words, capacity to act means that the central story characters can at least take action although they do access external or internal attachment resources. Importantly, when attachment figures are not available, taking action at least keeps the individual organized. It may be helpful to think of capacity to act in the context of the AAP as a secondary attachment strategy. Main (1990) defined a secondary attachment strategy as one that enabled the child to resolve attachment stress indirectly; that is, in lieu of a direct approach or appeal to the attachment figure. Similarly, in terms of AAP story content, secondary strategies bypass direct appeals to internal working models of attachment; the character is instead described as engaging in some specific behavior or activity, such as going home, going to work, or becoming involved in an activity.

The Window and Bench pictures are also evaluated on the dimension of connectedness (see Table 32.2). Connectedness assesses a character’s desire to be with others. It is a more general evaluation of relationships than agency of self, which refers specifically to attachment relationships. According to ethology, in addition to attachment, the individual establishes other relationships such as friendships (affiliative behavioral system) and intimate adult relationships (sexual behavioral system) over the course of development (Bowlby, 1969/1982; Hinde, 1982). Connectedness, then, designates story content that indicates a character’s desire to be with others, including interactions, for example, with parents, friends, intimate partners, teachers, neighbors, protection authorities (e.g., police), or health professionals. Interestingly, our work to date suggests that individuals who are judged secure most frequently create story lines in which connectedness is depicted to real or internalized attachment figures. This is not the case for individuals judged insecure. For example, dismissing adults often show connectedness in stories that describe distressed characters “hanging out with” friends instead of turning to attachment figures. Notably, preoccupied adults characteristically portray characters as alone; that is, not connected to others in any type of relationship.

Synchrony is the analogous relationship dimension that is coded for dyadic pictures (see Table 32.2). The pictures themselves depict a potential attachment figure in the actual drawing (a mother figure in Bed, an adult partner in Departure, a grandmother figure in Ambulance). Synchrony, then, assesses whether the story content portrays the dyad as participating in a reciprocal, mutually engaging, and satisfying relationship. When a story character is distressed or vulnerable, the evaluation of synchrony indicates how the dyadic partner (by

definition, an attachment figure) responds in order to solve a problem or reduce anxiety. An important feature of synchrony is that the actions and feelings of the dyad are coordinated; that is, the story describes characters as engaged in a goal-corrected partnership. For example, in the Ambulance story, content is evaluated as synchronous when the child is described as being upset and the adult is described as responding to the child immediately and appropriately by providing comfort or solace. By contrast, a story that depicts an adult attempting to calm a child who pushes the adult away is not a synchronous relationship. Nonsynchronous relationships also include stories in which the characters are not seen as related, or in a story told about only one of the characters with no reference to the other character in the picture.

ASSIGNING ATTACHMENT STATUS USING THE AAP

Classification using the AAP requires the judge to examine the pattern of attachment markers or dimensions across the entire set of stories. We describe in this section the general AAP patterns for secure, dismissing, preoccupied, and unresolved attachment. We highlight the discussion by including examples from each attachment group in response to the Bench picture (see Figure 32.1). We emphasize that classification requires coding of the full set of picture responses; it is never based on the individual’s response to only one picture.

Secure Adult Attachment

Secure attachment is characterized at the representational level by flexible and organized thought about attachment situations and relationships (George & Solomon, 1996, 1999; Main et al., 1985; Solomon & George, 1996). Securely attached individuals are confident that they can rely on attachment figures to achieve care, safety, and protection and, when alone, have access to internalized attachment relationships. Because of their ability to acknowledge and cope with distress, secure individuals do not rely excessively on defensive processes to modulate attachment anxiety. As such, their story content and discourse reveal little or no evidence of defensive exclusion. Many secure individuals have experienced attachment trauma and their stories sometimes include segregated systems markers. When these markers do appear, they are clearly and swiftly resolved.

The hallmark of security in the AAP is individuals’ depiction of attachment relationships as remedying the distress that follows upon activation of their attachment system by

the projective stimuli. Further, only secure individuals demonstrate internalized secure base; that is, the capacity to use internal resources to resolve attachment stress. Secure individuals also show the importance of relationships more generally in their stories through expressing the desire to be connected to others (connectedness in alone pictures) and descriptions of balanced, reciprocal interaction (synchrony in dyadic pictures). Finally, secure individuals demonstrate moderate to high discourse coherency in the telling of their stories. Attachment security is rarely associated with markers for personal experience, thus demonstrating the ability of these individuals to maintain clear self-other boundaries in response to the pictures.

Many of these qualities of secure attachment are present in the story in Example 1. Italics in the story text in the left column indicate dimensions identified by our coding system for the AAP; annotated explanations of this text are provided in the right column.

The most striking feature of this story is the character’s use of internalized secure base to cope with her distress. In terms of story content, the girl is described as sitting on the bench gathering her thoughts. Drawing upon her own internal resources, she gets ready to face her problem again. This individual’s story content is relatively undefended. In terms of coding defensive exclusion, the story has only one form of cognitive disconnection (withdrawal to be by herself). It also contains a minor form of deactivation language (“deal with”) that hardly counts as defensive exclusion in the overall scope of the story. Like many of the stories of secure individuals, the strongest evidence of any kind of defensive processing is revealed in the story’s coherency. This story is only moderately coherent. The individual spends a lot of time discussing the girl’s thinking activity, but we only have a general notion of the preceding and following events. The actual manner of discourse would best be described as “windy” as the individual describes the thinking activity using a long run-on sentence.

Dismissing Adult Attachment

Dismissing attachment is characterized by the individual’s attempts to minimize, avoid, or neutralize attachment in an effort to modulate stress (George & Solomon, 1999; Main, 1995; Solomon et al., 1995). Dismissing individuals typically develop stories in which distress is discounted and attachment relationships (real or internalized) are not described as integral or important to remedying the situation. Although their AAP stories may portray characters as having the capacity to act, agency of self in the forms of internalized secure base or haven of safety are notably lacking. Connectedness may be directed toward nonattachment figures, such as friends or sexual partners. Reciprocal forms of interaction indicative of synchrony are usually also lacking in their stories. Relationships often are “functional”; that is, these interactions are based on a basic script that fits a particular context. Examples of such scripts include descriptions of behavior that follows cultural rules for how people should act at a train station or when someone is hurt. In other instances, relationship synchrony may be violated by rejection, such as a mother refusing to give a child a hug at bedtime. And further still, their story content may be devoid of relationships entirely and characters are only described as involved in their own activities.

As we have stated, defensive deactivation differentiates dismissing attachment from other insecure groups. George and Solomon (1996) demonstrated that deactivation and cognitive disconnection defensive processing commonly characterize

Example 1

This looks like someone who isn’t very happy. Maybe feeling a little, a little sad. Felt like they needed to get away and have some time to themselves so they went for a walk and they found this bench, decided to sit on it and think for a while and maybe feeling um, just trying to reflect on what’s going on in their life and feeling maybe a little overwhelmed or maybe something has happened that they’re saddened by and they need this time to get their—gather their thoughts and they’ll maybe um just sit there for a while and then have a good cry and feel better and be able to get back up and go home and deal with what they need to.

Defense: Cognitive disconnection—withdraw.

Agency of self, internalized secure base. Note the use of solitude.

More agency of self, internalized secure base.

Deactivation language: Weak evidence of deactivation.

Note the character’s resilience as the product of internalized secure base.

Example 2

Um, this is at school, and this person, again has no friends, and or maybe they’re being teased um, and, they’re sad and um, again lonely I guess and, um, it’s recess so that’s why there’s no other kids around cause she’s on the bench by herself while everyone else is at the playground. And maybe um, she doesn’t have friends not necessarily because other people are mean but maybe because she doesn’t she won’t make the effort to make friends. Um, she’s just afraid to. Um, and I guess probably one of the reasons is that everyone else or everyone she will be going in with everyone else and she’ll sit by herself again in class, and nothing will really change.

Defense: Deactivation—negative evaluation of a character. Defense: Cognitive disconnection—entanglement.

Defense: Deactivation—negative evaluation of a character. The lack of friends is her fault.

Segregated systems marker: Danger.

Agency of self, capacity to act: The girl does not use attachment to resolve the danger but she does have the capacity to go into the classroom and sit down.

both the avoidant/dismissing and ambivalent/preoccupied attachment groups. What is uniquely characteristic of dismissing individuals in response to the AAP is the predominant use of deactivation in response to a significant number of the stories. We note that in terms of coherency, their coherency scores are often similar to secure individuals. Thus, both coherency and the story content markers must be examined in order to place an individual in the dismissing attachment group.

The inclusion of reference to personal experience while responding to the projective stimulus generally characterizes insecure attachment, but the presence of a personal experience marker does not clearly differentiate among the insecure attachment groups (dismissing, preoccupied, or unresolved attachment). Based on the work we have completed to date, it appears, however, that dismissing individuals are less likely to refer to personal experience as compared to preoccupied or unresolved individuals. In other words, deactivation appears to help dismissing individuals maintain the boundaries of self and other while they are engaged in the projective task.

Example 2 illustrates many of the features of the dismissing attachment group. The main theme of this story is negative evaluation of the story character, a strong indication of the defensive deactivation of the attachment system. Note that negative evaluation appears twice during the story; there is no doubt that the girl is the source of her own problem. We also see evidence of cognitive disconnection through the suggestion of peer teasing. Teasing stirs up feelings and results in relationship entanglement and mental preoccupation.

As is typical in the stories of dismissing individuals, the girl fails to demonstrate the use of attachment to terminate her distress. We see no use of an attachment figure as a secure base or of an internalized secure base. We do see some agency, as the girl is able to return to the classroom and sit down. Consistent with attachment theory, we see that behavioral action alone in the absence of attachment, however, does not result in personal transformation or change in her anxious state. Further, the individual drives this point home in the story by stating at the end, “nothing will really change.”

Defensive exclusion again affects the story’s coherency. The repeated statements of negative evaluation diminish the quality of the story. The individual also compromises quality by her indecision in the beginning—the girl’s condition is due to not having friends or being teased. The story plot is generally vague and is told in a manner that includes several run-on sentences. Finally, this story has a segregated systems marker in that the girl is described as “scared.” The story is resolved by her behavioral action that keeps her organized and moving forward. She may be afraid but, unlike the unresolved individual, her fear is not paralyzing.

Preoccupied Attachment

Preoccupied attachment is characterized by mental confusion, uncertainty, ambivalence, and preoccupation with attachment events, details, and emotions (particularly anger and sadness). As with the dismissing group, the AAP stories of preoccupied individuals portray nonconnected and nonsynchronous relationships. Unlike the capacity to act commonly seen in the stories of dismissing individuals, preoccupied

Note that without use of an attachment figure or internalized secure base nothing has been transformed.

Example 3

Well someone looks a bit up to it there I suppose, sitting on the bench having a bit of a cry, obviously something traumatizing happened before—sitting there thinking why did this happen to me and, I don’t know I wouldn’t know what happens next, I expect she gets up and walks away.

Passive language: Nonsense or jargon phrase.

Highly exaggerating language, not real trauma.

Defense: Cognitive disconnection—uncertainty. The character asks a question.
Defense: Cognitive disconnection—uncertainty. The individual is uncertain about how to continue the story.

Absence of agency of self.

individuals frequently describe characters as not taking any action at all, leaving them alone and often passive and immobilized. Consistent with these portrayals of agency (more correctly, the lack of agency), characters in the stories of preoccupied individuals are less likely to express the desire to be connected to others and, in response to dyadic pictures, do not demonstrate synchrony.

Cognitive disconnection is the predominant form of defense used by individuals judged preoccupied. These individuals typically display a host of cognitive disconnection markers in any given story, particularly uncertainty and disconnected (i.e., split) story lines. Although some forms of deactivation may be present in one or two of the stories, the presence of deactivating defenses in the responses of preoccupied individuals is minimal. Cognitive disconnection interferes strongly with coherency of thought and discourse. The stories of preoccupied individuals are typically incoherent; contradictory story lines, a plethora of detail, run-on or unfinished sentences, jargon, stumbling, passive language, and an overall empty or vague quality of discourse encumber them. As well, it is often difficult for preoccupied individuals to maintain self-other boundaries, resulting in frequent and often lengthy descriptions of personal experience in their stories.

The Bench story in Example 3 exemplifies many of the features of the preoccupied attachment group. In this story, cognitive disconnection results in a meaningless story characterized by uncertainty and passivity in both the girl on the bench and the individual telling the story. Overall, this story says nothing and, with the exception of noting “a bit of a cry,” is devoid of attachment. Further, the girl’s distress is described in the prototypic manner of the preoccupied individual—vague jargon (“a bit up to it”) and overexaggeration (“traumatizing”). The story content fails to describe clear events that led up to the situation, any real activity while she is sitting on the bench, and the events that follow. In all respects, this story has no beginning, middle, or end. The uncertainty that results from attempts to disconnect events of attachment is also pervasive. The girl doesn’t know why this happened to her. The individual telling the story is “stuck” and doesn’t know what to say next. The more casual reader (i.e., one not trained to use AAP classification markers) might be tempted to suggest agency of self from the story content because the character is described as asking why this was happening to her. Looking at this statement to evaluate agency of self, we see that the question stops short of discovering a solution or transformation (internalized secure base). It also stops short of giving the girl the capacity to act to remedy her distress. She gets up and walks away, leaving the situation unchanged and herself alone with no expressed desire to be with others (lack of connectedness).

The uncertainty and passivity in this story adversely affect coherency, judged low. The reader may also note that cognitive disconnection results in a drawing out of this individual’s thoughts, as if she is buying time to figure out what is going on in order to tell a story. As a result, the story itself is essentially one long run-on sentence, a strong manner violation.

Unresolved Attachment

Unresolved segregated systems are the key features of defensive processing that characterize the unresolved attachment group. Unresolved individuals have not reworked and integrated trauma and loss experiences into their current mental representation of attachment. As a result, they are prone to dysregulation and the sudden emergence of segregated material when their attachment systems are activated. They are then “haunted” by feelings of failed protection, abandonment, vulnerability, threat, and extreme mental distress (George & Solomon, 1999; Solomon et al., 1995; West & George, 1999).

It is important to note that overall the other forms of defensive processes found in the stories of unresolved individ-

Example 4

Um, again it’s a, well not again it looks to me like a picture of, of absolute despair or isolation, sitting on a bench looks like totally withdrawn I when I first saw it I thought either um, a jail situation or, you almost can maybe be a sauna situation but you wouldn’t sit in that posture in a sauna. Um so I think it’s a negative um, the person looks bare, as if they had everything stripped away from them, um, so to me, and because I’ve done so much third world development it immediately I immediately thought of a third world situation where something has been totally stripped away from the individual, and they are in total despair and anxiety and um, almost withdrawn. And I suppose it could be because of my physio-occupational therapy training it could be a mental patient who’s way back in the olden days had everything taken away from them and they’re in total despair. OK. What might happen next? I almost think they might even lie down and curl up in the fetal position. Anything else? No.

Segregated system marker: Emptiness/isolation. Defense: Cognitive disconnection—Literal interpretation of the figure’s body posture convinces individual that the girl is in a severely isolating environment such as jail. Segregated system marker: Continued elaboration of emptiness, despair, and isolation. Defense: Cognitive disconnection—Anxiety is an entangling emotional state. Also withdrawn. Personal experience. Segregated system marker—Individual shifts theme to severe mental disorder and continued elaboration of isolation.

Unresolved: Complete withdrawal into the self.

uals are similar to those of individuals in the other attachment groups (i.e., they reveal similar patterns of deactivation and cognitive disconnection). We also note that the segregated systems markers of unresolved individuals are not necessarily autobiographical. The dysregulated, “unmetabolized” quality of unresolved attachment trauma, combined with other forms of defensive processing, typically result in low coherency, and an absence of agency of self, connectedness, and synchrony.

The Bench story in Example 4 is from the transcript of an individual judged unresolved. This story is an excellent example of Bowlby’s (1980) predictions regarding the effects of “unlocking” unresolved material that has previously been kept segregated from consciousness. Here, the individual became dysregulated and the story told is disorganized and “unmetabolized.” In this story, segregated systems material is demonstrated in the intense, repetitive descriptions of personal emptiness, isolation, and despair. At one point, the individual attempts to get control of her attachment stress by a weak depiction of the girl as being in a sauna. This depiction does not work, however, as the individual is struck by the literal drawing of the figure on the bench. The individual appears resigned to the fact that the girl is helplessly alone (jail) and desperate. The girl’s seclusion on the bench leaves her vulnerable, in danger, abandoned, and unprotected. Attachment despair is never resolved; the dysregulated material is never reorganized, contained, or integrated. Instead, the girl withdraws even further into a helpless fetal position.

Summary

With the projective assessment of adult attachment as our frame of reference, we have described the intricacies of defensive operations as defined by Bowlby, the analysis of story content and discourse coherency, and theoretical and procedural aspects in classifying patterns of attachment. Avowedly, following the nature of defense and mental representation, this presentation of AAP coding and classification principles was necessarily complex and may have overburdened the reader. To supplement the study of AAP interpretation, it will be worthwhile to conclude these discussions by representing the classification process diagramatically.

The integration of these features of the AAP to assign an attachment classification can be represented as a hierarchically integrated series of decision points (see Figure 32.4). Classification is assigned on the basis of analysis of the coding patterns for the entire set of seven attachment stories. A judge first notes if there is at least one unresolved segregated systems marker. If an unresolved segregation systems marker is present, the case is assigned the unresolved classification. If all segregated systems markers have been resolved, the judge then examines the pattern of codes used to differentiate secure from insecure cases (coherency, agency of self, connectedness, synchrony). If the case does not fit the secure pattern, the judge then proceeds to examine the specific patterns of defensive exclusion in order to differentiate dismissing and preoccupied attachment. Inspection of each decision

446 Adult Attachment Projective

Figure 32.4 Summary of classification process: AAP decision rules.

point emphasizes that defensive processing is an integral aspect of internal working models of attachment and the interpretation of the Adult Attachment Projective.

REFERENCES

Ainsworth, M.D.S. (1964). Patterns of attachment behavior shown by the infant in interaction with his mother. Merrill-Palmer Quarterly, 10, 51–58.
Ainsworth, M.D.S., Blehar, M.C., Waters, E., & Wall, S. (1978). Patterns of attachment: A psychological study of the Strange Situation. Hillsdale, NJ: Erlbaum.
Ainsworth, M.D.S., & Eichberg, C. (1991). Effects on infant-mother attachment of mother’s unresolved loss of an attachment figure, or other traumatic experience. In C.M. Parkes, J. Stevenson-Hinde, & P. Marris (Eds.), Attachment across the life cycle (pp. 160–186). New York: Routledge.
Bowlby, J. (1969/1982). Attachment and loss: Volume 1. Attachment. New York: Basic Books.
Bowlby, J. (1973). Attachment and loss: Volume 2. Separation. New York: Basic Books.
Bowlby, J. (1980). Attachment and loss: Volume 3. Loss. New York: Basic Books.
Bretherton, I. (1985). Attachment theory: Retrospect and prospect. In I. Bretherton & E. Waters (Eds.), Growing points in attachment theory and research. Monographs of the Society for Research in Child Development, 50 (1–2, Serial No. 209), 3–35.
Bretherton, I., & Munholland, K.A. (1999). Internal working models in attachment relationships: A construct revisited. In J. Cassidy & P.R. Shaver (Eds.), Handbook of attachment theory, research, and clinical implications (pp. 89–111). New York: Guilford Press.
Bretherton, I., Ridgeway, D., & Cassidy, J. (1990). Assessing internal working models of attachment relationships: An attachment story completion task for 3-year-olds. In M.T. Greenberg, D. Cicchetti, & E.M. Cummings (Eds.), Attachment in the preschool years (pp. 273–308). Chicago: University of Chicago Press.
Cassidy, J. (1988). Child-mother attachment and the self in six-yearolds. Child Development, 59, 121–134.
Cassidy, J., & Kobak, R.R. (1988). Avoidance and its relation to other defensive processes. In J. Belsky & T. Nezworski (Eds.), Clinical implications of attachment (pp. 300–323). Hillsdale, NJ: Erlbaum.
Cooper, S., Perry, J.C., & Arnow, D. (1988). An empirical approach to the study of defense mechanisms: I. Reliability and preliminary validity of the Rorschach Defense scales. Journal of Personality Assessment, 52, 187–203.
Cooper, S., Perry, J.C., O’Connell, M. (1991). The Rorschach Defense scales: II. Longitudinal perspectives. Journal of Personality Assessment, 56, 191–201.
Cramer, P. (1999). Future directions for the Thematic Apperception Test. Journal of Personality Assessment, 72, 740–792.
Fonagy, P., & Target, M. (1997). Attachment and reflective function: Their role in self-organization. Development and Psychopathology, 9, 679–700.
George, C., Kaplan, N., & Main, M. (1984/1985/1996). Attachment interview for adults. Unpublished manuscript, University of California, Berkeley.
George, C., & Solomon, J. (1989). Internal working models of parenting and quality of attachment at age six. Infant Mental Health Journal, 10, 222–237.
George, C., & Solomon, J. (1996). Representational models of relationships: Links between caregiving and attachment. Infant Mental Health Journal, 17, 198–216.
George, C., & Solomon, J. (1999). Attachment and caregiving: The caregiving behavioral system. In J. Cassidy & P.R. Shaver (Eds.), Handbook of attachment theory, research, and clinical implications (pp. 649–670). New York: Guilford Press.
George, C., & West, M. (1999). Developmental vs. social personality models of adult attachment and mental ill health. British Journal of Medical Psychology, 72, 285–303.
George, C., & West, M. (2001). The development and preliminary validation of a new measure of adult attachment: The Adult Attachment Projective. Attachment and Human Development, 3, 55–86.
George, C., West, M., & Pettem, O. (1999). The Adult Attachment Projective: Disorganization of adult attachment at the level of representation. In J. Solomon & C. George (Eds.), Attachment disorganization (pp. 462–507). New York: Guilford Press.
Gloger-Tippelt, G. (1999). Transmission von Bindung bei Muettern und ihren Kindern im Vorschulalter. Praaxis der Kinderpsychologie und Kinderpsychiatrie, 48, 113–128.
Green, J., Stanley, C., Smith, V., & Goldwyn, R. (2000). A new method of evaluating attachment representations in young school-age children: The Manchester Child Attachment Story Task. Attachment and Human Development, 2, 42, 64.
Grice, P. (1975). Logic and conversation. In P. Cole & J.L. Moran (Eds.), Syntax and semantics III: Speech acts (pp. 41–58). New York: Academic Press.
Hinde, R.A. (1982). Ethology. New York: Oxford University Press.
Kaplan, N. (1987). Individual differences in six-year-olds’ thoughts about separation: Predicted from attachment to mother at one year of age. Unpublished doctoral dissertation, University of California, Berkeley.
Lerner, H., Albert, C., & Walsh, M. (1987). The Rorschach assessment of borderline defenses: A concurrent validity study. Journal of Personality Assessment, 51, 334–348.
Lyons-Ruth, K., & Jacobvitz, D. (1999). Attachment disorganization: Unresolved loss, relational violence, and lapses in behav-

ioral and attentional strategies. In J. Cassidy & P.R. Shaver (Eds.), Handbook of attachment theory, research, and clinical implications (pp. 520–554). New York: Guilford Press.

Main, M. (1990). Cross-cultural studies of attachment organization: Recent studies, changing methodologies and the concept of conditional strategies. Human Development, 33, 48–61.
Main, M. (1995). Recent studies in attachment. In S. Goldberg, R. Muir, & J. Kerr (Eds.), Attachment theory: Social, developmental, and clinical perspectives (pp. 467–474). Hillsdale NJ: Analytic Press.
Main, M., & Goldwyn, R. (1985/1991/1994). Adult attachment scoring and classification systems. Unpublished classification manual, University of California, Berkeley.
Main, M., Kaplan, N., & Cassidy, J. (1985). Security in infancy, childhood, and adulthood: A move to the level of representation. In I. Bretherton & E. Waters (Eds.), Growing points in attachment theory and research. Monographs of the Society for Research in Child Development, 50 (1–2, Serial No. 209), 66–104.
Rapaport, D. (1952). Projective techniques and the theory of thinking. Journal of Projective Techniques, 16, 269–275.
Schafer, R. (1954). Psychoanalytic interpretation in Rorschach testing: Theory and application. New York: Grune & Stratton.
Slade, A., Belsky, J., Aber, J.L., & Phelps, J.L. (1999). Mothers’ representations of their relationships with their toddlers: Links to adult attachment and observed mothering. Developmental Psychology, 35, 611–619.
Solomon, J., & George, C. (1996). Defining the caregiving system: Toward a theory of caregiving. Infant Mental Health Journal, 17, 183–197.
Solomon, J., & George, C. (1999). The place of disorganization in attachment theory: Linking classic observations with contemporary findings. In J. Solomon & C. George (Eds.), Attachment disorganization (pp. 3–32). New York: Guilford Press.
Solomon, J., George, C., & De Jong, A. (1995). Children classified as controlling at age six: Evidence of disorganized representational strategies and aggression at home and school. Development and Psychopathology, 7, 447–464.
Sroufe, L.A., & Fleeson, J. (1986). Attachment and the construction of relationships. In W. Hartup & Z. Rubin (Eds.), The nature and development of relationships (pp. 51–71). Hillsdale, NJ: Erlbaum.
West, M.L., & George, C. (1999). Violence in intimate adult relationships: An attachment theory perspective. Attachment and Human Development, 1, 137–156.
West, M.L., & Sheldon-Keller, A.E. (1994). Patterns of relating: An adult attachment perspective. New York: Guilford Press.
West, M.L., Rose, S., Spreng, S., Verhoef, M., & Bergman, J. (1999). Anxious attachment and severity of depressive symptomatology in women. Women and Health, 29, 47–56.
Zeanah, C.H., & Barton, M.L. (1989). Introduction: Internal representations and parent-infant relationships. Infant Mental Health Journal, 10, 135–141.

Table of Contents

CHAPTER 26 Rorschach Assessment: Current Status

CONCEPTUAL BASIS

DEVELOPMENT AND PSYCHOMETRIC CHARACTERISTICS

Intercoder Agreement

Reliability

Validity

Normative Reference Data

UTILITY OF RORSCHACH APPLICATIONS

Clinical Practice

Forensic Practice

Organizational Practice

LEGAL ISSUES

CROSS-CULTURAL CONSIDERATIONS

COMPUTERIZATION OF RESULTS

CURRENT AND FUTURE STATUS

REFERENCES

354 Rorschach Assessment: Current Status

CHAPTER 27 The Thematic Apperception Test (TAT)

TEST DESCRIPTION

THEORETICAL BASIS

And,

TEST DEVELOPMENT

PSYCHOMETRIC CHARACTERISTICS

RANGE OF APPLICABILITY AND LIMITATIONS

CROSS-CULTURAL FACTORS

ACCOMMODATION FOR POPULATIONS WITH DISABILITIES

LEGAL AND ETHICAL CONSIDERATIONS

COMPUTERIZATION

CURRENT RESEARCH STATUS

USE IN CLINICAL OR ORGANIZATIONAL PRACTICE

FUTURE DEVELOPMENTS

REFERENCES

CHAPTER 28 The Use of Sentence Completion Tests with Adults

HISTORY OF SENTENCE COMPLETION METHODS

ROTTER INCOMPLETE SENTENCES BLANK

Test Description

Theoretical Basis

Test Development

Psychometric Characteristics

Norms

Reliability

Validity

Range of Applicability and Limitations of Sentence Completion Methods

Cross-Cultural and Diversity Factors

Accommodation for Populations With Disabilities

Computer-Based Testing

Current Research Status

Clinical Applications

Future Directions

APPENDIX: LIST OF SENTENCE COMPLETION METHODS FOUND IN THE LITERATURE

Appendix: List of Sentence Completion Methods Found in the Literature 381

REFERENCES

CHAPTER 29 Use of Graphic Techniques in Personality Assessment: Reliability, Validity, and Clinical Utility

INTRODUCTION

THE DRAW-A-PERSON TEST (DAP) AND THE HOUSE-TREE-PERSON DRAWING TEST (H-T-P)

Test Descriptions

DAP

H-T-P

Theoretical Basis

DAP

H-T-P

Test Development

DAP

H-T-P

Psychometric Characteristics

DAP

H-T-P

Range of Applicability and Limitations

DAP

H-T-P

Cross-Cultural Factors

DAP

H-T-P

Accommodation for Populations With Disabilities: DAP and H-T-P

Legal and Ethical Considerations: DAP and H-T-P

Computerization: DAP and H-T-P

Current Research Status

DAP

H-T-P