Use of the TWEAK Test in Screeing for Alcoholism - Heavy Drinking in Three Populations

Vol. 17, No. 6
November/December 1993
Use of the TWEAK Test in Screening for Alcoholism/
Heavy Drinking in Three Populations
Arthur W. K. Chan, Edward A. Pristach, John W. Welte, and Marcia Russell
TWEAK is an acronym for Tolerance (TI number of drinks to feel
high; T,, number of drinks one can hold), Worry about drinking, Eye-
opener (morning drinking), Amnesia (blackouts), and Cut down on
drinking (K/C). In this study, two versions (TI and T2) of the TWEAK
were part of a questionnaire used to detect alcoholism or heavy
alcohol intake in three populations, namely, alcoholics in treatment,
patients in two outpatient clinics, and the general population. Similar
to the CAGE and the 10-item brief MAST, the TWEAK identified most
known alcoholics, but the TWEAK had a higher sensitivity and spec-
ificity than the CAGE and 8-MAST in detecting alcoholism/heavy
drinking in the clinical and general populations. Different cut-off
values for tolerance (T, and Tz) are recommended for screening
different populations.
Key Words: TWEAK, CAGE, Brief-MAST, Clinical Outpatients, Gen-
eral Population.
LCOHOLICS EXPERIENCE many drinking prob-
A lems before seeking professional help.’ Unfortu-
nately, doctors who are uniquely placed to detect problem
drinlung in their patients, often fail to identify the majority
of One possible reason is that doctors may be
reluctant to spend time administering lengthy question-
naires or to personally conduct a structured interview.?
Therefore, the availability of brief and reliable screening
instruments to detect alcohol problems will increase the
likelihood of doctors using these tools in their clinical
Currently, there are several short screening tests that
medical practitioners can use to detect hazardous drink-
ing/probable alcoholism in their patients. One example is
the Cyr and Wartman’ 2-item questionnaire [“Have you
ever had a drinking problem?” “When was your last
drink?” (24-hr cut-off)], which was reported to have a
sensitivity of 9 1.5% and a specificity of 89.7%. Using this
instrument, Goldberg et a].? reported that 35.6% of the
patients in an academic, general medical clinic were
screened positive. More importantly, over four times as
many patients screened this way (10.8% vs. 2.3%) accepted
referrals for counseling, as did patients seen by doctors
providing standard care. Studies of elderly veterans and
From the Research Institute on Addictions, New York State Oflice of
Received for publication February 25, 1993; accepted June 1, 1993
This work was supported in part by Public Health Service Grant
Reprinr requests: Arthur W. K. Chan, Ph.D.. Research Institute on
Copyright Q I993 by The Research Society on Alcoholism.
Alcoholism and Substance Abuse Services, Buffalo, New York.
Addictions, 1021 Main Street, Bufsalo, NY 14203-1016.
women have indicated that this test was less useful in these
CAGE is an acronym for: Cut down on drinking, An-
noyed by criticism of drinking, feeling Guilty about drink-
ing, and Eye-opener (morning drinking), with two or more
positive responses being indicative of problem drinking or
probable alcoholism.” The advantages of the CAGE are
its brevity and high clinical validity.” Thus, many studies
have shown that the CAGE identified most alcoholics and
excessive On the other hand, some studies
have reported relatively low sensitivity for the CAGE in
other selected populations, such as college students,’? DWI
offenders,” emergency room patients, I9 and elderly med-
ical patients.20 The disadvantages of the CAGE are that it
focuses on lifetime rather than current problems, because
of the phrasing “Have you ever. . .,” and that the questions
about “cutting down” and “feeling guilty” are often an-
swered “YES” by current light drinkers or abstainers.
These may lead to an overestimation of alcohol-related
problems by the CAGE.2’ There have only been two
studies that used the CAGE in a general population survey
of drinlung. ’
The T-ACE q~estionnaire~~ retained three CAGE items,
namely, A, C, and E. T stands for tolerance to alcohol
that was defined as needing two or more drinks to make
a female subject feel “high.” When used in the prenatal
detection of risk-drinking in gravid women, it correctly
identified 69% of the risk drinker^.'^ In another study of
screening for risk-drinking in pregnant women,24 the sen-
sitivity of the T-ACE was 60-89%, and the specificity was
80-8696, depending on which one of the two definitions
of tolerance (TI or T2) was used: number of drinks needed
to get high (7‘’ 5 3 drinks) or the number of drinks one
could hold (7‘2 2 5 drinks). Other variations of the T-ACE
and CAGE are the 3-item NET (N, whether one is a
normal drinker; E, eye-opener; T, tolerance) and the 5-
item TWEAK [T, tolerance; W, worry about drinking; E,
eye-opener; A, amnesia (blackouts); and K/C, cut
down].24 When applied to the screening of risk-drinking
in pregnant women, the TWEAK (TI high or T2 hold) had
a sensitivity of 68-79% and a specificity of 83-93%. In
contrast, the NET (high) had a 100% sensitivity but only
22% specifi~ity.~~ There has been no report dealing with
the use of the T-ACE, NET, or TWEAK in detecting
problem drinking in the general population.
In the present study, the TWEAK is part of a question-
naire used to detect alcoholism or heavy alcohol intake in
three populations, namely, alcoholics in treatment, pa-
tients in two outpatient clinics in a county hospital, and a
popu1ation* The Of the
TWEAK is compared with that of the CAGE, and the 10-
(American Indians and Orientals) constituted between 1.1-2.1 % of each
of the subject groups. The educational levels of the CL and ALC subjects
were very similar, with 34-39% having some college education; but
60.4% of the GP subjects had some college education. Among the CL
and GP subiects. 14. I % and i 1.7%. resmctivelv. were heaw drinkers.
item brief MAST.25
Subject Recruitment
Alcoholics admitted to the Alcoholism Treatment Center in a County
Medical Center in Buffalo were randomly recruited for the study by the
staff of the treatment center. To be eligible, they must have had their last
drink no longer than 4 days before the interview. This eligibility require-
ment was not expected to affect the generalizability of the screening test
results in this study, but was intended to result in a uniform control
group for the determination of serum levels of carbohydrate-deficient
transfemn (CDT; data to be presented elsewhere); levels of CDT usually
decrease during abstinence. Outpatients from the Primary Health Care
Center and the Family Care Center in the same County Medical Center
were recruited via telephone by the project staff at least several days
before their scheduled appointments at the clinics. These patients mostly
belonged to lower- or lower middle-class households. A randomdigit
dial telephone interview was conducted on household occupants residing
in the Buffalo metropolitan area. We used a computer-assisted telephone
interviewing (Sawtooth Software Incorporated, 199 1) system for the
random selection and dialing of telephone numbers. The procedures for
oversampling Blacks and heavy drinkers were the same as previously
described.26 Briefly, Blacks were oversampled by using a higher propor-
tion of phone numbers in telephone districts with high Black populations.
Blacks were oversampled to permit adequate numbers for analysis of
possible ethnic differences in blood chemistry items or in responses to
questionnaire items. Because there were far fewer female heavy drinkers
(those consuming two or more drinks/day) than male drinkers, female
drinkers were oversampled by using biased screening tables such that the
average probability of selecting a female from a household was 6-8 times
higher than that of selecting a male. To produce unbiased samples of the
general population in data analysis, each case was weighted inversely
proportional to its probability of selection, taking into account these
Informed consent was obtained from all three groups of subjects
before their participation in a face-to-face interview, during which a
questionnaire was completed. The age limit was 18-65 years, inclusive.
There were 1,635 subjects, 252 alcoholics in treatment (ALC), 390
clinical outpatients (CL), and 993 general population (GP) subjects. The
demographic characteristics of these subjects are summarized in Table
1. The GP subjects were significantly younger than the CL and ALC
subjects (p < 0.001). The male:female ratio for the ALC was -3:1, and
that for the CL was - 1:2, but for the GP it was close to 1 :I, due to the
oversampling of female drinkers. The distribution of White and Black
subjects was similar in the GP and ALC samples, being 63.4% White in
the former and 64.3% in the latter. In contrast, the CL sample was
composed of 4 I .5% White and 57.4% Black subjects. Other ethnic groups
Table 1. Demographic Characteristics of Three Groups of Subjects
Variable GP CL ALC
~ ~
n 993 390 252
Age (years)’ 34.7 f 3.8t 41.3 2 0.61 37.6 +_ 0.60
Male 48.3% 33.6% 73.0%
Female 51.7% 66.4% 27.0%
White 63.4% 41.8% 64.3%
Black 35.0% 57.2% 33.3%
Others 1.6% 1 .O% 2.4%
Years of schooling’ 13.2 f 0.07$ 12.2 * 0.24 12.1 f 0.12
* Mean +_ SE.
t Significantly different from the other two groups (p < 0.01).
$Significantly dierent from the other two groups (p < 0.001).
All subjects received a face-to-face interview by research interviewers
who were trained to administer a standardized questionnaire. The ques-
tionnaire elicited information on demographic characteristics such as
age, sex, race, occupation, and education; beverage-specific quantity-
frequency questions for beer, wine, and liquor in the past year, 5 years
ago, and 10 years ago; age of fmt drink and time (date) of last drink; use
of nonprescription drugs in the last 2 weeks and in the past year; current
use of prescription drugs; family history of alcoholism; the 4-item CAGE
questi~nnaire;’~-’~ the 10-item Brief MAST;” the TWEAK;24 age of onset
of smoking; quantity and frequency of smoking for last year; lifetime
record of diseases such as hepatitis, cirrhosis, and other liver diseases,
seizures, etc.; recent significant health problems; depression2’; and anxi-
ety disorders2’ Some subjects in each of the three groups (ALC, n = 53;
CL, n = 84; GP, n = 369) were also asked questions pertaining to lifetime
and current (past year) DSM-111-R criteria for alcohol dependence.
The TWEAK contained the following questions:
1. How many drinks does it take before you begin to feel the first effects
of the alcohol?
2. How many drinks does it take before the alcohol makes you fall
asleep or pass out? Or, if you never drink until you pass out, what is
the largest number of drinks you have?
3. Have your friends or relatives Worried or complained about your
drinking in the past year?
4. Do you sometimes take a drink in the morning when you first get
up? (Eye-opener)
5. Are there times when you drink and afterwards you can’t remember
what you said or did? (Amnesia)
6. Do you sometimes feel the need to Cut down on your drinking? (K
or C)
Questions 1 (TI, high) and 2 ( T2, hold) were the two versions of tolerance
described in the introduction. For initial data analysis, we used a cut-off
of 23 drinks for TI and 2 5 drinks for T2, giving us two versions of
TWEAK, namely, TIWEAK and T2WEAK. As described in “Results,”
we also experimented with other cut-offs for TI and T2. As suggested by
Russell et al.,24 both T and W were scored two points, whereas each of
the other items (E, A, K) was scored 1 point. A total score of 3 or more
was considered positive for TIWEAK or TIWEAK.
Drinker Definitions
Based on their self-reports of drinking for the past year. the clinical
outpatients and general population subjects were stratified into the
following drinker categories: Heavy drinkers were males consuming an
average of six or more drinks/day and females consuming four or more
drinks/day. Individuals drinking less than these amounts were considered
nonheavy drinkers. Abstainers were included as nonheavy drinkers in
the data analysis.
Statistical Analysis
Data were analyzed by the Statistical Package for the Social Sciences
(SPSS-X) using one-way ANOVA, x2, multiple range test (Neuman-
Keuls), and correlation analysis procedure, where appropriate. The level
of significance was defined at p < 0.05 throughout. Sensitivity and
specificity were computed using standard meth~dology.~~ A Receiver

Operating Characteristic (ROC) curve was constructed as described by
Hsiao et aL30
Either version of the TWEAK (TIWEAK and
TzWEAK) correctly identified 96.2% of the ALC. In com-
parison, the CAGE and B-MAST identified 98.3% and
99.2% of the ALC, respectively. The sensitivities and
specificities of TI WEAK and T2WEAK in the CL and GP
groups are compared with those of the CAGE and B-
MAST in Table 2. Based on the DSM-111-R gold standard
(A), the two versions of TWEAK had higher sensitivity
and specificity than the CAGE or B-MAST, except for the
higher sensitivity for CAGE in CL and the higher specific-
ity of B-MAST in GP. With heavy drinking as the gold
standard (B), the two versions of TWEAK correctly iden-
tified more heavy drinkers than the CAGE or B-MAST,
and still had the lowest false positive rates except for the
higher specificity of B-MAST in GP. The sensitivity of
T2WEAK in the CL and GP samples (Table 2) approached
that for the ALC and its specificity was reasonably good.
Data shown in Table 2 are based on the criteria of TI z
3 drinks and T2 2 5 drinks as the positive indications of
tolerance. Because it is possible that CL and GP subjects,
as well as males and females, might need different cut-offs
for T, and T2, we have experimented with different values
for TI and TZ without changing the total cut-off scores for
TIWEAK and T2WEAK (Table 3). As seen in Table 3,
increasing the TI cut-off stepwise from 2 to 6 only de-
creased slightly the sensitivity in CL males, but there was
a gradual increase in specificity as the TI value increased.
However, in GP males, the sensitivity was highest when
TI = 2 and was lowest when TI = 6, and the specificity
increased as TI value increased at the expense of a corre-
sponding loss in sensitivity. There was no change in the
100% sensitivity in CL males as T2 was increased from 5
to 8, with only a slight gain in specificity. Similarly,
increasing TZ from 5 to 8 only changed slightly the sensi-
tivity in GP males, with very little change in specificity.
Therefore, it appears that for males, the Tl(3)WEAK
would be the test of choice for CL subjects if the objective
were to maximize sensitivity, but Tl(6)WEAK would be
the measure of choice if a high specificity were desired.
The T2(5 or 6)WEAK or Tl(2)WEAK would be good for
Table 2. Sensitivity and Specificity of the TWEAK, CAGE, and B-MAST
Gold standard A' Gold standard Bt
Sensitivitylspecificity Sensitivity/specificity
("4 ("4
TIWEAK 94.41955 88.7182.2 85.4183.0 86.1172.4
T2WEAK 94.4189.4 98.6f74.5 93.8180.4 92.1164.6
CAGE 100/68.2 84.5f71.1 72.9165.2 75.5165.3
8-MAST 77.8180.3 47.9184.9 72.9p6.9 55.0184.4
* DSM-Ill-R past year alcohol dependence (n = 84 and 369, for CL and GP,
t Heavy drinking as defined in text (n = 390 and 992, for CL and GP,
Table 3. Gender Differences in Sensitivity and Specificity of TWEAK
Male Female
Sensitivity/specificityt Sensitivity/specificityt
(W ("10)
cut-ow CL GP CL GP
T7 (2) 100/75.0 95.5169.9 100/96.0 96.3181.7
TI (3) 100181.3 88.6174.8 83.31100 88.9187.4
T7 (4) 91.7187.5 705183.7 83.31100 81.5/91.4
TI (5) 91.7187.5 63.6188.6 83.31100 815194.9
TI (6) 91.7f93.8 61.4190.6 83.31100 815195.4
T2 (5) lOOl68.8 100/67.5 83.3196.0 96.3179.4
T2 (6) lOOl81.3 100/67.5 83.31100 96.3181.7
T2 (7) 100181.3 97.71715 83.31100 88.9/84.0
T2 (8) lOOl81.3 95.5fl2.4 83.31100 88.9/84.6
* Number in parentheses is the cut-off value for Tolerance (T, or T2). Total wt-
t The gold standard was DSM-Ill-R past-year alcohol dependence (CL, n = 28
off score (23) remained unchanged for TIWEAK or T2WEAK.
males and 56 females; GP. n = 167 males and 202 females).
2ol 10
o! 1 I I I I I I I I I
0 10 20 30 40 50 60 70 80 90 100
False-Positive Rate
Fig. 1. ROC curve for T,WEAK. Computations for sensitivity and specificity
(varying total cut-off scores) were based on the following group definitions: ALC
subjects plus all heavy drinkers in CL and GP as "true-positives" and all the
nonheavy drinkers in CL and GP as the 'true-negatives."
screening GP subjects if a high sensitivity is desired. In
contrast, the data in Table 3 indicate that for females, the
T1(2)WEAK would have the better sensitivity and speci-
ficity for both CL and GP subjects.
Figure 1 shows an ROC curve for Tl(3)WEAK. "True-
positives" were defined as all the ALC subjects plus the
heavy drinkers in CL and GP, and "true-negatives" con-
sisted of all the nonheavy drinkers in CL and GP. The
ROC curve plots the true-positive rate (sensitivity) against
the false-positive rate (1 minus specifi~ity),~' using differ-
ent total cut-off score for TIWEAK. Steep parts of the
ROC curve indicate large gains in sensitivity, with small
loss of specificity, and horizontal parts indicate small gains
in sensitivity for large loss of specificity. It is seen from
Fig. 1 that a total score of 3 or 4 is an efficient cut-off for
the TWEAK. A similar result was obtained (data not
shown) when the "true-positives" included only the heavy
drinkers in CE and GP but not the ALC subjects.
The clinical validity of the two versions of TWEAK has
been confirmed in this study in that either test identified

nearly all of the known alcoholics. Both the TI WEAK and
TzWEAK also have fairly high sensitivity and specificity
in detecting alcohol dependence or heavy alcohol con-
sumption in CL and GP subjects (Table 2). Phrasing of
the TWEAK questions has been deliberately cast in the
present tense, and in the “Worry” question, a past-year
criterion has been built in. These are aimed to avoid the
major disadvantage of the CAGE and B-MAST, namely,
inability to distinguish between lifetime and current alco-
hol problems because of the phrasing, “Have you ever
. . .” in the questions. It should be emphasized that the
TWEAK is only a screening instrument rather than a
diagnostic instrument. For screening purposes, a high
sensitivity of the test is desired, which the TWEAK can
achieve. The less-than-ideal specificity will inevitably re-
sult in some false-positives that can be ruled out by
supplementary tests, such as physical symptoms, biochem-
ical tests, or a more in-depth diagnostic interview.
On paper, two items of the TWEAK, namely, “cut-
down” and “eye-opener,” (see “Method” for phrasing of
these two TWEAK questions) appear to overlap with the
two similarly worded items in the CAGE. The phrasing of
the two CAGE questions are: “Have you ever felt you
ought to cut down on your drinking?” and “Have you
ever had a drink first thing in the morning to steady your
nerves or get rid of a hangover or for an eye-opener?”
However, results of correlation analysis and logistic regres-
sion analysis indicate that these similarly worded items
are not redundant. Correlation analysis between the
TWEAK and CAGE items in both the GP and CL samples
showed no correlation coefficients >0.62. In other words,
no one TWEAK item explains >37% of the variance in
any CAGE item. A logistic regression analysis of the GP
data in which the CAGE items were entered first followed
by insertion of the TWEAK items showed that the F-to-
enter values of the similarly worded (i.e., “cut-down” and
“eye-opener”) items were usually higher than those of the
nonsimilarly worded items. These data demonstrate that
the similarly worded items in the CAGE and TWEAK are
not redundant. Perhaps the different phrasing of these
similarly worded items contributes to the extra discnmi-
natory power of the TWEAK items over the CAGE items.
The ROC curve shown in Fig. 1 indicates that a
TWEAK score of 3 or 4 is an efficient cut-off. This is in
agreement with the initial choice of a cut-off score of 3 by
Russell et al.24 in their study of a special clinical popula-
tion, namely, pregnant women.
Based on the results, we recommend Tl(3)WEAK or
T1(6)WEAK for CL males, Tz(5 or 6)WEAK or
Tl(2)WEAK for GP males, and TI(2)WEAK for CL and
GP females. Replications of these findings are necessary
before our recommendations can be extended to, or mod-
ified for use in, other clinical populations or other selected
populations such as college students, DWI offenders,
emergency room patients, etc. Data in Table 3 also suggest
that the TWEAK appears to have a higher specificity for
women than for men. However, it should be noted that
the number of subjects in the CL sample was small, and
this finding needs to be replicated with a larger sample
size. There is also a need to determine whether the
TWEAK will be equally effective if it is administered by
doctors themselves rather than by research personnel.
Plans are underway to conduct some of these studies.
Another challenge is to convince more doctors and health
professionals to adopt the screening of alcohol problems
in their patients as part of their clinical routine.
We thank Janet Berg for her skillful typing. We gratefully acknowledge
the cooperations of Dr. Charles Hershey, Dr. Barbara Majeroni, Dr.
Douglas Moffat, Mitch Lopez, and the staff of the Erie County Medical
Center Primary Health Care Clinic, Family Care Clinic, and Alcoholism
Treatment Center for their help in the recruitment of subjects. The
following individuals provided assistance in the recruitment and inter-
viewing of research subjects: Jorge Antonetti, Kevin Dees, Juan Figueroa,
Carol Marx, Colleen Marx, Florian Penetrante, Maria Penetrante, Sheila
Pilc, Catherine Rebmann, Richard Topolski, and Robin Truesdale.
