/clinical/,/clinical/cckm-tools/,/clinical/cckm-tools/content/,/clinical/cckm-tools/content/questionnaires/,/clinical/cckm-tools/content/questionnaires/related/,

/clinical/cckm-tools/content/questionnaires/related/name-97198-en.cckm

201606168

page

100

UWHC,UWMF,

Clinical Hub,UW Health Clinical Tool Search,UW Health Clinical Tool Search,Questionnaires,Related

Reliability and Validity of the BASIS-24 Mental Health Survey for Whites, African-Americans, and Latinos

Reliability and Validity of the BASIS-24 Mental Health Survey for Whites, African-Americans, and Latinos - Clinical Hub, UW Health Clinical Tool Search, UW Health Clinical Tool Search, Questionnaires, Related


Regular Article
Reliability and Validity of the BASIS-24
**
Mental Health Survey for Whites,
African-Americans, and Latinos
Susan V. Eisen, PhD
Mariana Gerena, PhD
Gayatri Ranganathan, MS
David Esch, PhD
Thomas Idiculla, MSW
Abstract
Increasing racial and ethnic diversity calls for mental health assessment instruments that are
appropriate, reliable, and valid for the wide range of cultures that comprise the current US
population. However, most assessment instruments have not been tested on diverse samples. This
study assessed psychometric properties and sensitivity to change of the revised Behavior and
Symptom Identification Scale (BASIS-24
*
) among the three largest race/ethnicity groups in the
USA: Whites, African-Americans, and Latinos. BASIS-24
*
assessments were obtained for 2436
inpatients and 2975 outpatients treated at one of 27 mental health and/or substance abuse
programs. Confirmatory factor analysis and several psychometric tests supported the factor
structure, reliability, concurrent validity, and sensitivity of the instrument within each race/
ethnicity group, although discriminant validity may be weaker for African-Americans and Latinos
than for Whites. Further research is needed to test and validate assessment instruments with other
race/ethnicity groups.
Address correspondence to Susan V. Eisen, PhD, Associate Professor, Health Services Department, Boston University
School of Public Health, Boston, MA, USA and a health research scientist at the Center for Health Quality, Outcomes &
Economic Research (CHQOER), Edith Nourse Rogers Memorial Veterans Hospital, 200 Springs Road (152), Bedford, MA
01730, USA. Phone: +1-781-6872858. Fax: +1-781-6873106. E-mail: seisen@bu.edu.
Mariana Gerena, PhD, is a senior research scientist at the Institute on Urban Health Research, 503 Stearns, Northeastern
University, 360 Huntington Avenue, Boston, MA 02115, USA. Phone: +1-617-3735177. Fax: +1-617-3737309. E-mail:
gerena@neu.edu.
Gayatri Ranganathan, MS, is a biostatistician at MetaWorks Inc., 10 President’s Landing, Medford, MA 02155, USA.
Phone: +1-781-3950700. Fax: +1-781-3957336. E-mail: gayatrir@bu.edu.
David Esch, PhD, is an analyst at MDT Advisers, 125 Cambridge Park Drive, Cambridge, MA 02140-2329, USA.
Phone: +1-617-2342200. Fax: +1-617-2342210. E-mail: esch@mdtai.com.
Thomas Idiculla, MSW, is a doctoral candidate at the Boston College Graduate School of Social Work and Quality
Indicators Data Manager, Department of Mental Health Services Evaluation, McLean Hospital, 115 Mill Street, Belmont,
MA 02478, USA. Phone: +1-617-8552432. Fax: +1-617-8552948. E-mail: idicult@mcleanpo.mclean.org.
Journal of Behavioral Health Services & Research, 2006. * 2006 National Council for Community Behavioral
Healthcare.
304 The Journal of Behavioral Health Services & Research 33:3 July 2006

Introduction
Patient-reported outcome measures have long been recognized in the research arena as
important indicators of treatment efficacy and effectiveness
1
and, more recently, have been
incorporated into routine outcomes monitoring for purposes of enhancing quality of clinical care,
continuous quality improvement (CQI), or performance measurement.
2–8
Accrediting organiza-
tions including the Joint Commission on Accreditation of Healthcare Organizations and the
National Committee for Quality Assurance require monitoring of clinical outcomes as part of
accreditation requirements.
9,10
These national efforts require use of standardized instruments
across treatment facilities, so that provider and health-care system performance can be assessed
using the same metric.
11
A wide variety of standardized self-report measures have been developed for these
purposes.
12–14
However, many have not been tested or validated on the major racial/ethnic
minority groups that comprise the US population, and previous research suggests that
standardized measures may be problematic when applied to diverse groups.
15–17
For example,
in evaluating the 36-item Short-Form Health Survey (SF-36
\
) among 24 patient subgroups
varying in sociodemographic characteristics, McHorney et al. reported that psychometric and data
quality criteria were well met in most patient subgroups. However, groups characterized by older
age, lower education, poverty status, or African-American race had higher rates of missing data,
lower item-scale correlations, poorer item discrimination, and/or lower scaling success rates.
15
In
an effort to confirm the factor structure of the 32-item Behavior and Symptom Identification Scale
(BASIS-32
\
), Chow et al. reported acceptable levels of cross-ethnic equivalence for Asian- and
African-Americans, but marginal fit for Latinos.
16
The consequences of using inappropriate measures are not trivial. Because expression of
psychiatric symptoms and treatment outcomes may vary across racial/ethnic groups, the use of
inappropriate outcome measures may result in under- or overestimation of prevalence rates for
specific psychiatric disorders, as well as erroneous estimates of treatment effects.
18–21
Clearly,
increasing diversity calls for instruments that are appropriate, reliable, and valid for the wide range
of cultures that comprise the current US population. Testing and validation of assessment
instruments among diverse groups is an important step toward ensuring accurate and meaningful
understanding of mental health treatment outcomes. Consistent with this need, national and state
organizations, including the Substance Abuse and Mental Health Services Administration, National
Alliance for the Mentally Ill, National Committee on Quality Assurance, and California State
Department of Mental Health, have recommended development of standards for behavioral health
service competence that include guidelines for assessment measures, which are important for
accurately identifying disparities in mental health treatment and outcomes for diverse groups.
21
The purpose of this study is to evaluate factor structure, reliability, validity, and sensitivity to
change of the revised BASIS-24
*
among the three largest race/ethnicity groups in the USA:
Whites, African-Americans, and Latinos. BASIS-24
*
, a revised version of the BASIS-32
\
, was
designed to increase applicability of the instrument across diverse populations and levels of care,
improve reliability and validity, and reduce redundancy.
22,23
The 24 items measure six symptom/
problem domains: depression/functioning (6 items), interpersonal relationships (5 items),
psychotic symptoms (4 items), alcohol/drug use (4 items), emotional lability (3 items), and
self-harm (2 items). Development of the revised instrument, additional background about the
study, and confirmation of the factor structure, reliability, and validity for inpatients and
outpatients across racial/ethnic groups have been previously reported.
23,24
This paper extends the
previously reported work by examining psychometric properties of the instrument and sensitivity
to change within the three largest racial/ethnic groups in the USA.
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 305

Methods
Sample
The sample consisted of adult English-speaking inpatients (N = 2656) and outpatients (N =
3222) receiving mental health and/or substance abuse treatment at one of 27 participating
programs throughout the USA. All adult (over the age of 18) inpatient admissions and new
outpatient intakes arriving for treatment at one of the participating sites during the study period
(May 2001 through June 2002) were eligible for inclusion in the study. New outpatient intakes
were defined as individuals who had not been seen in the previous 6 months. Both mental health
and substance abuse treatment centers were included in the study to enable testing of the
instrument (which includes mental health and substance abuse domains) among diverse programs
and populations exhibiting a broad range of symptom presentations. All four of the major
geographic census regions (northeast, south, midwest, and west) were represented.
24
Among the
participants, 2620 inpatients and 3186 outpatients (99% of the sample) reported their ethnicity
and/or race. Ninety-three percent of the inpatients and outpatients could be categorized as White,
African-American, or Latino. The remaining 7% included Asian/Pacific Islander, American
Indian/Alaskan, and other (non-Latino). However, because there were insufficient numbers of any
other single race group, these cases were excluded from this paper. Results reported here are
based on the 2436 inpatients and 2975 outpatients for whom race/ethnicity data could be classified
as White, African-American, or Latino. Time 2 BASIS-24
*
assessments were available for 1302
of the inpatients (53%) and 780 of the outpatients (26%). Sample characteristics stratified by race/
ethnicity are presented in Table 1.
Measures
BASIS-24
**
Development of the BASIS-24
*
has been described in detail in previous publications and is
briefly summarized here.
23,24
First, measures of general and mental health status, psychiatric
symptoms, substance abuse, social/community functioning, and quality of life were reviewed to
identify the optimal range of question stems, response options, wording, and content. Feedback
about the original BASIS-32
\
was obtained from 75 researchers, administrators, clinicians, and
consumers about the length of the instrument, items that seemed confusing or difficult to answer,
appropriateness of response options, time frame, domains covered, and sensitivity to different racial
and cultural groups. Readability analysis was conducted using software for seven widely used
readability formulas and vocabulary lists associated with specific grade levels. Principles of survey
and item construction were used to create items that were clear, concise, and simply written.
Following these procedures, a revised version of the instrument was developed for further
readability analysis and cognitive testing, a qualitative interview process for evaluating
comprehension of questionnaire items.
25,26
Ninety-seven cognitive interviews were completed
at 12 mental health treatment programs in each of the four major US census regions, with
oversampling of minority participants (25% African-American, 15% Latino, and 5% other non-
White). Based on analysis of the cognitive interviews, the instrument was further revised by
eliminating or modifying items that were poorly understood. A field test instrument written at a
fifth grade reading level was constructed, tested, and validated. Factor analytic, classical test
theory, and item response theory (IRT) methods were used to eliminate items that were redundant
or that did not contribute to the reliability and validity of the instrument.
27–31
306 The Journal of Behavioral Health Services & Research 33:3 July 2006

Table 1
Characteristics of White, African-American, and Latino inpatients and outpatients
Inpatients (N = 2436) Outpatients (N = 2975)
White
(n = 1663)
African-
American
(n = 663)
Latino
(n = 110)
White
(n = 2371)
African-
American
(n = 341)
Latino
(n = 263)
(in %)
Age
18–24 15.2 12.4 22.4 20.1 13.2 19.4
25–34 20.4 24.9 28.2 28.1 26.7 42.6
35–44 30.9 34.5 22.7 29.3 35.8 24.0
45–54 20.5 21.1 16.4 16.3 18.5 11.4
55–64 7.0 4.5 3.6 4.6 5.3 2.3
65+ 6.0 2.6 2.7 1.6 0.6 0.4
Gender
Male 51.0 63.8 57.3 43.9 42.5 46.8
Female 49.0 36.2 42.7 56.1 57.5 53.2
Marital status
Never married 40.6 57.0 49.1 10.1 54.8 50.8
Married 25.3 10.1 20.9 31.3 16.7 23.3
Separated/
divorced
30.1 29.3 29.1 26.6 24.4 24.0
Widowed 4.0 3.6 0.9 2.0 4.2 1.9
Education
Eighth grade
or less
5.4 9.7 10.1 1.5 3.8 5.4
Some high
school
14.5 31.9 22.0 11.3 24.5 29.5
High school
graduate/GED
29.6 35.1 36.7 30.6 33.3 30.3
Some college 27.9 17.3 22.0 33.9 29.8 23.8
4-year college
graduate
22.7 6.0 9.2 22.6 8.6 11.1
Employed in the past 30 days
No 63.4 77.6 62.4 42.9 64.1 51.2
1–10 h 4.9 6.7 6.4 5.1 3.9 5.4
11–30 h 8.7 4.4 12.8 13.6 11.3 11.5
930 h 23.1 11.4 18.4 38.4 20.8 28.9
Student in the
past 30 days
6.9 5.0 9.2 9.4 8.6 11.1
Volunteer in the
past 30 days
7.9 9.4 16.6 9.8 9.0 7.7
Primary psychiatric diagnosis
Schizophrenia/
schizoaffective
17.0 40.8 21.8 4.1 7.4 10.5
Depressive disorder 31.7 11.7 13.9 31.9 26.5 35.1
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 307

The final BASIS-24
*
includes 24 items, each with five ordered response options reporting
either the level of difficulty experienced (no difficulty...extreme difficulty) or the frequency with
which a symptom or problem has occurred (none of the time...all of the time). Respondents
answer each question in terms of how they have been during the past week; for example, BDuring
the past week, how much of the time did you feel sad or depressed?^ The complete instrument
and information about how to obtain it is available on the Web site http://www.basissurvey.org.
Item response theory analyses were used initially to create standardized scores for each domain
of the instrument and to compute an overall summary score.
24
However, because computation of
IRT scores requires specialized software not available to many mental health programs, a linear
approximation to the IRT scores was developed by performing an ordinary least-squares
regression of the IRT domain and overall summary scores using raw scores as independent
variables. The coefficients were used as weights to compute weighted scores that maintain the
same range of values (0...4) as the BASIS-24
*
items and the same range of values as the original
BASIS-32
\
. The weighted scores have correlations ranging from .96 to 9.99 with the IRT scores.
Lower scores indicate less symptom/problem difficulty or frequency, and higher scores indicate
greater symptom/problem difficulty or frequency.
BASIS-24
*
differs from the BASIS-32
\
in several important ways: item wording is simplified,
double-barreled questions (incorporating multiple symptoms in one item) are eliminated, response
options are varied (most asking about frequency of symptom occurrence), items unlikely to apply
to many respondents (e.g., school or work functioning) are replaced with more broadly applicable
items, questions that were poorly understood were revised or eliminated, and alcohol and drug use
items were added. Thus, although most of the constructs represented in the original BASIS-32
\
were retained, many of the items were revised.
Bipolar disorder 15.6 6.8 9.9 10.6 5.7 10.5
Alcohol/drug use
disorder
25.1 30.9 42.6 26.0 44.1 27.5
Anxiety disorder 4.5 0.8 3.0 9.4 4.1 8.8
Adjustment
disorder
2.8 3.8 4.0 11.7 9.4 5.9
Other Axis
1 disorder
2.9 4.4 4.0 6.3 2.9 1.8
Medical/no Axis
1 disorder
0.5 0.9 1.0 0.1 0 0
GAF rating at
time 1
34.01 36.10 37.28 54.45 51.64 53.20
Table 1
(continued)
Inpatients (N = 2436) Outpatients (N = 2975)
White
(n = 1663)
African-
American
(n = 663)
Latino
(n = 110)
White
(n = 2371)
African-
American
(n = 341)
Latino
(n = 263)
(in %)
308 The Journal of Behavioral Health Services & Research 33:3 July 2006

Validation measures
Validation measures included the 12-item Short-Form Health Survey (SF-12
\
), global ratings
of mental health and life satisfaction, and DSM-IV psychiatric diagnoses including Global
Assessment of Functioning (GAF) ratings.
22,32–34
The SF-12
\
, a widely used reliable and valid
self-report instrument, assesses physical and mental health status with two summary scales:
physical component summary (PCS) and mental component summary (MCS).
32
Global ratings of
mental health and satisfaction with life were five-point self-report scales ranging from 0 (poor) to
4 (excellent). The GAF (Axis V diagnosis) is a single-item clinician rating scale assessing overall
psychological symptoms and social and occupational functioning. Primary psychiatric diagnosis
was used to validate corresponding BASIS-24
*
domains. In addition, a comorbidity index was
computed by summing the total number of different diagnostic categories assigned to each
participant. Up to three comorbid diagnostic categories were included, yielding a comorbidity
index ranging from 0 to 3. Diagnostic categories included those shown in Table 1 as well as Axis
II personality disorders.
Demographic characteristics
Age, gender, Latino ethnicity, race, education, and marital status were obtained by patient self-
report questions appended to the BASIS-24
*
. Based on Office of Management and Budget
(OMB) census guidelines, Latino ethnicity was a separate question preceding the race question,
and reporting of multiple races was permitted.
35
For purposes of this study, mutually exclusive
race/ethnicity categories were constructed as follows. So as not to underrepresent Latinos, all
respondents who reported Latino ethnicity were categorized as Latino regardless of which race
categories were checked (n = 110 inpatients and n = 263 outpatients). All non-Latino respondents
who reported BWhite^ as their only race were categorized as White (n = 1662 inpatients and n =
2371 outpatients). All non-Latino respondents who reported BBlack or African-American^ as their
only race were categorized as African-American (n = 663 inpatients and n = 341 outpatients). All
other respondents including those who reported more than one race were categorized as Bother^ (n =
184 inpatients and n = 211 outpatients) and were not included in the data analysis. Although
Latinos who reported more than one race were included in the Latino group, only nine
participants (2.4% of Latinos) did so. The largest single race category reported by Latinos was
Bother^ (46% of inpatients and 38% of outpatients), suggesting that many Latinos do not identify
with the standard race categories recommended by the OMB. An additional 7% of Latino
inpatients and 10% of Latino outpatients left the race question blank.
Procedure
The BASIS-24
*
, SF-12
\
, and other validation items were administered twice, upon admission
or intake, and in the 24-h period before discharge (for inpatients) or 4–8 weeks following intake
for outpatients. These instruments were administered by program staff within the context of CQI
programs as part of routine outcomes monitoring. Verbal consent was obtained from all
participants. This data collection process was approved by the Institutional Review Board of the
grantee institution and by each participating site. Demographic characteristics, admission and
discharge dates, payer, and DSM-IV psychiatric diagnoses including GAF ratings were extracted
from medical records or administrative databases.
Data analysis: overview
Data analyses included tests of data quality (examination of the rate of missing data and floor
and ceiling effects), confirmatory factor analysis (CFA) within each racial/ethnic group, tests of
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 309

internal consistency reliability, construct validity, and sensitivity to change. All analyses were
performed separately for White, African-American, and Latino subgroups. Following completion
of the missing data analysis, missing BASIS-24
*
, SF-12
\
, and mental health/life satisfaction item
ratings were imputed using SAS PROC MI procedures.
24,36
Data quality
Item frequency distributions were generated to assess rates of missing data and floor and ceiling
effects for each domain. High rates of missing data can identify items that are confusing, difficult
to answer, or inapplicable to respondents. Extensive floor and ceiling effects can indicate
insensitivity of the instrument to individual differences in symptom levels at the extreme ends of
the continuum or inapplicability of items to the sample.
Confirmatory factor analysis
Confirmatory factor analysis for each of the three race/ethnicity subgroups was conducted using
LISREL 8.3 to confirm the six-factor structure that was previously obtained for the BASIS-24
*
across race/ethnicity.
37
Reliability and validity of the instrument
Reliability and validity analyses were conducted separately for each race/ethnicity group within
inpatient and outpatient samples. These analyses were performed separately for the two levels of
care to determine the instrument’s psychometric properties and potential utility for outcome
assessment in both inpatient and outpatient settings. Cronbach’s alpha and item–total correlations
were computed to assess internal consistency reliability of each subscale and the total scale.
38
Known groups (discriminant) validity was assessed in two ways. First, two groups at different
levels of care (inpatients and outpatients) were compared, with the expectation that inpatients
would report higher levels of severity than outpatients. Second, score differences between
diagnostic groups corresponding to specific BASIS-24
*
subscales were examined. Individuals
diagnosed with depressive disorders were expected to report greater difficulty with depression/
functioning, those diagnosed with psychotic disorders (schizophrenia, schizophreniform,
schizoaffective, and other psychotic disorders) were expected to report greater frequency of
psychotic symptoms, those with substance use disorders were expected to report greater frequency
of substance abuse problems, and those with bipolar disorders were expected to report greater
emotional lability.
Table 2
Fit indices for the six-factor model of BASIS-24
*
by race/ethnicity
White African-American Latino
Adjusted goodness of fit index .89 .87 .84
Root mean squared error of approximation .068 .069 .067
Standardized root mean squared residual .053 .061 .067
Comparative fit index .92 .91 .91
Non-normed fit index .91 .90 .89
310 The Journal of Behavioral Health Services & Research 33:3 July 2006

Table 3
Standardized internal consistency reliability (Cronbach’s alpha)
a
coefficients and item–total
correlations for inpatients and outpatients from three race/ethnicity groups
BASIS-24
**
subscale
Inpatient (N = 2436) Outpatient (N = 2975)
White
(n = 1663)
African-
American
(n = 663)
Latino
(n = 110)
White
(n = 2371)
African-
American
(n = 341)
Latino
(n =263)
Depression/
functioning
.88 .85 .89 .91 .91 .87
Manage
day-to-day life
.72 .66 .77 .77 .75 .64
Cope with
problems in life
.77 .71 .79 .82 .84 .77
Concentrate .73 .68 .74 .77 .81 .76
Feel confident
in yourself
.57 .50 .53 .68 .55 .42
Feel sad or
depressed
.71 .65 .70 .78 .84 .81
Feel nervous .63 .62 .66 .72 .74 .67
Interpersonal
relationships
.82 .81 .80 .83 .86 .81
Get along with
people in family
.58 .54 .51 .62 .66 .54
Get along
outside family
.67 .66 .67 .68 .76 .68
Get along in
social situations
.64 .66 .69 .66 .76 .63
Feel close to
another person
.62 .59 .54 .62 .65 .64
Have someone
to turn to
.54 .56 .50 59 .56 .51
Self-harm
b
.90 .89 .82 .86 .86 .88
Think about
ending your life
.82 .81 .69 .76 .75 .78
Think about
hurting yourself
Emotional lability .75 .78 .66 .78 .79 .78
Thoughts racing
through head
.50 .55 .40 .47 .51 .50
Have mood
swings
.66 .68 .61 .71 .74 .70
Feel short-tempered .59 .60 .41 .67 .65 .69
Psychotic symptoms .77 .77 .70 .74 .78 .83
Have special
powers
.46 .39 .39 .40 .41 .51
Hear voices or
see things
.56 .61 .58 .57 .61 .66
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 311

Concurrent construct validity was assessed in three ways. First, Pearson product moment
correlations between the BASIS-24
*
summary score and similar constructs (other measures of
mental health), as well as correlations between the summary score and dissimilar constructs
(physical health), were computed.
27
Higher correlations were expected with other mental health
measures (global mental health, life satisfaction, and MCS) than with physical health measures
(PCS). Second, BASIS-24
*
summary scores were correlated with GAF ratings extracted from
medical records. Third, the comorbidity index (number of comorbid diagnoses) was correlated
with the BASIS-24
*
summary score. Hypothesizing that comorbidity indicates increased
diagnostic complexity or intensity of illness, the expectation was that the more comorbid
diagnoses, the greater would be the self-reported symptom and problem difficulty.
Sensitivity to change
Sensitivity to change was assessed by comparing statistical significance and effect size of change
from admission to discharge for inpatients, and from intake to 2-month follow-up for outpatients
within each race/ethnicity group. Analyses were conducted separately for the inpatient and
outpatient samples because the difference in symptom severity between these two levels of care has
implications for expected levels of improvement. Inpatients, with higher time 1 symptom severity
levels, would be expected to improve more than outpatients.
22
However, it is important for an
outcome instrument to be sensitive to the smaller amounts of change that might characterize
individuals with lower symptom severity levels treated in less intensive (e.g., outpatient) settings.
Think people are
watching you
.70 .70 .67 .64 .70 .80
Think people are
against you
.59 .61 .33 .52 .62 .67
Alcohol/drug use .89 .84 .88 .81 .83 .85
Have urge to drink/
take drugs
.75 .69 .76 .59 .66 .66
Talk to you about
alcohol/drugs
.72 .61 .68 .58 .61 .65
Hide your drinking
or drug use
.75 .64 .74 .63 .65 .71
Problems from
drinking/drugs
.81 .76 .77 .71 .75 .73
Overall summary
score
.87 .89 .89 .90 .91 .90
a
Cronbach’s alpha coefficients are shown in bold print.
b
Because the self-harm subscale has only two items, there is only one item–total correlation.
Table 3
(continued)
BASIS-24
**
subscale
Inpatient (N = 2436) Outpatient (N = 2975)
White
(n = 1663)
African-
American
(n = 663)
Latino
(n = 110)
White
(n = 2371)
African-
American
(n = 341)
Latino
(n = 263)
312 The Journal of Behavioral Health Services & Research 33:3 July 2006

Results
Data quality
The rate of missing data for each item ranged from G1 to 3.8% for both inpatients and
outpatients with virtually no variation among race/ethnicity groups. The highest missing data rate
of 3.8% occurred for no more than two items for any of the race/ethnicity groups. Regarding item
distributions, for both inpatients and outpatients, all possible response options were endorsed for
each item. For each domain, floor effects (worst possible functioning) occurred for no more than
5% of inpatients and outpatients within each race/ethnicity group. Ceiling effects (best possible
functioning) were infrequent for common domains such as depression and functioning, occurring
for up to 5% of inpatients and up to 7% of outpatients. However, ceiling effects were more
common for infrequently occurring domains such as self-harm, with 38–51% of inpatients and
56–67% of outpatients reporting no thoughts of self-harm during the past week. As expected,
outpatients generally had higher rates of ceiling effects (best functioning), and inpatients had
higher rates of floor effects (worst functioning) with relatively little variation among race/
ethnicity groups (data not shown).
Confirmatory factor analysis results
Confirmatory factor analysis conducted for each race/ethnicity subgroup yielded statistically
significant chi-squares for each of the race/ethnicity groups. However, because the chi-square test
of fit is affected by the large sample size, additional fit statistics were examined (Table 2).
39
The
adjusted goodness of fit index ranged from .84 to .89, indicating adequate fit to the model.
40
Absolute fit indices, which assess the adequacy with which the model reproduces the data, include
the root mean square error of approximation and standardized root mean squared residual.
39
These
values were less than .07 for each race/ethnicity group, indicating adequate model fit. Incremental
fit indices, which should have large values indicating that the model accounts for much of the
variation in the data, include the comparative fit index and the non-normed fit index. These values
exceeded .89 for each race/ethnicity group, indicating good fit.
40
Reliability
Internal consistency reliability (Cronbach’s alpha) coefficients exceeded .70 for all six domains
and for all three race/ethnicity groups for both inpatients and outpatients, with one exception: for
Latino inpatients, the alpha was .66 for the emotional lability domain (Table 3). Item–total
correlations, also presented in Table 3, exceeded .40 for all six domains within each race/ethnicity
group, with the exception of two items in the psychosis domain for Latino inpatients and one item
in the psychosis domain for African-American inpatients. Item–total correlations with the overall
summary score ranged from .11 to .72, with more than 70% of the values Q.40 within each race/
ethnicity group. Substance abuse and psychosis items generally had lower correlations with the
overall summary score than items in the other four domains (data not shown).
Discriminant (known groups) validity
As expected, White inpatients consistently reported significantly higher symptom/problem
levels than outpatients for all BASIS domains and for the overall summary score (Table 4).
However, this difference based on level of care was not as consistent for the two minority groups.
Among African-Americans, inpatients reported significantly higher levels of self-harm and
psychotic symptoms than outpatients, but there was no statistically significant difference in any of
the other domains or in the overall summary score. Among Latinos, inpatients reported sig-
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 313

T
a
b
l
e
4
M
e
a
n
(
S
D
)
B
A
S
I
S
-
2
4
*
s
c
o
r
e
s
f
o
r
i
n
p
a
t
i
e
n
t
s
a
n
d
o
u
t
p
a
t
i
e
n
t
s
b
y
l
e
v
e
l
o
f
c
a
r
e
a
n
d
r
a
c
e
/
e
t
h
n
i
c
i
t
y
B
A
S
I
S
-
2
4
s
u
b
s
c
a
l
e
W
h
i
t
e
A
f
r
i
c
a
n
-
A
m
e
r
i
c
a
n
L
a
t
i
n
o
I
n
p
a
t
i
e
n
t
(
n
=
1
6
6
3
)
O
u
t
p
a
t
i
e
n
t
(
n
=
2
3
7
1
)
t
p
GGGG
I
n
p
a
t
i
e
n
t
(
n
=
6
6
3
)
O
u
t
p
a
t
i
e
n
t
(
n
=
3
4
1
)
t
p
GGGG
I
n
p
a
t
i
e
n
t
(
n
=
1
1
0
)
O
u
t
p
a
t
i
e
n
t
(
n
=
2
6
3
)
t
p
GGGG
D
e
p
r
e
s
s
i
o
n
/
f
u
n
c
t
i
o
n
i
n
g
2
.
4
1
(
1
.
0
7
)
1
.
8
2
(
1
.
0
7
)
1
7
.
1
5
.
0
0
1
1
.
8
3
(
1
.
1
2
)
1
.
7
9
(
1
.
2
0
)
0
.
5
3
n
s
1
.
8
9
(
1
.
1
9
)
2
.
0
6
(
1
.
0
8
)
_
1
.
3
2
n
s
I
n
t
e
r
p
e
r
s
o
n
a
l
r
e
l
a
t
i
o
n
s
h
i
p
s
1
.
7
5
(
1
.
0
3
)
1
.
3
5
(
.
9
6
)
1
2
.
6
2
.
0
0
1
1
.
7
6
(
1
.
0
9
)
1
.
6
5
(
1
.
1
0
)
1
.
6
0
n
s
1
.
7
4
(
1
.
1
3
)
1
.
6
5
(
1
.
0
3
)
0
.
7
1
n
s
S
e
l
f
-
h
a
r
m
1
.
2
2
(
1
.
2
6
)
0
.
4
6
(
.
8
3
)
2
1
.
6
8
.
0
0
1
0
.
9
5
(
1
.
2
3
)
0
.
6
1
(
1
.
0
0
)
4
.
6
1
.
0
0
1
1
.
1
7
(
1
.
2
8
)
0
.
7
3
(
1
.
0
6
)
3
.
2
2
.
0
0
2
E
m
o
t
i
o
n
a
l
l
a
b
i
l
i
t
y
2
.
0
1
(
1
.
0
7
)
1
.
8
7
(
1
.
0
8
)
4
.
1
7
.
0
0
1
1
.
7
6
(
1
.
4
5
)
1
.
9
0
(
9
1
)
_
1
.
8
1
n
s
1
.
7
8
(
1
.
1
3
)
2
.
1
4
(
1
.
1
4
)
_
2
.
7
3
.
0
0
7
P
s
y
c
h
o
t
i
c
s
y
m
p
t
o
m
s
0
.
9
4
(
1
.
0
4
)
0
.
5
4
(
.
7
9
)
1
3
.
1
5
.
0
0
1
1
.
4
5
(
1
.
2
3
)
0
.
9
1
(
1
.
0
5
)
7
.
2
6
.
0
0
1
1
.
0
4
(
1
.
0
0
)
0
.
9
3
(
1
.
0
6
)
0
.
9
3
n
s
A
l
c
o
h
o
l
/
d
r
u
g
u
s
e
1
.
2
7
(
1
.
3
1
)
0
.
6
5
(
.
8
8
)
1
6
.
8
7
.
0
0
1
1
.
2
0
(
1
.
2
3
)
1
.
0
9
(
1
.
1
7
)
1
.
3
5
n
s
1
.
3
6
(
1
.
2
9
)
0
.
8
0
(
1
.
0
6
)
4
.
0
1
.
0
0
1
O
v
e
r
a
l
l
s
u
m
m
a
r
y
s
c
o
r
e
1
.
9
4
(
.
7
9
)
1
.
4
6
(
.
7
9
)
1
8
.
8
1
.
0
0
1
1
.
6
5
(
.
8
7
)
1
.
5
6
(
.
8
9
)
1
.
5
0
n
s
1
.
6
8
(
.
8
9
)
1
.
7
1
(
.
8
4
)
_
0
.
3
7
n
s
n
s
=
n
o
t
s
i
g
n
i
fi
c
a
n
t
.
T
h
e
h
i
g
h
e
r
t
h
e
n
u
m
b
e
r
,
t
h
e
g
r
e
a
t
e
r
t
h
e
s
y
m
p
t
o
m
/
p
r
o
b
l
e
m
s
e
v
e
r
i
t
y
.
314 The Journal of Behavioral Health Services & Research 33:3 July 2006

T
a
b
l
e
5
M
e
a
n
(
S
D
)
B
A
S
I
S
-
2
4
*
s
u
b
s
c
a
l
e
s
c
o
r
e
s
f
o
r
c
o
r
r
e
s
p
o
n
d
i
n
g
d
i
a
g
n
o
s
t
i
c
c
a
t
e
g
o
r
i
e
s
I
n
p
a
t
i
e
n
t
s
(
N
=
2
2
8
1
)
a
W
h
i
t
e
A
f
r
i
c
a
n
-
A
m
e
r
i
c
a
n
L
a
t
i
n
o
B
A
S
I
S
-
2
4
S
u
b
s
c
a
l
e
D
e
p
r
e
s
s
i
o
n
/
f
u
n
c
t
i
o
n
i
n
g
D
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
4
9
0
)
N
o
d
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
1
0
5
5
)
D
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
7
4
)
N
o
d
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
5
6
1
)
D
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
1
4
)
N
o
d
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
8
7
)
2
.
7
0
(
0
.
9
5
)
2
.
2
9
*
*
*
(
1
.
1
1
)
2
.
2
4
(
1
.
0
0
)
1
.
7
6
*
*
*
(
1
.
1
2
)
2
.
3
6
(
1
.
2
7
)
1
.
8
0
(
1
.
1
2
)
P
s
y
c
h
o
t
i
c
s
y
m
p
t
o
m
s
P
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
2
6
2
)
N
o
p
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
1
2
8
3
)
P
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
2
5
9
)
N
o
p
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
3
7
6
)
P
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
2
2
)
N
o
p
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
7
9
)
1
.
3
4
(
1
.
1
9
)
0
.
8
2
*
*
*
(
0
.
9
8
)
1
.
6
3
(
1
.
2
7
)
1
.
3
1
*
*
*
(
1
.
2
0
)
1
.
1
4
(
0
.
9
6
)
0
.
9
5
(
0
.
9
7
)
A
l
c
o
h
o
l
/
d
r
u
g
u
s
e
S
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
3
8
8
)
N
o
s
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
1
1
5
7
)
S
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
1
9
6
)
N
o
s
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
4
3
9
)
S
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
4
3
)
N
o
s
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
5
8
)
2
.
3
8
(
1
.
1
8
)
0
.
9
3
*
*
*
(
1
.
1
5
)
2
.
2
6
(
1
.
0
3
)
0
.
7
3
*
*
*
(
0
.
9
8
)
2
.
2
9
(
1
.
0
0
)
0
.
6
5
*
*
*
(
1
.
0
4
)
E
m
o
t
i
o
n
a
l
l
a
b
i
l
i
t
y
B
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
2
4
1
)
N
o
b
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
1
3
0
4
)
B
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
4
3
)
N
o
b
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
5
9
2
)
B
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
1
0
)
N
o
b
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
9
1
)
2
.
1
9
(
1
.
1
2
)
1
.
9
8
*
*
(
1
.
0
5
)
2
.
0
3
(
1
.
0
6
)
1
.
7
3
(
1
.
1
6
)
2
.
1
5
(
1
.
1
8
)
1
.
7
0
(
1
.
1
1
)
a
D
S
M
-
I
V
d
i
a
g
n
o
s
i
s
w
a
s
n
o
t
a
v
a
i
l
a
b
l
e
f
o
r
6
%
o
f
i
n
p
a
t
i
e
n
t
s
a
n
d
3
1
%
o
f
o
u
t
p
a
t
i
e
n
t
s
.
*
*
*
p
G
.
0
0
1
.
*
*
p
G
.
0
1
.
*
p
G
.
0
5
.
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 315

T
a
b
l
e
5
(
c
o
n
t
i
n
u
e
d
)
O
u
t
p
a
t
i
e
n
t
s
(
N
=
2
0
4
5
)
a
W
h
i
t
e
A
f
r
i
c
a
n
-
A
m
e
r
i
c
a
n
L
a
t
i
n
o
D
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
5
2
0
)
N
o
d
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
1
1
0
9
)
D
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
6
5
)
N
o
d
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
1
8
0
)
D
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
6
0
)
N
o
d
e
p
r
e
s
s
i
v
e
d
i
s
o
r
d
e
r
(
n
=
1
1
1
)
2
.
2
8
(
0
.
9
2
)
1
.
6
6
*
*
*
(
1
.
0
6
)
2
.
2
1
(
1
.
0
6
)
1
.
6
6
*
*
*
(
1
.
2
6
)
2
.
4
9
(
0
.
8
3
)
1
.
8
5
*
*
*
(
1
.
1
3
)
P
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
6
6
)
N
o
p
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
1
5
6
3
)
P
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
1
8
)
N
o
p
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
2
2
7
)
P
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
1
8
)
N
o
p
s
y
c
h
o
t
i
c
d
i
s
o
r
d
e
r
(
n
=
1
5
3
)
1
.
1
5
(
1
.
1
8
)
0
.
5
2
*
*
*
(
0
.
7
6
)
1
.
3
1
(
1
.
2
1
)
0
.
9
1
(
1
.
0
5
)
2
.
1
5
(
1
.
3
4
)
0
.
8
5
*
*
*
(
0
.
9
9
)
S
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
4
2
3
)
N
o
s
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
1
2
0
6
)
S
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
1
0
8
)
N
o
s
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
1
3
7
)
S
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
4
7
)
N
o
s
u
b
s
t
a
n
c
e
u
s
e
d
i
s
o
r
d
e
r
(
n
=
1
2
4
)
1
.
0
2
(
1
.
0
2
)
0
.
5
1
*
*
*
(
0
.
7
8
)
1
.
7
0
(
1
.
1
9
)
0
.
6
7
*
*
*
(
0
.
9
2
)
1
.
0
2
1
.
2
2
0
.
7
1
(
1
.
0
3
)
B
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
1
7
2
)
N
o
b
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
1
4
5
7
)
B
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
1
4
)
N
o
b
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
2
3
1
)
B
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
1
8
)
N
o
b
i
p
o
l
a
r
d
i
s
o
r
d
e
r
(
n
=
1
5
3
)
2
.
1
8
(
1
.
0
8
)
1
.
8
8
*
*
*
(
1
.
0
8
)
2
.
1
7
(
1
.
1
5
)
1
.
8
9
(
1
.
1
6
)
2
.
0
1
(
1
.
4
3
)
2
.
2
1
(
1
.
1
5
)
316 The Journal of Behavioral Health Services & Research 33:3 July 2006

nificantly higher levels of self-harm and alcohol/drug problems, but there were no significant
differences with regard to depression/functioning, interpersonal relationships, psychotic symp-
toms, or the overall summary score; and contrary to expectation, Latino outpatients reported
significantly greater frequency of emotional lability than inpatients.
Results regarding BASIS-24
*
subscale score differences for individuals with corresponding
diagnoses are presented in Table 5. For White inpatients and outpatients, results supported the
hypotheses that patients diagnosed with schizophrenia, depression, bipolar disorder, or substance
abuse would report greater symptom levels on corresponding BASIS-24
*
domains than patients
without these diagnoses. Results were generally in the expected direction for African-American
and Latino inpatients and outpatients, but were not statistically significant for all groups. The
depression/functioning subscale discriminated by diagnosis for all subgroups except Latino
inpatients. The substance abuse subscale discriminated by diagnosis for all subgroups except
Latino outpatients. The psychosis subscale discriminated by diagnosis for all subgroups except
Latino inpatients and African-American outpatients. The emotional lability subscale failed to
discriminate by diagnosis for African-Americans and Latino inpatients and outpatients.
Concurrent construct validity
Correlations between the time 1 BASIS-24
*
overall summary score and other measures of
mental and physical health status are presented in Table 6. For both inpatients and outpatients in
each race/ethnicity group, correlations of the BASIS summary score with other self-reported
measures of mental health status (MCS, global mental health, and satisfaction with life) ranged
from .59 to .82. Correlations between the summary score with PCS (physical functioning) were
consistently lower, ranging from .07 to .45.
Among outpatients, correlations between time 1 GAF ratings and BASIS-24
*
overall scores
were statistically significant (ranging from .27 to .29) for each race/ethnicity group, indicating that
greater self-reported symptom and problem difficulty was significantly correlated with greater
clinician-rated impairment. However, for inpatients, the correlation was statistically significant
only for African-Americans. Similarly, expected associations between self-reported severity and
Table 6
Pearson correlations
a
of time 1 BASIS-24
*
overall summary score with measures of mental and
physical functioning, global functioning, and comorbidity
Inpatient Outpatient
White African-American Latino White African-American Latino
SF-12
\
MCS .71 .68 .73 .78 .82 .72
SF-12
\
PCS .07 .27 .18 .27 .25 .45
Global mental
health
.66 .61 .59 .76 .74 .73
Satisfaction
with life
.64 .61 .61 .72 .70 .71
GAF .04
b
.11 .03
b
.27 .27 .29
Comorbidity
index
.15 .06
b
.11 .27 .22 .32
a
All correlations are in the expected direction. For ease of presentation, absolute values of the correlations
are presented.
b
Not statistically significant. All other correlations are statistically significant (p G .001).
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 317

T
a
b
l
e
7
M
e
a
n
(
S
D
)
B
A
S
I
S
-
2
4
*
s
c
o
r
e
s
a
t
t
w
o
t
i
m
e
p
o
i
n
t
s
f
o
r
t
h
r
e
e
r
a
c
e
/
e
t
h
n
i
c
i
t
y
g
r
o
u
p
s
I
n
p
a
t
i
e
n
t
(
N
=
1
3
0
2
)
B
A
S
I
S
-
2
4
s
u
b
s
c
a
l
e
W
h
i
t
e
(
n
=
8
6
1
)
A
f
r
i
c
a
n
-
A
m
e
r
i
c
a
n
(
n
=
3
8
5
)
L
a
t
i
n
o
(
n
=
5
6
)
T
1
T
2
E
S
T
1
T
2
E
S
T
1
T
2
E
S
D
e
p
r
e
s
s
i
o
n
/
f
u
n
c
t
i
o
n
i
n
g
2
.
4
2
(
1
.
0
8
)
1
.
2
8
(
0
.
8
7
)
1
.
1
6
1
.
8
3
(
1
.
1
4
)
0
.
9
6
(
0
.
8
5
)
.
8
7
1
.
9
5
(
1
.
1
7
)
1
.
0
4
(
0
.
7
7
)
.
9
2
I
n
t
e
r
p
e
r
s
o
n
a
l
r
e
l
a
t
i
o
n
s
h
i
p
s
1
.
7
2
(
1
.
0
3
)
1
.
2
6
(
0
.
9
9
)
.
4
6
1
.
8
1
(
1
.
1
1
)
1
.
2
9
(
1
.
1
2
)
.
4
7
1
.
7
3
(
1
.
0
8
)
1
.
3
4
(
1
.
2
3
)
.
3
4
S
e
l
f
-
h
a
r
m
1
.
2
1
(
1
.
2
5
)
0
.
4
1
(
0
.
7
2
)
.
7
8
0
.
9
4
(
1
.
2
2
)
0
.
4
1
(
0
.
7
8
)
.
5
2
1
.
2
0
(
1
.
3
2
)
0
.
2
7
(
0
.
5
6
)
.
9
2
E
m
o
t
i
o
n
a
l
l
a
b
i
l
i
t
y
2
.
0
5
(
1
.
0
9
)
1
.
3
1
(
0
.
9
2
)
.
7
3
1
.
7
7
(
1
.
1
7
)
1
.
1
8
(
1
.
0
0
)
.
5
4
1
.
8
6
(
1
.
2
0
)
1
.
2
8
(
1
.
0
2
)
.
5
2
P
s
y
c
h
o
t
i
c
s
y
m
p
t
o
m
s
0
.
9
3
(
1
.
0
6
)
0
.
5
4
(
0
.
7
8
)
.
4
2
1
.
5
0
(
1
.
2
4
)
0
.
8
9
(
1
.
0
0
)
.
5
4
1
.
0
7
(
1
.
0
9
)
0
.
5
6
(
0
.
8
6
)
.
5
2
A
l
c
o
h
o
l
/
d
r
u
g
u
s
e
1
.
2
7
(
1
.
3
1
)
0
.
7
9
(
0
.
9
6
)
.
4
2
1
.
1
6
(
1
.
2
3
)
0
.
8
0
(
0
.
9
5
)
.
3
3
1
.
2
6
(
1
.
2
7
)
0
.
7
2
(
0
.
9
6
)
.
4
8
O
v
e
r
a
l
l
s
u
m
m
a
r
y
s
c
o
r
e
1
.
9
5
(
0
.
7
9
)
1
.
1
1
(
0
.
6
3
)
1
.
1
8
1
.
6
5
(
0
.
8
7
)
0
.
9
8
(
0
.
6
8
)
.
8
6
1
.
7
2
(
0
.
9
0
)
0
.
9
9
(
0
.
6
0
)
.
9
5
O
u
t
p
a
t
i
e
n
t
(
N
=
7
8
0
)
B
A
S
I
S
-
2
4
s
u
b
s
c
a
l
e
W
h
i
t
e
(
n
=
6
5
4
)
A
f
r
i
c
a
n
-
A
m
e
r
i
c
a
n
(
n
=
6
7
)
L
a
t
i
n
o
(
n
=
5
9
)
T
1
T
2
E
S
T
1
T
2
E
S
T
1
T
2
E
S
D
e
p
r
e
s
s
i
o
n
/
f
u
n
c
t
i
o
n
i
n
g
1
.
8
1
(
1
.
1
0
)
1
.
4
0
(
0
.
9
9
)
.
3
9
1
.
9
1
(
1
.
1
7
)
1
.
4
9
(
1
.
0
7
)
.
3
7
2
.
1
2
(
1
.
0
9
)
1
.
7
2
(
1
.
1
)
.
3
7
I
n
t
e
r
p
e
r
s
o
n
a
l
r
e
l
a
t
i
o
n
s
h
i
p
s
1
.
3
5
(
0
.
9
9
)
1
.
1
4
(
0
.
9
0
)
.
2
2
1
.
7
0
(
1
.
1
3
)
1
.
3
9
(
0
.
9
0
)
.
3
0
1
.
8
7
(
1
.
0
2
)
1
.
5
3
(
1
.
0
0
)
.
3
4
S
e
l
f
-
h
a
r
m
0
.
5
0
(
0
.
8
7
)
0
.
3
1
(
0
.
6
7
)
.
2
4
0
.
6
0
(
1
.
0
1
)
0
.
4
4
(
0
.
8
8
)
.
1
7
a
0
.
6
5
(
0
.
9
3
)
0
.
5
0
(
0
.
9
3
)
.
1
6
a
E
m
o
t
i
o
n
a
l
l
a
b
i
l
i
t
y
1
.
8
2
(
1
.
1
1
)
1
.
5
0
(
1
.
0
0
)
.
3
0
2
.
0
3
(
1
.
1
4
)
1
.
8
6
(
1
.
0
5
)
.
1
6
a
2
.
2
5
(
1
.
0
1
)
1
.
8
0
(
1
.
1
6
)
.
4
1
P
s
y
c
h
o
t
i
c
s
y
m
p
t
o
m
s
0
.
5
3
(
0
.
7
7
)
0
.
3
8
(
0
.
6
7
)
.
2
1
1
.
0
0
(
1
.
0
9
)
0
.
7
0
(
0
.
8
4
)
.
3
1
1
.
0
5
(
1
.
0
9
)
0
.
8
8
(
0
.
9
9
)
.
1
6
a
A
l
c
o
h
o
l
/
d
r
u
g
u
s
e
0
.
6
3
(
0
.
8
6
)
0
.
4
1
(
0
.
6
4
)
.
2
9
1
.
2
8
(
1
.
3
3
)
0
.
8
2
(
0
.
9
4
)
.
4
0
1
.
1
0
(
1
.
2
6
)
0
.
6
5
(
0
.
7
8
)
.
4
3
O
v
e
r
a
l
l
s
u
m
m
a
r
y
s
c
o
r
e
1
.
4
6
(
0
.
8
2
)
1
.
1
4
(
0
.
7
2
)
.
4
1
1
.
6
5
(
0
.
8
8
)
1
.
3
2
(
0
.
8
1
)
.
3
9
1
.
8
1
(
0
.
8
4
)
1
.
4
5
(
0
.
8
7
)
.
4
2
T
1
=
t
i
m
e
1
,
T
2
=
t
i
m
e
2
,
E
S
=
e
f
f
e
c
t
s
i
z
e
.
a
N
o
t
s
t
a
t
i
s
t
i
c
a
l
l
y
s
i
g
n
i
fi
c
a
n
t
.
A
l
l
o
t
h
e
r
T
1
_
T
2
d
i
f
f
e
r
e
n
c
e
s
a
r
e
s
t
a
t
i
s
t
i
c
a
l
l
y
s
i
g
n
i
fi
c
a
n
t
(
p
G
0
5
)
.
318 The Journal of Behavioral Health Services & Research 33:3 July 2006

diagnostic comorbidity were statistically significant for each race/ethnicity group among
outpatients (ranging from .22 to .32). For inpatients, the associations were weaker, but still
statistically significant for Whites and for Latinos (Table 6).
Sensitivity to change
Change from admission to discharge for inpatients, and from intake to 1-month follow-up for
outpatients, along with statistical significance and effect sizes are shown in Table 7. Across all
race/ethnicity groups, change was greater for inpatients than outpatients. Among inpatients, all
differences between admission and discharge were statistically significant, with effect sizes
ranging from small/medium (.33) to large (1.18).
41
The largest effect sizes occurred for the
depression/functioning subscale and for the overall summary score. Although there was less
change over time and smaller effect sizes (.16 to .43) for outpatients than inpatients, among
Whites, all differences were statistically significant; for African-American and Latino outpatients,
time 1
_
time 2 differences were statistically significant for four of the six domains and for the
summary score.
Because there was substantial attrition between time 1 and time 2, particularly for outpatients,
sample characteristics of those for whom time 2 data were obtained were compared to
characteristics of those who did not complete time 2 assessments. Among inpatients, time 2
respondents did not differ from nonrespondents with respect to any of the variables examined
[age, gender, education, employment, program type (mental health versus substance abuse/dual
diagnosis), primary diagnosis, or receipt of disability benefits]. Among outpatients, follow-up
respondents were significantly less educated (51% had at least some college compared to 57% of
nonrespondents, chi-square = 12.90, p G .01) and less likely to be treated in substance abuse/dual
diagnosis programs (26% of respondents versus to 30% of nonrespondents, chi-square = 5.47, p G
.05). Other differences were not statistically significant.
Discussion
A number of analyses supported the psychometric strength of the BASIS-24
\
instrument within
the three major race/ethnicity groups. Confirmatory factor analyses were quite consistent within
each group and better than previously reported for the original BASIS-32
\
instrument.
16,42
Internal
consistency reliability was acceptable for all of the subscales and all race/ethnicity groups with one
exception: Latino inpatients, for whom the emotional lability and psychotic symptoms subscales
were weaker than for Whites and African-Americans. The weaker reliability values for Latinos may
relate to both cultural issues and to English proficiency. Research in mental health diagnostic
assessment and self-reported symptom distress has shown differences both between Latinos and
Whites, as well as among specific Latino subgroups.
43,44
In particular, specific psychotic symptoms
perceived in the majority (White) culture as signs of psychotic disorders (e.g., hearing voices or
having special powers) are consistent with cultural or religious beliefs in Latino cultures;
consequently, despite earlier qualitative testing, these items may not be understood in the same way
by Latinos as they are by individuals from the majority culture.
45
English proficiency may also
affect understanding and responses to items. Although all participants in this study responded to
English language questionnaires, 27% of Latinos reported that English was not their first language,
whereas only 2% of the non-Latino sample reported that English was not their first language.
Concurrent validity as determined by correlations with other measures of mental and physical
health was strong and not consistently different for the three race/ethnicity groups. The low-to-
moderate correlations between the overall BASIS-24
*
score and clinical measures (GAF and
comorbidity index) are consistent with findings in the literature.
46,47
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 319

Regarding sensitivity to change, statistically significant change was found among inpatients
from all three race/ethnicity groups, with moderate-to-large effect sizes and no systematic
difference among groups.
41
The smaller effect sizes for outpatients were expected, and the lack of
statistically significant change over time for African-Americans and Latinos in some areas may
relate to the small sample size for these two groups. Alternatively, it is possible that the outpatient
treatment provided may have been less effective for the two minority groups. Lower levels of
improvement may also be associated with the weaker discriminant validity among the minority
groups with respect to distinguishing between inpatients and outpatients, in that both of these
findings may relate to accessibility of services in a particular region and/or the point at which an
individual seeks treatment. If African-Americans and Latinos wait longer to seek or receive
outpatient treatment, they may report more symptom distress than White outpatients who receive
treatment earlier. Recent research from the national comorbidity survey replication indicates that
African-Americans and Latinos are less likely to receive treatment for mental illness and
substance abuse over the previous 12 months than Whites and have longer delays in receiving
treatment for at least some disorders.
48,49
In support of this explanation for the study results, the
Latino outpatients in this study reported greater symptom severity/frequency than Whites in all
six domains, and African American outpatients reported greater symptom severity/frequency in
five of the six domains. If minority outpatients wait longer to get treatment, improvement may be
slower and more difficult to accomplish, highlighting the potential need for earlier intervention for
mental health problems. Differential levels of improvement among the race/ethnicity groups also
raise questions about possible interactions between race/ethnicity, other respondent character-
istics, and program type (mental health versus substance abuse/dual diagnosis). Because the
BASIS-24
\
is intended to assess substance abuse outcomes in addition to mental health outcomes,
both program types were included in the study. However, examination of outcomes for each type
of program, while beyond the scope of this paper, would further enhance understanding of both
the usefulness of the instrument in each type of setting as well as understanding of the multitude
of factors that influence improvement following treatment.
Weaker discriminant validity with respect to the instrument subscales’ capacity to distinguish
between groups with corresponding diagnoses may be caused by several factors in addition to, or
rather than, unreliability of the instrument. First, the number of cases in particular diagnostic groups
was quite low in some race/ethnicity subgroups. For example, there were only 10 Latino inpatients
and 14 Latino outpatients with bipolar disorders. Second, diagnoses extracted from medical records
or administrative databases provided by each participating site may not be reliable.
Limitations
Although this study included a fairly large sample from multiple sites, several limitations
should be noted. First, this was not a probability sample of all mental health service recipients or
treatment programs. Participating sites were a convenience sample of programs that had previous
experience or interest in outcome assessment and agreed to participate in the study. Although
specific efforts were made to include programs with racially/ethnically diverse populations, it was
not possible to match programs by sample characteristics. Consequently, results may not be
representative of all mental health service recipients or programs.
Furthermore, because of demographic and diagnostic differences between the three race/
ethnicity groups, and because of confounding between race/ethnicity and program (the three race/
ethnicity groups were not evenly distributed across the 27 programs), it is not appropriate to
directly compare mean mental health status scores or improvement over time for the three groups.
Second, sample sizes for the smaller minority groups (e.g., Asians, American Indians, etc.),
specific Latino subgroups (e.g., Puerto Rican, Cuban, Mexican, etc.), and for individuals reporting
multiple races were insufficient to assess reliability and validity of the instrument among these
320 The Journal of Behavioral Health Services & Research 33:3 July 2006

groups. Third, because the instrument was tested only in English, individuals who did not speak
English were not included. Finally, the clinical data available for use in the validation effort were
limited to one clinician-rated measure of mental health status (GAF) and psychiatric diagnoses
extracted from medical records or administrative databases. Both of these variables are subject to
unreliability. However, because there is currently no Bgold standard^ mental health assessment
measure, a range of clinical and self-report measures was used for validation purposes.
Implications for Behavioral Health
Results of this research provide support for the BASIS-24 instrument as a reliable and valid
tool for assessing mental health outcomes among English speakers from the three largest race/
ethnicity groups in the USA. Efforts to increase generalizability across diverse race/ethnicity
groups by oversampling minorities during the instrument development process were fairly
successful, as evidenced by stronger fit indices than were previously reported for racial/ethnic
minorities for the original BASIS-32
\
instrument.
16
Results of CFA, reliability, and validity tests
were generally consistent across three race/ethnicity groups with two exceptions. First, reliability
of two subscales (emotional lability and psychotic symptoms) was lower among Latino inpatients
compared with White and African-American inpatients, although this difference did not occur
among Latino outpatients. This finding should be explored in further research with larger samples
of Latinos to determine its validity. If confirmed, possible sources of this unreliability, possibly
deriving from cultural or language differences, should be explored. The second consistent
difference among race/ethnicity groups, i.e., much smaller differences in baseline scores between
inpatients and outpatients who were African-American or Latino, also warrants further
exploration to determine whether this is an indication of poor discriminant validity of the
instrument or race/ethnicity differences in access or entry into mental health treatment.
This study highlights the continuing challenge of developing culture-free mental health status
measures, a goal worth striving for, but one which has not been fully achieved. However, lack of a
perfect Bgold standard^ outcome measure should not deter efforts to assess mental health
treatment outcomes. BASIS-24
*
has been found to be feasible, reliable, and valid for routine
outcomes monitoring in a range of mental health and substance abuse programs treating diverse
sample of mental health consumers. Further research is needed to examine its use among other
race/ethnicity groups.
Acknowledgments
This research was supported by grant R01 MH58240 from the National Institute of Mental Health
and by the Veterans Administration Health Services Research & Development program. The views
expressed in this article are those of the authors and do not necessarily represent the views of the
Department of Veterans Affairs. The authors thank the clinical programs that participated in this
research and colleagues Barbara Bokhour, Ph.D., Rani Elwy, Ph.D. Donald Miller, Sc.D., and
Avron Spiro III, Ph.D., for their comments on earlier versions of this manuscript.
References
1. Donabedian A. Explorations in quality assessment and monitoring, Vol. I: The definition of quality and approaches to it’s assessment.
Ann Arbor, MI: Health Administration Press; 1980.
2. Mirin SM, Namerow MJ. Why study treatment outcome? Hospital and Community Psychiatry. 1991;42:1007–1013.
3. Eisen SV, Dickey B. Mental health outcome assessment: the new agenda. Psychotherapy. 1996;33:181–189.
4. Sederer LI, Dickey B, Hermann RC. The imperative of outcomes assessment in psychiatry. In: Sederer LI, Dickey B, eds. Outcomes
Assessment in Clinical Practice. Baltimore, MD: Williams & Wilkins; 1996:1–7.
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 321

5. Smith GR, Fischer EP, Nordquist CR, et al. Implementing outcomes management systems in mental health settings. Psychiatric
Services. 1997;48:364–368.
6. Smith GR, Manderscheid RW, Flynn LM, et al. Principles for assessment of patient outcomes in mental health care. Psychiatric
Services. 1997;48:1033–1036.
7. Hodges K, Wotring J. The role of monitoring outcomes in initiating implementation of evidence-based treatments at the state level.
Psychiatric Services. 2004;55:396–400.
8. Brown GS, Burlingame GM, Lambert MJ. Pushing the quality envelope: a new outcomes management system. Psychiatric Services.
2001;52:925–934.
9. Joint Commission on Accreditation of Healthcare Organizations. Oryx Outcomes: The Next Evolution in Accreditation. Oakbrook
Terrace, IL; 1997.
10. National Committee for Quality Assurance. 2001 Standards and Surveyor Guidelines for the Accreditation of MBHO’s. Washington,
DC; 2000.
11. Sechrest L, McKnight P, McKnight K. Calibration of measures for psychotherapy outcome studies. American Psychologist.
1996;51:1065–1071.
12. Lambert MJ, Gregersen AT, Burlingame GM. The Outcome Questionnaire-45. In: Maruish MM, ed. The Use of Psychological Testing
for Treatment Planning and Outcome Assessment, Third Edition. Volume 3. Mahwah, NJ: Lawrence Erlbaum Associates; 2004:191–
234.
13. McLellan AT, Cacciola J. Kushner H, et al. The fifth edition of the addiction severity index: cautions, additions and normative data.
Journal of Substance Abuse and Treatment. 1992;9:461–480.
14. Derogatis LR, Fitzpatrick M. The SCL-90-R, the Brief Symptom Inventory (BSI) and the BSI-18. In: Maruish MM, ed. The Use of
Psychological Testing for Treatment Planning and Outcome Assessment, Third Edition. Volume 3. Mahwah, NJ: Lawrence Erlbaum
Associates; 2004:1–41.
15. McHorney CA, Ware JE, Lu JF, et al. The MOS 36-item short-form health survey (SF-36): III. Tests of data quality, scaling
assumptions, and reliability across diverse patient groups. Medical Care. 1994;32:40–66.
16. Chow JC-C, Snowden LR, McConnell W. A confirmatory factor analysis of the BASIS-32 in racial and ethnic samples. Journal of
Behavioral Health Services & Research. 2001;28:400–411.
17. Stewart AL, Na´poles-Springer A. Health-related quality of life assessments in diverse population groups in the United States. Medical
Care. 2000;38(Supplement II):II102–II124.
18. Canino G, Bravo M. The adaptation and testing of diagnostic and outcome measures for cross-cultural research. International Review of
Psychiatry. 1994;6:281–286.
19. U.S. Department of Health & Human Services. Culture, Race and Ethnicity. A Supplement to Mental Health: A Report of the Surgeon
General. Chapter 6. Mental Health Care for Hispanic Americans. Office of the Surgeon General. Substance Abuse and Mental Health
Services Administration, Center for Mental Health Services; 2001:129–155.
20. Shrout PE, Canino GJ, Bird HR, et al. Mental health status among Puerto Ricans, Mexican Americans and non-Hispanic whites.
American Journal of Community Psychology. 1993;21:383–395.
21. Phillips D, Leff HS, Kaniasty E, et al. Culture, Race and Ethnicity in Performance Measurement: A Compendium of Resources. The
Evaluation Center @ HSRI. Cambridge, MA: Human Services Research Institute; 1999.
22. Eisen SV, Dill DL, Grob MC. Reliability and validity of a brief patient-report instrument for psychiatric outcome evaluation. Hospital
and Community Psychiatry. 1994;45:242–247.
23. Eisen SV, Normand SLT, Belanger AJ, et al. BASIS-32
\
and the Revised Behavior and Symptom Identification Scale (BASIS-R). In:
Maruish MM, ed. The Use of Psychological Testing for Treatment Planning and Outcome Assessment, Third Edition. Volume 3.
Mahwah, NJ: Lawrence Erlbaum Associates; 2004:79–115.
24. Eisen SV, Normand SLT, Belanger AJ, et al. The revised Behavior and Symptom Identification Scale (BASIS-24): reliability and
validity. Medical Care. 2004;42:1230–1241.
25. Lessler JT. Choosing questions that people can understand and answer. Medical Care. 1995;33:AS203–AS208.
26. DeMaio TJ, Rothgeb JM. Cognitive interviewing techniques: in the lab and in the field. In: Schwarz N, Sudman S, eds. Answering
Questions: Methodology for Determining Cognitive and Communicative Processes in Survey Research. San Francisco: Jossey-Bass;
1996:177–195.
27. Nunnaly JC, Bernstein IH. Psychometric Theory. 3rd ed. New York: McGraw-Hill; 1994.
28. Embretson SE. The new rules of measurement. Psychological Assessment. 1996;8:341–349.
29. Hays RD, Morales LS, Reise SP. Item response theory and health outcomes measurement in the 21st century. Medical Care.
2000;38(9 Suppl): II28–II42.
30. Hambleton RK. Emergence of item response modeling in instrument development and data analysis. Medical Care. 2000;
38(9 suppl):II60–II65.
31. McHorney CA. Generic health measurement: past accomplishments and a measurement paradigm for the 21st century. Annals of
Internal Medicine. 1997;127:743–750.
32. Ware JE, Kosinksi M, Keller S. A 12-item short-form health survey (SF-12): construction of scales and preliminary tests of reliability
and validity. Medical Care. 1996;24:220–233.
33. American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. Fourth Edition (DSM-IV). Washington, DC:
American Psychiatric Association; 1994.
34. Spitzer RL, Gibbon M, Endicott J. Global Assessment Scale (GAS), Global Assessment of Functioning (GAF) Scale, Social and
Occupational Functioning Assessment Scale (SOFAS). In: Handbook of Psychiatric Measures. Washington, DC: American Psychiatric
Association; 2000:96–100.
35. Friedman DJ, Cohen BB, Averbach AR, et al. Race/ethnicity and OMB directive 15: implications for state public health practice.
American Journal of Public Health. 2000;90:1714–1719.
36. Little RJA. Regression with missing X’s: a review. Journal of the American Statistical Association. 1992;87:1227–1237.
322 The Journal of Behavioral Health Services & Research 33:3 July 2006

37. Joreskog K, Sorbom D. LISREL 8.53: Structural Equation Modeling with the SIMPLIS Command Language. Chicago: Scientific
Software International; 2002.
38. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297–334.
39. Hu L, Bentler PM. Cutoff criterion for fit indexes in covariance structure analysis: conventional criteria versus new alternatives.
Structural Equation Modeling. 1999;6:1–55.
40. Browne MS, Cudeck R. Alternative ways of assessing model fit. In: Bollen KA, Long JS, eds. Testing Structural Equation Models.
Newbury Park, CA: Sage; 1993:136–162.
41. Cohen J. Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates; 1988.
42. Eisen SV, Culhane MA. Behavior and Symptom Identification Scale (BASIS-32). In: Maruish MM, ed. The Use of Psychological
Testing for Treatment Planning and Outcome Assessment, Second Edition. Mahwah, NJ: Lawrence Erlbaum Associates; 1999:759–790.
43. Canino G, Bravo M. The adaptation and testing of diagnostic and outcome measures for cross-cultural research. International Review of
Psychiatry. 1994;6:281–286.
44. U.S. Department of Health & Human Services. Culture, Race and Ethnicity. A Supplement to Mental Health: A Report of the Surgeon
General. Chapter 6, Mental health care for Hispanic Americans. Office of the Surgeon General. Substance Abuse and Mental Health
Services Administration, Center for Mental Health Services; 2001:129–155.
45. Canino G, Lewis-Fernandez R, Bravo M. Methodological challenges in cross-cultural mental health research. Transcultural Research.
1997;23:163–184.
46. Moos RH, McCoy L, Moos BS. Global assessment of functioning (GAF) ratings: determinants and role as predictors of one-year
treatment outcomes. Journal of Clinical Psychology. 2000;56:449–460.
47. Moos RH, Nichol AC, Moos BS. Global assessment of functioning ratings and the allocation and outcomes of mental health services.
Psychiatric Services. 2002;53:730–737.
48. Wang PS, Lane ML, Olfson M, et al. Twelve-month use of mental health services in the United States. Results from the national
comorbidity survey replication. Archives of General Psychiatry. 2005;62:629–640.
49. Wang PS, Berglund P, Olfson M. Failure and delay in initial treatment contact after first onset of mental disorders in the national
comorbidity survey replication. Archives of General Psychiatry. 2005;62:603–613.
Reliability and Validity of the BASIS-24
*
Mental Health Survey EISEN et al. 323