Statistical software

Stuck with a difficult assignment? No time to get your paper done? Feeling confused? If you’re looking for reliable and timely help for assignments, you’ve come to the right place. We promise 100% original, plagiarism-free papers custom-written for you. Yes, we write every assignment from scratch and it’s solely custom-made for you.

Order a Similar Paper Order a Different Paper

Please read this very carefully and ask questions if you don’t understand it.

Software Examples (you are by no means limited to this list!  New examples are highly encouraged!!):

Tableau, KPI, WHONET, SAS, STATA, Vitalnet, SaTScan, R, JMP, Qlik, Minitab, VOSViewer, ATHENANet

You may also focus on SPSS capabilities that are not utilized in this course.  Examples: predictive analytics, forecasting

Statistical Software Paper

These articles are examples of acceptable sources.  Feel free to use them in your papers, use them and their references for inspiration or not use them at all.  🙂

These are example articles

· Software Article 2.pdf

Software Article.pdf

· Statistical Software Paper.docx

Download MOHA 570 Article Review.pdf

Software Examples (you are by no means limited to this list!  New examples are highly encouraged!!):

Tableau, KPI, WHONET, SAS, STATA, Vitalnet, SaTScan, R, JMP, Qlik, Minitab, VOSViewer, ATHENANet

You may also focus on SPSS capabilities that are not utilized in this course.  Examples: predictive analytics, forecasting

· Download NCHL competencies and definitions copy.pdf 

Addresses course outcome 3,4
Addresses program outcome 10
Addresses NCHL competency objective: Analytical thinking L4, communication skills L1, performance
measurement L2, impact and influence L2, information seeking L2, innovation L3, strategic orientation
L1, professional and social responsibility L1


NCHL Leadership Competency Model – Quick Reference Guide

The following pages have been designed to facilitate the process of matching objectives to competencies.

This quick reference guide format outlines only three competencies per page in alphabetical order, helping

faculty to scan the categories and levels for an efficient matching process.

Healthcare Leadership Competency Model, Version 2.1

L1. Accountability

L2. Achievement Orientation

L3. Analytical Thinking

The ability to hold people accountable

to standards of performance or ensure

compliance using the power of one’s position

or force of personality appropriately and

effectively, with the long-term good of the

organization in mind.

A concern for surpassing a standard of

excellence.The standard may be one’s own

past performance (striving for improvement);

an objective measure (results orientation);

outperforming others (competitiveness);

challenging goals, or something that has not

been done previously (innovation).

The ability to understand a situation, issue,

or problem by breaking it into smaller pieces

or tracing its implications in a step-by-step

way. It includes organizing the parts of a

situation, issue, or problem systematically;

making systematic comparisons of different

features or aspects; setting priorities on a

rational basis; and identifying time sequences,

causal relationships, or if-then relationships.

L1.1 Communicates Requirements

and Expectations

L2.1 Wants to Do Job Well

Gives basic directions; Makes needs and

requirements reasonably clear; Ensures

understanding of task requirements and

performance expectations; Explicitly delegates

details of routine tasks in order to free self for

more valuable or longer-range considerations

Tries to do the job well or right; Expresses a

desire to do better; Expresses frustration at

waste or inefficiency; Delivers expected results

in line with job requirements

L3.1 Breaks Down Problems

Breaks problems into simple lists of tasks or

activities without assigning values; Lists items

with no particular order or set of priorities

L2.2 Creates Own Measure of Excellence

Sets standard of personal expectation for

excellence in both the quality and quantity of

work; Tracks and measures outcomes against a

standard of excellence – one that is higher and

more precise – not imposed by others; Focuses

on new or more precise ways of meeting goals

set by others

L3.2 Identifies Basic Relationships

L1.2 Sets Limits

Identifies the cause-and-effect relationship

between two aspects of a situation; Separates

situations into two parts: pro and con; Sorts out

a list of tasks in order of importance

Establishes high but achievable performance,

quality, and resource utilization standards;

Firmly says no to unreasonable requests; Sets

limits for others’ behavior and actions; Limits

others’ options to force them to make desired

resources available

L3.3 Recognizes Multiple Relationships

Makes multiple causal links: several potential

causes of events, several consequences of

actions, or multiple-part chain of events

(A leads to B leads to C leads to D); Analyzes

relationships among several parts of a problem

or situation (e.g., anticipates obstacles and

thinks ahead about next steps, in detail, with

multiple steps)

L2.3 Improves Performance

L1.3 Demands High Performance

Makes specific changes in the system or in

own work methods to improve performance;

Does something better, faster, at lower cost,

more efficiently

Imposes new, different, or higher standards

of performance with little input from others;

Insists on compliance with own orders or

requests; Monitors performance against clear

standards; Ensures promised results are

achieved; Demands high performance, quality,

and resources; Issues clear warnings about

consequences for non-performance; Shares

results with stakeholders

L2.4 Sets and Works to Meet Challenging Goals

Establishes – ”stretch goals” for self and others

that are realistic and possible to reach; Strives

to achieve a unique standard (e.g.,“No one

had ever done it before.”); Compares specific

measures of baseline performance compared

with better performance at a later point in time

(e.g.,“When I took over, efficiency was 20%;

now it is up to 85%.”)

L3.4 Develops Complex Plans or Analyses

Identifies multiple elements of a problem and

breaks down each of those elements in detail,

showing causal relationships between them;

Peels back multiple layers of a problem; Uses

several analytical techniques to identify

L1.4 Confronts Performance Problems

Openly and directly confronts individual and

team performance shortfalls and problems;

Holds people accountable for performance;

Ensures timely resolution to performance

deficiencies; Appropriately dismisses people

for cause

potential solutions and weigh the value of each

L2.5 Makes Cost-Benefit Analyses

Makes decisions, sets priorities, or chooses

goals on the basis of calculated inputs and

outputs (e.g., makes explicit considerations of

potential profit and risks or return on investment);

Analyzes entrepreneurial opportunities in

relation to risks, return on investment, and the

scope and magnitude of the investments

L1.5 Creates Culture

of Accountability

Creates a culture of strong accountability

throughout the organization; Holds others

accountable for demanding high performance and

enforcing consequences of non-performance

and taking action; Accepts responsibility for

results of own work and that delegated

to others

L2.6 Takes Calculated Entrepreneurial Risks

Commits significant resources and/or time in

the face of uncertain results when significantly

increased or dramatic benefits could be the

outcome (e.g., improved performance, a

challenging goal)


© C o p y r i g h t 2 0 0 6 N a t i o n a l C e n t e r fo r H e a l t h c a r e Le a d e r s h i p



L4. Change Leadership

L5. Collaboration

L6. Communication Skills

The ability to energize stakeholders and

sustain their commitment to changes in

approaches, processes, and strategies.

The ability to work cooperatively with others,

to be part of a team, to work together, as

opposed to working separately or competitively.

Collaboration applies when a person is a

member of a group of people functioning as a

team, but not the leader.

The ability to speak and write in a clear,

logical, and grammatical manner in formal

and informal situations to prepare cogent

business presentations, and to facilitate a group.

L4.1 Identifies Areas for Change

Publicly defines one or more specific areas

where change is needed; Identifies what needs

to change, but may not completely describe

the path to change

L6.1 Uses Generally Accepted English


Uses subject-verb agreement and parallel

structure; Uses rules of punctuation and

sentence and paragraph construction; Uses

concise thematic construction

L5.1 Conducts work in a cooperative manner

Supports team decisions; Does his or her share

of the work; Keeps other team members informed

and up-to-date about what is happening in the

group; Shares all relevant or useful information

L4.2 Expresses Vision for Change

Defines an explicit vision for change

(i.e., what should be different and how);

Modifies or redefines a previous vision in

specific terms; Outlines strategies for change

L6.2 Prepares Effective Written Business

Cases or Presentations

L5.2 Expresses Positive Attitudes and

Expectations of Team or Team Members

Expresses positive attitudes and expectations

of others in terms of their abilities, expected

contributions, etc.; Speaks of team members

in positive terms, either to the team member

directly or to a third party; Develops effective

working interactions with teammates

Uses accurate and complete presentation of

facts; Uses logical presentation of arguments

pro and con; Develops well-reasoned

recommendations; Prepares concise

executive summary

L4.3 Ensures Change Message is Heard

Deliver the message or vision for change to

everyone affected; Repeats message wherever

possible; Posts change messages (e.g., banners,

plaques, or other physical and public reminders);

Provides opportunities for others to engage in

change initiatives

L6.3 Makes Persuasive Oral Presentations

Uses clear and understandable voice that is

free of extraneous phrases (i.e.,“uhm” and

“you know”); Uses effective audiovisual media

(presentation software, exhibits, etc.); Stays

on the topic; Engages in non-defensive Q&A;

Stays within time allotment

L5.3 Solicits Input

Genuinely values others’ input and expertise;

Actively seeks the input of others to increase

the quality of solutions developed; Displays

willingness to learn from others, including

subordinates and peers; Solicits ideas and

opinions to help form specific decisions or

plans; Works to create common mindset

L4.4 Challenges Status Quo

Publicly challenges the status quo by comparing

it to an ideal or a vision of change; Creates a

realistic sense of crisis or a disequilibrium in

order to prepare the ground for change;

Energizes others for change

L6.4 Facilitates Group Interactions

Uses varied communication management

techniques, brainstorming, consensus building,

group problem solving, and conflict resolution;

Demonstrates good meeting management

techniques (e.g., agenda development, time


L4.5 Reinforces Change Vision Dramatically

Takes a dramatic action (other than giving a

speech) to reinforce or enforce the change

effort; Personally exemplifies or embodies the

desired change through strong, symbolic

actions that are consistent with the change

L5.4 Encourages Others

Publicly credits others who have performed

well; Encourages others; Empowers others

L5.5 Builds Team Commitment

Acts to promote good working relationships

regardless of personal likes or dislikes; Breaks

down barriers across groups; Builds good morale

or cooperation within the team, including

creating symbols of group identity or other

actions to build cohesiveness; Encourages or

facilitates a beneficial resolution to conflict;

Creates conditions for high-performance teams

L4.6 Provides Calm During the Storm

of Change

Maintains an eye on the strategic goals and

values during the chaos of change; Provides

focused, unswerving leadership to advance

change initiatives; Exemplifies quiet confidence

in the progress and benefits of change;

Provides direction for overcoming adversity

and resistance to change; Defines the vision

for the next wave of change

© C o p y r i g h t 2 0 0 6 N a t i o n a l C e n t e r fo r H e a l t h c a r e Le a d e r s h i p


Only two pages were converted.

Please Sign Up to convert the full document.

Workplace Emotions: The Role of Supervision and Leadership

Joyce E. Bono, Hannah Jackson Foldes, Gregory Vinson, and John P. Muros
University of Minnesota

In this experience sampling study, the authors examined the role of organizational leaders in employees’
emotional experiences. Data were collected from health care workers 4 times a day for 2 weeks. Results
indicate supervisors were associated with employee emotions in 3 ways: (a) Employees experienced
fewer positive emotions when interacting with their supervisors as compared with interactions with
coworkers and customers; (b) employees with supervisors high on transformational leadership experi-
enced more positive emotions throughout the workday, including interactions with coworkers and
customers; and (c) employees who regulated their emotions experienced decreased job satisfaction and
increased stress, but those with supervisors high on transformational leadership were less likely to
experience decreased job satisfaction. The results also suggest that the effects of emotional regulation on
stress are long lasting (up to 2 hr) and not easily reduced by leadership behaviors.

Keywords: emotional regulation, mood, stress, job satisfaction, leadership

Over the past 2 decades, there has been a growing interest in
affective and emotional experiences at work (Brief & Weiss,
2002), including interest in the role of mood and emotions in
employee motivation (Erez & Isen, 2002), job performance (Law,
Wong, & Song, 2004), creativity (George & Zhou, 2002), and job
attitudes (Weiss & Cropanzano, 1996). Emotional regulation at
work has also received attention, especially given evidence that
regulating emotions is associated with cardiovascular system ac-
tivation (Gross & Levenson, 1993, 1997), stress, emotional ex-
haustion (Pugliesi, 1999), and physical symptoms such as head-
aches (Schaubroeck & Jones, 2000).

There has also been interest in emotions in the leadership
domain. Most influential theories of transformational and charis-
matic leadership (e.g., Bass, 1985; Conger & Kanungo, 1998;
House, 1977; Shamir, House, & Arthur, 1993) posit emotional
links between leaders and followers, yet there is little empirical
research linking managers and their leadership behaviors to em-
ployees’ emotions. A few studies have been conducted, but these
studies have not fully elucidated emotional links between leaders
and followers because they either manipulated leader emotions
(Sy, Côté, & Saavedra, 2005); used simulated leaders and follow-
ers (Bono & Ilies, 2006); or used a single-time, single-source
survey to assess follower emotions and leader behavior (McColl-
Kennedy & Anderson, 2002). Therefore, the primary purpose of
our study was to examine the effects of supervisors and managers
on employees’ emotions in a natural work setting. First, we ex-

amined the direct effects of supervisors and managers and their
leadership behaviors on employees’ emotional experiences (i.e.,
experienced emotions, expressed emotions, and emotional regula-
tion). Second, we examined the extent to which managers’ lead-
ership behaviors can buffer employees from the negative conse-
quences of emotional regulation. We used an experience sampling
methodology and within-person analyses, which allowed us to
focus on within-person covariations and the effects of leaders’
behaviors on these covariations. Figure 1 displays the associations
of interest in this research.

Leaders and Employee Emotions

Despite widespread beliefs that supervisors are a key source of
bad moods at work, there is little empirical research documenting
these effects. There is an extensive literature on workplace factors
associated with employee well-being and stress (see Danna &
Griffin, 1999, for a review), and a key assumption of this research
is that supervisors affect employees’ emotional experiences. From
a theoretical standpoint, there are at least two reasons why em-
ployees may feel increased anxiety during supervisory interac-
tions. First, supervisors are the individuals who directly evaluate
performance, and thus interactions with supervisors may increase
anxiety about performance, which can be directly evaluated by the
supervisor during interactions. Second, Ryan and Deci (2000)
provided compelling evidence that individuals have a need for
autonomy, which tends to be limited in the workplace by super-
visors. Interactions with supervisors allows for closer observations
of employee behavior, which may leave employees feeling mon-
itored and controlled (George & Zhou, 2001), leading to feelings
of irritation. Furthermore, research by Diefendorff and Richard
(2003) suggests that supervisors’ expectations may lead employees
to constrain their emotional expressions, which also leads to neg-
ative affect.

Empirical research is also consistent with the notion that super-
visors may negatively influence employee emotions. Glaso and
Einarsen (2006) found four affective factors relevant to the

Joyce E. Bono, Hannah Jackson Foldes, Gregory Vinson, and John P.
Muros, Department of Psychology, University of Minnesota.

Hannah Jackson Foldes is now employed at Personnel Decisions Inter-
national, and Gregory Vinson is now employed at The Center for Victims
of Torture.

This article was supported in part by a Grant-in-Aid of Research and
Artistry, University of Minnesota.

Correspondence concerning this article should be addressed to Joyce E.
Bono, Department of Psychology, University of Minnesota, 75 East River
Road, Minneapolis, MN 55455. E-mail: [email protected]

Journal of Applied Psychology Copyright 2007 by the American Psychological Association
2007, Vol. 92, No. 5, 1357–1367 0021-9010/07/$12.00 DOI: 10.1037/0021-9010.92.5.1357


supervisor–subordinate relationship, three of which were negative
(i.e., frustration, violation, and uncertainty). Fitness (2000) inter-
viewed employees about their experiences of anger and found that
unfair treatment by supervisors, which tended to remain unre-
solved, was a key source of employee anger. A recent experience
sampling study directly examined the link between mood and
supervisory interactions (Miner, Glomb, & Hulin, 2005). This
research revealed that employees rated 80% of their interactions
with their supervisors as positive and only 20% as negative;
however, the effects of negative interactions on employee mood
were, in general, 5 times stronger than the effects of positive
interactions. These findings suggest that even though most super-
visory interactions are positive, the overall net effect of interac-
tions with supervisors may be slightly negative because of the
stronger effects of negative interactions on employee mood.1

Hypothesis 1: Employees experience more negative and
fewer positive emotions when interacting with supervisors
than when interacting with coworkers or clients and custom-

In addition to the general mood effects of interactions with
supervisors, we also expected supervisors’ typical behavioral pat-
terns (i.e., leadership behaviors) to influence employees’ emo-
tional experiences. Abusive supervisors (Tepper, 2000) might be
expected to elicit frustration, anxiety, and anger. In contrast, su-
pervisors who use transformational leadership behaviors may elicit
feelings of happiness and enthusiasm in employees. Transforma-
tional and charismatic leadership theories (Bass, 1985; Conger &
Kanungo, 1998; House, 1977) are unique in the emphasis they
place on the emotional aspects of leadership. Shamir et al. (1993)
focused on emotional attachment to the leader and the emotional
arousal of followers, and Bass (1985) suggested that transforma-
tional leadership “has an intense emotional component” (p. 36).
Testing these notions, Bono and Ilies (2006) recently linked ratings
of transformational leadership to managers’ use of positive affect
words (e.g., good, happy) in their written and verbal communica-
tions. Lab studies also suggest that positive emotions experienced
and expressed by leaders may be transferred to employees through
the process of emotional contagion. Sy et al. (2005) reported that
leader’s positive mood, which was induced in their study, influ-
enced group affect and performance. Bono and Ilies extended this
work by linking leaders’ expressions of enthusiasm to followers’
positive mood states. Whereas these results are promising, there
has been no research examining the effects of leadership behaviors
on employee mood throughout the day in a natural work setting.

Hypothesis 2: There is a positive association between super-
visors’ transformational leadership behaviors and employees’
experiences of positive emotions throughout the workday.

Leaders and Employee Emotional Regulation

The preceding arguments imply direct effects of supervisors and
their leadership behaviors on employees’ experienced emotions. A
second potential influence of leader behaviors involves leaders as
a coping resource for employees who regulate their emotions at
work. Emotional regulation refers to processes by which individ-
uals choose which emotions they express, relative to those they
experience, in either a controlled or an automatic way (Gross,
1998). People habitually regulate their emotions and emotional
displays to conform with norms and expectations of the workplace,
as well as job role demands. Initial theoretical work by Hochschild
(1979) and the more recent free trait theory (Little, 2000) both
view emotional regulation as harmful to employees because it
involves acting without authenticity. Empirical evidence supports
this notion, as researchers have found that suppressing emotions
has both physiological and cognitive costs, including cardiovascu-
lar activation and decreased memory for social information (see
Gross, Richards, & John, 2006, for a review), and is also associ-
ated with psychological strain (e.g., stress, emotional exhaustion,
and burnout) and physical complaints (Schaubroeck & Jones,
2000). Therefore, we expected the following:

Hypothesis 3a: Emotional regulation (both hiding negative
emotions and faking positive emotions) is negatively associ-
ated with job satisfaction within individuals.

Hypothesis 3b: Emotional regulation (both hiding negative
emotions and faking positive emotions) is positively associ-
ated with experienced stress within individuals.

Whereas these hypotheses are not unique in linking emotional
regulation to stress and job satisfaction, existing research has
examined only the association between employees’ general ten-

1 Although there is considerable evidence that positive and negative
mood represent independent dimensions of affect experienced by individ-
uals in general (Watson & Clark, 1997), when participants are asked to
report their current mood, a single dimension (i.e., happiness vs. unhappi-
ness) best represents mood data (Diener & Emmons, 1984). However,
because we wished to examine the potential differential effects of regulat-
ing positive and negative emotions, we considered positive and negative
mood separately, expecting them to be negatively correlated.

Figure 1. Model linking supervision to employees’ experienced emotions, emotional regulation, stress, and job
satisfaction. H � Hypothesis.


dencies to regulate emotions at work and their general tendencies
to experience stress and decreased job satisfaction. In contrast, we
hypothesized and tested direct links between individual episodes
of emotional regulation, job stress, and job satisfaction and their
covariation within individuals. Using a repeated measures, within-
subjects design, we sought to determine whether individuals’ lev-
els of stress and job satisfaction fluctuate throughout the course of
the workday in conjunction with emotional regulation. By exam-
ining variability around each individual’s mean level, we con-
trolled for differences between individuals in the tendency to
experience, remember, and report emotional experiences and
stress. This design also allowed us to examine the association
between emotional regulation and stress over time.

In linking supervisors’ and managers’ leadership behaviors to
emotional regulation, we drew from two theoretical perspectives.
Self-determination theory (Ryan & Deci, 2000) and the goal self-
concordance model (Sheldon & Elliott, 1999) both posit links
between authentic self-expression and individual well-being.
Hochschild (1983) suggested that when employees regulate their
emotions at work, they experience feelings of depersonalization
and separation from self; however, it appears that this is not always
the case, as certain conditions may weaken or eliminate the neg-
ative effects of emotional regulation. In a review of the literature,
John and Gross (2004) reported that healthier patterns of social
functioning and greater well-being were linked to emotional reg-
ulation for individuals who changed the way they thought about an
emotional event as compared with individuals who changed only
their emotional expressions. In addition, Ashforth and Humphrey
(1993) suggested that employees who identify with their work are
more likely to feel authentic even when conforming to role expec-
tations, such as demands for emotional regulation. This is where
leadership can play a role. In both lab and field studies, Bono and
Judge (2003) demonstrated that managers’ transformational lead-
ership behaviors can influence employees’ identification with their
work. They also found that transformational leadership was a
positive predictor of the extent to which employees felt that their
work activities were self-congruent and consistent with their own
interests and values. This suggests that individuals who report to
transformational leaders, compared with those who do not, will be
more likely to identify with their work and to change the way they
think about role-required emotional regulation, thereby leading
them to experience less stress and greater job satisfaction.

Managers who engage in transformational leadership behaviors
may also provide greater social support for their employees. Social
support has been recognized as important to the experience of
emotional management at work because providers of social sup-
port (e.g., family, coworkers, and supervisors) are thought to
enable employees to cope better with job stressors and increase
their sense of control (Abraham, 1998; Zapf, 2002). Abraham
reported that when employees experienced high social support, no
association between emotional regulation and job satisfaction was
found; however, under conditions of low social support, emotional
labor negatively affected satisfaction. Transformational leaders are
typically characterized as empathetic, and the behavioral dimen-
sion of individualized consideration explicitly describes leaders as
attending to and supporting the individual needs of followers. Such
leaders may be able to help employees cope with emotional
regulation in more effective and less psychologically draining
ways by helping employees to understand why and how positive

emotional expressions contribute to work goals (e.g., Shamir et al.,
1993). Because they engender trust, transformational leaders are
also uniquely positioned to assist employees in coping with emo-
tional labor. Therefore, we expected the following:

Hypothesis 4a: Supervisors’ transformational leadership be-
haviors moderate the link between emotional regulation and
job satisfaction, such that when supervisors engage in more
transformational leadership behaviors, the negative associa-
tion between emotional regulation and job satisfaction is
weaker than when supervisors engage in less transformational
leadership behaviors.

Hypothesis 4b: Supervisors’ transformational leadership be-
haviors moderate the link between emotional regulation and
stress, such that when supervisors engage in more transfor-
mational leadership behaviors, the positive association be-
tween emotional regulation and stress is weaker than when
supervisors engage in less transformational leadership behav-



Participants were 57 employees of an ambulatory health care
organization, randomly selected from nonmanagement employees
participating in an organization-wide survey. As is common in
health care organizations, participants were predominantly women
(94%). They were, on average, 41 years old (SD � 10 years).
Eighty-six percent were Caucasian, 9% were African American,
and 5% were of Asian origin. Participants’ average job tenure was
5.24 years. Participants worked in family practice clinics (n � 14),
administrative offices (n � 10), and a billing office (n � 33), and
held a broad variety of jobs ranging from nurse, medical assistant,
and lab technician in the family practice clinics; patient services
support, case analyst, and account follow-up specialist in the
billing office; and accountant, case manager, and human resource
specialist in the administrative office.


Participants provided two types of data: survey data and expe-
rience sampling data. Paper surveys were used to gather general
job satisfaction and stress data. Experience sampling data were
obtained via a handheld computer that participants carried for 2
weeks and included momentary stress, job satisfaction, and affec-
tive experiences. A third type of data—leadership behaviors of
participants’ supervisors—was obtained from an organization-
wide survey.

As part of an organization-wide survey that we administered,
all employees (N � 365; 309 nonmanagement and 56 manage-
ment) reported on the leadership behaviors of their direct su-
pervisor. Surveys, completed in small group sessions by 253
(73%) nonmanagement and 56 (100%) management employees,
were anonymous (employees recorded only the name of their
supervisor) and were returned to us in unmarked, sealed enve-
lopes. Only aggregate data were provided to organization ex-
ecutives. One month following the organizational survey, we
invited 73 employees to participate in the study. Participants


were quasi-randomly selected, proportionately from each loca-
tion, by entering the names of those who participated in the
organization-wide survey into a spreadsheet, sorting by first
name, and then choosing every other person on the list until the
desired number of participants was achieved.

Fifty-eight of the 73 employees contacted (79%) affirmed they
would attend an information session, 6 (8%) were interested but
could not make it to a scheduled information session, and 9 (12%)
indicated they were not interested in participating in this research.
At the orientation session, participants were informed that the
organization’s management had endorsed their participation in the
study and that they were allowed to respond to surveys during
normal working hours. They were also informed that they would
receive $25 as payment for participation. After obtaining this
information, 57 of the 58 employees agreed to participate.

After agreeing to enroll in the study, participants completed a
survey, after which they were trained in the use of a personal
digital assistant (PDA). For the experience sampling surveys,
participants were signaled four times each day, Monday through
Friday, for 2 weeks. The signal intervals were modified to accom-
modate participants’ regular working hours (e.g., 7 a.m. to 3 p.m.
or 8 a.m. to 4 p.m.). Participants were signaled once during each
2-hr segment of their 8-hr workday. Participants were given 10
min to respond to each signal, after which the survey was no longer
available. They were told to respond to the surveys based on the
emotions and attitudes they were experiencing “immediately be-
fore the beep went off.” In addition, they were instructed to
respond to as many signals as possible during the workday without
compromising their job performance or service to patients or
customers. Because we were not able to program the PDAs to stop
signaling on weekends (i.e., Saturday and Sunday), we asked
participants to leave the PDA either at work, in their car, or at
home in a location where the signals would not be annoying. Data
were collected for 10 working days.2

At each signal, participants were asked to report whether they
were at work. As we were attempting to collect data based spe-
cifically on emotional regulation in a social context, we next asked
whether they were interacting with others when the signal
sounded. Only surveys with yes responses to this item were in-
cluded in this research. Next, they were asked to report on their
level of stress and job satisfaction at that moment. Finally, they
were asked to report on their affective experiences at that moment,
including the extent to which they were feeling six emotions

(happiness, enthusiasm, optimism, anxiety, irritation, and anger)
and the extent to which they were faking the three positive emo-
tions (happiness, enthusiasm, and optimism) and hiding the three
negative emotions (anxiety, irritation, and anger). Hereinafter, we
refer to experienced emotions as emotions experienced or felt
emotions and to faking and hiding emotions as emotional regula-
tion. The 12 affective items (i.e., 6 experienced emotion items and
6 emotional regulation items) appeared in random order for each

At the end of 2 weeks, participants attended a debriefing
session, completed a second paper survey regarding general
stress at work, returned their handheld computers, and were
paid $25. At this time, they were also asked the name of their
supervisor so that we could obtain reports on their supervisor’s
leadership behaviors from the organizational survey. Table 1
displays the source and format of the data, along with the data
collection timeline.


Supervisors’ leadership behaviors. The leadership behaviors
of participants’ supervisors were measured using the 20-item Mul-
tifactor Leadership Questionnaire (MLQ – Form 5x; Avolio, Bass,
& Jung, 1995).3 The MLQ is the most frequently used measure of
transformational leadership and has exhibited high reliability and
convergent and discriminant validity (Avolio et al., 1995). Re-
sponses were evaluated on a 5-point scale ranging from 1 (not at
all) to 5 (frequently, if not always). Consistent with other research-
ers whose interest is the general construct of transformational
leadership behaviors, we combined items to form a single score for
each supervisor (e.g., Lim & Ployhart, 2004).

Job satisfaction. We assessed momentary job satisfaction by
asking participants to rate their agreement with the statement “At
this very moment, I am fairly satisfied with my job” on a 5-point
scale (1 � strongly disagree, 5 � strongly agree). Ilies and Judge

2 The PDAs were programmed to signal twice during the evening hours
as well. The optional evening surveys were used to collect pilot data for
another project. Thirty-five individuals responded to evening surveys,
which were not included in this study.

3 The MLQ, Form 5x (Copyright 1995 by Bernard Bass and Bruce
Avolio) is used with permission of Mind Garden, 1690 Woodside Road,
Suite 202, Redwood City, CA 94061.

Table 1
Overview of Data Collection

Variable Source Format Timeline

Supervisors’ leadership behaviors Aggregate of employee responses
to organizational survey

Paper survey T1: One month prior to ESM

General job satisfaction Participant Paper survey T2: Immediately prior to ESM
Work status and interactions Participant Handheld computer T3: During ESM
Momentary job satisfaction Participant Handheld computer T3: During ESM
Momentary stress Participant Handheld computer T3: During ESM
Affective experiences Participant Handheld computer T3: During ESM
Overall stress Participant Paper survey T4: Immediately following ESM

Note. The Experience sampling method (ESM) period consisted of a 2-week time frame during which participants responded to multiple surveys
administered daily via a handheld computer. T1–T4 refer to the four distinct episodes of data collection.


(2002; Judge & Ilies, 2004) have found considerable variation in
job satisfaction over the course of the workday, which our mo-
mentary measure was designed to capture. We also obtained a
measure of general job satisfaction, using five Brayfield–Rothe
items (e.g., “Most days I am enthusiastic about my work”; see
Bono & Judge, 2003; Brayfield & Rothe, 1951). Responses for
these five items were on the same 5-point scale used for momen-
tary job satisfaction.

Work status and interactions. Immediately after being sig-
naled by the PDA, participants were asked to report whether they
were currently at work (yes or no). Next, they responded to the
following question: “When the beep went off, were you interacting
with any of the following?” 1 (supervisor), 2 (coworker), 3 (cus-
tomer/client), 4 (family/friends), 5 (no one), 6 (other).

Affective experiences. Although many taxonomies of affect
and emotion exist in the literature, we were constrained by our
method (i.e., surveys of only 1–2 min in length) to select a small
number of affect terms in attempting to cover the full range of
work-relevant affect and emotion. After reviewing the literature
(Ortony & Turner, 1990; Russell & Feldman Barrett, 1999;
Shaver, Schwartz, Kirson, & O’Connor, 1987) and conducting a
pilot study of workers who reported how frequently they expressed
each of 10 emotions at work, we selected three positive (happiness,
enthusiasm, and optimism) and three negative (anxiety, anger, and
irritation) work-relevant emotions. We excluded some basic emo-
tions (e.g., fear and love) and low-activation affect terms (e.g.,
contentment and calm), deeming them of low relevance to our
study. Participants reported the degree to which they felt the six
selected emotions at each signal.

Because Glomb and Tews (2004) illustrated the importance of
distinguishing between felt emotions and the act of emotional
regulation, we also asked participants about faking (i.e., expressing
an emotion they did not feel) the three positive emotions (happi-
ness, enthusiasm, and optimism) and hiding (i.e., feeling an emo-
tion they were not expressing) the three negative emotions (anxi-
ety, irritation, and anger). Responses for these items were
evaluated on a 7-point scale (1 � none at all, 7 � an intense
amount) for feeling, hiding, and faking.

Stress. We assessed momentary stress by asking participants
to respond to the statement “At this very moment, I am experi-
encing stress.” This measure was intended to capture variability in
stress throughout the day. We also obtained an overall, global
measure of stress, using Motowidlo, Packard, and Manning’s
(1986) four-item measure. Responses for both momentary and
overall stress items were on the same 5-point scale used for job

Data Analyses

Prior to conducting our analyses of interest, we examined sev-
eral properties of our data. We excluded data from 3 participants
who had responded to fewer than three signals over the course of
the 2 weeks (1 because of job termination, 1 because of PDA
failure, and 1 for an unknown reason). Therefore, the total possible
number of responses was approximately 2,160 (4 responses per
day � 10 workdays � 54 participants � 2,160); however, because
of scheduling the drop-off and pick-up of PDAs, a few participants
responded to surveys on Days 11 and 12, raising the total possible
to 2,266). Participants responded to 1,983 signals (approximately

88% of the total possible), with an average of 37 responses per
participant (ranging from 11 to 42). We excluded data from re-
sponses when participants were not at work (n � 287 episodes)
and for interactions with family or friends (n � 52 episodes),
resulting in 1,653 work-related responses. Our goal was to exam-
ine emotional regulation in a social context. However, in many
cases participants were alone when the signal sounded (n � 764;
48% of work-related responses).4 Thus, our final data set consisted
of 889 responses (52% of total at-work responses; average of 15
per participant) from employees who were at work and involved in
a work-related interaction at the time of the signal.

To form a single leadership score for the supervisor of each
participant, we aggregated the leadership survey responses of all
employees who completed a survey for the target supervisor (av-
erage n � 5 reports per supervisor). Aggregating leadership reports
across followers was deemed justifiable in these data by a signif-
icant ICC-1 value of .29 ( p � .01) and an ICC-2 value of .72. The
ICC-2 value can be interpreted as the reliability of the aggregated
measure of leadership. An average rwg � .84 across groups
(James, Demaree, & Wolf, 1984; assuming a slight negative skew
in the data [Bono & Judge, 2003]) further supports aggregation.

We also created several composite emotion variables. The three
positive (enthusiasm, happiness, and optimism) and three negative
(anger, irritation, and anxiety) emotions were highly correlated
within individuals for both felt emotions and emotional regulation
(average r � .87, .77, .93, and .82, for positive experienced,
negative experienced, positive faked, and negative hidden, respec-
tively). Therefore, we computed four composite emotion variables:
positive emotions experienced, negative emotions experienced,
positive emotions faked, and negative emotions hidden. Principal-
components analysis provided support for creating these compos-
ites, as we found that a single factor explained most of the variance
(ranging from 70%– 82%) in each set of related emotions. We also
found that the two types of emotional regulation were highly
related (i.e., faking positive emotions and hiding negative emo-
tions tended to co-occur; r � .88, p � .01). Because we were
interested in possible differential effects of the two types of emo-
tional regulation, we included faking positive and hiding negative
in our analyses. However, we also formed a total emotional reg-
ulation score by averaging the two types of emotional regulation.

Our hypotheses address two levels of analysis. Within-subject
analyses must be used to link emotional regulation within an
individual with momentary variation in stress and job satisfaction
(Hypothesis 3). In these analyses, data from each individual are
centered around that individual’s mean score, effectively control-
ling for mean differences between individuals in emotional regu-
lation, job satisfaction, and stress. Cross-level (between- and
within-subjects) analyses are used to test our leadership hypothe-
ses (Hypotheses 1, 2, and 4), which posit associations between
supervisory behavior and employees’ emotional experiences.
Therefore, we analyzed our data using multilevel modeling tech-
niques (i.e., HLM 5; Raudenbush & Bryk, 2002).

4 A comparison of affective experiences when interacting with others
versus alone revealed no significant differences in experienced positive and
negative emotions or in emotional regulation.



Means, standard deviations, and correlations are presented in
Table 2, revealing that participants generally experienced more
positive emotions than negative emotions at work and tended to
hide negative emotions only slightly more than they faked positive
emotions (mean hide negative � 1.67, and mean fake positive �
1.50). Mean levels of job satisfaction and stress were similar in the
momentary, PDA data (Level 1) and in the global, overall measure
(Level 2). We note a high correlation between negative mood and
emotional regulation (r � .78, .92, and .88, all ps � .01, for faking
positive, hiding negative, and total emotional regulation, respec-
tively). These high correlations are consistent with the notion that
positive emotions are valued (and negative emotions devalued) in
organizations, leading employees to suppress negative emotions.

Leaders and Employee Emotions

Hypotheses 1 and 2 address the role of supervisors in employ-
ees’ emotional experiences. We first examined the link between
employees’ affective experiences and interaction partner (e.g.,
supervisor vs. customer– client). Analysis of variance results (see
Table 3) revealed significant differences in affective experiences
based on interaction partner. Specifically, participants reported the
most positive emotions when interacting with their coworkers
(mean positive emotions experienced � 3.75) and fewest positive
emotions when interacting with their supervisors (mean positive
emotions experienced � 3.05). This was true for each of the
individual positive emotions, which we do not report in Table 3,
and the positive emotion composite, F(3, 886) � 5.74, p � .01.
Thus, Hypothesis 1 was supported for positive emotions; however,
interaction partner was significantly related neither to the experi-
ence of negative emotions nor to employees’ emotional regulation.

Next, we examined the direct effects of supervisors’ transfor-
mational leadership behaviors on employees’ momentary emo-
tional experiences, testing the theory that transformational leaders
influence employee emotions. In this analysis we linked the su-
pervisor’s typical leadership behaviors to employees’ emotions
across all interactions, including those with customers and cowork-
ers. Table 4 presents coefficients linking supervisors’ leadership
behaviors to employees’ experienced emotions and emotional reg-

ulation. As hypothesized (Hypothesis 2), employees who work for
supervisors rated high on transformational leadership reported
experiencing more positive emotions throughout the course of
their workday (� � .54, p � .01). This held true for each of the
individual positive emotions (results not reported in Table 4), as
well as the positive emotions composite. Leadership behaviors
were not linked to participants’ experience of negative emotions or
to reports of emotional regulation.

Leaders and Employee Emotional Regulation

Emotional regulation, job satisfaction, and stress. In addition
to main effects of supervisors’ leadership behaviors on employees’
affective experiences, we also hypothesized a moderating effect
for transformational leadership. We examined the data to deter-
mine whether emotional regulation covaries with stress and job
satisfaction within individuals and over time. Results in Table 5
(upper portion: momentary data) addressed the link between mo-
mentary emotional regulation, job satisfaction, and stress. Results
revealed a significant, negative association between both types of
emotional regulation and job satisfaction (� � –.18 and –.18, p �
.01, for hiding negative and faking positive, respectively) and a
significant positive link between both types of emotional regula-
tion and stress (� � .40 and .35, p � .01, for hiding negative and
faking positive, respectively). These results suggest that regardless
of how often certain individuals regulate their emotions, individual
episodes of emotional regulation are associated with increased
stress (above the mean for a given individual) and decreased job
satisfaction (below the mean for a given individual). This was true
for both faking positive and hiding negative emotions, supporting
Hypotheses 3a and 3b.

A unique advantage of experience sampling research is that we
can examine the link between emotional regulation at one point in
time (t) and stress or job satisfaction for that individual at the next
random signal (t � 1), allowing us to draw stronger causal infer-
ences and to determine how long these effects last. Results in
Table 5 (lower portion: lagged momentary data) showed that the
association between emotional regulation and stress extended over
time. Although weaker than the concurrent effects, results indi-
cated a significant positive association between emotional regula-
tion at one time point and stress approximately 2 hr later. No

Table 2
Means, Standard Deviations, and Intercorrelations Between Level 2 Variables and Aggregate of Level 1 Variables

Scale M SD 1 2 3 4 5 6 7 8 9 10 11 12

1. Positive affect 3.78 .52 .82
2. Negative affect 1.66 .61 �.13 .91
3. Transformational leadership 3.24 .55 .24 �.19 .93
4. ESM job satisfaction 3.73 .59 .28* �.17 .19 —
5. ESM job stress 2.22 .67 .05 .31* .12 �.19 —
6. Feel positive emotions 3.65 .98 .31* �.22 .31* .56** �.24 .95
7. Feel negative emotions 1.90 .81 .15 .38** .03 �.38** .66** �.26 .90
8. Fake positive emotions 1.50 .67 .09 .24 .07 �.28* .51** �.06 .78** .97
9. Hide negative emotions 1.67 .81 .15 .24 .12 �.37** .58** �.20 .92** .88** .92

10. Total emotional regulation 1.59 .71 .13 .25 .10 �.34** .57** �.14 .88** .96** .98** —
11. General job satisfaction 3.74 .59 .39** �.24 .29* .74** �.14 .43** �.27** �.21 �.21 �.22 .85 —
12. General stress 2.84 .82 �.20 .20 �.25 �.35** .36** �.28* .31* .26 .28* .28* �.38** .79

Note. N � 54. Coefficient alphas are presented on the main diagonal. ESM � Experience sampling method.
* p � .05. ** p � .01.


significant effects for emotional regulation (t) on stress were found
after 4 hr (t � 2), and sample sizes were too small to examine t �
3 data (i.e., average 6-hr time lag). Although our data are corre-
lational and preclude firm causal conclusions, these results lend
support to our proposed causal ordering of the variables (e.g.,
emotional regulation 3 stress), as we also tested the reverse
ordering (e.g., stress [t] 3 emotional regulation [t � 1]) and found
no significant effects. The effects of emotional regulation on job
satisfaction appear to be more fleeting than the effects of stress, as
emotional regulation did not predict job satisfaction after 2 hr.

Leadership as buffer. We suggested that leadership behaviors
would buffer employees from the negative effects of emotional
regulation. Results in Table 6 support Hypothesis 4a, revealing a
significant moderating effect for transformational leadership (� �
.21, p � .01) on the association between emotional regulation and
job satisfaction. Figure 2 shows that when transformational lead-
ership behaviors were low (1 standard deviation below the mean),

episodes of emotional regulation were associated with decreased
job satisfaction. However, when transformational leadership be-
haviors were high (1 standard deviation above the mean), there
was little or no association between emotional regulation and job
satisfaction. A different picture emerged when we examined stress
(Hypothesis 4b). Contrary to our expectations, supervisors’ trans-
formational leadership behaviors did not protect employees from
the stress associated with emotional regulation (see Table 6).5


According to Brief and Weiss (2002), “the organizational liter-
ature is populated with many more ideas about the leader’s role in
the production of moods and emotions than it is with relevant data”
(p. 289). In this study we focused on the role of managers in
producing employee emotions and in buffering them from the
effects of emotional regulation. With respect to the former, our
results revealed that employees experience less optimism, happi-
ness, and enthusiasm when they interact with supervisors than
when they interact with customers, clients, and coworkers. We also
found that employees who report to supervisors who engage in
transformational leadership behaviors, compared with those who
do not, experience more optimism, happiness, and enthusiasm
throughout the day, including during their interactions with cus-
tomers and coworkers. With respect to the latter, our results
indicate that when employees regulate their emotions at work, they
experience increased stress and decreased job satisfaction. Further-

5 Several participants in our sample reported to the same supervisor,
creating dependencies in the leadership data that were not addressed in our
hierarchical linear modeling analyses and may have potentially produced
biased tests of significance. Thus, we repeated our moderator analyses
involving leadership (see Table 6) using only 1 participant per leader. This
reduced our sample size (from 54 to 27) but eliminated the dependencies.
All significant effects reported in Table 6 remained significant in this

Table 3
Analysis of Variance for Felt and Regulated Emotions by Interaction Partner


partner M SD F
% total variance explained

by interaction partner

Feel positive Supervisor 3.05 1.44 5.74** 36
Coworker 3.75 1.25
Customer-client 3.53 1.58
Other 3.54 1.47

Feel negative Supervisor 2.08 1.35 1.01 4
Coworker 1.83 1.13
Customer-client 1.90 1.14
Other 1.84 1.19

Fake positive Supervisor 1.38 0.91 0.30 6
Coworker 1.47 0.88
Customer-client 1.46 0.85
Other 1.42 0.86

Hide negative Supervisor 1.50 1.01 0.61 3
Coworker 1.62 1.04
Customer-client 1.67 1.05
Other 1.56 0.99

Note. N � 889. dfs � 3, 886. The frequency of interactions was as follows: coworker, 55%; customer-client,
23%; supervisor, 8%; other (which included groups of individuals), 14%.
** p � .01.

Table 4
Hierarchical Linear Modeling Estimates of the Associations
Between Supervisors’ Leadership Behaviors and Participants’
Affective Experiences

Affective experiences

Transformational leadership behavior

Coefficient SE t

Feeling positive .54** .18 2.94
Feeling negative .03 .22 0.15
Faking positive .11 .16 0.73
Hiding negative .18 .20 0.94
Total emotional regulation .14 .17 0.85

Note. N � 54 participants; N � 889 observations. Coefficients for feeling
positive, feeling negative, faking positive, hiding negative, and total emo-
tional regulation were estimated independently. Approximate df � 51.
** p � .01.


more, although the effects of emotional regulation on job satisfac-
tion appear to be short-lived, the effects of emotional regulation on
stress are more long lasting (e.g., 2 hr). Managers who use trans-
formational leadership behaviors buffer employees who regulate
their emotions from decreased job satisfaction, but not from in-
creased stress. Given these results, two important questions re-
main: (a) How do these results contribute to our understanding of
“the leader’s role in the production of moods and emotions” (Brief
& Weiss, 2002, p. 289)?, and (b) How do these results contribute
to our understanding of the role of managers in workplace emo-
tional regulation?

With respect to leadership and emotions, results of this study
demonstrate the powerful role that managers and supervisors have
on employee emotions. Transformational leadership theory sug-
gests emotional links between leaders and followers but offers
little explanation of how such emotional links contribute to lead-
ership effectiveness. Based on theories of primitive emotional
contagion, lab studies (e.g., Barsade, 2002; Sy et al., 2005) have
documented the fleeting effects of a leader’s positive mood on the
mood of workers. Given evidence in this study that managers’
leadership behaviors have an ongoing influence on employee
optimism and enthusiasm, it may be time to look beyond primitive
emotional contagion as the primary process by which leaders
influence follower emotions and organizational outcomes. Our

results suggest that managers’ transformational leadership behav-
iors may have broad, deep, and long-lasting effects on individual
employees and the organization as a whole. Beyond their imme-
diate effects on employee mood, the positive emotions elicited by
transformational leaders have the potential to influence the overall
work climate and customer satisfaction. Employees who are opti-
mistic are more likely to invest the time to help others (Lee &
Allen, 2002) and can be expected to persist in work tasks – even
in the face of difficulty (Seligman & Schulman, 1986). In the same
vein, research by Pugh (2001) linked positive emotional displays
by employees to customer evaluations of service quality.

Our study also contributes to theory and existing research on
emotional regulation. What we know from the existing literature is
that individuals who tend to regulate their emotions are also less
happy with their jobs and more stressed, although the link with job
satisfaction varies across studies. Our results show that discrete
episodes of emotional regulation are also associated with de-
creased job satisfaction and increased stress, even for individuals
who do not frequently regulate their emotions. This is an important
finding given that emotional regulation theory is focused on the
negative effects of sustained and frequent emotional regulation.
However, it is also important to note that we did not find any
time-lagged effects of emotional regulation on job satisfaction.
Thus, the effects of discrete episodes of emotional regulation on

Table 5
Hierarchical Linear Modeling Analysis Linking Emotional Regulation to Stress and Job Satisfaction


Job satisfaction Stress

Coefficient SE t Coefficient SE t

Momentary dataa

Intercept (�00) 3.73
** 0.08 46.47 2.21** 0.09 24.89

Hiding negative emotions (�10) �.18
** .04 �4.94 .40** .07 5.92

Faking positive emotions (�10) �.18
** .05 �3.60 .35** .06 5.78

Total emotional regulation (�10) �.23
** .05 �4.74 .49** .07 6.88

Lagged momentary datab

Intercept (�00) 3.76
** 0.10 39.08 2.25** 0.12 19.33

Hiding negative emotions (�10) �.05 .04 �1.24 .20
** .08 2.63

Faking positive emotions (�10) �.00 .05 �0.29 .19 .10 1.85
Total emotional regulation (�10) �.04 .05 �0.95 .25

* .10 2.41

Note. Coefficients for hiding negative, faking positive, and total emotional regulation were estimated independently. For lagged data, we used emotional
regulation at t and stress and job satisfaction at t � 1, where t represents a time point and t � 1 represents the time of the next random signal (2 hr later
on average).
a n � 821; approximate df � 819. b n � 284; approximate df � 281.
* p � .05. ** p � .01.

Table 6
Hierarchical Linear Modeling Estimates of the Moderating Effect of Transformational Leadership


Hiding negative emotion Faking positive emotion Total emotion regulation

Coefficient SE t Coefficient SE t Coefficient SE t

Moderator effects—job satisfaction
Transformational leadership (�13) .12

** .06 2.01 .24** .07 3.21 .21** .07 2.87
Moderator effects—stress

Transformational leadership (�13) .01
** .10 0.14 .02** .11 0.14 .01** .10 0.09

Note. N � 54 participants; N � 889 observations. Coefficients for hiding negative, faking positive, and total emotional regulation were estimated
independently. Approximate df � 51.
** p � .01.


job satisfaction appear to be very short-lived (less than 2 hr). This
is in contrast to the effects of emotional regulation on stress, which
remained significant 2 hr later. Hochschild’s (1983) seminal re-
search suggested that individuals who must frequently regulate
their emotions on the job will experience depersonalization and
stress. Although our results do not speak to cumulative effects over
time, they do suggest that even for individuals who rarely regulate
their emotions, doing so on a single occasion is associated with
increased stress, which may last for several hours. An important
topic for future research is to examine the long-term, sustained
effects of emotional regulation on employee health.

In organizational research on emotional regulation, it is common
to do as we did and consider both job satisfaction and stress as
important outcomes of emotional regulation; however, our results
suggest that different processes may underlie the effects of emo-
tional regulation on these two outcomes. We found that emotional
regulation was associated with stress 2 hr later, but the effects of
emotional regulation on job satisfaction were more fleeting (i.e.,
we found no time-lagged effects). We also found that managerial
behavior buffered employees from the effects of emotional regu-
lation on job satisfaction but not on stress, suggesting that the
processes linking emotional regulation with job satisfaction and
stress may differ. The link between emotional regulation and job
satisfaction may be cognitive in nature. One way that managers
provide support for employees is by enhancing internalized self-
regulation and drawing attention to outcomes associated with
emotional regulation (e.g., greater service for customers). There-
fore, when employees regulate their emotions, they may con-
sciously decide that it is worth doing because it helps them to
perform better in a job they care about and identify with. Thus,
during emotional regulation they may feel momentarily unhappy
with their job because it demands inauthentic emotional expres-
sions, but these effects are fleeting (and do not occur for employ-
ees who report to transformational managers) because employees
understand how emotional regulation contributes to their job.

In contrast, the effects of emotional regulation on stress may be,
at least in part, physiological. Empirical evidence shows that
emotional regulation leads to cardiac arousal (see Gross & Lev-
enson, 1993), which is linked to perceptions of stress (see Blas-
covich & Tomaka, 1996). It may be that the association between
emotional regulation and stress is stronger, lasts longer, and is less
amenable to reduction through managerial support because it has a
physiological basis. Clearly, we can only speculate about the
processes by which emotional regulation affects stress and job
satisfaction, but this is an important area for future research. If our
theoretical propositions are correct, managers may be able to help
employees more by reducing or eliminating the need for emotional
regulation (perhaps by encouraging and supporting authenticity)
rather than attempting to buffer them from the negative effects of
emotional regulation after it occurs.

Our study highlights the need for follow-up research. First, it is
important to establish causality by manipulating both emotional
regulation and leadership behavior in the lab. Some work has been
done in this regard by Bono and Vey (2007), who linked emotional
regulation to increased heart rate and stress. Controlled laboratory
settings would also be useful in assessing the longevity of the
effects of emotional regulation episodes and sustained emotional
regulation. Second, it is important to determine the direct cause of
emotional regulation. Although our theoretical model suggests that
emotional regulation causes stress, our methods did not allow us to
establish causality. Indeed, it is possible that certain events in the
workplace may cause both emotional regulation and stress. For
example, in cases in which workers are asked to be pleasant and
enthusiastic with customers, a rude customer may be the source of
emotional regulation, which then leads to stress. Alternatively, a
rude customer may be the source of both emotional regulation and
stress, creating the illusion of a causal association between them.
It is also plausible that an unpleasant interaction with a patient or
customer would have both direct effects on stress and indirect
effects, through emotional regulation. One possible next step

Figure 2. The moderating effect of transformational leadership on the relationship between hiding negative
emotions and faking positive emotions and job satisfaction. Job satisfaction scores are based on responses to the
question “At this very moment, I am fairly satisfied with my job,” rated on a 5-point scale (1 � strongly
disagree, 5 � strongly agree).


would be to use experience sampling methods to catalogue events
that lead to emotional regulation in the natural work environment
and determine how these events affect both emotions and stress.

Our study is unique in investigating leadership as an important
factor in employees’ experienced emotions and in the emotional
regulation process; it has some methodological strengths and lim-
itations worth noting. Concerning the former, we used a longitu-
dinal design in conjunction with an experience sampling method-
ology. In doing so, we were able to collect daily emotional
regulation data across multiple locations and jobs as emotional
regulation occurred. Thus, we avoided retrospective and recall
biases that occur in reporting emotional experiences. Our within-
person design also allowed us to confidently link specific events of
emotional regulation to stress and job satisfaction, free from the
biasing effects of individual differences. Furthermore, our mea-
sures of leadership behavior and employee emotional experiences
were relatively independent of each other. Our study also ad-
dressed the call for more research to be conducted in health care
settings (Ehler, Major, & Fletcher, 2003).

Despite these strengths, our study also has some limitations.
First, although our participants held a variety of jobs, we did not
have the sample size to examine subgroups of employees in
service versus nonservice support jobs. Second, our sample was
drawn from a single organization. Thus, it is possible that charac-
teristics of the work climate, shared by all participants, influenced
the experience of both positive and negative emotions and overall
levels of emotional regulation. If the entire work climate was
particularly supportive or abusive, our results may under- or over-
estimate the moderating effects of leadership behaviors. Third, our
sample—though representative of the larger organization and the
health care industry—was largely female, and we cannot be certain
that the associations we found would hold for men. Fourth, it is
possible that the PDA signals may have annoyed participants,
causing increased irritation and stress. If this is true, the effects
were small because mean stress from PDA data was not higher
than global reports of stress.

Considered as a whole, our results highlight the importance of
supervisors in employees’ emotional experiences. Our results are
provocative in that they suggest the possibility of different theo-
retical explanations (and therefore the need for different manage-
rial interventions) for the effects of emotional regulation on stress
and job satisfaction. Our study also extends existing theory on
leader–follower emotional links by looking beyond the primitive
emotional contagion process. We hope this study will motivate
future longitudinal research in naturalistic settings along with
experimental research in the laboratory, including both psycholog-
ical and physiological measures of employee moods, emotional
regulation, and stress over time.


Abraham, R. (1998). Emotional dissonance in organizations: Antecedents,
consequences, and moderators. Genetic, Social, and General Psychology
Monographs, 124, 229 –246.

Ashforth, B. E., & Humphrey, R. H. (1993). Emotional labor in service
roles: The influence of identity. Academy of Management Review, 18,
88 –115.

Avolio, B. J., Bass, B. M., & Jung, D. I. (1995). Multifactor Leadership
Questionnaire technical report. Redwood City, CA: Mind Garden.

Barsade, S. G. (2002). The ripple effects: Emotional contagion and its

influence on group behavior. Administrative Science Quarterly, 47,
644 – 675.

Bass, B. M. (1985). Leadership and performance beyond expectations.
New York: Free Press.

Blascovich, J., & Tomaka, J. (1996). The biopsychosocial model of arousal
regulation. Advances in Experimental Social Psychology, 28, 1–51.

Bono, J. E., & Ilies, R. (2006). Charisma, positive emotions, and mood
contagion. Leadership Quarterly, 17, 317–334.

Bono, J. E., & Judge, T. A. (2003). Self concordance at work: Toward
understanding the motivational effects of transformational leaders.
Academy of Management Journal, 46, 554 –571.

Bono, J. E., & Vey, M. A. (2007). Personality and emotional performance:
Extraversion, neuroticism, and self-monitoring. Journal of Occupational
Health Psychology, 12, 177–192.

Brayfield, A. H., & Rothe, H. F. (1951). An index of job satisfaction.
Journal of Applied Psychology, 35, 307–311.

Brief, A. P., & Weiss, H. M. (2002). Organizational behavior: Affect in the
workplace. Annual Review of Psychology, 53, 279 –307.

Conger, J. A., & Kanungo, R. N. (1998). Charismatic leadership in
organizations. Thousand Oaks, CA: Sage.

Danna, K., & Griffin, R. W. (1999). Health and well-being in the work-
place: A review and synthesis of the literature. Journal of Management,
25, 357–384.

Diefendorff, J. M., & Richard, E. M. (2003). Antecedents and conse-
quences of emotional display rule perceptions. Journal of Applied Psy-
chology, 88, 284 –294.

Diener, E., & Emmons, R. (1984). The independence of positive and
negative affect. Journal of Personality and Social Psychology, 47,

Ehler, M. L., Major, D. A., & Fletcher, T. D. (2003). Applying I-O to
medicine: Making the case that it can be done and that it should be done.
The Industrial–Organizational Psychologist, 41, 50 –54.

Erez, A. I., & Isen, A. M. (2002). The influence of positive affect on the
components of expectancy motivation. Journal of Applied Psychology,
87, 1055–1067.

Fitness, J. (2000). Anger in the workplace: An emotion script approach to
anger episodes between workers and their superiors, coworkers, and
subordinates. Journal of Organizational Behavior, 21, 147–162.

George, J. M., & Zhou, J. (2001). When Openness to Experience and
Conscientiousness are related to creative behavior: An interactional
approach. Journal of Applied Psychology, 86, 513–524.

George, J. M., & Zhou, J. (2002). Understanding when bad moods foster
creativity and good ones don’t: The role of context and clarity of
feelings. Journal of Applied Psychology, 87, 687– 697.

Glaso, L., & Einarsen, S. (2006). Experienced affects in leader–subordinate
relationships. Scandinavian Journal of Management, 22, 49 –73.

Glomb, T. M., & Tews, M. J. (2004). Emotional labor: A conceptualization
and scale development. Journal of Vocational Behavior, 64, 1–23.

Gross, J. J. (1998). Antecedent- and response-focused emotion regulation:
Divergent consequences for experience, expression, and physiology.
Journal of Personality and Social Psychology, 74, 224 –237.

Gross, J. J., & Levenson, R. W. (1993). Emotional suppression: Physiol-
ogy, self-report, and expressive behavior. Journal of Personality and
Social Psychology, 64, 970 –986.

Gross, J. J., & Levenson, R. W. (1997). Hiding feelings: The acute effects
of inhibiting negative and positive emotion. Journal of Abnormal Psy-
chology, 106, 95–103.

Gross, J. J., Richards, J. M., & John, O. P. (2006). Emotion regulation in
everyday life. In D. K. Snyder, J. A. Simpson, & J. N. Hughes (Eds.),
Emotion regulation in couples and families: Pathways to dysfunction
and health (pp. 13–39). Washington, DC: American Psychological As-

Hochschild, A. R. (1979). Emotion work, feeling rules, and social struc-
ture. American Journal of Sociology, 85, 551–575.


Hochschild, A. R. (1983). The managed heart. Berkeley: University of
California Press.

House, R. J. (1977). A 1976 theory of charismatic leadership. In J. G. Hunt
& L. L. Larsen (Eds.), Leadership: The cutting edge. Carbondale:
Southern Illinois University Press.

Ilies, R., & Judge, T. A. (2002). Understanding the dynamic relationships
among personality, mood, and job satisfaction: A field experience sam-
pling study. Organizational Behavior & Human Decision Processes, 89,
1119 –1139.

James, L. R., Demaree, R. G., & Wolf, G. (1984). Estimating within-group
reliability with and without response bias. Journal of Applied Psychol-
ogy, 69, 85–98.

John, O. P., & Gross, J. J. (2004). Healthy and unhealthy emotional
regulation: Personality processes, individual differences and life span
development. Journal of Personality, 72, 1301–1333.

Judge, T. A., & Ilies, R. (2004). Affect and job satisfaction: A study of their
relationship at work and at home. Journal of Applied Psychology, 89,
661– 673.

Law, K. S., Wong, C., & Song, L. (2004). The construct and criterion
validity of emotional intelligence and its potential utility for manage-
ment studies. Journal of Applied Psychology, 89, 483– 496.

Lee, K., & Allen, N. J. (2002). Organizational citizenship behavior and
workplace deviance: The role of affect and cognitions. Journal of
Applied Psychology, 87, 131–142.

Lim, B. C., & Ployhart, R. E. (2004). Transformational leadership: Rela-
tions to the five-factor model and team performance in typical and
maximum contexts. Journal of Applied Psychology, 89, 610 – 621.

Little, B. R. (2000). Free traits and personal contexts: Expanding a social
ecological model of well-being. In W. B. Walsh & K. H. Craik (Eds.),
Person– environment psychology: New directions and perspectives (2nd
ed., pp. 87–116). Mahwah, NJ: Erlbaum.

McColl-Kennedy, J. R., & Anderson, R. D. (2002). Impact of leadership
style and emotions on subordinate performance. Leadership Quarterly,
13, 545–559.

Miner, A., Glomb, T. M., & Hulin, C. (2005). Experience sampling mood
and its correlates at work. Journal of Occupational and Organizational
Psychology, 78, 171–193.

Motowidlo, S. J., Packard, J. S., & Manning, M. R. (1986). Occupational
stress: Its causes and consequences for job performance. Journal of
Applied Psychology, 71, 618 – 629.

Ortony, A., & Turner, T. (1990). What’s basic about basic emotions?
Psychological Review, 97, 313–331.

Pugh, S. D. (2001). Service with a smile: Emotional contagion in the
service encounter. Academy of Management Journal, 44, 1018 –1027.

Pugliesi, K. (1999). The consequences of emotional labor: Effects on work

stress, job satisfaction, and well-being. Motivation and Emotion, 23,

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models:
Applications and data analysis methods. London: Sage.

Russell, J. A., & Feldman Barrett, L. (1999). Core affect, prototypical
emotional episodes, and other things called emotion: Dissecting the
elephant. Journal of Personality and Social Psychology, 76, 805– 819.

Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the
facilitation of intrinsic motivation, social development, and well-being.
American Psychologist, 55, 68 –78.

Schaubroeck, J., & Jones, J. R. (2000). Antecedents of workplace emo-
tional labor dimensions and moderators of their effects on physical
symptoms. Journal of Organizational Behavior, 21, 163–183.

Seligman, M. E., & Schulman, P. (1986). Explanatory style as a predictor
of productivity and quitting among life insurance sales agents. Journal
of Personality and Social Psychology, 50, 832– 838.

Shamir, B., House, R. J., & Arthur, M. B. (1993). The motivational effects
of charismatic leadership: A self-concept based theory. Organization
Science, 4, 577–594.

Shaver, P., Schwartz, J., Kirson, D., & O’Connor, C. (1987). Emotion
knowledge: Further exploration of a prototype approach. Journal of
Personality and Social Psychology, 52, 1061–1086.

Sheldon, K. M., & Elliott, A. J. (1999). Goal striving, need satisfaction and
longitudinal well-being: The self-concordance model. Journal of Per-
sonality and Social Psychology, 76, 482– 497.

Sy, T., Côté, S., & Saavedra, R. (2005). The contagious leader: Impact of
the leader’s mood on the mood of group members, group affective tone,
and group processes. Journal of Applied Psychology, 90, 205–305.

Tepper, B. J. (2000). Consequences of abusive supervision. Academy of
Management Journal, 43, 178 –190.

Watson, D., & Clark, L. A. (1997). Measurement and mismeasurement of
mood: Recurrent and emergent issues. Journal of Personality Assess-
ment, 68, 267–296.

Weiss, H. M., & Cropanzano, R. (1996). Affective events theory: A
theoretical discussion of the structure, causes and consequences of
affective experiences at work. In B. M. Staw & L. L. Cummings (Eds.),
Research in organizational behavior: An annual series of analytical
essays and critical reviews (pp. 1–74). Greenwich, CT: JAI Press.

Zapf, D. (2002). Emotion work and psychological well-being: A review of
the literature and some conceptual considerations. Human Resource
Management Review, 12, 237–268.

Received February 9, 2005
Revision received December 15, 2006

Accepted December 18, 2006 �


Health Services and Outcomes Research

Comparison of “Risk-Adjusted” Hospital Outcomes

David M. Shahian, MD; Sharon-Lise T. Normand, PhD

—A frequent challenge in outcomes research is the comparison of rates from different populations. One common example with substantial health policy implications involves the determination and comparison of hospital outcomes. The concept of “risk-adjusted” outcomes is frequently misunderstood, particularly when it is used to justify the direct comparison of performance at 2 specific institutions.

Methods and Results
—Data from 14 Massachusetts hospitals were analyzed for 4393 adults undergoing isolated coronary artery bypass graft surgery in 2003. Mortality estimates were adjusted using clinical data prospectively collected by hospital personnel and submitted to a data coordinating center designated by the state. The primary outcome was hospital-specific, risk-standardized, 30-day all-cause mortality after surgery. Propensity scores were used to assess the comparability of case mix (covariate balance) for each Massachusetts hospital relative to the pool of patients undergoing coronary artery bypass grafting surgery at the remaining hospitals and for selected pairwise comparisons. Using hierarchical logistic regression, we indirectly standardized the mortality rate of each hospital using its expected rate. Predictive cross-validation was used to avoid underidentification of true outlying hospitals. Overall, there was sufficient overlap between the case mix of each hospital and that of all other Massachusetts hospitals to justify comparison of individual hospital performance with that of the remaining hospitals. As expected, some pairwise hospital comparisons indicated lack of comparability. This finding illustrates the fallacy of assuming that risk adjustment per se is sufficient to permit direct side-by-side comparison of healthcare providers. In some instances, such analyses may be facilitated by the use of propensity scores to improve covariate balance between institutions and to justify such comparisons.

—Risk-adjusted outcomes, commonly the focus of public report cards, have a specific interpretation. Using indirect standardization, these outcomes reflect a provider’s performance for its specific case mix relative to the expected performance of an average provider for that same case mix. Unless study design or post hoc adjustments have resulted in reasonable overlap of case-mix distributions, such risk-adjusted outcomes should not be used to directly compare one institution with another. (Circulation. 2008;117:1955-1963.)

Key Words: health care quality assessment ■ outcomes research ■ risk ■ statistics

)utcomes research “seeks to understand the end results of particular health care practices and interventions.”1 This

may involve investigation of a new drug or procedure compared with standard therapy through the use of either a randomized trial or an observational study. Because of the current health policy emphasis on measuring and improving provider performance,2,3 interest has also been increasing in another type of outcomes research referred to as provider profiling.4,5 This research focuses on the collection and analysis of outcomes data to evaluate the performance of a physician or a hospital.

Clinical Perspective p 1963

Provider profiling has a number of features that distinguish it from other types of outcomes research. First, unlike trials of new medications or treatment regimens, randomization of

patients to hospitals or physicians would often be both impractical and unethical. Thus, profiling studies are almost always observational in nature, relying on data from usual practice settings. In further contrast to drug trials that involve direct comparisons of outcomes for only a few treatments, profiling studies typically assess outcomes for many provid- ers, usually with regard to some population reference stan- dard. Finally, when profiling is based on outcomes measures such as mortality or morbidity, risk adjustment is necessary to account for preexisting conditions that may confound their assessment.

Despite their increasingly widespread use, considerable confusion exists among consumers, the media, payers, and providers as to the correct meaning and interpretation of risk-adjusted outcomes. For example, many incorrectly inter- pret such outcomes as having “leveled the playing field” to

Downloaded from
by guest on February 20, 2017

Received November 9, 2007; accepted February 13, 2008.

From the Center for Quality and Safety, Department of Surgery, and Institute for Health Policy, Massachusetts General Hospital, and Harvard Medical School (D.M.S.), and Department of Health Care Policy, Harvard Medical School, and the Department of Biostatistics, Harvard School of Public Health (S.T.N.), Boston, Mass.

Guest Editor for this article was Harlan M. Krumholz, MD, SM.

Correspondence to Sharon-Lise T. Normand, Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave, Boston, MA 02115. E-mail [email protected]

© 2008 American Heart Association, Inc.

Circulation is available at

DOI: 10.1161/CIRCULATIONAHA.107.747873


) (
) (





) (


) (

Downloaded from
by guest on February 20, 2017
)permit direct comparison of one provider with another. Direct comparability may sometimes be justified in an observational study, but this would be fortuitous and is not an inherent characteristic of the study design.

Correct interpretation of the concept of risk-adjusted out- comes is neither a trivial nor a strictly academic concern. Such outcomes are used to designate centers of excellence, to determine reimbursement levels in pay for performance programs, to rank institutions, and to classify providers as “outliers.” These determinations may have profound effects on patient access, hospital reputation, referrals, and financial survival.

The goal of this article is to systematically review the fundamental concepts from which the deceptively simple term “risk-adjusted outcome” is derived. We develop the concept of risk-adjusted outcomes in the context of causal inference theory and illustrate the derivation of indirectly standardized mortality ratios, often referred to as O/E (observed/expected) ratios. Key methodological concepts (eg, outlier determination and direct comparison of hospitals) are illustrated through the example of coronary artery bypass grafting surgery (CABG) mortality profiling, in which the difference in outcomes of a hospital compared with the reference standard is generally regarded as a reflection of quality of care.5


Background It is useful to consider risk adjustment and standardization as specific applications of causal inference theory, a broad discipline with historical roots in philosophy, mathematical logic, and statistics.6 –19 This is the foundation for understanding causal effects in health care,16,20 –24 which can be thought of as the difference between the outcome for a patient when exposed to one treatment (or provider) and the outcome when exposed to another.

A fundamental precept of causality is that only one of a series of potential outcomes can be experienced at any one time.7,17,20,23,24 In CABG hospital profiling, a patient can undergo CABG at only one hospital on a given day. Therefore, some method must be used to estimate what would hypothetically have occurred to that patient had he or she undergone surgery at a different hospital. The observed result is referred to as the actual outcome, and the unobservable estimated outcome is the counterfactual.7,17,20,23,24 Estimation of this counterfactual outcome, the hypothetical result if treated under a different set of circumstances, is the primary motivator for risk model development. Several approaches have been developed to estimate these potential outcomes for individual patients and subse- quently to assess the overall performance of a hospital.

Estimation of Counterfactuals for Risk Adjustment and Standardization

The simplest estimator of a counterfactual would be the average result of treating a similar condition (eg, a CABG procedure) in the overall population or at another specific institution. However, this estimator is likely to be both inaccurate and misleading. Patients are nonrandomly allocated among institutions, and use of crude mortal- ity rates from other hospitals as the counterfactual outcomes would ignore systematic differences among patients such as acuity status. At the other end of the spectrum, the counterfactual outcomes could be determined through randomization,15,18,19,25 the most internally valid design. Both measured and unmeasured confounders would be balanced, so the mortality experience of patients undergoing CABG at one hospital could serve as the counterfactual outcome for patients treated at another hospital. However, it is implausible to think that

most patients would consent to randomization for anything but truly experimental care; for this reason, almost all profiling studies are conducted with observational data. Matching and stratification are other methods sometimes used to derive counterfactuals, but they quickly become impractical when more than a few predictor vari- ables are considered, the typical case in mortality profiling.

Most profiling studies have relied on regression modeling to derive counterfactual outcomes, and it is the method used here. Risk adjustment, the term commonly used for this approach, refers to the results of statistical regression models that relate the outcome for a specific patient to his or her observed characteristics.4,26 –29 Then, because the main focus of profiling is to determine how the overall experience of a particular hospital compares to what would be “expected,” the next step is to standardize the results of an institution to the reference population.

Indirect standardization is used for almost all profiling and public report cards. With this method, the expected rate represents what the mortality rate would have been at a hospital given its actual distribution of patients but replacing its observed mortality rates with rates estimated from the entire group of providers. The indirectly standardized mortality ratio, often referred to as the ratio of observed to expected outcomes (O/E ratio), compares the outcomes for the specific distribution of patients at a hospital with their expected results had they been treated by an average provider in the reference population.

Indirect standardization is accomplished by first summing the individual risk probabilities for each patient within a given hospital using the coefficients estimated from the regression model and the patient’s specific distribution of confounders. This yields the ex- pected total number of deaths for that hospital. This counterfactual hospital mortality often is used as the denominator of the ratio of observed to expected mortality (O/E ratio), a form of causal estimand. This O/E ratio is favorable if 1 and unfavorable if 1. As a final step, the O/E ratio may be multiplied by the unadjusted population mortality rate for the procedure to obtain what is often called the risk-adjusted mortality rate but which is more correctly designated the risk-standardized mortality rate (RSMR) or standard- ized mortality incidence rate (SMIR).30 –34

Outlier Determination and the Direct Comparison of Hospitals

The main goal of outcomes profiling is to identify differences in hospital quality. Because the risk-standardized rates for each hospital are derived from the reference population, it is most appropriate to determine whether these rates are statistically different from the population average. If so, the hospital is regarded as a statistical outlier. Most commonly, this is achieved by determining whether the 95% interval for a hospital’s risk-standardized mortality estimate includes the overall state average mortality (or alternatively, if the intervals around their O/E ratio intersect 1). If no overlap exists, they typically are classified as an outlier. An important but overlooked aspect of outlier determination is the effect on expected outcomes when true outlying programs are included in the development of the statistical model. This problem and a potential solution (cross- validated P values) are described further in the Illustration.

Risk Factor Distribution and Direct Comparability

In addition to comparing individual hospitals with the reference population to determine outlier status, some consumers also seek to directly compare individual hospitals with one another. A problem with direct comparisons that has been widely recognized by statis- ticians, and that was the motivation for the development of balancing methods such as propensity scores,14 –16,18,19,35-41 is that of covariate imbalance. Absent randomization, the patient cohorts from 2 hospi- tals may be unbalanced with regard to the frequency of confounders. The implications of such imbalance have received little attention in the context of risk-adjusted outcomes profiling, which in turn has led to both misunderstanding and misuse.

Downloaded from
by guest on February 20, 2017
)In general, only the results for those patients with comparable risk profiles (eg, that overlap the risk distributions of the 2 providers) should be directly compared. Consider the extreme but not uncom- mon example of a state or region with many small community hospitals and 1 or 2 tertiary/quaternary hospitals. As a general principle, direct comparison of a community to a tertiary hospital would be appropriate only for the relatively small proportion of patients who overlap between the 2 hospitals. Although the results for the overlap group can be used to estimate expected outcomes for patients not in common between the 2 institutions, this form of extrapolation depends heavily on assumptions that are typically unverifiable. For example, the indirectly risk-standardized results at a community hospital apply to its specific type of patients, who might be relatively low risk compared with a tertiary center. It cannot be assumed that a favorable risk-standardized mortality at the community hospital, based on its lower risk case mix, could necessarily be achieved if it were confronted with the higher-risk case mix of the tertiary center, including some types of patients that it rarely, if ever, encounters.

Propensity scores are a useful method to construct treatment and control groups that may differ in number of subjects but are similar to randomized studies in their balanced distribution of all measured confounders.14 –16,18,19,35– 41 The propensity score is the likelihood of receiving treatment of one type compared with another (or in the case of profiling, exposure to one or another specific provider) on the basis of a patient’s set of observed characteristics. It provides a convenient scalar (1-number) summary of the information contained in all the patient’s measured covariates. The propensity score may then be used for matching, stratification, blocking, or weighting in regression modeling.

The problem of covariate imbalance has received little attention in provider profiling studies.42– 45 If the propensity score provides a convenient summary estimate of individual patient risk, then each provider will have a specific distribution of propensity scores that characterizes its “case mix.” For 2 providers to be comparable, the area of overlap in their respective propensity score distributions should be identified. As shown in Figure 1A, 2 hypothetical hospitals (hospitals 1 and 2) might by chance (or as a result of randomization) have substantial overlap in their propensity score distributions. The area of shaded overlap in Figure 1A indicates that a majority of patients treated at hospital 2 have a similar propensity to have been treated at hospital 1. For almost every patient who underwent CABG at hospital 1, we can find a “similar” patient from among those having CABG at hospital 2.

Figure 1B depicts a different set of 2 hospitals with significant imbalance in their average patient risk as measured by their propen- sity score distributions. Only a small percentage of patients at the 2 institutions have comparable risk profiles. It is only the group of patients who overlap from which relative performance inferences should be drawn.

Study Population

We examined data from all adults (>18 years of age) undergoing isolated CABG at all acute-care, nonfederal hospitals in Massachu- setts between January 1, 2003, and December 31, 2003. Data collection is mandated by the Massachusetts Department of Public Health.

Data Sources

We used clinical data submitted to a data coordinating center (Mass-DAC) located in the Harvard Medical School Department of Health Care Policy. Data are collected by trained hospital personnel using the Society of Thoracic Surgeons National Adult Cardiac Database instrument.46 Supplemental patient and surgeon identifying information also is collected using additional data forms developed by Mass-DAC. The data are sent electronically to Mass-DAC, where they are cleaned, audited, and verified using internal and external procedures.

End Points

The primary end point is hospital-specific, risk-standardized, all- cause, 30-day mortality rate. Mortality data are obtained 2 ways. First, hospital personnel are responsible for collecting 30-day mor- tality for all patients undergoing cardiac surgery. Second, patient identifying information is linked to this registry from the Massachu- setts Registry of Vital Records and Statistics to verify date of death. The registry includes mortality information for Massachusetts resi- dents and all records of deaths that occur within the Commonwealth regardless of the state of residence. Because Mass-DAC has access to Social Security numbers, the Social Security Index Web site47 also is searched to identify deaths, including those reported to the Social Security Administration by funeral homes or by relatives.

Statistical Analyses

Distributions of clinical and demographic variables are computed and stratified by hospital to identify unusual or extreme values. Because of data collection protocols and auditing procedures, no data are missing in the clinical variables or outcomes for the mortality models.

Risk Adjustment

We first estimated a propensity score model in which the dependent variable was multinomial, assuming 13 distinct values corresponding to the 13 hospitals (1 hospital is the reference group). The specific clinical variables included in the model were selected from a literature review of existing models and expert opinion from a panel of senior cardiac surgeons. A multinomial logistic regression model was estimated, and predictions for each patient in the sample were subsequently obtained. Thus, each patient had 14 estimated proba- bilities, each reflecting the likelihood that the patient would undergo CABG at 1 specific hospital rather than 1 of the remaining 13 hospitals. For this reason, the sum of the 14 estimated probabilities for each patient was 1.

To compare the performance of each hospital with that of its peers, it is necessary to assess whether the population of patients undergo- ing surgery at a particular hospital is comparable to that of all other Massachusetts hospitals on the basis of their observed characteris- tics. To accomplish this, we examined the overlap between the distribution of the propensity scores for patients undergoing surgery at each hospital and the distribution of the propensity scores for patients not undergoing surgery at that hospital. Ideally, the esti- mated propensity scores of the latter group would cover the entire range of estimated propensity scores at the particular hospital being studied. This finding would provide support for the assumption that the 2 groups of patients (those treated at a particular hospital versus all others) were similar in terms of observable demographic charac- teristics and other comorbidities.

We next estimated a regression model for the mortality outcomes. The dependent variable was binary, assuming a value of 1 if the patient died of any cause within 30 days of surgery and 0 otherwise. We included the same set of confounders used in the propensity score model. We included a random hospital-specific intercept that represented the underlying quality of the hospital and accounted for within-hospital correlation of patients. We calculated odds ratios (ORs) conditional on the hospital random effects that apply to comparisons of patients belonging to the same hospital (see Larsen and Merlo48 for a discussion of differences between conditional and unconditional ORs).

The size of between-hospital variation was summarized by the median OR (MOR).49 The MOR considers 2 CABG patients with the same set of observed risk factors but selected randomly from 2 different hospitals. The MOR is the OR between the patient with a higher probability of dying and the patient with a lower probability of dying. A MOR value 1 supports the hypothesis that between- hospital variation in mortality exists after adjustment for patient characteristics. If the between-hospital variation were 0, this would imply that differences in hospital outcomes, after adjustment for patient characteristics, are due only to random sampling variability. Although between-hospital variation will always be 0 in practice, some have suggested that small values can be effectively ignored by

A Hospital 1

) (
) (
)Hospital 2

Hospital 1

) (
) (
)Hospital 2

) (
) (
) (
) (
) (
)Logit(P(Surgery at Hospital 1)) Logit(P(Surgery at Hospital 1)

Figure 1. Covariate balance (shaded area) between patients treated at 2 fictitious hospitals. The x axis represents the log-odds of the probability that a patient has surgery at hospital 1 vs hospital 2; the y axis represents the density of patients. Substantial overlap is present in log-odds in A, and less overlap is present in B.

Downloaded from
by guest on February 20, 2017
) (
) (
) (
) (
) (
) (
) (
) (
) (
) (
)essentially setting the between-hospital variation component to 0. We see no reason to assume that between-hospital variation is 0 given that this value can be estimated.

We calculated the mortality risk for each patient using the observed values of his or her confounding variables. The individual risk factors were multiplied by the estimated coefficients from the regression model, transformed onto the probability scale, and summed to obtain the number of expected number of deaths at each hospital.

Hospital RSMRs

We next estimated a risk-standardized mortality ratio for each hospital by computing the ratio of the “observed” number of deaths to the expected number of deaths (RSMR). However, rather than use the actual numbers of deaths at a hospital, we used an adjusted number (called a shrinkage estimate) that avoids several statistical problems associated with the observed number, including small sample sizes and clustering.28,34,50,51 We then multiplied the stan- dardized mortality ratio by the crude state mortality rate to obtain hospital-specific RSMRs. Ninety-five percent posterior intervals for each RSMR were computed.


Because all hospitals contribute to the model used to estimate the expected number of deaths, each hospital helps to define its own expected behavior.50,51 If one hospital is truly “outlying,” with an

unusually high or low mortality rate, it may “inflate” the estimated between-hospital variance component because the regression model adapts to incorporate the results of the unusual hospital. Conse- quently, this hospital will be less likely to be identified as an outlier. With a very large number of hospitals, the results of one institution are unlikely to distort the model substantially. However, with a smaller number of cardiac surgery hospitals, as in Massachusetts or other individual states, one aberrant hospital could substantially influence the counterfactual outcome and make the performance of that hospital less likely to be identified as an outlier.

We addressed this problem through cross-validation. In a second set of analyses, the data from each hospital were sequentially deleted from the determination of the counterfactual distribution for its particular patients. With this approach, the expected number of deaths for a hospital represents how well the rest of the hospitals in the state would fare with the patients from that specific hospital. We computed the difference between the observed numbers of deaths in each hospital and the number of deaths predicted using its case mix and the regression coefficients from a model based on all other hospitals. Posterior predictive probability values, which reflect the similarity of the mortality experience of a particular hospital to that of its peers, also were computed.50 Extreme predictive P values (P<0.01 or P>0.99) indicate a discrepancy between the observed data and what is predicted by the model developed from the remaining hospitals.

Table 1. Selected Patient Characteristics Stratified by Hospital: Massachusetts Adults Undergoing Isolated CABG Surgery During 2003

Hospital, %
















All, %

















Renal failure
















Hx of PVD
















Prior CABG
















EF 30%
















MI 6 h
















Emergent or salvage
















Cardiogenic shock
















Preop IABP
















Hx of PVD indicates history of peripheral vascular disease; EF, ejection fraction; MI, myocardial infarction; and preop IABP, preoperative intraaortic balloon pump.

The authors had full access to and take full responsibility for the integrity of the data. All authors have read and agree to the manuscript as written.


The crude 30-day mortality rate is 2.25%, corresponding to

99 deaths out of 4393 isolated CABG admissions. The

A Hospital B vs All Others Multinomial Logistic

number of isolated CABG admissions ranged from a low of 44 to a high of 650. Not surprisingly, substantial differences were found in patient risk factors among hospitals (Table 1). For example, the percentage of admissions in which ejection fraction was 30% ranged from 1.8% to 15.0%, renal failure ranged from 1.8% to 13.0%, preoperative intraaortic balloon

B Hospital B vs Hospital F Binary Logistic





at F
) (
) (
)Surgery at B Surgery

) (
) (
) (
) (
) (
) (
)not at B

) (
) (
) (
) (
) (
)-6.0 -5.0 -4.0 -3.0 -2.0 -1.0 0.0

Logit(P(Surgery at Hospital B))

-6 -2 0 2 4 6 8 10 14 18 22 26

Logit(P(Surgery at Hospital B))

Downloaded from
by guest on February 20, 2017
) (
) (
) (
) (
) (
)Figure 2. Covariate balance for 2 comparisons using Massachusetts cardiac surgery programs. A, Substantial overlap is present in the log-odds of the probability of surgery at hospital B vs the remaining 13 cardiac surgery programs. B, The covariate balance for the direct comparison of hospital B to hospital F is much less.

Table 2. Comparison of Prevalence of Risk Factors Between Hospitals B and F

Table 3. Prevalence of Risk Factors and Conditional and Unconditional (Population-Averaged) Odds Ratios of 30-Day Mortality After Isolated CABG Surgery in Massachusetts (2003)

Area of Area of




Odds Ratios

Risk Factor Hospital B Hospital F Hospital B

Risk Factor


(95% Posterior Limits)

Years 65 y, mean


1.06, (1.04, 1.09)



0.43 (0.28, 0.68)

Renal failure


3.35 (1.81, 5.57)

Diabetes mellitus


1.21 (0.76, 1.84)



1.05 (0.58, 1.84)

Peripheral vascular disease


1.34 (0.78, 2.08)

Prior CABG surgery


2.67 (0.96, 5.49)

Prior PTCA surgery


1.32 (0.75, 2.15)

Cardiogenic shock


6.58 (2.70, 13.87)

Ejection fraction (reference,




30% or missing 12.6 1.13 (0.55, 1.99)

30% to 39% 12.4 2.05 (1.16, 3.29)




































)Myocardial infarction (reference, none)

49.7 1.00

6 h


0.72 (0.09, 2.40)

7–24 h


1.77 (0.47, 4.51)

1–7 d


1.16 (0.63, 1.98)

8–21 d


1.26 (0.47, 2.61)

21 d


1.01 (0.50, 1.78)









Status of CABG, %


) (




)Status of CABG (reference, elective)

31.2 1.00



0.74 (0.43, 1.20)



2.16 (0.70, 4.96)

Preoperative intraaortic balloon pump


1.71 (0.89, 2.91)

PTCA indicates percutaneous coronary angioplasty. Area of overlap was defined by estimated log-odds of propensity score 5.

pump use varied from 2.3% to 29.0%, and emergent or salvage procedures ranged from 0% to 7.2%. Visual inspec- tion of the covariate frequencies for hospitals B and F

Between-hospital parameters*

Mean, between-hospital (logits)

Variance, between-hospital variance (logits)

… -5.05 (-5.76, -4.35)

. . . 0.0939 (0.00111, 0.483)

Downloaded from
by guest on February 20, 2017
)suggests that they represent, on average, quite different populations. For example, 7.2% of the patients at hospital B were emergent or salvage, the highest-acuity group, whereas only 0.9% of patients at hospital F were in that category. This imbalance is illustrated more formally in Figure 2B, a graphic depiction of the density of estimated propensity scores from hospital B compared with those of hospital F. This analysis is restricted to those patients who underwent surgery in those 2 hospitals. The propensity scores in Figure 2B were obtained by estimating a (binary) logistic regression model in which the response was an indicator assuming a value of 1 if the patient underwent CABG at hospital B and 0 if the patient underwent surgery at hospital F. The density estimates indicate that for 13% of the patients who underwent CABG at hospital B (solid line), no “similar” patient underwent the procedure in hospital F (dashed line). This percentage was calculated by identifying the fraction of hospital B patients with estimated log-odds of propensity scores 5 because this

MOR . . . 1.34

PTCA indicates percutaneous coronary angioplasty. Based on 4393 surgeries with 99 deaths (2.25%).

*Between-hospital parameter estimates are reported as means and 95% limits.

defined the area of nonoverlap (eg, no hospital F patient had an estimated log-odds of propensity score 5). The lack of overlap implies that a direct comparison of all patients treated at hospital B with those at hospital F may not be statistically valid.

Table 2 illustrates the prevalence of the individual covari- ates from which these propensity score density distributions were derived. Column 1 shows the characteristics of the subset of patients at hospital B who do not overlap with hospital F (ie, for whom the log-odds of their propensity scores are 5). The prevalence of individual high-risk char- acteristics is quite elevated in this patient subset (eg, 24% renal failure, 17% reoperation, 10% cardiogenic shock, 52%

Table 4. Cross-Validation Results

Counterfactual=Hospital Peer Experience (Entire State Excluding Hospital)

Counterfactual=Entire State Experience (2.25%)


Between-Hospital MOR

Observed-“Predicted” Mortality, %

Predictive P


(95% Posterior Limits)





2.49 (1.83, 3.83)





2.03 (1.16, 2.74)





2.39 (1.58, 3.69)





3.06 (2.07, 5.68)





2.64 (1.85, 4.49)





2.05 (0.89, 3.09)





2.12 (1.33, 2.88)





2.12 (1.18, 3.03)





2.33 (1.45, 3.68)





2.20 (1.47, 3.12)





2.58 (1.78, 4.25)





2.10 (1.06, 3.07)





2.11 (1.11, 3.09)





2.39 (1.29, 4.17)

MOR for entire state is 1.34. Positive values of observed-predicted indicated higher-than-predicted mortality rates, whereas negative values indicate lower-than-predicted mortality.

Downloaded from
by guest on February 20, 2017
)emergent or salvage), and hospital F has no experience with patients having this overall level of acuity. The last 2 columns demonstrate the balancing properties of propensity scores in the area of overlap, in which patients are found from both hospitals with comparable log-odds of propensity score. For many of the most important covariates (eg, prior CABG, cardiogenic shock, recent myocardial infarction, urgent or emergent/salvage status), the prevalence was comparable for hospital B and F patients in the overlap region.

Although direct hospital-to-hospital covariate balance was poor, the overlap of estimated propensity score distributions for each hospital compared with the propensity score distri- bution for patients at most of the remaining hospitals was excellent. For example, Figure 2A displays the overlap for hospital B and all remaining hospitals based on the predic- tions obtained from the multinomial logistic regression model. This suggests that a comparison of the performance of hospital B relative to the overall group of other Massachusetts CABG providers is statistically valid.

The prevalence of the confounders and their relationship to 30-day mortality are presented in Table 3. Between-hospital variation measured by the MOR, after accounting for patient risk factors, is 1.34. This implies that for 2 patients with the same observed risk factors, the patient treated in the hospital with higher mortality risk is 1.34 times as likely to die within 30 days of isolated CABG as the patient treated in the hospital with lower mortality risk.

The last column of Table 4 depicts the typical profiling results that would be obtained with the entire state experience (all 14 hospitals) as the counterfactual. The 95% posterior interval of each hospital for its RSMR includes the state crude rate of 2.25%. This would imply that no hospital had higher- or lower-than-expected mortality rate given its case mix. In most public report cards, this finding would be regarded as

sufficient evidence for the absence of statistical outliers, but as noted previously, this conclusion may be misleading. The 3 columns on the left demonstrate the results of analyses performed with cross-validation, sequentially deleting the results of each hospital from the determination of its own counterfactual. The result of this cross-validation predictive P value analysis was highly significant (P=0.01) for hospital D on the left side of Table 4. Supporting this concern is the fact that the between-hospital variation in risk-adjusted mortality is reduced by 50% when hospital D is excluded from the model (from 0.0939 to 0.048; data not shown), and the MOR decreases from 1.34 to 1.23. Finally, a 2.26% excess mortal- ity rate results when hospital D is compared with its peers. These findings all suggest that hospital D is in fact a statistical outlier.


The study of variations in the provision of healthcare services has been a central activity of outcomes research for more than 2 decades. This variability has included both utilization of services and outcomes. Initial publication of hospital mortal- ity rates in 1986 by the Health Care Financing Administration (now known as the Centers for Medicare and Medicaid Services, or CMS) was widely criticized for failing to adjust for patient risk.52 This motivated the development of numer- ous statistical risk models, particularly in cardiac surgery, to account for preoperative patient characteristics. It also stim- ulated CMS to look more closely at its risk models. It has now released new mortality models for acute myocardial infarc- tion and heart failure that address many risk-adjustment issues and statistical deficiencies identified in their earlier releases.32,33 Nevertheless, although risk adjustment corrects for the case severity at a given institution using risk estimates derived from the entire population, it does not guarantee

Downloaded from
by guest on February 20, 2017
)statistically valid direct hospital-to-hospital comparisons. When analyzing outcomes data, interested stakeholders should always consider these additional questions: To what type of patients can inferences about risk-standardized hos- pital outcomes be applied? What reference population was used to determine the counterfactual? If direct hospital-to- hospital comparison is the goal, is there sufficient covariate balance (overlap) to justify such comparison? A widely held view is that risk adjustment levels the playing field so that hospitals can be compared directly with one another over the broad spectrum of patient risk. We argue that this assumption often is invalid and that this common misinterpretation has profound health policy implications in today’s performance- centric environment.

Are current report cards useful? Yes, they are useful when

interpreted in the correct context. Most outcomes report cards use indirect standardization. In this context, the RSMR of a hospital may be interpreted as a measure of quality for the type of patient it treats. Properly constructed and interpreted, report cards facilitate comparisons of hospitals with the entire experience of a larger population of providers (eg, a state or region). Such a comparison group for each hospital typically will be rich enough to support a valid assessment of their quality of care, and it provides meaningful information to payers, regulators, and healthcare consumers.


Outcomes research typically involves nonrandomized studies to assess the results of patient experience with the healthcare system. Virtually always, some form of adjustment is re- quired. Although risk-standardized outcomes have been an important advance in adjusting provider results for differ- ences in case mix, such results often have been misapplied. Assessing the performance of a hospital for its case mix compared with the expected performance of a reference group of providers for a similar case mix usually is justified. However, because of substantial differences in the distribu- tion of risk factors, it may often be inappropriate to directly compare 2 hospitals using the results available in most public report cards.

Sources of Funding

Dr Normand is contracted by the Massachusetts Department of Public Health to monitor hospital cardiac quality and also receives funding from Yale University to develop risk models for CMS.




1. Agency for Healthcare Research and Quality. Outcomes Research: Fact Sheet. Available at: Accessed September 5, 2007.

2. Institute of Medicine. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academies Press; 2001.

3. Institute of Medicine. Performance Measurement: Accelerating Improvement. Washington, DC: National Academies Press; 2006.

4. Gatsonis CA. Profiling providers of medical care. In: Armitage P, Colton T, ed. Encyclopedia of Biostatistics, Volume 6. 2nd ed. Chichester, UK: John Wiley & Sons Ltd; 2005:4252– 4254.

Normand S-LT. Quality of care. In: Armitage P, Colton T, ed. Ency- clopedia of Biostatistics, Volume 6. 2nd ed. Chichester, UK: John Wiley & Sons Ltd; 2005:4348 – 4352.

6. Rubin DB. Comment: Neyman (1923) and causal inference in exper- iments and observational studies. Stat Sci. 1990;5:472– 480.

7. Holland PW. Statistics and causal inference. J Am Stat Assoc. 1986;81: 945–960.

8. Holland PW, Rubin DB. Causal inference in retrospective studies. Eval Rev. 1988;12:203–231.

9. Rothman KJ, Greenland S. Causation and causal inference in epidemiol- ogy. Am J Public Health. 2005;95:S144 –S150.

10. Rothman KJ, Greenland S. Modern Epidemiology. Philadelphia, Pa: Lippincott-Raven; 1998.

11. Pearl J. Causality: Models, Reasoning, and Inference. Cambridge, UK: Cambridge University Press; 2000.

12. Robins JM, Greenland S. The role of model selection in causal inference from nonexperimental data. Am J Epidemiol. 1986;123:392– 402.

13. Rosenbaum PR, Rubin DB. Estimating the effects caused by treatments: comment. J Am Stat Assoc. 1984;79:26 –28.

14. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.

15. Rosenbaum PR. Observational Studies. New York, NY: Springer; 2002.

16. Little RJ, Rubin DB. Causal effects in clinical and epidemiological studies via potential outcomes: concepts and analytical approaches. Annu Rev Public Health. 2000;21:121–145.

17. Rubin DB. Causal inference using potential outcomes: design, modeling, decisions. J Am Stat Assoc. 2005;100:322–331.

18. Gelman A. Applied Bayesian Modeling and Causal Inference From Incomplete Perspectives. Chichester, UK: Wiley; 2004.

19. Gelman A, Hill J. Data Analysis Using Regression and Multilevel/ Hierarchical Models. Cambridge, UK: Cambridge University Press; 2007.

20. Maldonado G, Greenland S. Estimating causal effects. Int J Epidemiol. 2002;31:422– 429.

21. Rubin DB. The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat Med. 2007;26:20 –36.

22. Rubin DB. Direct and indirect causal effects via potential outcomes.

Scand J Stat. 2004;31:161–170.

23. Rubin DB. Bayesian-inference for causal effects: role of randomization.

Ann Stat. 1978;6:34 –58.

24. Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66:688 –701.

25. Fleiss JL, Levin BA, Paik MC. Statistical Methods for Rates and Pro- portions. Hoboken, NJ: J. Wiley; 2003.

26. Shahian DM, Blackstone EH, Edwards FH, Grover FL, Grunkemeier GL, Naftel DC, Nashef SA. Nugent WC, Peterson ED. Cardiac surgery risk models: a position article. Ann Thorac Surg. 2004;78:1868 –1877.

27. Shahian DM, Normand SL, Torchiana DF, Lewis SM, Pastore JO, Kuntz RE, Dreyer PI. Cardiac surgery report cards: comprehensive review and statistical critique. Ann Thorac Surg. 2001;72:2155–2168.

28. Normand S-LT, Glickman ME, Gatsonis CA. Statistical methods for profiling providers of medical care: issues and applications. J Am Stat Assoc. 1997;92:803– 814.

29. McNeil BJ, Pedersen SH, Gatsonis C. Current issues in profiling quality of care. Inquiry. 1992;29:298 –307.

30. Hannan EL, Wu C, Ryan TJ, Bennett E, Culliford AT, Gold JP, Hartman A, Isom OW, Jones RH, McNeil B, Rose EA, Subramanian VA. Do hospitals and surgeons with higher coronary artery bypass graft surgery volumes still have lower risk-adjusted mortality rates? Circulation. 2003; 108:795– 801.

31. Hannan EL, Kumar D, Racz M, Siu AL, Chassin MR. New York State’s Cardiac Surgery Reporting System: four years later. Ann Thorac Surg. 1994;58:1852–1857.

32. Krumholz HM, Wang Y, Mattera JA, Wang Y, Han LF, Ingber MJ, Roman S, Normand SL. An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with an acute myocardial infarction. Circulation. 2006;113: 1683–1692.

33. Krumholz HM, Wang Y, Mattera JA, Wang Y, Han LF, Ingber MJ, Roman S, Normand SL. An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with heart failure. Circulation. 2006;113:1693–1701.

34. Shahian DM, Torchiana DF, Shemin RJ, Rawn JD, Normand SL. Mas- sachusetts cardiac surgery report card: implications of statistical meth- odology. Ann Thorac Surg. 2005;80:2106 –2113.

35. Rosenbaum PR, Rubin DB. Constructing a control-group using multivar- iate matched sampling methods that incorporate the propensity score. Am Stat. 1985;39:33–38.

36. Rosenbaum PR, Rubin DB. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc. 1984;79: 516 –524.

37. Rubin DB. The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat Med. 2007;26:20 –36.

38. D’Agostino RB Jr. Propensity scores in cardiovascular research.

Circulation. 2007;115:2340 –2343.

39. Braitman LE, Rosenbaum PR. Rare outcomes, common treatments: analytic strategies using propensity scores. Ann Intern Med. 2002;137: 693– 695.

40. D’Agostino RB Jr. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med. 1998;17:2265–2281.

41. Joffe MM, Rosenbaum PR. Invited commentary: propensity scores. Am J Epidemiol. 1999;150:327–333.

42. Glance LG, Osler TM, Mukamel DB, Dick AW. Use of a matching algorithm to evaluate hospital coronary artery bypass grafting per- formance as an alternative to conventional risk adjustment. Med Care. 2007;45:292–299.

43. Huang IC, Frangakis C, Dominici F, Diette GB, Wu AW. Application of a propensity score approach for risk adjustment in profiling multiple physician groups on asthma care. Health Serv Res. 2005;40:253–278.

44. Dehejia RH, Wahba S. Causal effects in nonexperimental studies: reeval- uating the evaluation of training programs. J Am Stat Assoc. 1999;94: 1053–1062.

45. Tchernis R, Horvitz-Lennon M, Normand SL. On the use of discrete choice models for causal inference. Stat Med. 2005;24:2197–2212.

46. Society of Thoracic Surgeons. STS National Database. Available at: Accessed September 5, 2007.

47. Social Security Death Index interactive search. Available at: http:// Accessed September 5, 2007.

48. Larsen K, Merlo J. Appropriate assessment of neighborhood effects on individual health: integrating random and fixed effects in multilevel logistic regression. Am J Epidemiol. 2005;161:81– 88.

49. Larsen K, Petersen JH, Budtz J, Endahl L. Interpreting parameters in the logistic regression model with random effects. Biometrics. 2000;56: 909 –914.

50. Normand ST, Shahian DM. Statistical and clinical aspects of hospital outcomes profiling. Stat Sci. 2007;22:206 –226.

51. Draper D, Gittoes M. Statistical analysis of performance indicators in UK higher education. J Royal Stat Soc Ser A (Stat Soc). 2004;167:449 – 474.

52. Iezzoni LI. Risk Adjustment for Measuring Health Care Outcomes. 3rd ed. Chicago, Ill: Health Administration Press; 2003.

Downloaded from
by guest on February 20, 2017


Risk-standardized outcomes are increasingly being used by various stakeholders to assess the quality of care delivered by

healthcare providers. Although adjusted outcomes represent a substantial improvement over unadjusted results, they are

















Risk-standardized outcomes, as most commonly constructed, characterize a provider’s performance for a specific group of





















population (typically a state or a country). These indirectly standardized outcomes, based on providers’ actual case mix,



















of patients. Moreover, if the number of providers in the reference population is small, the inclusion of a true outlying





















In Massachusetts, this problem is mitigated through the use of cross-validation, obtained by sequentially removing each

hospital from risk model development and then assessing its performance with a model derived from the remaining















of the overall reference population, this does not imply that the outcomes of 2 providers can be directly compared with one






























Downloaded from
by guest on February 20, 2017

Comparison of ”Risk-Adjusted” Hospital Outcomes

David M. Shahian and Sharon-Lise T. Normand

Circulation. 2008;117:1955-1963; originally published online April 7, 2008; doi: 10.1161/CIRCULATIONAHA.107.747873

Circulation is published by the American Heart Association, 7272 Greenville Avenue, Dallas, TX 75231 Copyright © 2008 American Heart Association, Inc. All rights reserved.

Print ISSN: 0009-7322. Online ISSN: 1524-4539

The online version of this article, along with updated information and services, is located on the World Wide Web at:

Requests for permissions to reproduce figures, tables, or portions of articles originally published

can be obtained via RightsLink, a service of the Copyright Clearance Center, not the Editorial

Office. Once the online version of the published article for which permission is being requested is located,

click Request Permissions in the middle column of the Web page under Services. Further information about

this process is available in the
Permissions and Rights Question and Answer
Information about reprints can be found online at:
Information about subscribing to
is online at:

We’ve proficient writers who can handle both short and long papers, be they academic or non-academic papers, on topics ranging from soup to nuts (both literally and as the saying goes, if you know what we mean). We know how much you care about your grades and academic success. That's why we ensure the highest quality for your assignment. We're ready to help you even in the most critical situation. We're the perfect solution for all your writing needs.

Get a 15% discount on your order using the following coupon code SAVE15

Order a Similar Paper Order a Different Paper