Writing Measure Specifications
In writing specifications for the measure, the domain of quality
improvement sought is transformed into a quantitative measure. For example, if caregivers
were interested in improving timely delivery of medications, they could decide to
focus on delivery of antibiotics, then focus on
the first dose of antibiotics, and finally set the performance standard of delivery
of the first dose of antibiotics within 30 minutes. Improvement teams often define
the quality measure too broadly, such as improving safety, limiting their ability
to develop an empirical measure of quality. Through writing the design specifications,
providers move from a broad domain of quality to a detailed definition that should
be developed with an appropriate level of scientific rigor to minimize the risk associated
with bias.
Although many improvement efforts lack a data collection system,
the use of standardized data collection tools can help reduce the burden of data
collection and improve validity and reliability of the measure. Pilot testing the
data collection system generally identifies areas requiring further modification.
For example, the measure specifications may need more detail, the required data
may not be available, or the burden of data collection may be too great to measure
its quality. Although pilot testing can be performed on a small sample, skipping
this step is perilous because doing so can compromise the validity of the measure.
Reliability is reproducibility. If measurements of the same data
are repeated, will the same result occur? Reliability can be improved by having
detailed specifications and data collection instruments and by systematically training
data collectors. However, validity is the extent to which the measure accurately
evaluates the domain of quality being assessed. Although there are formal measures
to evaluate validity, we found the most practical for local quality improvement efforts
is to ask the consumer of the information. Does the measure, as defined, accurately
evaluate an aspect of quality of care? To use the measure, the consumer of the data
must believe the measure is valid.[62]
[63]
For example, a hospital measured return to the ICU within 30 days
as a measure of ICU quality (see Chapter
71
, Chapter 74
,
and Chapter 75
). Nonetheless,
the ICU physicians felt that, as defined, this measure did not accurately measure
quality of ICU care; the patient's disease or care in less acute care units to which
patients were transferred probably determined readmission rather than the ICU discharge
decision. We changed the measure to readmission within 48 hours, and providers accepted
and used it regularly.
The team must decide how the data will be analyzed and how clinical
and statistical significance will be determined. Although the analyses of rates
or proportions are generally clear, the analysis of continuous data, such as time,
is less clear. For example, a team wanted to improve the timely delivery of medication.
After reviewing available literature and subsequent discussion, they decided that
timely delivery of antibiotics is important and focused attention on that area.
The team needed to decide how to analyze the data. Would they evaluate mean time,
median time, or the percentage of patients who received antibiotics below some threshold?
Which is correct? They all could be. The consumers of the information, informed
by evidence, must decide which to use. Scales and units of measure should be chosen
to optimize the learning of the user.[64]
The microsystem and organizational structure may affect patient
outcomes. For example, improved team communication within an ICU reduced ICU length
of stay and failed extubations.[38]
Moreover, the
magnitude of the treatment effect of these organizational factors may be as large
as the best therapies. We found that ICU physician staffing is associated with a
30% relative risk reduction for inpatient mortality, at least as large a reduction
in mortality as our best ICU therapies.[65]
[66]
The medical research community has adopted the randomized clinical
trial (i.e., efficacy study) as the optimal study design to evaluate therapies.
Our comfort with randomized clinical trials is warranted; they provide the least
risk of bias from observed or unobserved confounding variables.[67]
[68]
If our goal is to learn if therapy A is better
than therapy B, an ethically conducted randomized clinical trial is likely to provide
the least biased answer. Nonetheless, when applying results of a randomized clinical
trial in routine practice (i.e. effectiveness study), caregivers should consider
whether the results can be replicated, especially when applying the results in various
microsystems (i.e., work units) and organizational structures.[61]
Because the microsystem and organizational structure are constantly
changing, providers implement interventions under varied conditions, potentially
altering their effects. Because variation among organizations and temporal changes
can influence, often significantly, the ability to replicate the results from randomized
clinical trials, caregivers need to be able to develop tools to measure their results.
The potential problem of assuming applicability of results of a randomized clinical
trial to a population of patients is illustrated in the carotid endarterectomy story.
The Asymptomatic Carotid Atherosclerosis Study (ACAS) found that the 30-day mortality
rate for asymptomatic patients having carotid endarterectomy was 1 death per 1000
patients.[69]
Coincident with the publication of
this trial was a significant increase in the rate of this operation between 1989
and 1995.[70]
However, an effectiveness study revealed
that the mortality rate for asymptomatic patients having a carotid endarterectomy
was 1 death per 100 patients, 10 times greater than in the clinical trial.[71]
This difference in mortality probably results from the strict entry criteria for
the clinical trial, because patients with comorbid diseases and older than 79 years
and hospitals and surgeons with poor outcomes were excluded from the clinical trial.
 |