|
|
||||||||
Articles |
a Author for correspondence. Fax 314-362-3016; e-mail parvin{at}wugcrc.wustl.edu
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
The traditional QC planning process assesses and compares QC strategies
on the basis of the probability of rejecting an analytical run when an
out-of-control error condition exists (Ped, see
Table 1
for a glossary of symbols) and the probability of falsely
rejecting an analytical run that is in control
(Pfr) (1). Comparing QC performance
by using traditional performance measures is difficult if the
definition of an analytical run differs between alternative QC
strategies.
|
As an illustration of the problem, imagine that two clinical laboratories wish to compare their QC performance. The total allowable error specification (TEa) for the analyte of interest is 5.0 multiples of the analytical SD. Laboratory A tests one control sample per analytical run (N = 1) and rejects the run if the control's result is more than 3.0 analytical SDs from target (13s rule). Laboratory B tests two controls per analytical run (N = 2) and rejects if either one is >2.5 analytical SDs from target (12.5s rule). Calculating Pfr gives a false-rejection probability of 0.003 for laboratory A and 0.025 for laboratory B. Following the traditional performance evaluation approach, "critical" out-of-control error conditions can be computed as SEc = TEa - 1.65 for a critical systematic error condition (in multiples of analytical SD) and REc = TEa/1.96 for a critical increase in analytical imprecision (3). Calculating Ped at SEc = 3.35 gives an error detection probability of 0.64 for laboratory A and 0.96 for laboratory B. Similar calculations can be performed for REc. The findings based on traditional performance measures are: Laboratory B uses twice as many control samples per analytical run as laboratory A; laboratory B's false-rejection rate is higher than laboratory A's; and laboratory B has good error detection ability for critical systematic errors, but laboratory A's error detection rate is too low.
Suppose it is then revealed that laboratory A tests one QC sample every 10 patient samples, while laboratory B tests a pair of QC samples every 80 patient samples. Should these different analytical run definitions affect the comparison of the relative QC performance of the two laboratories? Knowledge that laboratory A and laboratory B define their analytical runs differently has no effect on the traditional performance measures Pfr and Ped.
The purpose of this paper is to introduce performance measures that can accommodate different definitions of an analytical run. The performance measures are based on the concept of the average number of patient samples (ANP) required to detect an out-of-control error condition. ANPed will denote the average number of patient specimens from the inception of an out-of-control error condition until it is detected, and ANPfr will denote the average number of patient specimens to rejection when no out-of-control error condition exists.
The clinical laboratory has traditionally specified quality requirements in terms of TEa. Within this context, the performance measures of primary interest should directly relate to the chances that a test result contains an analytical error that exceeds TEa (3). The average number of patient specimens that contain unacceptable analytical errors resulting from an out-of-control condition will be denoted ANPTE. ANPTE can be separated into two parts. The first part reflects the expected increase in the number of patient specimens with unacceptable analytical errors after the occurrence of an out-of-control condition, but before the next QC testing opportunity has arrived. This part will be denoted ANPE. The second part is the average number of unacceptable results attributable to the out-of-control condition starting from the first QC test after the error. This part will be denoted ANPQE. These performance measures are used to investigate how different analytical run definitions influence QC performance.
| Methods |
|---|
|
|
|---|
QC rules that test only the current group of N control samples are
investigated. Two QC rules within this class are evaluated. The
1ks rule rejects if any of the N control samples
in the analytical run are more than k analytical SDs from
target. The X(c)/R4s rule
rejects if the average z-score for the N control samples in
the analytical run exceeds c SEMs or the range of the
z-scores of the N control observations exceeds 4. A
z-score is obtained by calculating the difference of a
control observation from its expected mean and dividing by its
analytical SD. A z-score has a mean value of 0, a SD of 1,
and the SEM of N z-score values is 1/
N. For each
analytical run definition, control limits for the
1ks rule and
X(c)/R4s rule are
determined so that ANPfr = 2000 patient samples.
Given M patient specimens per analytical run, ANPfr is equal to the average number of analytical runs to false rejection (ARLfr) times the number of patient specimens per run, or ANPfr = ARLfrM. Under the assumption that an out-of-control error condition can occur anywhere with equal probability and persists unchanged until detected, a straightforward (but lengthy) mathematical derivation shows that ANPed is closely approximated by M/2 + M(ARLed - 1), where ARLed denotes the average number of analytical runs to error detection. The first term is the average number of patient specimens after the error condition occurs but before the next QC testing opportunity. The second term is the average number of patient specimens after the first QC testing opportunity until a QC rejection. For QC rules that test only the control samples in the current analytical run, ARLed = 1/Ped (1).
Alternatively, an analytical run could be defined as TP units of time followed by N control samples (5). If it is assumed that an out-of-control error condition can occur with equal probability at any point in time and that the number of patient samples analyzed in a given time interval is proportional to the length of the time interval, then the above formula for ANPed still holds, where M denotes the average number of patient samples processed in TP units of time.
Let PE(SE) represent the probability that a test result contains an analytical error that exceeds TEa during the existence of an out-of-control error condition that causes a systematic error of SE analytical SDs (3). Then PE(0) is the probability that a test result contains an analytical error that exceeds TEa when the process is in control. The average number of patient samples that contain unacceptable analytical errors attributable to the existence of SE is ANPTE = ANPed [PE(SE) - PE(0)], which can be separated into the two components ANPE = (M/2) [PE(SE) - PE(0)] and ANPQE = M(ARLed - 1)[PE(SE) - PE(0)]. The same equations apply for an out-of-control error condition that causes an increase in analytical imprecision by a factor of RE with PE(SE) and PE(0) replaced by PE(RE) and PE(1) respectively.
All results were obtained by numerical analysis without use of
simulations. Given TEa, PE(SE)
= 1 - [
(TEa - SE) -
(-TEa -
SE)] during a systematic error condition equal to SE and
PE(RE) = 1 - [
(TEa/RE) -
(-TEa/RE)] if analytical imprecision increases by a
factor equal to RE, where
is the cumulative distribution function
of the standard normal distribution. For the 1ks
rule with N control observations, Ped = 1
- (1 - P1)N where
P1 = 1 - [
(k - SE) -
(-k - SE)] is the probability of a single control
observation exceeding control limits when a systematic error condition
equal to SE exists and P1 = 1 -
[
(k/RE) -
(-k/RE)] for RE. For the
X(c) rule with N control
observations, the probability of rejection is
Px = 1 -
[
(c - SE
N) -
(-c -
SE
N)] and Px = 1 -
[
(c/RE) -
(-c/RE)] for systematic and
random error conditions respectively. For the R4s rule with
N = 2, the probability of rejection, PR,
for any systematic error condition is PR =
1 -
12 (8) where
12 is the cumulative
2
distribution function with 1 degree of freedom. If a random error
condition exists, PR = 1 -
12(8/RE2). When N >2,
PR is calculated by numerical integration
(6). For the
X(c)/R4s rule,
Ped = Px +
(1 -
Px)PR.
Calculations were performed by using the software package Stata (Stata
Corp.).
| Results |
|---|
|
|
|---|
|
To examine how QC performance is influenced by analytical run
definition, we compared three different analytical run lengths while
maintaining the same density of control samples to patient specimens.
Table 3
shows ANPed for the Shewhart
1ks rule with the alternative analytical run
definitions. For the 1ks rule there never
appears to be an advantage to defining longer analytical runs with N
>1 if the goal is to minimize the average number of patient specimens
processed during the existence of an out-of-control error condition.
The advantage of short runs increases as the magnitude of the
out-of-control error condition increases. The minimum value that
ANPed will attain is approximately M/2. This occurs when
the magnitude of the out-of-control error condition is so large that
the probability of a QC rejection is 1.0 at the first QC testing
opportunity after the error.
|
Table 4
shows ANPed for the
X(c)/R4s rule with the
three analytical run definitions. When N = 1, the R4s
rule can't be invoked and the
X(c)/R4s rule reduces to
a 1ks rule. Therefore, the (M, N) = (20, 1)
columns in Tables 3
and 4
are identical. For RE conditions the
performance of the
X(c)/R4s rule is very
similar to the 1ks rule. For SE conditions, the
analytical run length with the shortest ANP to rejection for the
X(c)/R4s rule depends on
the magnitude of the out-of-control error condition. However,
consistent with the case for the 1ks rule, as
the magnitude of the error condition increases, shorter analytical run
lengths have a greater chance of early error detection.
|
Figure 1
A plots ANPTE for the
1ks rule as a function of SE for the three
different analytical run lengths. Figs. 1B
and 1C
plot the two
components, ANPE and ANPQE, that comprise
ANPTE. Note the different scales for the two components. In
the patient specimens that are processed after an out-of-control error
condition occurs, but before the next QC testing opportunity, the
expected number of unacceptable results increases as the magnitude of
the out-of-control error condition increases (Fig. 1B
). Longer
intervals between QC testing opportunities result in higher expected
numbers of unacceptable results. Once a QC testing opportunity arrives,
the expected number of unacceptable results caused by an out-of-control
error condition depends on how long the error condition exists before
it is detected. At this point, the longer analytical run definitions
that test a larger group of control samples are associated with lower
expected numbers of unacceptable results (Fig. 1C
).
|
Figure 2
plots ANPTE for the 1ks
rule for different TEa requirements. The average number of
unacceptable results increases as TEa decreases, but the
relative performance of the three different analytical run lengths
remains the same. Graphs of ANPTE as a function of RE show
the same patterns (data not shown).
|
Figure 3
gives ANPTE for the
X(c)/R4s rule for the
three analytical run definitions as a function of SE at different
TEa specifications. The main difference between
ANPTE performance for the
X(c)/R4s rule and the
1ks rule is when the TEa
specification is low and the out-of-control error condition is small.
Longer run lengths with N >1 can produce smaller ANPTE for
the X(c)/R4s rule,
whereas there is no benefit to longer run lengths with N >1 for the
1ks rule even for small TEa and
small error conditions.
|
| Discussion |
|---|
|
|
|---|
Measuring performance in terms of the probability of rejecting an analytical run doesn't permit examination of the influence of different analytical run definitions. Consequently, this issue has not been investigated to any extent in the published literature. Westgard et al. looked at the effect of a number of QC practices, including batch size, on QC performance (8). They compared "test yield" for increasing batch sizes and concluded that larger batch sizes produced greater test yield. However, test yield is defined in terms of the probability of rejecting analytical runs that are in control or that contain "critical" out-of-control error conditions. Bishop and Nix argue that treating each control sample individually, as if N = 1, provides improved error detection for QC rules designed to detect persistent out-of-control error conditions, such as the 22s rule or 41s rule (9).
Different analytical run definitions are easily accommodated if QC performance is measured in terms of ANPed or ANPTE. Evaluation of these performance measures demonstrates that only in rare cases would less frequent testing of a large group of control samples be preferable to testing fewer control samples more frequently. In situations where it is necessary to improve QC performance, the strategy should be to decrease the interval of time between QC tests, rather than to increase N while keeping the same analytical run length.
We investigated two QC rules (1ks and X(c)/R4s) that are members of the class of QC rules that only test the control samples within a single analytical run. Our findings also apply to the other QC rules within this class. In general, any QC strategy that is based on the periodic testing of control samples will always be vulnerable to producing unacceptable test results during the interval between the occurrence of a large out-of-control error condition and the next QC testing opportunity. Short intervals between QC testing opportunities minimize this vulnerability. To improve QC performance further, strategies that do not depend on the periodic testing of control samples, such as QC methods that use patient results, will be required (10). While these findings are consistent with intuition, they have never been formally demonstrated because traditional methods of evaluating QC performance have had no way to objectively address the relevant questions.
If patients' results processed after an acceptable QC test are reported as they are obtained, ANPTE will reflect the expected increase in number of unacceptable results reported during the course of an out-of-control error condition. If test results are held in a batch and not reported until a QC test is performed at the end of the batch (sometimes referred to as "bracketing" patient samples by controls), then the unacceptable patient samples that are processed in the batch ending with the QC test that detects the out-of-control error condition will not be reported.
Depending on the stability of an analytical system, most of the time a process should be in control. The overall long-term rate of producing and reporting unacceptable results requires taking into account the frequency of out-of-control error conditions, the type and magnitude of an error condition when it does occur, and knowledge of ANPed and ANPTE over the range of possible error conditions.
In summary, the purpose of this paper has been to investigate the influence of different analytical run definitions on QC performance during the routine monitoring of an analytical process. The performance measures ANPed and ANPTE provide the ability to investigate this issue in a way that has not been possible with traditional approaches. These performance measures demonstrate that during routine QC monitoring, the length of the interval between QC tests can have a major influence on the expected number of unacceptable patient results produced during the existence of an out-of-control error condition.
| Footnotes |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
C. A. Parvin and S. Robbins III Evaluation of the Performance of Randomized versus Fixed Time Schedules for Quality Control Procedures Clin. Chem., April 1, 2007; 53(4): 575 - 580. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Dechert and K. E. Case Multivariate approach to quality control in clinical chemistry Clin. Chem., September 1, 1998; 44(9): 1959 - 1963. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. P. Caudill, G. R. Cooper, S. J. Smith, and G. L. Myers Assessment of current National Cholesterol Education Program guidelines for total cholesterol, triglyceride, HDL-cholesterol, and LDL-cholesterol measurements Clin. Chem., August 1, 1998; 44(8): 1650 - 1658. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |