Clinical Chemistry
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Clinical Chemistry 43: 2149-2154, 1997;
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (7)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Parvin, C. A.
Right arrow Articles by Gronowski, A. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Parvin, C. A.
Right arrow Articles by Gronowski, A. M.
Related Collections
Right arrow Laboratory Management
(Clinical Chemistry. 1997;43:2149-2154.)
© 1997 American Association for Clinical Chemistry, Inc.


Articles

Effect of analytical run length on quality-control (QC) performance and the QC planning process

Curtis A. Parvina and Ann M. Gronowski

a Author for correspondence. Fax 314-362-3016; e-mail parvin{at}wugcrc.wustl.edu


   Abstract
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
The performance measure traditionally used in the quality-control (QC) planning process is the probability of rejecting an analytical run when an out-of-control error condition exists. A shortcoming of this performance measure is that it doesn't allow comparison of QC strategies that define analytical runs differently. Accommodating different analytical run definitions is straightforward if QC performance is measured in terms of the average number of patient samples to error detection, or the average number of patient samples containing an analytical error that exceeds total allowable error. By using these performance measures to investigate the impact of different analytical run definitions on QC performance demonstrates that during routine QC monitoring, the length of the interval between QC tests can have a major influence on the expected number of unacceptable results produced during the existence of an out-of-control error condition.


   Introduction
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
The quality-control (QC) process can be divided into multiple stages (1). One stage involves QC testing that is triggered by an event such as calibration. The QC test is performed after the event but before any patient samples are analyzed, to determine whether the analytical process is in control. A second stage involves the monitoring of an analytical process over time. In this stage the concept of an analytical run becomes important. In the modern clinical laboratory the definition of an analytical run is often arbitrary. Many times an analytical run is defined in units of time or number of patient specimens between QC samples. The issue of how to define an appropriate run length for routine QC testing has received increased attention recently (2).

The traditional QC planning process assesses and compares QC strategies on the basis of the probability of rejecting an analytical run when an out-of-control error condition exists (Ped, see Table 1 for a glossary of symbols) and the probability of falsely rejecting an analytical run that is in control (Pfr) (1). Comparing QC performance by using traditional performance measures is difficult if the definition of an analytical run differs between alternative QC strategies.


View this table:
[in this window]
[in a new window]
 
Table 1. Glossary of symbols.

As an illustration of the problem, imagine that two clinical laboratories wish to compare their QC performance. The total allowable error specification (TEa) for the analyte of interest is 5.0 multiples of the analytical SD. Laboratory A tests one control sample per analytical run (N = 1) and rejects the run if the control's result is more than 3.0 analytical SDs from target (13s rule). Laboratory B tests two controls per analytical run (N = 2) and rejects if either one is >2.5 analytical SDs from target (12.5s rule). Calculating Pfr gives a false-rejection probability of 0.003 for laboratory A and 0.025 for laboratory B. Following the traditional performance evaluation approach, "critical" out-of-control error conditions can be computed as SEc = TEa - 1.65 for a critical systematic error condition (in multiples of analytical SD) and REc = TEa/1.96 for a critical increase in analytical imprecision (3). Calculating Ped at SEc = 3.35 gives an error detection probability of 0.64 for laboratory A and 0.96 for laboratory B. Similar calculations can be performed for REc. The findings based on traditional performance measures are: Laboratory B uses twice as many control samples per analytical run as laboratory A; laboratory B's false-rejection rate is higher than laboratory A's; and laboratory B has good error detection ability for critical systematic errors, but laboratory A's error detection rate is too low.

Suppose it is then revealed that laboratory A tests one QC sample every 10 patient samples, while laboratory B tests a pair of QC samples every 80 patient samples. Should these different analytical run definitions affect the comparison of the relative QC performance of the two laboratories? Knowledge that laboratory A and laboratory B define their analytical runs differently has no effect on the traditional performance measures Pfr and Ped.

The purpose of this paper is to introduce performance measures that can accommodate different definitions of an analytical run. The performance measures are based on the concept of the average number of patient samples (ANP) required to detect an out-of-control error condition. ANPed will denote the average number of patient specimens from the inception of an out-of-control error condition until it is detected, and ANPfr will denote the average number of patient specimens to rejection when no out-of-control error condition exists.

The clinical laboratory has traditionally specified quality requirements in terms of TEa. Within this context, the performance measures of primary interest should directly relate to the chances that a test result contains an analytical error that exceeds TEa (3). The average number of patient specimens that contain unacceptable analytical errors resulting from an out-of-control condition will be denoted ANPTE. ANPTE can be separated into two parts. The first part reflects the expected increase in the number of patient specimens with unacceptable analytical errors after the occurrence of an out-of-control condition, but before the next QC testing opportunity has arrived. This part will be denoted ANPE. The second part is the average number of unacceptable results attributable to the out-of-control condition starting from the first QC test after the error. This part will be denoted ANPQE. These performance measures are used to investigate how different analytical run definitions influence QC performance.


   Methods
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
An analytical run will be defined as M patient specimens followed by N control samples. Analytical run definitions with (M, N) equal to (20, 1), (40, 2), and (80, 4) are compared. In each case the ratio of patient specimens to control samples, M/N, is the same. Out-of-control error conditions that cause a systematic error (SE) or an increase in analytical imprecision (RE) are evaluated. Analytical imprecision is assumed to be within-run imprecision. Between-run imprecision is not considered (4). It is assumed that an out-of-control error condition can occur with equal probability anywhere within the stream of specimens being analyzed and persists until it is detected.

QC rules that test only the current group of N control samples are investigated. Two QC rules within this class are evaluated. The 1ks rule rejects if any of the N control samples in the analytical run are more than k analytical SDs from target. The X(c)/R4s rule rejects if the average z-score for the N control samples in the analytical run exceeds c SEMs or the range of the z-scores of the N control observations exceeds 4. A z-score is obtained by calculating the difference of a control observation from its expected mean and dividing by its analytical SD. A z-score has a mean value of 0, a SD of 1, and the SEM of N z-score values is 1/{surd}N. For each analytical run definition, control limits for the 1ks rule and X(c)/R4s rule are determined so that ANPfr = 2000 patient samples.

Given M patient specimens per analytical run, ANPfr is equal to the average number of analytical runs to false rejection (ARLfr) times the number of patient specimens per run, or ANPfr = ARLfrM. Under the assumption that an out-of-control error condition can occur anywhere with equal probability and persists unchanged until detected, a straightforward (but lengthy) mathematical derivation shows that ANPed is closely approximated by M/2 + M(ARLed - 1), where ARLed denotes the average number of analytical runs to error detection. The first term is the average number of patient specimens after the error condition occurs but before the next QC testing opportunity. The second term is the average number of patient specimens after the first QC testing opportunity until a QC rejection. For QC rules that test only the control samples in the current analytical run, ARLed = 1/Ped (1).

Alternatively, an analytical run could be defined as TP units of time followed by N control samples (5). If it is assumed that an out-of-control error condition can occur with equal probability at any point in time and that the number of patient samples analyzed in a given time interval is proportional to the length of the time interval, then the above formula for ANPed still holds, where M denotes the average number of patient samples processed in TP units of time.

Let PE(SE) represent the probability that a test result contains an analytical error that exceeds TEa during the existence of an out-of-control error condition that causes a systematic error of SE analytical SDs (3). Then PE(0) is the probability that a test result contains an analytical error that exceeds TEa when the process is in control. The average number of patient samples that contain unacceptable analytical errors attributable to the existence of SE is ANPTE = ANPed [PE(SE) - PE(0)], which can be separated into the two components ANPE = (M/2) [PE(SE) - PE(0)] and ANPQE = M(ARLed - 1)[PE(SE) - PE(0)]. The same equations apply for an out-of-control error condition that causes an increase in analytical imprecision by a factor of RE with PE(SE) and PE(0) replaced by PE(RE) and PE(1) respectively.

All results were obtained by numerical analysis without use of simulations. Given TEa, PE(SE) = 1 - [{Phi}(TEa - SE) - {Phi}(-TEa - SE)] during a systematic error condition equal to SE and PE(RE) = 1 - [{Phi}(TEa/RE) - {Phi}(-TEa/RE)] if analytical imprecision increases by a factor equal to RE, where {Phi} is the cumulative distribution function of the standard normal distribution. For the 1ks rule with N control observations, Ped = 1 - (1 - P1)N where P1 = 1 - [{Phi}(k - SE) - {Phi}(-k - SE)] is the probability of a single control observation exceeding control limits when a systematic error condition equal to SE exists and P1 = 1 - [{Phi}(k/RE) - {Phi}(-k/RE)] for RE. For the X(c) rule with N control observations, the probability of rejection is Px = 1 - [{Phi}(c - SE{surd}N) - {Phi}(-c - SE{surd}N)] and Px = 1 - [{Phi}(c/RE) - {Phi}(-c/RE)] for systematic and random error conditions respectively. For the R4s rule with N = 2, the probability of rejection, PR, for any systematic error condition is PR = 1 - {chi}12 (8) where {chi}12 is the cumulative {chi}2 distribution function with 1 degree of freedom. If a random error condition exists, PR = 1 - {chi}12(8/RE2). When N >2, PR is calculated by numerical integration (6). For the X(c)/R4s rule, Ped = Px + (1 - Px)PR. Calculations were performed by using the software package Stata (Stata Corp.).


   Results
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
Table 2 presents the performance measures for the hypothetical comparison between two laboratories. When accounting for differences in analytical run definition, ANPfr is 3704 patient samples for laboratory A and 3241 patient samples for laboratory B. For SEc = 3.35, ANPed is 11 patient samples for laboratory A and 43 patient samples for laboratory B. With ANPed as the performance measure the findings are: Laboratory A uses four times as many controls as laboratory B; both laboratories have about the same false-rejection rate (approximately once every 3500 patient samples); and laboratory A detects a critical systematic error four times faster (on average) than laboratory B. These conclusions are substantially different from those based on the traditional performance measures.


View this table:
[in this window]
[in a new window]
 
Table 2. Comparison of QC performance between two laboratories with different analytical run definitions.

To examine how QC performance is influenced by analytical run definition, we compared three different analytical run lengths while maintaining the same density of control samples to patient specimens. Table 3 shows ANPed for the Shewhart 1ks rule with the alternative analytical run definitions. For the 1ks rule there never appears to be an advantage to defining longer analytical runs with N >1 if the goal is to minimize the average number of patient specimens processed during the existence of an out-of-control error condition. The advantage of short runs increases as the magnitude of the out-of-control error condition increases. The minimum value that ANPed will attain is approximately M/2. This occurs when the magnitude of the out-of-control error condition is so large that the probability of a QC rejection is 1.0 at the first QC testing opportunity after the error.


View this table:
[in this window]
[in a new window]
 
Table 3. Behavior of the 1ks rule with different analytical run definitions.

Table 4 shows ANPed for the X(c)/R4s rule with the three analytical run definitions. When N = 1, the R4s rule can't be invoked and the X(c)/R4s rule reduces to a 1ks rule. Therefore, the (M, N) = (20, 1) columns in Tables 3Up and 4 are identical. For RE conditions the performance of the X(c)/R4s rule is very similar to the 1ks rule. For SE conditions, the analytical run length with the shortest ANP to rejection for the X(c)/R4s rule depends on the magnitude of the out-of-control error condition. However, consistent with the case for the 1ks rule, as the magnitude of the error condition increases, shorter analytical run lengths have a greater chance of early error detection.


View this table:
[in this window]
[in a new window]
 
Table 4. Behavior of the (c)/R4s rule with different analytical run definitions.

Figure 1 A plots ANPTE for the 1ks rule as a function of SE for the three different analytical run lengths. Figs. 1B and 1C plot the two components, ANPE and ANPQE, that comprise ANPTE. Note the different scales for the two components. In the patient specimens that are processed after an out-of-control error condition occurs, but before the next QC testing opportunity, the expected number of unacceptable results increases as the magnitude of the out-of-control error condition increases (Fig. 1B ). Longer intervals between QC testing opportunities result in higher expected numbers of unacceptable results. Once a QC testing opportunity arrives, the expected number of unacceptable results caused by an out-of-control error condition depends on how long the error condition exists before it is detected. At this point, the longer analytical run definitions that test a larger group of control samples are associated with lower expected numbers of unacceptable results (Fig. 1C ).



View larger version (12K):
[in this window]
[in a new window]
 
Figure 1. ANPTE (A), ANPE (B), and ANPQE (C) as a function of SE for the 1ks rule with three different analytical run lengths when TEa = 5.0.

The short dashed lines represent an analytical run defined as 20 patient specimens followed by one control sample, the medium dashed lines represent 40 patient specimens followed by two control samples, and the long dashed lines represent 80 patient samples followed by four control samples.

Figure 2 plots ANPTE for the 1ks rule for different TEa requirements. The average number of unacceptable results increases as TEa decreases, but the relative performance of the three different analytical run lengths remains the same. Graphs of ANPTE as a function of RE show the same patterns (data not shown).



View larger version (19K):
[in this window]
[in a new window]
 
Figure 2. ANPTE as a function of SE for the 1ks rule with three different analytical run lengths and a range of TEa specifications.

See Fig. 1Up for further details.

Figure 3 gives ANPTE for the X(c)/R4s rule for the three analytical run definitions as a function of SE at different TEa specifications. The main difference between ANPTE performance for the X(c)/R4s rule and the 1ks rule is when the TEa specification is low and the out-of-control error condition is small. Longer run lengths with N >1 can produce smaller ANPTE for the X(c)/R4s rule, whereas there is no benefit to longer run lengths with N >1 for the 1ks rule even for small TEa and small error conditions.



View larger version (19K):
[in this window]
[in a new window]
 
Figure 3. ANPTE as a function of SE for the X(c)/R4s rule with three different analytical run lengths and a range of TEa specifications.

See Fig. 1Up for further details.


   Discussion
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
The probability of rejecting an analytical run when an out-of-control error condition exists is the performance measure that currently guides the QC planning process. On the basis of this performance measure and the TEa requirements, appropriate QC rules and Ns are recommended. For instance, a recent guideline advises a simple Shewhart rule with N = 2 if TEa >5.65 (SEc >4), a Shewhart rule or multirule procedure with N = 2 or 3 if 4.65< TEa <5.65, Ns of 3 or 4 if 3.65< TEa <4.65, and Ns of 4 to 8 if TEa <3.65 (7).

Measuring performance in terms of the probability of rejecting an analytical run doesn't permit examination of the influence of different analytical run definitions. Consequently, this issue has not been investigated to any extent in the published literature. Westgard et al. looked at the effect of a number of QC practices, including batch size, on QC performance (8). They compared "test yield" for increasing batch sizes and concluded that larger batch sizes produced greater test yield. However, test yield is defined in terms of the probability of rejecting analytical runs that are in control or that contain "critical" out-of-control error conditions. Bishop and Nix argue that treating each control sample individually, as if N = 1, provides improved error detection for QC rules designed to detect persistent out-of-control error conditions, such as the 22s rule or 41s rule (9).

Different analytical run definitions are easily accommodated if QC performance is measured in terms of ANPed or ANPTE. Evaluation of these performance measures demonstrates that only in rare cases would less frequent testing of a large group of control samples be preferable to testing fewer control samples more frequently. In situations where it is necessary to improve QC performance, the strategy should be to decrease the interval of time between QC tests, rather than to increase N while keeping the same analytical run length.

We investigated two QC rules (1ks and X(c)/R4s) that are members of the class of QC rules that only test the control samples within a single analytical run. Our findings also apply to the other QC rules within this class. In general, any QC strategy that is based on the periodic testing of control samples will always be vulnerable to producing unacceptable test results during the interval between the occurrence of a large out-of-control error condition and the next QC testing opportunity. Short intervals between QC testing opportunities minimize this vulnerability. To improve QC performance further, strategies that do not depend on the periodic testing of control samples, such as QC methods that use patient results, will be required (10). While these findings are consistent with intuition, they have never been formally demonstrated because traditional methods of evaluating QC performance have had no way to objectively address the relevant questions.

If patients' results processed after an acceptable QC test are reported as they are obtained, ANPTE will reflect the expected increase in number of unacceptable results reported during the course of an out-of-control error condition. If test results are held in a batch and not reported until a QC test is performed at the end of the batch (sometimes referred to as "bracketing" patient samples by controls), then the unacceptable patient samples that are processed in the batch ending with the QC test that detects the out-of-control error condition will not be reported.

Depending on the stability of an analytical system, most of the time a process should be in control. The overall long-term rate of producing and reporting unacceptable results requires taking into account the frequency of out-of-control error conditions, the type and magnitude of an error condition when it does occur, and knowledge of ANPed and ANPTE over the range of possible error conditions.

In summary, the purpose of this paper has been to investigate the influence of different analytical run definitions on QC performance during the routine monitoring of an analytical process. The performance measures ANPed and ANPTE provide the ability to investigate this issue in a way that has not been possible with traditional approaches. These performance measures demonstrate that during routine QC monitoring, the length of the interval between QC tests can have a major influence on the expected number of unacceptable patient results produced during the existence of an out-of-control error condition.


   Footnotes
 
Division of Laboratory Medicine, Department of Pathology, Washington University School of Medicine, Box 8118, 660 S. Euclid Ave., St. Louis, MO 63110.


   References
Top
Abstract
Introduction
Methods
Results
Discussion
References
 

  1. Westgard JO, Barry PL. Cost-effective quality control: managing the quality and productivity of analytical processes 1986:184 AACC Press Washington, DC. .
  2. Witte DL, Astion ML. Panel discussion: how to monitor and minimize variation and mistakes. Clin Chem 1997;43:880-885. [Free Full Text]
  3. Parvin CA. Quality-control (QC) performance measures and the QC planning process. Clin Chem 1997;43:602-607. [Abstract/Free Full Text]
  4. Parvin CA. New insight into the comparative power of quality-control rules that use control observations within a single analytical run. Clin Chem 1993;39:440-447. [Abstract/Free Full Text]
  5. Gronowski AM, Parvin CA. The effect of the number of QC's evaluated and the length of time between evaluations on quality control performance [Abstract]. Clin Chem 1995;41:S214.
  6. Stuart A, Ord JK. Kendall's advanced theory of statistics, Vol. 1 1987:464-466 Oxford University Press New York. .
  7. Westgard JO. Strategies for cost-effective quality control. Clin Lab News 1996;22:8-9.
  8. Westgard JO, Oryall JJ, Koch DD. Predicting effects of quality-control practices on the cost-effective operation of a stable, multitest analytical system. Clin Chem 1990;36:1760-1764. [Abstract/Free Full Text]
  9. Bishop J, Nix ABJ. Comparison of quality-control rules used in clinical chemistry laboratories. Clin Chem 1993;39:1638-1649. [Abstract]
  10. Cembrowski GS. Thoughts on quality-control systems: a laboratorian's perspective. Clin Chem 1997;43:886-892. [Abstract/Free Full Text]



The following articles in journals at HighWire Press have cited this article:


Home page
Clin. Chem.Home page
C. A. Parvin and S. Robbins III
Evaluation of the Performance of Randomized versus Fixed Time Schedules for Quality Control Procedures
Clin. Chem., April 1, 2007; 53(4): 575 - 580.
[Abstract] [Full Text] [PDF]


Home page
Clin. Chem.Home page
J. Dechert and K. E. Case
Multivariate approach to quality control in clinical chemistry
Clin. Chem., September 1, 1998; 44(9): 1959 - 1963.
[Abstract] [Full Text] [PDF]


Home page
Clin. Chem.Home page
S. P. Caudill, G. R. Cooper, S. J. Smith, and G. L. Myers
Assessment of current National Cholesterol Education Program guidelines for total cholesterol, triglyceride, HDL-cholesterol, and LDL-cholesterol measurements
Clin. Chem., August 1, 1998; 44(8): 1650 - 1658.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (7)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Parvin, C. A.
Right arrow Articles by Gronowski, A. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Parvin, C. A.
Right arrow Articles by Gronowski, A. M.
Related Collections
Right arrow Laboratory Management


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS