|
|
||||||||
Laboratory Management |
a Address correspondence to this author at: Fax 32-9-264 81 98; e-mail Linda.Thienpont{at}rug.ac.be.
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
The specimens that were used for the evaluation consisted of a group of 60 single-donation serum samples that had been certified with a candidate reference method using ion chromatography (IC)1 (8)(9). The eight systems being evaluated for intrinsic quality used ion selective electrodes (ISEs) to measure K. The systems that used direct ISE were manufactured by AVL, Chiron, Dade (two systems), and Johnson & Johnson (J&J); the systems that used indirect ISE were manufactured by Boehringer Mannheim, Beckman, and Roche. For the Chiron, J&J, Boehringer, and Beckman systems, routine quality was evaluated in five clinical laboratories, using the same panel of samples. The results of the study are interpreted on the basis of the strictest limits for total error (TE, 6.3%) (10) and systematic error (SE, 1.6%) (11) of serum K analysis, derived from the biological variation of the analyte.
| Materials and Methods |
|---|
|
|
|---|
test systems performed by or under the responsibility of the
manufacturers
The analyses in the application laboratories were performed in
duplicate in one assay under strict IQC. The manufacturers used their
own IQC samples at two or more concentrations. The IQC samples were
placed at the beginning of the assay (n = 6), after each tenth
duplicate (n = 3), and at the end of the assay (n = 6). Five
test systems that used direct ISE were investigated. The measurements
with the AVL system (AVL Medical Instruments) were performed at the
site of the Belgian distributor, Merck-Belgolabo, using an Omni system
and the Combitrol 13 (AVL) IQC samples. The measurements with the
Chiron system (Chiron Diagnostics) were done in the Chiron
application laboratory at Zaventem (Belgium) with a Chiron 654 analyzer
and the Certain Lyte Level 13 (Chiron Diagnostic) IQC samples. The
measurements with the two Dade systems (Dade International) were
performed in the Dade application laboratory at Munich (Germany), using
a Dimension AR system (direct ISE) and a Dimension RXL
instrument [direct integrated multisensor technology (IMT)] analyzer.
For IQC on both instruments, Monitrol I and Monitrol II (Dade) samples
were used. The measurements with the J&J system (Johnson & Johnson
Clinical Diagnostics) were done in the J&J application laboratory at
Illkirch (France) using a Vitros 250 analyzer and the X1395 and Y1397
(J&J) IQC samples.
The test systems below used indirect ISE. The measurements with the Boehringer system (Boehringer Mannheim) were performed in the manufacturer's application laboratory using a Hitachi 917 instrument and the PNU 186720, the PNU 189638, and the PPU 188184 IQC samples. The analyses with the Beckman system were done at the site of the Belgian distributor, Analis, using a Synchron CX7 analyzer and the Decision Level 13 (Beckman) IQC samples. The analyses with the Roche system (Roche Diagnostics) were performed by the clinical routine laboratory of the hospital, Maria Middelares, in Ghent (Belgium), using the Integra instrument and the Roc SN and SP (Roche Diagnostics) IQC samples.
routine analyses performed in clinical laboratories
The routine analyses with the Beckman, Boehringer, Chiron, and J&J
test systems were performed in Belgian clinical laboratories (five
laboratories for each system). The measurements were done under
conditions strictly identical to those for measurement of patient
samples, i.e., in singlets and without any additional calibration or
IQC precautions. Measuring the samples in a single assay was not
mandatory. We guaranteed the anonymity of the participating routine
laboratories.
The instruments and the reference intervals they applied are listed in
Table 1
(this table also includes the data from the manufacturers).
|
serum samples
The native samples were single donations from 60 healthy persons
(30 males and 30 females). They were purchased from WBAG Resources.
Fifteen donors were younger than 20 years, 5 were older than 50, and
the other age groups were 2030 years of age (n = 15), 3040
years of age (n = 15), and 4050 years of age (n = 10). The
blood was allowed to coagulate for 1 h after sampling, after which
the serum was isolated by centrifugation. The resulting fresh sera were
stored at 4 °C for 3 days. They were then sterile-filtered,
fractionated into 500-µL portions in Eppendorf vials, and frozen. The
sera were kept at -20 °C for 2 weeks and sent on dry ice to the IC
laboratory in Ghent. The samples were treated according to the
instructions of the Ethical Commission of the University of Ghent. All
samples were sent on dry ice from the IC laboratory to the laboratories
participating in the study. When the samples arrived, the person
designated as responsible for the study checked whether the samples
were still frozen, which was always the case. The samples were kept at
-20 °C until analysis. Values for total protein, cholesterol,
bilirubin, and triglycerides were determined by WBAG Resources on a
Hitachi 717 instrument; almost all were within the reference range.
Sera with suspected interferences were not selected.
calculations
Outlier removal.
To prevent large errors from influencing the
method comparisons, we treated values outside the 2.5-SD limit as
outliers. The statistical chance to observe values outside this limit
is 1%; in our case, this corresponded to ~1 of 60 values; in
practice we rejected a maximum of three values to avoid
"overfitting" the data.
Imprecision.
For IC, we calculated the within-day imprecision
(CVwd) from the deviations of the within-day
duplicates and the mean K concentration according to the
formula below:
![]() |
We calculated the between-day imprecision
(CVbd) with the same formula but using the
deviations of the between-day duplicates. In addition, the total
imprecision (CVtot,REF) of the reference
method values was calculated with the following formula:
![]() |
The imprecision of the analyses in the application laboratories
was estimated using the differences between duplicates (see above) and
the results of the IQC measurements (CVIQC). The
total analytical imprecision for the method comparisons (single
measurements for the routine method vs quadruplicate measurements for
the reference method; CVtot,COMP) was calculated
with the following formula:
![]() |
This formula was used to calculate the 95% prediction limits of variation. The CVIQC was used to account for the total imprecision during the method comparison, including eventual drifts and/or shifts. In contrast, the imprecision estimated from the differences between duplicates (CVwd) did not reflect shifts or drifts because they were performed immediately after each other.
For the method comparisons for the routine laboratories, no imprecision data were available. Therefore, the corresponding CVtot,COMP calculated for the application laboratories was used.
SE.
To estimate the SE of the test systems performed at the
manufacturer and at the routine laboratories, we calculated the mean
deviation of the results from IC and the 95% confidence interval.
Because the routine laboratories only measured the sera in singlets, we
also used the singlet results from the application laboratories. In
addition, we used linear regression to calculate the deviation for the
sample with the lowest (3.56 mmol/L) and the highest (5.42 mol/L)
K concentration. These deviations were then evaluated
against the SE limit proposed by Fraser et al.
(11) for routine test systems for K
(1.6%).
data presentation
The results of the method comparisons are presented in the form of
bias plots. As indicated above, we used only single measurements (even
for the application laboratories, where the samples had been measured
in duplicate) for the method comparisons. This allowed us to evaluate
the performance of the test systems against the limits for TE, which
also referred to a single measurement. Therefore, in all data
presentations, the y-axis represents the percentage of the
deviations of the test system singlets from the mean of the IC
quadruplicates. The respective values (mmol/L) measured for the 60
samples by IC are plotted on the x-axis. The plots also
contain the regression lines for the deviations, together with the 95%
prediction limits calculated as 1.96 x the
CVtot,COMP. Finally, the strictest TE limit for
serum K analysis (TE = 6.3%), as proposed by
Ricós et al. (10), was included.
| Results and Discussion |
|---|
|
|
|---|
The SE and TE were controlled by use of the NIST 909b reference materials. The observed deviations were 0.20 ± 0.25% for the SRM 909b1 (certified value, 3.424 mmol/L) and 0.27 ± 0.18% for the SRM 909b2 (certified value, 6.278 mmol/L). Both values satisfied the 0.65% limit for SE cited above. Furthermore, neither the 1.5% CV limit (this also holds true for the 60 serum samples) nor the 3% TE limit for the quadruplicates were exceeded. The total imprecision, CVtot,REF, was 0.6%. From these results, we can conclude that the reference method was under control during the study.
intrinsic quality
Internal quality control.
During the study, a strict IQC
protocol was imposed on the respective application laboratories. At
least two IQC samples (with low and high K concentrations)
were measured in blocks at the beginning of the assay (n = 6),
after each tenth duplicate (n = 3), and at the end of the assay.
All manufacturers used their own IQC samples, each of which had a
method-dependent target value. The mean deviations from the respective
target values for the IQC samples with low, medium (if applicable), and
high concentration are represented in Table 2
. For our study, we aimed for a maximum deviation from the
target (2% of the mean) of all IQC results per sample. However, this
limit could not always be realized in practice. The low-concentration
IQC samples of three of the manufacturers (AVL, Beckman, and Chiron)
had mean deviations >2.0%. These deviations had previously been
observed during experiments preliminary to the study. However, the
manufacturers claimed that this would not negatively influence the
results of the analyses of these samples. In the calculation of the
mean AVL deviations, the first IQC block was not taken into account
because the results for all samples showed a pronounced negative bias
(-5.56%, -3.23%, and -2.28%, respectively). The analyses were not
repeated because the system was stable after this first IQC block, and
the deviation was not reflected in the sample measurements.
|
Imprecision.
Imprecision was calculated using the IQC results
(CVIQC) and using the differences between
duplicates (CVwd); these values are shown in
Table 3
. Interestingly, for Boehringer and J&J, the
CVIQC values were substantially higher than the
CVwd values. This was due to a system drift over
the total run in the order of 2% for Boehringer and to a continuous
fluctuation of the IQC values in the same magnitude for J&J. The
influence of the system drift on the duplicate measurements was
negligible because both samples immediately followed each other. This
observation led us to calculate the CVtot,COMP
values (see above) from the CVIQC and not from
the CVwd data, because the total imprecision of
the routine results was involved. For the Dade ISE test, the
CVIQC and CVwd values
were similar and extremely low, indicating a high repeatability and a
negligible drift during the study. Imprecision was also very low for
the Chiron, Roche, and Beckman systems (the
CVIQC and CVwd for all
three systems were
0.5%).
|
Method comparison: systematic error.
For each test system
evaluated in the application laboratories, the deviation (%) of the
singlet results for the group of sera from the results (mean of
quadruplicates) obtained by the IC reference method was calculated.
From these data, we derived the mean deviation to estimate the
systematic error of the test systems in comparison with IC (Table 4
). Because of a concentration dependency of these deviations,
the deviations of the samples with the lowest (3.56 mmol/L) and the
highest (5.42 mmol/L) K concentrations were also
calculated. When comparing these deviations with the most stringent
limit for SE of 1.6% (11), one has to consider
that the reference method was performed under the condition of a
maximum SE of 0.65%. Therefore, the 1.6% limit has to be extended in
our case to 2.25% (1.6 0.65%). Only the Dade ISE system exceeded
this limit and showed a constant bias of ~3%. However, the AVL,
Beckman, and Chiron systems showed a slight undercalibration in the low
concentration range, and the Dade IMT system showed a slight
undercalibration in the high concentration range. Interestingly, this
fact is also visible in the respective IQC data (Table 2
), which
stresses the importance of adequate IQC.
|
In summary, with the exception of the Dade ISE system, all of the investigated test systems were well calibrated in the reference range.
Method comparison: total error.
The method comparisons are
shown in the form of a bias plot (see Materials and Methods)
in Fig. 1
. As explained above, these bias plots were constructed from
singlet results. This was purposely done because in this way, the
method comparisons directly give an impression about the magnitude of
the TE. Therefore, Fig. 1
also reflects the high intrinsic quality of
the investigated systems. None of them gave a substantial number of
results outside the strictest TE limit of 6.3%. Nevertheless, the
slightly poorer performance of the Dade IMT system (compared with the
other systems) can also be observed in the bias plots. In particular,
the variation around the regression line is considerably greater in the
Dade IMT system than in the other test systems. This variation can
originate from two sources, namely, from the total analytical variation
of the reference and routine methods
(CVtot,COMP) or from
sample-related effects (15). To distinguish
between the two sources, we included the 95% prediction limit for
variation caused by the CVtot,COMPin Fig. 1
. For the same purpose, we compared the predicted
variation; with the observed variation, the latter was obtained from
regression analysis (Sy
x; Table 5
). To make the data more comparable, we expressed the
Sy
x values in percentages. For the Dade IMT
system, Sy
x is considerably higher than
CVtot,COMP. Similarly, it can
be seen in the bias plot (Fig. 1
) that the number of results outside
the 95% prediction limit is considerably greater than the
statistically tolerable number of three. Both of these problems might
indicate some sensitivity of the Dade IMT system to sample-related
matrix effects because no drifts or shifts were observed during
analysis (shifts or drifts would also produce higher values for
Sy
x). For the Dade ISE, the Beckman, and the
Chiron systems, the values for Sy
x were greater
than CVtot, COMP. Similarly,
the number of outliers exceeded the statistically expected number
(Fig. 1
). For these systems, however, the respective
CVtot,COMP values were very low.
Therefore, it seems likely that the imprecision of those test systems
has been underestimated.
|
|
routine quality
The values for bias and Sy
x observed in the
study for the test systems in the routine laboratories are shown in
Table 6
. For comparison, the corresponding values from the
manufacturers are included. In the bias plots (Fig. 2
), only the results of the laboratories with the best and the
worst performance are shown (the prediction intervals are those of the
respective manufacturers).
|
|
For Beckman laboratories I and IV, the Sy
x values
considerably exceeded the values of the application laboratory. For
laboratory IV, this was partly because it only could report results
with two, instead of three, significant digits. For laboratory I, the
greater variation was due to within-run recalibration, which produced
blocks of successive samples that were biased when compared with the
mean deviation. With the exception of laboratory I, all laboratories
using the Beckman system produced results well within the 6.3% TE
limit. The 2.25% SE limit, however, was exceeded in several
laboratories, especially in the lower concentration range. The latter
was caused to a great extent by the undercalibration of the Beckman
system itself (Table 4
and Fig. 1
). It should be mentioned in this
connection that such low bias values as those for serum
K are generally difficult to realize in practice
(16). Therefore, these findings should not be
overinterpreted.
Most of the laboratories that used the Boehringer system could generally reproduce the intrinsic quality of the test. The slightly higher imprecision in laboratories I and II was because they reported results with only two significant digits. However, laboratories I and II showed a bias of -4% and 3%, respectively, in the lower concentration range and thus exceeded the 2.25% limit for SE. With the exception of one result in laboratory II, all deviations were very well within the 6.3% TE limit.
For the laboratories that used the Chiron system, only laboratories I
and III were able to reproduce the intrinsic quality of the test.
Laboratory IV showed a mean bias of -4.8% and an increased
imprecision. The latter was observed in particular among the first 15
samples. The laboratory found no reason for this, but problems with
electrode adaptation can occur. The combination of bias and increased
imprecision produced a high number of results outside the 6.3% TE
limit. Laboratories II and V showed greatly increased values of
Sy
x and a pronounced negative bias in the low
(-7%; laboratory II) and high (-11.5%; laboratory V) concentration
ranges. The effect of the increased Sy
x value
and bias was a high number of results outside the 6.3% TE limit.
Laboratory II, however, used a different type of electrode than the one
that was used by the application laboratory (Table 1
).
For the J&J system, the majority of the routine laboratories could
reproduce the intrinsic quality of the applications laboratory. Only
laboratory III showed a slightly increased value of
Sy
x, with the consequence that results were
outside the 6.3% limit for TE. However, system calibration was a
slight problem. Laboratories I and IV showed a bias of approximately
-3% and thus exceeded the 2.25% SE limit.
Reference intervals.
To investigate whether the observed
biases would be reflected in laboratory reference intervals, we asked
the routine laboratories which reference intervals they used. Most
laboratories followed the recommendations of the manufacturers and used
a reference interval of 3.55.1 mmol/L. J&J used a reference
interval of 3.65.0 mmol/L; however, the mean of 4.3 mmol/L
was the same as the other reference intervals (Table 1
). Four
laboratories (Beckman laboratory II, Boehringer laboratory V, and
Chiron laboratories III and IV) reported considerably different
reference intervals. Most strikingly, their mean value was 7% higher
than those of the others. However, this was not reflected in their
calibration. Perhaps that they accounted in this way for higher
preanalytical errors that caused increased K values
through hemolysis.
| Summary and Conclusions |
|---|
|
|
|---|
| Acknowledgments |
|---|
| Footnotes |
|---|
1 Nonstandard abbreviations: IC, ion chromatography; ISE, ion selective electrode; TE, total error; SE, systematic error; IQC, internal quality control; SRM, standard reference material; and IMT, integrated multisensor technology. ![]()
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
H. W. Vesper and L. M. Thienpont Traceability in Laboratory Medicine Clin. Chem., June 1, 2009; 55(6): 1067 - 1075. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Dewitte, C. Fierens, D. Stockl, and L. M. Thienpont Application of the Bland-Altman Plot for Interpretation of Method-Comparison Studies: A Critical Investigation of Its Practice Clin. Chem., May 1, 2002; 48(5): 799 - 801. [Full Text] [PDF] |
||||
![]() |
D. Stockl, K. Dewitte, and L. M. Thienpont Validity of linear regression in method comparison studies: is it limited by the statistical model or the quality of the analytical input data? Clin. Chem., November 1, 1998; 44(11): 2340 - 2346. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |