|
|
||||||||
Laboratory Management |
Laboratory of Clinical Biochemistry, Psychiatric University Hospital, DK-8240 Risskov, Denmark.
linnet{at}post7.tele.dk.
| Abstract |
|---|
|
|
|---|
), but in cases involving only single measurements by each method,
this ratio may be unknown and is often assigned a default value of one.
On the basis of simulations, this practice was evaluated in situations
with real error ratios deviating from one. Comparisons of two
electrolyte methods and two glucose methods were simulated. In the
first case, misspecification of
produced a bias that amounted to
two-thirds of the maximum bias of the ordinary least-squares regression
method. Standard errors and the results of hypothesis-testing also
became misleading. In the second situation, a misspecified error ratio
resulted only in a negligible bias. Thus, given a short range of values
in relation to the measurement errors, it is important that
is
correctly estimated either from duplicate sets of measurements or, in
the case of single measurement sets, specified from quality-control
data. However, even with a misspecified error ratio, Deming regression
analysis is likely to perform better than least-squares regression
analysis. | Introduction |
|---|
|
|
|---|
| Materials and Methods |
|---|
|
|
|---|
or
). For a
given sample measured by two clinical chemistry methods, the following
relations exist:
![]() |
![]() |
![]() |
To estimate the regression line by the Deming method it is, however,
necessary to estimate or assign a value to the ratio between squared
analytical standard deviations for the x and y
method,
=
SDax2/SDay2
(see formula in Appendix).
determines the angle in which
to project points onto the line to minimize the sum of squared
deviations (5) (Fig. 1
).
|
For analytes with values extending over a considerable range, the
analytical SD usually increases with the measurement level
(6). Often a proportional relationship approximately exists
over the major part of the range, which implies that the analytical
coefficient of variation (CVa) is approximately constant.
In this situation, we have the relation
=
CVax2/CVay2.
Deming regression may in this case be carried out either in the simple,
unweighted form or in a weighted form in which weights are introduced
that are inversely proportional to the squared analytical standard
deviation at a given measurement level (see Appendix). The
latter procedure is the optimal one (3).
simulation procedure
The performance of regression methods is dependent on the relation
between the dispersions of measurement errors and the dispersions of
target values. The larger the dispersion of measurement errors is in
relation to the dispersion of target values, the larger is the
imprecision of the slope estimate. For the ordinary least-squares
procedure, the bias of the slope estimate depends on the ratio between
the dispersions of x measurements and X target
values (7). In clinical chemistry, one observes a wide range
of ratios between measurement error and target value dispersions. High
ratios occur for compounds that are tightly regulated in the body,
e.g., electrolytes, whereas small ratios may occur for substances with
a very wide physiological variation, e.g., various hormones whose serum
concentrations may span several decades. Intermediate ratios occur for
various metabolites such as glucose, urea, and others. In the present
study, the focus is on two prototype situations: an "electrolyte-like
situation" with a small dispersion of target values and a
"metabolite-like case" with a span of target values close to one
decade. The performance of the two regression methods was evaluated for
various measurement error combinations that should be realistic in
clinical chemistry. Random numbers were generated by a computer
according to specified distributions. The measurement error
distributions were supposed to be gaussian. The regression estimates
were computed and subjected to statistical tests as described
(3)(8). The computational methods were effected
using a modification of the software program CBstat, which is a Windows
application for Deming regression analysis developed by the author.
Each simulated situation was repeated 5000 times to obtain reliable
performance measures. The performance measures were as follows:
(i) Bias of the slope estimate. The bias is the difference between the true value (ß) and the average of estimated slope values for 5000 (nrun) simulation runs (bm). The true value was set to 1 in the simulated situations corresponding to the null hypothesis situation of identity.
(ii) Root mean squared error of the slope
estimate. The root mean squared error (RMSE) is an estimate of the
total error of the slope estimate (b) and includes a
systematic part (bias) and a random error part (standard error):
![]() |
(iv) Average estimated standard error of the slope [SE(b)]. In each simulation run, a standard error of the slope is estimated as a result of the statistical analysis, so that a t-test of the null hypothesis can be performed. A prerequisite for a correct test is that the estimated standard error, on average, agrees with the real one.
(v) Hypothesis testing. The performance of hypothesis testing can be evaluated by comparing the observed and expected frequencies of rejection of the null hypothesis on the basis of the t-test for the slope carried out in each simulation run. Under the null hypothesis, one expects 50 rejections out of 1000 trials, when the nominal type I error is set to the usual level of 5%. Thus, if the observed number of rejections is 200, the actual type I error is four times the nominal or expected value, and testing of the null hypothesis is not performed in the correct way.
| Results |
|---|
|
|
|---|
|
|
For the situations with analytical SD ratios different from 1, both
types of slope estimates become biased (Fig. 2
, B and C). The worst
case is that for the least-squares method and an analytical CV of 2%
for the x measurements (bias, -35%). However, the Deming
method also gives slope results with a considerable bias in the
situations with a misspecified SD ratio, up to 24% in the example
shown here. The bias of the Deming method is positive when
SDay exceeds SDax and negative when
SDay is smaller than SDax. In contrast, the
bias of the OLR method is always negative and only depends on the
relation between x measurement errors and the dispersion of
X target values (see Appendix).
A more detailed account on the relation between analytical error ratio
and bias is displayed in Fig. 3
. In Fig. 3A
, SDax is fixed at 1.405 (CVax,
1%), and SDay increases from 1.405 to 2.81 (CVay,
2%). In this situation, the bias of the Deming method gradually
increases to a maximum value of 24% as the ratio deviates more and
more from the assumed value of one, whereas the bias of the OLR method
is constant (-12%). In Fig. 3B
, SDay is kept fixed at 1.405
(CVay, 1%), and SDax increases from 1.405 to 2.81
(CVax, 2%). Now the biases of both procedures increase with
increasing SDax value, so that the maximum biases are observed
at a SDay/SDax ratio of 0.5 (Fig. 3B
). The bias of the
Deming procedure increases from zero to a maximum of -18%, whereas
the bias of OLR starts at -12% and increases to -35%.
|
When the null hypothesis has been tested in cases where no difference exists between the target values, the bias has led to the null hypothesis being rejected too frequently. For the OLR method, the frequency ranges from 19% to 98%. In the latter situation, the test will almost always produce a rejected null hypothesis, i.e., the conclusion of the statistical analysis is that there is a systematic difference between the two methods although no real difference exists. With regard to the Deming method, this frequency ranges from 5.9% to 44%.
In the example described above, the target values of the 50 samples
were randomly drawn from a gaussian distribution with a mean
corresponding to the mean of the reference interval. Therefore, the
majority of sample values are located in the middle of the reference
interval with relatively few extreme sample values. In method
comparison studies, the investigator may choose to deliberately select
samples with values in the periphery of the range of interest to obtain
a more uniform distribution over the studied interval and thereby
increase the precision of estimated regression parameters. To study
such a procedure, one may simulate drawing of sample sets from a
uniform distribution of target values covering the range of
interest (131150 mmol/L) instead of a gaussian distribution, such as
the cases dealt with above. After this change, the Pearson coefficient
of correlation increased to 0.94 when both analytical SDs were 1.405
and to 0.87 when one of the analytical SDs was equal to 2.81. For this
model, the same general trends are observed, but in a reduced scale.
The Deming procedure with
specified to one yields slope
biases of 10% and -9% for SDax/SDay ratios of 1:2
and 2:1, respectively. Similarly, the slope biases of the OLR procedure
are also reduced to about one-half of the values, -6% and -20%,
respectively. Thus, by assuring a more even distribution of sample
values over a given interval, the bias problems can be reduced but not
avoided.
metabolite case
Many clinical chemical analyses have range ratios in the interval
510, e.g., serum glucose determinations. For this analyte, patient
samples may typically range from ~2.5 to 20 mmol/L. In a method
comparison study, most of the values are likely to be located in the
lower half of this interval. Thus, in the simulation model it is
supposed that three-quarters of the observations are located in the
lower half and one-quarter in the upper half of the interval. The
measurement errors were assumed to be proportional to the
concentrations, which is a more common situation than that of constant
analytical errors for analytes with a considerable range
(6). Proportional measurement errors imply that the
analytical coefficients of variation, CVax and CVax,
are constant over the measurement range. Their values were set to 4%
and 8%, respectively, so that the ratios of analytical coefficients of
variation were 1:1, 1:2, or 2:1 (Table 2
). The case with a ratio of 2:1 is shown in Fig. 4
. The Pearson coefficient of correlation was 0.99 when both CVs
equalled 4% and 0.98 when one CV was 8%. In accordance with the
specified model, the weighted form of Deming regression
analysis was carried out assuming proportional errors (see
Appendix). Again, the sample was composed of 50 single
observations for each method. All slope estimates obtained by the
least-squares regression method are biased, but because of the wider
range of target values, the bias is either negligible (0.7%) or small
(2.7%). Hypothesis-testing, however, is clearly misleading with type I
errors four to six times larger than the nominal one of 5%. This
relies partly on the bias, partly on an underestimated standard error.
|
|
The weighted Deming method yields an unbiased slope estimate in the case with an analytical CV ratio of 1:1 and biases up to only 0.7% in the other cases. Thus, in the present example, the maximum bias of the weighted Deming procedure amounts to only about one-fourth of the maximum bias of the OLR method and so may be regarded as negligible. Furthermore, estimation of the standard error of the slope, and thus hypothesis-testing, is rather insensitive to a misspecified error ratio. The observed type I error values (5.35.5%) are very close to the nominal value of 5%. Thus, although the analytical SD ratio might have been misspecified, hypothesis-testing is reliable in this situation, with a relatively wide range of target values compared with the dispersion of measurement errors.
| Discussion |
|---|
|
|
|---|
may, however, pose problems. The
easiest way to attain a correct
value is to use duplicate
measurements in the method comparison study, allowing for simultaneous
estimation of analytical SDs or CVs and so
. The use of duplicate
sets of measurements in method comparison studies is generally
recommended (11). However, taking into account that most
method comparison evaluations appear to be based on single
measurements, one should strive to specify
correctly. In most
cases, quality-control data will be available, and therefore,
may
be specified as the ratio between recorded squared analytical SDs. As
shown, misspecification of
may induce a considerable slope bias for
the Deming method in electrolyte-like situations with a short range of
data. Hypothesis-testing is also seriously affected. Thus, a correct
specification of
is important when applying the Deming method in
situations with a limited range of values. With regard to the
metabolite-like case, the bias problem is negligible; also with regard
to the testing of hypotheses, the Deming procedure appears to be rather
robust towards a misspecified error ratio in these examples. More generally, when considering the sensitivity of the Deming procedure to a misspecified error ratio in a given situation, one may to some extent be guided by the value of the correlation coefficient. Electrolyte-like cases are characterized by low values for the correlation coefficient, whereas the correlation coefficient is higher in situations with a wider range of values. In cases with a misspecified error ratio, the bias of the Deming method is smaller than the maximum bias observed for OLR analysis. A value of 0.975 for the correlation coefficient has been suggested as the lower limit for acceptable performance of OLR in method comparison studies (11). Thus, according to this point of view, the Deming method may be regarded as relatively insensitive to misspecified error ratios in situations with correlation coefficients exceeding 0.975.
The impact of a misspecified error ratio on the performance of Deming regression analysis has also been considered in the theoretical literature (12). A theoretical evaluation shows that the RMSE of the Deming slope is at a minimum when the error ratio is correctly specified and increases the more the specified ratio deviates from the true ratio. However, for reasonable parameter examples, the RMSE stays lower than that of OLR (12).
Some other regression methods that take measurement errors for both x and y into account apparently do not exhibit the problem with specification of the SDax/SDay ratio. In the so-called standardized principal component analysis, the slope is computed in a slightly different way. The procedure actually presupposes that the error ratio is related to the slope, that is (SDax/SDay) = 1/ß (13). Only when this assumption is given does the method provide a slope estimate free of any bias. This procedure is thus not very flexible and, therefore, not as useful as the Deming procedure. Similarly, the rank regression method of Passing and Bablok operates with the same rigid assumption (14). If the relation does not hold true, the slope estimate becomes biased (3).
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
M. A. Virji, S. R. Woskie, M. Waters, S. Brueck, D. Stancescu, R. Gore, C. Estill, and M. Prince Agreement between Task-Based Estimates of the Full-Shift Noise Exposure and the Full-Shift Noise Dosimetry Ann. Hyg., April 1, 2009; 53(3): 201 - 214. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Geistanger, S. Arends, C. Berding, T. Hoshino, J.-O. Jeppsson, R. Little, C. Siebelder, C. Weykamp, and on behalf of the IFCC Working Group on Standardiza Statistical Methods for Monitoring the Relationship between the IFCC Reference Measurement Procedure for Hemoglobin A1c and the Designated Comparison Methods in the United States, Japan, and Sweden Clin. Chem., August 1, 2008; 54(8): 1379 - 1385. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Han, M. Xu, L. Tang, X. Sun, N. Zhang, X. Tan, X. Tan, Y. Tan, and R. M. Hoffman Homogeneous Enzymatic Colorimetric Assay for Total Cysteine Clin. Chem., July 1, 2004; 50(7): 1229 - 1231. [Full Text] [PDF] |
||||
![]() |
C. Wang, D. H. Catlin, L. M. Demers, B. Starcevic, and R. S. Swerdloff Measurement of Total Serum Testosterone in Adult Men: Comparison of Current Laboratory Methods Versus Liquid Chromatography-Tandem Mass Spectrometry J. Clin. Endocrinol. Metab., February 1, 2004; 89(2): 534 - 543. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Krijt, M. Vackova, and V. Kozich Measurement of Homocysteine and Other Aminothiols in Plasma: Advantages of Using Tris(2-carboxyethyl)phosphine as Reductant Compared with Tri-n-butylphosphine Clin. Chem., October 1, 2001; 47(10): 1821 - 1828. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Si-Mohamed, L. Andreoletti, I. Colombet, M.-P. Carreno, G. Lopez, G. Chatelier, M. D. Kazatchkine, and L. Belec Quantitation of Human Immunodeficiency Virus Type 1 (HIV-1) RNA in Cell-Free Cervicovaginal Secretions: Comparison of Reverse Transcription-PCR Amplification (AMPLICOR HIV-1 MONITOR 1.5) with Enhanced-Sensitivity Branched-DNA Assay (Quantiplex 3.0) J. Clin. Microbiol., June 1, 2001; 39(6): 2055 - 2059. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Linnet Necessary Sample Size for Method Comparison Studies Based on Regression Analysis Clin. Chem., June 1, 1999; 45(6): 882 - 894. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Linnet Limitations of the Paired t-Test for Evaluation of Method Comparison Data Clin. Chem., February 1, 1999; 45(2): 314 - 315. [Full Text] [PDF] |
||||
![]() |
D. Stockl, K. Dewitte, and L. M. Thienpont Validity of linear regression in method comparison studies: is it limited by the statistical model or the quality of the analytical input data? Clin. Chem., November 1, 1998; 44(11): 2340 - 2346. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |