|
|
||||||||
Articles |
| Abstract |
|---|
|
|
|---|
Methods: Theoretical equations based on the Deming approach, further developed by physicists and extended herein, were applied to method-comparison data analysis. Monte Carlo simulations were used to demonstrate the validity of the new procedure and to compare its performance to ordinary linear regression (OLR) and simple Deming regression (SDR) procedures.
Results: Simulation studies included three types of data commonly encountered in method-comparison studies: (a) constant within-method SDs for both methods, (b) constant within-method CVs for both methods, and (c) neither SDs nor CVs constant for both methods. For all cases examined, OLR produced unreliable confidence intervals of the estimated bias. However, OLR point estimates of systematic bias were reliable when the correlation coefficient was >0.975. SDR produced reliable estimates of systematic bias for all cases studied, but the confidence intervals of systematic bias were unreliable when SDs of methods varied as a function of analyte concentration.
Conclusion: Only iteratively reweighted general Deming regression produced statistically unbiased estimates of systematic bias and reliable confidence intervals of bias for all cases.
| Introduction |
|---|
|
|
|---|
C) at medical decision level,
XC, is given by:
![]() |
The CI of the estimated bias is given by:
![]() |
where t(1 -
/2;n - 2)
is the Student t-statistic at the desired confidence level
(1 -
) with (n - 2) degrees of freedom. The variance of
the bias estimate is given by:
![]() |
where Var(a) and
Var(b) are the variances of the estimated
intercept and slope, respectively. Note that
Var(a) =
[SE(a)]2 and
Var(b) =
[SE(b)]2. The calculation of
w, the
weighted mean of the comparative method values, is described below.
The reliability of the estimated bias and its CI depend on the appropriateness of the regression procedure for analysis of the particular set of experimental data. In current practice, regression procedures are selected based on what assumptions can justifiably be made about the data. For example, in ordinary linear regression (OLR), the most commonly used regression procedure for method-comparison calculations, it is assumed that comparative method values are without random error and that test method random error is constant throughout the range of the data. Although these assumptions are never strictly justified, results of OLR are of acceptable accuracy and precision when the random error of the comparative method is small compared with the range of the data and when the test method data are not "significantly" heteroscedastic. When OLR cannot be used because of substantial violations of its assumptions, an appropriate form of Deming regression may be selected.
Deming regression is the term used in laboratory medicine to refer to linear regression analysis in which the random error of both the comparative and test methods is taken into account. Although Demings approach to a generalized regression procedure was basically sound, he oversimplified the problem by expanding the straight line function in a Taylor series about assumed values of slope, intercept, and adjusted points. Because squared and higher terms in the expansion were neglected, Demings original general exposition can lead to significant errors in some instances, as he recognized.
Nevertheless, Deming presented an exact solution for the particular
case in which both x and y are subject to random
error but in such a way that the ratio
=
Var(x)/Var(y) is constant
and not infinite or zero throughout the data range (1). With
this constraint, he derived equations for the slope and intercept for a
weighted least-squares regression model. When the variance of
x is constant throughout the data range, the variance of
y must be constant, and the equations for the Deming slope
and intercept reduce to the well-known formulae for simple Deming
regression (SDR) called to our collective attention by Cornbleet and
Gochman (2). More recently, Linnet (3)
independently rederived the cited formulae for the Deming slope and
intercept and focused on a special case in which random errors in both
x and y are proportional to the overall average
value of the test and comparative results for each sample. For
convenience, we refer to the Linnet procedure as "constant CV Deming
regression", although it is not precisely so. When assumptions about
constant SDs or proportional SDs do not apply to the data, CIs of the
bias are unreliable for the cited Deming methods.
We present here a generally applicable statistical procedure for Deming regression without constraints on random error of the test or comparative method. We then use Monte Carlo simulations to demonstrate the validity of the new procedure and to compare its performance to other regression procedures frequently used in current practice.
| Materials and Methods |
|---|
|
|
|---|
![]() |
where:
![]() |
![]() |
![]() |
Because zi,
wi,
w, and
w are
functions of bD, an iterative calculation
procedure is required.
Unbiased estimates of aD and
bD are obtained with these equations when
the true weights of the observed points
(xi,
yi) are known. In
method-comparison work where weights are typically some function of
concentration, weights corresponding to observed points are not optimal
because the observed points are subject to random error of the method.
We therefore extend the procedure described above by estimating
improved weights based on the adjusted values
(
i,
i), which
are those points through which the least-squares line is drawn and
which represent our best estimates of the true values
(Xi, Yi).
Linnet (3) used a similar approach for weighting observed
points in his development of constant CV Deming regression. The
relationships between observed and adjusted points were given by York
(4):
![]() |
![]() |
![]() |
Weights for each observed point are calculated iteratively. After an initial estimate of aD and bD based on weights derived from observed points, revised weights are computed using adjusted points, which in turn are used to calculate new values for aD and bD. The process is repeated using updated estimates of adjusted values and weights until the correction to bD is <0.0001. In our experience, four or fewer iterations are required for convergence, even for extremely imprecise methods. We refer to this overall procedure for obtaining the unbiased slope and intercept as iteratively reweighted general Deming regression (IRGDR).
Williamson (6) derived the variances of the estimated slope
and intercept from first order derivatives of
bD and aD
with respect to the observed points (xi
and yi):
![]() |
![]() |
where:
![]() |
![]() |
The derivatives used to calculate Var(bD) and Var(aD) may also be evaluated at adjusted points. For well-correlated data typically encountered in method-comparison studies, the difference in variances estimated by the two procedures is small, with values based on observed points being slightly larger.
simulation studies
We compared the performance characteristics of the IRGDR
procedures to those of OLR and SDR using Monte Carlo simulations. For
each simulation run, we set the true slope at 1.0, the true intercept
at 0.0, and n = 50 samples with duplicate values for test and
comparative methods at each point. The random error for each simulated
result had a gaussian distribution. Regression calculations were
performed by each procedure using only the first replicate of each
analytical method to estimate the average bias and 95% CI of the bias
at medical decision levels. For SDR calculations, duplicates of the
test and comparative method results for each sample were used to
estimate SDs, and SEs of the SDR slope and intercept were computed
using the general Deming regression relationships with the constant
SDs. Computations were performed with an adaptation of a
Windows® application developed by the author
(EP_Suite 9A, a module in EP_Suite for WindowsTM)
that contains components for each of the regression procedures.
| Results |
|---|
|
|
|---|
|
For each of 5000 simulation runs for each case, the slope, intercept,
their respective SEs (based on observed values), and the 95% CI of the
systematic bias were computed. The means and SDs of the 5000 slopes
(and intercepts) are listed as the "average slope" (intercept) and
SD of slopes (intercepts). The root-mean-square of the 5000 computed
SEs [SE(a) and SE(b)]
are tabulated in their respective rows. The proportion of calculated
CIs of systematic bias that did not include the true value of bias
(0.0) at each medical decision level is represented by
. Thus
is the confidence coefficient estimated from the simulation
runs.
An adequate regression procedure must provide (a)
statistically unbiased estimates of slope and intercept to compute
unbiased point estimates of systematic bias at each medical decision
level, and (b) an estimated confidence coefficient
(
) equal to the preestablished value (0.05 in our study), thus
indicating reliable CIs of the bias at each level. Review of the data
presented in Table 1
leads to the following conclusions regarding the
adequacy of the different regression procedures for the various cases:
Relative to point 4, we evaluated the use of the correlation
coefficient (r) as a criterion for assessing whether the
range of data is adequate for use of the OLR procedure. Among others,
Westgard (9) and NCCLS (10) have indicated that
if the correlation coefficient is
0.975, OLR may be used to estimate
systematic bias. In our studies, r was <0.975 in 99.8% of
simulation runs for case A, correctly indicating that OLR should
not be used. For cases B and C, r was
0.975 in
99.8% and 100.0% of cases, respectively, indicating that OLR should
be adequate under these conditions. Thus, our data support the usual
correlation coefficient criteria for using OLR to estimate average
systematic error. However, when the range of data is very broad,
heteroscedasticity is likely and the CI of the bias based on OLR should
be considered suspect, as revealed in cases B and C.
| Discussion |
|---|
|
|
|---|
From a practical point of view, the improved reliability of general Deming regression comes at the cost of knowing (or determining) imprecision profiles for both test and comparative methods. In SDR, the constant imprecisions calculated from sample duplicates or from external imprecision studies serve the purpose well. When imprecision of one or both methods varies across the concentration interval, the task is less straightforward. Although several procedures exist for determining weighting functions directly from the method-comparison data (11)(12), such procedures may be risky in view of the not uncommon paucity of imprecision information at the extremes of the data. Furthermore, when weights are estimated directly from the data, calculated results are somewhat less reliable because estimating weights introduces another source of variability.
We have, therefore, preferred to use imprecision profiles from data external to the method-comparison experiment (e.g., imprecision studies, reportable range studies, quality control, or manufacturers documentation). There are two primary requirements on such external imprecision data: (a) data must accurately reflect imprecision on authentic patient samples, and (b) the imprecision profile must span the entire range of data collected in the method-comparison experiment. With imprecisions on three to seven levels, we then use cubic spline calculations to create a continuous imprecision profile for each method.
As with any weighted regression procedure, performance depends on the adequacy of the imprecision profiles. To validate the IRGDR procedure, the simulation studies presented here used known (true) imprecision profiles, which in practice are rarely available. Thus, the simulation results presented may give a somewhat optimistic impression of performance of the method.
One of our Windows programs that performs IRGDR calculations is
available as a data supplement from the Clinical Chemistry
Web site. The file can be accessed by a link from the on-line Table
of
Contents
(http://www.clinchem.org/content/vol46/issue1/).
Executing the downloaded file named "Deming.exe" will create a
Windows program group that includes the statistical program (GDR),
instruction manuals (in Adobe®
Acrobat® pdf format), and a Readme text file
that contains important information about the system.
| Footnotes |
|---|
1 Nonstandard abbreviations: CI, confidence interval; OLR, ordinary linear regression; SDR, simple Deming regression; and IRGDR, iteratively reweighted general Deming regression. ![]()
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
S.-T. Lee, C. W. Weykamp, Y.-W. Lee, J.-W. Kim, and C.-S. Ki Effects of 7 Hemoglobin Variants on the Measurement of Glycohemoglobin by 14 Analytical Methods Clin. Chem., December 1, 2007; 53(12): 2202 - 2205. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. UMETANI, S. HIRAMATSU, and D. S.B HOON Higher Amount of Free Circulating DNA in Serum than in Plasma Is Not Mainly Caused by Contaminated Extraneous DNA during Separation. Ann. N.Y. Acad. Sci., September 1, 2006; 1075: 299 - 307. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Ferre, J. Camps, J. Marsillach, B. Mackness, M. Mackness, B. Coll, M. Tous, and J. Joven Comparison of Paraoxonase 1 Measurements in Serum and in Lithium-Heparin-Anticoagulated Plasma Samples Clin. Chem., May 1, 2005; 51(5): 922 - 923. [Full Text] [PDF] |
||||
![]() |
S. A. Santini, C. Carrozza, C. Vulpio, E. Capoluongo, G. Luciani, P. Lulli, B. Giardina, and C. Zuppi Assessment of Parathyroid Function in Clinical Practice: Which Parathyroid Hormone Assay Is Better? Clin. Chem., July 1, 2004; 50(7): 1247 - 1250. [Full Text] [PDF] |
||||
![]() |
P. Suominen Evaluation of an Enzyme Immunometric Assay to Measure Serum Adiponectin Concentrations Clin. Chem., January 1, 2004; 50(1): 219 - 221. [Full Text] [PDF] |
||||
![]() |
C. Gay-Bellile, D. Bengoufa, P. Houze, D. Le Carrer, M. Benlakehal, B. Bousquet, B. Gourmel, and T. Le Bricon Automated Multicapillary Electrophoresis for Analysis of Human Serum Proteins Clin. Chem., November 1, 2003; 49(11): 1909 - 1915. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Khuseyinova, A. Imhof, G. Trischler, D. Rothenbacher, W. L. Hutchinson, M. B. Pepys, and W. Koenig Determination of C-Reactive Protein: Comparison of Three High-Sensitivity Immunoassays Clin. Chem., October 1, 2003; 49(10): 1691 - 1695. [Full Text] [PDF] |
||||
![]() |
C. Bieglmayer, G. Prager, and B. Niederle Kinetic Analyses of Parathyroid Hormone Clearance as Measured by Three Rapid Immunoassays during Parathyroidectomy Clin. Chem., October 1, 2002; 48(10): 1731 - 1738. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Flanders, R. Crist, S. Safapour, and G. M. Rodgers Evaluation and Performance Characteristics of the STA-R Coagulation Analyzer Clin. Chem., September 1, 2002; 48(9): 1622 - 1624. [Full Text] [PDF] |
||||
![]() |
S. Rothkrantz-Kos, M. P.J. Schmitz, O. Bekers, P. P.C.A. Menheere, and M. P. van Dieijen-Visser High-Sensitivity C-Reactive Protein Methods Examined Clin. Chem., February 1, 2002; 48(2): 359 - 362. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |