|
|
||||||||
Technical Briefs |
Departments of1
Pathology and Laboratory Medicine and Medical Biophysics,2
Medical Oncology, and5
Radiology, and 3
Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada;
4 Department of Radiology, University of British Columbia, Vancouver, BC, Canada;
6 Laboratory of Proteomics and Analytical Technologies, SAIC, Inc., Frederick, MD;
aaddress correspondence to this author at: Department of Medical Biophysics, British Columbia Cancer Research Centre, 675 West 10th Ave., Vancouver, BC, Canada V5Z 1L3; fax 604-675-8049, e-mail akarsan{at}bccrc.ca
Currently available serum tumor markers lack sufficient specificity and sensitivity as stand-alone diagnostic or screening tests (1). Nevertheless, these assays are used extensively because of a lack of better alternatives. To accelerate the discovery of tumor markers for diagnosis and/or prognosis, there has been great enthusiasm in attempting to use mass spectrometry (MS)-based testing of serum to identify potential biomarkers or spectral patterns that can act as a fingerprints for specific diseases (1). Analysis of serum by surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) MS was recently reported to be able to predict the existence of ovarian cancer without missing a single case (2). In this method, capture of serum proteins on a biochip by use of surface chemistry is followed by MS analysis of all captured proteins, and data mining software is used to identify a pattern of spectral peaks that will predict the presence or absence of the particular disease state in question (2). Although the seminal study using this technique has been criticized by statisticians, bioinformaticians, and clinical chemists because of the likelihood of systematic bias, there has been no empirical testing of the potential for systematic bias in SELDI-TOF serum analysis (1)(3)(4)(5)(6). This study was conducted to determine whether spectral patterns generated by SELDI-TOF MS could distinguish between patients with cancer and those with benign disease among women presenting with suspicious breast abnormalities on mammography or physical examination.
The study was approved by the Research Ethics Board at the University of British Columbia, and all participating patients provided informed consent. We prospectively recruited 136 consecutive consenting patients attending 3 different clinics from September 2002 to April 2004 for a core needle biopsy for histopathologic diagnosis of a suspicious breast lump. Of the 136 patients, 3 were lost to follow-up, and 1 sample was not frozen within 6 h. Of the remaining 132 patients, 63 came from clinic A, 64 from clinic B, and 5 from clinic C. All patients with a positive core biopsy for malignancy as well as a subset with a negative core biopsy had excisional biopsies. A total of 96 patients (72.7%) received a histopathologic diagnosis of breast cancer (ductal carcinoma in situ, n = 13; invasive ductal carcinoma, n = 78; lobular, tubulolobular, or mixed, n = 5).
Serum samples were collected before biopsy in 7-mL glass serum tubes with no additive (BD Vacutainer®), aliquoted, and frozen at 80 °C within 6 h of phlebotomy until used for SELDI-TOF analysis. A pilot study using chips with different surface chemistries led us to choose the immobilized metal affinity capture (IMAC3) chips with Cu(II) as the metal ion to use for this study because of good reproducibility of duplicate spectra and the generation of multiple features on the spectra for analysis (data not shown). IMAC3 chips from the same lot were used for all samples run to avoid lot-to-lot variability. Chips were charged with 100 mmol/L CuSO4, fixed with 100 mmol/L sodium acetate (pH 4.0), and equilibrated with phosphate-buffered saline (PBS), pH 7.4. Serum samples (50 µL) were diluted 1:1 in 8 mol/L urea containing 10 mL/L CHAPS and after vortex-mixing were further diluted 1:5 in PBS. Diluted serum samples (100 µL) were then applied to IMAC3 chips in duplicate, on spots on different chips. After washes in PBS and water, chips were air-dried, and 1 µL of saturated sinnapinic acid in 500 mL/L acetonitrile5 mL/L trifluoroacetic acid was applied to each spot. The 132 serum samples were prepared and spotted in duplicate on 3 consecutive days, and the spotted arrays were read on a PBS II ProteinChip reader (Ciphergen Systems) on 2 consecutive days (one-third on the first day and the remaining two-thirds on the following day).
Spectra were calibrated externally and analyzed by mapping the raw (nonfiltered, nonbaseline-subtracted) TOF spectra to mass spectra consisting of 16 384 channels, with mass calibration given by m/z = aC2, where C is the channel number and a = 0.0001 m/z, and normalized to the same total area. An automated procedure to find and fit the peaks in the mass spectra has been developed by one of the authors and is freely available (sflibotte{at}bcgsc.ca). This procedure generates an average spectrum from all samples, which is divided into several sections by a heuristic approach to obtain the best possible fit. Each section is then fitted iteratively with the appropriate number of gaussian peaks superimposed on a locally quadratic background. The duplicate pairs of spectra from each specimen in the dataset were averaged and fitted, with each section from the average spectrum used as a template. Duplicate spectra from individual samples showed a high degree of reproducibility as demonstrated by a median Pearson correlation coefficient of 0.9704 for all pairs of spectra evaluated. The Pearson correlation coefficients were calculated in the mass region used in the fitting procedure (m/z 533 to 26 840). Examples of duplicate raw spectra are shown in Fig. 1
of the Data Supplement that accompanies the online version of this Technical Brief at http://www.clinchem.org/content/vol51/issue8/. The position, width, and height of the peaks, and the local background were all fitted at the same time; this procedure corrects for local gain, matching variations between spectra because the absolute position of each peak is free to vary, although the relative position of each peak is fixed. A total of 445 peaks were fitted for each spectrum.
|
Two machine-learning algorithms, a support vector machine (SVM) and C4.5, were used in various analyses using all 445 peaks as described below. SVMs perform well in situations in which the number of samples in the dataset is not large compared with the number of attributes, i.e., peak areas in this case, and have been used successfully in microarray and SELDI-TOF experiments (7)(8). The C4.5 algorithm, a decision tree algorithm, is also a widely used machine learning algorithm applied in many settings (8). A 10-fold cross-validation was performed 10 times to assess each classification scheme. In other words, the dataset was divided into 10 equal groups, 9 of which were used to build a classifier and predict the classes of the samples in the remaining group. All 10 groups were assessed in this way. This 10-fold cross-validation procedure was repeated 9 more times with a breakdown of the dataset into 10 different but random groups. A majority predictor, which simply predicts the majority class in the dataset, was used as a comparator. Classification accuracy with means, SDs, and probability values to assess significant differences from the majority predictor (using a Student t-test) were generated by the Weka machine learning software (8).
Our findings demonstrate that specimen collection and processing introduce significant biases in the spectral pattern, such that machine learning algorithms can differentiate between sample source, day that the chips were set up, and days that they were read. In contrast, accuracy of predicting cancer was much poorer.
As demonstrated in Table 1
, neither machine learning algorithm was able to classify patients with breast cancer any better than the majority predictor. Two previous studies using IMAC3 chips identified 2 different sets of peaks that were able to classify patients with breast cancer (9)(10). Attempts to classify the spectra by use of these published peaks were also unsuccessful. We then eliminated every sample in which the duplicate spectra could not be overlaid by visual inspection. Both algorithms performed slightly better than the majority predictor in classifying cancer in this reduced subset of 70 patients, but the results were not statistically significant. To reduce possible source-related biases, we next analyzed specimens that showed reproducible spectra but came from only one clinic. However, the use of samples from only one clinic did not improve classification accuracy by either machine learning algorithm (Table 1
).
|
In contrast to the lack of predictive ability of the spectral patterns for the diagnosis of breast cancer, both machine learning algorithms demonstrated an excellent ability to predict on which day the chips were read and on which day they were prepared, albeit the second variable may be a function of the first (Table 1
). Even more surprisingly, there were distinct spectral features that the algorithms successfully applied to classifying the clinics from which the samples were acquired (Table 1
). Fig. 1
shows a dot plot demonstrating that, using only 2 peaks at m/z 2992 and 5643, the C4.5 algorithm was able to distinguish between samples obtained in clinic A or B.
The very high probability values assigned to the classifications of distinct analytical and preanalytical variables suggest that in previous reports there may have been inadvertent biases in sample collection, storage, or processing between patients from different groups being tested. There are several potential reasons for the analytical and preanalytical biases seen, but the specific mechanisms remain to be elucidated. The findings presented here empirically validate the concerns of clinical chemists and bioinformaticians that serum profiling of unfractionated serum may be detecting preanalytical and analytical variables that are not reflective of the disease state (3)(4)(11). The inability to use previously published peaks to classify breast cancer patients in this study also suggests that there are likely site-specific findings that reflect analytical and preanalytical biases for SELDI-TOF MS. As has been pointed out, it can be difficult to obtain stable, reproducible SELDI-TOF MS results over time and across laboratories (12). Although a recent study has shown some reproducibility across laboratories, only 28 optimal spectra out of a cohort of more than 1000 were used for the validation, and whether the reproducible peaks actually represent cancer biomarkers or artifacts was not addressed (13)(14). Recent critiques have argued that the two main potential problems with observational studies arise from chance and bias (15)(16)(17). The current study highlights the effects of bias; thus, future studies attempting to profile serum by proteomic approaches will have to take extreme care in specimen handling and storage, as well as in randomization of specimen preparation and spectrum collection times, to discover true disease-related spectral profiles.
Acknowledgments
We thank Ingrid Pollet and Fred Wong for assistance with serum sample freezing. This study was funded by grants to A.K. from the National Cancer Institute of Canada with funds from the Canadian Cancer Society and to A.K. and K.G. from the Canadian Breast Cancer Foundation (BC Chapter), and in part with US funds from the National Cancer Institute, National Institutes of Health, under Contract NO1-CO-12400 to T.V., T.P.C., and Z.X. A.K. is supported by a personnel award from the Heart and Stroke Foundation of Canada and a Scholarship from the Michael Smith Foundation for Health Research.
References
The following articles in journals at HighWire Press have cited this article:
![]() |
D. McLerran, W. E. Grizzle, Z. Feng, W. L. Bigbee, L. L. Banez, L. H. Cazares, D. W. Chan, J. Diaz, E. Izbicka, J. Kagan, et al. Analytical Validation of Serum Proteomic Profiling for Diagnosis of Prostate Cancer: Sources of Sample Bias Clin. Chem., January 1, 2008; 54(1): 44 - 52. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Seam, D. A. Gonzales, S. J. Kern, G. L. Hortin, G. T. Hoehn, and A. F. Suffredini Quality Control of Serum Albumin Depletion for Proteomic Analysis Clin. Chem., November 1, 2007; 53(11): 1915 - 1920. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Findeisen, S. Post, F. Wenz, and M. Neumaier Addition of Exogenous Reporter Peptides to Serum Samples before Mass Spectrometry-Based Protease Profiling Provides Advantages over Profiling of Endogenous Peptides Clin. Chem., October 1, 2007; 53(10): 1864 - 1866. [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Oncopeptidomics: A Useful Approach for Cancer Diagnosis? Clin. Chem., June 1, 2007; 53(6): 1004 - 1006. [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Is Early Detection of Cancer with Serum Biomarkers or Proteomic Profiling Feasible? Am. Assoc. Cancer Res. Educ. Book, April 14, 2007; 2007(1): 129 - 132. [Full Text] [PDF] |
||||
![]() |
J. F. Timms, E. Arslan-Low, A. Gentry-Maharaj, Z. Luo, D. T'Jampens, V. N. Podust, J. Ford, E. T. Fung, A. Gammerman, I. Jacobs, et al. Preanalytic Influence of Sample Handling on SELDI-TOF Serum Protein Profiles Clin. Chem., April 1, 2007; 53(4): 645 - 656. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. L. Hortin The MALDI-TOF Mass Spectrometric View of the Plasma Proteome and Peptidome Clin. Chem., July 1, 2006; 52(7): 1223 - 1237. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Serum Proteomic Profiling by Matrix-Assisted Laser Desorption-Ionization Time-of-Flight Mass Spectrometry for Cancer Diagnosis: Next Steps Cancer Res., June 1, 2006; 66(11): 5540 - 5541. [Full Text] [PDF] |
||||
![]() |
C. Agostini and M. Facco The promising future of proteomics in sarcoidosis. Am. J. Respir. Crit. Care Med., May 15, 2006; 173(10): 1053 - 1054. [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Validation of breast cancer biomarkers identified by mass spectrometry. Clin. Chem., April 1, 2006; 52(4): 771 - 772. [Full Text] [PDF] |
||||
![]() |
P. Findeisen, D. Sismanidis, M. Riedl, V. Costina, and M. Neumaier Preanalytical Impact of Sample Handling on Proteome Profiling Experiments with Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry Clin. Chem., December 1, 2005; 51(12): 2409 - 2411. [Full Text] [PDF] |
||||
![]() |
S. R. Master Diagnostic Proteomics: Back to Basics? Clin. Chem., August 1, 2005; 51(8): 1333 - 1334. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |