|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Proteomics and Protein Markers |
1 PerkinElmer Life and Analytical Sciences, Wellesley, MA.
2 Center for Applied Proteomics and Molecular Medicine, George Mason University, Manassas, VA.
3 Clinical Proteomics Reference Laboratory, Gaithersburg, MD.
4 Department of Pathology, Division of Translational Pathology, University of Texas Southwestern Medical Center, Dallas, TX.
5 Nonlinear Dynamics, Cuthbert House, All Saints, Newcastle upon Tyne, United Kingdom.
6 Harvard Partners, Cambridge, MA.
7 New York University Medical School, Division of Obstetrics and Gynecology, New York, NY.
aAddress correspondence to this author at: PerkinElmer Life and Analytical Sciences, 45 Williams St., Wellesley, MA 02481-4008. Fax 617-574-9864; e-mail mary.lopez{at}perkinelmer.com.
| Abstract |
|---|
|
|
|---|
15%, but 5-year survival approaches 90% when the cancer is detected early (stage I). To use mass spectrometry (MS) of serum proteins for early detection, a seamless workflow is needed that provides an opportunity for rapid profiling along with direct identification of the underpinning ions. Methods: We used carrier proteinbound affinity enrichment of serum samples directly coupled with MALDI orthagonal TOF MS profiling to rapidly search for potential ion signatures that contained discriminatory power. These ions were subsequently directly subjected to tandem MS for sequence identification.
Results: We discovered several biomarker panels that enabled differentiation of stage I ovarian cancer from unaffected (age-matched) patients with no evidence of ovarian cancer, with positive results in >93% of samples from patients with disease-negative results and in 97% of disease-free controls. The carrier proteinbased approach identified additional protein fragments, many from low-abundance proteins or proteins not previously seen in serum.
Conclusions: This workflow system using a highly reproducible, high-resolution MALDI-TOF platform enables rapid enrichment and profiling of large numbers of clinical samples for discovery of ion signatures and integration of direct sequencing and identification of the ions without need for additional offline, time-consuming purification strategies.
| Introduction |
|---|
|
|
|---|
15% (3). The 5-year survival approaches 90%, however, if cancer is detected when confined to the ovary (stage I) (3). More sensitive and specific tests for earlier detection may improve patient survival rates by facilitating early treatments such as surgical intervention (4)(5)(6)(7)(8). Recently, new methods of disease detection based on discriminant mass spectral analysis (serum pattern profiling) have been proposed (9)(10)(11)(12)(13)(14). The power of this approach is 4-fold: (a) it is unbiased and does not presuppose any particular disease mechanism, (b) multiple differences, i.e., multiple putative disease markers, are often discovered and combinations of markers are likely to be more powerful discriminators than single markers, (c) large numbers of samples (usually blood sera) from appropriate cohorts can be analyzed quickly for discovery and subsequent validation of putative marker sets, and (d) the method does not require a priori antibody development for success.
Serum peptide pattern profiling studies have shown promise, especially in clinical research (13)(14). The approach has, however, generated controversy centered primarily on 2 issues: the relative importance of obtaining definitive sequence identification for differentiating masses and the likelihood that peptides or protein fragments, as opposed to intact proteins, can be useful as disease biomarkers (14)(15)(16)(17). The clinical relevance of the serum peptidome has been vigorously debated, but recent publications have confirmed that specific protein fragments are correlated with disease stages (18)(19). A major area of controversy has been the lack of data consistency and reproducibility across the various published studies (17), although with proper attention to stringent experimental design and protocols, these issues can be addressed in future studies (20). High resolution and mass accuracy are required for accurate comparison of mass spectral data across a broad mass range. Sample preparation is another area of critical importance. Reduction of sample complexity is an essential first step for all bloodborne biomarker discovery methods because of the large dynamic range of protein concentrations (21). Low-throughput methods such as 2-dimensional gels coupled with mass spectrometry (MS)1 or shotgun proteomics have been used to mine the serum proteome for biomarkers (22)(23)(24)(25), but these methods have limited clinical value because they do not allow the simultaneous screening of adequate (large) numbers of samples. Numerous depletion strategies have been developed to remove the more common, abundant proteins in sera or plasma (26). However, many of these proteins (in part because of their high abundance in the blood), act as carrier proteins and bind a vast assortment of peptides and protein fragments (27). With an albumin blood concentration >600 µmol/L, the probability is >98% that even molecules with relatively low binding affinities will be complexed with albumin (19). These carrier protein-bound protein fragments and peptides may provide potential diagnostic information for many diseases (18)(28)(29). Peptides and protein fragments from proteins catabolized by proteolytic cascades in tissues diffuse into the blood and are bound to highly abundant proteins such as albumin. This process effectively prolongs the bloodstream half-life of low molecular weight peptides and protein fragments that otherwise would be eliminated by the kidneys. The collection of peptides and protein fragments bound to carrier proteins therefore provides a metabolic snapshot of diseased and normal tissues, and the low molecular weight peptide archive can be viewed as a direct reflection of the ongoing pathophysiology that can facilitate a true systems biology approach to biomarker discovery. Thus, adsorption to albumin serendipitously provides an endogenous and very efficient enrichment process for rare or low-abundance protein fragments. Unfortunately, many potentially interesting biomarkers are likely to be inadvertently eliminated during commonly used strategies for the removal of albumin and other highly abundant proteins in blood serum, and sample fractionation protocols that omit the capture of carrier protein-bound peptides may yield preparations consisting almost exclusively of peptides derived from blood coagulation cascades (30)(31).
To develop a new workflow for biomarker candidate identification, we evaluated high-throughput carrier protein-bound affinity enrichment of serum samples coupled with high-resolution MALDI orthogonal TOF (OTOF) MS, discriminant analysis of the resulting mass spectral patterns, and sequence identification of the discriminating ions to search for putative early protein/peptide biomarkers in ovarian cancer serum samples.
| Materials and Methods |
|---|
|
|
|---|
40 years for all groups. To minimize institutional bias attributable to sample handling techniques (15), all samples were collected and handled with the same standard operating procedures. Ovarian cancer samples were obtained from patients before therapy and surgery; all patients were surgically staged. There was a predominance of serous cases (54.9%) with 7.1% clear cell and the remaining 38.0% listed only as adenocarcinoma. Each sample was accompanied by a verified pathologic diagnosis.
|
All serum samples were processed from blood drawn under strict National Cancer Institute/US Food and Drug Administration Proteomics Program standard operating guidelines as follows: Specimens were collected in red-top Vacutainer tubes and allowed to clot for 1 h on ice, followed by centrifugation at 4 °C for 10 min at 2000g. The serum supernatant was divided into aliquots and stored at 80 °C until needed. The serum samples were assayed for CA125 by use of the Elecsys CA125 reagent set (Roche Pharmaceuticals). Samples with a CA125 concentration >22 kIU/L were classified as high CA125 and those with a concentration <22 kIU/L as low CA125. We selected 22 kIU/L because it was the mean of the samples in the study; CA125 >35 kIU/L is generally considered to be the cutoff indicating likely disease recurrence. CA125 was below the limit of detection in the healthy controls.
sample processing
Samples from cancer patients and healthy controls were processed in a random order to account for any systematic errors and variations from experiment to experiment. Locations of all samples were random on each MALDI plate.
Serum samples were processed using prototype ProXPRESSIONTM biomarker enrichment reagent sets (PerkinElmer) as described in (28). The Cibachron blue dye affinity chromatography-based technology is designed to capture high-abundance carrier proteins in blood (such as albumin) and dramatically enriches the peptide and protein fragments bound to the carrier proteins. ZipPlatesTM and a vacuum manifold were purchased from Millipore. Millipore also provided custom-fitting adapters for direct spotting of samples on single use MALDIchipTM prOTOF Target plates (PerkinElmer). Premixed alpha-cyano-4-hydroxycinnamic acid matrix was from PerkinElmer.
maldi-otof ms
Mass spectra were acquired on a prOTOFTM 2000 MALDI-OTOF Mass Spectrometer interfaced with TOFWorksTM software (PerkinElmer/SCIEX). Because of the orthogonal design, a single external mass calibrant was used to achieve better than 5 ppm mass accuracy over an entire sample plate (up to 384 samples). In this study, a 2-point external calibration of the prOTOF instrument was performed before acquiring the spectra in a batch mode from 96 samples. MALDI-OTOF-MS can collect data over a wide range of mass values (300 kDa) in a single acquisition. Typical resolution for peptides and proteins up to 10 kDa was >12 000 full width at half maximum.
The raw mass spectral data used in this study are accessible without restriction upon request to the corresponding authors.
ultrahigh resolution tandem ms
Pools (6) of either ovarian cancer samples or healthy samples were dissolved in 50 µL of 5% acetonitrile 0.1% formic acid/water, and transferred to an MS plate and lyophilized. Samples in 5% acetonitrile 0.1% formic acid were injected with a Famos Autosampler onto a 75 µm x 18 cm fused silica capillary column packed with C18 or C8 media, in a 250-µL/min gradient of 5% acetonitrile 0.1% formic acid to 50% acetonitrile 0.1% formic acid over the course of 100 min with a total run length of 150 min. For the 240-min runs, 25-cm columns were used to achieve higher chromatographic resolution and loading capacity. The LTQ-FT ultrahybrid mass spectrometer (Thermo Fisher Scientific) was run in a top 4 configuration at 200 K resolution for a full scan. Ions that were +1 or undefined in charge states were rejected for MS 2 analysis. Dynamic exclusion was set to 1 with a limit of 180 s, with early expiration set to 6 full scans. Peptide identification was performed with Sequest through the Bioworks Browser 3.2 EF2 (Thermo Scientific). Database searches were made with a no-enzyme indexed version of the National Center for Biotechnology Information RefSeqhuman/reversed Refseqhuman database using differential oxidized methionines at a tolerance of 10 ppm. Peptide score cutoff values were chosen at Xcorr of 1.8 for singly charged ions, 2.0 for doubly charged ions, and 2.5 for triply charged ions, along with deltaCN values of
0.1, and rank score preliminary values of <10 with a peptide P value of 1e3 or better. The small mass tolerance of the search ensured that only relevant peptides were matched. The cross-correlation values chosen for each peptide assured a high confidence match for the different charge states, and the deltaCN cutoff insured the uniqueness of the peptide hit. The P value is a probability score for a random hit peptide. Typically, multiple peptide hits were obtained for any identified protein, for example, more than 28 separate hits were obtained for plasma kallikrein-sensitive glycoprotein (data not shown). However, there were also a number of proteins that were identified by single peptide hits.
processing and analysis of spectral profiles
Progenesis PG600TM software (NonLinear Dynamics) was used to process and analyze the OTOF mass spectral data. Raw spectra from the OTOF were directly loaded into the PG600 program using the prOTOF loader program. Binning was set at 4. Analyses were performed to find discriminant markers between the following groups: healthy vs all cancer, healthy vs high CA125, healthy vs low CA125, and healthy vs stage I cancer. For the initial analysis, the stringency parameters for biomarker selection were set to include peaks with an mean quantity threshold of
75 (higher intensity peaks, to facilitate subsequent sequence identification by tandem MS) and P
0.01. Subsequent analyses were performed with peak intensity stringency of <50 or 0 and a P
0.01. These parameters ensured the detection of differently expressed peaks that were highly significant. Once the putative peaks were detected, classification models were developed using flexible discriminant analysis and the R statistical package (32). The flexible discriminant analysis algorithm determined nonlinear decision boundaries that were better able to separate classes, resulting in a classification technique that was more powerful for high-dimensional data with complex interrelationships. We used independent stratified balanced random sampling to split the data into a training set and a test set. The training set was used to build a classifier model, and this model was evaluated on the test set. The classifier classified test cases as healthy or diseased, and these data were then used to create ROC curves. Monte Carlo cross-validation of training and test sets was used (33)(34). For this validation, the results of 100 runs of the sampling and classifier modeling procedure were averaged together to create the final ROC curve.
| Results |
|---|
|
|
|---|
|
analysis and identification of discriminant peptide masses and putative biomarkers
An initial set of 9 discriminating peptides resulted from the initial analysis comparing the spectral profiles from 4 different groups: healthy vs all cancer (low CA125 + high CA125), healthy vs low CA125, healthy vs high CA125, and healthy vs stage I cancer (Table 2
). The identities of these peptides included multiple peptide hits from complement component 3 and interalpha (globulin) inhibitor H4, and single peptides from complement component 4A, transthyretin, and fibrinogen (Table 3
). Because the analysis parameters were purposefully set to screen out low-intensity masses (to improve our success rate with sequence identification), it is not surprising that these fragments are derived from relatively abundant serum proteins. Although these peptides were derived from highly abundant resident proteins, they are not necessarily nonspecific or themselves highly abundant. Examples of overlaid disease and healthy spectra are shown in Fig. 2
; putative marker expression ranged from an
3.6-fold increase to a 2.6-fold decrease in cancer samples (Table 2
). To test the discriminating power of the 9-marker set, a flexible discriminant analysis classification model was built and tested. The discriminating power of the 9-marker model was quite good; 93% of samples from patients with disease were positive and 93% of samples from disease-free controls were negative (Table 2
).
|
|
|
To expand the analysis, we reduced the stringency parameter slightly to allow the inclusion of lower intensity discriminating masses. The results of this analysis yielded the set of 4 markers shown in Tables 2
and 3
. Two of the peptides at m/z 1739.9 and 2582.35 were also in the initial set of 9, and the remaining 2 masses at m/z 2659.27 and 2989.49 remained unidentified. The 4-marker model delivered equivalent diagnostic sensitivity (93%) and better specificity (97%) than the 9-marker model (Table 2
). In an effort to discover lower abundance discriminating peptides, or peptides not related to coagulation, we lowered the intensity stringency of the analysis to 0, keeping the P values at 0.01 or better. Five additional discriminating masses resulted from this analysis (Tables 2
and 3
). None of the identified proteins in this set are related to coagulation, and all are correlated with or involved in cellular oncogenenesis [casein kinase 2, transgelin (35)(36)(37)], proliferation [keratin 2, LARGE (38)(39)], or detoxification of ROS [diamine oxidase (40)]. A model created with these 5 markers plus 2 additional markers from the 4-marker set classified the healthy and low CA125 samples with 77% sensitivity and 85% specificity (Table 2
, 7-marker model).
additional protein fragments detected in cancer sera
In addition to the discriminating protein fragments identified above, additional sequence identities were obtained (162 total) for other peptides/proteins in the ovarian cancer serum samples (see Table 1 in the Data Supplement that accompanies the online version of this article at http://www.clinchem.org/content/vol53/issue6). Many of these fragments were not detected in the healthy sera. We could not be certain that the fragments were exclusive to the cancer sera, however, because we extensively sequenced peptides from only a limited number of pooled samples. Further sequencing experiments may determine the exclusivity of these peptides to disease or healthy samples and their potential as putative disease biomarkers. Many of the proteins from which these peptides are derived are present in very low abundance or have not been identified previously in blood (41).
| Discussion |
|---|
|
|
|---|
The carrier proteinbased approach yielded a number of discriminating peptides, and we built flexible discriminant models with multiple marker sets (9, 4, and 7 markers). These models enabled classification with high specificity and sensitivity of samples from cancer patients and healthy controls. Perhaps not surprisingly, peptide fragments associated with the coagulation cascade provided the highest classification power. Substantial evidence supports the association between activation of blood coagulation and progression of cancer. Recent studies from several laboratories have linked malignant transformation (oncogenesis), tumor angiogenesis, and metastasis to the generation of clotting intermediates, clotting or platelet function inhibitors, or fibrinolysis inhibitors (42). Many researchers have published putative markers for cancer, and ovarian cancer in particular, that are related to coagulation or inflammation (43). Interestingly, when we lowered the stringency of our analysis to allow the inclusion of very low intensity discriminating signals, we identified a further 5 peptides not related to either coagulation or inflammation pathways. Among these lower intensity peptides, casein kinase 2 is oncogenic and upregulated in tumors (35), and trangelin has been reported previously as a putative marker for ovarian and endometrial cancer in other studies using widely different discovery methods including LC-MS and cDNA-representational difference analysis (36)(37). The other 3, keratin 2, glycosyl transferase (LARGE), and diamino oxidase are also associated with processes related to cancer (38)(39)(40).
In addition to the discriminating peptides described above, a rich trove of protein fragments, many from low-abundance proteins or proteins not previously seen in serum (41), were recovered from the ovarian cancer sera. These results are consistent with those reported by Lowenthal et al. (19). In this study, we identified a number of proteins associated with cellular proliferation, cancer, and cancer signaling pathways in the ovarian cancer samples. Although we could not be certain that these peptides/proteins were exclusive to the cancer sera, many of the peptides were not found in the pooled healthy serum samples. A sampling of these proteins and their functions are described in Table 2 of the online Data Supplement.
Several proteins identified in this study (e.g. transthyretin) have also been reported by others as putative biomarkers for ovarian cancer (10)(45). Interestingly, the transthyretin fragment reported herein is a different unique mass than that reported in a previous serum-based ovarian cancer study (10).
The proteins we identified are involved in cellular inflammation, differentiation, signaling, apoptosis, transcriptional regulation, and other regulatory mechanisms. It is remarkable that this rich variety of low-abundance species is so well represented in the fraction bound to serum albumin.
In summary, in a period of 23 weeks we identified
162 proteins from peptides and protein fragments bound to carrier proteins from ovarian cancer patient serum samples. Within this study, 3 sets of the discriminating carrier-protein bound fragments differentiated samples from patients with ovarian cancer and from apparently healthy controls with sensitivities and specificities of up to 93% and 97%, respectively. These values compare very favorably with published mean sensitivities and specificities of
50% for CA125, the current gold standard biomarker for ovarian cancer (4). Thus, this new high-throughput, top-down approach to biomarker discovery provides a clear path for the rapid detection of potential markers for early disease detection.
| Acknowledgments |
|---|
Financial disclosures: M.F.L., A.M., S.K., E.G., W.F.P., C.L., A.J., and W.M. are employees of PerkinElmer Life and Analytical Sciences, the vendor of ProXPRESSIONS reagent sets and the prOTOF 2000 orthogonal MALDI-TOF mass spectrometer described in this manuscript. L.A.L. and E.F.P. have US government- and University-assigned patents that cover certain aspects of the technology discussed. No other financial interests declared.
| Footnotes |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
J. F. Timms, R. Cramer, S. Camuzeaux, A. Tiss, C. Smith, B. Burford, I. Nouretdinov, D. Devetyarov, A. Gentry-Maharaj, J. Ford, et al. Peptides Generated Ex Vivo from Serum Proteins by Tumor-Specific Exopeptidases Are Not Useful Biomarkers in Ovarian Cancer Clin. Chem., February 1, 2010; 56(2): 262 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. Fleischer, A. Lyshchik, H. W. Jones III, M. A. Crispens, R. F. Andreotti, P. K. Williams, and D. A. Fishman Diagnostic Parameters to Differentiate Benign From Malignant Ovarian Masses With Contrast-Enhanced Transvaginal Sonography J. Ultrasound Med., October 1, 2009; 28(10): 1273 - 1280. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. G. Miller, D. E. Bruns, G. L. Hortin, S. Sandberg, K. M. Aakre, M. J. McQueen, Y. Itoh, J. C. Lieske, D. W. Seccombe, G. Jones, et al. Current Issues in Measurement and Reporting of Urinary Albumin Excretion Clin. Chem., January 1, 2009; 55(1): 24 - 38. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Liotta and E. F. Petricoin Putting the "Bio" back into Biomarkers: Orienting Proteomic Discovery toward Biology and away from the Measurement Platform Clin. Chem., January 1, 2008; 54(1): 3 - 5. [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Oncopeptidomics: A Useful Approach for Cancer Diagnosis? Clin. Chem., June 1, 2007; 53(6): 1004 - 1006. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |