|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Proteomics and Protein Markers |
1 Translational Research Laboratory, Institute of Womens Health, University College London, London, United Kingdom.
2 Cancer Proteomics Group, Ludwig Institute for Cancer Research, London Branch, London, United Kingdom.
3 Department of Computer Science, Royal Holloway College, University of London, London, United Kingdom.
4 Ciphergen Biosystems, Inc., Fremont, CA.
aAddress correspondence to this author at: Translational Research Laboratory, Institute of Womens Health, University College London, Huntley Street, London, WC1E 6DH, UK. Fax 44-207-6796334; e-mail jtimms{at}ludwig.ucl.ac.uk.
| Abstract |
|---|
|
|
|---|
Methods: To examine the effect of tube type, clotting time, transport/incubation time, temperature, and storage method on protein profiles, we used 6 different handling methods to collect sera from 25 healthy volunteers. We used a high-throughput, prefractionation strategy to generate anion-exchange fractions and examined their protein profiles on CM10, IMAC30-Cu, and H50 arrays by using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry.
Results: Prolonged transport and incubation at room temperature generated low mass peaks, resulting in distinctions among the protocols. The most and least stringent methods gave the lowest overall peak variances, indicating that proteolysis in the latter may have been nearly complete. For samples transported on ice there was little effect of clotting time, storage method, or transit time. Certain proteins (TTR, ApoCI, and transferrin) were unaffected by handling, but others (ITIH4 and hemoglobin ß) displayed significant variability.
Conclusions: Changes in preanalytical handling variables affect profiles of serum proteins, including proposed disease biomarkers. Proteomic analysis of samples from serum banks collected using less stringent protocols is applicable if all samples are handled identically.
| Introduction |
|---|
|
|
|---|
Several different methods based on mass spectrometry (MS)1 have been applied in the search for cancer biomarkers [reviewed in (1)(7)]. Surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) MS (8)(9) has been used extensively for serum profiling. In this method, high-throughput mass profiling with laser desorption/ionization MS instrumentation is performed on sample proteins bound selectively to chip surfaces with different chemical properties. Spectral patterns are then compared across samples to find discriminating masses or changes in peak intensities. Initial enthusiasm about these new technologies has been somewhat tempered by questions on the robustness of class discriminating algorithms and method reproducibility (7)(10)(11). Increasing evidence that sample collection and processing can affect protein profiles and the ability to differentiate between disease and control samples has cast further doubt on the validity of some studies (12)(13). Transit time, storage conditions, clotting time, and tube type can all affect serum profiles, irrespective of true biological variation (14)(15)(16)(17). It is likely that such introduced differences are primarily driven by proteolysis, although other variables may contribute, such as agglutination or differential adhesion of serum polypeptides to tube walls. These findings have raised concerns about using samples for case-control studies from older collections, where samples were collected and transported for different times at ambient temperatures. Many of these collections are unique, with samples predating cancer diagnosis. One such collection is the United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS), in which 202 638 postmenopausal women from 13 centers in the UK were randomized to screening vs controls; this serum bank will eventually have 500 000 samples, including serial samples from 50 000 women (see www.ukctocs.org.uk). The collection protocol for this trial allows blood samples to stand on the clot for 2456 h before processing. Thus, if proteomic technologies are used for biomarker discovery from such collections, it is imperative to compare samples collected and processed using these less stringent protocols with those collected in accordance with protocols that involve immediate transport on ice.
We examined the impact of diverse serum handling protocols on protein profiles observed by SELDI-TOF MS. We also sought to determine the least variable and most clinically feasible handling method for prospective serum collections for proteomic studies.
| Materials and Methods |
|---|
|
|
|---|
sample preparation
Sample preparation details are provided as Supplemental Data. Briefly, samples were thawed, randomized, and triplicate 25 µL aliquots placed into 96-well plates. After denaturation in urea and dilution, the samples at pH 9 were put in filter plates containing rehydrated QHyperD® F resin (Pall Corporation) and incubated with shaking. Unbound material was removed on a vacuum manifold as fraction 1 (FR1), and proteins were eluted in a step-wise fashion by decreasing pH (FR2, pH 7; FR3, pH 5; FR4, pH 4; and FR5, pH 3) with a final organic solvent elution (FR6). The 6 fractions were applied to CM10 (weak cation-exchange) and IMAC30-Cu (immobilized metal affinity capture) arrays in 96-sample bioprocessors (Ciphergen Biosystems). FR6 samples were also applied to H50 (hydrophobic) arrays. Chip preparation, sample application, and matrix application were performed according to the manufacturers instructions. All liquid handling steps were performed on an Aquarius workstation (Tecan).
seldi-tof ms data acquisition and processing
Details of SELDI-TOF MS data acquisition and processing are provided as Supplemental Data. Briefly, spectra were acquired on an externally calibrated ProteinChip® System Series 4000 instrument, using 2 laser intensities for acquisition of low (2.520 kDa) and high (20200 kDa) mass range data. Spectra were processed (baseline subtraction, deionzing, normalization, spectral alignment, and peak detection) with CiphergenExpress software, as described in Supplemental Data. Peak numbers were recorded for the different handling methods and fraction types. We examined the differences between the handling methods by principal components analysis (PCA), hierarchical clustering, and examination of P value by use of mean peak intensities from triplicate samples. Median variances were also calculated as descriptors of trends for different collection/handling methods.
peak identification
Proteins were enriched by liquid chromatography and ultrafiltration followed by SDS-PAGE. Peptides <5 kDa were identified by direct sequencing by tandem mass spectrometry (see below). Proteins >5 kDa were purified by SDS-PAGE and stained using a Colloidal Blue Staining Kit (Invitrogen). Selected bands were excised, and one quarter of each was extracted using 50% formic acid/25% acetonitrile/15% isopropanol/10% water and reanalyzed by SELDI-TOF MS to confirm matching of the stained band with the peak of interest. The remainder was in-gel digested with trypsin and analyzed by tandem MS using a Q-STAR® XL equipped with a PCI-1000 ProteinChip Interface (Ciphergen Biosystems). MS/MS spectra were submitted to the database mining tool Mascot (Version 2.1.2; Matrix Science) and searched against the updated SwissProt or NCBInr databases with the following search parameters: trypsin, allowing up to 2 missed cleavages (or semitrypsin if the trypsin search was not successful); peptide tolerance ±50 ppm; MS/MS tolerance ± 0.3 Da; peptide charge +1. Peak identifications were also confirmed using data from previous publications (3)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31). All are abundant, ubiquitously expressed serum proteins or proteolytic fragments.
role of the sponsors
Ciphergen Biosystems participated in the study design, donation of reagents, interpretation of results, and writing of this article, but were not involved in donor selection. Approval of the paper by Ciphergen Biosystems before its submission was not required.
| Results |
|---|
|
|
|---|
|
We next conducted PCA to assess how samples and handling methods grouped together. Protocol 1 (GN) was the most distinctive method, with most volunteer samples grouping together and away from samples collected using the other methods (Fig. 1
). This was true for all fraction/chip surface combinations, but only in the low (2.520 kDa) mass range. A likely explanation for this separation is the extended transport/storage time used in this protocol. In support of this, partial separation was observed between protocols 2b (GY; 3 h on ice) and 2e (WH; 3 h at RT), suggesting that temperature before centrifugation is a major factor influencing spectral patterns (data not shown). Importantly, there was no differences among the protocols, using either mass range, when samples were transported on ice for 3 vs 6 h (GY vs OR), when samples were clotted for 60 min vs 5 min (YE vs GY), or when samples were strawed vs aliquoted into cryovials for storage (GY vs CR) (data not shown). Protocols were also compared based on P values. For this, the preprocessed 160 most frequent peaks were selected and a median intensity value (n = 3) obtained for each study participant and protocol. To test the null-hypothesis that there is no difference between protocols, P values were calculated for all 160 peaks using the Wilcoxon sign test. The Pmin (minimum P value of all 160 values) was then used to find a measure of agreement between pairs of protocols by calculating the corresponding "conservative" P value according to the formula: min (n*Pmin;1), where n = 160 and the term "min" means that the minimum of 2 numbers Pmin and 1 is taken; the word "conservative" refers to the fact that the probability of the P value not exceeding epsilon is at most epsilon, for any epsilon between 0 and 1. To create a summary table to characterize the combination of all 13 fraction/chip types, the smallest P value in the fractions for each protocol pair was taken and adjusted according to the formula using n = 13 (Table 2
). Using this approach, protocols 2b (GY), 2c (CR), and 2d (OR) were most similar and protocol 1 (GN) most dissimilar, in agreement with the PCA. However, there was no significant difference between protocols 1 and 2c, (P = 0.276), although protocol 2c had the smaller number of 13 samples, making it difficult to make reliable conclusions. This may also explain why all other protocols showed a strong agreement with protocol 2c. Notably, there was a significant difference between protocols 2a (YE) and 2b (GY), suggesting that clotting time does influence the protein profiles, a finding which was not apparent from the PCA.
|
|
We next analyzed peak variances to assess the general stability of the handling methods. Data were calibrated internally based on known peaks, and median intensities and SDs taken for each peak across the 25 study participants. These values were used to calculate a coefficient of variance (SD/median intensity) for each protocol and fraction type. Median variance values were also calculated and compared across protocols to give an overall measure of variability. Protocol 2a (YE) gave the lowest median variances in both mass ranges, followed by protocols 1 (GN) and 2e (WH), suggesting that these were the most stable methods (Fig. 2A
and B). This was corroborated by the observation that these methods gave the highest numbers of peaks with a median variance <1.0 (Fig. 2C
).
|
Selection of peaks for identification was based on altered intensity and variance across protocols, with emphasis on differences between protocols 1 (GN) and 2a (YE); considered the least and most stringent protocols, respectively. Identifications were in agreement with previous studies, where available (Table 3
). Examples of two peaks that displayed increased intensity in protocol 1 samples, but not those of protocol 2a, are shown in Fig. 3
. Peak 4286 Da (Fig. 3A
) was identified as the 4281.78 Da fragment of inter
-trypsin inhibitor heavy chain H4 [ITIH4) (see (22)]. Two other ITIH4 fragments (3157.58 Da [see (27)] and 3955.48 Da [see(31)] and their methionine-oxidised forms were also identified, with all displaying increased intensity in protocol 1 samples (Table 3
). An 8144 Da peak appeared to be a superposition of two peaks. In most protocols the 8144 Da peak represented an 8141.59 Da form of platelet factor 4 (30), with an alternatively cleaved signal sequence (data not shown). In protocol 1, the up-regulated peak 8126 Da corresponded to a C-terminal-truncated fragment of C3a anaphylatoxin [8126.52 Da; see (3)(23)] (Fig. 3C
). A C1 inhibitor C-terminal fragment (4152.87 Da), an albumin N-terminal fragment (3156.59 Da), and peaks corresponding to neutrophil defensins 1, 2, and 3 were also noticeably increased in protocol 1 samples. Other identifications were apolipoprotein C-I (ApoCI; 6330.59 Da) and its SPA adduct, a truncated form of ApoCII (8204.17 Da), albumin dimer (138 kDa), transferrin (79 kDa), and transthyretin (13.9 kDa), which were relatively stable across the different handling methods. In contrast, hemoglobin
(15126.36 Da) and ß (15867.28 Da), and fibrinogen
fragments 3262.47 Da and 5904.22 Da and their modified forms, displayed lower intensities in protocols 1 (GN) and 2e (WH) (Table 3
). Several peaks, including those representing the major form of platelet factor 4 [7765.10 Da; see (30)], showed altered intensity in protocol 2b (GY) vs 2a (YE) and 1 (GN), but not the other protocols, suggesting that their final serum concentration is affected by clotting time (highlighted in bold in Table 3
).
|
|
|
|
|
| Discussion |
|---|
|
|
|---|
The stringent protocol with 1 h clotting and 3 h transit/storage on ice (YE) has been shown to give the most reproducible results in a serum profiling analysis performed using automated magnetic bead-based prefractionation and MALDI-TOF MS (14). However, our study showed that despite transit/storage at RT for 30 h, protocol 1 (GN) also gave a relatively low overall variance. This finding is critical, because it establishes that samples collected in older studies with longer transit times at RT can be used in case-control studies for novel biomarkers as long as all samples were handled similarly. Many of these biobanks are unique because they are associated with long follow-up, and contain samples stored many years before disease diagnosis.
The greater number of peaks (often with lower variances and higher peak intensities) in the low mass range for protocol 1 samples suggests that proteolysis in these samples has gone to completion. A similar trend was also apparent in protocol 2e (WH) samples, which were incubated for 3 h at RT before storage. These data are in agreement with a previous SELDI-TOF MS study showing increases in certain peaks with time between venipuncture and sample processing, with some overlap with the peaks identified here (15). In particular, the increased intensities of the ITIH4, C3a, and C1 inhibitor fragments and their modified forms in protocol 1 samples provides evidence that these degradation products are generated as a result of increased proteolysis due to extended transport/storage. Conversely, full-length hemoglobin
and ß displayed decreased intensities in protocols 1 (GN) and 2e (WH), suggesting that they may be subject to degradation. Similarly, the decreases in fibrinogen
fragments 3262.47 and 5904.22 Da were consistent with further degradation to smaller undetected forms. It is harder to explain the increased levels of the neutrophil defensins in the protocol 1 samples. These disulfide bond-containing molecules are resistant to proteolysis, so an indirect mechanism must account for their altered intensities. Several other protein forms did not change significantly with collection method, revealing them to be relatively stable serum markers.
It has been suggested that many candidate disease biomarkers identified in SELDI-TOF MS profiling experiments are abundant acute-phase reactants, and are thus secondary effects of the diseased state (7). For example, serum transthyretin (TTR) is a known marker for nutritional status and the inflammatory acute-phase response. In ovarian cancer, TTR was identified as a potential early diagnostic marker, with decreased TTR levels reported in the sera of ovarian cancer patients compared with controls, without differences in its microheterogeneity (20)(32)(33). Notably, TTR and its modified forms were unaffected by the different handling conditions used here, and displayed relatively low variances across this healthy cohort. Thus, it would appear that TTR is relatively stable, making it a more robust disease biomarker. Similarly, SELDI-TOF MS was previously used to detect ApoCI and transferrin as classifiers of ovarian, colorectal, and other cancers (20)(34), and our data show that they are also stable under the conditions tested.
In a recent study, the putative acute-phase protein ITIH4 was shown to be extensively proteolytically processed, and its fragmentation patterns associated with different disease conditions (27)(31). Fragmentation was generally consistent with cleavages by endoproteases, followed by exoproteases, and the observed fragments were reported to change little under different assay conditions or processing procedures. An up-regulated cleavage fragment of ITIH4 was also shown to enable differentiation of patients with ovarian cancer from healthy controls or patients with benign pelvic masses (32). Our data provide evidence that ITIH4 is relatively unstable, with the generation of fragments increasing in serum maintained at RT for prolonged periods. Hemoglobin ß has also been identified as a putative ovarian cancer biomarker (3)(20), but appears from our study to be relatively unstable in serum. With this in mind, ITIH4 fragments or hemoglobin ß may not make robust disease biomarkers unless strict precautions are taken with sample handling. Future work will involve additional MS/MS-based identification of the unknown peaks that are discriminatory for the different collection methods. This will allow the assessment of their usefulness as potential disease biomarkers where they have been identified in other studies.
Our work establishes that the proteomic analysis of samples from established serum banks, where samples were not collected in accordance with more stringent protocols, can be used for proteomic biomarker studies. The key factor is that all samples in the collection should have been handled in a similar manner. Cases and controls should be matched for transport time and an assessment made when proteolysis in these samples reaches completion. Biomarker discovery using a proteomic approach in such case-control sets, such as UKCTOCS, will involve stable biomarkers rather than labile proteins. For future studies, the key variable during specimen collection will be transport on ice, and it does not seem to matter if transport times are then 3 or 6 h.
| Acknowledgments |
|---|
| Footnotes |
|---|
-trypsin inhibitor heavy chain H4; ApoCI, apolipoprotein C-I; TTR, transthyretin. | References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
B. A. ZEIDAN, R. I. CUTRESS, N. MURRAY, G. R. COULTON, C. HASTIE, G. PACKHAM, and P. A. TOWNSEND Proteomic Analysis of Archival Breast Cancer Serum Cancer Genomics Proteomics, May 1, 2009; 6(3): 141 - 147. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gammerman, V. Vovk, B. Burford, I. Nouretdinov, Z. Luo, A. Chervonenkis, M. Waterfield, R. Cramer, P. Tempst, J. Villanueva, et al. Serum Proteomic Abnormality Predating Screen Detection of Ovarian Cancer The Computer Journal, May 1, 2009; 52(3): 326 - 333. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. CADRON, T. VAN GORP, F. AMANT, I. VERGOTE, P. MOERMAN, E. WAELKENS, A. DAEMEN, R. VAN DE PLAS, B. DE MOOR, and R. ZEILLINGER The Use of Laser Microdissection and SELDI-TOF MS in Ovarian Cancer Tissue to Identify Protein Profiles Anticancer Res, April 1, 2009; 29(4): 1039 - 1045. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lakka Klement, T.-T. Yip, F. Cassiola, L. Kikuchi, D. Cervi, V. Podust, J. E. Italiano, E. Wheatley, A. Abou-Slaybi, E. Bender, et al. Platelets actively sequester angiogenesis regulators Blood, March 19, 2009; 113(12): 2835 - 2842. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Mannello Serum or Plasma Samples?: The "Cinderella" Role of Blood Collection Procedures Preanalytical Methodological Issues Influence the Release and Activity of Circulating Matrix Metalloproteinases and Their Tissue Inhibitors, Hampering Diagnostic Trueness and Leading to Misinterpretation Arterioscler. Thromb. Vasc. Biol., April 1, 2008; 28(4): 611 - 614. [Full Text] [PDF] |
||||
![]() |
D. McLerran, W. E. Grizzle, Z. Feng, W. L. Bigbee, L. L. Banez, L. H. Cazares, D. W. Chan, J. Diaz, E. Izbicka, J. Kagan, et al. Analytical Validation of Serum Proteomic Profiling for Diagnosis of Prostate Cancer: Sources of Sample Bias Clin. Chem., January 1, 2008; 54(1): 44 - 52. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lippi, M. Plebani, and G. C. Guidi The Paradox in Translational Medicine Clin. Chem., August 1, 2007; 53(8): 1553 - 1553. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |