|
|
||||||||
Proteomics and Protein Markers |
1 Centre for Emerging Infectious Diseases, and Departments of2 Medicine and Therapeutics,3 Chemical Pathology, and4 Biology, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, NT, Hong Kong SAR.
aAddress correspondence to this author at: Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong SAR. Fax 852-2648-8842; e-mail tcwpoon{at}cuhk.edu.hk.
| Abstract |
|---|
|
|
|---|
Methods: Using surface-enhanced laser desorption/ionization (SELDI) ProteinChip technology, we profiled and compared serum proteins of 39 patients with early-stage SARS infection and 39 non-SARS patients who were suspected cases during the SARS outbreak period. Proteomic patterns associated with SARS were identified by bioinformatic and biostatistical analyses. Features of interest were then purified and identified by tandem mass spectrometry.
Results: Twenty proteomic features were significantly different between the 2 groups. Fifteen were increased in the SARS group, and 5 were decreased. Their concentrations were correlated with 2 or more clinical and/or biochemical variables. Two were correlated with the SARS-CoV viral load. Hierarchical clustering analysis showed that a majority of the SARS patients (95%) had similar serum proteomic profiles and identified 2 subgroups with poor prognosis. ROC curve analysis identified individual features as potential biomarkers for SARS diagnosis (areas under ROC curves, 0.7330.995). ROC curve areas were largest for an N-terminal fragment of complement C3c
chain (m/z 28 119) and an internal fragment of fibrinogen
-E chain (m/z 5908). Immunoglobulin
light chain (m/z 24 505) positively correlated with viral load.
Conclusions: Specific proteomic fingerprints in the sera of adult SARS patients could be used to identify SARS cases early during onset with high specificity and sensitivity.
| Introduction |
|---|
|
|
|---|
Recently, advances in proteomics have provided new strategies to identify biomarkers and therapeutic targets and to study the pathology of diseases. Surface-enhanced laser desorption/ionization (SELDI) ProteinChip technology is a proteomic tool that has been applied to the discovery of diagnostic proteomic fingerprints for various diseases, including cancer and infectious diseases (7)(8)(9). Recent studies have shown that this technology could also be used to identify potential biomarkers for early diagnosis of SARS (7)(10)(11). In these studies, the control cases were either healthy persons or persons with non-SARS viral infections. Unfortunately, the degree of similarity of the symptoms between the SARS and control groups and the time point of blood collection were not considered in these studies. From the perspective of infectious disease diagnosis, one is not trying to differentiate healthy persons from infected patients, but rather is trying to identify the disease causing the symptoms in patients presenting with similar symptoms (12).
Bearing in mind the above issues, in the present study we attempted to profile and compare the serum proteomes between SARS patients and non-SARS patients. The non-SARS patients were those who had symptoms similar to those in SARS patients. They were admitted to the same hospital as the SARS patients and were later shown to be negative for SARS-CoV infection. Furthermore, for both SARS and non-SARS patients, sera were collected within 1 week after fever onset.
| Materials and Methods |
|---|
|
|
|---|
The serum SARS-CoV RNA test was performed on 31 patients, and 22 of them (71%) showed detectable SARS-CoV RNA in their serum. For the control group, the 39 non-SARS patients (21 with bacterial pneumonia, 1 with aspiratory pneumonia, 2 with acute pulmonary edema, 2 with upper respiratory tract infection, 2 with influenza, 1 with acute bronchitis, 1 with bronchiectasis, 1 with secondary lung cancer, 1 with gastrointestinal bleeding, 1 with acute cholangitis, 1 with infectious mononucleosis, 1 with viral gastroenteritis, 1 with liver abscess, 1 with peritonitis and end stage renal failure, 1 with cellulites, and 1 with thalamic hemorrhage) were those who had symptoms similar to those for the SARS patients. They were admitted to the same hospital as the SARS patients and were later shown to be negative for SARS-CoV antibody by a serology test after at least 6 weeks of the onset of symptoms. All biochemical information was collected from the same blood samples. All samples were stored as aliquots at 70 °C until ProteinChip array profiling analysis was performed.
serum proteomic profiling
In the SELDI ProteinChip analysis, the serum samples from the diseased and control groups were randomized, and the investigator was blinded to their identity. The SELDI ProteinChip analysis was performed as described previously (8)(10)(13). For each set of binding conditions, samples were analyzed in duplicate. Serum samples (2 µL) were denatured by addition of 4 µL of U9 buffer (9 mol/L urea, 20 g/L CHAPS, 50 mmol/L Tris-HCl, pH 9.0) and incubated on ice for 20 min. Denatured serum samples were diluted with 34 µL of T4 (50 mmol/L sodium acetate, 1 mL/L Triton X-100, pH 4.0) or T9 (10 mmol/L Tris, 1 mL/L Triton X-100, pH 9.0) binding buffer, respectively, to give a final dilution of 20-fold. CM10 ProteinChip arrays (Ciphergen Biosystems) were used in this study. After dilution, 5 µL of the diluted sample was applied to a preequilibrated ProteinChip array in duplicate in a bioprocessor and incubated with shaking for 90 min at room temperature. After incubation, the ProteinChip arrays were washed 5 times with the same binding buffer and rinsed twice with deionized water. After air-drying, sinapinic acid matrix in 500 mL/L acetonitrile5 mL/L trifluoroacetic acid was added to each array. The ProteinChip arrays were read on the ProteinChip PBS II reader of a ProteinChip Biomarker System (Ciphergen Biosystems) to determine the masses and intensities of all peaks over the range m/z 1000 to 250 000. For each box (12 pieces containing 96 assay spots) of ProteinChip arrays, 1 array (8 assay spots) was used for calibration. Mixtures of peptide/protein calibrators [angiotensin (m/z 1296.51), corticotropin (clip 117; m/z 2093.46), corticotropin (clip 1839; m/z 2465.72), doubly charged horse apomyoglobin (m/z 8475.8), Escherichia coli thioredoxin (m/z 11 673.5), horse apomyoglobin (m/z 16 951.6), bovine serum albumin (m/z 66 430), and bovine serum albumin dimer (m/z 132 861); Applied Biosystems Ltd.] were added to all 8 assay spots of the calibration array. Intensities of peaks between m/z 1000 and 20 000 were obtained at a laser setting of 183 and a optimized range of m/z 1000 to 20 000; intensities of peaks between m/z 20 000 and 250 000 were obtained at a laser setting of 188 and a optimized range of m/z 20 000 to 150 000. The spectra were smoothed, baseline-subtracted, and externally calibrated. The common peaks among the SELDI mass spectra were identified and quantified by use of Biomarker Wizard software (Ciphergen Biosystems). The peak intensities were normalized with the total ion current and, subsequently, with the total peak intensities. Before data mining, the normalized peak intensities of the duplicate measurements were averaged and log2-transformed. The intraassay CVs of the normalized intensities of various peaks were <15%.
bioinformatic analysis
To identify proteomic features associated only with disease, we used 2 criteria: (a) the normalized peak intensities had to be significantly higher/lower in SARS patients than in non-SARS individuals; and (b) the normalized peak intensities had to correlate with 2 or more clinical/biochemical variables, indicating their biological meaningfulness.
The Significance Analysis of Microarray (SAM) algorithm (Stanford University) (8)(10)(13) was used to identify proteomic features with concentrations significantly different between the SARS and non-SARS patient groups. During SAM analysis, "two classed, unpaired data" were selected as the data type, and 5000 permutations were performed. The false significant discovery rate was set to zero to avoid the identification of falsely significant proteomic features caused by multiple comparisons. Correlations between the differential proteomic features and various clinical/biochemical features were examined by the Spearman rank-order correlation test.
The significant differential proteomic features correlated with various clinical/biochemical correlations were subjected to two-way hierarchical clustering analysis, as described previously (8). Before clustering analysis, the data of each proteomic feature were subjected to zero-mean unit-variance normalization. The processed proteomic data and the serum samples were subjected to two-way hierarchical clustering analysis by the Cluster and TreeView (14). Spearman rank correlation was used to calculate the distance, and complete linkage clustering was performed.
protein purification
For protein identification, proteins corresponding to the SEDLI peaks were purified by cation-exchange chromatography with the use of CM10 ceramic beads (BioSepra) under binding conditions similar to those for CM10 ProteinChip arrays. Briefly, pooled serum samples were first denatured with U9 buffer and diluted with T4 or T9 sample binding buffer, respectively. After incubation for 120 min and subsequent washing, the bound proteins were eluted from the CM10 ceramic beads with 1 mol/L NaCl solution. C18 ZipTips were used to desalt the eluted proteins according to the manufacturers instructions (Millipore). The desalted protein preparations were spotted on the gold-coated ProteinChip arrays and examined with the ProteinChip reader to confirm that the purified proteins had the same masses as the targeted SEDLI proteomic features. After confirmation, the purified proteins were resolved by 2-dimensional gel electrophoresis in the absence of reducing agents. Proteins on gels were visualized by either colloidal blue (Invitrogentrade;) or silver staining (GE Healthcare). The gel images were then digitized with a densitometer and analyzed by the PDQuest gel analysis software (Ver. 7.3.0; Bio-Rad). Protein spots with masses matched with the differential proteomic features were excised and subjected to mass spectrometric (MS) analysis.
protein identification by tandem ms
Protein spots of interests were removed from the gel and subjected to trypsin digestion as described previously (15). The trypsin digests were then extracted and subjected to tandem MS (MS/MS) analysis in the ABI 4700 system (Applied Biosystems). Trypsin peaks, possible keratin contamination, and matrix cluster peaks were excluded from subsequent collision-induced dissociation. The MS/MS spectra were then processed with Data Explorer software (Ver. 4.4; Applied Biosystems). The spectra were subjected to gaussian smoothing with a filter width of 5 points, and the baselines were corrected with default settings. Peaks were detected based on a signal-to-noise threshold of 15. The fragment masses and intensities of each MS/MS mass spectrum were subjected to an online Mascot MS/MS ion search (http://www.matrixscience.com/) to obtain the protein identities. For the search parameters, the maximum allowed missed cleavage in trypsin digestion was 1; partial oxidation of methionine, phosphorylation of serine/threonine/tyrosine, and iodoacetamide modification of cysteine residues were selected. The error tolerance values of the parent peptides and the MS/MS ion masses were 0.1 and 0.3 Da, respectively. A protein identification result was considered significant when the MS/MS ion profile matched a known protein in the NCBInr (2005/06/01) database with a P value <0.05. For each identified protein, an accession number in the UniProt protein database (Ver. 48.0) was reported when available.
| Results |
|---|
|
|
|---|
To avoid identification of falsely significant proteomic features caused by systematic bias, we considered only 20 differential proteomic features (Fig. 1
and Table 1
) that were significantly correlated with at least 2 biochemical/clinical variables as SARS-specific (Table 2
). These potential biomarkers were found to be significantly associated with SARS-CoV viral load (2 correlated with SARS-CoV RNA), acute-phase reaction [10 correlated with C-reactive protein (CRP)], lung damage [12 correlated with lactate dehydrogenase (LD)], liver function (15 correlated with albumin and/or total protein), immune response (11 correlated with neutrophil count; 3 correlated with total leukocyte count), and age (3 correlated), respectively. Thus, these biomarkers could reflect different physiologic conditions of the body after infection with SARS-CoV.
|
|
|
differentiation of sars by two-way hierarchical clustering analysis of serum proteomic fingerprints
In the dendrogram (Fig. 2
), majority of SARS cases (95%) were grouped under 4 clusters. There were significantly more cases with poor prognosis [required treatment in the intensive care unit (ICU) and/or supplemental oxygen during treatment] in SARS clusters 2 and 3.
|
diagnostic values of individual proteomic features
The areas under the ROC curves for most of the SARS-specific proteomic features were between 0.733 and 0.995. The 2 biomarkers at m/z 28 119 and 5908 gave the largest ROC curve areas. For the biomarker at m/z 28 119, the ROC curve area was 0.987 (95% confidence interval, 0.9661.007; Fig. 3A
). At 97% specificity, the sensitivity was 97%. For the biomarker at m/z 5908, the ROC curve area of 1/peak intensity was 0.995 (95% confidence interval, 0.9851.004; Fig. 3B
). At 95% specificity, the sensitivity was 100%.
|
protein identities of the potential biomarkers
Attempts were made to purify and identify the 2 biomarkers with the highest diagnostic values and the 2 biomarkers correlated with viral load. The CM10 ceramic beads captured proteomic features similar to those of the CM10 ProteinChip array. The proteins eluted from the CM10 ceramic beads were separated and concentrated as protein spots by 2-dimensional gel electrophoresis. We successfully identified the protein spots with masses corresponding to the proteomic features with m/z values of 5908, 24 505, and 28 119, which were internal fragment of fibrinogen
-E chain, immunoglobulin
light chain, and N-terminal fragment of complement C3c
-chain, respectively (Fig. 4
).
|
| Discussion |
|---|
|
|
|---|
The most sensitive marker, the proteomic feature at m/z 5908, which was negatively correlated with neutrophil count and with the largest chest radiographic changes, was found to be an internal fragment of fibrinogen
-E chain. On the one hand, we previously reported that a high neutrophil count was a risk factor associated with clinical deterioration in SARS (26). On the other hand, intravascular fibrin deposition has been observed in SARS patients (29). Vascular fibrin thrombi are often associated with pulmonary infarcts. Specific interactions between neutrophils and fibrin thrombi are well recognized (30). In addition, neutrophils produce a neutral peptidegenerating protease that can cleave fibrinogen into peptide fragments (31). It is possible that fibrinogen and/or its fragments are involved in the pathophysiologic mechanism linking neutrophil activation and lung damage.
The next most sensitive biomarker is the proteomic feature at m/z 28 119. This feature was identified as the N-terminal fragment of complement C3c
-chain. Complement 3 (C3), which is composed of an
chain (Mr 115 000) and a ß chain (Mr 75 000), is the central molecule in complement systems comprising the classic, alternative, or lectin pathways. On activation and subsequent inactivation of C3, several physiologic protein fragments, such as C3c, are produced. When examined by sodium dodecyl sulfatepolyacrylamide gel electrophoresis, C3c separates into a ß-chain (Mr 75 000) and 2 fragments of
-chain (Mr 27 000 N-terminal fragment and Mr 43 000 C-terminal fragment) (32). The presence of free C3c
-chain N-terminal fragment in the blood circulation might be the result of degradation of C3c. In the SARS patients, the concentration of this C3c fragment was positively correlated with CRP, suggesting its positive association with the acute-phase reaction and with the activation of the complement system. It is worth nothing that this C3c fragment contains a binding domain for complement receptor type 1 (33). Activation of complement receptor type 1 enhances phagocytosis of the neutrophils (34) and activates B-cell differentiation (35)(36). To date, information about the activation of the complement pathway in SARS patients has been limited. Liao et al. (17) reported that there was no significant difference in C3 concentrations between SARS and control patients. Another group demonstrated that SARS-CoV could trigger complement activation through the lectin pathway (37). The biological relevance of C3 in SARS remains unknown.
The proteomic feature at m/z 24 505, which was increased in SARS patients and was positively correlated with viral load, was found to be immunoglobulin
light chain. This finding is consistent with our recent finding of anti-SARS-CoV IgG in 93% of SARS cases at the time of sampling. IgG was first detected on day 4 of illness (38). Higher IgG concentrations were detected in patients with poor outcome (i.e., requiring supplemental oxygen for hypoxia or treatment in the ICU).
Aside from using individual differential proteomic features as biomarkers, one could combine all of the differential features to form a SARS-specific fingerprint. The SARS-specific fingerprint not only could differentiate SARS from non-SARS diseases with similar symptoms, but also could be useful in identifying patients with poor prognosis. The prognostic capability could be explained by the fact that the identified differential proteomic features were correlated with clinical and biochemical features having prognostic values, including viral load (27)(28), neutrophil counts (26), LD (26)(39)(40)(41), and age (40)(41).
Previously, 2 research teams, using the SELDI ProteinChip technology, reported potential biomarkers in the sera of adult SARS patients (7)(11). In the present study, the intensity of the proteomic feature at m/z 7769 was significantly lower in SARS patients (MannWhitney test, P <0.001); in their study, Yip et al. (7) also reported that it is significantly lower in SARS patients (MannWhitney test, P = 4.9 x 108). Except for this proteomic feature, the SARS-associated proteomic features in these other studies were different from our findings. These differences might be attributable to different selection criteria for the controls. In the previous studies, the control cases were either healthy persons or persons with viral infections from other clinics. The extent of similarities of symptoms between the SARS and control groups and the time point of blood collection were not considered. In the present study, the control group comprised suspected SARS patients who were admitted to the same hospital as the SARS patients but were later shown to be negative for SARS-CoV infection. The symptoms in the SARS and control groups and the time point of blood collection were very similar. The biomarkers identified in the present study may thus have an advantage in an actual diagnostic setting compared with those identified in the previous studies.
The difference in the findings reported by Yip et al. (7) and the present study could be also attributable to the use of different profiling methodologies. In their study, Yip et al. (7) used a comprehensive profiling approach. After being denatured with urea and detergent, the serum proteins were first fractionated with anion-exchange beads to give 6 fractions, which were later analyzed with copper ProteinChip arrays and weak cation-exchange CM10 ProteinChip arrays. Use of the comprehensive profiling approach would increase the chance of identification of more potential protein markers (8). In the present study, we analyzed the serum proteins directly after purification with only CM10 ProteinChip arrays at 2 different binding conditions (pH 4 and pH 9). We chose the CM10 ProteinChip arrays (previously called WCX2) because Kang et al. (11) showed that this chip type gives the best profiling result when analyzing serum samples from SARS patients by a direct binding approach (11). Although the direct binding approach might lead to the discovery of fewer biomarkers, such direct binding assays have higher potential to be modified for use as a clinical assay that can be used even when the protein identities of the disease-specific SELDI peaks are not known.
The most commonly used assays for detecting SARS are based on the detection of viral RNA (6) or antibodies against the SARS-CoV (5). Detection of viral RNA is useful in the early phase of the disease, whereas the serology test for antibodies against SARS-CoV is useful from 21 days onward. The current study has demonstrated that within the first week after onset of fever, similar to viral RNA concentration, the serum proteome contains both diagnostic and prognostic information. The SELDI ProteinChip assay could be used for first-line detection of SARS, followed by the quantitative viral RNA assay for confirmation. Once the disease is confirmed, the treatment strategy could be adjusted according to the prognosis based on the SELDI ProteinChip profiling result and the viral RNA concentration.
In conclusion, we have demonstrated that disease-specific proteomic fingerprints are present in the sera of adult SARS patients. They could be used to identify SARS cases during the early stage of the disease with high specificity and sensitivity. These markers may provide information about the patients physiologic status as well as prognostic information. The 2 proteomic features having the highest diagnostic value were the N-terminal fragment of complement C3c
-chain and an internal fragment of fibrinogen
-E chain. The proteomic feature (m/z 24 505) positively correlated with viral load was identified as immunoglobulin
light chain.
| Acknowledgments |
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
-related cytokine storm in SARS patients. J Med Virol 2005;75:185-194.[CrossRef][Web of Science][Medline]
[Order article via Infotrieve]
chain of fibrinogen. Proc Natl Acad Sci U S A 1991;88:1044-1048.The following articles in journals at HighWire Press have cited this article:
![]() |
G. L. Hortin The MALDI-TOF Mass Spectrometric View of the Plasma Proteome and Peptidome Clin. Chem., July 1, 2006; 52(7): 1223 - 1237. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. T.K. Pang, T. C.W. Poon, K.C. A. Chan, N. L.S. Lee, R. W.K. Chiu, Y.-K. Tong, S. S.C. Chim, J. J.Y. Sung, and Y.M. D. Lo Serum amyloid a is not useful in the diagnosis of severe acute respiratory syndrome. Clin. Chem., June 1, 2006; 52(6): 1202 - 1204. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |