|
|
||||||||
Proteomics and Protein Markers |
Departments of1 Medicine and Therapeutics, and 2 Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Prince of Wales Hospital, New Territories, Hong Kong Special Administrative Region, China.
aAddress correspondence to this author at: Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, 30-32 Ngan Shing St., Shatin, New Territories, Hong Kong Special Administrative Region, China. Fax 852-2648-8842; e-mail tcwpoon{at}cuhk.edu.hk.
| Abstract |
|---|
|
|
|---|
Methods: Serum proteins from 46 patients with chronic hepatitis B (CHB) were profiled quantitatively on surface-enhanced laser desorption/ionization (SELDI) ProteinChip arrays. The identified liver fibrosis-associated proteomic fingerprint was used to construct an artificial neural network (ANN) model that produced a fibrosis index with a range of 06. The clinical value of this index was evaluated by leave-one-out cross-validation.
Results: Thirty SELDI proteomic features were significantly associated with the degree of fibrosis. Cross-validation showed that the ANN fibrosis indices derived from the proteomic fingerprint strongly correlated with Ishak scores (r = 0.831) and were significantly different among stages of fibrosis. ROC curve areas in predicting significant fibrosis (Ishak score
3) and cirrhosis (Ishak score
5) were 0.906 and 0.921, respectively. At 89% specificity, the sensitivity of the ANN fibrosis index in predicting fibrosis was 89%. The sensitivity for prediction increased with degree of fibrosis, achieving 100% for patients with Ishak scores >4. The accuracy for prediction of cirrhosis was also 89%. Inclusion of International Normalized Ratio, total protein, bilirubin, alanine transaminase, and hemoglobin in the ANN model improved the predictive power, giving accuracies >90% for the prediction of fibrosis and cirrhosis.
Conclusions: A unique serum proteomic fingerprint is present in the sera of patients with fibrosis. An ANN fibrosis index derived from this fingerprint could differentiate between different stages of fibrosis and predict fibrosis and cirrhosis in CHB infection.
| Introduction |
|---|
|
|
|---|
Liver fibrosis and cirrhosis are reversible, and accurate diagnosis is crucial to the management of patients with CHB or CHC (4). Histologic diagnosis with liver biopsy has long been the gold standard for assessing the degree of fibrosis, but it is an invasive procedure with inherent risk and sampling variability (5)(6). In addition, the diagnostic accuracy depends on the size of the biopsy specimens (6)(7). Furthermore, intra- and interobserver variation for interpretation of biopsies is 1020%, even among experienced pathologists (8).
Serum-based tests of liver fibrosis have attracted more attention in recent years (9). Forns et al.(10) developed an index based on age, platelet count, cholesterol, and
-glutamyltransferase concentration, which was independently validated by at least two other groups of investigators (11)(12). A simple index using only aspartate transaminase and platelet count was proposed by Wai et al. (13). The algorithm first used by Poynard et al. (14)(15) and Imbert-Bismut et al.(15) has been commercialized in the Fibrotest. All of these serum-based assays can be used to assess and monitor liver fibrosis from CHC with an area under the ROC curve between 0.7 and 0.9. In the case of CHB, similar models based on serum biochemical markers achieve a moderate correlation with liver fibrosis (area under ROC curve
0.8) (16).
Comparison of proteomes of disease and control serum samples has been shown to be a possible approach for discovering serum biomarkers of liver diseases associated with hepatitis B infection (17)(18)(19)(20). Surface-enhanced laser desorption/ionization (SELDI) ProteinChip technology is a proteomic tool that has been applied to the discovery of diagnostic proteomic signatures in the sera of patients with various diseases, including cancer and infectious diseases (18)(21)(22). In the present study, we aimed to identify serum proteomic signatures associated with different degrees of liver fibrosis by use of the SELDI ProteinChip technology and to develop a proteomic fingerprinting model for predicting fibrosis and cirrhosis in patients with CHB.
| Materials and Methods |
|---|
|
|
|---|
hematologic and biochemical tests
Complete blood screens and coagulation tests were performed on the day of biopsy for all 46 patients in the central hospital hematology laboratory with use of a GEN-S blood cell counter (Beckman-Coulter Inc) and Amax Mechanical CS 190 Coagulation Analyzer (Sigma Diagnostics), respectively. In 30 patients, liver function tests were performed on the day of biopsy. For the remaining 16 patients, no liver function tests were performed on the day of biopsy. Results of liver function tests within 4 weeks before biopsy were used for the analysis. Bilirubin, total protein, albumin, alanine transaminase (ALT), and alkaline phosphatase (ALP) were measured by a Modular Analytic system (Roche Diagnostics).
-Fetoprotein was measured by a two-site sandwich chemiluminometric immunoassay (ACS 180 analyzer; Chiron Diagnostics).
histologic staging
Liver biopsies were obtained with 16-gauge Temno needles (Bauer Medical). The specimens were fixed with formalin, embedded in paraffin, and stained with hematoxylineosin. All specimens were at least 15 mm in length with a minimum of five portal tracts. Hepatic fibrosis was assessed by use of the Ishak fibrosis score (23) by a single pathologist blinded to the clinical and proteomic data. The mean (SD) Ishak fibrosis score was 3.5(1.9). Ten patients had minimal fibrosis (Ishak score = 1), 9 had mild fibrosis (Ishak score = 2), 10 had moderate fibrosis (Ishak score = 3 or 4), 8 had severe fibrosis and incomplete cirrhosis (Ishak score = 5), and 9 had probable/definite cirrhosis (Ishak score = 6).
serum proteomic profiling
Serum samples were analyzed with the SELDI ProteinChip system (Ciphergen Biosystems) to obtain a quantitative proteomic profile with molecular masses ranging from 0.9 to 250 kDa, as described previously (18)(22). All samples were analyzed without knowledge of the fibrosis stages. Briefly, 2 µL of each sample was denatured by addition of 4 µL of U9 solution (9 mol/L urea; 20 g/L CHAPS, 50 mmol/L Tris-HCl, pH 9) and diluted with 34 µL of T4(50 mmol/L sodium acetate, 1 mL/L Triton X-100, pH 4.0) or T9(10 mmol/L Tris, 1 mL/L Triton X-100, pH 9.0) binding buffer to give a final dilution of 20-fold. CM10 ProteinChip arrays were preequilibrated twice with 5 µL of the binding buffer for 5 min, after which 5 µL of the diluted sample was applied to the ProteinChip array in duplicate and incubated with shaking at room temperature for 90 min. After the incubation, each array was washed five times with the binding buffer and rinsed twice with deionized water. After air-drying, sinapinic acid matrix in 500 mL/L acetonitrile and 5 mL/L trifluoroacetic acid was added to each array. The ProteinChip arrays were read on the ProteinChip PBS II reader of a ProteinChip Biomarker System to measure the masses and intensities of the protein peaks. The common peaks among the SELDI mass spectra were identified and quantified by use of the Ciphergen Biomarker Wizard software. The peak intensities were normalized with the total ion current and, subsequently, with the total peak intensities. Before data mining, the normalized peak intensities of the duplicate measurements were averaged, followed by log2 transformation.
identification of fibrosis-associated proteomic features
To identify proteomic features associated only with disease, two criteria were used: (a) the normalized peak intensities must be significantly higher/lower in patients with typical fibrosis/cirrhosis than in individuals with minimal fibrosis; and (b) the normalized peak intensities must correlate with the degree of fibrosis.
We used the Significance Analysis of Microarray (SAM) Algorithm (Stanford University) (18)(22)(24) to identify the proteomic features that were significantly higher/lower in patients with fibrosis/cirrhosis by comparing the proteomic profiles of the cases with minimal fibrosis (Ishak score <2) with those for cases with typical fibrosis/cirrhosis (Ishak score >3) at a median false discovery rate of 0. In the SAM analysis, "two classed, unpaired data" were selected as the data type, and 5000 permutations were performed. Correlations between the degree of fibrosis (Ishak scores) and the peak intensities of the significant proteomic features were analyzed by the Spearman rank-order correlation test.
development of predictive artificial neural network model
We developed artificial neural network (ANN) models for prediction of liver fibrosis as described previously (18)(25). The ANN models were developed with EasyNN (Ver. 8.1; Stephen Wolstenholme). The development method was of the feed-forward type, and the networks were trained by weighted back-propagation. Both the learning rate and momentum were optimized automatically by the software. The ANN models for the present study were composed of three layers: one input layer, one hidden layer, and one output layer. There were six nodes at the middle hidden layer and one output at the output layer.
Two ANN models were developed from the liver fibrosis-associated proteomic features to generate an ANN fibrosis index for predicting the degree of fibrosis. Model 1 used the proteomic data alone as the input variables, whereas model 2 also included age, sex, platelet count, leukocyte count, hemoglobin, activated partial thromboplastin time, International Normalized Ratio, and concentrations of bilirubin, total protein, albumin, ALT, ALP, and
-fetoprotein. The laboratory indices chosen included those found to be independent predictors of liver fibrosis in CHC in previous studies (10)(13)(14). For both models, the output was the ANN fibrosis index in the range 06.000. During training, the ANN fibrosis index was equal to the Ishak score. Training was stopped when all output errors were <0.01. The maximum number of training cycles was restricted to 5000 to prevent overtraining.
evaluation of the ann models
The performance of the ANN models was evaluated by leave-one-out cross-validation (jack-knife). Among the common cross-validation techniques, leave-one-out cross-validation generates the most accurate estimation of the prediction performance for ANN models (26). Briefly, an ANN model was trained on (46 1) cases, and the trained model was then used to test the case that had been left out. This process was repeated until every case in the dataset had been used once as an unseen test case. The results were averaged across the 46 test cases to estimate the classifiers prediction performance.
statistical analyses
We analyzed the correlation between the Ishak scores and the ANN fibrosis indices by use of the Spearman rank-order correlation test. The patients were separated into five study groups: minimal fibrosis (Ishak score = 1), mild fibrosis (Ishak score = 2), moderate fibrosis (Ishak score = 3 or 4), severe fibrosis and incomplete cirrhosis (Ishak score = 5), and cirrhosis (Ishak = 6). We compared the ANN fibrosis indices of the five study groups by KruskalWallis ANOVA on ranks. To isolate groups that differed from each other, we used the Dunn multiple comparison method. To construct ROC curves, we calculated the sensitivities and specificities of the ANN fibrosis indices at different cutoff points for differentiating patients with minimal/mild fibrosis (Ishak score <3) from patients with significant fibrosis (Ishak score
3), and for differentiating patients with (Ishak score
5) or without (Ishak score <5) cirrhosis. The sensitivity and specificity were calculated according to the standard formulas. The likelihood ratios were calculated by use of the standard formulas: positive likelihood ratio = sensitivity/(100% specificity); negative likelihood ratio = (100% sensitivity)/specificity.
| Results |
|---|
|
|
|---|
|
|
correlations with liver function and inflammation
Among these 30 proteomic features, 4 were positively correlated with albumin concentration (r = 0.2980.449; all P values <0.05); 6 were negatively correlated with activated partial thromboplastin time (r = 0.305 to 0.413; all P values <0.05); and 9 were negatively correlated with the International Normalized Ratio (r = 0.295 to 0.437; all P values <0.05), giving 14 proteomic features positively correlated with liver function. Concerning liver inflammation, only one proteomic feature positively correlated with ALT concentration (r = 0.313; P = 0.034). This proteomic feature did not have any correlation with the liver function markers. None of the proteomic features were correlated with ALP concentration, which reflected cholestasis. As a result, 50% (15 of 30) of the liver fibrosis-associated proteomic features did not have any significant association with either liver function or inflammation (Table 1
).
ann fibrosis index correlated with degree of fibrosis
We developed two ANN models: model 1 used proteomic data alone as input variables, whereas model 2 also included the clinical and laboratory values for the patients. Initially all available clinical and laboratory values were included in model 2. After removal of the input variables that did not improve the predictive performance of the model, ALT, total protein, bilirubin, and hemoglobin were retained in model 2. The relative importance of each input variable in the final models was calculated, and the 10 most important variables for each model are listed in Table 2
.
|
For evaluation of the two ANN models, the ANN fibrosis index for each case was obtained by leave-one-out cross-validation. The distribution patterns of the ANN fibrosis indices in patient groups with different degrees of fibrosis are shown in Fig. 2
. The distribution patterns of ANN fibrosis indices generated by the two models were very similar. There was a low degree of overlap of the indices between the patient group with an Ishak score of 1 and the groups with Ishak scores
2. We observed considerable overlap only between the group with an Ishak score of 2 and the group with an Ishak score of 3 or 4.
|
KruskalWallis one-way ANOVA on ranks showed that the ANN fibrosis indices were significantly different among the patient groups (P <0.001 for both models). The Dunn multiple comparison test indicated that the ANN fibrosis indices of the patient groups with Ishak scores >4 were significantly higher than those of the patients with minimal fibrosis regardless of the models (all P values <0.05). The ANN fibrosis indices were highest in the patients with cirrhosis (Ishak score = 6) and were significantly higher than the indices for patients with minimal or mild fibrosis (Ishak score <3, all P values <0.05). For both ANN models, there was a strong positive correlation between the ANN fibrosis indices and the Ishak scores (model 1, r = 0.831, P <0.0005; model 2, r = 0.861, P <0.0005). These findings indicate that the ANN fibrosis indices were highly correlated with the degrees of fibrosis and were significantly different in patients with different stages of fibrosis.
ann fibrosis index predicts fibrosis and cirrhosis with high accuracy
The ROC curve analysis (Fig. 3
) showed that the ANN fibrosis index derived from the serum proteomic fingerprint (ANN model 1) was useful in predicting patients with significant fibrosis (Ishak score
3). The area under ROC curve was 0.906(95% confidence interval, 0.8131.000; P <0.0005). In addition, the ANN fibrosis index was useful in predicting patients with cirrhosis (Ishak score
5), giving a ROC curve area of 0.921(95% confidence interval, 0.8430.999; P <0.0005). At a cutoff of 3.7, the sensitivity and specificity of the ANN fibrosis index in predicting fibrosis (Ishak score
3) were 89% and 89%, respectively. It is worth noting that the accuracy in the prediction of fibrosis increased with the degree of fibrosis. All cases with severe fibrosis (Ishak score >4) were correctly identified. At a cutoff of 4.7, the sensitivity and specificity of the ANN fibrosis index in predicting cirrhosis (Ishak score
4) were 94% and 86%, respectively.
|
The ROC curve analysis (Fig. 3
) showed that inclusion of ALT, total protein, bilirubin, hemoglobin, and International Normalized Ratio as input variables (ANN model 2) improved the predictive performance of the ANN model. The ROC curve areas for predicting significant fibrosis and for predicting cirrhosis were 0.930(95% confidence interval, 0.8451.014; P <0.0005) and 0.929(0.8501.008; P <0.0005), respectively. At
90% specificity, the sensitivities for predicting fibrosis and cirrhosis were 96% and 94%, respectively. The values of the overall accuracy were 93% and 91%, respectively. The clinical values of the ANN fibrosis indices of the two ANN models in predicting fibrosis and cirrhosis are summarized in Table 3
.
|
| Discussion |
|---|
|
|
|---|
0.5%, patient discomfort, and expense. Liver biopsy is therefore not suitable for regular monitoring of disease progression. In this study, we attempted to develop a novel predictive model of liver fibrosis and cirrhosis, using a serum proteome-based fingerprinting approach. We found that 30 serum proteomic features formed a unique fingerprint for fibrosis. More experiments are needed to identify the nature of these 30 proteomic features. Nevertheless, because each protein has a unique m/z value, although their protein identities are not known, they can be unambiguously detected and quantified in patient sera. We generated two ANN models to interpret the serum proteomic fingerprints to produce a predictive ANN fibrosis index. Our results strongly indicate that the ANN algorithm is a useful tool for predicting liver fibrosis and cirrhosis through analysis of the fibrosis-associated proteomic fingerprint.
Assays for predicting liver fibrosis can be classified into two major groups: direct and indirect assays (28). For the direct assays, the serologic markers reflect the metabolism of extracellular matrix. For the indirect assays, the markers reflect alterations in hepatic function but do not directly reflect the metabolism of extracellular matrix. In this study, we found that one half of the 30 liver fibrosis-associated proteomic features were positively correlated with liver function or inflammation. Our proteomic fingerprint-based ANN models thus may reflect not only liver function and inflammation, but also other physiologic conditions.
The SELDI-derived serum proteome-based ANN fibrosis index is less invasive and easily performed in a high-throughput 96-well format. The test can be completed within 3 h. ANN fibrosis indices were highly correlated with the degree of fibrosis and were significantly different in patients with different stages of fibrosis, from minimal fibrosis to cirrhosis. Additionally, the ANN fibrosis indices could differentiate patients with minimal/mild fibrosis from patients with significant fibrosis or cirrhosis with high accuracy (
90%) and ROC curve areas >0.9. Such accuracy compares favorably with models derived from regression analysis of routine clinical and laboratory values. For example, two recent independent studies indicated that for the Fibrotest, the ROC curve areas for the identification of METAVIR fibrosis stage F2F4 in CHC patients were 0.74 and 0.84 (12)(29) In the case of CHB, similar models based on serum biochemical markers achieved only moderate correlation with liver fibrosis (ROC curve area
0.8) (16).
In the present study, we instituted various preventive measures to avoid generation of biased results caused by artifacts related to the nature of the clinical samples used, sample storage conditions, experimental details, the mass spectrometric instruments, and/or bioinformatic analyses (30)(31). All serum samples were collected and processed within the same clinical and laboratory settings. To ensure the quality of the serum samples, they were stored at 70°C before analysis. Additionally, all samples were analyzed blindly without knowledge of the fibrosis stages, and SAM, which is a conservative multivariate bioinformatic test, was used in the identification of the fibrosis-associated proteomic features. This method allowed us to adjust the false discovery rate, thus providing high reliability of the identified proteomic features. In this study, we set the median false discovery rate at 0. At this rate, all 81 significant proteomic features identified by the SAM analysis should be true. Last but not least, we used stringent criteria to define a proteomic feature as fibrosis specific. The fibrosis-associated proteomic features were not only significantly higher or lower in patients with fibrosis but were also significantly correlated with the degrees of fibrosis. With these precautions, it was much less likely that the disease-specific proteomic features were identified by chance or were results of bias in patient selection. Furthermore, one half of the identified liver fibrosis-associated proteomic features were later found to be significantly associated with liver function or liver inflammation. This also serves as indirect evidence indicating that the identified proteomic features have clinical meanings associated with the disease of study.
It is important to emphasize that the current study was of the proof-of-principle type. Further studies are needed to validate the use of this model in CHB patients in a prospective manner. The same technology and analytical approaches should also be tested on patients with other chronic liver diseases to determine their general applicability to fibrosis related to other etiologies and, ultimately, the potential of serum proteomics as noninvasive markers of fibrosis and cirrhosis.
| Acknowledgments |
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
T Asselah, I Bieche, A Sabbagh, P Bedossa, R Moreau, D Valla, M Vidaud, and P Marcellin Gene expression and hepatitis C virus infection Gut, June 1, 2009; 58(6): 846 - 858. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. L. Friedman Hepatic Stellate Cells: Protean, Multifunctional, and Enigmatic Cells of the Liver Physiol Rev, January 1, 2008; 88(1): 125 - 172. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Huwart, C. Sempoux, N. Salameh, J. Jamart, L. Annet, R. Sinkus, F. Peeters, L. C. ter Beek, Y. Horsmans, and B. E. Van Beers Liver Fibrosis: Noninvasive Assessment with MR Elastography versus Aspartate Aminotransferase to-Platelet Ratio Index Radiology, November 1, 2007; 245(2): 458 - 466. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. K.T. Kam, T. C.W. Poon, H. L.Y. Chan, N. Wong, A. Y. Hui, and J. J.Y. Sung High-Throughput Quantitative Profiling of Serum N-Glycome by MALDI-TOF Mass Spectrometry and N-Glycomic Fingerprint of Liver Fibrosis Clin. Chem., July 1, 2007; 53(7): 1254 - 1263. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. T.K. Pang, T. C.W. Poon, K.C. A. Chan, N. L.S. Lee, R. W.K. Chiu, Y.-K. Tong, R. M.Y. Wong, S. S.C. Chim, S. M. Ngai, J. J.Y. Sung, et al. Serum Proteomic Fingerprints of Adult Patients with Severe Acute Respiratory Syndrome Clin. Chem., March 1, 2006; 52(3): 421 - 429. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |