REVISTA ESPAÑOLA DE

Vol. 37, n.º 3, 2004

A comparative analysis of six current histological classification schemes and scoring systems used in chronic hepatitis reporting

Okechukwu Okafor, Segun Ojo

Department of Morbid Anatomy and Forensic Medicine, Obafemi Awolowo University Teaching Hospitals Complex, Ile-Ife. Nigeria. hillpoint@softhome.net

SUMMARY

The purpose of the present paper is to ascertain if and how the Old Qualitative Classification correlates with the newer semi-quantitative scoring systems, including the Knodell Histological Activity Index (HAI), Ishak Modified HAI, Scheuer, Batts-Ludwig, and the METAVIR scoring systems, and how these scoring systems correlate with each other. The study was carried out with liver biopsies from fifty consecutive cases of chronic hepatitis. Each case was classified and scored using all 6 systems and correlation studies were subsequently carried out. The semi-quantitative scoring systems showed good statistical correlation among themselves. It is therefore possible to compare chronic hepatitis biopsies from different centres where any of these 5 semi-quantitative scoring systems are used. Also the results of therapeutic trials using any of these scoring systems can be compared. The Old Qualitative Classification only showed significant association with the necroinflammatory activity score of the Knodell scheme but not with its fibrosis score.

Key words: Chronic hepatitis, necroinflammatory activity score, liver fibrosis score, semi-quantitative scoring, liver biopsy.

Estudio comparativo de seis clasificaciones histológicas actuales utilizadas en la valoración de la hepatitis crónica

RESUMEN

El objetivo de este artículo fue comprobar si y cómo la Antigua Clasificación Cualitativa se pone en correlación con los nuevos sistemas de medida semi-cuantitativa, incluyendo el Índice de Actividad Histológico de Knodell (IAH), IAH Modificado de Ishak, los sistemas de medida de Scheuer, Batts-Ludwig y METAVIR, y cómo estos sistemas de medida se relacionan entre ellos. El estudio se realizó con biopsias hepáticas provenientes de cincuenta casos consecutivos de hepatitis crónica. Cada caso se clasificó y se evaluó según los 6 sistemas y posteriormente se realizó estudios de correlación. Los sistemas de medida semi-cuantitativa mostraron buena correlación estadística entre sí. Es posible, por tanto, comparar biopsias de hepatitis crónica de varios centros donde se utiliza cualquiera de estos 5 sistemas de medida semi-cuantitativa. También se pueden comparar los resultados de pruebas terapéuticas que utilizan estos sistemas. La Antigua Clasificación Cualitativa solo mostró una asociación significativa con la medición de actividad necroinflamatoria del sistema de Knodell pero no con la medición de fibrosis.

Palabras clave: Hepatitis crónica, medida de actividad necroinflamatoria, medida de fibrosis hepática, medida semi-cuantitativa, biopsia de hígado.

INTRODUCTION

The diagnosis of chronic hepatitis has major diagnostic, prognostic, and therapeutic implications. Thorough investigation of chronic hepatitis is mandatory to discover the cause of liver inflammation, assess its severity, and plan treatment. Liver biopsy assessment is cardinal in this process. The first histological classification, which was published by an international group in 1968 (1) and revised in 1977 (2) (in this article referred to as the Old Qualitative Classification), codified the terminologies, chronic persistent, chronic lobular, and chronic active hepatitis. The classification, as then proposed, was only aimed at distinguishing subgroups according to the degree of disease activity and to provide prognostic information and criteria for the use of immunosuppressive therapy. In recent times, more information has become available concerning the causes, natural history, pathogenesis, serological features, and therapy of chronic hepatitis. Therefore, categorisation of chronic hepatitis hitherto based primarily upon histopathological features has been replaced by a more informative classification based upon a combination of clinical, serological, and histological variables (3-7). Thus the overall classification of chronic hepatitis is now based upon aetiology, necroinflammatory activity, and degree of fibrosis.

The concepts of grading and staging have traditionally been applied to neoplasia; grading describes the degree of differentiation of a neoplasm, while staging describes the extent of its spread. The same principles have come to be applied, however, with some modifications, to chronic hepatitis (8); grading is used to describe the intensity of necroinflammatory activity while staging is an indication of architectural alteration thus signifying progression of the disease towards cirrhosis or end-stage liver disease. This histological activity is important for the patient and the clinician because it provides a measure of severity of the hepatitis at the time of biopsy, and this is not always matched by abnormal liver function tests (9-11). The application of this concept of grading and staging seeks to impute a prognostic value to biopsy assessment in chronic hepatitis (12). In furtherance of this, numerical scores have been attributed to both staging and grading, thus providing a semi-quantitative assessment of the observed histological features.

Even though the Knodell Histological Index (Knodell HAI) (13) was the first of this kind and is the most widely used system (14), its apparent flaws detract from its universal use and hence several other scoring systems have been proposed (15-19). This lack of consensus, makes it difficult to compare chronic hepatitis specimens from different centres where the various semi-quantitative scoring systems are used. It is also difficult to compare studies on chronic hepatitis therapies and management techniques when different scoring systems have been applied. The purpose of this study is to assess, using liver biopsies from Nigerian patients, correlation between 6 systems namely: the Old Qualitative Classification, Knodell HAI, Scheuer system (20), Ishak Modified Histological Activity Index (Ishak Modified HAI)(12), Batts-Ludwig system (21), and METAVIR system (22). Inter- and intra-observer studies have been carried out to test the reproducibility and correlation between the Knodell HAI, Scheuer system, and METAVIR system (23). To our knowledge, this study represents the first attempt to determine the degree of correlation between all the 6 systems of scoring or classifying chronic hepatitis.

MATERIALS AND METHODS

The surgical daybooks of the Morbid Anatomy department of the Obafemi Awolowo University Teaching Hospitals Complex, Ile-Ife between January 1992 and December 2000 were studied for all liver biopsies. The original request cards were retrieved and relevant information extracted. The first 50 consecutive cases of chronic hepatitis were selected for the study. For a liver biopsy to be included in this study it had to have a definite histological evidence of chronic hepatitis with or without serological alterations suggestive of chronic hepatitis. Archival liver biopsies from 50 consecutive cases of chronic hepatitis were retrieved and sections from each were subjected to 6 stains: haematoxylin and eosin, Periodic Acid-Schiff (PAS) with diastase digestion, Masson’s trichrome, reticulin, orcein, and Perls’ iron. Exclusion criteria in the selection of cases are as follows:

a) Cases without either the complete pathology report or the relevant slides in the departmental archives.

b) Some cases of liver disease were selected out because, even though they had clinical and histological features suggestive of chronic hepatitis, there was evidence that the chronic inflammation visualised could be attributable to other disease processes that now are excluded from chronic hepatitis. The excluded diseases are: primary biliary cirrhosis, primary sclerosing cholangitis, Wilson’s disease, haemochromatosis, alpha 1-antitrypsin deficiency disease of the liver, non-alcoholic steatohepatitis, and alcoholic liver disease.

c) Deteriorated slides, such as broken or faded slides were rejected for the study. Cases were the original blocks were still accessible and satisfactory recut sections taken were accepted and used for the study.

d) Sections of non-cirrhotic liver with less than 3 recognisable portal spaces.

21 of the patients were females while 29 were males. The ages ranged from 9 years to 70 years with an average of 36.1 years and standard deviation of 17.2 years. Each biopsy was evaluated by a single pathologist (the first author) and classified using the 6 systems namely the Old Qualitative Classification, Knodell HAI, Scheuer, Ishak Modified HAI, Batts-Ludwig, and METAVIR systems. The published guidelines for all these systems were strictly adhered to. For each scoring system, two scores were obtained: one score for the degree of necroinflammatory activity and another for the degree of fibrosis. The Knodell HAI, which is traditionally reported as a single score, was divided into 2 scores in this study: a necroinflammatory activity score consisting of the sum of the first 3 criteria and then a fibrosis score, which is the fourth score in the table (13). This was done so that the scores obtained could be made comparable with corresponding scores of the other scoring systems. Two statistical methods were employed to analyse the results; the Pearson Product Moment Correlation method and the Chi-square distribution method. The Pearson Product Moment Correlation method (24) was used to assess correlation between the Knodell HAI, Scheuer system, Ishak Modified HAI, Batts-Ludwig system, and METAVIR system only (excluding the Old Qualitative Classification). This is because these scoring systems generate numeric data hence a parametric measure of correlation could be used. The Old Qualitative Classification is essentially an assessment of the necroinflammatory activity in chronic hepatitis and does not generate numeric data. Therefore, a frequency distribution was obtained between its various categories and plotted against the severity of necroinflammatory activity scores (approximated as mild, moderate or severe) generated by the Knodell HAI. The Chi-square distribution method (24) was then used to assess the relationship between the Old Qualitative Classification and the Knodell HAI. After the scoring of all the 50 cases, the analysis of the scores obtained was carried out in four separate processes.

(1) Firstly, the necroinflammatory activity scores of the 5 scoring systems were obtained. 4 different correlation tests were carried out between the Knodell HAI, on one part, and the Scheuer system, Ishak Modified HAI, Batts-Ludwig system, and METAVIR system respectively, on the other part. The Pearson Product Moment Correlation method was used and, consequently, 4 correlation coefficient (r) values were obtained and these were all tested for significance using the two tails t distribution table. The correlation was considered significant when p<0.001.

(2) Secondly, the fibrosis scores of the 5 scoring systems were also obtained and analysed in identical manner as the necroinflammatory activity scores. The Pearson Product Moment Correlation method was also used here.

(3) Thirdly, all the 50 cases were classified by the Old Qualitative Classification into chronic persistent, chronic active, and chronic lobular hepatitis. Then using the necroinflammatory activity scores obtained from the Knodell HAI, these 50 cases were also classified into 4 groups namely, minimal, mild, moderate, and severe chronic hepatitis using the format suggested by Desmet et al. (8) (table 1). Hence, a frequency distribution was obtained with the histological classification on one part and the degree of necroinflammatory activity on the other. The Chi-square distribution method was then applied to test for any association between these two characteristics.

(4) Lastly, a similar association test was carried out between the Old Qualitative Classification and the fibrosis scores obtained from the Knodell HAI as well.

RESULTS

Table 2 shows the necroinflammatory activity scores obtained for each of the 50 biopsies studied along with the corresponding histological type. All the 4 numerical scoring systems showed positive correlation with the Knodell HAI. The correlation coefficients obtained were all significant when tested (p<0.001), hence the positive correlation obtained for all 4 systems did not occur by chance. Of the 4 scoring systems, the Ishak Modified HAI correlated most with the Knodell HAI, while the METAVIR System correlated least (r=0.99912 and r=0.94045 respectively). Table 3 shows the fibrosis scores obtained for each of the 50 biopsies studied. All the 4 numerical scoring systems showed positive correlation with the Knodell HAI. As in the case of necroinflammatory activity, the correlation coefficients obtained were all significant when tested (p<0.001), hence the positive correlation obtained for all 4 systems did not occur by chance. Of the 4 scoring systems, the Scheuer and Batts-Ludwig Systems showed the same degree of correlation (r=0.99378), better than the other two. The average correlation coefficient for the necroinflammatory scores is 0.96556 while that for the fibrosis scores is 0.99050 showing that there is more correlation with the fibrosis scores than the necroinflammatory scores. Table 4 is an m X n contingency table showing the frequency distribution of data obtained by the Old Quantitative Classification and the disease grade using the Knodell HAI necroinflammatory activity score. Using the Chi-square distribution method, an association was seen to exist between the two features at 0.1% significance level (p<0.001). 22 biopsies were classified as CPH, 27 as CAH, and 1 biopsy corresponded to CLH. Most of the cases of CPH (91%) were seen to have minimal activity, while most of the cases of CAH (59%) had mild activity. Severe necroinflammatory activity is not usually a feature of chronic hepatitis and in this study no case had severe activity. Table 5 is another m X n contingency table showing the frequency distribution of data obtained by the Old Quantitative Classification and the Knodell HAI fibrosis score. Using the Chi-square distribution method, there is seen to be no association between the two features at 5% significance level (p>0.05). The majority of biopsies of the 3 histological categories (CLH, CPH, and CAH) had no fibrosis (100%, 82%, and 59% respectively).

DISCUSSION

The new recommendations for nomenclature, grading, and staging of chronic hepatitis attempt to standardise the criteria and simplify the terminology used in making these diagnoses. The application (use) of a system of grading and staging, whether it is numerical or descriptive, simple or complex, matters less than the need for it to communicate important information about the degree of necroinflammatory activity (grade) and the extent of the disease (stage of fibrosis) that are likely factors of prognostic and therapeutic significance to the clinician. Similar to a previous report by Assy and Minuk (25), this study shows that there is a significant association between the histological category (in the Old Qualitative Classification) and the degree on necroinflammatory activity obtainable from the Knodell necroinflammatory score (table 4). 91% of the cases of CPH had Knodell necroinflammatory activity between 0 and 3 (minimal chronic hepatitis). 93% of the cases of CAH had Knodell necroinflammatory activity between 4 and 12 (mild or moderate chronic hepatitis). CPH generally connotes a lesser degree of necroinflammatory activity. There was no association when the Old Classification was compared to the degree of fibrosis (table 5).

This study shows significant correlation between the 5 new scoring systems assessed, both for the necroinflammatory activity and fibrosis scores. This is quite similar to the results obtained by Foschini and Dal Monte (23) who found good correlation between the Knodell HAI, Scheuer System, and Ishak Modified HAI (3 of the 5 scoring systems assessed in this study). The possibility of comparing chronic hepatitis specimens from different Histopathology centres where any one of these 5 scoring systems is applied is a major consequence of this finding (correlation). Also the results of therapeutic trials using these different scoring systems can be reasonably compared.

Scoring of chronic hepatitis specimens in now widespread but this practice must be recognised and utilised for what it is. These scores are not to be considered the equivalent of quantitative laboratory tests but rather as indicative of relative severity. The correlation observed in this study among the semi-quantitative scoring systems, being only a statistical approximation of numerical scores, is necessarily imperfect because each score can be derived in one of several ways, depending on the exact distribution of inflammation and the different types of necrosis. Scoring therefore represents an approximate summary of severity and cannot replace accurate verbal description of biopsy findings nor should it be given priority over a statement of the aetiological subtype of chronic hepatitis. An exclusive focus on the global or summary score for a biopsy may lead one to overlook the contribution of the individual components with clinical significance; indeed, it has been shown that different patterns of activity can produce the same final score (26). The various scoring systems have their advantages and disadvantages (27-28). There is none at present, which employs all clinical, aetiological, and histological information available. Such a classification, even if it were to be developed, indeed would be more comprehensive that those used at present, but also would be more complex and thus less reproducible. As an alternative to an extensive subdivision, the histological report should have 3 parts namely: histological description, a diagnosis statement, and a semi-quantitative score. The diagnosis statement should be an aetiological diagnosis which takes into consideration all the serological and immunohistochemical data available. A verbal approximation of the necroinflammatory activity and fibrosis stage (minimal, mild, moderate, or severe) should be applied to the aetiological diagnosis. If the Knodell HAI scoring was done, then table 1 (8) could be employed in order to ascertain this approximation. Also it is good practice to specify indirectly the degree of surety of the aetiological diagnosis. While serological and immunohistochemical data permit a sure diagnosis, histological pointers are sometimes quite vague or only suggestive of the noxious agent. The semi-quantitative score appears after the diagnosis statement, as an adjunct to it. It should be written with the various category scores appearing in the same order as in the original outline. The scores for disease grade and stage must be written on separate lines, even if the Knodell HAI is employed. It will thus prove to be a useful tool for statistical analysis and follow-up of patients. An acceptable system of grading and staging must have several characteristics. First it must include all features known to be of value in the assessment of the severity and extent of the disease, as well as those believed to have prognostic potential for useful evaluation. Secondly, the system must be practical and not cumbersome to use, and must be shown to be reasonably reproducible, both by one pathologist at different times (low intra-observer variation) and by different pathologists (low inter-observer variation). Thirdly, the system must be useful to the clinician or evaluator of a clinical trial; reproducibility is not in itself valuable if the data generated cannot be used by the clinician. This often requires a compromise between complexity and reproducibility. The system that is most appropriate for clinical practice may not be the most informative for investigative work. The dedication of separate systems tailored to clinical use and research purposes may be warranted if such a goal can be accomplished without imposing unnecessary difficulty on the pathologist or imparting confusion to the clinical audience. The choice of systems depends to a large measure on whether it would be used for routine clinical work or for research purposes. The more detailed systems like Knodell HAI and Ishak Modified HAI are better suited for research purposes. For routine clinical work, these systems are cumbersome and more liable to generate inter- and intra-observer differences (29), especially among pathologists who only see liver biopsy specimens from time to time. Since the results of this study shows good correlation among all these scoring systems, the simpler systems like the METAVIR, Scheuer or Batts-Ludwig System could be used for routine bench reporting and still be useful to the clinician for follow-up of patients. The authors also recommend that all the pathologists in any given department (institution) should use the same scoring system that has been agreed upon among them and this decision in turn should be communicated to the clinicians. This arrangement can easily be achieved in those centres where there are frequent clinicopathological meetings and discussion sessions. In conclusion, this study has shown that there is a statistically significant association between the Old Qualitative Classification and the necroinflammatory score of the Knodell HAI but not with the fibrosis score. This old system should no longer be used because it does not include disease stage and aetiology and cannot be correlated with the newer semi-quantitative scoring systems. The system should rather be replaced by a system or combination of systems that take into account the grade, stage, and possible aetiology of the hepatic inflammation. Also there is significant statistical correlation between the Knodell’s HAI and other grading and staging systems when disease grade and stage are compared separately. We therefore suggests a system of reporting biopsies, which includes a detailed histological description of all the relevant findings, a diagnosis statement, which contains the aetiological diagnosis and an approximation of the disease grade and stage, and finally a semi-quantitative score.

REFERENCES

De Groote J, Desmet VJ, Gedigk P, Korb G, Popper H, Poulsen H, et al. A classification of chronic hepatitis. Lancet 1968; 2: 626-8.
International group. Acute and chronic hepatitis revisited. Lancet 1977; 2: 914-9.
Dienstag JL, Isselbacher KJ. Chronic hepatitis. In: Fauci AS, Braunwald E, Isselbacher KJ, Wilson JD, Martin JB, Kasper DL, et al., editors. Harrison’s principles of internal medicine. 14^th Ed. New York: McGraw-Hill; 1998. p. 1696-703.
Scheuer PJ. Changing views on chronic hepatitis. Histopathology 1986; 10: 1-4.
Working Party. Terminology of chronic hepatitis, allograft rejection, and nodular lesions of the liver. Summary of recommendations developed by an International Working Party, supported by the World Congress of Gastroenterology. Los Angeles, 1994. Am J Gastroenterol 1994; 89: S177-81.
Ishak KG. Chronic hepatitis: morphology and nomenclature. Mod Pathol 1994; 7: 690-713.
Hall PM. Chronic hepatitis: an update with guidelines for histopathological assessment of liver biopsies. Pathology 1998; 30: 369-80.
Desmet VJ, Gerber M, Hoofnagle J, Mann M, Scheuer P. Classification of chronic hepatitis. Diagnosis, grading and staging. Hepatology 1994; 19: 1513-20.
Van Thiel DH, Caraceni P, Molloy PJ, Hassanein T, Kania RJ, Gurakar A, et al. Chronic hepatitis C in patients with normal or near normal alanine aminotransferase levels. The role of interferon alpha-2b therapy. J Hepatol 1995; 23: 503-8.
Gholson CF, Morgan K, Catinis G, Favrot D, Taylor B, Gonzalez E, et al. Chronic hepatitis C with normal aminotransferase levels. A clinical histologic study. Am J Gastroenterol 1996; 92: 1788-92.
Scheuer PJ. Chronic hepatitis: what is activity and how should it be assessed? Histopathology 1997; 30: 103-5.
Ishak K, Baptista A, Bianchi L, Callea F, de Groote J, Gudat F, et al. Histological grading and staging of chronic hepatitis. J Hepatol 1995; 22: 696-9.
Knodell RG, Ishak KG, Black WC, Chen TS, Craig R, Kaplowitz N, et al. Formulation and application of a numerical scoring system for assessing histological activity in asymptomatic chronic active hepatitis. Hepatology 1981; 1: 4315.
Brunt EM. Grading and staging the histopathological lesions of chronic hepatitis: the Knodell histology activity index and beyond. Hepatology 2000; 31: 241-6.
Scheuer PJ, Davies SE, Dhillon AP. Histopathological aspects of viral hepatitis. J Viral Hepat 1996; 3: 277-83.
Ludwig J. The nomenclature of chronic active hepatitis: an obituary. Gastroenterology 1993; 105: 274-8.
Sherlock S. Classifying chronic hepatitis. Lancet 1989; 2: 1168-70.
Zetterman RK. Chronic hepatitis. Is it persistent, active or just chronic? Am J Gastroenterol 1993; 88: 1-2.
Gabriel A, Ziokowski A. New terminology of chronic hepatitis, including a scoring scale. Med Sci Monit 1997; 3: 113-8.
Scheuer PJ. Classification of chronic viral hepatitis. A need for reassessment. J Hepatol 1991; 13: 372-4.
Batts KP, Ludwig J. Chronic hepatitis. An update on terminology and reporting. Am J Surg Pathol 1995; 19: 1409-17.
Bedossa P, Poynard T, The French METAVIR Cooperative Study Group. An algorithm for grading activity in chronic hepatitis C. Hepatology 1996; 24: 289-93.
Foschini MP, Dal Monte R. [Comparison of the different methods of grading and staging of chronic hepatitis]. Pathologica 1996; 88: 263-9.
Armitage P, Berry G. Statistical Methods in Medical Research. 2nd ed. Oxford: Blackwell Scientific Publications; 1994.
Assy N, Minuk G. A comparison between previous and present historic assessments of chronic hepatitis C viral infections in humans. World J Gastroenterol 1999;5:107-10.
Hubscher SG. Histological grading and staging in chronic hepatitis: clinical applications and problems. J Hepatol 1998; 29: 1015-22.
Chen TJ, Liaw YF. The prognostic significance of bridging hepatic necrosis in chronic type B hepatitis: a histopathologic study. Liver 1988; 8: 10-6.
Westin J, Lagging LM, Wejstal R, Norkrans G, Dhillon AP. Interobserver study of liver histopathology using the Ishak score in patients with chronic hepatitis C virus infection. Liver 1999; 19: 183-7.
Goldin RD, Goldin JG, Burt AD, Dhillon PA, Hubscher S, Wyatt J, et al. Intra-observer and inter-observer variation in the histopathological assessment of chronic viral hepatitis. J Hepatol 1996; 25: 649-54.