Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Significance of results in cancer imaging

Significance of results in cancer imaging Cancer Imaging (2001) 2, 1–5 Introductory Lecture Monday 15 October 2001, 09.10–09.50 Significance of results in cancer imaging Rodney H Reznek Professor of Diagnostic Imaging, St Bartholomew’s Hospital, London, UK Technical performance relates to the ability to obtain Introduction high image quality in a reasonable time frame and The effective management of patients with cancer whether these images permit correct interpretation. requires a multidisciplinary team approach with the Diagnostic performance is concerned with the ability diagnostic radiologist playing an extremely important of the technique to identify disease correctly. Thus, role in that team. Increasingly it is realized that it is diagnostic performance is a measure of sensitivity, often the responsibility of the radiologist to understand specificity, positive predictive value, negative predictive and elucidate the significance of the findings of a test. Its value and accuracy of the technique in a given clinical significance lies not only in the clinical context but also situation. This is a familiar method of evaluating imag- in appreciating the impact that the test will have on the ing in cancer and the major method by which different patient’s outcome. The latter requires a knowledge imaging techniques are compared. Thus, the decision to of the cost-effectiveness of the use of any imaging use one imaging technique for staging cancer in prefer- technique. ence to another is frequently based on information provided on diagnostic performance. While it is not possible to discuss the use of statistics in detail, it is Evaluation of the effect of imaging important to recognize that studies should be designed to answer an hypothesis and that the help of a statisti- The issue of cost-effective imaging is complex and cian to design a study is likely to yield enormous benefits beyond the scope of this text to discuss these issues in [1] by reducing inappropriate methodology and bias . detail. However, recognition of the importance of Diagnostic impact is determined by the influence proper evaluation of imaging techniques and of their of the result of imaging on the clinician’s diagnostic use in clinical practice should improve both cost- confidence and by the ability of the new technology to effectiveness and efficacy of cancer imaging. These issues replace older established methods. Displacement of are discussed in an excellent review on measuring the [1] older techniques by new imaging modalities is easy to effects of imaging by MacKenzie and Dickson . These demonstrate. For example, lymphangiography has now authors point out that for diagnostic technologies it is become obsolete in the staging of several cancers and not clear how a technology itself may directly affect the myelography has also been superseded by magnetic physical health of the patient, a factor which is particu- resonance imaging (MRI) in the investigation of spinal larly important in the case of diagnostic imaging devel- [4,5] cord compression . opment. A strategy has been devised, therefore, for Therapeutic impact reflects the alteration in manage- evaluating the chain of events in which a trained ment of a patient based on results of imaging. Dixon observer makes an imaging report and the clinician [6] et al. recorded changes in the proposed treatment in combines the information in the report with clinical 182 of 200 patients referred for MRI of the head and findings and other tests to make a diagnosis and choose [2] [3] spine and, in the same group of patients, surgery was appropriate therapy . Fineberg et al. introduced four considered to be appropriate in 50 patients before MRI, levels to determine efficacy for diagnostic imaging which but in only 28 patients following the results of the have subsequently been expanded to five levels: examination. Technical performance Impact on health is much more difficult, if not imposs- Diagnostic performance ible, to evaluate, particularly in oncology when diagnos- Diagnostic impact tic information may be in advance of the ability to treat Therapeutic impact the disease. However, progress in research in both Impact on health diagnosis and treatment of cancer can only be made by The positive effect of one level is determined by the level furthering our understanding of the natural processes of above and in turn determines the possibility of a positive therapeutic response and tumour regrowth. In this con- result at the level below. text, therefore, imaging has an important role in cancer 1470-7330/01/010001 + 05  2001 International Cancer Imaging Society 2 R H Reznek even if there is no demonstrable impact on health. 1 Furthermore, it must be emphasized that although imaging itself cannot make an impact on outcome, the 0.8 results of imaging may directly influence management allowing the clinician to make the optimum therapeutic decision. In this way diagnostic imaging through 0.6 therapy does make an important contribution to final outcome. 0.4 Diagnostic performance 0.2 The diagnostic impact of imaging is most frequently made on the basis of studies designed to evaluate the ability of a technique to detect cancer accurately. In a 0 0.2 0.4 0.6 0.8 review entitled ‘A guide to clinical epidemiology for 1-Specificity (false-positive rate) [7] radiologists’, Goldin and Sayre commented that the Figure 1 ROC curve. poor understanding by physicians of the principles of statistical analysis weakens many investigations. Their review discusses the different methods of The positive predictive value (PPV) of a test indicates statistical analysis and basic concepts used to select the probability of whether the disease is actually present the appropriate technique and to interpret the results, if the test is positive. and is recommended as an excellent overview of the subject. true positives In the text of the chapters that follow, many refer- PPV = ences are made to sensitivity, specificity, positive predic- true positives + false negatives tive value, negative predictive value and accuracy. Negative predictive value (NPV) indicates whether the Advising on the judicious use of imaging studies in disease is likely to be absent if the result is negative. the staging and evaluation of malignancy requires a thorough understanding of these basic tests of efficacy true negatives and of the receiver–operator characteristics curve. These NPV = true negatives + false positives terms are defined below: Sensitivity of an investigation is its ability to identify correctly those patients who have the disease or is the Thus NPV = 1 PPV. proportion of patients with the disease who have posi- tive test results. Sensitivity is also referred to as the The sensitivity and specificity of a test are generally true-positive rate of the investigation. independent of disease prevalence and are therefore often called the intrinsic operating characteristics of the true positives test. On the other hand the PPV (and NPV) and Sensitivity = accuracy are highly dependent on the prevalence of the true positives + false negatives disease and cannot be generalized over settings where the prevalence varies. For this reason, reports of sensi- Specificity of an investigation is its ability to identify tivity and specificity are more reliable than tests of PPV correctly those patients who do not have the disease or is and accuracy, which are greatly influenced by regional the proportion of patients without disease who have variation of disease prevalence. negative test results. true positives Sensitivity = false positives + true negatives ROC The specificity is also called the false-positive rate of the Other statistical methods such as receiver–operating test. characteristics (ROC) and Kappa statistics are com- Accuracy of a test equals: monly used. Receiver–operating characteristics analysis is a plot of sensitivity vs. specificity for different cut-off true positives + true negatives points of a particular test. By grading test results accord- ing to five categories (strongly positive, 5; weakly true positives + true negatives + false positives + positive, 4; intermediate, 3; weakly negative, 2; strongly false negatives negative, 1) and plotting sensitivity against 1 The accuracy of a test is of less value than the sensitivity specificity, the ROC curve is generated (Fig. 1). and specificity because it lumps together positive and Thus, as the criteria for calling a test result positive negative results. are made more stringent, specificity improves at the Sensitivity Significance of results in cancer imaging 3 [8,9] Table 1 Interobserver agreement Radiologist B Radiologist A Normal Benign Suspected cancer Cancer Total Normal 21 12 0 0 33 Benign 4 17 1 0 22 Suspected cancer 3 9 15 2 29 Cancer 0 0 0 1 1 Total 28 38 16 2 85 expense of sensitivity. Conversely, as the criteria are Table 2 Calculation of the expected frequencies for the [8] relaxed, sensitivity improves while specificity diminishes. kappa test, after Altman The fundamental principle illustrated by the ROC curve is that there is an inherent limit to the diagnostic Assessment Expected frequency efficacy of a test. Once this limit has been reached, the interpreter can only improve sensitivity at the expense of Normal 33 (28/85) = 10.87 specificity and vice versa. The ROC curve can be used to Benign 22 (38/85) = 9.84 select the ‘best’ cut-off criteria for positivity taking the Suspected cancer 29 (16/85) = 5.46 Cancer 1 (3/85) = 0.04 positive predictive value and the relative costs (in terms Total 26.2 (31) of patient outcome) of false-positive and false-negative rest results into account. This has particular relevance in the use of imaging in staging cancer where cut-off criteria for positive results are constantly being decided. purely by chance, even if they were guessing their An example of this is on deciding on the upper limit of assessments. normal size for lymph nodes on cross-sectional imaging. The complete theory underpinning the kappa () test, An understanding of the ROC curve is therefore including the calculation of confidence intervals and essential for all radiologists and oncologists interpreting including a weighted kappa test where all disagreements [8] the results of imaging in staging cancer. First, the curve are not treated equally, has been given by Altman . displays explicitly the trade-off between sensitivity and The expected frequencies along the diagonal of this specificity which results from varying the criterion for table are given in Table 2 from which it is seen for these interpretation. Second, it provides a graphical summary data that the number of agreements expected by chance of how well a test performs for each method of interpret- is 26.2, which is 31% of the total, i.e. 26.2/85. What the ation, allowing one to compare two or more tests kappa test gives is the answer to the question of how without the necessity of having to stipulate the positive much better the radiologists were than 0.31. criterion for each test. The maximum agreement is 1.00 and the kappa Kappa statistics are used to demonstrate the statistic gives the radiologists’ agreement as a proportion [7] agreement between observers or different tests . of the possible scope for performing better than chance, which is 1.00–0.31. = (0.64 0.31)/(1.00 0.31) = 0.47 Interobserver agreement (Kappa) [8] Altman describes well how to measure interobserver There are no absolute definitions for interpreting [8,10] agreement, using as data the assessments of 85 xero- but it has been suggested that the guidelines in Table mammograms by two radiologists (A and B) where the 3 can be followed, which in the example considered here xeromammogram reports are given as one of four means that there was moderate agreement between results: normal, benign disease, suspected cancer, cancer. radiologist A and radiologist B. A measure of agreement is required between radiolo- gist A and radiologist B rather than a test of association such as might be undertaken using the  test (Table 1). Imaging strategies As Altman points out, the simplest approach is to count how many exact agreements were observed be- The radiologist should undoubtedly be at the forefront tween A and B, which from Table 1 is 54/85 = 0.64. of deciding which test should be used in evaluating However, the disadvantages with this method of merely patients with malignant disease and the appropriate and quoting a 64% measure of agreement is that it does not judicious use of radiological technology is a formidable take into account where the agreements occurred and challenge. also the fact that one would expect a cetain amount of Based on the discussion above, it is clear that the agreement between radiologist A and radiologist B proper use of imaging in cancer is a complex issue and at 4 R H Reznek Table 3 Guidelines for the interpreting the  stat- subcategories than a cohort managed in the 1950s and [8,10] istic 1960s; a finding which is not surprising. When, however, he staged the recent cohort on clinical grounds only, Strength of without the benefit of ultrasonography, CT and nuclear values agreement medicine, these survival differences disappeared. It was apparent that the improved survival rates were mainly <20 Poor an artefact of better staging; patients in the lower stages 0.21–0.40 Fair with clinically occult (usually nodal) disease were being 0.41–0.60 Moderate identified with better imaging and were being placed in a 0.61–0.80 Good more advanced stage (‘stage migration’). Better staging 0.81–1.00 Very good led to benefit to all; in the lower stages, patients with occult metastases would be removed with benefitto those stages; in the higher stages, those patients with a lower tumour burden would be added to those with a best only guidelines on the appropriate use of imaging higher one, with improvement in survival rates. Thus techniques can be provided in the chapters which follow. while individual prognosis did not change overall, sur- Nevertheless, there are certain important issues that vival in each stage improved. The stage migration need to be addressed in the choice of a particular phenomenon occurs when comparisons are made imaging technique which relate not only to technical and between groups of patients who have undergone less or diagnostic performance but also to the purpose of more thorough staging techniques and as such is likely imaging in an individual patient. to occur when the comparisons are made over a time Imaging may be requested to answer a specific clinical period which spans the introduction of new tech- question in an individual patient on cancer therapy or it nology. It has been noted with numerous tumours [6,12] may be requested as a routine investigation at the time including metastatic germ cell tumours and gastric [13] of presentation for diagnostic and staging purposes. In cancers . those tumours where established therapy is available Imaging may be used for surveillance of patients with imaging is required to measure therapeutic response. no clinical evidence or imaging evidence of disease in Imaging also has a major role in supporting clinical order to identify relapse as early as possible. In patients trials of new therapeutic agents and in this situation is with clinical suspicion of relapse, again imaging is used more frequently during the course of cancer than required to detect recurrence in the previously treated when used as a tool for management decisions. Imaging patient. The choice of an imaging technique in this to support clinical trials is an increasingly important role clinical setting depends on the ability of the different for the radiologist with an interest in oncology. The very imaging methods, not only to identify an abnormality, high accuracy of and reproducibility of cross-sectional but also to characterize a lesion and distinguish benign imaging (particularly computed tomography (CT) and from malignant pathology in the presence of previously MRI) makes it extremely well suited to Phase II trials in treated normal tissues which may have been damaged by which the oncologist is assessing the biological activity therapy. of new treatments. In Phase III trials, comparing the In all the situations outlined above, the imaging results of different treatments, survival is usually the modality chosen will depend upon local factors which final arbiter. If the size of the patient group is large include the availability of equipment, the expertise of enough, sophisticated staging is unnecessary as the stage medical and ancillary personnel and the demands made will be randomized out. In practice, however, the groups on imaging by the workload of the department. tend to be small and one of the prognostic variables, Best practice dictates that the imaging technique namely the varying stage of the disease, can be removed which provides the best diagnostic performance will be from the study by achieving more accurate staging used in all circumstances, but this is not always possible. through imaging. Furthermore, in patients with ad- It is, however, incumbent on the radiologist to adhere to vanced disease, where there is no obvious difference in good practice using his knowledge of diagnostic imaging survival and the end-point of the study becomes the and of cancer to provide the optimum service within response rate rather than survival, accurate imaging the local environment. Good practice requires close becomes an extremely valuable research tool. collaboration between radiologists and clinicians to An important impact of the use of sophisticated define protocols. The issues to be addressed include: techniques to stage patients with cancer is the apparent The choice of a technique for different tumour types continuous improvement in cancer survival rates For a given imaging technique examination protocols reported over the last 25 years. Although this is quickly should be agreed for every tumour and easily attributable to earlier diagnosis and new and The timing of imaging in relation to treatment should more effective treatments, the effect of more accurate be agreed staging may to some extent explain these improved Follow-up studies should also be performed to an [11–13] [11] results . Feinstein et al. found that a 1977 cohort agreed protocol of patients who had undergone lung cancer treatment Finally, the impact of diagnostic imaging in cancer is survived significantly longer in each of three TNM enormously improved by working in a multidisciplinary Significance of results in cancer imaging 5 [7] Goldin J, Sayre JW. Review: a guide to clinical epidemiology team with regular clinicoradiological review of imaging for radiologists: part II statistical analysis. Clin Radiol 1996; studies in relation to management decisions. 51: 317–24. [8] Altman DG. Practical Statistics for Medical Research. London: Chapman & Hall, 1991: 404–9. [9] Boyd NF, Wolfson C, Moskowitz M. Observer variation in the interpretation of xeromammograms. J Natl Cancer Inst References 1982; 68: 357–63. [10] Landis JR, Koch GG. The measurement of observer [1] MacKenzie R, Dixon AK. Review: measuring the effects of agreement for categorical data. Biometrics 1977; 33: 159–74. imaging: an evaluation framework. Clin Radiol 1995; 50: [11] Feinstein AR, Sosin DM, Wells CK. The Will Rogers 513–8. phenomenon. Stage migration and new diagnostic techniques [2] Donabedian A. Evaluating the quality of medical care. as a source of misleading statistics for survival in cancer. New Millbank Memorial Fund Quarterly 1966; 44: 166–206. Engl J Med 1985; 312: 1604–8. [3] Fineberg HV, Wittenberg J, Ferrucci JT. The clinical value of [12] Bosi GJ, Geller NL, Chan EY. Stage migration and the body computed tomography over time and technologic increasing proportion of complete responders in patients with change. Am J Roentgenol 1983; 141: 1067–72. advanced germ cell tumours. Cancer Res 1988; 48: 3524–7. [4] Libson E, Polliack A, Bloom RA. Value of lymphangiography in the staging of Hodgkin lymphoma. Radiology 1994; 193: [13] Bunt AMG, Hermans J, Smit VTHBM, van de Velde CJH, 757–9. Fleuren GJ, Bruijn JA. Surgical/pathologic stage migration [5] Williams MP, Cherryman GR, Husband JES. Magnetic confounds comparisons of gastric cancer survival rates resonance imaging in suspected metastatic spinal cord between Japan and Western Countries. J Clin Oncol 1955; 13: compression. Clin Radiol 1989; 40: 286–90. 19–25. [6] Dixon AK, Southern JP, Teale A et al. Magnetic resonance The digital object identifier for this article is: 10.1102/ imaging for the head and spine: effective for the clinician or the patient? Br Med J 1991; 302: 78–82. 1470-7330.2001.004 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Cancer Imaging Springer Journals

Significance of results in cancer imaging

Cancer Imaging , Volume 2 (1) – May 5, 2015

Loading next page...
 
/lp/springer-journals/significance-of-results-in-cancer-imaging-3HeYj7OhUK

References (15)

Publisher
Springer Journals
Copyright
Copyright © 2001 by International Cancer Imaging Society
Subject
Medicine & Public Health; Imaging / Radiology; Cancer Research; Oncology
eISSN
1470-7330
DOI
10.1102/1470-7330.2001.004
Publisher site
See Article on Publisher Site

Abstract

Cancer Imaging (2001) 2, 1–5 Introductory Lecture Monday 15 October 2001, 09.10–09.50 Significance of results in cancer imaging Rodney H Reznek Professor of Diagnostic Imaging, St Bartholomew’s Hospital, London, UK Technical performance relates to the ability to obtain Introduction high image quality in a reasonable time frame and The effective management of patients with cancer whether these images permit correct interpretation. requires a multidisciplinary team approach with the Diagnostic performance is concerned with the ability diagnostic radiologist playing an extremely important of the technique to identify disease correctly. Thus, role in that team. Increasingly it is realized that it is diagnostic performance is a measure of sensitivity, often the responsibility of the radiologist to understand specificity, positive predictive value, negative predictive and elucidate the significance of the findings of a test. Its value and accuracy of the technique in a given clinical significance lies not only in the clinical context but also situation. This is a familiar method of evaluating imag- in appreciating the impact that the test will have on the ing in cancer and the major method by which different patient’s outcome. The latter requires a knowledge imaging techniques are compared. Thus, the decision to of the cost-effectiveness of the use of any imaging use one imaging technique for staging cancer in prefer- technique. ence to another is frequently based on information provided on diagnostic performance. While it is not possible to discuss the use of statistics in detail, it is Evaluation of the effect of imaging important to recognize that studies should be designed to answer an hypothesis and that the help of a statisti- The issue of cost-effective imaging is complex and cian to design a study is likely to yield enormous benefits beyond the scope of this text to discuss these issues in [1] by reducing inappropriate methodology and bias . detail. However, recognition of the importance of Diagnostic impact is determined by the influence proper evaluation of imaging techniques and of their of the result of imaging on the clinician’s diagnostic use in clinical practice should improve both cost- confidence and by the ability of the new technology to effectiveness and efficacy of cancer imaging. These issues replace older established methods. Displacement of are discussed in an excellent review on measuring the [1] older techniques by new imaging modalities is easy to effects of imaging by MacKenzie and Dickson . These demonstrate. For example, lymphangiography has now authors point out that for diagnostic technologies it is become obsolete in the staging of several cancers and not clear how a technology itself may directly affect the myelography has also been superseded by magnetic physical health of the patient, a factor which is particu- resonance imaging (MRI) in the investigation of spinal larly important in the case of diagnostic imaging devel- [4,5] cord compression . opment. A strategy has been devised, therefore, for Therapeutic impact reflects the alteration in manage- evaluating the chain of events in which a trained ment of a patient based on results of imaging. Dixon observer makes an imaging report and the clinician [6] et al. recorded changes in the proposed treatment in combines the information in the report with clinical 182 of 200 patients referred for MRI of the head and findings and other tests to make a diagnosis and choose [2] [3] spine and, in the same group of patients, surgery was appropriate therapy . Fineberg et al. introduced four considered to be appropriate in 50 patients before MRI, levels to determine efficacy for diagnostic imaging which but in only 28 patients following the results of the have subsequently been expanded to five levels: examination. Technical performance Impact on health is much more difficult, if not imposs- Diagnostic performance ible, to evaluate, particularly in oncology when diagnos- Diagnostic impact tic information may be in advance of the ability to treat Therapeutic impact the disease. However, progress in research in both Impact on health diagnosis and treatment of cancer can only be made by The positive effect of one level is determined by the level furthering our understanding of the natural processes of above and in turn determines the possibility of a positive therapeutic response and tumour regrowth. In this con- result at the level below. text, therefore, imaging has an important role in cancer 1470-7330/01/010001 + 05  2001 International Cancer Imaging Society 2 R H Reznek even if there is no demonstrable impact on health. 1 Furthermore, it must be emphasized that although imaging itself cannot make an impact on outcome, the 0.8 results of imaging may directly influence management allowing the clinician to make the optimum therapeutic decision. In this way diagnostic imaging through 0.6 therapy does make an important contribution to final outcome. 0.4 Diagnostic performance 0.2 The diagnostic impact of imaging is most frequently made on the basis of studies designed to evaluate the ability of a technique to detect cancer accurately. In a 0 0.2 0.4 0.6 0.8 review entitled ‘A guide to clinical epidemiology for 1-Specificity (false-positive rate) [7] radiologists’, Goldin and Sayre commented that the Figure 1 ROC curve. poor understanding by physicians of the principles of statistical analysis weakens many investigations. Their review discusses the different methods of The positive predictive value (PPV) of a test indicates statistical analysis and basic concepts used to select the probability of whether the disease is actually present the appropriate technique and to interpret the results, if the test is positive. and is recommended as an excellent overview of the subject. true positives In the text of the chapters that follow, many refer- PPV = ences are made to sensitivity, specificity, positive predic- true positives + false negatives tive value, negative predictive value and accuracy. Negative predictive value (NPV) indicates whether the Advising on the judicious use of imaging studies in disease is likely to be absent if the result is negative. the staging and evaluation of malignancy requires a thorough understanding of these basic tests of efficacy true negatives and of the receiver–operator characteristics curve. These NPV = true negatives + false positives terms are defined below: Sensitivity of an investigation is its ability to identify correctly those patients who have the disease or is the Thus NPV = 1 PPV. proportion of patients with the disease who have posi- tive test results. Sensitivity is also referred to as the The sensitivity and specificity of a test are generally true-positive rate of the investigation. independent of disease prevalence and are therefore often called the intrinsic operating characteristics of the true positives test. On the other hand the PPV (and NPV) and Sensitivity = accuracy are highly dependent on the prevalence of the true positives + false negatives disease and cannot be generalized over settings where the prevalence varies. For this reason, reports of sensi- Specificity of an investigation is its ability to identify tivity and specificity are more reliable than tests of PPV correctly those patients who do not have the disease or is and accuracy, which are greatly influenced by regional the proportion of patients without disease who have variation of disease prevalence. negative test results. true positives Sensitivity = false positives + true negatives ROC The specificity is also called the false-positive rate of the Other statistical methods such as receiver–operating test. characteristics (ROC) and Kappa statistics are com- Accuracy of a test equals: monly used. Receiver–operating characteristics analysis is a plot of sensitivity vs. specificity for different cut-off true positives + true negatives points of a particular test. By grading test results accord- ing to five categories (strongly positive, 5; weakly true positives + true negatives + false positives + positive, 4; intermediate, 3; weakly negative, 2; strongly false negatives negative, 1) and plotting sensitivity against 1 The accuracy of a test is of less value than the sensitivity specificity, the ROC curve is generated (Fig. 1). and specificity because it lumps together positive and Thus, as the criteria for calling a test result positive negative results. are made more stringent, specificity improves at the Sensitivity Significance of results in cancer imaging 3 [8,9] Table 1 Interobserver agreement Radiologist B Radiologist A Normal Benign Suspected cancer Cancer Total Normal 21 12 0 0 33 Benign 4 17 1 0 22 Suspected cancer 3 9 15 2 29 Cancer 0 0 0 1 1 Total 28 38 16 2 85 expense of sensitivity. Conversely, as the criteria are Table 2 Calculation of the expected frequencies for the [8] relaxed, sensitivity improves while specificity diminishes. kappa test, after Altman The fundamental principle illustrated by the ROC curve is that there is an inherent limit to the diagnostic Assessment Expected frequency efficacy of a test. Once this limit has been reached, the interpreter can only improve sensitivity at the expense of Normal 33 (28/85) = 10.87 specificity and vice versa. The ROC curve can be used to Benign 22 (38/85) = 9.84 select the ‘best’ cut-off criteria for positivity taking the Suspected cancer 29 (16/85) = 5.46 Cancer 1 (3/85) = 0.04 positive predictive value and the relative costs (in terms Total 26.2 (31) of patient outcome) of false-positive and false-negative rest results into account. This has particular relevance in the use of imaging in staging cancer where cut-off criteria for positive results are constantly being decided. purely by chance, even if they were guessing their An example of this is on deciding on the upper limit of assessments. normal size for lymph nodes on cross-sectional imaging. The complete theory underpinning the kappa () test, An understanding of the ROC curve is therefore including the calculation of confidence intervals and essential for all radiologists and oncologists interpreting including a weighted kappa test where all disagreements [8] the results of imaging in staging cancer. First, the curve are not treated equally, has been given by Altman . displays explicitly the trade-off between sensitivity and The expected frequencies along the diagonal of this specificity which results from varying the criterion for table are given in Table 2 from which it is seen for these interpretation. Second, it provides a graphical summary data that the number of agreements expected by chance of how well a test performs for each method of interpret- is 26.2, which is 31% of the total, i.e. 26.2/85. What the ation, allowing one to compare two or more tests kappa test gives is the answer to the question of how without the necessity of having to stipulate the positive much better the radiologists were than 0.31. criterion for each test. The maximum agreement is 1.00 and the kappa Kappa statistics are used to demonstrate the statistic gives the radiologists’ agreement as a proportion [7] agreement between observers or different tests . of the possible scope for performing better than chance, which is 1.00–0.31. = (0.64 0.31)/(1.00 0.31) = 0.47 Interobserver agreement (Kappa) [8] Altman describes well how to measure interobserver There are no absolute definitions for interpreting [8,10] agreement, using as data the assessments of 85 xero- but it has been suggested that the guidelines in Table mammograms by two radiologists (A and B) where the 3 can be followed, which in the example considered here xeromammogram reports are given as one of four means that there was moderate agreement between results: normal, benign disease, suspected cancer, cancer. radiologist A and radiologist B. A measure of agreement is required between radiolo- gist A and radiologist B rather than a test of association such as might be undertaken using the  test (Table 1). Imaging strategies As Altman points out, the simplest approach is to count how many exact agreements were observed be- The radiologist should undoubtedly be at the forefront tween A and B, which from Table 1 is 54/85 = 0.64. of deciding which test should be used in evaluating However, the disadvantages with this method of merely patients with malignant disease and the appropriate and quoting a 64% measure of agreement is that it does not judicious use of radiological technology is a formidable take into account where the agreements occurred and challenge. also the fact that one would expect a cetain amount of Based on the discussion above, it is clear that the agreement between radiologist A and radiologist B proper use of imaging in cancer is a complex issue and at 4 R H Reznek Table 3 Guidelines for the interpreting the  stat- subcategories than a cohort managed in the 1950s and [8,10] istic 1960s; a finding which is not surprising. When, however, he staged the recent cohort on clinical grounds only, Strength of without the benefit of ultrasonography, CT and nuclear values agreement medicine, these survival differences disappeared. It was apparent that the improved survival rates were mainly <20 Poor an artefact of better staging; patients in the lower stages 0.21–0.40 Fair with clinically occult (usually nodal) disease were being 0.41–0.60 Moderate identified with better imaging and were being placed in a 0.61–0.80 Good more advanced stage (‘stage migration’). Better staging 0.81–1.00 Very good led to benefit to all; in the lower stages, patients with occult metastases would be removed with benefitto those stages; in the higher stages, those patients with a lower tumour burden would be added to those with a best only guidelines on the appropriate use of imaging higher one, with improvement in survival rates. Thus techniques can be provided in the chapters which follow. while individual prognosis did not change overall, sur- Nevertheless, there are certain important issues that vival in each stage improved. The stage migration need to be addressed in the choice of a particular phenomenon occurs when comparisons are made imaging technique which relate not only to technical and between groups of patients who have undergone less or diagnostic performance but also to the purpose of more thorough staging techniques and as such is likely imaging in an individual patient. to occur when the comparisons are made over a time Imaging may be requested to answer a specific clinical period which spans the introduction of new tech- question in an individual patient on cancer therapy or it nology. It has been noted with numerous tumours [6,12] may be requested as a routine investigation at the time including metastatic germ cell tumours and gastric [13] of presentation for diagnostic and staging purposes. In cancers . those tumours where established therapy is available Imaging may be used for surveillance of patients with imaging is required to measure therapeutic response. no clinical evidence or imaging evidence of disease in Imaging also has a major role in supporting clinical order to identify relapse as early as possible. In patients trials of new therapeutic agents and in this situation is with clinical suspicion of relapse, again imaging is used more frequently during the course of cancer than required to detect recurrence in the previously treated when used as a tool for management decisions. Imaging patient. The choice of an imaging technique in this to support clinical trials is an increasingly important role clinical setting depends on the ability of the different for the radiologist with an interest in oncology. The very imaging methods, not only to identify an abnormality, high accuracy of and reproducibility of cross-sectional but also to characterize a lesion and distinguish benign imaging (particularly computed tomography (CT) and from malignant pathology in the presence of previously MRI) makes it extremely well suited to Phase II trials in treated normal tissues which may have been damaged by which the oncologist is assessing the biological activity therapy. of new treatments. In Phase III trials, comparing the In all the situations outlined above, the imaging results of different treatments, survival is usually the modality chosen will depend upon local factors which final arbiter. If the size of the patient group is large include the availability of equipment, the expertise of enough, sophisticated staging is unnecessary as the stage medical and ancillary personnel and the demands made will be randomized out. In practice, however, the groups on imaging by the workload of the department. tend to be small and one of the prognostic variables, Best practice dictates that the imaging technique namely the varying stage of the disease, can be removed which provides the best diagnostic performance will be from the study by achieving more accurate staging used in all circumstances, but this is not always possible. through imaging. Furthermore, in patients with ad- It is, however, incumbent on the radiologist to adhere to vanced disease, where there is no obvious difference in good practice using his knowledge of diagnostic imaging survival and the end-point of the study becomes the and of cancer to provide the optimum service within response rate rather than survival, accurate imaging the local environment. Good practice requires close becomes an extremely valuable research tool. collaboration between radiologists and clinicians to An important impact of the use of sophisticated define protocols. The issues to be addressed include: techniques to stage patients with cancer is the apparent The choice of a technique for different tumour types continuous improvement in cancer survival rates For a given imaging technique examination protocols reported over the last 25 years. Although this is quickly should be agreed for every tumour and easily attributable to earlier diagnosis and new and The timing of imaging in relation to treatment should more effective treatments, the effect of more accurate be agreed staging may to some extent explain these improved Follow-up studies should also be performed to an [11–13] [11] results . Feinstein et al. found that a 1977 cohort agreed protocol of patients who had undergone lung cancer treatment Finally, the impact of diagnostic imaging in cancer is survived significantly longer in each of three TNM enormously improved by working in a multidisciplinary Significance of results in cancer imaging 5 [7] Goldin J, Sayre JW. Review: a guide to clinical epidemiology team with regular clinicoradiological review of imaging for radiologists: part II statistical analysis. Clin Radiol 1996; studies in relation to management decisions. 51: 317–24. [8] Altman DG. Practical Statistics for Medical Research. London: Chapman & Hall, 1991: 404–9. [9] Boyd NF, Wolfson C, Moskowitz M. Observer variation in the interpretation of xeromammograms. J Natl Cancer Inst References 1982; 68: 357–63. [10] Landis JR, Koch GG. The measurement of observer [1] MacKenzie R, Dixon AK. Review: measuring the effects of agreement for categorical data. Biometrics 1977; 33: 159–74. imaging: an evaluation framework. Clin Radiol 1995; 50: [11] Feinstein AR, Sosin DM, Wells CK. The Will Rogers 513–8. phenomenon. Stage migration and new diagnostic techniques [2] Donabedian A. Evaluating the quality of medical care. as a source of misleading statistics for survival in cancer. New Millbank Memorial Fund Quarterly 1966; 44: 166–206. Engl J Med 1985; 312: 1604–8. [3] Fineberg HV, Wittenberg J, Ferrucci JT. The clinical value of [12] Bosi GJ, Geller NL, Chan EY. Stage migration and the body computed tomography over time and technologic increasing proportion of complete responders in patients with change. Am J Roentgenol 1983; 141: 1067–72. advanced germ cell tumours. Cancer Res 1988; 48: 3524–7. [4] Libson E, Polliack A, Bloom RA. Value of lymphangiography in the staging of Hodgkin lymphoma. Radiology 1994; 193: [13] Bunt AMG, Hermans J, Smit VTHBM, van de Velde CJH, 757–9. Fleuren GJ, Bruijn JA. Surgical/pathologic stage migration [5] Williams MP, Cherryman GR, Husband JES. Magnetic confounds comparisons of gastric cancer survival rates resonance imaging in suspected metastatic spinal cord between Japan and Western Countries. J Clin Oncol 1955; 13: compression. Clin Radiol 1989; 40: 286–90. 19–25. [6] Dixon AK, Southern JP, Teale A et al. Magnetic resonance The digital object identifier for this article is: 10.1102/ imaging for the head and spine: effective for the clinician or the patient? Br Med J 1991; 302: 78–82. 1470-7330.2001.004

Journal

Cancer ImagingSpringer Journals

Published: May 5, 2015

There are no references for this article.