Get 20M+ Full-Text Papers For Less Than $1.50/day. Subscribe now for You or Your Team.

Learn More →

Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach

Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach ARTICLE Received 25 Nov 2013 | Accepted 29 Apr 2014 | Published 3 Jun 2014 | Updated 7 Aug 2014 DOI: 10.1038/ncomms5006 OPEN Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach 1,2,3,4, 1,2, 1 1,2 Hugo J.W.L. Aerts *, Emmanuel Rios Velazquez *, Ralph T.H. Leijenaar , Chintan Parmar , 2 1 5 5 6 Patrick Grossmann , Sara Carvalho , Johan Bussink , Rene´ Monshouwer , Benjamin Haibe-Kains , 7 1 8 8 1 Derek Rietveld , Frank Hoebers , Michelle M. Rietbergen , C. Rene´ Leemans , Andre Dekker , 4 9 1 John Quackenbush , Robert J. Gillies & Philippe Lambin Human cancers exhibit strong phenotypic differences that can be visualized noninvasively by medical imaging. Radiomics refers to the comprehensive quantification of tumour phenotypes by applying a large number of quantitative image features. Here we present a radiomic analysis of 440 features quantifying tumour image intensity, shape and texture, which are extracted from computed tomography data of 1,019 patients with lung or head-and-neck cancer. We find that a large number of radiomic features have prognostic power in independent data sets of lung and head-and-neck cancer patients, many of which were not identified as significant before. Radiogenomics analysis reveals that a prognostic radiomic signature, capturing intratumour heterogeneity, is associated with underlying gene-expression patterns. These data suggest that radiomics identifies a general prognostic phenotype existing in both lung and head-and-neck cancer. This may have a clinical impact as imaging is routinely used in clinical practice, providing an unprecedented opportunity to improve decision-support in cancer treatment at low cost. 1 2 Department of Radiation Oncology (MAASTRO), Research Institute GROW, Maastricht University, 6229ET Maastricht, The Netherlands. Department of Radiation Oncology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts 02215-5450, USA. Department of Radiology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts 02215-5450, 4 5 USA. Department of Biostatistics & Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215-5450, USA. Department of Radiation Oncology, Radboud University Medical Center Nijmegen, PB 9101, 6500HB Nijmegen, The Netherlands. Princess Margaret Cancer Centre, University Health Network and Medical Biophysics Department, University of Toronto, Toronto, Ontario, Canada M5G 1L7. Department of Radiation Oncology, VU University Medical Center, 1081 HZ Amsterdam, The Netherlands. Department of Otolaryngology/Head and Neck Surgery, VU University Medical Center, 1081 HZ Amsterdam, The Netherlands. Department of Cancer Imaging and Metabolism, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida 33612, USA. * These authors contributed equally to this work. Correspondence and requests for materials should be addressed to H.A. (email: Hugo_Aerts@dfci.harvard.edu). NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 1 & 2014 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 edical imaging is one of the major factors that have However, in clinical practice, tumour response to therapy is only informed medical science and treatment. By assessing measured using one- or two-dimensional descriptors of tumour Mthe characteristics of human tissue noninvasively, size (RECIST and WHO, respectively) . Although a change in imaging is often used in clinical practice for oncologic diagnosis tumour size can indicate response to therapy, it often does not 1–3 6,7 and treatment guidance . A key goal of imaging is ‘personalized predict overall or progression free survival . Although some medicine’, where treatment is increasingly tailored on the basis of investigations have characterized the appearance of a tumour specific characteristics of the patient and their disease . on CT images, these characteristics are typically described Much of the discussion of personalized medicine has focused subjectively and qualitatively (‘moderate heterogeneity’, ‘highly on molecular characterization using genomic and proteomic spiculated’, ‘large necrotic core’). However, recent advances in technologies. However, as tumours are spatially and temporally image acquisition, standardization and image analysis allow for heterogeneous, these techniques are limited. They require objective and precise quantitative imaging descriptors that could biopsies or invasive surgeries to extract and analyse what are potentially be used as noninvasive prognostic or predictive generally small portions of tumour tissue, which do not allow for biomarkers. a complete characterization of the tumour. Imaging has great Radiomics is an emerging field that converts imaging data into potential to guide therapy because it can provide a more a high dimensional mineable feature space using a large number 8,9 comprehensive view of the entire tumour and it can be used on of automatically extracted data-characterization algorithms . an ongoing basis to monitor the development and progression of We hypothesize that these imaging features capture distinct the disease or its response to therapy. Further, imaging is phenotypic differences of tumours and may have prognostic noninvasive and is already often repeated during treatment in power and thus clinical significance across different diseases. Here routine practice, on the contrary of genomics or proteomics, we assess the clinical relevance of 440 radiomic features, many of which are still challenging to implement into clinical routine. which currently have no known clinical significance, in seven The most widely used imaging modality in oncology is X-ray independent cohorts consisting of 1,019 lung cancer and head- computed tomography (CT), which assesses tissue density. and-neck cancer patients. Two data sets are used to assess Indeed, CT images of lung cancer tumours exhibit strong the stability of the features, four data sets to assess the prognostic contrast reflecting differences in the intensity of a tumour on value of radiomic features on lung cancer patients and the image, intratumour texture and tumour shape (Fig. 1a). head-and-neck cancer patients, and one data set for association ab I) CT imaging III) Analysis II) Feature extraction A Radiomic features Gene expression A B Tumour intensity A B Tumour shape Clinical data Tumour texture Wavelet Figure 1 | Extracting radiomics data from images. (a) Tumours are different. Example computed tomography (CT) images of lung cancer patients. CT images with tumour contours left, three-dimensional visualizations right. Please note strong phenotypic differences that can be captured with routine CT imaging, such as intratumour heterogeneity and tumour shape. (b) Strategy for extracting radiomics data from images. (I) Experienced physicians contour the tumour areas on all CT slices. (II) Features are extracted from within the defined tumour contours on the CT images, quantifying tumour intensity, shape, texture and wavelet texture. (III) For the analysis the radiomics features are compared with clinical data and gene-expression data. 2 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 ARTICLE Training Validation Radiomics features definition RIDER Multiple Lung1 Lung2 H&N1 H&N2 Lung3 test/retest delineation Maastro Radboud Maastro VU Amsterdam MUMC NSCLC NSCLC HNSCC HNSCC NSCLC n =31 n =21 n =136 n =89 n =422 n =225 n =95 Radiomics features Stability rank Stability rank Radiomics features Radiomics features Radiomics features Radiomics features features features gene-expression Feature selection based on stability ranks and performance Radiomics signature Prognostic Prognostic Prognostic validation Association (containing four validation validation H&N cancer cohort with gene-expression features) lung cancer cohort H&N cancer cohort Figure 2 | Analysis workflow. The defined radiomic features algorithms were applied to seven different data sets. Two data sets were used to calculate the feature stability ranks, RIDER test/retest and multiple delineation respectively (both orange). The Lung1 data set, containing data of 422 non-small cell lung cancer (NSCLC) patients, was used as training data set. Lung2 (n¼ 225), H&N1 (n¼ 136) and H&N2 (n¼ 95) were used as validation data sets. The Lung3 data set (n¼ 89) was used for association of the radiomic signature with gene expression profiles. For the multivariate analysis, only one fixed four-feature radiomic signature was tested in the validation data sets. with gene-expression profiles of lung cancer patients (Fig. 2). Our Prognostic value of radiomic data. The possible association of results reveal that radiomics data contain strong prognostic radiomic features with survival was then explored by Kaplan– information in both lung and head-and-neck cancer patients, and Meier survival analysis. For training we used the Lung1 data set, are associated with the underlying gene-expression patterns. and for validation the Lung2, H&N1, H&N2 data sets (Fig. 2). These results suggest that radiomics decodes a general prognostic The radiomic features were not normalized on any data set, and phenotype existing in multiple cancer types. Radiomics can have only the raw values were used that were directly computed from a large clinical impact, as imaging is used in routine practice the DICOM images. worldwide, providing a method that can quantify and monitor To ensure a completely independent validation, the median phenotypic changes during treatment. value of each feature was computed on the training Lung1 data set, and locked for use as a threshold in the validation data sets to assess the survival differences without retraining. In Results Supplementary Fig. 1 we show Kaplan–Meier survival curves Association of radiomic data with clinical data. To assess the for four representative features. Features describing heterogeneity value of radiomic features to capture phenotypic differences of in the primary tumour were associated with worse survival in all tumours, we performed an integrated analysis assessing prog- four data sets. Also, patients with more compact/spherical nostic performance and association with gene expression in lung tumours had better survival probability. and head-and-neck cancer data sets. First, we defined 440 Overall, the median threshold derived from Lung1 yielded a quantitative image features describing tumour phenotype char- significant survival difference for 238 features (54% of total 440; acteristics by: (I) tumour image intensity, (II) shape, (III) texture G-rho test, false discovery rate (FDR) 10%) in the Lung2 and (IV) multiscale wavelet (Fig. 1b, Supplementary Methods). validation data set. Furthermore, there was a significant survival To investigate radiomic expression patterns we extracted difference for 135 features (31%) in H&N1 and for 186 features in radiomic features from the Lung1 data set, consisting of 422 H&N2 (42%). Sixty-six (15%) of the features derived from Lung1 non-small cell lung cancer (NSCLC) patients (Fig. 2). Unsuper- were significant for survival in all three validation data sets vised clustering revealed clusters of patients with similar radiomic (Lung2, H&N1 and H&N2). expression patterns (Fig. 3). We compared the three main clusters of patients with clinical parameters (Fig. 3b), and found significant association with primary tumour stage (T-stage; Building prognostic radiomic signature. To build a prognostic 20 2  3 Po1 10 , w test) and overall stage (P¼ 3.4 10 , radiomic signature, the analysis was divided in training and vali- w test), wherein cluster I was associated with lower stages. dation phases (Fig. 2). For the training phase, we first explored N-stage (lymph node) and M-stage (metastasis), however, feature stability determined in both test-retest and inter-observer showed no correspondence with the radiomic expression patterns setting. Using the publicly available RIDER data set, consisting (P¼ 0.46 and P¼ 0.73, respectively, w test). of 31 sets of test-retest CT scans that were acquired approximately Furthermore, a significant association with histology (P¼ 0.019, 15 min apart, we tested how consistent the radiomic features were w test) was observed, wherein squamous cell carcinoma showed a between the test and the retest scan. The multiple delineation data higher presence in cluster II. Looking at the representation of the set, where five oncologists delineated lesions on CT scans from 21 feature groups (Fig. 3c), there was no correspondence between the patients , was used to test the stability of the radiomic features to feature group and radiomic expression patterns. variation in manual delineations. NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 3 & 2014 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 –2 –1 0 1 2 Patients Z-score Intensity Shape Texture HHH HHL HLH HLL LHH LHL LLH LLL Clusters I II III T-stage 12 3 4 01 2 3 N-stage M-stage Overall stage I II IIIA IIIB IV Histology Adenocarcinoma Squamous cell carcinoma Large cell carcinoma Not otherwise specified (nos) NA Figure 3 | Radiomics heat map. (a) Unsupervised clustering of lung cancer patients (Lung1 set, n¼ 422) on the y axis and radiomic feature expression (n¼ 440) on the x axis, revealed clusters of patients with similar radiomic expression patterns. (b) Clinical patient parameters for showing 20 2  3 significant association of the radiomic expression patterns with primary tumour stage (T-stage; Po1 10 , w test), overall stage (P¼ 3.4 10 , 2 2 w test) and histology (P¼ 0.019, w test). (c) Correspondence of radiomic feature groups with the clustered expression patterns. For each feature, we compared the stability ranks for test-retest Nonuniformity HLH’ (Feature Group 4), also describing and multiple delineation with prognosis in the Lung1 training intratumour heterogeneity after decomposing the image in mid- data set. Although the stability ranks did not use any information frequencies. The weights of each of the features in the signature about prognosis, in general, features with higher stability for test- were fitted on the training data set Lung1. retest and delineation inaccuracies showed higher prognostic performance (Supplementary Fig. 2). This is possibly due to reduced amount of noise in the stable features and supports the Prognostic validation of radiomic signature. The performance use of stability ranks for feature selection. of the four-feature radiomic signature was validated in the data To test the multivariate performance of a radiomic signature, sets Lung2, H&N1 and H&N2 (Fig. 2) using the concordance we used the workflow depicted in Fig. 2 and Supplementary Fig. 3. index (CI), which is a generalization of the area under the ROC We focused our analysis on the 100 most stable features, which curve . The radiomic signature had good performance on the were determined by averaging the stability ranks of RIDER data Lung2 data (CI¼ 0.65, P¼ 2.91 10 , Wilcoxon test), and a set and multiple delineation data set. To remove redundancy high performance in H&N1 (CI¼ 0.69, P¼ 7.99 10 , within the radiomic information, we selected the single best Wilcoxon test) and H&N2 (CI¼ 0.69, P¼ 3.53 10 , performing radiomic feature from each of the four-feature groups, Wilcoxon test). In Fig. 4a the Kaplan–Meier curves are shown. and combined these top four features into a multivariate Cox Although volume had a good performance in all data sets, the proportional hazards regression model for prediction of survival. radiomic signature performed significantly better, suggesting that The resulting radiomic signature consisted of (I) ‘Statistics radiomic features contain relevant, complementary information Energy’ (Supplementary Methods Feature 1) describing the for prognosis (Supplementary Table 1). Furthermore, combining overall density of the tumour volume, (II) ‘Shape Compactness’ the radiomic signature with volume was significantly better than (Feature 16) quantifying how compact the tumour shape is, (III) volume alone in all data sets. ‘Grey Level Nonuniformity’ (Feature 48) a measure for Comparing the radiomic signature with the TNM staging ,we intratumour heterogeneity and (IV) wavelet ‘Grey Level see that the signature performance was better in both Lung2 and 4 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. Radiomics features Wavelet NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 ARTICLE a Kaplan−Meier radiomics signature Kaplan−Meier radiomics signature 1.0 1.0 <= Median <= Median > Median > Median 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 Lung1: Maastro H&N1: Maastro Lung2: RadBoud H&N2: VU 0.0 0 200 400 600 800 1,000 1,200 1,400 0 500 1,000 1,500 2,000 Survival time (days) Survival time (days) ******* ** ** Statistics total energy ******* ** Shape compactness ****** RLGL grey level nonuniformity Wavelet HLH ************** RLGL grey level nonuniformity Colour key 1 1.5 Enrichment score Figure 4 | Prognostic performance and gene-expression association of the radiomics signature. (a) Radiomic signature performance. Kaplan–Meier curves demonstrating performance of the radiomic signature on the lung cancer data sets (left) and the head-and-neck cancer data sets (right). The signature was built on the Lung1 data (n¼ 422). The signature had a good performance in the Lung2 (CI¼ 0.65, P¼ 2.91 10 , Wilcoxon test, n¼ 225), 07  06 and a high performance in H&N1 (CI¼ 0.69, P¼ 7.99 10 , Wilcoxon test, n¼ 136) and H&N2 (CI¼ 0.69, P¼ 3.53 10 , Wilcoxon test, n¼ 95) validation data sets. (b) Association of radiomic signature features and gene expression using gene-set enrichment analysis (GSEA) in the Lung3 data set (n¼ 89). Gene sets that have been significantly enriched (FDR¼ 20%) for at least one of the four radiomic features are indicated with an asterisk. The corresponding normalized enrichment scores (NES), GSEA’s primary statistic, for all radiomic signature features is displayed in a heat map, where light blue means low and dark blue means high NES. NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 5 & 2014 Macmillan Publishers Limited. All rights reserved. Survival probability EXTRACELLULAR_REGION_PART EXTRACELLULAR_SPACE REGULATION_OF_MULTICELLULAR_ORGANISMAL_PROCESS DNA_DEPENDENT_DNA_REPLICATION REGULATION_OF_IMMUNE_SYSTEM_PROCESS TISSUE_DEVELOPMENT LEUKOCYTE_ACTIVATION MITOTIC_CELL_CYCLE_CHECKPOINT PROTEIN_AMINO_ACID_LIPIDATION LYMPHOCYTE_ACTIVATION EXTRACELLULAR_REGION PROTEIN_COMPLEX_BINDING ECTODERM_DEVELOPMENT EPIDERMIS_DEVELOPMENT Survival probability DNA_REPAIR CHROMOSOME MITOSIS REGULATION_OF_DNA_METABOLIC_PROCESS M_PHASE_OF_MITOTIC_CELL_CYCLE CELL_CYCLE_PROCESS CELL_CYCLE_PHASE CELL_CYCLE_GO_0007049 MITOTIC_CELL_CYCLE DNA_RECOMBINATION M_PHASE ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 H&N2 and comparable in H&N1 (Supplementary Table 1). avoid any form of over-fitting or bias, we performed a robust Importantly, combining the radiomic signature with TNM statistical validation: only one radiomics signature (containing four staging showed a significant improvement in all data sets, radiomic features) was validated in data of 545 patients in compared with TNM staging alone. Furthermore, we assessed if independent validation data sets (Fig. 2 and Supplementary Fig. 3). the radiomics signature preserved the significant prognostic The four features were selected on the basis of feature stability and performance compared with the treatment that the patients prognostic performance in the discovery data set only. received. We found that the signature preserved its prognostic The top performing feature ‘Grey Level Nonuniformity’ (Feature performance for all the treatment groups (radiation or concurrent 48) and the most dominant features in the radiomic signature chemoradiation), for both Lung and H&N cancer patients (Features III and IV), quantified intratumour heterogeneity. Indeed, (Supplementary Table 2), demonstrating the complementary it is often hypothesized that intratumour heterogeneity is exhibited value of radiomics for each treatment type. on different spatial scales, for example at the radiological, Human papillomavirus (HPV) is an important determinant in macroscopic, cellular and the molecular (genetics) level. Radi- head-and-neck cancer patients, especially those with orophar- ological tumour phenotype characteristics may thus be useful to yngeal carcinoma for prognosis and may guide future treatment investigate the underlying evolving biology. It is known that selection. We did not find a significant association between multiple subclonal populations coexist within tumours, reflecting 15,16 radiomic signature prediction and HPV status in a combined extensive intratumoral ‘somatic evolution’ . This heterogeneity is analysis in the H&N1 and H&N2 data set (P¼ 0.17, Wilcoxon a clear barrier to the goal of personalized therapy based on test, Supplementary Table 3). However, we found that the molecular biopsy-based assays, as the identified mutations and signature preserved its prognostic performance in the HPV- gene-expression does not always represent the entire population of 17,18 negative group (CI¼ 0.66), consisting of the majority of patients tumour cells . Radiomics circumvents this by assessing the (76%, n¼ 130), demonstrating the complementary value of comprehensive three-dimensional tumour bulk. The study radiomics to HPV screening. presented here probes heterogeneity and demonstrates To assess the association between the radiomic signature and corresponding clinical importance in two cancer types. the underlying biology, we compared the radiomic signature Furthermore, we demonstrated association of intratumour with gene-expression profiles (Lung3 data set, Fig. 2) using gene- heterogeneity with proliferation, a general hallmark of cancer. 1,14 set enrichment analysis (GSEA) . We found significant Overall, the lung-derived radiomic signature had better associations between the signature features and gene-expression performance in head and neck compared with lung cancer. One patterns (Fig. 4b). Further, the radiomic features are significantly reason could be that head-and-neck images were acquired with associated with different biologic gene sets, demonstrating that head immobilization, whereas lung images were acquired with radiomic features probe different biologic mechanisms. It is free breathing and are affected by patient movement or noteworthy that both intratumour heterogeneity features in the respiration, resulting in relatively more image noise. Nonetheless, signature (Feature III and IV) were strongly correlated with cell our results show that the radiomic signature could be transferred cycling pathways, indicating an increased proliferation for more from lung to head-and-neck cancer, which suggests that the heterogeneous tumours. signature identifies a general prognostic tumour phenotype. Our method provides a noninvasive (and therefore with no risk of infection or complications that accompany tissue biopsies), Discussion fast, low cost and repeatable way of investigating phenotypic Medical imaging is one of the major factors informing medical information, potentially speeding up the development of science and treatment. Its potential resides in its ability to assess personalized medicine. Furthermore, we show that the radiomic the characteristics of human tissue noninvasively, and therefore is signature is significantly associated with the underlying gene- routinely used in clinical practice for oncologic diagnosis and expression patterns, suggesting that inter-patient differences of treatment guidance and monitoring. gene expression are larger than intra-patient differences. However, traditionally, medical imaging has been a subjective The clinical impact of our results are illustrated by the fact that or qualitative science. Recent advances in medical imaging it advances knowledge in the analysis and characterization of acquisition and analysis allow the high-throughput extraction of tumours in medical images, previously not done, and provides informative imaging features to quantify the differences that knowledge currently not used in the clinic. We showed the oncologic tissues exhibit in medical imaging. complementary performance of radiomic features with TNM Radiomics applies advanced computational methodologies to staging for prediction of outcome, which illustrates the clinical medical imaging data to convert medical images into quantitative importance of our findings as TNM is routinely used in the clinic. descriptors of oncologic tissues . Currently, the TNM staging system is used for risk stratification In this study, we analysed 440 radiomic features quantifying and treatment decision making. However, the TNM staging tumour phenotypic differences based on its image intensity, shape system is primarily based on resectability of the tumour, whereas and texture. In a large data set of 1,019 lung and head-and-neck a larger number of NSCLC patients will receive primary cancer patients, of which we extracted radiomic features on treatment with radiotherapy either alone or combined with computed tomography images, we found that a large number of chemotherapy. Therefore, the TNM staging system is insufficient radiomic features have prognostic power, many of which their for risk stratification of this group of patients, in particular to prognostic implication have not been described before. Further- make the decision between curative treatment (concomitant more, our integrated analysis showed that features selected on the radiochemotherapy) or palliative treatment especially in elderly basis of their stability and reproducibility were also the most patients, a growing issue in western countries. Our results show informative features, which indicates the power of integrating that the radiomics signature is performing better in independent independent data sets for radiomic feature selection and model cohorts than the TNM classification. In future clinical trials, this building. inexpensive method can be used as well for pretreatment risk We showed as well that a radiomic signature, capturing stratification (for example, high, low risk). intratumour heterogeneity, was strongly prognostic and validated Furthermore, we have shown for the first time the translational in three independent data sets of lung and head-and-neck cancer capability of radiomics in two cancer types (lung and head-and- patients, and was associated with gene-expression profiles. To neck cancer). These results indicate that radiomics quantifies a 6 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 ARTICLE general prognostic cancer phenotype that likely can broadly be Data sets. We applied a radiomic analysis to seven image data sets. An overview of the data sets is presented in Fig. 2. All research was carried out in accordance applied to other cancer types. Similar observations have been with Dutch law. The Institutional Review Boards of each of the participating made in gene-expression studies where signatures are prognostic centres approved the studies: Lung1, Lung3, H&N1 (Maastricht University Medical across different diseases . Center (MUMCþ ), Maastricht, The Netherlands), Lung2 (Radboud University Analysis of image features applied to medical imaging has been Medical Center (RUMC), Nijmegen, The Netherlands) and H&N2 (VU University Medical Center (VUMC), Amsterdam, The Netherlands). The Multiple delineation a largely studied field and extensive literature exists. However, the data set is publicly available (downloaded from: www.cancerdata.org). This study majority of previous work describes the use of imaging features was conducted according to national laws and guidelines and approved by the focused in the detection of small nodules in, for example, appropriate local trial committee at Maastricht University Medical Center mammograms or chest CT/positron emission tomography (PET) (MUMC1), Maastricht, The Netherlands. scans, or in the differential diagnosis of malignant versus benign nodules (computed-aided diagnostics). However, applications  The RIDER data set consists of 31 NSCLC patients with two CT scans acquired approximately 15 min apart . We used this data set to assess stability of the and methodologies are distinct from our study. Quantitative features for test-retest. imaging for personalized medicine is a recent field, with a limited The multiple delineation data set consists of 21 NSCLC patients where the 12,20–27 number of publications . The main clinical question of this tumour volume was delineated manually on CT/PET scans by five independent research is not the diagnosis, but how to extract more useful oncologists . We used this data set to assess stability of the features for delineation inaccuracies. information from the tumour phenotype that can be used for The Lung1 data set consists of 422 NSCLC patients that were treated at personalized medicine. Therefore, we assessed the association of MAASTRO Clinic, The Netherlands. For these patients, CT scans, manual radiomics with clinical factors, prognosis and gene-expression delineations, clinical and survival data were available. We used this data set to levels, using large amounts of features and with external and assess the prognostic value of the radiomic features and to build a radiomic signature. independent validation cohorts of patients. The most important The Lung2 data set consists of 225 NSCLC patients that were treated at Radboud message in our study is that there is prognostic and biologic University Nijmegen Medical Centre, The Netherlands. For these patients, CT information enclosed in routinely acquired CT imaging and was scans, manual delineations, clinical and survival data were available. We used evident in two cancer types. this data set to validate the prognostic value of the radiomic features and signature in an independent NSCLC cohort. It is known that variability in image acquisition exists across The H&N1 data set consists of 136 head-and-neck squamous cell carcinoma hospitals and that this is a reality in clinical practice. However, in (HNSCC) patients treated at MAASTRO Clinic, The Netherlands. For these our analysis we used data directly generated from the scanner and patients, CT scans, manual delineations, clinical and survival data were available. the features were calculated from the RAW imaging data, without We used this data set to validate the prognostic value of the radiomic features and signature in HNSCC patients. any pre-processing or normalization. As there was no correction The H&N2 data set consists of 95 HNSCC patients treated at the VU University by cohort or scanner type, this illustrates the translational Medical Center Amsterdam, The Netherlands. For these patients, CT scans, potential of our results and it is a strong argument in favour of a manual delineations, clinical and survival data were available. We used this data multicentric application of radiomics. The radiomics signature set to validate the prognostic value of the radiomic features and signature in a had strong prognostic power in these independent data sets second cohort of HNSCC patients. The Lung3 data set consists of 89 NSCLC patients that were treated at generated in daily clinical practice. Furthermore, we expect that MAASTRO Clinic, The Netherlands. For these patients pretreatment CT scans, with better standardization and imaging protocols, the power of tumour delineations and gene expression profiles were available. We used this radiomics will even further improve. Among others, the data set to associate imaging features with gene-expression profiles. quantitative imaging network of the National Institute of Health, In the Supplementary Methods and Supplementary Tables 4–7, further as well as the quantitative imaging biomarker alliance, investi- descriptions of the data sets are presented. The discovery Lung1 data set, consisting gates future directions by performing phantom studies and of CT images for 422 NSCLC patients, and the Lung3 data set consisting of CT discussing with vendor’s open and standardized protocols for images and gene-expression profiling for 89 NSCLC patients, are publicly available 2,3 image acquisition . at The Cancer Imaging Archive, Lung1: https://wiki.cancerimagingarchive.net/ display/Public/NSCLC-Radiomics and Lung3: https://wiki.cancerimagingarchive. Due to the large availability of noninvasive imaging performed net/display/Public/NSCLC-Radiomics-Genomics, as well as on www.cancerdata.org. routinely in a large number of cancer patients and the automated feature algorithms, the results of this work could stimulate further research of image-based quantitative features. Also, we presented Sample size. To reduce any form of over-fitting or bias in the multivariate ana- lysis, we trained on data the Lung1 data sets (n¼ 422), selecting the features and evidence that the defined radiomic feature-metrics are platform fixing the weights, and tested only one signature (containing four features) in data independent, though this should be studied further, and can of 545 patients in the independent validation data sets. There was no need for potentially be applied to other image modalities, such as magnetic randomization as the patients originated from distinct groups. Patients were resonance imaging or PET. This approach can have a large included in the analysis with the following criteria: confirmed primary tumour, impact as imaging is routinely used in clinical practice, world- patients underwent treatment with curative intent. Excluded from this analysis were patients receiving no or palliative treatment and patients with previous lung wide, in all stages of diagnoses and treatment, providing an or head-and-neck cancer. unprecedented opportunity to improve medical decision-support. Data analysis. An overview of the analysis is shown in Fig. 2. The analysis was Methods divided in training and validation phases. For the training phase, we first explored Radiomics features. We defined 440 radiomic image features that describe tumour feature stability determined in both test-retest and inter-observer setting. The characteristics and can be extracted in an automated way. The features can be RIDER and multiple delineation data sets were used to assess stability of the divided into four groups: (I) tumour intensity, (II) shape, (III) texture and (IV) features to select the most informative features for further investigation. Using the wavelet features. The first group quantified tumour intensity characteristics using RIDER test-retest data set, we tested the stability of the radiomic features between first-order statistics, calculated from the histogram of all tumour voxel intensity test and retest . For each patient, we extracted the radiomic features from both values. Group 2 consists of features based on the shape of the tumour (for example, scans. A stability rank was calculated for each feature, using the intraclass sphericity or compactness of the tumour). Group 3 consists of textual features that correlation coefficient, where a higher intraclass correlation coefficient rank are able to quantify intratumour heterogeneity differences in the texture that is corresponds to a more stable feature. observable within the tumour volume. These features are calculated in all three- We assessed the feature stability for delineation inaccuracies using a multiple dimensional directions within the tumour volume, thereby taking the spatial location delineation data set . All radiomic features were computed for five delineations of each voxel compared with the surrounding voxels into account. Group 4 calcu- per patient, and a stability rank per feature was calculated using the Friedman test. lates the intensity and textural features from wavelet decompositions of the original The Friedman test is a nonparametric repeated measurement test for a non- image, thereby focusing the features on different frequency ranges within the tumour Gaussian population. A rank of 1 indicated the most stable feature for delineation volume (Supplementary Fig. 4). All feature algorithms were implemented in Matlab. inaccuracies and 440 the least stable feature. All 440 radiomic features were In the Supplementary Methods, the feature algorithms are described. extracted for the Lung1, Lung2, H&N1 and H&N2 data sets. The radiomic features NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 7 & 2014 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 were not normalized on any data set, and only the raw values were used that were 13. Compton, C. C. et al. AJCC Cancer Staging Atlas (Springer, 2012). directly computed from the DICOM image. To explore the association of the 14. Subramanian, A. Gene set enrichment analysis: a knowledge-based approach radiomics features with survival, we used Kaplan–Meier analysis in a training and for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA validation phase. To ensure a completely independent validation, the median 102, 15545–15550 (2005). threshold of each feature on the Lung1 data set was computed, and then this 15. Yachida, S. et al. Distant metastasis occurs late during the genetic evolution of threshold was used in the validation data sets (Lung2, H&N1 and H&N2) to split pancreatic cancer. Nature 467, 1114–1117 (2010). the survival curves. We used the G-rho rank test for censored survival data to test 16. Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed for significant differences between the two survival curves. P-values were corrected by multiregion sequencing. New Engl. J. Med. 366, 883–892 (2012). for multiple testing by controlling the FDR of 10%, the expected proportion of false 17. Gerlinger, M. & Swanton, C. How Darwinian models inform therapeutic failure discoveries amongst the rejected hypotheses. initiated by clonal heterogeneity in cancer medicine. Br. J. Cancer 103, To assess the multivariate performance of radiomic features we built a signature. 1139–1143 (2010). We selected the 100 most stable features, determined by averaging the stability ranks 18. Kern, S. E. Why your new cancer biomarker may never work: recurrent of RIDER data set and multiple delineation data set. Next, we computed the patterns and remarkable diversity in biomarker failures. Cancer Res. 72, performance in the Lung 1 data set of each of the selected 100 features using the 6097–6101 (2012). concordance index (CI) . This measure is comparable with the area under the curve 19. Starmans, M. H. W. et al. Independent and functional validation of a multi- but can also be used for Cox regression analysis. From each of the four-feature groups, tumour-type proliferation signature. Br. J. Cancer 107, 508–515 (2012). we selected the single best performing feature for prognosis in the Lung1 data set, and 20. Nair, V. S. et al. Prognostic PET 18F-FDG uptake imaging features are combined these top four features into a multivariate Cox proportional hazards associated with major oncogenomic alterations in patients with resected non- regression model for prediction of survival. The weights of the model were fitted on the Lung1 data set. We applied the radiomic signature to the validation data sets small cell lung cancer. Cancer Res. 72, 3725–3734 (2012). Lung2, H&N1 and H&N2, and the performance was assessed with the CI. To 21. Diehn, M. et al. Identification of noninvasive imaging surrogates for brain tumor calculate significance between two models we used a bootstrap approach, for 100 gene-expression modules. Proc. Natl Acad. Sci. USA 105, 5213–5218 (2008). times we calculated the CI of both models from 100 randomly selected samples. The 22. Segal, E. et al. Decoding global gene expression programs in liver cancer by Wilcoxon test was used to assess significance. noninvasive imaging. Nat. Biotechnol. 25, 675–680 (2007). A similar approach was used to assess if the signature had significant power, 23. Tixier, F. et al. Intratumor heterogeneity characterized by textural features on compared with random (CI¼ 0.5). We used a bootstrap approach, for 100 times we baseline 18F-FDG PET images predicts response to concomitant calculated the CI of the radiomics signature based on 100 randomly selected radiochemotherapy in esophageal cancer. J. Nucl. Med. 52, 369–378 (2011). samples with correct outcome data, as well as on 100 randomly chosen samples 24. Naqa El, I. et al. Exploring feature-based approaches in PET images for with random outcome data. The Wilcoxon test was used to assess significance, predicting cancer treatment outcomes. Pattern Recognit. 42, 1162–1171 (2009). between the two distributions. 25. Ganeshan, B., Panayiotou, E., Burnand, K., Dizdarevic, S. & Miles, K. Tumour To assess the complementary effect of the signature with clinical parameters, we heterogeneity in non-small cell lung carcinoma assessed by CT texture analysis: built a new model with the prediction of the signature as one input and the clinical a potential marker of survival. Eur. Radiol. 22, 796–802 (2011). parameter as the other input. The weight of the clinical parameter was fitted on the 26. Ganeshan, B., Skogen, K., Pressney, I., Coutroubis, D. & Miles, K. Tumour training data set Lung1. heterogeneity in oesophageal cancer assessed by CT texture analysis: To assess the association of the radiomic signature with gene expression, we preliminary evidence of an association with tumour metabolism, stage, and used the Lung3 data set. Gene expression of 89 patients was measured on survival. Clin. Radiol. 67, 157–164 (2012). Affymetrix chips with the custom chipset HuRSTA_2a520709 for 21,766 genes. 27. Gevaert, O. et al. Non-small cell lung cancer: identifying prognostic imaging Expression values were normalized with the RMA algorithm5 in the Affy package biomarkers by leveraging public gene expression microarray data: methods and in Bioconductor. For each of the four features in the radiomic signature, we preliminary results. Radiology 264, 387–396 (2012). calculated the Spearman rank correlation to gene expression and used the 28. Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics corresponding P-values to obtain a rank of genes representing high-to-low 27, 1739–1740 (2011). agreement. Each of these gene ranks were used to perform a pre-ranked version of 14 28 GSEA on the C5 collection of MSigDB , which contains gene sets associated Acknowledgements with specific GO terms. We only regarded gene sets of size 15 to 500. Local FDRs were calculated on the normalized enrichment scores (NES), primary statistic of We acknowledge financial support from the National Institute of Health (NIH-USA U01 GSEA and only gene sets enriched with an FDR of r20% were retained. Figure 4b CA 143062-01, Radiomics of NSCLC), the CTMM framework (AIRFORCE project, grant displays gene sets that have been significantly enriched (FDR r20%) for at least 030-103), EU 6th and 7th framework program (METOXIA, EURECA, ARTFORCE), one of four radiomic features (indicated by an asterisk). The corresponding euroCAT (IVA Interreg—www.eurocat.info), and the Dutch Cancer Society (KWF UM absolute NES in all of the four features are given color-coded, where light blue 2011-5020, KWF UM 2009-4454). We also acknowledge financial support from the means low and dark blue means high NES. Innovative Medicines Initiative Joint Undertaking (www.imi.europa.eu), based on resources from the 7th framework program abd EFPIA companies’ kind contribution (Grant agreement No. 115151). References 1. Kurland, B. F. et al. Promise and pitfalls of quantitative imaging in oncology clinical trials. Magn. Reson. Imaging 30, 1301–1312 (2012). Author contributions 2. Buckler, A. J., Bresolin, L., Dunnick, N. R. & Sullivan, D. C. Group. A H.J.W.L.A., E.R.V., R.J.G. and P.L. conceived the project, analysed the data and wrote the collaborative enterprise for multi-stakeholder participation in the advancement paper. R.T.H.L., C.P. and S.C. collected the data and provided analysis on the data sets. of quantitative imaging. Radiology 258, 906–914 (2011). P.G., B.H.-K. and J.Q. provided bioinformatics analysis and support. J.B., D.R., R.M., 3. Buckler, A. J. et al. Quantitative imaging test approval and biomarker F.H., M.M.R., C.R.L. and A.D. provided expert knowledge, collection and availability of qualification: interrelated but distinct activities. Radiology 259, 875–884 (2011). the data. All authors edited the manuscript. 4. Lambin, P. et al. Predicting outcomes in radiation oncology—multifactorial decision support systems. Nat. Rev. Clin. Oncol. 10, 27–40 (2013). Additional information 5. Jaffe, C. C. Measures of response: RECIST, WHO, and new alternatives. J. Clin. Supplementary Information accompanies this paper at http://www.nature.com/ Oncol. 24, 3245–3251 (2006). naturecommunications 6. Burton, A. RECIST: right time to renovate? Lancet Oncol. 8, 464–465 (2007). 7. Birchard, K. R., Hoang, J. K., Herndon, J. E. & Patz, E. F. Early changes in Competing financial interests: The authors declare no competing financial interests. tumor size in patients treated for advanced stage nonsmall cell lung cancer do not correlate with survival. Cancer 115, 581–586 (2009). Reprints and permission information is available online at http://npg.nature.com/ 8. Lambin, P. et al. Radiomics: extracting more information from medical images reprintsandpermissions/ using advanced feature analysis. Eur. J. Cancer 48, 441–446 (2012). How to cite this article: Aerts, H. J. W. L. et al. Decoding tumour phenotype by 9. Kumar, V. et al. Radiomics: the process and the challenges. Magn. Reson. noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 5:4006 Imaging 30, 1234–1248 (2012). doi: 10.1038/ncomms5006 (2014). 10. Zhao, B. et al. Evaluating variability in tumor measurements from same-day repeat CT scans of patients with non–small cell lung cancer. Radiology 252, This work is licensed under a Creative Commons Attribution- 263–272 (2009). NonCommercial-NoDerivs 3.0 Unported License. The images or other 11. van Baardwijk, A. et al. PET-CT-based auto-contouring in non-small-cell lung third party material in this article are included in the article’s Creative Commons license, cancer correlates with pathology and reduces interobserver variability in the unless indicated otherwise in the credit line; if the material is not included under the delineation of the primary tumor and involved nodal volumes. Int. J. Radiat. Creative Commons license, users will need to obtain permission from the license holder Oncol. Biol. Phys. 68, 771–778 (2007). to reproduce the material. To view a copy of this license, visit http://creativecommons. 12. Harrell, F. E. Regression Modeling Strategies: With Applications to Linear org/licenses/by-nc-nd/3.0/ Models, Logistic Regression, and Survival Analysis (Springer, 2001). 8 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. DOI: 10.1038/ncomms5644 Corrigendum: Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach Hugo J.W.L. Aerts, Emmanuel Rios Velazquez, Ralph T.H. Leijenaar, Chintan Parmar, Patrick Grossmann, Sara Carvalho, Johan Bussink, Rene´ Monshouwer, Benjamin Haibe-Kains, Derek Rietveld, Frank Hoebers, Michelle M. Rietbergen, C. Rene´ Leemans, Andre Dekker, John Quackenbush, Robert J. Gillies & Philippe Lambin Nature Communications 5:4006 doi: 10.1038/ncomms5006 (2014); Published 3 Jun 2014; Updated 7 Aug 2014 The original version of this Article contained a typographical error in the spelling of the author Sara Carvalho, which was incorrectly given as Sara Cavalho. This has now been corrected in both the PDF and HTML versions of the Article. NATURE COMMUNICATIONS | 5:4644 | DOI: 10.1038/ncomms5644 | www.nature.com/naturecommunications 1 & 2014 Macmillan Publishers Limited. All rights reserved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Nature Communications Springer Journals

Loading next page...
 
/lp/springer-journals/decoding-tumour-phenotype-by-noninvasive-imaging-using-a-quantitative-symohH0TSr

References (41)

Publisher
Springer Journals
Copyright
Copyright © 2014 by The Author(s)
Subject
Science, Humanities and Social Sciences, multidisciplinary; Science, Humanities and Social Sciences, multidisciplinary; Science, multidisciplinary
eISSN
2041-1723
DOI
10.1038/ncomms5006
Publisher site
See Article on Publisher Site

Abstract

ARTICLE Received 25 Nov 2013 | Accepted 29 Apr 2014 | Published 3 Jun 2014 | Updated 7 Aug 2014 DOI: 10.1038/ncomms5006 OPEN Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach 1,2,3,4, 1,2, 1 1,2 Hugo J.W.L. Aerts *, Emmanuel Rios Velazquez *, Ralph T.H. Leijenaar , Chintan Parmar , 2 1 5 5 6 Patrick Grossmann , Sara Carvalho , Johan Bussink , Rene´ Monshouwer , Benjamin Haibe-Kains , 7 1 8 8 1 Derek Rietveld , Frank Hoebers , Michelle M. Rietbergen , C. Rene´ Leemans , Andre Dekker , 4 9 1 John Quackenbush , Robert J. Gillies & Philippe Lambin Human cancers exhibit strong phenotypic differences that can be visualized noninvasively by medical imaging. Radiomics refers to the comprehensive quantification of tumour phenotypes by applying a large number of quantitative image features. Here we present a radiomic analysis of 440 features quantifying tumour image intensity, shape and texture, which are extracted from computed tomography data of 1,019 patients with lung or head-and-neck cancer. We find that a large number of radiomic features have prognostic power in independent data sets of lung and head-and-neck cancer patients, many of which were not identified as significant before. Radiogenomics analysis reveals that a prognostic radiomic signature, capturing intratumour heterogeneity, is associated with underlying gene-expression patterns. These data suggest that radiomics identifies a general prognostic phenotype existing in both lung and head-and-neck cancer. This may have a clinical impact as imaging is routinely used in clinical practice, providing an unprecedented opportunity to improve decision-support in cancer treatment at low cost. 1 2 Department of Radiation Oncology (MAASTRO), Research Institute GROW, Maastricht University, 6229ET Maastricht, The Netherlands. Department of Radiation Oncology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts 02215-5450, USA. Department of Radiology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts 02215-5450, 4 5 USA. Department of Biostatistics & Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215-5450, USA. Department of Radiation Oncology, Radboud University Medical Center Nijmegen, PB 9101, 6500HB Nijmegen, The Netherlands. Princess Margaret Cancer Centre, University Health Network and Medical Biophysics Department, University of Toronto, Toronto, Ontario, Canada M5G 1L7. Department of Radiation Oncology, VU University Medical Center, 1081 HZ Amsterdam, The Netherlands. Department of Otolaryngology/Head and Neck Surgery, VU University Medical Center, 1081 HZ Amsterdam, The Netherlands. Department of Cancer Imaging and Metabolism, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida 33612, USA. * These authors contributed equally to this work. Correspondence and requests for materials should be addressed to H.A. (email: Hugo_Aerts@dfci.harvard.edu). NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 1 & 2014 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 edical imaging is one of the major factors that have However, in clinical practice, tumour response to therapy is only informed medical science and treatment. By assessing measured using one- or two-dimensional descriptors of tumour Mthe characteristics of human tissue noninvasively, size (RECIST and WHO, respectively) . Although a change in imaging is often used in clinical practice for oncologic diagnosis tumour size can indicate response to therapy, it often does not 1–3 6,7 and treatment guidance . A key goal of imaging is ‘personalized predict overall or progression free survival . Although some medicine’, where treatment is increasingly tailored on the basis of investigations have characterized the appearance of a tumour specific characteristics of the patient and their disease . on CT images, these characteristics are typically described Much of the discussion of personalized medicine has focused subjectively and qualitatively (‘moderate heterogeneity’, ‘highly on molecular characterization using genomic and proteomic spiculated’, ‘large necrotic core’). However, recent advances in technologies. However, as tumours are spatially and temporally image acquisition, standardization and image analysis allow for heterogeneous, these techniques are limited. They require objective and precise quantitative imaging descriptors that could biopsies or invasive surgeries to extract and analyse what are potentially be used as noninvasive prognostic or predictive generally small portions of tumour tissue, which do not allow for biomarkers. a complete characterization of the tumour. Imaging has great Radiomics is an emerging field that converts imaging data into potential to guide therapy because it can provide a more a high dimensional mineable feature space using a large number 8,9 comprehensive view of the entire tumour and it can be used on of automatically extracted data-characterization algorithms . an ongoing basis to monitor the development and progression of We hypothesize that these imaging features capture distinct the disease or its response to therapy. Further, imaging is phenotypic differences of tumours and may have prognostic noninvasive and is already often repeated during treatment in power and thus clinical significance across different diseases. Here routine practice, on the contrary of genomics or proteomics, we assess the clinical relevance of 440 radiomic features, many of which are still challenging to implement into clinical routine. which currently have no known clinical significance, in seven The most widely used imaging modality in oncology is X-ray independent cohorts consisting of 1,019 lung cancer and head- computed tomography (CT), which assesses tissue density. and-neck cancer patients. Two data sets are used to assess Indeed, CT images of lung cancer tumours exhibit strong the stability of the features, four data sets to assess the prognostic contrast reflecting differences in the intensity of a tumour on value of radiomic features on lung cancer patients and the image, intratumour texture and tumour shape (Fig. 1a). head-and-neck cancer patients, and one data set for association ab I) CT imaging III) Analysis II) Feature extraction A Radiomic features Gene expression A B Tumour intensity A B Tumour shape Clinical data Tumour texture Wavelet Figure 1 | Extracting radiomics data from images. (a) Tumours are different. Example computed tomography (CT) images of lung cancer patients. CT images with tumour contours left, three-dimensional visualizations right. Please note strong phenotypic differences that can be captured with routine CT imaging, such as intratumour heterogeneity and tumour shape. (b) Strategy for extracting radiomics data from images. (I) Experienced physicians contour the tumour areas on all CT slices. (II) Features are extracted from within the defined tumour contours on the CT images, quantifying tumour intensity, shape, texture and wavelet texture. (III) For the analysis the radiomics features are compared with clinical data and gene-expression data. 2 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 ARTICLE Training Validation Radiomics features definition RIDER Multiple Lung1 Lung2 H&N1 H&N2 Lung3 test/retest delineation Maastro Radboud Maastro VU Amsterdam MUMC NSCLC NSCLC HNSCC HNSCC NSCLC n =31 n =21 n =136 n =89 n =422 n =225 n =95 Radiomics features Stability rank Stability rank Radiomics features Radiomics features Radiomics features Radiomics features features features gene-expression Feature selection based on stability ranks and performance Radiomics signature Prognostic Prognostic Prognostic validation Association (containing four validation validation H&N cancer cohort with gene-expression features) lung cancer cohort H&N cancer cohort Figure 2 | Analysis workflow. The defined radiomic features algorithms were applied to seven different data sets. Two data sets were used to calculate the feature stability ranks, RIDER test/retest and multiple delineation respectively (both orange). The Lung1 data set, containing data of 422 non-small cell lung cancer (NSCLC) patients, was used as training data set. Lung2 (n¼ 225), H&N1 (n¼ 136) and H&N2 (n¼ 95) were used as validation data sets. The Lung3 data set (n¼ 89) was used for association of the radiomic signature with gene expression profiles. For the multivariate analysis, only one fixed four-feature radiomic signature was tested in the validation data sets. with gene-expression profiles of lung cancer patients (Fig. 2). Our Prognostic value of radiomic data. The possible association of results reveal that radiomics data contain strong prognostic radiomic features with survival was then explored by Kaplan– information in both lung and head-and-neck cancer patients, and Meier survival analysis. For training we used the Lung1 data set, are associated with the underlying gene-expression patterns. and for validation the Lung2, H&N1, H&N2 data sets (Fig. 2). These results suggest that radiomics decodes a general prognostic The radiomic features were not normalized on any data set, and phenotype existing in multiple cancer types. Radiomics can have only the raw values were used that were directly computed from a large clinical impact, as imaging is used in routine practice the DICOM images. worldwide, providing a method that can quantify and monitor To ensure a completely independent validation, the median phenotypic changes during treatment. value of each feature was computed on the training Lung1 data set, and locked for use as a threshold in the validation data sets to assess the survival differences without retraining. In Results Supplementary Fig. 1 we show Kaplan–Meier survival curves Association of radiomic data with clinical data. To assess the for four representative features. Features describing heterogeneity value of radiomic features to capture phenotypic differences of in the primary tumour were associated with worse survival in all tumours, we performed an integrated analysis assessing prog- four data sets. Also, patients with more compact/spherical nostic performance and association with gene expression in lung tumours had better survival probability. and head-and-neck cancer data sets. First, we defined 440 Overall, the median threshold derived from Lung1 yielded a quantitative image features describing tumour phenotype char- significant survival difference for 238 features (54% of total 440; acteristics by: (I) tumour image intensity, (II) shape, (III) texture G-rho test, false discovery rate (FDR) 10%) in the Lung2 and (IV) multiscale wavelet (Fig. 1b, Supplementary Methods). validation data set. Furthermore, there was a significant survival To investigate radiomic expression patterns we extracted difference for 135 features (31%) in H&N1 and for 186 features in radiomic features from the Lung1 data set, consisting of 422 H&N2 (42%). Sixty-six (15%) of the features derived from Lung1 non-small cell lung cancer (NSCLC) patients (Fig. 2). Unsuper- were significant for survival in all three validation data sets vised clustering revealed clusters of patients with similar radiomic (Lung2, H&N1 and H&N2). expression patterns (Fig. 3). We compared the three main clusters of patients with clinical parameters (Fig. 3b), and found significant association with primary tumour stage (T-stage; Building prognostic radiomic signature. To build a prognostic 20 2  3 Po1 10 , w test) and overall stage (P¼ 3.4 10 , radiomic signature, the analysis was divided in training and vali- w test), wherein cluster I was associated with lower stages. dation phases (Fig. 2). For the training phase, we first explored N-stage (lymph node) and M-stage (metastasis), however, feature stability determined in both test-retest and inter-observer showed no correspondence with the radiomic expression patterns setting. Using the publicly available RIDER data set, consisting (P¼ 0.46 and P¼ 0.73, respectively, w test). of 31 sets of test-retest CT scans that were acquired approximately Furthermore, a significant association with histology (P¼ 0.019, 15 min apart, we tested how consistent the radiomic features were w test) was observed, wherein squamous cell carcinoma showed a between the test and the retest scan. The multiple delineation data higher presence in cluster II. Looking at the representation of the set, where five oncologists delineated lesions on CT scans from 21 feature groups (Fig. 3c), there was no correspondence between the patients , was used to test the stability of the radiomic features to feature group and radiomic expression patterns. variation in manual delineations. NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 3 & 2014 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 –2 –1 0 1 2 Patients Z-score Intensity Shape Texture HHH HHL HLH HLL LHH LHL LLH LLL Clusters I II III T-stage 12 3 4 01 2 3 N-stage M-stage Overall stage I II IIIA IIIB IV Histology Adenocarcinoma Squamous cell carcinoma Large cell carcinoma Not otherwise specified (nos) NA Figure 3 | Radiomics heat map. (a) Unsupervised clustering of lung cancer patients (Lung1 set, n¼ 422) on the y axis and radiomic feature expression (n¼ 440) on the x axis, revealed clusters of patients with similar radiomic expression patterns. (b) Clinical patient parameters for showing 20 2  3 significant association of the radiomic expression patterns with primary tumour stage (T-stage; Po1 10 , w test), overall stage (P¼ 3.4 10 , 2 2 w test) and histology (P¼ 0.019, w test). (c) Correspondence of radiomic feature groups with the clustered expression patterns. For each feature, we compared the stability ranks for test-retest Nonuniformity HLH’ (Feature Group 4), also describing and multiple delineation with prognosis in the Lung1 training intratumour heterogeneity after decomposing the image in mid- data set. Although the stability ranks did not use any information frequencies. The weights of each of the features in the signature about prognosis, in general, features with higher stability for test- were fitted on the training data set Lung1. retest and delineation inaccuracies showed higher prognostic performance (Supplementary Fig. 2). This is possibly due to reduced amount of noise in the stable features and supports the Prognostic validation of radiomic signature. The performance use of stability ranks for feature selection. of the four-feature radiomic signature was validated in the data To test the multivariate performance of a radiomic signature, sets Lung2, H&N1 and H&N2 (Fig. 2) using the concordance we used the workflow depicted in Fig. 2 and Supplementary Fig. 3. index (CI), which is a generalization of the area under the ROC We focused our analysis on the 100 most stable features, which curve . The radiomic signature had good performance on the were determined by averaging the stability ranks of RIDER data Lung2 data (CI¼ 0.65, P¼ 2.91 10 , Wilcoxon test), and a set and multiple delineation data set. To remove redundancy high performance in H&N1 (CI¼ 0.69, P¼ 7.99 10 , within the radiomic information, we selected the single best Wilcoxon test) and H&N2 (CI¼ 0.69, P¼ 3.53 10 , performing radiomic feature from each of the four-feature groups, Wilcoxon test). In Fig. 4a the Kaplan–Meier curves are shown. and combined these top four features into a multivariate Cox Although volume had a good performance in all data sets, the proportional hazards regression model for prediction of survival. radiomic signature performed significantly better, suggesting that The resulting radiomic signature consisted of (I) ‘Statistics radiomic features contain relevant, complementary information Energy’ (Supplementary Methods Feature 1) describing the for prognosis (Supplementary Table 1). Furthermore, combining overall density of the tumour volume, (II) ‘Shape Compactness’ the radiomic signature with volume was significantly better than (Feature 16) quantifying how compact the tumour shape is, (III) volume alone in all data sets. ‘Grey Level Nonuniformity’ (Feature 48) a measure for Comparing the radiomic signature with the TNM staging ,we intratumour heterogeneity and (IV) wavelet ‘Grey Level see that the signature performance was better in both Lung2 and 4 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. Radiomics features Wavelet NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 ARTICLE a Kaplan−Meier radiomics signature Kaplan−Meier radiomics signature 1.0 1.0 <= Median <= Median > Median > Median 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 Lung1: Maastro H&N1: Maastro Lung2: RadBoud H&N2: VU 0.0 0 200 400 600 800 1,000 1,200 1,400 0 500 1,000 1,500 2,000 Survival time (days) Survival time (days) ******* ** ** Statistics total energy ******* ** Shape compactness ****** RLGL grey level nonuniformity Wavelet HLH ************** RLGL grey level nonuniformity Colour key 1 1.5 Enrichment score Figure 4 | Prognostic performance and gene-expression association of the radiomics signature. (a) Radiomic signature performance. Kaplan–Meier curves demonstrating performance of the radiomic signature on the lung cancer data sets (left) and the head-and-neck cancer data sets (right). The signature was built on the Lung1 data (n¼ 422). The signature had a good performance in the Lung2 (CI¼ 0.65, P¼ 2.91 10 , Wilcoxon test, n¼ 225), 07  06 and a high performance in H&N1 (CI¼ 0.69, P¼ 7.99 10 , Wilcoxon test, n¼ 136) and H&N2 (CI¼ 0.69, P¼ 3.53 10 , Wilcoxon test, n¼ 95) validation data sets. (b) Association of radiomic signature features and gene expression using gene-set enrichment analysis (GSEA) in the Lung3 data set (n¼ 89). Gene sets that have been significantly enriched (FDR¼ 20%) for at least one of the four radiomic features are indicated with an asterisk. The corresponding normalized enrichment scores (NES), GSEA’s primary statistic, for all radiomic signature features is displayed in a heat map, where light blue means low and dark blue means high NES. NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 5 & 2014 Macmillan Publishers Limited. All rights reserved. Survival probability EXTRACELLULAR_REGION_PART EXTRACELLULAR_SPACE REGULATION_OF_MULTICELLULAR_ORGANISMAL_PROCESS DNA_DEPENDENT_DNA_REPLICATION REGULATION_OF_IMMUNE_SYSTEM_PROCESS TISSUE_DEVELOPMENT LEUKOCYTE_ACTIVATION MITOTIC_CELL_CYCLE_CHECKPOINT PROTEIN_AMINO_ACID_LIPIDATION LYMPHOCYTE_ACTIVATION EXTRACELLULAR_REGION PROTEIN_COMPLEX_BINDING ECTODERM_DEVELOPMENT EPIDERMIS_DEVELOPMENT Survival probability DNA_REPAIR CHROMOSOME MITOSIS REGULATION_OF_DNA_METABOLIC_PROCESS M_PHASE_OF_MITOTIC_CELL_CYCLE CELL_CYCLE_PROCESS CELL_CYCLE_PHASE CELL_CYCLE_GO_0007049 MITOTIC_CELL_CYCLE DNA_RECOMBINATION M_PHASE ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 H&N2 and comparable in H&N1 (Supplementary Table 1). avoid any form of over-fitting or bias, we performed a robust Importantly, combining the radiomic signature with TNM statistical validation: only one radiomics signature (containing four staging showed a significant improvement in all data sets, radiomic features) was validated in data of 545 patients in compared with TNM staging alone. Furthermore, we assessed if independent validation data sets (Fig. 2 and Supplementary Fig. 3). the radiomics signature preserved the significant prognostic The four features were selected on the basis of feature stability and performance compared with the treatment that the patients prognostic performance in the discovery data set only. received. We found that the signature preserved its prognostic The top performing feature ‘Grey Level Nonuniformity’ (Feature performance for all the treatment groups (radiation or concurrent 48) and the most dominant features in the radiomic signature chemoradiation), for both Lung and H&N cancer patients (Features III and IV), quantified intratumour heterogeneity. Indeed, (Supplementary Table 2), demonstrating the complementary it is often hypothesized that intratumour heterogeneity is exhibited value of radiomics for each treatment type. on different spatial scales, for example at the radiological, Human papillomavirus (HPV) is an important determinant in macroscopic, cellular and the molecular (genetics) level. Radi- head-and-neck cancer patients, especially those with orophar- ological tumour phenotype characteristics may thus be useful to yngeal carcinoma for prognosis and may guide future treatment investigate the underlying evolving biology. It is known that selection. We did not find a significant association between multiple subclonal populations coexist within tumours, reflecting 15,16 radiomic signature prediction and HPV status in a combined extensive intratumoral ‘somatic evolution’ . This heterogeneity is analysis in the H&N1 and H&N2 data set (P¼ 0.17, Wilcoxon a clear barrier to the goal of personalized therapy based on test, Supplementary Table 3). However, we found that the molecular biopsy-based assays, as the identified mutations and signature preserved its prognostic performance in the HPV- gene-expression does not always represent the entire population of 17,18 negative group (CI¼ 0.66), consisting of the majority of patients tumour cells . Radiomics circumvents this by assessing the (76%, n¼ 130), demonstrating the complementary value of comprehensive three-dimensional tumour bulk. The study radiomics to HPV screening. presented here probes heterogeneity and demonstrates To assess the association between the radiomic signature and corresponding clinical importance in two cancer types. the underlying biology, we compared the radiomic signature Furthermore, we demonstrated association of intratumour with gene-expression profiles (Lung3 data set, Fig. 2) using gene- heterogeneity with proliferation, a general hallmark of cancer. 1,14 set enrichment analysis (GSEA) . We found significant Overall, the lung-derived radiomic signature had better associations between the signature features and gene-expression performance in head and neck compared with lung cancer. One patterns (Fig. 4b). Further, the radiomic features are significantly reason could be that head-and-neck images were acquired with associated with different biologic gene sets, demonstrating that head immobilization, whereas lung images were acquired with radiomic features probe different biologic mechanisms. It is free breathing and are affected by patient movement or noteworthy that both intratumour heterogeneity features in the respiration, resulting in relatively more image noise. Nonetheless, signature (Feature III and IV) were strongly correlated with cell our results show that the radiomic signature could be transferred cycling pathways, indicating an increased proliferation for more from lung to head-and-neck cancer, which suggests that the heterogeneous tumours. signature identifies a general prognostic tumour phenotype. Our method provides a noninvasive (and therefore with no risk of infection or complications that accompany tissue biopsies), Discussion fast, low cost and repeatable way of investigating phenotypic Medical imaging is one of the major factors informing medical information, potentially speeding up the development of science and treatment. Its potential resides in its ability to assess personalized medicine. Furthermore, we show that the radiomic the characteristics of human tissue noninvasively, and therefore is signature is significantly associated with the underlying gene- routinely used in clinical practice for oncologic diagnosis and expression patterns, suggesting that inter-patient differences of treatment guidance and monitoring. gene expression are larger than intra-patient differences. However, traditionally, medical imaging has been a subjective The clinical impact of our results are illustrated by the fact that or qualitative science. Recent advances in medical imaging it advances knowledge in the analysis and characterization of acquisition and analysis allow the high-throughput extraction of tumours in medical images, previously not done, and provides informative imaging features to quantify the differences that knowledge currently not used in the clinic. We showed the oncologic tissues exhibit in medical imaging. complementary performance of radiomic features with TNM Radiomics applies advanced computational methodologies to staging for prediction of outcome, which illustrates the clinical medical imaging data to convert medical images into quantitative importance of our findings as TNM is routinely used in the clinic. descriptors of oncologic tissues . Currently, the TNM staging system is used for risk stratification In this study, we analysed 440 radiomic features quantifying and treatment decision making. However, the TNM staging tumour phenotypic differences based on its image intensity, shape system is primarily based on resectability of the tumour, whereas and texture. In a large data set of 1,019 lung and head-and-neck a larger number of NSCLC patients will receive primary cancer patients, of which we extracted radiomic features on treatment with radiotherapy either alone or combined with computed tomography images, we found that a large number of chemotherapy. Therefore, the TNM staging system is insufficient radiomic features have prognostic power, many of which their for risk stratification of this group of patients, in particular to prognostic implication have not been described before. Further- make the decision between curative treatment (concomitant more, our integrated analysis showed that features selected on the radiochemotherapy) or palliative treatment especially in elderly basis of their stability and reproducibility were also the most patients, a growing issue in western countries. Our results show informative features, which indicates the power of integrating that the radiomics signature is performing better in independent independent data sets for radiomic feature selection and model cohorts than the TNM classification. In future clinical trials, this building. inexpensive method can be used as well for pretreatment risk We showed as well that a radiomic signature, capturing stratification (for example, high, low risk). intratumour heterogeneity, was strongly prognostic and validated Furthermore, we have shown for the first time the translational in three independent data sets of lung and head-and-neck cancer capability of radiomics in two cancer types (lung and head-and- patients, and was associated with gene-expression profiles. To neck cancer). These results indicate that radiomics quantifies a 6 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 ARTICLE general prognostic cancer phenotype that likely can broadly be Data sets. We applied a radiomic analysis to seven image data sets. An overview of the data sets is presented in Fig. 2. All research was carried out in accordance applied to other cancer types. Similar observations have been with Dutch law. The Institutional Review Boards of each of the participating made in gene-expression studies where signatures are prognostic centres approved the studies: Lung1, Lung3, H&N1 (Maastricht University Medical across different diseases . Center (MUMCþ ), Maastricht, The Netherlands), Lung2 (Radboud University Analysis of image features applied to medical imaging has been Medical Center (RUMC), Nijmegen, The Netherlands) and H&N2 (VU University Medical Center (VUMC), Amsterdam, The Netherlands). The Multiple delineation a largely studied field and extensive literature exists. However, the data set is publicly available (downloaded from: www.cancerdata.org). This study majority of previous work describes the use of imaging features was conducted according to national laws and guidelines and approved by the focused in the detection of small nodules in, for example, appropriate local trial committee at Maastricht University Medical Center mammograms or chest CT/positron emission tomography (PET) (MUMC1), Maastricht, The Netherlands. scans, or in the differential diagnosis of malignant versus benign nodules (computed-aided diagnostics). However, applications  The RIDER data set consists of 31 NSCLC patients with two CT scans acquired approximately 15 min apart . We used this data set to assess stability of the and methodologies are distinct from our study. Quantitative features for test-retest. imaging for personalized medicine is a recent field, with a limited The multiple delineation data set consists of 21 NSCLC patients where the 12,20–27 number of publications . The main clinical question of this tumour volume was delineated manually on CT/PET scans by five independent research is not the diagnosis, but how to extract more useful oncologists . We used this data set to assess stability of the features for delineation inaccuracies. information from the tumour phenotype that can be used for The Lung1 data set consists of 422 NSCLC patients that were treated at personalized medicine. Therefore, we assessed the association of MAASTRO Clinic, The Netherlands. For these patients, CT scans, manual radiomics with clinical factors, prognosis and gene-expression delineations, clinical and survival data were available. We used this data set to levels, using large amounts of features and with external and assess the prognostic value of the radiomic features and to build a radiomic signature. independent validation cohorts of patients. The most important The Lung2 data set consists of 225 NSCLC patients that were treated at Radboud message in our study is that there is prognostic and biologic University Nijmegen Medical Centre, The Netherlands. For these patients, CT information enclosed in routinely acquired CT imaging and was scans, manual delineations, clinical and survival data were available. We used evident in two cancer types. this data set to validate the prognostic value of the radiomic features and signature in an independent NSCLC cohort. It is known that variability in image acquisition exists across The H&N1 data set consists of 136 head-and-neck squamous cell carcinoma hospitals and that this is a reality in clinical practice. However, in (HNSCC) patients treated at MAASTRO Clinic, The Netherlands. For these our analysis we used data directly generated from the scanner and patients, CT scans, manual delineations, clinical and survival data were available. the features were calculated from the RAW imaging data, without We used this data set to validate the prognostic value of the radiomic features and signature in HNSCC patients. any pre-processing or normalization. As there was no correction The H&N2 data set consists of 95 HNSCC patients treated at the VU University by cohort or scanner type, this illustrates the translational Medical Center Amsterdam, The Netherlands. For these patients, CT scans, potential of our results and it is a strong argument in favour of a manual delineations, clinical and survival data were available. We used this data multicentric application of radiomics. The radiomics signature set to validate the prognostic value of the radiomic features and signature in a had strong prognostic power in these independent data sets second cohort of HNSCC patients. The Lung3 data set consists of 89 NSCLC patients that were treated at generated in daily clinical practice. Furthermore, we expect that MAASTRO Clinic, The Netherlands. For these patients pretreatment CT scans, with better standardization and imaging protocols, the power of tumour delineations and gene expression profiles were available. We used this radiomics will even further improve. Among others, the data set to associate imaging features with gene-expression profiles. quantitative imaging network of the National Institute of Health, In the Supplementary Methods and Supplementary Tables 4–7, further as well as the quantitative imaging biomarker alliance, investi- descriptions of the data sets are presented. The discovery Lung1 data set, consisting gates future directions by performing phantom studies and of CT images for 422 NSCLC patients, and the Lung3 data set consisting of CT discussing with vendor’s open and standardized protocols for images and gene-expression profiling for 89 NSCLC patients, are publicly available 2,3 image acquisition . at The Cancer Imaging Archive, Lung1: https://wiki.cancerimagingarchive.net/ display/Public/NSCLC-Radiomics and Lung3: https://wiki.cancerimagingarchive. Due to the large availability of noninvasive imaging performed net/display/Public/NSCLC-Radiomics-Genomics, as well as on www.cancerdata.org. routinely in a large number of cancer patients and the automated feature algorithms, the results of this work could stimulate further research of image-based quantitative features. Also, we presented Sample size. To reduce any form of over-fitting or bias in the multivariate ana- lysis, we trained on data the Lung1 data sets (n¼ 422), selecting the features and evidence that the defined radiomic feature-metrics are platform fixing the weights, and tested only one signature (containing four features) in data independent, though this should be studied further, and can of 545 patients in the independent validation data sets. There was no need for potentially be applied to other image modalities, such as magnetic randomization as the patients originated from distinct groups. Patients were resonance imaging or PET. This approach can have a large included in the analysis with the following criteria: confirmed primary tumour, impact as imaging is routinely used in clinical practice, world- patients underwent treatment with curative intent. Excluded from this analysis were patients receiving no or palliative treatment and patients with previous lung wide, in all stages of diagnoses and treatment, providing an or head-and-neck cancer. unprecedented opportunity to improve medical decision-support. Data analysis. An overview of the analysis is shown in Fig. 2. The analysis was Methods divided in training and validation phases. For the training phase, we first explored Radiomics features. We defined 440 radiomic image features that describe tumour feature stability determined in both test-retest and inter-observer setting. The characteristics and can be extracted in an automated way. The features can be RIDER and multiple delineation data sets were used to assess stability of the divided into four groups: (I) tumour intensity, (II) shape, (III) texture and (IV) features to select the most informative features for further investigation. Using the wavelet features. The first group quantified tumour intensity characteristics using RIDER test-retest data set, we tested the stability of the radiomic features between first-order statistics, calculated from the histogram of all tumour voxel intensity test and retest . For each patient, we extracted the radiomic features from both values. Group 2 consists of features based on the shape of the tumour (for example, scans. A stability rank was calculated for each feature, using the intraclass sphericity or compactness of the tumour). Group 3 consists of textual features that correlation coefficient, where a higher intraclass correlation coefficient rank are able to quantify intratumour heterogeneity differences in the texture that is corresponds to a more stable feature. observable within the tumour volume. These features are calculated in all three- We assessed the feature stability for delineation inaccuracies using a multiple dimensional directions within the tumour volume, thereby taking the spatial location delineation data set . All radiomic features were computed for five delineations of each voxel compared with the surrounding voxels into account. Group 4 calcu- per patient, and a stability rank per feature was calculated using the Friedman test. lates the intensity and textural features from wavelet decompositions of the original The Friedman test is a nonparametric repeated measurement test for a non- image, thereby focusing the features on different frequency ranges within the tumour Gaussian population. A rank of 1 indicated the most stable feature for delineation volume (Supplementary Fig. 4). All feature algorithms were implemented in Matlab. inaccuracies and 440 the least stable feature. All 440 radiomic features were In the Supplementary Methods, the feature algorithms are described. extracted for the Lung1, Lung2, H&N1 and H&N2 data sets. The radiomic features NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications 7 & 2014 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5006 were not normalized on any data set, and only the raw values were used that were 13. Compton, C. C. et al. AJCC Cancer Staging Atlas (Springer, 2012). directly computed from the DICOM image. To explore the association of the 14. Subramanian, A. Gene set enrichment analysis: a knowledge-based approach radiomics features with survival, we used Kaplan–Meier analysis in a training and for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA validation phase. To ensure a completely independent validation, the median 102, 15545–15550 (2005). threshold of each feature on the Lung1 data set was computed, and then this 15. Yachida, S. et al. Distant metastasis occurs late during the genetic evolution of threshold was used in the validation data sets (Lung2, H&N1 and H&N2) to split pancreatic cancer. Nature 467, 1114–1117 (2010). the survival curves. We used the G-rho rank test for censored survival data to test 16. Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed for significant differences between the two survival curves. P-values were corrected by multiregion sequencing. New Engl. J. Med. 366, 883–892 (2012). for multiple testing by controlling the FDR of 10%, the expected proportion of false 17. Gerlinger, M. & Swanton, C. How Darwinian models inform therapeutic failure discoveries amongst the rejected hypotheses. initiated by clonal heterogeneity in cancer medicine. Br. J. Cancer 103, To assess the multivariate performance of radiomic features we built a signature. 1139–1143 (2010). We selected the 100 most stable features, determined by averaging the stability ranks 18. Kern, S. E. Why your new cancer biomarker may never work: recurrent of RIDER data set and multiple delineation data set. Next, we computed the patterns and remarkable diversity in biomarker failures. Cancer Res. 72, performance in the Lung 1 data set of each of the selected 100 features using the 6097–6101 (2012). concordance index (CI) . This measure is comparable with the area under the curve 19. Starmans, M. H. W. et al. Independent and functional validation of a multi- but can also be used for Cox regression analysis. From each of the four-feature groups, tumour-type proliferation signature. Br. J. Cancer 107, 508–515 (2012). we selected the single best performing feature for prognosis in the Lung1 data set, and 20. Nair, V. S. et al. Prognostic PET 18F-FDG uptake imaging features are combined these top four features into a multivariate Cox proportional hazards associated with major oncogenomic alterations in patients with resected non- regression model for prediction of survival. The weights of the model were fitted on the Lung1 data set. We applied the radiomic signature to the validation data sets small cell lung cancer. Cancer Res. 72, 3725–3734 (2012). Lung2, H&N1 and H&N2, and the performance was assessed with the CI. To 21. Diehn, M. et al. Identification of noninvasive imaging surrogates for brain tumor calculate significance between two models we used a bootstrap approach, for 100 gene-expression modules. Proc. Natl Acad. Sci. USA 105, 5213–5218 (2008). times we calculated the CI of both models from 100 randomly selected samples. The 22. Segal, E. et al. Decoding global gene expression programs in liver cancer by Wilcoxon test was used to assess significance. noninvasive imaging. Nat. Biotechnol. 25, 675–680 (2007). A similar approach was used to assess if the signature had significant power, 23. Tixier, F. et al. Intratumor heterogeneity characterized by textural features on compared with random (CI¼ 0.5). We used a bootstrap approach, for 100 times we baseline 18F-FDG PET images predicts response to concomitant calculated the CI of the radiomics signature based on 100 randomly selected radiochemotherapy in esophageal cancer. J. Nucl. Med. 52, 369–378 (2011). samples with correct outcome data, as well as on 100 randomly chosen samples 24. Naqa El, I. et al. Exploring feature-based approaches in PET images for with random outcome data. The Wilcoxon test was used to assess significance, predicting cancer treatment outcomes. Pattern Recognit. 42, 1162–1171 (2009). between the two distributions. 25. Ganeshan, B., Panayiotou, E., Burnand, K., Dizdarevic, S. & Miles, K. Tumour To assess the complementary effect of the signature with clinical parameters, we heterogeneity in non-small cell lung carcinoma assessed by CT texture analysis: built a new model with the prediction of the signature as one input and the clinical a potential marker of survival. Eur. Radiol. 22, 796–802 (2011). parameter as the other input. The weight of the clinical parameter was fitted on the 26. Ganeshan, B., Skogen, K., Pressney, I., Coutroubis, D. & Miles, K. Tumour training data set Lung1. heterogeneity in oesophageal cancer assessed by CT texture analysis: To assess the association of the radiomic signature with gene expression, we preliminary evidence of an association with tumour metabolism, stage, and used the Lung3 data set. Gene expression of 89 patients was measured on survival. Clin. Radiol. 67, 157–164 (2012). Affymetrix chips with the custom chipset HuRSTA_2a520709 for 21,766 genes. 27. Gevaert, O. et al. Non-small cell lung cancer: identifying prognostic imaging Expression values were normalized with the RMA algorithm5 in the Affy package biomarkers by leveraging public gene expression microarray data: methods and in Bioconductor. For each of the four features in the radiomic signature, we preliminary results. Radiology 264, 387–396 (2012). calculated the Spearman rank correlation to gene expression and used the 28. Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics corresponding P-values to obtain a rank of genes representing high-to-low 27, 1739–1740 (2011). agreement. Each of these gene ranks were used to perform a pre-ranked version of 14 28 GSEA on the C5 collection of MSigDB , which contains gene sets associated Acknowledgements with specific GO terms. We only regarded gene sets of size 15 to 500. Local FDRs were calculated on the normalized enrichment scores (NES), primary statistic of We acknowledge financial support from the National Institute of Health (NIH-USA U01 GSEA and only gene sets enriched with an FDR of r20% were retained. Figure 4b CA 143062-01, Radiomics of NSCLC), the CTMM framework (AIRFORCE project, grant displays gene sets that have been significantly enriched (FDR r20%) for at least 030-103), EU 6th and 7th framework program (METOXIA, EURECA, ARTFORCE), one of four radiomic features (indicated by an asterisk). The corresponding euroCAT (IVA Interreg—www.eurocat.info), and the Dutch Cancer Society (KWF UM absolute NES in all of the four features are given color-coded, where light blue 2011-5020, KWF UM 2009-4454). We also acknowledge financial support from the means low and dark blue means high NES. Innovative Medicines Initiative Joint Undertaking (www.imi.europa.eu), based on resources from the 7th framework program abd EFPIA companies’ kind contribution (Grant agreement No. 115151). References 1. Kurland, B. F. et al. Promise and pitfalls of quantitative imaging in oncology clinical trials. Magn. Reson. Imaging 30, 1301–1312 (2012). Author contributions 2. Buckler, A. J., Bresolin, L., Dunnick, N. R. & Sullivan, D. C. Group. A H.J.W.L.A., E.R.V., R.J.G. and P.L. conceived the project, analysed the data and wrote the collaborative enterprise for multi-stakeholder participation in the advancement paper. R.T.H.L., C.P. and S.C. collected the data and provided analysis on the data sets. of quantitative imaging. Radiology 258, 906–914 (2011). P.G., B.H.-K. and J.Q. provided bioinformatics analysis and support. J.B., D.R., R.M., 3. Buckler, A. J. et al. Quantitative imaging test approval and biomarker F.H., M.M.R., C.R.L. and A.D. provided expert knowledge, collection and availability of qualification: interrelated but distinct activities. Radiology 259, 875–884 (2011). the data. All authors edited the manuscript. 4. Lambin, P. et al. Predicting outcomes in radiation oncology—multifactorial decision support systems. Nat. Rev. Clin. Oncol. 10, 27–40 (2013). Additional information 5. Jaffe, C. C. Measures of response: RECIST, WHO, and new alternatives. J. Clin. Supplementary Information accompanies this paper at http://www.nature.com/ Oncol. 24, 3245–3251 (2006). naturecommunications 6. Burton, A. RECIST: right time to renovate? Lancet Oncol. 8, 464–465 (2007). 7. Birchard, K. R., Hoang, J. K., Herndon, J. E. & Patz, E. F. Early changes in Competing financial interests: The authors declare no competing financial interests. tumor size in patients treated for advanced stage nonsmall cell lung cancer do not correlate with survival. Cancer 115, 581–586 (2009). Reprints and permission information is available online at http://npg.nature.com/ 8. Lambin, P. et al. Radiomics: extracting more information from medical images reprintsandpermissions/ using advanced feature analysis. Eur. J. Cancer 48, 441–446 (2012). How to cite this article: Aerts, H. J. W. L. et al. Decoding tumour phenotype by 9. Kumar, V. et al. Radiomics: the process and the challenges. Magn. Reson. noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 5:4006 Imaging 30, 1234–1248 (2012). doi: 10.1038/ncomms5006 (2014). 10. Zhao, B. et al. Evaluating variability in tumor measurements from same-day repeat CT scans of patients with non–small cell lung cancer. Radiology 252, This work is licensed under a Creative Commons Attribution- 263–272 (2009). NonCommercial-NoDerivs 3.0 Unported License. The images or other 11. van Baardwijk, A. et al. PET-CT-based auto-contouring in non-small-cell lung third party material in this article are included in the article’s Creative Commons license, cancer correlates with pathology and reduces interobserver variability in the unless indicated otherwise in the credit line; if the material is not included under the delineation of the primary tumor and involved nodal volumes. Int. J. Radiat. Creative Commons license, users will need to obtain permission from the license holder Oncol. Biol. Phys. 68, 771–778 (2007). to reproduce the material. To view a copy of this license, visit http://creativecommons. 12. Harrell, F. E. Regression Modeling Strategies: With Applications to Linear org/licenses/by-nc-nd/3.0/ Models, Logistic Regression, and Survival Analysis (Springer, 2001). 8 NATURE COMMUNICATIONS | 5:4006 | DOI: 10.1038/ncomms5006 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. DOI: 10.1038/ncomms5644 Corrigendum: Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach Hugo J.W.L. Aerts, Emmanuel Rios Velazquez, Ralph T.H. Leijenaar, Chintan Parmar, Patrick Grossmann, Sara Carvalho, Johan Bussink, Rene´ Monshouwer, Benjamin Haibe-Kains, Derek Rietveld, Frank Hoebers, Michelle M. Rietbergen, C. Rene´ Leemans, Andre Dekker, John Quackenbush, Robert J. Gillies & Philippe Lambin Nature Communications 5:4006 doi: 10.1038/ncomms5006 (2014); Published 3 Jun 2014; Updated 7 Aug 2014 The original version of this Article contained a typographical error in the spelling of the author Sara Carvalho, which was incorrectly given as Sara Cavalho. This has now been corrected in both the PDF and HTML versions of the Article. NATURE COMMUNICATIONS | 5:4644 | DOI: 10.1038/ncomms5644 | www.nature.com/naturecommunications 1 & 2014 Macmillan Publishers Limited. All rights reserved.

Journal

Nature CommunicationsSpringer Journals

Published: Jun 3, 2014

There are no references for this article.