Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Whole-genome analysis informs breast cancer response to aromatase inhibition

Whole-genome analysis informs breast cancer response to aromatase inhibition ARTICLE doi:10.1038/nature11143 Whole-genome analysis informs breast cancer response to aromatase inhibition 1,2,3 4,5 4,5 3,6 7 4,5 1 Matthew J. Ellis *,Li Ding *, Dong Shen *, Jingqin Luo , Vera J. Suman , John W. Wallis , Brian A. Van Tine , 1 8,9,10 11 11 1 1 1 Jeremy Hoog , Reece J. Goiffon , Theodore C. Goldstein , Sam Ng , Li Lin , Robert Crowder , Jacqueline Snider , 7 1,8,12 13 4,5 4,5 4,5 Karla Ballman , Jason Weber , Ken Chen , Daniel C. Koboldt , Cyriac Kandoth , William S. Schierding , 4,5 4,5 4,5 4,5 4,5 Joshua F. McMichael , Christopher A. Miller , Charles Lu , Christopher C. Harris , Michael D. McLellan , 4,5 1 3,14 15 16 2 Michael C. Wendl , Katherine DeSchryver , D. Craig Allred , Laura Esserman , Gary Unzeitig , Julie Margenthaler , 13 17 18 19 13 17 6 1,4 G. V. Babiera , P. Kelly Marcom , J. M. Guenther , Marilyn Leitch , Kelly Hunt , John Olson ,YuTao , Christopher A. Maher , 4,5 4,5 4,5 4,5 4,5 4,5 Lucinda L. Fulton , Robert S. Fulton , Michelle Harrison , Ben Oberkfell , Feiyu Du , Ryan Demeter , 4,5 8,9,10 8,12,20,21 2,22 6,14,22 Tammi L. Vickery , Adnan Elhammali , Helen Piwnica-Worms , Sandra McDonald , Mark Watson , 4,5 23 3,14 2,3 1,2,4 8,9,10,12,24 David J. Dooling , David Ota , Li-Wei Chang , Ron Bose , Timothy J. Ley , David Piwnica-Worms , 11 2,4,5 2,4,5 Joshua M. Stuart , Richard K. Wilson & Elaine R. Mardis To correlate the variable clinical features of oestrogen-receptor-positive breast cancer with somatic alterations, we studied pretreatment tumour biopsies accrued from patients in two studies of neoadjuvant aromatase inhibitor therapy by massively parallel sequencing and analysis. Eighteen significantly mutated genes were identified, including five genes (RUNX1, CBFB, MYH9, MLL3 and SF3B1) previously linked to haematopoietic disorders. Mutant MAP3K1 was associated with luminal A status, low-grade histology and low proliferation rates, whereas mutant TP53 was associated with the opposite pattern. Moreover, mutant GATA3 correlated with suppression of proliferation upon aromatase inhibitor treatment. Pathway analysis demonstrated that mutations in MAP2K4, a MAP3K1 substrate, produced similar perturbations as MAP3K1 loss. Distinct phenotypes in oestrogen-receptor-positive breast cancer are associated with specific patterns of somatic mutations that map into cellular pathways linked to tumour biology, but most recurrent mutations are relatively infrequent. Prospective clinical trials based on these findings will require comprehensive genome sequencing. Oestrogen-receptor-positive breast cancer exhibits highly variable 48 were at or below 10% (‘aromatase-inhibitor-sensitive tumours’, prognosis, histological growth patterns and treatment outcomes. median Ki67 1.2%, range 0–8%). Cases were also classified as luminal Neoadjuvant aromatase inhibitor treatment trials provide an opportunity A or B by gene expression profiling . We subsequently examined inter- to document oestrogen-receptor-positive breast cancer phenotypes in a actions between Ki67 biomarker change, histological categories, setting where sample acquisition is easy, prospective consent for geno- intrinsic subtype and mutation status in selected recurrently mutated mic analysis can be obtained and responsiveness to oestrogen depriva- genes in 310 cases overall. Pathway analysis was applied to contrast tion therapy is documented . We therefore conducted massively parallel the signalling perturbations in aromatase-inhibitor-sensitive versus sequencing (MPS) on 77 samples accrued from two neoadjuvant aromatase-inhibitor-resistant tumours. 2,3 aromatase inhibitor clinical trials . Forty-six cases underwent Results whole-genome sequencing (WGS) and 31 cases underwent exome The mutation landscape of luminal-type breast cancer sequencing, followed by extensive analysis for somatic alterations and their association with aromatase inhibitor response. Case selection Using paired-end MPS, 46 tumour and normal genomes were for discovery was based on the levels of the tumour proliferation sequenced to at least 30-fold and 25-fold haploid coverage, respectively, with diploid coverage of at least 95% based on concordance with SNP marker Ki67 in the surgical specimen, because high cellular prolifera- tion despite aromatase inhibitor treatment identifies poor prognosis array data (Supplementary Table 1). Candidate somatic events were 4 5,6 tumours exhibiting oestrogen-independent growth (Supplementary identified using multiple algorithms , and were then verified by hybrid- ization capture-based validation that targeted all putative somatic single- Fig. 1). Twenty-nine samples had Ki67 levels above 10% (‘aromatase- inhibitor-resistant tumours’, median Ki67 21%, range 10.3–80%) and nucleotide variants (SNVs) and small insertions/deletions (indels) that 1 2 3 Department of Internal Medicine, Division of Oncology, Washington University, St Louis, Missouri, USA. Siteman Cancer Center, Washington University, St Louis, Missouri 63110, USA. Breast Cancer 4 5 Program, Washington University, St Louis, Missouri 63110, USA. The Genome Institute, Washington University, St Louis, Missouri 63108, USA. Department of Genetics, Washington University, St Louis, 6 7 8 Missouri 63108, USA. Division of Biostatistics, Washington University, St Louis, Missouri 63110, USA. ACOSOG Statistical Center, Mayo Clinic, Rochester, Minnesota 55905, USA. BRIGHT Institute, 9 10 Washington University School of Medicine, St Louis, Missouri 63110, USA. Molecular Imaging Center, Washington University, St Louis, Missouri 63110, USA. Malinckrodt Institute of Radiology, 11 12 Washington University, St Louis, Missouri 63110, USA. Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA. Department of Cell Biology and Physiology, 13 14 Washington University, St Louis, Missouri 63110, USA. M. D. Anderson Cancer Center, Houston, Texas 77030, USA. Department of Pathology and Immunology, Washington University, St Louis, Missouri 15 16 17 63110, USA. Helen Diller Cancer Center, University of California, San Francisco, California 94143, USA. Doctors Hospital of Laredo, Laredo, Texas 78045, USA. Duke University Cancer Center, Durham, 18 19 20 North Carolina 27705, USA. Good Samaritan Hospital, Cincinnati, Ohio 45406, USA. Simmons Cancer Center, University of Texas Southwestern, Dallas, Texas 75390, USA. Department of Internal 21 22 Medicine, Washington University, St Louis, Missouri 63110, USA. Howard Hughes Medical Institute, Chevy Chase, Maryland 20815, USA. ACOSOG Central Specimen Bank, Washington University, St 23 24 Louis, Missouri 63110, USA. ACOSOG Operations Center, Duke University, Durham, North Carolina 27705, USA. Department of Developmental Biology, Washington University, St Louis, Missouri 63110, USA. *These authors contributed equally to this work. 2 1 JU NE 20 1 2 | V OL 48 6 | NAT URE | 3 5 3 ©2012 Macmillan Publishers Limited. All rights reserved UNC13B[F1405C] IFNA1[S176F] MPDZ[S224L] MPDZ[I1507T] LOC392275[V313L] FAM135B[I504V] TMEM71[S246*] LOC286456[NULL] PABPC1[N519S] CXorf1[I24V] GPR112[S1278Y] HEY1[S151C] SGK3[R181W] LYN[A370T] TNMD[G272V] BNIP3L[R163H] ZNF711[H566R] KCNU1[S30F] LOC642546[NULL] BRWD3[e6+1] P2RY10[A192G] STC1[S115P] SCARA5[R362C] EBF2[R162*] EFHA2[E222K] LOC645251[NULL] DLC1[E11V] WDR60[I844M] TRPV5[R359H] BRAF[K601E] CTPS2[R164T] ARHGAP6[D733A] SEPT3[R280Q] RELN[Q2936H] TNRC6B[L1571P] PICK1[M231L] C22orf23[S80L] 1 M307 PFTK [ L] PRDM15[W482C] FTCD[A483V] FAM3B[S176I] KLHL7[L62F] MOCS3[S274N] FAM83D[I203T] FAM83C[L557F] ID1[E93Q] STX11[R216C] NANP[E94K] MAP7[R374K] SLC23A2[D334N] ZNF835[E75*] ZNF582[E59Q] ZNF83[R168C] BCAM[D518N] ZNF292[S2627*] CPAMD8[S1245R] OLFM2[R94C] ZNF431[Y27H] ZNF558[T142M] DST[E231K] LOC646048[NULL] MYO1F[R178Q] PTPRS[D41H] DOT1L[F245C] PPIL1[R152C] C3[e40-1] FLJ45743[NULL] RREB1[S231*] DUSP1[E239K] FAM71B[E423K] ROCK1[Q891*] LOC728095[NULL] RNMT[E432K] PCDHB1[R721T] HMMR[E339Q] MC5R[F177L] DNAJC18[R156Q] RPS6KB1[S375F] AFF4[Q558H] NOG[L137Q] BPTF[E694Q] CEP120[K795N] ATAD4[S102L] MARCH10[S493*] LOC390806[NULL] CAK [R198H] D D TMED7[M147I] BCAS3[T542I] DHX58[P205L] TEX14[R19T] CRKRS[Y1328N] FBXL20[R94*] TP53[R209I] LOC646021[S62fs] AP3B1[V234L] TP53[E285K] MAP2K4[S184L] TP53[K386N] PHF23[P295A] TP53[H214D] RGS7BP[E69G] HTR1A[E291K] LOC642366[NULL] ANKRD11[S1884*] FYB[G335E] WDR70[R517W] LOC440386[NULL] CDH5[D231H] PLEKHG4[A711D] CDH18[A195T] TRIO[E1958D] CCDC135[G298S] CES1[Q337*] RBL2[R116T] KIAA0947[V2217M] SRCAP[S1503L] SORBS2[T371M] NEK1[e25-1] ABCC1[S1523I] FSTL5[D822H] MEFV[G763E] SRRM2[R2060*] MAP9[E137K] NPY2R[S236L] SV2B[F349L] ZNF710[E215K] AKAP13[L991F] PCDH10[N246I] PLK4[G897D] MYO5A[R1731H] NR2E3[D196N] C4orf21[E71K] MYEF2[P369S] RESEARCH ARTICLE overlap coding exons, splice sites and RNA genes (tier 1), high- specimens showed high concordance in all cases (correlation co- confidence SNVs and indels in non-coding conserved or regulatory efficiency ranged from 0.74 to 0.95) (Supplementary Fig. 2) and a regions (tier 2), as well as non-repetitive regions of the human genome somatic mutation was infrequently detected in only one of the two (tier 3). In addition, somatic structural variants and germline structural samples (4.65% overall). variants that potentially affect coding sequences (Supplementary Information) were assessed. Digital sequencing data from captured Significantly mutated genes in luminal breast cancer target DNAs from the 46 tumour and normal pairs (Supplementary The discovery effort was extended by studying 31 additional cases by Table 2 and Supplementary Information) confirmed 81,858 mutations exome sequencing, producing an additional 1,371 tier 1 mutations. In (point mutations and indels) and 773 somatic structural variants. The total the 77 cases yielded 3,355 tier 1 somatic mutations, including average numbers of somatic mutations and structural variants were 3,208 point mutations, 1 dinucleotide mutation and 146 indels, 1,780 (range 44–11,619) and 16.8 (range 0–178) per case, respectively ranging from 1 to 28 nucleotides. The point mutations included 733 (Supplementary Table 3). Tier 1 point mutations and small indels silent, 2,145 missense, 178 nonsense, 6 read-through, 69 splice-site predicted for all 46 cases also were validated using both 454 and mutations and 77 in RNA genes (Supplementary Table 5). Of 2,145 Illumina sequencing (Supplementary Information). BRC25 was a clear missense mutations, 1,551 were predicted to be deleterious by SIFT 13 45 outlier with only 44 validated tiers 1–3 mutations, all at low allele and/or PolyPhen . The MuSiC package was applied to determine frequencies (ranging from 5% to 26.8%). This sample probably had the significance of the difference between observed versus expected low tumour content despite histopathology assessment, but the data mutation events in each gene, on the basis of the background are included to avoid bias. mutation rate. This identified 18 significantly mutated genes with a The overall mutation rate was 1.18 validated mutations per megabase convolution false discovery rate (FDR), 0.26 (Table 1 and Sup- (Mb) (tier 1: 1.05; tier 2: 1.14; tier 3: 1.20). The mutation rate for tier 1 plementary Table 6). The list contains genes previously identified as 14 15 12 13 was higher than that observed for acute myeloid leukaemia (0.18– mutated in breast cancer (PIK3CA , TP53 , GATA3 , CDH1 , 6,7 16 17 18 19 0.23) , but lower than that reported for hepatocellular carcinoma RB1 , MLL3 , MAP3K1 and CDKN1B ) as well as genes not previ- 8 9 10,11 (1.85) , malignant melanoma (6.65) and lung cancers (3.05–8.93) ously observed in clinical breast cancer samples, including TBX3, (Supplementary Table 4). The background mutation rate (BMR) across RUNX1, LDLRAP1, STNM2, MYH9, AGTR2, STMN2, SF3B1 and the 21 aromatase-inhibitor-resistant tumours was 1.62 per Mb, nearly CBFB. twice that of the 25 aromatase-inhibitor-sensitive tumours at 0.824 per Thirteen mutations (3 nonsense, 6 frame-shift indels, 2 in-frame Mb (P5 0.02, one-sided t-test). A trend for more somatic structural deletions and 2 missense) were identified in MAP3K1 (Table 1 and variations in the aromatase-inhibitor-resistant group was also observed, Fig. 2), a serine/threonine kinase that activates the ERK and JNK kinase as the validated somatic structural variation frequency in the 21 pathways through phosphorylation of MAP2K1 and MAP2K4 (ref. 20). aromatase-inhibitor-resistant tumour genomes was 21.69 versus an Of interest, a missense (S184L) and a splice-region mutation (e213 average of 12.76 in 25 aromatase-inhibitor-sensitive tumours probably affecting splicing) in MAP2K4 were observed in two tumours (P5 0.16, one-sided t-test) (Fig. 1). If ten TP53 mutated cases were with no MAP3K1 mutation (Fig. 2). Single nonsynonymous mutations excluded, the background mutation rate still tended to be higher in the in MAP3K12, MAP3K4, MAP4K3, MAP4K4, MAPK15 and MAPK3 aromatase-inhibitor-resistant group (P5 0.08). To demonstrate that a were also detected (Supplementary Table 5). TBX3 harboured three single-tumour core biopsy produced representative genomic data, small indels (one insertion and two deletions). TBX3 affects expansion whole-genome sequencing of two pre-treatment biopsies was con- of breast cancer stem-like cells through regulation of FGFR .Two ducted for 5 of the 46 cases. The frequency of mutations in the paired truncating mutations in the tumour suppressor CDKN1B were Aromatase-inhibitor-sensitive BRC15 BRC17 BRC22 Aromatase-inhibitor-resistant TGM4[W141*] BRC44 BRC47 BRC50 Figure 1 | Genome-wide somatic mutations. Circos plots indicate validated BRC17 and BRC22) and three on-treatment Ki67 greater than 10% (bottom somatic mutations comprising tier 1 point mutations and indels, genome-wide panel: BRC44, BRC47 and BRC50) cases are shown. Significantly mutated copy number alterations, and structural rearrangements in six representative genes are highlighted in red. No purity-based copy number corrections were genomes. Three on-treatment Ki67 less than or at 10% (top panel: BRC15, used for plotting copy number. 354| NATURE |VOL 486| 21 JUNE 2012 ©2012 Macmillan Publishers Limited. All rights reserved SCG3[V196I] C15orf43[S43L] SPTBN5[G900V] SPRED1[R256H] TMCO5A[E62K] HERC2[I1780M] LOC728288[D138A] RYR3[I4659T] APBA2[R4W] OCA2[E124K] UGT2B28[Q399*] PDGFRA[E279K] uc010ifb.1[R584C] MLH3[R1273T] ANGEL1[D415H] C14orf45[D82N] COQ6[R173H] WDR21A[E426K] LIPH[D209N] NIN[Q1341*] ACTL6A[Q245*] L2HGDH[G298E] PIK3CA[H1047R] PIK3CA[E542K] PIK3CA[H1047L] KIAA1305[P535S] TOX4[S133C] IGSF10[G325A] ATR[E2503Q] EPHB1[R637H] NPHP3[K1042T] TUBGCP3[R365W] CASR[E475K] KLHL1[P748S] CADM2[A72fs] NEK3[E477K] TRIM13[F385L] RB1[L665R] TGM4[W141*] SCN10A[e21-1] SGOL1[M1L] POLE[S2031fs MED13L[S521C] PTPN11[E76K] FLJ43879[NULL] STK 1IP[D803N] LOC728170[NULL] CWC22[A256T] MON2[E251Q] TTN[E12510K] LARP4[R139*] NEUROD1[E198Q] SLC39A5[E270K] SLC4A8[T896M] DLX P 81R] 2[ 2 ARID2[S742fs] KCNH7[V284I] ASC1[T10A] ABCC9[E126K] DUSP16[L402V] DARS[I268fs] SLC2A14[E244D] DCP1B[N152S] LOC649489[NULL] PTPN4[Q739H] MARCO[A167V] NCAPD3[e32-2] C2orf55[D781N] LOC642732[NULL] FAT3[A182S] CNGA3[R603W] TMPRSS4[I222V] MP 0[G64E] M 2 PRSS23[H70D] C2orf3[E619Q] MMP10[D388N] LRRTM1[R320C] GAB2[D579H] REL[D282H] CCDC82[Q241H] CLEC4F[R444H] ACER3[e3-1] HNRPLL[P301S] FAM161A[L27V] ODZ4[R1404*] INTS5[D609H] MSH6[K1140*] SYT7[R15H] LCLAT1[E228*] LTBP3[E495D] I [G743E] NTS5 BIRC6[e48-1] SOS1[D169H] TNKS1BP1[I1560L] OR5I1[T195fs] LCLAT1[e5-1] LBH[G32R] C2orf71[R1009C] PLB1[A997G] DNMT3A[A446P] TCP11L1[e3+1] SLC4A1AP[A538V] DCHS1[E2079A] OR52H1[D129H] KCNQ1[G219A] HTRA1[R274Q] OBSCN[e54+1] HHIPL2[K619N] GBF1[T124M] INPP5F[S59L] C1orf97[S58W] PTPRC[A111V] DY V RK3[L530 ] NPM3[E57K] SYT2[V160I] CFHR5[R19Q] SORBS1[V159I] LOC643981[NULL] LOC729046[NULL] MYST4[S975F] BAT2D1[P1918T] SPTA1[H304Y] CCAR1[G5E] SH2D2A[A361T] INSRR[E549Q] CRTC2[G557R] PRKG1[G619A] OR13A1[G251R] CUL2[D131H] DENND2C[R362W] HMCN2[E740K] C9orf43[E196K] SVEP1[E3304Q] NCBP1[E581K] S100PBP[Q353E] KPNA6[R43Q] NXNL2[P94S] HNRNPR[E338G] GJB4[V63I] KPNA6[e2-1] PEF1[S211F] SPEN[E1273*] LOC729796[NULL] PEX14[P248Q] ARTICLE RESEARCH Table 1 | Significantly mutated genes identified in 46 whole genomes that regulates alternative splicing by modulating the phosphorylation and 31 exomes sequenced in luminal breast cancer patients 32 of SR splicing factor . Translocations and point mutations of MALAT1 33 34 Gene Total MS NS Indel SS P value FDR have been reported in sarcoma and colorectal cancer cell lines .Five MAP3K1 13 2380 0 0 additional MALAT1 mutations were found in the recurrent screening PIK3CA 45 44 0 1 0 0 0 set (Supplementary Table 5d). The locations of these mutations TP53 18 13 1 1 1 0 0 219 216 clustered in a region of species homology (F1 and 2 domains) that GATA3 81043 1.15 3 10 7.41 3 10 215 211 could mediate interactions with SRSF1 (ref. 32, Supplementary Fig. 4). CDH1 81151 3.07 3 10 1.59 3 10 TBX3 30030 2.58 3 10 0.011 Non-coding mutation clusters were found in ATR, GPR126 and NRG3 ATR 66000 3.73 3 10 0.014 (Supplementary Information and Supplementary Table 7). RUNX1 44000 6.59 3 10 0.021 ENSG00000212670* 22000 2.31 3 10 0.066 Correlating mutations with clinical data RB1 42101 2.76 3 10 0.071 LDLRAP1 21100 4.27 3 10 0.092 To study clinical correlations, mutation recurrence screening was STMN2 21010 4.15 3 10 0.092 conducted on an additional 240 cases (Supplementary Table 8 and MYH9 41120 8.96 3 10 0.178 24 Supplementary Fig. 1). By combining WGS, exome and recurrence MLL3 51130 1.04 3 10 0.191 screening data, we determined the mutation frequency in PIK3CA to CDKN1B 20110 1.39 3 10 0.240 AGTR2 22000 1.71 3 10 0.256 be 41.3% (131 of 317 tumours) (Supplementary Table 5a–d and SF3B1 33000 1.79 3 10 0.256 Supplementary Fig. 3). TP53 was mutated in 51 of 317 tumours CBFB 21100 1.70 3 10 0.256 (16.1%) (Supplementary Table 5a–d and Supplementary Fig. 3). * ENSG00000212670 is not in RefSeq release 50. Additionally, 52 nonsynonymous MAP3K1 mutations in 39 tumours MS, Missense; NS, nonsense; SS, splice site. and 10 mutations in its substrate MAP2K4 were observed, represent- identified . Four missense RUNX1 mutations were observed, with ing a combined case frequency of 15.5% (Supplementary Table 5a–d and Fig. 3). Of note, 52 of the 62 non-silent mutations in MAP3K1 and three in the RUNT domain clustered within the 8 amino acid putative ATP-binding site (R166Q, G168E and R169K). RUNX1 is a transcrip- MAP2K4 were scattered indels or other protein-truncating events tion factor affected by mutation and translocation in the M2 subtype of strongly suggesting functional inactivation. In addition, 13 tumours harboured two non-silent MAP3K1 mutations, indicative of bi-allelic acute myeloid leukaemia and is implicated in tethering the oestrogen receptor to promoters independently of oestrogen response elements . loss and reinforcing the conclusion that this gene is a tumour sup- pressor. Twenty nine tumours harboured a total of 30 mutations in Two mutations (N104S and N140*) were also identified in CBFB, the binding partner of RUNX1. Additional mutations included 3 missense GATA3, consisting of 25 truncation events, one in-frame insertion, (2 K700E and 1 K666Q), in SF3B1, a splicing factor implicated in and 4 missense mutations including 3 recurrent mutations at M294K 24 25 (Supplementary Table 5a–d and Supplementary Fig. 3). BRC8 myelodysplasia and chronic lymphocytic leukaemia . One missense mutation, one nonsense mutation and two indels were found in the harboured a chromosome 10 deletion that includes GATA3. CDH1 mutation data were available for 169 samples and, as expected, its MYH9 gene, involved in hereditary macrothrombocytopenia as well as being observed in an ALK translocation in anaplastic large cell mutation status was strongly associated with lobular breast cancer 27 45 lymphoma . (Table 2a). We applied a permutation-based approach in MuSiC to ascertain relationships between mutated genes. Negative correlations We also identified three significantly mutated genes (LDLRAP1, AGTR2 and STMN2) not previously implicated in cancer. A missense were found between mutations in gene pairs such as GATA3 and and a nonsense mutation were observed in LDLRAP1, a gene asso- PIK3CA (P5 0.0026), CDH1 and GATA3 (P5 0.015), and CDH1 and TP53 (P5 0.022). MAP3K1 and MAP2K4 mutations were mutu- ciated with familial hypercholesterolaemia . AGTR2, angiotensin II receptor type 2, harboured two missense mutations (V184I and ally exclusive, albeit without reaching statistical significance (P5 0.3). In contrast, a positive correlation between MAP3K1/MAP2K4 and R251H). Angiotensin signalling and oestrogen receptor intersect in models of tissue fibrosis . STMN2, a gene activated by JNK family PIK3CA mutations was highly significant (P5 0.0002) (Supplemen- 30,31 kinases and therefore regulated by MAP3K1 and MAP2K4, tary Table 9). Two independent mutation data sets, designated ‘Set 1’ (discovery harboured one frameshift deletion and one missense mutation. Three deletions and one point mutation (Supplementary Fig. 3) were cohort) and ‘Set 2’ (validation cohort), from these clinical trial samples identified in a large, infrequently spliced non-coding (lnc) RNA gene, were analysed separately and then in combination, with a false discovery MALAT1 (metastasis associated lung adenocarcinoma transcript 1), rate (FDR)-corrected P value to gauge the overall strength and MAP3K1 MAP2K4 0 300 600 900 1200 1500 0 50 100 150 200 250 300 350 400 Scale (amino acid positions) Scale (amino acid positions) Frame-shift deletion ATP-binding site Frame-shift deletion ATP-binding site Frame-shift insertion Serine/threonine site Missense Serine/threonine site In-frame deletion Active site Splice region Active site Missense RING Splice site Nonsense SWIM Splice site insertion Figure 2 | MAP3K1 and MAP2K4 mutations observed in 317 samples. substitution, splice site mutation or indel is designated with a circle at the Somatic status of all mutations was obtained by Sanger sequencing of PCR representative protein position with colour to indicate translation effects of the products or Illumina sequencing of targeted capture products. The locations of mutation. Asterisk, nonsense mutations that cause truncation of the open conserved protein domains are highlighted. Each nonsynonymous reading frame. 2 1 JU NE 20 1 2 | V OL 48 6 | NAT URE | 3 5 5 ©2012 Macmillan Publishers Limited. All rights reserved V218fs R273fs S292* F327fs F360fs R364G Q367* H393Q H393fs N406fs S431fs S438fs P444fs T542fs S628* L707fs I761fs Y790fs R790fs Q886* K913fs S939 in-frame deletion* Q957* S989fs T999fs K1003fs R1012fs Q1022fs P1034S L1052fs Q1259* T1267fs M1269fs Y1276fs E1286 in-frame deletion I1295fs N1305D I1307 in-frame deletion Y1319fs S1344* V1346 in-frame deletion L1352fs I1366 in-frame deletion LLID1375 in-frame deletion A1396V F1415C S1471fs Q1492* Q1494* R1509C 72e2+3 R75fs H79fs R134W S184L D186G 211e5+2 P272fs 297e9-1 346e9+1 RESEARCH ARTICLE Deletion involving MAP3K1 Deletion involving MAP2K4 Deletion involving NRG1 5:56165381 BRC49 normal 5:56372730 17:11571485 BRC47 normal 17:12568748 8:28017654 BRC49 normal 8:32431503 56 67 77 0 0 0 BRC49 tumour 5:56372730 BRC47 tumour BRC49 tumour 67 134 0 0 0 Read pairs Read pairs Read pairs MAP3K1 (XM_042066) exons 4–23 out of 23 DNAH9 (NM_001372) exons 29–71 out of 71 ELP3 (NM_018091) exons 5–17 out of 17 ZNF18 (NM_144680) SETD9 (NM_153706) exons 1–8 out of 8 exons 1–11 out of 11 NRG1 (ENST00000338921) exons 1–1 out of 5 MIER3 (NM_152622) MAP2K4 (NM_003010) exons 1–13 out of 13 exons 1–15 out of 15 Figure 3 | Structural variants in significantly mutated or frequently deleted reads from whole-genome sequence data. Arcs represent multiple breakpoint- genes. One MAP3K1 deletion in BRC49 and one MAP2K4 deletion in BRC47, spanning read pairs with sequence coverage depth plotted in black across the and one ELP3-NRG1 fusion in BRC49 identified using Illumina paired-end region. Chr, chromosome. consistency of genotype–phenotype relationships (Table 2a, b and cascade/apoptosis, ErbB signalling, Akt/PI3K/mTOR signalling, Supplementary Fig. 1). TP53 mutations in both data sets correlated TP53/RB signalling and MAPK/JNK pathways (Fig. 4a). To discern with significantly higher Ki67 levels, both at baseline (P5 0.0003) and the pathways relevant to aromatase inhibitor sensitivity, we con- at surgery (P5 0.001). Furthermore, TP53 mutations were signifi- ducted separate pathway analyses for aromatase-inhibitor-sensitive cantly enriched in luminal B tumours (P5 0.04) and in higher histo- versus aromatase-inhibitor-resistant tumours. Whereas the majority logical grade tumours (P5 0.02). In contrast, MAP3K1 mutations of top altered pathways (FDR# 0.15) in each group are shared, were more frequent in luminal A tumours (P5 0.02), in grade 1 several pathways were enriched in the aromatase-inhibitor-resistant tumours (P5 0.005) and in tumours with lower Ki67 at baseline group, including the TP53 signalling pathway, DNA replication, and (P5 0.001) with consistent findings across both data sets. GATA3 mismatch repair. Specifically, 38% of the aromatase-inhibitor- mutation did not influence baseline Ki67 levels but was enriched in resistant group (11 of 29 tumours) have mutations in the TP53 samples exhibiting greater percentage Ki67 decline (P5 0.01). This pathway with three having double or triple hits involving TP53, finding requires further verification because it was significant in Set ATR, APAF1 or THBS1. In contrast, only 16.6% (8 of 48 tumours) 1 (uncorrected P value 0.003) but was a marginal finding in Set 2 of the Ki67 low group had mutations in the TP53 signalling pathway, (P5 0.08). However, it suggests GATA3 mutation may be a positive each with only a single hit in genes TP53, ATR, CCNE2 or IGF1. predictive marker for aromatase inhibitor response. (Supplementary Table 16). GeneGo pathway analysis of MetaCore interacting network objects Structural variation and DNA repair mechanisms was used to identify genes in the 77 luminal breast cancers with low- Analysis of copy number alterations (CNAs) revealed arm-level gains frequency mutations that cluster into pathway maps. Eight networks for 1q, 5p, 8q, 16p, 17q, 20p and 20q and arm-level losses for 1p, 8p, assembled from significant maps encompassed mutations from 71 16q, and 17p in the 46 WGS tumour genomes (Supplementary Fig. 5). (92%) of the tumours (Fig. 4b). Many of the network objects shared A total of 773 structural variants (579 deletions, 189 translocations pathways with significantly mutated genes such as TP53, MAP3K1, and 5 inversions) identified by WGS were validated as somatic in 46 PIK3CA and CDH1. GeneGo analysis also revealed that several genes breast cancer genomes by capture validation. No recurrent transloca- with low-frequency mutations were actually subunits of complexes, tions were detected but six in-frame fusion genes were validated by resulting in higher mutation rates for that object, for example, the reverse transcription followed by PCR (Supplementary Information condensin complex (4 mutations in 4 genes) and the MRN complex (4 and Supplementary Tables 10–13). Seven tumours had multiple com- mutations in 3 genes). Several pathways without multiple significantly plex translocations with breakpoints suggestive of a catastrophic mutated genes, such as the apoptotic cascade, calcium/phospholipase mitotic event (‘chromothripsis’; Supplementary Table 11). Analysis signalling and G-protein-coupled receptors, were significantly affected of the structural variant genomic breakpoints shows the spectra of by low-frequency mutations. Grouping tumours by significantly putative chromothripsis-related events are the same as seen for other mutated genes and pathway mutation status showed that whereas 55 somatic events, with the majority of structural variants arising from (71%) of the tumours contained significantly mutated genes in signifi- non-homologous end-joining. We classified somatic (mitotic) and cant pathways, an additional 16 (21%) contained only non-significantly germline (meiotic) structural variants into four groups: variable mutated genes in these pathways. Thus, tumours without a given sig- number tandem repeat (VNTR), non-allelic homologous recombina- nificantly mutated gene often had other mutations in the same relevant tion (NAHR), microhomology-mediated end joining (MMEJ), and pathway (Fig. 4b, Supplementary Fig. 6, Supplementary Table 17 and non-homologous end joining (NHEJ), according to criteria described Supplementary Information). in Supplementary Information. The fraction of each classification is We also applied PARADIGM to infer pathway-informed gene shown for germline and somatic (mitotic) events (Supplementary activities using gene expression and copy-number data to identify Table 14). There were significantly more somatic NHEJ events in several ‘hubs’ of activity (Supplementary Fig. 7, Supplementary Fig. 8 tumour genomes than the other three types (P, 2.23 10 ). and Supplementary Information). As expected, ESR1 and FOXA1 were among the hubs activated cohort-wide while other hubs exhibited high Pathways relevant to aromatase inhibitor response but differential changes in aromatase-inhibitor-resistant tumours Pathscan analysis (Supplementary Table 15 and Supplementary including MYC, FOXM1 and MYB (Supplementary Fig. 8). The con- Information) indicated that somatic mutations detected in the 77 cordance among the 104 MetaCore maps from GeneGo analysis discovery cases affect a number of pathways, including caspase described above is significant, with 75 (72%) matching one of the 356| NATURE |VOL 486| 21 JUNE 2012 ©2012 Macmillan Publishers Limited. All rights reserved Depth for Chr 5 Chr 5 Depth for Chr 17 Chr 17 Depth for Chr 8 Chr 8 ARTICLE RESEARCH a b Receptor tyrosine kinases ERBB2 PDGFRA EPHA7 CSF1R DDR1 MET KIT Predicted functional inactivation C 7% C 2% C 9% C 9% C 2% C 4% C 2% Predicted functional activation M 2% M 4% M 2% M 2% M 2% M 2% M 4% C Copy number alteration M Mutation C 13% C4% PI3KCA KRAS S Structural variation M 52% Activation PTEN C7% Inhibition C 52% AKT BRAF M2% M 2% M 9% MAP3K4 S 2% C 7% MAP3K1 M 2% C 13% C 22% Receptors M 4% MAP2K3 MAPKs MAP2K4 S 2% S 4% (MAP3K1, STMN2) GTPases, GPCRs DNA damage, cell cycle (AGTR2) (MLL2/3 complex (MLL3*, MLL2, PAXIP1, C 2% PI3K, Akt, mTor NCOA6, PAXIP1-associated protein 1), C 4% MAPK8 / JNK MAPK14 / P38 GATA3 M 7% (PIK3CA) ATR, SF3B1, GATA3, RB1, TP53) S 2% C9% 2+ Ca , PLC signalling Apoptosis Wnt SMG mutations (CDH1) C 17% S 2% SMG in a complex with mutated non-SMGs ATM M9% TP53 ATR C 28% M 22% CHK1 S 4% FOXO1 FOXO3 FOXO4 C 7% C 20% C 15% C 20% C4% C4% MDM2 CDK4 CHK2 CDC25 M 2% S 4% C 15% M 9% RB1 CDKN1B CDK2 E2F C 7% C7% M 4% Cell cycle Cell death progression Figure 4 | Key cancer pathway components altered in luminal breast over-represented by mutations in 77 luminal breast tumours (46 WGS and 31 tumours. a, Only genetic alterations identified in 46 WGS cases are shown. exome cases). In the concentric circle diagram, tumours are arranged as radial Alterations were discovered in key genes in the TP53/RB, MAPK, PI3K/AKT/ spokes and categorized by their mutation status in each network (concentric mTOR pathways. Genes coloured blue and red are predicted to be functionally ring colour) and significantly mutated gene mutation status (black dots). inactivated and activated, respectively, through focused mutations including Tumour classification by pathway analysis shows many tumours unaffected by point mutations and small indels (M), copy number deletions (C), or other a given significantly mutated gene often harbour other mutations in the same structural changes (S) that affect the gene. The inter-connectedness of this network. For full annotation, see Supplementary Information and network (several pathways) shows that there are many different ways to perturb Supplementary Fig. 6. PLC, phospholipase C; SMG, significantly mutated gene. a pathway. b, Eight interaction networks from canonical maps are significantly PARADIGM subnetworks at the 0.05 significance level after multiple with 43, JUN with 40, HDAC1 with 40, SHC1 with 39, and HIF1A/ test correction (P, 4.43 10 ; Bonferroni-adjusted hypergeometric ARNT complex with 39 (Supplementary Fig. 11). test) (Supplementary Fig. 9). We identified significant subnetworks To identify higher-level connections between mutations and associated with Ki67 biomarker status (Supplementary Fig. 10 and Sup- clinical features, we compared the samples on the basis of pathway- plementary Information) involving transcription factors controlling derived signatures. For each clinical attribute and each significantly large regulons. mutated gene, we dichotomized the discovery samples into a positive and a negative group to derive pathway signatures that discriminated The PARADIGM-inferred pathway signatures were further used to derive a map of the genetic mechanisms that may underlie treatment between the groups (see details in Supplementary Information). We response. A subnetwork was constructed in which interactions were then computed all pair-wise Pearson correlations between pathway retained only if they connected two features with higher than average signatures and clustered the resulting correlations (Fig. 5). The entire absolute association with Ki67 biomarker status (Supplementary Figs process was repeated using validated mutations and signatures 10 and 11 and Supplementary Information). Consistent with the derived from the validation set (Supplementary Fig. 12). In line with PathScan results, among the largest of the hubs in the identified expectation, PIK3CA, MAP3K1, MAP2K4, and low risk preoperative network were a central DNA damage hub with the second highest endocrine prognostic index (PEPI) scores (PEPI is an index of connectivity (55 regulatory interactions; 1% of the network) and TP53 recurrence risk post neoadjuvant aromatase inhibitor therapy ) with the 14th highest connectivity (26 connections; 0.5% of the network). cluster with the luminal A subtypes and with each other, and are Additional highly connected hubs identified in order of connectivity supported by the validation set analysis. The luminal B-like signatures were MYC with 79 connections (1.4%), FYN with 45 (0.8%), MAPK3 included TP53, RB1, RUNX1 and MALAT1, which also associated 21 JU NE 201 2 | V OL 486 | N A TURE | 3 5 7 ©2012 Macmillan Publishers Limited. All rights reserved C End of treatment Ki67 RB1 mutation PEPI score RUNX1 mutation CDKN1B mutation TP53 mutation Baseline Ki67 PAM50 subtype luminal B Histopathology grade BIRC6 mutation MALAT1 mutation ATR mutation CDH1 mutation PAM50 subtype luminal A MAP3K1 mutation PEPI 0* Histopathology type MAP2K4 mutation PIK3CA mutation MLL3 mutation RESEARCH ARTICLE Luminal A Luminal B with other poor outcome features such as high baseline and surgical Discovery set Ki67 levels, high grade histology and high PEPI scores. The TP53 and Differential pathway MALAT1 associations in the discovery set also were supported by the Signature correlation n = 77 Differential signatures validation set analysis. Mutation vs wild type MLL3 mutation Mutation vs wild type PIK3CA mutation Mutation vs wild type MAP2K4 mutation Druggable gene analysis Lobular vs ductal Histopathology type PEPI = 0 vs PEPI > 0 PEPI 0* We defined mutations in druggable tyrosine kinase domains includ- Mutation vs wild type MAP3K1 mutation LRENT Luminal A vs luminal B PAM50 subtype Luminal A ing in ERBB2 (a V777L and a 755–759 in-frame deletion Mutation vs wild type CDH1 mutation Mutation vs wild type ATR mutation homologous to gefitinib-sensitizing EGFR mutations in lung cancer ), Mutation vs wild type MALAT1 mutation as well as in DDR1 (A829V, R611C), DDR2 (E583D), CSF1R (D735H, Mutation vs wild type BIRC6 mutation Grade II, III vs grade I Histopathology grade M875L), and PDGFRA (E924K). In addition, pleckstrin homology Luminal B vs luminal A PAM50 subtype Luminal B Above 14% vs below 14% Baseline Ki67 domain mutations were observed in AKT1 (C77F) and AKT2 (S11F) Mutation vs wild type TP53 mutation and a kinase domain mutation was identified in RPS6KB1 (S375F) Mutation vs wild type CDKN1B mutation Mutation vs wild type RUNX1 mutation (Supplementary Table 18). Above mean vs below mean PEPI score Mutation vs wild type RB1 mutation Above 10% vs below 10% End of treatment Ki67 Discussion –1.0 Anticorrelation –0.75 The low frequency of many significantly mutated genes presents an –0.38 0.00 No correlation enormous challenge for correlative analysis, but several statistically 0.38 0.75 significant patterns were identified, including the relationship between 1.0 Correlation MAP3K1 mutation, luminal A subtype, low tumour grade and low Differential signatures Ki67 proliferation index. On this basis, for patients with MAP3K1 Figure 5 | Pathway signatures reveal connections between mutations and mutant luminal tumours, neoadjuvant aromatase inhibitor could pro- clinical outcomes. PARADIGM-based pathway signatures were derived for vide a favourable option. In contrast, tumours with TP53 mutations, tumour feature dichotomies including mutation driven gene signatures which are mostly aromatase inhibitor resistant, would be more appro- (mutant versus non-mutant), histopathology type (lobular versus ductal), priately treated with other modalities. MAP3K1 activates the ERK preoperative endocrine prognostic index (PEPI) score (PEPI5 0 favourable family, thus, loss of ERK signalling could explain the indolent nature versus PEPI .0 unfavourable), PAM50 (50-gene intrinsic breast cancer of MAP3K1-deficient tumours . However, MAP3K1 also activates subtype classifier) luminal A subtype (luminal A versus luminal B) and the JNK through MAP2K4, which also can be mutated . Loss of JNK reverse (luminal B versus luminal A), histopathology grade (grades II and III signalling produces a defect in apoptosis in response to stress, which versus I), baseline Ki67 levels ($ 14% versus , 14%), and end-of-treatment 39,40 would hypothetically explain why these mutations accumulate . Ki67 levels ($ 10% versus , 10%) and overall PEPI score (higher than mean unfavourable versus lower than mean favourable). Pearson correlations were PIK3CA harboured the most mutations (41.3%) but was neither asso- computed between all pair-wise signatures; positive correlations, red; negative ciated with clinical nor Ki67 response, confirming our earlier report . correlations, blue; column features ordered identically as rows. Correlation However, the positive association between MAP3K1/MAP2K4 muta- analysis on the 77 samples in the discovery set is shown. Asterisk: Ki67, 2.7%, tions and PIK3CA mutation at both the mutation and pathway levels oestrogen-receptor-positive, node negative and tumour size#5cm. suggests cooperativity (Fig. 4a). The finding of multiple significantly mutated genes linked previ- ously to benign and malignant haematopoietic disorders suggests that breast cancer, like leukaemia, can be viewed as a stem-cell disorder Table 2 | Correlations between mutations and clinical features a Luminal subtype and histology grade Gene Expression/histo-pathology variable Mutation frequency* Set1 P{ Set2 P{ Whole set FDR P{ TP53 Luminal subtype A 9.3% (13/140) 0.001 0.46 0.041 Luminal subtype B 21.5% (38/177) TP53 Histological grade I 4.5% (3/66) 0.05 0.067 0.02 Histological grade II/III 19.2% (48/250) MAP3K1 Luminal subtype A 20.0% (28/140) 0.018 0.028 0.005 Luminal subtype B 6.2% (11/177) MAP3K1 Histological grade I 25.8% (17/66) 0.061 0.011 0.005 Histological grade II/III 8.8% (22/250) 211 210 CDH1 Histological type ductal 5.9% (10/169) 0.411 2.8 3 10 3.9 3 10 Histological type lobular 50.0% (20/40) b Mutation and Ki67 index Gene Ki67 variable Wild type meanI Mutant meanI Set1 P" Set2 P" Whole set FDR P{ TP53 Baseline 13.1 25.1 3.7 3 10 0.012 0.0003 Surgery 1.4 4 0.0002 0.014 0.001 % change 289.2 284.3 0.09 0.28 0.24 MAP3K1 Baseline 15.8 8.1 0.049 0.001 0.002 Surgery 1.86 0.75 0.11 0.1 0.05 % change 288.3 290.5 0.49 0.65 0.55 GATA3 Baseline 14.8 11.5 0.13 0.95 0.56 Surgery 1.95 0.38 0.001 0.23 0.012 % change 286.8 296.9 0.003 0.08 0.012 * Mutation percentage (mutant cases/total cases in a category), counts are based on all cases (Set 1 and Set 2 combined). { Unadjusted P value from Fisher’s exact test or Chi-square test as appropriate. { Benjamini–Hochberg false discovery rate (FDR)-adjusted P value using all cases (Set1 and Set2 combined). 1 Only 77 cases in Set1 had CDH1 sequencing results. IGeometric means are based on all cases (Set1 and Set2 combined). "Unadjusted P value from Wilcoxon rank sum test. 358| NATURE |VOL 486| 21 JUNE 2012 ©2012 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH 3. Ellis, M. J. et al. Randomized phase II neoadjuvant comparison between letrozole, that produces indolent or aggressive tumours that display varying anastrozole, and exemestane for postmenopausal women with estrogen receptor- phenotypes depending on differentiation blocks generated by differ- rich stage 2 to 3 breast cancer: clinical and biomarker outcomes and predictive ent mutation repertoires . Whereas only MLL3 showed statistical value of the baseline PAM50-based intrinsic subtype—ACOSOG Z1031. J. Clin. Oncol. 29, 2342–2349 (2011). significance in the analysis of 46 WGS cases, multiple mutations in 4. Ellis, M. J. et al. Outcome prediction for estrogen receptor-positive breast cancer genes related to histone modification and chromatin remodelling are based on postneoadjuvant endocrine therapy tumor characteristics. J. Natl. Cancer worth noting (Supplementary Table 19). An array of coding muta- Inst. 100, 1380–1388 (2008). 5. Chen, K. et al. BreakDancer: an algorithm for high-resolution mapping of genomic tions and structural variations was discovered in methyltransferases structural variation. Nature Methods 6, 677–681 (2009). (MLL2, MLL3, MLL4 and MLL5), demethyltransferases (KDM6A, 6. Mardis, E. R. et al. Recurring mutations found by sequencing an acute myeloid KDM4A, KDM5B and KDM5C), and acetyltransferases (MYST1, leukemia genome. N. Engl. J. Med. 361, 1058–1066 (2009). 7. Ley, T. J. et al. DNA sequencing of a cytogenetically normal acute myeloid MYST3 and MYST4). Furthermore, our analysis identified several leukaemia genome. Nature 456, 66–72 (2008). adenine-thymine (AT)-rich interactive domain-containing protein 8. Totoki, Y. et al. High-resolution characterization of a hepatocellular carcinoma genes (ARID1A, ARID2, ARID3B and ARID4B) that harboured muta- genome. Nature Genet. 43, 464–469 (2011). 9. Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a tions and large deletions, reinforcing the role of members from the human cancer genome. Nature 463, 191–196 (2010). SNF/SWI family in breast cancer. 10. Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures of Pathway analysis enables the evaluation of mutations with low tobacco exposure. Nature 463, 184–190 (2010). 11. Lee, W. et al. The mutation spectrum revealed by paired genome sequences from a recurrence frequency where statistical comparisons are conventionally lung cancer patient. Nature 465, 473–477 (2010). underpowered. For example, the eight samples with MAP2K4 muta- 12. Usary, J. et al. Mutation of GATA3 in human breast tumors. Oncogene 23, tions were sufficient to derive a reliable pathway-based gene signature 7669–7678 (2004). 13. Berx, G. et al. E-cadherin is a tumour/invasion suppressor gene mutated in human in PARADIGM that aligns with MAP3K1. This approach also pointed lobular breast cancers. EMBO J. 14, 6107–6115 (1995). to a putative connection between MALAT1 and the TP53 pathway. 14. Samuels, Y. et al. High frequency of mutations of the PIK3CA gene in human Finally, we provide evidence that transcriptional associations to Ki67 cancers. Science 304, 554 (2004). 15. Prosser, J., Thompson, A. M., Cranston, G. & Evans, H. J. Evidence that p53 behaves response reside in a connected network under the control of several key as a tumour suppressor gene in sporadic breast tumours. Oncogene 5, ‘hub’ genes including MYC, FYN and MAP kinases, among others. 1573–1579 (1990). 16. T’Ang, A., Varley, J. M., Chakraborty, S., Murphree, A. L. & Fung, Y. K. Structural Targeting these hubs in resistant tumours could produce therapeutic rearrangement of the retinoblastoma gene in human breast carcinoma. Science advances. In conclusion, the genomic information derived from 242, 263–266 (1988). unbiased sequencing is a logical new starting point for clinical invest- 17. Wang, X. X. et al. Somatic mutations of the mixed-lineage leukemia 3 (MLL3) gene in primary breast cancers. Pathol. Oncol. Res. 17, 429–433 (2011). igation, where the mutation status of an individual patient is deter- 18. Kan, Z. et al. Diverse somatic mutation patterns and pathway alterations in human mined in advance and treatment decisions are driven by therapeutic cancers. Nature 466, 869–873 (2010). hypotheses that stem from knowledge of the genomic sequence and its 19. Spirin, K. S. et al. p27/Kip1 mutation found in breast cancer. Cancer Res. 56, 2400–2404 (1996). possible consequences. However, the accrual of large numbers of 20. Fanger, G. R., Johnson, N. L. & Johnson, G. L. MEK kinases are regulated by EGF and patients and the use of comprehensive sequencing and gene expression selectively interact with Rac/Cdc42. EMBO J. 16, 4961–4972 (1997). approaches will be required because of the extreme genomic hetero- 21. Fillmore, C. M. et al. Estrogen expands breast cancer stem-like cells through paracrine FGF/Tbx3 signaling. Proc. Natl Acad. Sci. USA 107, 21737–21742 geneity documented by this investigation. (2010). 22. Mao, S., Frank, R. C., Zhang, J., Miyazaki, Y. & Nimer, S. D. Functional and physical METHODS SUMMARY interactions between AML1 proteins and an ETS protein, MEF: implications for the pathogenesis of t(8;21)-positive leukemias. Mol. Cell. Biol. 19, 3635–3644 (1999). Clinical trial samples were accessed from the preoperative letrozole phase 2 study 23. Stender, J. D. et al. Genome-wide analysis of estrogen receptor a DNA binding and (NCT00084396) that investigated the effect of letrozole for 16 to 24 weeks on tethering mechanisms identifies Runx1 as a novel tethering factor in receptor- surgical outcomes and from the American College of Surgeons Oncology Group mediated transcriptional activation. Mol. Cell. Biol. 30, 3943–3955 (2010). (ACOSOG) Z1031 study (NCT00265759) that compared anastrozole with 24. Papaemmanuil, E. et al. Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. N. Engl. J. Med. 365, 1384–1395 (2011). exemestane or letrozole for 16 to 18 weeks before surgery (REMARK flow charts, 25. Wang, L. et al. SF3B1 and other novel cancer genes in chronic lymphocytic Supplementary Fig. 1). Baseline snap-frozen biopsy samples with greater than leukemia. N. Engl. J. Med. 365, 2497–2506 (2011). 70% tumour content (by nuclei) underwent DNA extraction and were paired with 26. Chen, Z. et al. The May-Hegglin anomaly gene MYH9 is a negative regulator of a peripheral blood DNA sample. Two formalin-fixed biopsies were obtained at platelet biogenesis modulated by the Rho-ROCK pathway. Blood 110, 171–179 baseline and at surgery, and were used to conduct oestrogen receptor and Ki67 (2007). 27. Lamant, L. et al. Non-muscle myosin heavy chain (MYH9): a new partner fused to immunohistochemistry as previously published . Paired end Illumina reads from ALK in anaplastic large cell lymphoma. Genes Chromosom. Cancer 37, 427–432 tumours and normal samples were aligned to NCBI build36 using BWA. Somatic (2003). point mutations were identified using SomaticSniper , and indels were identified 28. Wilund, K. R. et al. Molecular mechanisms of autosomal recessive by combining results from a modified version of the Samtools indel caller (http:// hypercholesterolemia. Hum. Mol. Genet. 11, 3019–3030 (2002). samtools.sourceforge.net/), GATK and Pindel. Structural variations were 29. Delleˆ,H. et al. Antifibrotic effect of tamoxifen in a model of progressive renal disease. J. Am. Soc. Nephrol. 23, 37–48 (2012). identified using BreakDancer and SquareDancer (unpublished). All putative 30. Tararuk, T. et al. JNK1 phosphorylation of SCG10 determines microtubule somatic events found in 46 cases were validated by targeted custom capture arrays dynamics and axodendritic length. J. Cell Biol. 173, 265–277 (2006). (Nimblegen)/Illumina sequencing and all tier 1 mutations for 46 WGS cases also 31. Westerlund, N. et al.Phosphorylation of SCG10/stathmin-2determines multipolar were validated using PCR/454 sequencing. All statistical analyses, including stage exit and neuronal migration rate. Nature Neurosci. 14, 305–313 (2011). significantly mutated gene, mutation relation and clinical correlation were done 32. Tripathi, V. et al. The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Mol. Cell 39, using the MuSiC package and/or by standard statistical tests (Supplementary 925–938 (2010). Information). Pathway analysis was performed with PathScan, GeneGo Metacore 33. Rajaram, V., Knezevich, S., Bove, K. E., Perry, A. & Pfeifer, J. D. DNA sequence of the (http://www.genego.com/metacore.php) and PARADIGM. A complete descrip- translocation breakpoints in undifferentiated embryonal sarcoma arising in tion of the materials and methods used to generate this data set and results is mesenchymal hamartoma of the liver harboring the t(11;19)(q11;q13.4) provided in the Supplementary Methods section. translocation. Genes Chromosom. Cancer 46, 508–513 (2007). 34. Xu, C., Yang, M., Tian, J., Wang, X. & Li, Z. MALAT-1: a long non-coding RNA and its important 39 end functional motif in colorectal cancer metastasis. Int. J. Oncol. 39, Received 16 June 2011; accepted 12 April 2012. 169–175 (2011). Published online 10 June 2012. 35. Wendl, M. C. et al. PathScan: a tool for discerning mutational significance in groups of putative cancer genes. Bioinformatics 27, 1595–1602 (2011). 1. Chia, Y. H., Ellis, M. J. & Ma, C. X. Neoadjuvant endocrine therapy in primary 36. Vaske, C. J. et al. Inference of patient-specific pathway activities from multi- breast cancer: indications and use as a research tool. Br. J. Cancer 103, 759–764 dimensional cancer genomics data using PARADIGM. Bioinformatics 26, (2010). i237–i245 (2010). 2. Olson, J. A. Jr et al. Improved surgical outcomes for breast cancer patients receiving 37. Lynch, T. J. et al. Activating mutations in the epidermal growth factor receptor neoadjuvant aromatase inhibitor therapy: results from a multicenter phase II trial. underlying responsiveness of non-small-cell lung cancer to gefitinib. N. Engl. J. Med. 350, 2129–2139 (2004). J. Am. Coll. Surg. 208, 906–914; discussion 915–906 (2009). 21 JU NE 20 12 | V OL 4 8 6 | N ATU RE | 3 5 9 ©2012 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE 38. Johnson, G. L. & Lapadat, R. Mitogen-activated protein kinase pathways mediated CA114722), the Susan G. Komen Breast Cancer Foundation (BCTR0707808), and the by ERK, JNK, and p38 protein kinases. Science 298, 1911–1912 (2002). Fashion Footwear Charitable Foundation, Inc., grant awards to ACOSOG included NCI 39. Widmann, C., Johnson, N. L., Gardner, A. M., Smith, R. J. & Johnson, G. L. U10 CA076001, the Breast Cancer Research Foundation, and clinical trial support Potentiation of apoptosis by low dose stress stimuli in cells expressing activated from Novartis and Pfizer, and a Center grant (NCI P50 CA94056) to D.P.-W. We also MEK kinase 1. Oncogene 15, 2439–2447 (1997). acknowledge institutional support in the form of the Washington University Cancer 40. Wagner, E. F. & Nebreda, A. R. Signal integration by JNK and p38 MAPK pathways Genome Initiative (R.K.W.), and a productive partnership with Illumina, Inc. The tissue in cancer development. Nature Rev. Cancer 9, 537–549 (2009). procurement core was supported by an NCI core grant to the Siteman Cancer Center 41. Ellis, M. J. et al. Phosphatidyl-inositol-3-kinase alpha catalytic subunit mutation (NCI 3P50 CA68438). The BRIGHT Institute is supported in part by an ATT/Emerson and response to neoadjuvant endocrine therapy for estrogen receptor positive gift to the Siteman Cancer Center. breast cancer. Breast Cancer Res. Treat. 119, 379–390 (2010). Author Contributions M.J.E. led the clinical investigations, biomarker analysis and 42. Prat, A. & Perou, C. M. Mammary development meets cancer genomics. Nature chip-based genomics. E.R.M., M.J.E., L.D., R.S.F., T.J.L. and R.K.W. designed the Med. 15, 842–844 (2009). experiments. L.D. and M.J.E. led data analysis. D.S., J.W.W., D.C.K., C.C.H., M.D.M., K.C., 43. Larson, D. E. et al. SomaticSniper: identification of somatic point mutations in C.A.Mi., F.D., W.S.S., M.C.W., R.C. and C.K. performed data analysis. D.S., C.A.Ma., J.W.W., whole genome sequencing data. Bioinformatics 28, 311–317 (2011). J.F.M., C.L. and L.D. prepared figures and tables. R.S.F., L.L.F., R.D., M.H., T.L.V., J.H., L.L., 44. Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. R.C. and J.S. performed laboratory experiments. L.E., G.U., J.M., G.V.B., P.K.M., J.M.G., Genome Res. 19, 1639–1645 (2009). M.L., K.H. and J.O. provided samples and clinical data. V.J.S., K.B., J.L., Y.T. and C.K. 45. Dees, N. et al. MuSiC: Identifying mutational significance in cancer genomes. provided statistical and clinical correlation analysis. D.O. oversees the ACOSOG Genome Res. (in the press). Operations Center that provides oversight and tracking for ACOSOG clinical trials. K.D., Supplementary Information is linked to the online version of the paper at S.McD., D.C.A. and M.W. provided pathology analysis. B.A.V.T., J.W., R.J.G., A.E., D.P.-W., www.nature.com/nature. H.P.-W., J.M.S., T.C.G., S.N., C.K. and M.C.W. performed pathway analysis. L.-W.C. and R.B. analysed the druggable target mutation data. D.J.D. and B.O. provided informatics Acknowledgements This article is dedicated to the memory of Evelyn Lauder in support. L.D., M.J.E. and E.R.M. wrote the manuscript. T.J.L., M.C.W. and R.K.W. critically recognition of her efforts to eradicate breast cancer. We would like to thank the read and commented on the manuscript. participating patients and their families, clinical investigators and their support staffs, and J. A. Zujewski and the Cancer Therapy Evaluation Program at the US National Author Information DNA sequence data are deposited in the restricted access portal at Cancer Institute. We would like to acknowledge the efforts of the following people and dbGaP, accession number phs000472.v1.p1. Gene expression array data used in the groups at The Genome Institute for their contributions to this manuscript: the Analysis Paradigm training set is deposited in GEO, accession number GSE29442, and a Pipeline group for developing the automated analysis pipelines that generated Superseries that covers both the Agilent gene expression data and the Agilent array alignments and somatic variants; the LIMS group for developing tools to manage CGH data used for the Paradigm test set is deposited in GEO, accession number validation array ordering, capture and sequencing, and J. Veizer and H. Schmidt for GSE35191 . Reprints and permissions information is available at www.nature.com/ structural variant and recurrent screening analyses. We thank the many members of reprints. This paper is distributed under the terms of the Creative Commons the Siteman Cancer Center at Washington University in St Louis for support, and the Attribution-Non-Commercial-Share Alike licence, and is freely available to all readers at committed members of the American College of Surgeons Oncology Group and their www.nature.com/nature. The authors declare no competing financial interests. patients for contributing samples to the Z1031 trial. This work was funded by grants to Readers are welcome to comment on the online version of this article at R.K.W. from the National Human Genome Research Institute (NHGRI U54 HG003079), www.nature.com/nature. Correspondence and requests for materials should be grants to M.J.E. from the National Cancer Institute (NCI R01 CA095614, NCI U01 addressed to M.J.E., L.D. and E.R.M. 3 6 0 | NAT U R E | V O L 4 8 6 | 21 JU NE 20 12 ©2012 Macmillan Publishers Limited. All rights reserved http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Nature Springer Journals

Loading next page...
 
/lp/springer-journals/whole-genome-analysis-informs-breast-cancer-response-to-aromatase-xagcrqRw46

References (52)

Publisher
Springer Journals
Copyright
Copyright © 2012 by The Author(s)
Subject
Science, Humanities and Social Sciences, multidisciplinary; Science, Humanities and Social Sciences, multidisciplinary; Science, multidisciplinary
ISSN
0028-0836
eISSN
1476-4687
DOI
10.1038/nature11143
Publisher site
See Article on Publisher Site

Abstract

ARTICLE doi:10.1038/nature11143 Whole-genome analysis informs breast cancer response to aromatase inhibition 1,2,3 4,5 4,5 3,6 7 4,5 1 Matthew J. Ellis *,Li Ding *, Dong Shen *, Jingqin Luo , Vera J. Suman , John W. Wallis , Brian A. Van Tine , 1 8,9,10 11 11 1 1 1 Jeremy Hoog , Reece J. Goiffon , Theodore C. Goldstein , Sam Ng , Li Lin , Robert Crowder , Jacqueline Snider , 7 1,8,12 13 4,5 4,5 4,5 Karla Ballman , Jason Weber , Ken Chen , Daniel C. Koboldt , Cyriac Kandoth , William S. Schierding , 4,5 4,5 4,5 4,5 4,5 Joshua F. McMichael , Christopher A. Miller , Charles Lu , Christopher C. Harris , Michael D. McLellan , 4,5 1 3,14 15 16 2 Michael C. Wendl , Katherine DeSchryver , D. Craig Allred , Laura Esserman , Gary Unzeitig , Julie Margenthaler , 13 17 18 19 13 17 6 1,4 G. V. Babiera , P. Kelly Marcom , J. M. Guenther , Marilyn Leitch , Kelly Hunt , John Olson ,YuTao , Christopher A. Maher , 4,5 4,5 4,5 4,5 4,5 4,5 Lucinda L. Fulton , Robert S. Fulton , Michelle Harrison , Ben Oberkfell , Feiyu Du , Ryan Demeter , 4,5 8,9,10 8,12,20,21 2,22 6,14,22 Tammi L. Vickery , Adnan Elhammali , Helen Piwnica-Worms , Sandra McDonald , Mark Watson , 4,5 23 3,14 2,3 1,2,4 8,9,10,12,24 David J. Dooling , David Ota , Li-Wei Chang , Ron Bose , Timothy J. Ley , David Piwnica-Worms , 11 2,4,5 2,4,5 Joshua M. Stuart , Richard K. Wilson & Elaine R. Mardis To correlate the variable clinical features of oestrogen-receptor-positive breast cancer with somatic alterations, we studied pretreatment tumour biopsies accrued from patients in two studies of neoadjuvant aromatase inhibitor therapy by massively parallel sequencing and analysis. Eighteen significantly mutated genes were identified, including five genes (RUNX1, CBFB, MYH9, MLL3 and SF3B1) previously linked to haematopoietic disorders. Mutant MAP3K1 was associated with luminal A status, low-grade histology and low proliferation rates, whereas mutant TP53 was associated with the opposite pattern. Moreover, mutant GATA3 correlated with suppression of proliferation upon aromatase inhibitor treatment. Pathway analysis demonstrated that mutations in MAP2K4, a MAP3K1 substrate, produced similar perturbations as MAP3K1 loss. Distinct phenotypes in oestrogen-receptor-positive breast cancer are associated with specific patterns of somatic mutations that map into cellular pathways linked to tumour biology, but most recurrent mutations are relatively infrequent. Prospective clinical trials based on these findings will require comprehensive genome sequencing. Oestrogen-receptor-positive breast cancer exhibits highly variable 48 were at or below 10% (‘aromatase-inhibitor-sensitive tumours’, prognosis, histological growth patterns and treatment outcomes. median Ki67 1.2%, range 0–8%). Cases were also classified as luminal Neoadjuvant aromatase inhibitor treatment trials provide an opportunity A or B by gene expression profiling . We subsequently examined inter- to document oestrogen-receptor-positive breast cancer phenotypes in a actions between Ki67 biomarker change, histological categories, setting where sample acquisition is easy, prospective consent for geno- intrinsic subtype and mutation status in selected recurrently mutated mic analysis can be obtained and responsiveness to oestrogen depriva- genes in 310 cases overall. Pathway analysis was applied to contrast tion therapy is documented . We therefore conducted massively parallel the signalling perturbations in aromatase-inhibitor-sensitive versus sequencing (MPS) on 77 samples accrued from two neoadjuvant aromatase-inhibitor-resistant tumours. 2,3 aromatase inhibitor clinical trials . Forty-six cases underwent Results whole-genome sequencing (WGS) and 31 cases underwent exome The mutation landscape of luminal-type breast cancer sequencing, followed by extensive analysis for somatic alterations and their association with aromatase inhibitor response. Case selection Using paired-end MPS, 46 tumour and normal genomes were for discovery was based on the levels of the tumour proliferation sequenced to at least 30-fold and 25-fold haploid coverage, respectively, with diploid coverage of at least 95% based on concordance with SNP marker Ki67 in the surgical specimen, because high cellular prolifera- tion despite aromatase inhibitor treatment identifies poor prognosis array data (Supplementary Table 1). Candidate somatic events were 4 5,6 tumours exhibiting oestrogen-independent growth (Supplementary identified using multiple algorithms , and were then verified by hybrid- ization capture-based validation that targeted all putative somatic single- Fig. 1). Twenty-nine samples had Ki67 levels above 10% (‘aromatase- inhibitor-resistant tumours’, median Ki67 21%, range 10.3–80%) and nucleotide variants (SNVs) and small insertions/deletions (indels) that 1 2 3 Department of Internal Medicine, Division of Oncology, Washington University, St Louis, Missouri, USA. Siteman Cancer Center, Washington University, St Louis, Missouri 63110, USA. Breast Cancer 4 5 Program, Washington University, St Louis, Missouri 63110, USA. The Genome Institute, Washington University, St Louis, Missouri 63108, USA. Department of Genetics, Washington University, St Louis, 6 7 8 Missouri 63108, USA. Division of Biostatistics, Washington University, St Louis, Missouri 63110, USA. ACOSOG Statistical Center, Mayo Clinic, Rochester, Minnesota 55905, USA. BRIGHT Institute, 9 10 Washington University School of Medicine, St Louis, Missouri 63110, USA. Molecular Imaging Center, Washington University, St Louis, Missouri 63110, USA. Malinckrodt Institute of Radiology, 11 12 Washington University, St Louis, Missouri 63110, USA. Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA. Department of Cell Biology and Physiology, 13 14 Washington University, St Louis, Missouri 63110, USA. M. D. Anderson Cancer Center, Houston, Texas 77030, USA. Department of Pathology and Immunology, Washington University, St Louis, Missouri 15 16 17 63110, USA. Helen Diller Cancer Center, University of California, San Francisco, California 94143, USA. Doctors Hospital of Laredo, Laredo, Texas 78045, USA. Duke University Cancer Center, Durham, 18 19 20 North Carolina 27705, USA. Good Samaritan Hospital, Cincinnati, Ohio 45406, USA. Simmons Cancer Center, University of Texas Southwestern, Dallas, Texas 75390, USA. Department of Internal 21 22 Medicine, Washington University, St Louis, Missouri 63110, USA. Howard Hughes Medical Institute, Chevy Chase, Maryland 20815, USA. ACOSOG Central Specimen Bank, Washington University, St 23 24 Louis, Missouri 63110, USA. ACOSOG Operations Center, Duke University, Durham, North Carolina 27705, USA. Department of Developmental Biology, Washington University, St Louis, Missouri 63110, USA. *These authors contributed equally to this work. 2 1 JU NE 20 1 2 | V OL 48 6 | NAT URE | 3 5 3 ©2012 Macmillan Publishers Limited. All rights reserved UNC13B[F1405C] IFNA1[S176F] MPDZ[S224L] MPDZ[I1507T] LOC392275[V313L] FAM135B[I504V] TMEM71[S246*] LOC286456[NULL] PABPC1[N519S] CXorf1[I24V] GPR112[S1278Y] HEY1[S151C] SGK3[R181W] LYN[A370T] TNMD[G272V] BNIP3L[R163H] ZNF711[H566R] KCNU1[S30F] LOC642546[NULL] BRWD3[e6+1] P2RY10[A192G] STC1[S115P] SCARA5[R362C] EBF2[R162*] EFHA2[E222K] LOC645251[NULL] DLC1[E11V] WDR60[I844M] TRPV5[R359H] BRAF[K601E] CTPS2[R164T] ARHGAP6[D733A] SEPT3[R280Q] RELN[Q2936H] TNRC6B[L1571P] PICK1[M231L] C22orf23[S80L] 1 M307 PFTK [ L] PRDM15[W482C] FTCD[A483V] FAM3B[S176I] KLHL7[L62F] MOCS3[S274N] FAM83D[I203T] FAM83C[L557F] ID1[E93Q] STX11[R216C] NANP[E94K] MAP7[R374K] SLC23A2[D334N] ZNF835[E75*] ZNF582[E59Q] ZNF83[R168C] BCAM[D518N] ZNF292[S2627*] CPAMD8[S1245R] OLFM2[R94C] ZNF431[Y27H] ZNF558[T142M] DST[E231K] LOC646048[NULL] MYO1F[R178Q] PTPRS[D41H] DOT1L[F245C] PPIL1[R152C] C3[e40-1] FLJ45743[NULL] RREB1[S231*] DUSP1[E239K] FAM71B[E423K] ROCK1[Q891*] LOC728095[NULL] RNMT[E432K] PCDHB1[R721T] HMMR[E339Q] MC5R[F177L] DNAJC18[R156Q] RPS6KB1[S375F] AFF4[Q558H] NOG[L137Q] BPTF[E694Q] CEP120[K795N] ATAD4[S102L] MARCH10[S493*] LOC390806[NULL] CAK [R198H] D D TMED7[M147I] BCAS3[T542I] DHX58[P205L] TEX14[R19T] CRKRS[Y1328N] FBXL20[R94*] TP53[R209I] LOC646021[S62fs] AP3B1[V234L] TP53[E285K] MAP2K4[S184L] TP53[K386N] PHF23[P295A] TP53[H214D] RGS7BP[E69G] HTR1A[E291K] LOC642366[NULL] ANKRD11[S1884*] FYB[G335E] WDR70[R517W] LOC440386[NULL] CDH5[D231H] PLEKHG4[A711D] CDH18[A195T] TRIO[E1958D] CCDC135[G298S] CES1[Q337*] RBL2[R116T] KIAA0947[V2217M] SRCAP[S1503L] SORBS2[T371M] NEK1[e25-1] ABCC1[S1523I] FSTL5[D822H] MEFV[G763E] SRRM2[R2060*] MAP9[E137K] NPY2R[S236L] SV2B[F349L] ZNF710[E215K] AKAP13[L991F] PCDH10[N246I] PLK4[G897D] MYO5A[R1731H] NR2E3[D196N] C4orf21[E71K] MYEF2[P369S] RESEARCH ARTICLE overlap coding exons, splice sites and RNA genes (tier 1), high- specimens showed high concordance in all cases (correlation co- confidence SNVs and indels in non-coding conserved or regulatory efficiency ranged from 0.74 to 0.95) (Supplementary Fig. 2) and a regions (tier 2), as well as non-repetitive regions of the human genome somatic mutation was infrequently detected in only one of the two (tier 3). In addition, somatic structural variants and germline structural samples (4.65% overall). variants that potentially affect coding sequences (Supplementary Information) were assessed. Digital sequencing data from captured Significantly mutated genes in luminal breast cancer target DNAs from the 46 tumour and normal pairs (Supplementary The discovery effort was extended by studying 31 additional cases by Table 2 and Supplementary Information) confirmed 81,858 mutations exome sequencing, producing an additional 1,371 tier 1 mutations. In (point mutations and indels) and 773 somatic structural variants. The total the 77 cases yielded 3,355 tier 1 somatic mutations, including average numbers of somatic mutations and structural variants were 3,208 point mutations, 1 dinucleotide mutation and 146 indels, 1,780 (range 44–11,619) and 16.8 (range 0–178) per case, respectively ranging from 1 to 28 nucleotides. The point mutations included 733 (Supplementary Table 3). Tier 1 point mutations and small indels silent, 2,145 missense, 178 nonsense, 6 read-through, 69 splice-site predicted for all 46 cases also were validated using both 454 and mutations and 77 in RNA genes (Supplementary Table 5). Of 2,145 Illumina sequencing (Supplementary Information). BRC25 was a clear missense mutations, 1,551 were predicted to be deleterious by SIFT 13 45 outlier with only 44 validated tiers 1–3 mutations, all at low allele and/or PolyPhen . The MuSiC package was applied to determine frequencies (ranging from 5% to 26.8%). This sample probably had the significance of the difference between observed versus expected low tumour content despite histopathology assessment, but the data mutation events in each gene, on the basis of the background are included to avoid bias. mutation rate. This identified 18 significantly mutated genes with a The overall mutation rate was 1.18 validated mutations per megabase convolution false discovery rate (FDR), 0.26 (Table 1 and Sup- (Mb) (tier 1: 1.05; tier 2: 1.14; tier 3: 1.20). The mutation rate for tier 1 plementary Table 6). The list contains genes previously identified as 14 15 12 13 was higher than that observed for acute myeloid leukaemia (0.18– mutated in breast cancer (PIK3CA , TP53 , GATA3 , CDH1 , 6,7 16 17 18 19 0.23) , but lower than that reported for hepatocellular carcinoma RB1 , MLL3 , MAP3K1 and CDKN1B ) as well as genes not previ- 8 9 10,11 (1.85) , malignant melanoma (6.65) and lung cancers (3.05–8.93) ously observed in clinical breast cancer samples, including TBX3, (Supplementary Table 4). The background mutation rate (BMR) across RUNX1, LDLRAP1, STNM2, MYH9, AGTR2, STMN2, SF3B1 and the 21 aromatase-inhibitor-resistant tumours was 1.62 per Mb, nearly CBFB. twice that of the 25 aromatase-inhibitor-sensitive tumours at 0.824 per Thirteen mutations (3 nonsense, 6 frame-shift indels, 2 in-frame Mb (P5 0.02, one-sided t-test). A trend for more somatic structural deletions and 2 missense) were identified in MAP3K1 (Table 1 and variations in the aromatase-inhibitor-resistant group was also observed, Fig. 2), a serine/threonine kinase that activates the ERK and JNK kinase as the validated somatic structural variation frequency in the 21 pathways through phosphorylation of MAP2K1 and MAP2K4 (ref. 20). aromatase-inhibitor-resistant tumour genomes was 21.69 versus an Of interest, a missense (S184L) and a splice-region mutation (e213 average of 12.76 in 25 aromatase-inhibitor-sensitive tumours probably affecting splicing) in MAP2K4 were observed in two tumours (P5 0.16, one-sided t-test) (Fig. 1). If ten TP53 mutated cases were with no MAP3K1 mutation (Fig. 2). Single nonsynonymous mutations excluded, the background mutation rate still tended to be higher in the in MAP3K12, MAP3K4, MAP4K3, MAP4K4, MAPK15 and MAPK3 aromatase-inhibitor-resistant group (P5 0.08). To demonstrate that a were also detected (Supplementary Table 5). TBX3 harboured three single-tumour core biopsy produced representative genomic data, small indels (one insertion and two deletions). TBX3 affects expansion whole-genome sequencing of two pre-treatment biopsies was con- of breast cancer stem-like cells through regulation of FGFR .Two ducted for 5 of the 46 cases. The frequency of mutations in the paired truncating mutations in the tumour suppressor CDKN1B were Aromatase-inhibitor-sensitive BRC15 BRC17 BRC22 Aromatase-inhibitor-resistant TGM4[W141*] BRC44 BRC47 BRC50 Figure 1 | Genome-wide somatic mutations. Circos plots indicate validated BRC17 and BRC22) and three on-treatment Ki67 greater than 10% (bottom somatic mutations comprising tier 1 point mutations and indels, genome-wide panel: BRC44, BRC47 and BRC50) cases are shown. Significantly mutated copy number alterations, and structural rearrangements in six representative genes are highlighted in red. No purity-based copy number corrections were genomes. Three on-treatment Ki67 less than or at 10% (top panel: BRC15, used for plotting copy number. 354| NATURE |VOL 486| 21 JUNE 2012 ©2012 Macmillan Publishers Limited. All rights reserved SCG3[V196I] C15orf43[S43L] SPTBN5[G900V] SPRED1[R256H] TMCO5A[E62K] HERC2[I1780M] LOC728288[D138A] RYR3[I4659T] APBA2[R4W] OCA2[E124K] UGT2B28[Q399*] PDGFRA[E279K] uc010ifb.1[R584C] MLH3[R1273T] ANGEL1[D415H] C14orf45[D82N] COQ6[R173H] WDR21A[E426K] LIPH[D209N] NIN[Q1341*] ACTL6A[Q245*] L2HGDH[G298E] PIK3CA[H1047R] PIK3CA[E542K] PIK3CA[H1047L] KIAA1305[P535S] TOX4[S133C] IGSF10[G325A] ATR[E2503Q] EPHB1[R637H] NPHP3[K1042T] TUBGCP3[R365W] CASR[E475K] KLHL1[P748S] CADM2[A72fs] NEK3[E477K] TRIM13[F385L] RB1[L665R] TGM4[W141*] SCN10A[e21-1] SGOL1[M1L] POLE[S2031fs MED13L[S521C] PTPN11[E76K] FLJ43879[NULL] STK 1IP[D803N] LOC728170[NULL] CWC22[A256T] MON2[E251Q] TTN[E12510K] LARP4[R139*] NEUROD1[E198Q] SLC39A5[E270K] SLC4A8[T896M] DLX P 81R] 2[ 2 ARID2[S742fs] KCNH7[V284I] ASC1[T10A] ABCC9[E126K] DUSP16[L402V] DARS[I268fs] SLC2A14[E244D] DCP1B[N152S] LOC649489[NULL] PTPN4[Q739H] MARCO[A167V] NCAPD3[e32-2] C2orf55[D781N] LOC642732[NULL] FAT3[A182S] CNGA3[R603W] TMPRSS4[I222V] MP 0[G64E] M 2 PRSS23[H70D] C2orf3[E619Q] MMP10[D388N] LRRTM1[R320C] GAB2[D579H] REL[D282H] CCDC82[Q241H] CLEC4F[R444H] ACER3[e3-1] HNRPLL[P301S] FAM161A[L27V] ODZ4[R1404*] INTS5[D609H] MSH6[K1140*] SYT7[R15H] LCLAT1[E228*] LTBP3[E495D] I [G743E] NTS5 BIRC6[e48-1] SOS1[D169H] TNKS1BP1[I1560L] OR5I1[T195fs] LCLAT1[e5-1] LBH[G32R] C2orf71[R1009C] PLB1[A997G] DNMT3A[A446P] TCP11L1[e3+1] SLC4A1AP[A538V] DCHS1[E2079A] OR52H1[D129H] KCNQ1[G219A] HTRA1[R274Q] OBSCN[e54+1] HHIPL2[K619N] GBF1[T124M] INPP5F[S59L] C1orf97[S58W] PTPRC[A111V] DY V RK3[L530 ] NPM3[E57K] SYT2[V160I] CFHR5[R19Q] SORBS1[V159I] LOC643981[NULL] LOC729046[NULL] MYST4[S975F] BAT2D1[P1918T] SPTA1[H304Y] CCAR1[G5E] SH2D2A[A361T] INSRR[E549Q] CRTC2[G557R] PRKG1[G619A] OR13A1[G251R] CUL2[D131H] DENND2C[R362W] HMCN2[E740K] C9orf43[E196K] SVEP1[E3304Q] NCBP1[E581K] S100PBP[Q353E] KPNA6[R43Q] NXNL2[P94S] HNRNPR[E338G] GJB4[V63I] KPNA6[e2-1] PEF1[S211F] SPEN[E1273*] LOC729796[NULL] PEX14[P248Q] ARTICLE RESEARCH Table 1 | Significantly mutated genes identified in 46 whole genomes that regulates alternative splicing by modulating the phosphorylation and 31 exomes sequenced in luminal breast cancer patients 32 of SR splicing factor . Translocations and point mutations of MALAT1 33 34 Gene Total MS NS Indel SS P value FDR have been reported in sarcoma and colorectal cancer cell lines .Five MAP3K1 13 2380 0 0 additional MALAT1 mutations were found in the recurrent screening PIK3CA 45 44 0 1 0 0 0 set (Supplementary Table 5d). The locations of these mutations TP53 18 13 1 1 1 0 0 219 216 clustered in a region of species homology (F1 and 2 domains) that GATA3 81043 1.15 3 10 7.41 3 10 215 211 could mediate interactions with SRSF1 (ref. 32, Supplementary Fig. 4). CDH1 81151 3.07 3 10 1.59 3 10 TBX3 30030 2.58 3 10 0.011 Non-coding mutation clusters were found in ATR, GPR126 and NRG3 ATR 66000 3.73 3 10 0.014 (Supplementary Information and Supplementary Table 7). RUNX1 44000 6.59 3 10 0.021 ENSG00000212670* 22000 2.31 3 10 0.066 Correlating mutations with clinical data RB1 42101 2.76 3 10 0.071 LDLRAP1 21100 4.27 3 10 0.092 To study clinical correlations, mutation recurrence screening was STMN2 21010 4.15 3 10 0.092 conducted on an additional 240 cases (Supplementary Table 8 and MYH9 41120 8.96 3 10 0.178 24 Supplementary Fig. 1). By combining WGS, exome and recurrence MLL3 51130 1.04 3 10 0.191 screening data, we determined the mutation frequency in PIK3CA to CDKN1B 20110 1.39 3 10 0.240 AGTR2 22000 1.71 3 10 0.256 be 41.3% (131 of 317 tumours) (Supplementary Table 5a–d and SF3B1 33000 1.79 3 10 0.256 Supplementary Fig. 3). TP53 was mutated in 51 of 317 tumours CBFB 21100 1.70 3 10 0.256 (16.1%) (Supplementary Table 5a–d and Supplementary Fig. 3). * ENSG00000212670 is not in RefSeq release 50. Additionally, 52 nonsynonymous MAP3K1 mutations in 39 tumours MS, Missense; NS, nonsense; SS, splice site. and 10 mutations in its substrate MAP2K4 were observed, represent- identified . Four missense RUNX1 mutations were observed, with ing a combined case frequency of 15.5% (Supplementary Table 5a–d and Fig. 3). Of note, 52 of the 62 non-silent mutations in MAP3K1 and three in the RUNT domain clustered within the 8 amino acid putative ATP-binding site (R166Q, G168E and R169K). RUNX1 is a transcrip- MAP2K4 were scattered indels or other protein-truncating events tion factor affected by mutation and translocation in the M2 subtype of strongly suggesting functional inactivation. In addition, 13 tumours harboured two non-silent MAP3K1 mutations, indicative of bi-allelic acute myeloid leukaemia and is implicated in tethering the oestrogen receptor to promoters independently of oestrogen response elements . loss and reinforcing the conclusion that this gene is a tumour sup- pressor. Twenty nine tumours harboured a total of 30 mutations in Two mutations (N104S and N140*) were also identified in CBFB, the binding partner of RUNX1. Additional mutations included 3 missense GATA3, consisting of 25 truncation events, one in-frame insertion, (2 K700E and 1 K666Q), in SF3B1, a splicing factor implicated in and 4 missense mutations including 3 recurrent mutations at M294K 24 25 (Supplementary Table 5a–d and Supplementary Fig. 3). BRC8 myelodysplasia and chronic lymphocytic leukaemia . One missense mutation, one nonsense mutation and two indels were found in the harboured a chromosome 10 deletion that includes GATA3. CDH1 mutation data were available for 169 samples and, as expected, its MYH9 gene, involved in hereditary macrothrombocytopenia as well as being observed in an ALK translocation in anaplastic large cell mutation status was strongly associated with lobular breast cancer 27 45 lymphoma . (Table 2a). We applied a permutation-based approach in MuSiC to ascertain relationships between mutated genes. Negative correlations We also identified three significantly mutated genes (LDLRAP1, AGTR2 and STMN2) not previously implicated in cancer. A missense were found between mutations in gene pairs such as GATA3 and and a nonsense mutation were observed in LDLRAP1, a gene asso- PIK3CA (P5 0.0026), CDH1 and GATA3 (P5 0.015), and CDH1 and TP53 (P5 0.022). MAP3K1 and MAP2K4 mutations were mutu- ciated with familial hypercholesterolaemia . AGTR2, angiotensin II receptor type 2, harboured two missense mutations (V184I and ally exclusive, albeit without reaching statistical significance (P5 0.3). In contrast, a positive correlation between MAP3K1/MAP2K4 and R251H). Angiotensin signalling and oestrogen receptor intersect in models of tissue fibrosis . STMN2, a gene activated by JNK family PIK3CA mutations was highly significant (P5 0.0002) (Supplemen- 30,31 kinases and therefore regulated by MAP3K1 and MAP2K4, tary Table 9). Two independent mutation data sets, designated ‘Set 1’ (discovery harboured one frameshift deletion and one missense mutation. Three deletions and one point mutation (Supplementary Fig. 3) were cohort) and ‘Set 2’ (validation cohort), from these clinical trial samples identified in a large, infrequently spliced non-coding (lnc) RNA gene, were analysed separately and then in combination, with a false discovery MALAT1 (metastasis associated lung adenocarcinoma transcript 1), rate (FDR)-corrected P value to gauge the overall strength and MAP3K1 MAP2K4 0 300 600 900 1200 1500 0 50 100 150 200 250 300 350 400 Scale (amino acid positions) Scale (amino acid positions) Frame-shift deletion ATP-binding site Frame-shift deletion ATP-binding site Frame-shift insertion Serine/threonine site Missense Serine/threonine site In-frame deletion Active site Splice region Active site Missense RING Splice site Nonsense SWIM Splice site insertion Figure 2 | MAP3K1 and MAP2K4 mutations observed in 317 samples. substitution, splice site mutation or indel is designated with a circle at the Somatic status of all mutations was obtained by Sanger sequencing of PCR representative protein position with colour to indicate translation effects of the products or Illumina sequencing of targeted capture products. The locations of mutation. Asterisk, nonsense mutations that cause truncation of the open conserved protein domains are highlighted. Each nonsynonymous reading frame. 2 1 JU NE 20 1 2 | V OL 48 6 | NAT URE | 3 5 5 ©2012 Macmillan Publishers Limited. All rights reserved V218fs R273fs S292* F327fs F360fs R364G Q367* H393Q H393fs N406fs S431fs S438fs P444fs T542fs S628* L707fs I761fs Y790fs R790fs Q886* K913fs S939 in-frame deletion* Q957* S989fs T999fs K1003fs R1012fs Q1022fs P1034S L1052fs Q1259* T1267fs M1269fs Y1276fs E1286 in-frame deletion I1295fs N1305D I1307 in-frame deletion Y1319fs S1344* V1346 in-frame deletion L1352fs I1366 in-frame deletion LLID1375 in-frame deletion A1396V F1415C S1471fs Q1492* Q1494* R1509C 72e2+3 R75fs H79fs R134W S184L D186G 211e5+2 P272fs 297e9-1 346e9+1 RESEARCH ARTICLE Deletion involving MAP3K1 Deletion involving MAP2K4 Deletion involving NRG1 5:56165381 BRC49 normal 5:56372730 17:11571485 BRC47 normal 17:12568748 8:28017654 BRC49 normal 8:32431503 56 67 77 0 0 0 BRC49 tumour 5:56372730 BRC47 tumour BRC49 tumour 67 134 0 0 0 Read pairs Read pairs Read pairs MAP3K1 (XM_042066) exons 4–23 out of 23 DNAH9 (NM_001372) exons 29–71 out of 71 ELP3 (NM_018091) exons 5–17 out of 17 ZNF18 (NM_144680) SETD9 (NM_153706) exons 1–8 out of 8 exons 1–11 out of 11 NRG1 (ENST00000338921) exons 1–1 out of 5 MIER3 (NM_152622) MAP2K4 (NM_003010) exons 1–13 out of 13 exons 1–15 out of 15 Figure 3 | Structural variants in significantly mutated or frequently deleted reads from whole-genome sequence data. Arcs represent multiple breakpoint- genes. One MAP3K1 deletion in BRC49 and one MAP2K4 deletion in BRC47, spanning read pairs with sequence coverage depth plotted in black across the and one ELP3-NRG1 fusion in BRC49 identified using Illumina paired-end region. Chr, chromosome. consistency of genotype–phenotype relationships (Table 2a, b and cascade/apoptosis, ErbB signalling, Akt/PI3K/mTOR signalling, Supplementary Fig. 1). TP53 mutations in both data sets correlated TP53/RB signalling and MAPK/JNK pathways (Fig. 4a). To discern with significantly higher Ki67 levels, both at baseline (P5 0.0003) and the pathways relevant to aromatase inhibitor sensitivity, we con- at surgery (P5 0.001). Furthermore, TP53 mutations were signifi- ducted separate pathway analyses for aromatase-inhibitor-sensitive cantly enriched in luminal B tumours (P5 0.04) and in higher histo- versus aromatase-inhibitor-resistant tumours. Whereas the majority logical grade tumours (P5 0.02). In contrast, MAP3K1 mutations of top altered pathways (FDR# 0.15) in each group are shared, were more frequent in luminal A tumours (P5 0.02), in grade 1 several pathways were enriched in the aromatase-inhibitor-resistant tumours (P5 0.005) and in tumours with lower Ki67 at baseline group, including the TP53 signalling pathway, DNA replication, and (P5 0.001) with consistent findings across both data sets. GATA3 mismatch repair. Specifically, 38% of the aromatase-inhibitor- mutation did not influence baseline Ki67 levels but was enriched in resistant group (11 of 29 tumours) have mutations in the TP53 samples exhibiting greater percentage Ki67 decline (P5 0.01). This pathway with three having double or triple hits involving TP53, finding requires further verification because it was significant in Set ATR, APAF1 or THBS1. In contrast, only 16.6% (8 of 48 tumours) 1 (uncorrected P value 0.003) but was a marginal finding in Set 2 of the Ki67 low group had mutations in the TP53 signalling pathway, (P5 0.08). However, it suggests GATA3 mutation may be a positive each with only a single hit in genes TP53, ATR, CCNE2 or IGF1. predictive marker for aromatase inhibitor response. (Supplementary Table 16). GeneGo pathway analysis of MetaCore interacting network objects Structural variation and DNA repair mechanisms was used to identify genes in the 77 luminal breast cancers with low- Analysis of copy number alterations (CNAs) revealed arm-level gains frequency mutations that cluster into pathway maps. Eight networks for 1q, 5p, 8q, 16p, 17q, 20p and 20q and arm-level losses for 1p, 8p, assembled from significant maps encompassed mutations from 71 16q, and 17p in the 46 WGS tumour genomes (Supplementary Fig. 5). (92%) of the tumours (Fig. 4b). Many of the network objects shared A total of 773 structural variants (579 deletions, 189 translocations pathways with significantly mutated genes such as TP53, MAP3K1, and 5 inversions) identified by WGS were validated as somatic in 46 PIK3CA and CDH1. GeneGo analysis also revealed that several genes breast cancer genomes by capture validation. No recurrent transloca- with low-frequency mutations were actually subunits of complexes, tions were detected but six in-frame fusion genes were validated by resulting in higher mutation rates for that object, for example, the reverse transcription followed by PCR (Supplementary Information condensin complex (4 mutations in 4 genes) and the MRN complex (4 and Supplementary Tables 10–13). Seven tumours had multiple com- mutations in 3 genes). Several pathways without multiple significantly plex translocations with breakpoints suggestive of a catastrophic mutated genes, such as the apoptotic cascade, calcium/phospholipase mitotic event (‘chromothripsis’; Supplementary Table 11). Analysis signalling and G-protein-coupled receptors, were significantly affected of the structural variant genomic breakpoints shows the spectra of by low-frequency mutations. Grouping tumours by significantly putative chromothripsis-related events are the same as seen for other mutated genes and pathway mutation status showed that whereas 55 somatic events, with the majority of structural variants arising from (71%) of the tumours contained significantly mutated genes in signifi- non-homologous end-joining. We classified somatic (mitotic) and cant pathways, an additional 16 (21%) contained only non-significantly germline (meiotic) structural variants into four groups: variable mutated genes in these pathways. Thus, tumours without a given sig- number tandem repeat (VNTR), non-allelic homologous recombina- nificantly mutated gene often had other mutations in the same relevant tion (NAHR), microhomology-mediated end joining (MMEJ), and pathway (Fig. 4b, Supplementary Fig. 6, Supplementary Table 17 and non-homologous end joining (NHEJ), according to criteria described Supplementary Information). in Supplementary Information. The fraction of each classification is We also applied PARADIGM to infer pathway-informed gene shown for germline and somatic (mitotic) events (Supplementary activities using gene expression and copy-number data to identify Table 14). There were significantly more somatic NHEJ events in several ‘hubs’ of activity (Supplementary Fig. 7, Supplementary Fig. 8 tumour genomes than the other three types (P, 2.23 10 ). and Supplementary Information). As expected, ESR1 and FOXA1 were among the hubs activated cohort-wide while other hubs exhibited high Pathways relevant to aromatase inhibitor response but differential changes in aromatase-inhibitor-resistant tumours Pathscan analysis (Supplementary Table 15 and Supplementary including MYC, FOXM1 and MYB (Supplementary Fig. 8). The con- Information) indicated that somatic mutations detected in the 77 cordance among the 104 MetaCore maps from GeneGo analysis discovery cases affect a number of pathways, including caspase described above is significant, with 75 (72%) matching one of the 356| NATURE |VOL 486| 21 JUNE 2012 ©2012 Macmillan Publishers Limited. All rights reserved Depth for Chr 5 Chr 5 Depth for Chr 17 Chr 17 Depth for Chr 8 Chr 8 ARTICLE RESEARCH a b Receptor tyrosine kinases ERBB2 PDGFRA EPHA7 CSF1R DDR1 MET KIT Predicted functional inactivation C 7% C 2% C 9% C 9% C 2% C 4% C 2% Predicted functional activation M 2% M 4% M 2% M 2% M 2% M 2% M 4% C Copy number alteration M Mutation C 13% C4% PI3KCA KRAS S Structural variation M 52% Activation PTEN C7% Inhibition C 52% AKT BRAF M2% M 2% M 9% MAP3K4 S 2% C 7% MAP3K1 M 2% C 13% C 22% Receptors M 4% MAP2K3 MAPKs MAP2K4 S 2% S 4% (MAP3K1, STMN2) GTPases, GPCRs DNA damage, cell cycle (AGTR2) (MLL2/3 complex (MLL3*, MLL2, PAXIP1, C 2% PI3K, Akt, mTor NCOA6, PAXIP1-associated protein 1), C 4% MAPK8 / JNK MAPK14 / P38 GATA3 M 7% (PIK3CA) ATR, SF3B1, GATA3, RB1, TP53) S 2% C9% 2+ Ca , PLC signalling Apoptosis Wnt SMG mutations (CDH1) C 17% S 2% SMG in a complex with mutated non-SMGs ATM M9% TP53 ATR C 28% M 22% CHK1 S 4% FOXO1 FOXO3 FOXO4 C 7% C 20% C 15% C 20% C4% C4% MDM2 CDK4 CHK2 CDC25 M 2% S 4% C 15% M 9% RB1 CDKN1B CDK2 E2F C 7% C7% M 4% Cell cycle Cell death progression Figure 4 | Key cancer pathway components altered in luminal breast over-represented by mutations in 77 luminal breast tumours (46 WGS and 31 tumours. a, Only genetic alterations identified in 46 WGS cases are shown. exome cases). In the concentric circle diagram, tumours are arranged as radial Alterations were discovered in key genes in the TP53/RB, MAPK, PI3K/AKT/ spokes and categorized by their mutation status in each network (concentric mTOR pathways. Genes coloured blue and red are predicted to be functionally ring colour) and significantly mutated gene mutation status (black dots). inactivated and activated, respectively, through focused mutations including Tumour classification by pathway analysis shows many tumours unaffected by point mutations and small indels (M), copy number deletions (C), or other a given significantly mutated gene often harbour other mutations in the same structural changes (S) that affect the gene. The inter-connectedness of this network. For full annotation, see Supplementary Information and network (several pathways) shows that there are many different ways to perturb Supplementary Fig. 6. PLC, phospholipase C; SMG, significantly mutated gene. a pathway. b, Eight interaction networks from canonical maps are significantly PARADIGM subnetworks at the 0.05 significance level after multiple with 43, JUN with 40, HDAC1 with 40, SHC1 with 39, and HIF1A/ test correction (P, 4.43 10 ; Bonferroni-adjusted hypergeometric ARNT complex with 39 (Supplementary Fig. 11). test) (Supplementary Fig. 9). We identified significant subnetworks To identify higher-level connections between mutations and associated with Ki67 biomarker status (Supplementary Fig. 10 and Sup- clinical features, we compared the samples on the basis of pathway- plementary Information) involving transcription factors controlling derived signatures. For each clinical attribute and each significantly large regulons. mutated gene, we dichotomized the discovery samples into a positive and a negative group to derive pathway signatures that discriminated The PARADIGM-inferred pathway signatures were further used to derive a map of the genetic mechanisms that may underlie treatment between the groups (see details in Supplementary Information). We response. A subnetwork was constructed in which interactions were then computed all pair-wise Pearson correlations between pathway retained only if they connected two features with higher than average signatures and clustered the resulting correlations (Fig. 5). The entire absolute association with Ki67 biomarker status (Supplementary Figs process was repeated using validated mutations and signatures 10 and 11 and Supplementary Information). Consistent with the derived from the validation set (Supplementary Fig. 12). In line with PathScan results, among the largest of the hubs in the identified expectation, PIK3CA, MAP3K1, MAP2K4, and low risk preoperative network were a central DNA damage hub with the second highest endocrine prognostic index (PEPI) scores (PEPI is an index of connectivity (55 regulatory interactions; 1% of the network) and TP53 recurrence risk post neoadjuvant aromatase inhibitor therapy ) with the 14th highest connectivity (26 connections; 0.5% of the network). cluster with the luminal A subtypes and with each other, and are Additional highly connected hubs identified in order of connectivity supported by the validation set analysis. The luminal B-like signatures were MYC with 79 connections (1.4%), FYN with 45 (0.8%), MAPK3 included TP53, RB1, RUNX1 and MALAT1, which also associated 21 JU NE 201 2 | V OL 486 | N A TURE | 3 5 7 ©2012 Macmillan Publishers Limited. All rights reserved C End of treatment Ki67 RB1 mutation PEPI score RUNX1 mutation CDKN1B mutation TP53 mutation Baseline Ki67 PAM50 subtype luminal B Histopathology grade BIRC6 mutation MALAT1 mutation ATR mutation CDH1 mutation PAM50 subtype luminal A MAP3K1 mutation PEPI 0* Histopathology type MAP2K4 mutation PIK3CA mutation MLL3 mutation RESEARCH ARTICLE Luminal A Luminal B with other poor outcome features such as high baseline and surgical Discovery set Ki67 levels, high grade histology and high PEPI scores. The TP53 and Differential pathway MALAT1 associations in the discovery set also were supported by the Signature correlation n = 77 Differential signatures validation set analysis. Mutation vs wild type MLL3 mutation Mutation vs wild type PIK3CA mutation Mutation vs wild type MAP2K4 mutation Druggable gene analysis Lobular vs ductal Histopathology type PEPI = 0 vs PEPI > 0 PEPI 0* We defined mutations in druggable tyrosine kinase domains includ- Mutation vs wild type MAP3K1 mutation LRENT Luminal A vs luminal B PAM50 subtype Luminal A ing in ERBB2 (a V777L and a 755–759 in-frame deletion Mutation vs wild type CDH1 mutation Mutation vs wild type ATR mutation homologous to gefitinib-sensitizing EGFR mutations in lung cancer ), Mutation vs wild type MALAT1 mutation as well as in DDR1 (A829V, R611C), DDR2 (E583D), CSF1R (D735H, Mutation vs wild type BIRC6 mutation Grade II, III vs grade I Histopathology grade M875L), and PDGFRA (E924K). In addition, pleckstrin homology Luminal B vs luminal A PAM50 subtype Luminal B Above 14% vs below 14% Baseline Ki67 domain mutations were observed in AKT1 (C77F) and AKT2 (S11F) Mutation vs wild type TP53 mutation and a kinase domain mutation was identified in RPS6KB1 (S375F) Mutation vs wild type CDKN1B mutation Mutation vs wild type RUNX1 mutation (Supplementary Table 18). Above mean vs below mean PEPI score Mutation vs wild type RB1 mutation Above 10% vs below 10% End of treatment Ki67 Discussion –1.0 Anticorrelation –0.75 The low frequency of many significantly mutated genes presents an –0.38 0.00 No correlation enormous challenge for correlative analysis, but several statistically 0.38 0.75 significant patterns were identified, including the relationship between 1.0 Correlation MAP3K1 mutation, luminal A subtype, low tumour grade and low Differential signatures Ki67 proliferation index. On this basis, for patients with MAP3K1 Figure 5 | Pathway signatures reveal connections between mutations and mutant luminal tumours, neoadjuvant aromatase inhibitor could pro- clinical outcomes. PARADIGM-based pathway signatures were derived for vide a favourable option. In contrast, tumours with TP53 mutations, tumour feature dichotomies including mutation driven gene signatures which are mostly aromatase inhibitor resistant, would be more appro- (mutant versus non-mutant), histopathology type (lobular versus ductal), priately treated with other modalities. MAP3K1 activates the ERK preoperative endocrine prognostic index (PEPI) score (PEPI5 0 favourable family, thus, loss of ERK signalling could explain the indolent nature versus PEPI .0 unfavourable), PAM50 (50-gene intrinsic breast cancer of MAP3K1-deficient tumours . However, MAP3K1 also activates subtype classifier) luminal A subtype (luminal A versus luminal B) and the JNK through MAP2K4, which also can be mutated . Loss of JNK reverse (luminal B versus luminal A), histopathology grade (grades II and III signalling produces a defect in apoptosis in response to stress, which versus I), baseline Ki67 levels ($ 14% versus , 14%), and end-of-treatment 39,40 would hypothetically explain why these mutations accumulate . Ki67 levels ($ 10% versus , 10%) and overall PEPI score (higher than mean unfavourable versus lower than mean favourable). Pearson correlations were PIK3CA harboured the most mutations (41.3%) but was neither asso- computed between all pair-wise signatures; positive correlations, red; negative ciated with clinical nor Ki67 response, confirming our earlier report . correlations, blue; column features ordered identically as rows. Correlation However, the positive association between MAP3K1/MAP2K4 muta- analysis on the 77 samples in the discovery set is shown. Asterisk: Ki67, 2.7%, tions and PIK3CA mutation at both the mutation and pathway levels oestrogen-receptor-positive, node negative and tumour size#5cm. suggests cooperativity (Fig. 4a). The finding of multiple significantly mutated genes linked previ- ously to benign and malignant haematopoietic disorders suggests that breast cancer, like leukaemia, can be viewed as a stem-cell disorder Table 2 | Correlations between mutations and clinical features a Luminal subtype and histology grade Gene Expression/histo-pathology variable Mutation frequency* Set1 P{ Set2 P{ Whole set FDR P{ TP53 Luminal subtype A 9.3% (13/140) 0.001 0.46 0.041 Luminal subtype B 21.5% (38/177) TP53 Histological grade I 4.5% (3/66) 0.05 0.067 0.02 Histological grade II/III 19.2% (48/250) MAP3K1 Luminal subtype A 20.0% (28/140) 0.018 0.028 0.005 Luminal subtype B 6.2% (11/177) MAP3K1 Histological grade I 25.8% (17/66) 0.061 0.011 0.005 Histological grade II/III 8.8% (22/250) 211 210 CDH1 Histological type ductal 5.9% (10/169) 0.411 2.8 3 10 3.9 3 10 Histological type lobular 50.0% (20/40) b Mutation and Ki67 index Gene Ki67 variable Wild type meanI Mutant meanI Set1 P" Set2 P" Whole set FDR P{ TP53 Baseline 13.1 25.1 3.7 3 10 0.012 0.0003 Surgery 1.4 4 0.0002 0.014 0.001 % change 289.2 284.3 0.09 0.28 0.24 MAP3K1 Baseline 15.8 8.1 0.049 0.001 0.002 Surgery 1.86 0.75 0.11 0.1 0.05 % change 288.3 290.5 0.49 0.65 0.55 GATA3 Baseline 14.8 11.5 0.13 0.95 0.56 Surgery 1.95 0.38 0.001 0.23 0.012 % change 286.8 296.9 0.003 0.08 0.012 * Mutation percentage (mutant cases/total cases in a category), counts are based on all cases (Set 1 and Set 2 combined). { Unadjusted P value from Fisher’s exact test or Chi-square test as appropriate. { Benjamini–Hochberg false discovery rate (FDR)-adjusted P value using all cases (Set1 and Set2 combined). 1 Only 77 cases in Set1 had CDH1 sequencing results. IGeometric means are based on all cases (Set1 and Set2 combined). "Unadjusted P value from Wilcoxon rank sum test. 358| NATURE |VOL 486| 21 JUNE 2012 ©2012 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH 3. Ellis, M. J. et al. Randomized phase II neoadjuvant comparison between letrozole, that produces indolent or aggressive tumours that display varying anastrozole, and exemestane for postmenopausal women with estrogen receptor- phenotypes depending on differentiation blocks generated by differ- rich stage 2 to 3 breast cancer: clinical and biomarker outcomes and predictive ent mutation repertoires . Whereas only MLL3 showed statistical value of the baseline PAM50-based intrinsic subtype—ACOSOG Z1031. J. Clin. Oncol. 29, 2342–2349 (2011). significance in the analysis of 46 WGS cases, multiple mutations in 4. Ellis, M. J. et al. Outcome prediction for estrogen receptor-positive breast cancer genes related to histone modification and chromatin remodelling are based on postneoadjuvant endocrine therapy tumor characteristics. J. Natl. Cancer worth noting (Supplementary Table 19). An array of coding muta- Inst. 100, 1380–1388 (2008). 5. Chen, K. et al. BreakDancer: an algorithm for high-resolution mapping of genomic tions and structural variations was discovered in methyltransferases structural variation. Nature Methods 6, 677–681 (2009). (MLL2, MLL3, MLL4 and MLL5), demethyltransferases (KDM6A, 6. Mardis, E. R. et al. Recurring mutations found by sequencing an acute myeloid KDM4A, KDM5B and KDM5C), and acetyltransferases (MYST1, leukemia genome. N. Engl. J. Med. 361, 1058–1066 (2009). 7. Ley, T. J. et al. DNA sequencing of a cytogenetically normal acute myeloid MYST3 and MYST4). Furthermore, our analysis identified several leukaemia genome. Nature 456, 66–72 (2008). adenine-thymine (AT)-rich interactive domain-containing protein 8. Totoki, Y. et al. High-resolution characterization of a hepatocellular carcinoma genes (ARID1A, ARID2, ARID3B and ARID4B) that harboured muta- genome. Nature Genet. 43, 464–469 (2011). 9. Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a tions and large deletions, reinforcing the role of members from the human cancer genome. Nature 463, 191–196 (2010). SNF/SWI family in breast cancer. 10. Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures of Pathway analysis enables the evaluation of mutations with low tobacco exposure. Nature 463, 184–190 (2010). 11. Lee, W. et al. The mutation spectrum revealed by paired genome sequences from a recurrence frequency where statistical comparisons are conventionally lung cancer patient. Nature 465, 473–477 (2010). underpowered. For example, the eight samples with MAP2K4 muta- 12. Usary, J. et al. Mutation of GATA3 in human breast tumors. Oncogene 23, tions were sufficient to derive a reliable pathway-based gene signature 7669–7678 (2004). 13. Berx, G. et al. E-cadherin is a tumour/invasion suppressor gene mutated in human in PARADIGM that aligns with MAP3K1. This approach also pointed lobular breast cancers. EMBO J. 14, 6107–6115 (1995). to a putative connection between MALAT1 and the TP53 pathway. 14. Samuels, Y. et al. High frequency of mutations of the PIK3CA gene in human Finally, we provide evidence that transcriptional associations to Ki67 cancers. Science 304, 554 (2004). 15. Prosser, J., Thompson, A. M., Cranston, G. & Evans, H. J. Evidence that p53 behaves response reside in a connected network under the control of several key as a tumour suppressor gene in sporadic breast tumours. Oncogene 5, ‘hub’ genes including MYC, FYN and MAP kinases, among others. 1573–1579 (1990). 16. T’Ang, A., Varley, J. M., Chakraborty, S., Murphree, A. L. & Fung, Y. K. Structural Targeting these hubs in resistant tumours could produce therapeutic rearrangement of the retinoblastoma gene in human breast carcinoma. Science advances. In conclusion, the genomic information derived from 242, 263–266 (1988). unbiased sequencing is a logical new starting point for clinical invest- 17. Wang, X. X. et al. Somatic mutations of the mixed-lineage leukemia 3 (MLL3) gene in primary breast cancers. Pathol. Oncol. Res. 17, 429–433 (2011). igation, where the mutation status of an individual patient is deter- 18. Kan, Z. et al. Diverse somatic mutation patterns and pathway alterations in human mined in advance and treatment decisions are driven by therapeutic cancers. Nature 466, 869–873 (2010). hypotheses that stem from knowledge of the genomic sequence and its 19. Spirin, K. S. et al. p27/Kip1 mutation found in breast cancer. Cancer Res. 56, 2400–2404 (1996). possible consequences. However, the accrual of large numbers of 20. Fanger, G. R., Johnson, N. L. & Johnson, G. L. MEK kinases are regulated by EGF and patients and the use of comprehensive sequencing and gene expression selectively interact with Rac/Cdc42. EMBO J. 16, 4961–4972 (1997). approaches will be required because of the extreme genomic hetero- 21. Fillmore, C. M. et al. Estrogen expands breast cancer stem-like cells through paracrine FGF/Tbx3 signaling. Proc. Natl Acad. Sci. USA 107, 21737–21742 geneity documented by this investigation. (2010). 22. Mao, S., Frank, R. C., Zhang, J., Miyazaki, Y. & Nimer, S. D. Functional and physical METHODS SUMMARY interactions between AML1 proteins and an ETS protein, MEF: implications for the pathogenesis of t(8;21)-positive leukemias. Mol. Cell. Biol. 19, 3635–3644 (1999). Clinical trial samples were accessed from the preoperative letrozole phase 2 study 23. Stender, J. D. et al. Genome-wide analysis of estrogen receptor a DNA binding and (NCT00084396) that investigated the effect of letrozole for 16 to 24 weeks on tethering mechanisms identifies Runx1 as a novel tethering factor in receptor- surgical outcomes and from the American College of Surgeons Oncology Group mediated transcriptional activation. Mol. Cell. Biol. 30, 3943–3955 (2010). (ACOSOG) Z1031 study (NCT00265759) that compared anastrozole with 24. Papaemmanuil, E. et al. Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. N. Engl. J. Med. 365, 1384–1395 (2011). exemestane or letrozole for 16 to 18 weeks before surgery (REMARK flow charts, 25. Wang, L. et al. SF3B1 and other novel cancer genes in chronic lymphocytic Supplementary Fig. 1). Baseline snap-frozen biopsy samples with greater than leukemia. N. Engl. J. Med. 365, 2497–2506 (2011). 70% tumour content (by nuclei) underwent DNA extraction and were paired with 26. Chen, Z. et al. The May-Hegglin anomaly gene MYH9 is a negative regulator of a peripheral blood DNA sample. Two formalin-fixed biopsies were obtained at platelet biogenesis modulated by the Rho-ROCK pathway. Blood 110, 171–179 baseline and at surgery, and were used to conduct oestrogen receptor and Ki67 (2007). 27. Lamant, L. et al. Non-muscle myosin heavy chain (MYH9): a new partner fused to immunohistochemistry as previously published . Paired end Illumina reads from ALK in anaplastic large cell lymphoma. Genes Chromosom. Cancer 37, 427–432 tumours and normal samples were aligned to NCBI build36 using BWA. Somatic (2003). point mutations were identified using SomaticSniper , and indels were identified 28. Wilund, K. R. et al. Molecular mechanisms of autosomal recessive by combining results from a modified version of the Samtools indel caller (http:// hypercholesterolemia. Hum. Mol. Genet. 11, 3019–3030 (2002). samtools.sourceforge.net/), GATK and Pindel. Structural variations were 29. Delleˆ,H. et al. Antifibrotic effect of tamoxifen in a model of progressive renal disease. J. Am. Soc. Nephrol. 23, 37–48 (2012). identified using BreakDancer and SquareDancer (unpublished). All putative 30. Tararuk, T. et al. JNK1 phosphorylation of SCG10 determines microtubule somatic events found in 46 cases were validated by targeted custom capture arrays dynamics and axodendritic length. J. Cell Biol. 173, 265–277 (2006). (Nimblegen)/Illumina sequencing and all tier 1 mutations for 46 WGS cases also 31. Westerlund, N. et al.Phosphorylation of SCG10/stathmin-2determines multipolar were validated using PCR/454 sequencing. All statistical analyses, including stage exit and neuronal migration rate. Nature Neurosci. 14, 305–313 (2011). significantly mutated gene, mutation relation and clinical correlation were done 32. Tripathi, V. et al. The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Mol. Cell 39, using the MuSiC package and/or by standard statistical tests (Supplementary 925–938 (2010). Information). Pathway analysis was performed with PathScan, GeneGo Metacore 33. Rajaram, V., Knezevich, S., Bove, K. E., Perry, A. & Pfeifer, J. D. DNA sequence of the (http://www.genego.com/metacore.php) and PARADIGM. A complete descrip- translocation breakpoints in undifferentiated embryonal sarcoma arising in tion of the materials and methods used to generate this data set and results is mesenchymal hamartoma of the liver harboring the t(11;19)(q11;q13.4) provided in the Supplementary Methods section. translocation. Genes Chromosom. Cancer 46, 508–513 (2007). 34. Xu, C., Yang, M., Tian, J., Wang, X. & Li, Z. MALAT-1: a long non-coding RNA and its important 39 end functional motif in colorectal cancer metastasis. Int. J. Oncol. 39, Received 16 June 2011; accepted 12 April 2012. 169–175 (2011). Published online 10 June 2012. 35. Wendl, M. C. et al. PathScan: a tool for discerning mutational significance in groups of putative cancer genes. Bioinformatics 27, 1595–1602 (2011). 1. Chia, Y. H., Ellis, M. J. & Ma, C. X. Neoadjuvant endocrine therapy in primary 36. Vaske, C. J. et al. Inference of patient-specific pathway activities from multi- breast cancer: indications and use as a research tool. Br. J. Cancer 103, 759–764 dimensional cancer genomics data using PARADIGM. Bioinformatics 26, (2010). i237–i245 (2010). 2. Olson, J. A. Jr et al. Improved surgical outcomes for breast cancer patients receiving 37. Lynch, T. J. et al. Activating mutations in the epidermal growth factor receptor neoadjuvant aromatase inhibitor therapy: results from a multicenter phase II trial. underlying responsiveness of non-small-cell lung cancer to gefitinib. N. Engl. J. Med. 350, 2129–2139 (2004). J. Am. Coll. Surg. 208, 906–914; discussion 915–906 (2009). 21 JU NE 20 12 | V OL 4 8 6 | N ATU RE | 3 5 9 ©2012 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE 38. Johnson, G. L. & Lapadat, R. Mitogen-activated protein kinase pathways mediated CA114722), the Susan G. Komen Breast Cancer Foundation (BCTR0707808), and the by ERK, JNK, and p38 protein kinases. Science 298, 1911–1912 (2002). Fashion Footwear Charitable Foundation, Inc., grant awards to ACOSOG included NCI 39. Widmann, C., Johnson, N. L., Gardner, A. M., Smith, R. J. & Johnson, G. L. U10 CA076001, the Breast Cancer Research Foundation, and clinical trial support Potentiation of apoptosis by low dose stress stimuli in cells expressing activated from Novartis and Pfizer, and a Center grant (NCI P50 CA94056) to D.P.-W. We also MEK kinase 1. Oncogene 15, 2439–2447 (1997). acknowledge institutional support in the form of the Washington University Cancer 40. Wagner, E. F. & Nebreda, A. R. Signal integration by JNK and p38 MAPK pathways Genome Initiative (R.K.W.), and a productive partnership with Illumina, Inc. The tissue in cancer development. Nature Rev. Cancer 9, 537–549 (2009). procurement core was supported by an NCI core grant to the Siteman Cancer Center 41. Ellis, M. J. et al. Phosphatidyl-inositol-3-kinase alpha catalytic subunit mutation (NCI 3P50 CA68438). The BRIGHT Institute is supported in part by an ATT/Emerson and response to neoadjuvant endocrine therapy for estrogen receptor positive gift to the Siteman Cancer Center. breast cancer. Breast Cancer Res. Treat. 119, 379–390 (2010). Author Contributions M.J.E. led the clinical investigations, biomarker analysis and 42. Prat, A. & Perou, C. M. Mammary development meets cancer genomics. Nature chip-based genomics. E.R.M., M.J.E., L.D., R.S.F., T.J.L. and R.K.W. designed the Med. 15, 842–844 (2009). experiments. L.D. and M.J.E. led data analysis. D.S., J.W.W., D.C.K., C.C.H., M.D.M., K.C., 43. Larson, D. E. et al. SomaticSniper: identification of somatic point mutations in C.A.Mi., F.D., W.S.S., M.C.W., R.C. and C.K. performed data analysis. D.S., C.A.Ma., J.W.W., whole genome sequencing data. Bioinformatics 28, 311–317 (2011). J.F.M., C.L. and L.D. prepared figures and tables. R.S.F., L.L.F., R.D., M.H., T.L.V., J.H., L.L., 44. Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. R.C. and J.S. performed laboratory experiments. L.E., G.U., J.M., G.V.B., P.K.M., J.M.G., Genome Res. 19, 1639–1645 (2009). M.L., K.H. and J.O. provided samples and clinical data. V.J.S., K.B., J.L., Y.T. and C.K. 45. Dees, N. et al. MuSiC: Identifying mutational significance in cancer genomes. provided statistical and clinical correlation analysis. D.O. oversees the ACOSOG Genome Res. (in the press). Operations Center that provides oversight and tracking for ACOSOG clinical trials. K.D., Supplementary Information is linked to the online version of the paper at S.McD., D.C.A. and M.W. provided pathology analysis. B.A.V.T., J.W., R.J.G., A.E., D.P.-W., www.nature.com/nature. H.P.-W., J.M.S., T.C.G., S.N., C.K. and M.C.W. performed pathway analysis. L.-W.C. and R.B. analysed the druggable target mutation data. D.J.D. and B.O. provided informatics Acknowledgements This article is dedicated to the memory of Evelyn Lauder in support. L.D., M.J.E. and E.R.M. wrote the manuscript. T.J.L., M.C.W. and R.K.W. critically recognition of her efforts to eradicate breast cancer. We would like to thank the read and commented on the manuscript. participating patients and their families, clinical investigators and their support staffs, and J. A. Zujewski and the Cancer Therapy Evaluation Program at the US National Author Information DNA sequence data are deposited in the restricted access portal at Cancer Institute. We would like to acknowledge the efforts of the following people and dbGaP, accession number phs000472.v1.p1. Gene expression array data used in the groups at The Genome Institute for their contributions to this manuscript: the Analysis Paradigm training set is deposited in GEO, accession number GSE29442, and a Pipeline group for developing the automated analysis pipelines that generated Superseries that covers both the Agilent gene expression data and the Agilent array alignments and somatic variants; the LIMS group for developing tools to manage CGH data used for the Paradigm test set is deposited in GEO, accession number validation array ordering, capture and sequencing, and J. Veizer and H. Schmidt for GSE35191 . Reprints and permissions information is available at www.nature.com/ structural variant and recurrent screening analyses. We thank the many members of reprints. This paper is distributed under the terms of the Creative Commons the Siteman Cancer Center at Washington University in St Louis for support, and the Attribution-Non-Commercial-Share Alike licence, and is freely available to all readers at committed members of the American College of Surgeons Oncology Group and their www.nature.com/nature. The authors declare no competing financial interests. patients for contributing samples to the Z1031 trial. This work was funded by grants to Readers are welcome to comment on the online version of this article at R.K.W. from the National Human Genome Research Institute (NHGRI U54 HG003079), www.nature.com/nature. Correspondence and requests for materials should be grants to M.J.E. from the National Cancer Institute (NCI R01 CA095614, NCI U01 addressed to M.J.E., L.D. and E.R.M. 3 6 0 | NAT U R E | V O L 4 8 6 | 21 JU NE 20 12 ©2012 Macmillan Publishers Limited. All rights reserved

Journal

NatureSpringer Journals

Published: Jun 10, 2012

There are no references for this article.