Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Genetic Diversity and Breeding Signatures for Regional Indica Rice Improvement in Guangdong of Southern China

Genetic Diversity and Breeding Signatures for Regional Indica Rice Improvement in Guangdong of... As the pioneer of the Green Revolution in China, Guangdong province witnessed the improvement and spread of semi-dwarf Xian/Indica rice cultivars and possessed diverse rice germplasm of landrace and cultivars. A total of 517 accessions containing a core germplasm of 479 newly sequenced landraces and modern cultivars were used to reveal breeding signatures and key variations for regional genetic improvement of indica rice from Guangdong. Four subpopulations were identified in the collection, which including Ind IV as a novel subpopulation that not covered by previously released accessions. Modern cultivars of subpopulation Ind II were inferred to have less deleterious variations, especially in yield related genes. About 15 Mb genomic segments were identified as potential breeding signatures by cross-population likelihood method (XP-CLR) of modern cultivars and landraces. The selected regions spanning multiple yield related QTLs (quantitative trait locus) which identified by GWAS (genome-wide association studies) of the same population, and specific variations that fixed in modern cultivars of Ind II were characterized. This study highlights genetic differences between traditional landraces and modern cultivars, which revealed the potential molecular basis of regional genetic improvement for Guangdong indica rice from southern China. Keywords Rice, Yield improvement, Resequencing, Breeding signature, GWAS Background introduction and spread of semi-dwarf indica rice acces- Rice (Oryza sativa) feeds more than half of the world’s sions. Since then, rice yield has been increased by about population, and rice yield is vital for world food security. three-fold with the breakthrough of high-yield rice culti- Rice genetic improvement in China has facilitated the vars. However, with the spread of modern cultivars, rice increase of its production over the past several decades. landraces that were grown by local farmers is gradually Guangdong of southern China witnessed the breeding, disappearing. For the goal of further production increas- ing, the usage of genetic diversity for the valuable germ- plasm needs to be enhanced in breeding programs. *Correspondence: Tremendous efforts have been made by germplasm scien - Li Chen tist for the collection and conservation of landraces and lichen@gdaas.cn locally-improved traditional cultivars of southern China. Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China These landraces and cultivars represent the rice genetic Key Laboratory of Genetics and Breeding of High Quality Rice diversity of southern China before and after the rice in Southern China (Co-construction by Ministry and Province), Ministry “Green Revolution”, which could be used to reveal genetic of Agriculture and Rural Affairs, Guangzhou 510640, China Guangdong Key Laboratory of New Technology in Rice Breeding, trajectory for regional indica rice breeding and pheno- Guangzhou 510640, China type enhancement. Revelation of the functional varia- Guangdong Rice Engineering Laboratory, Guangzhou 510640, China tions related to the success of rice breeding in southern © The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Hang et al. Rice (2023) 16:25 Page 2 of 14 China will promote the utilization of genetic resources artificial selection and farmer cultivation. Genotyping by for future rice breeding. Moreover, characterization of sequencing of 108 core on-farm conserved rice landraces the genome sequences, genetic diversity and functional from Yunnan revealed 186 and 183 potential selective- variations of these germplasm collections is becoming sweep between different collection date (Cui et al. 2019). very critical for the next potential breakthrough of rice Different selection signature during breeding indicated production. by genetic differentiation between early and late culti - The advancement of sequencing technologies enabled vars of indica and japonica in Taiwan (Hour et al. 2020). the analysis of genetic diversity for large collection of As the pioneer of “Green Revolution” for indica rice in germplasm, which promoted the revelation of domesti- China, Guangdong province have rich diversity of germ- cated loci, and accelerated the identification of functional plasms. However, large-scale population genomics of rice genes. Genotypes of a large collection of 517 rice lan- landraces and improved varieties for the study of genetic draces were identified with onefold-coverage sequencing diversity and identification of regional breeding signa - and accurate imputation method, population structure, tures of are still lacking. genome-wide association analysis and haplotype analy- In this study, core germplasm of locally planted rice sis were conducted using about 3.6 million nonredun- accessions of landrace and cultivars from Guangdong of dant SNPs (Huang et  al. 2010). Thereafter, higher-depth southern China before and after rice “Green Revolution” sequencing with more than 15-fold coverage were con- were collected. Agronomic traits were systematically ducted on 40 cultivated and 10 wild rice accessions to investigated by field experiments, and they were geno - identify selection signatures during domestication using typed by 10-fold depth genome resequencing. Genetic nucleotide polymorphisms (Xu et  al. 2012), and larger diversity was analyzed and breeding signatures for mod- collection of 446 wild diverse rice accessions and 1083 ern cultivar subpopulation were identified and annotated cultivated varieties were also genotyped by sequencing by QTLs of eleven agronomic traits. Specific genetic vari - and used for the identification of 55 selective sweeps dur - ations and favorable alleles that fixed in subpopulation ing domestication (Huang et al. 2012). Further, 10,074 F of modern cultivar were identified that can be used for lines from 17 representative hybrid rice combinations molecular marker assisted breeding. were genotyped, and heterosis related loci were identi- fied (Huang et al. 2016). Release of sequencing data from Results “3000 Rice Genomes Project” (3KRGP) largely facili- Genetic Diversity and Population Structure Analysis tated the identification of untapped variations and novel A total of 517 accessions consisting mainly of indica rice genes (Fuentes et  al. 2019; Wang et  al. 2018). Jointly germplasm, which including 358 landrace and 159 arti- data analysis of 3KRGP and Indian long and short grain ficially improved cultivars from Guangdong, China were germplasm identified the long low-diversity region har - used to identify genomic variations (Additional file  1: boring key gene regulating grain weight (Kumar et  al. Table  S1). The 479 newly-sequenced accessions gener - 2020). Recently, the sequencing of local germplasm and ated 24.23 million 100  bp pair-end sequencing reads for improved varieties illustrated the regional genetic diver- each accession. Quality assessment of these sequenc- sity in detail. Sequencing of 239 japonica rice elites from ing reads revealed the average Q30 base quality (99.9% China, Japan and Korea identified 1131 novel genes and base call accuracy) percent was 99.65% (Additional file  1: artificial selection signals (Liu et al. 2021). Analysis of 672 Table S2). The average mapping depth against the MSU7 Vietnamese rice genomes described their classification reference genome was 12.02 with 91.68% coverage ratio and identified 21 unique QTLs from 19 traits (Higgins (Additional file  1: Table S3). Averagely 2.04 million SNPs, et al. 2021). Genotyping and systemically phenotyping of 82.34 thousand insertions and 127.99 thousand deletions 200 japonica rice varieties grown in central China over were identified for 517 indica rice accessions (Additional the past 30  years revealed the genetic factors regulating file 1: Table S4). the balance of yield, quality and blast resistance (Xiao Population structure analysis was conducted by prin- et al. 2021). cipal components analysis (PCA), phylogenetic and Artificial selection and breeding signatures during admixture analysis. In admixture analysis, group 1 con- the succession of rice varieties makes deep insight into tained 211 accessions (201 landraces and 10 cultivars), their genetic improvement. Low-coverage sequencing of and group 2 contained 306 accessions (157 landraces and 1479 landrace and modern cultivars from 73 countries 149 cultivars) when subpopulation number (k) was 2. revealed 200 regions were differentially selected between And three groups, namely group 1 (20 landraces and 116 two major indica subpopulations, and yield was cor- cultivars), group 2 (189 landraces and 11 cultivars) and related with number of the signatures (Xie et  al. 2015). group 3 (149 landraces and 32 cultivars) can be identi- Locally selective sweeps also showed pressure during fied when k = 3 (Additional file  2: Fig. S1). Integrated with Hang  et al. Rice (2023) 16:25 Page 3 of 14 PCA (Fig. 1A), phylogenetic tree (Fig. 1B) and admixture the square of the correlation coefficient (r ) between vari- analysis (Fig. 1C), a total of 4 subpopulations were finally ations. LD decay distance for Ind IV, Ind I and Ind II were determined. With the reference of 182 accessions (with 61.0  kb, 110.1  kb and 219.8  kb, respectively. The exten - three subpopulation that named with Ind I, Ind II and sion of LD decay distance for Ind II indicated that the Ind III) from RiceVarMap database, and the phylogenetic cultivar subpopulation Ind II underwent artificial selec - relationship with accessions from 3KRG, these 4 sub- tion pressure during the process of genetic improvement. populations were named as Ind I (landrace), Ind II (culti- Interestingly, the landrace subpopulation Ind I probably var), Ind IV (landrace) and GJ-tmp in this study (Fig. 2A, have selection effect by regional farmer breeders as its Additional file  2: Fig. S2). GJ-tmp diverged from other LD decay distance longer than landrace subpopulation subpopulations, and most of accessions from GJ-tmp Ind IV (Fig.  2B). Genetic diversity (pi and Tajima’s D) subpopulation were glutinous rice landraces. Another and differentiation (fst) analysis were conducted for three subpopulation Ind I contained a total of 181 accessions, main subpopulations of Guangdong indica rice. The pi 151 accessions of which were landraces and 30 culti- values for Ind IV, Ind I and Ind II were 0.0031, 0.0029 vars that were bred before 1980s. Subpopulation Ind II and 0.0029, respectively. Ind IV have higher pi values, contained 126 accessions, which have 117 cultivars and while Ind I and Ind II have similar values. The Tajima’s 9 landraces. Subpopulation Ind IV have 189 landraces D value for Ind IV was positive, while they were negative and 11 cultivars that were bred before 1980s (Additional for Ind I and Ind II, which implying potential selection file  1: Table  S1). A recently released and refined indica effect in subpopulation of Ind I and Ind II. Genetic diver - reference genome (9311) was also used to call genetic gence (fst) between Ind IV and Ind I is smaller than that variations and conduct population structure analysis, of Ind IV and Ind II, which indicates Ind II was higher which obtained similar results for genetic clustering for diverged from Ind IV than Ind I (Fig.  2C). Phylogenetic all those accessions (Additional file 2: Fig. S3). tree with 998 common wild rice (Oryza rufipogon Griff.) The length of linkage disequilibrium (LD) decay for lines also indicates degree of differentiation from wild subpopulation Ind I, Ind II and Ind IV other than the glu- rice populations from high to low was Ind IV, Ind I and tinous rice subpopulation GJ-tmp were estimated using Ind II (Fig. 2D). Together with these results, we deduced Fig. 1 Population structure analysis of Guangdong indica rice accessions. PCA plot (A), phylogenetic analyses (B) and population structure (C) showing genetic diversity and clustering of all accessions based on whole genome SNP variations. For PCA plot, PC1 (principal component 1) and PC2 (principal component 2) are showed on horizontal and vertical axes, and percentages of variance explained were noted in parentheses Hang et al. Rice (2023) 16:25 Page 4 of 14 Fig. 2 Genetic diversity and phylogenetic relationship of subpopulations for Guangdong indica rice. A PCA plot depicting the comparison of genetic clusters of 479 Guangdong and 220 indica accessions from RiceVarMap database. B Linkage disequilibrium (LD) decay analysis of three main subpopulations. C Genetic diversity and differentiation of three subpopulations for Guangdong indica rice. The size of the circles represents the level of genetic diversity (pi) of the subpopulations, and, and length of lines represent fst values between subpopulations. D Phylogenetic tree for Guangdong indica rice and common wild rice (Oryza rufipogon Griff.) that the genetic differences of these regionally cultivated rice accessions. Genetic diversity and LD decay analy- rice lines were attribute to the cultivation period, as mod- sis indicates potential selection pressure in Ind I, and ern cultivars may have high speed and flexible distance in even stronger selection effect in modern cultivar sub - their seed dispersal. population Ind II. The alteration of agronomic traits for these subpopulations recorded the trajectory of these selection effect. A total of eleven important agronomic Phenotypic Comparison for Subpopulations including plant height (PH), heading date (HD), yield The selection pressure by local breeders during rice per plant (YPP), panicle number (PN), grain number improvement for the past half century largely changed per panicle (GNPP), seed setting (SS), thousand grain the agronomic traits between traditional and modern weight (TGW), panicle length (PL), grain length (GL), Hang  et al. Rice (2023) 16:25 Page 5 of 14 grain width (GW) and grain length width ratio (GLWR) and grain length (Fig. 3b) were increased during modern were investigated and analyzed. breeding process, as shown by the comparison of Ind IV, During the improvement progress of Ind IV, Ind I and Ind I and Ind II. Plant height (Fig. 3c) and panicle length Ind II, values of eleven agronomic traits showed four dif- (Fig.  3d) descended during this process, which repre- ferent types of changing trends. Seed setting rate (Fig. 3a) sents the main phenotype alteration for semi-dwarf rice Fig. 3 Phenotype comparison of eleven main agronomic traits for three main subpopulations of Guangdong indica rice. Subpopulations were ordered by LD decay distance from short to long, and average trait values for each subpopulation were noted above boxplots Hang et al. Rice (2023) 16:25 Page 6 of 14 cultivars that released during rice “green revolution” of II, a total of 146 genomic segments with genomic length southern China. Trait of heading date (Fig.  3e), yield per of 14.59 Mb were identified (Additional file 1: Table S8). plant (Fig.  3f ), grain number per panicle (Fig.  3g) and Genome wide association study (GWAS) were con- grain length width ratio (Fig.  3h) showed fluctuation of ducted for eleven yield and yield-related traits and the decline in Ind I and elevation in Ind II. Panicle number effects of candidate genes were identified. For instance, (Fig.  3i), thousand grain weight (Fig.  3j) and grain width Ghd7.1/DTH7 explained 9.50% of heading date vari- (Fig. 3k) were raised in Ind I but decreased in Ind II. The ances, sd1 explained 15.07% of plant height variances, increasing of thousand grain weight and grain length and GS3 explained 4.57%, 9.84% and 6.46% phenotype vari- grain width reflecting the selection of high yield rice lines ances for thousand grain weight, grain length and grain with large grain size, while the breeding and applica- length width ratio, GSE5 explained 11.40%, 22.85% tion of high-quality “Simiao rice” with small and slender and 11.49% phenotype variances for thousand grain grains, the Guangdong indica rice showed decrease of weight, grain width and grain length width ratio, and these traits and the increase of grain length width ratio. GS5 explained 40.70% and 4.37% phenotype variances for grain width and grain length width ratio (Additional file  1: Table  S9, Additional file  2: Fig. S4). Effect of allele Frequency of Deleterious or Beneficial Allele During combination were analyzed for plant height and grain Genetic Improvement size genes. Average plant height of accessions with Number of deleterious variations that encode adverse Hap1 Hap2 combination of sd1 and Oshox4 was 116.49  cm, amino acid were predicted in three main subpopulations, Hap1 which significantly lower than 151.70  cm of sd1 and and the number of accessions from landrace subpopula- Hap1 Oshox4 (Additional file  1: Table  S10). A total of 25 tion Ind I and Ind IV were compared cultivar subpopu- major allele combinations of grain size genes GS3, GSE5 lation Ind II. Firstly, deleterious variations identified by and GS5 were detected. Thousand grain weight ranged SIFT software showed the total count of deleterious vari- from 19.56 to 24.22  g, grain length ranged from 7.62 to ations were stepwise decreasing in Ind I (median num- 9.43 mm, grain width ranged from 2.30 to 2.96 mm, and ber was 3255.0) and Ind II (median number was 3287.5) grain length width ratio ranged from 2.66 to 3.99 for the compared with Ind IV (median number was 3472.0), accessions with the 25 allele combinations. For instance, which implying these variations were lost during modern the high quality “Simiao” rice Meixiangzhan2hao have cultivars improvement under artificial selection pressure Hap3 Hap2 Hap6 the combination of GS3 , GSE5 and GS5 with (Fig. 4A and Additional file  1: Table S5). Secondly, a total thousand grain weight of 20.88 g and grain length width of 319 quantitative trait nucleotides (QTNs) of the 212 ratio of 3.99 (Additional file  1: Table S11). The phenotype vital gene in rice of RiceNavi database were used to anno- effects of known genes that genotyped by RiceNavi were tate accessions of three subpopulations (Additional file  1: also evaluated, and several QTNs have potential effects Table  S6). For all genes, the average inferior allele count on Guangdong indica rice phenotype. For instance, two of accessions in Ind IV, Ind I and Ind II were 52.19, 52.96 variations of Hd1 gene (9338004 and 9338220 on Chr6) and 50.32, respectively. Inferior allele counts were 17.38, shows effect to promoting heading date by about 9  days 16.34 and 15.72 for yield related genes and 29.83, 30.91 (Additional file  1: Table  S12). Four genes that regulating and 30.88 for Ind IV, Ind I and Ind II, respectively. These eating quality were also genotyped. Ten accessions were results suggesting that modern cultivars of subpopulation genotyped to have fragrance allele of Badh2 gene, and Ind II accumulated favorable alleles, especially for yield 9 of which are Ind II accessions. The only different site related genes during improvement (Fig. 4B and C). How- of two elite cultivars Huanghuazhan and Meixiangzhan- ever, modern cultivars lost some favorable allele of stress 2hao was the presence and absence of the fragrance allele responsive genes (Fig. 4D). of Badh2 gene (Additional file 1: Table S13). XP-CLR analysis were conducted by comparing Ind II with Ind IV (Fig. 5A) and Ind I (Fig. 5B), and the QTLs QTLs and Breeding Signatures of Guangdong Indica Rice were further used to annotate the regions of breed- Breeding signatures of modern cultivar subpopulation ing signatures (Fig.  5C). A total of 24 and 23 intersec- Ind II were identified using significant distorted patterns tions between agronomic QTLs and selected genomic in allele frequency of XP-CLR method. Modern cultivar regions for subpopulation Ind IV and Ind I, respec- subpopulation Ind II was respectively compared with lan- tively. Known vital yield and yield-related genes under drace subpopulation Ind IV and Ind I. For the compari- selection pressure were detected. For instance, sd1, son of Ind IV and Ind II, a total of 150 genomic segments Oshox4 and OsGA2ox5 were found under selection of spanning 15.10  Mb potentially selected genomic regions plant height, OsWDR5, and TAC3 were selected for the by modern cultivar breeding were identified (Additional improvement of grain yield per plant, and favorable file  1: Table S7). And for the comparison of Ind I and Ind Hang  et al. Rice (2023) 16:25 Page 7 of 14 Fig. 4 Deleterious variations in accessions of three main subpopulations. A Number of deleterious variations that predicted by SIFT (sorting intolerant from tolerant) software. Number of inferior alleles of all genes (B), yield related genes (C) and stress related genes (D) that annotated using RiceNavi database. Subpopulations were ordered by LD decay distance from short to long alleles of GS3, Osmyb3 and FLO13 were selected for and the selected genes in Ind I were OsGA2ox5 and grain length. Interestingly, different selected genes were Oshox4. For grain yield per plant, TAC3 and OsWDR5 detected for subpopulation Ind IV and Ind II. Plant were under selection pressure in Ind I but not in Ind IV. height genes sd1 and Oshox4 were selected in Ind IV, GS3 was selected for grain length in Ind IV but not in Ind I. GSE5 and OsDER1 were selected for grain width Hang et al. Rice (2023) 16:25 Page 8 of 14 Fig. 5 Selection sweeps for modern cultivar Ind II and the annotation of breeding signatures by QTLs of eleven agronomic traits. Distribution of XP-CLR scores between modern cultivar Ind II and local landrace subpopulations of Ind IV (A) and Ind I (B). C Annotation of breeding signatures using QTLs identified by GWAS and grain length width ratio in subpopulation Ind I, but main haplotypes, GS3-Hap3 was fixed during breed - OsABCG18 was selected in Ind IV (Fig. 5). ing and improvement process for Ind II. Hap3 of grain length gene Osmyb3 was mainly selected for Ind II dur- Allele Fixation During the Breeding Process of Guangdong ing modern rice breeding by one and two variations from Indica Rice Osmyb3-Hap4 and Osmyb3-Hap2. For the four haplo- Gene haplotype analysis were conducted to illustrate types of grain weight gene FLO13, Hap3 is a predominant evolutionary relationship and identify selected alleles allele in Ind II, which was selected from FLO13-Hap4 during modern breeding and improvement of subpopula- with one missense variation (Fig.  6). In those modern tion Ind II. A total of 6 potentially favorable alleles were cultivar fixed alleles, six Ind II specific variations were fixed in Ind II for key yield related genes, those genes identified. Oshox4-Hap3 have one specific intron varia - were Oshox4 and OsGA2ox5 for plant height, TAC3 for tion between exon 1 and exon 2, TAC3-Hap3 have three tiller angle and yield, GS3, Osmyb3 and FLO3 for grain cultivar specific variations, GS3-Hap3 have one stop size and weight. Four main haplotypes were identified codon gained variation, and FLO13-Hap3 have one spe- for plant height gene Oshox4, haplotype network analy- cific missense variation in Ind II (Fig. 7). sis revealed that Oshox4-Hap3 (Ind II fixed haplotype) was derived from Oshox4-Hap2, following the variations Discussion of Oshox4-Hap1 (Ind IV and Ind I) and Oshox4-Hap4 The major breakthrough of “green revolution” leads (GJ-tmp). For two main haplotypes of plant height gene quantum leaps of rice productivity (Cheng et  al. 2020), OsGA2ox5, most accessions of Ind II have OsGA2ox5- intensive breeding efforts and artificial selection have Hap2, while landraces of subpopulation Ind I and Ind IV facilitated the significant improvement of indica rice have OsGA2ox5-Hap1. Tiller angle and yield related gene yield in Guangdong, where the “green revolution” started TAC3 have four main haplotypes in these accessions, in China. Unlike previously reported selection analy- and TAC3-Hap3 are a fixed haplotype of Ind II modern sis of rice accession from multiple geographic positions cultivars. Grain length and weight gene GS3 have five (Li et al. 2020a, b; Lv et al. 2020; Xie et al. 2015; Xu et al. Hang  et al. Rice (2023) 16:25 Page 9 of 14 Oshox4-Hap1 10 samples (n=287) B 1 samples Ind I Ind II GJ-tmp Ind IV Oshox4-Hap4 (n=8) OsGA2ox5-Hap2 (n=87) Oshox4-Hap2 (n=183) OsGA2ox5-Hap1 (n=412) Oshox4-Hap3 (n=15) GS3-Hap1 GS3-Hap2 (n=222) D (n=161) C TAC3-Hap4 (n=19) TAC3-Hap3 (n=62) TAC3-Hap1 TAC3-Hap2 GS3-Hap4 GS3-Hap5 (n=258) (n=140) (n=18) (n=23) GS3-Hap3 (n=73) Osmyb3-Hap1 (n=238) FLO13-Hap1 (n=229) Osmyb3-Hap2 (n=125) FLO13-Hap2 FLO13-Hap4 (n=178) (n=13) Osmyb3-Hap4 FLO13-Hap3 (n=78) (n=21) Osmyb3-Hap5 (n=17) Osmyb3-Hap3 (n=86) Fig. 6 Haplotype networks of six key yield related genes in selected regions. A Oshox4 related to plant height, B OsGA2ox5 related to plant height, C TAC3 related to tiller angle, D GS3 related to grain length and weight, E Osmyb3 related to grain length, and F FLO13 related to grain weight. Circle size is proportional to accession number of one haplotype. Number of ticks in network edges represents the variation counts between haplotypes Hang et al. Rice (2023) 16:25 Page 10 of 14 Variaton effect type Chr9:17903164 17904164 17905164 intron_variant synonymous_variant 3_prime_UTR_variant missense_variant downstream_gene_variant Oshox4-Hap1 GC GG TC A splice_region_variant&intron_variant Oshox4-Hap2 GT CA CT A disruptive_inframe_deletion stop_gained Oshox4-Hap3 AT CA CT A Unique variation for Ind II Oshox4-Hap4 GC GG CC G group Chr3:29583117 29584117 CDS five_prime_UTR three_prime_UTR TAC3-Hap1 AT TTTTT TG A GG TAC3-Hap2 CT AC TT GT GAGA GG G TAC3-Hap3 ACAC TT GC GAGA GA G TAC3-Hap4 AT TC GTAGAA GT GAGA GG T Chr3:16729501 16730501 16731501 16732501 16733501 16734501 GS3-Hap1 CA G GS3-Hap2 GA G GS3-Hap3 GAAGGT GS3-Hap4 GAAGGG GS3-Hap5 CAAGGG Chr2:35013566 35014566 35015566 35016566 35017566 FLO13-Hap1 TC A FLO13-Hap2 GC T FLO13-Hap3 GG A FLO13-Hap4 G C A Fig. 7 Gene structure and haplotype specific variations of Oshox4 (A), TAC3 (B), GS3 (C) and FLO13 (D) for subpopulation Ind II. Background colors of bases represent annotation and effect level of the variation. Red stars mark the specific variations for the haplotype of Ind II 2016; Ye et  al. 2022), we focusing on the locally adapta- modern improved cultivars. For example, modern breed- tive selection of rice in Guangdong by comparing the ing of rice quality started from the end of last century in genomic variations and agronomic traits of locally culti- Guangdong favors slender grain type for regional appe- vated landraces by farmers before “green revolution” and tite in southern Aisa, which makes the increase of grain Hang  et al. Rice (2023) 16:25 Page 11 of 14 length and width ratio, and decrease of grain weight in Agricultural Sciences for two seasons of 2019 and 2021 subpopulation of modern cultivars. under conventional field management. All accessions The artificial replacement of deleterious variations with were planted in 1437 blocks under randomized complete favorable alleles during breeding and improvement are block design with three replications. A total of eleven meaningful to meet social demands for crop and food. agronomic traits were investigated and calculated. Plant By integrating GWAS for agronomic traits and breeding height (PH) was measured as length from the ground to signatures for selected regions, regionally selected key the highest point of the plant, heading date (HD) was genes that influencing yield and yield related traits for recorded when half of the plants in a block have reached Guangdong indica rice were identified. Oshox4 was iden- the heading stage and days from sowing to heading is tified as a selected gene for plant height in our accessions, calculated, yield per plant (YPP) was the total weight of which plays negative function in gibberellin responses filled grains per plant, panicle number (PN) was the num - and influencing plant height and tiller number (Dai et al. ber of effective panicles per mature plant, grain number 2008; Zhou et  al. 2015). An Ind II specific haplotype per panicle (GNPP) was the mean total number of grains (Oshox4-Hap3) were identified for cultivars. Another per panicle on a single plant, seed setting (SS) was calcu- regulator for rice growth and architecture, OsGA2ox5, lated as the percentage of filled grains to total grains per were also identified as selected gene of plant height for plant, grain length (GL), grain width (GW) and panicle cultivar subpopulation Ind II when compared with Ind I length (PL) were measured when seeds are mature (Yu (Lo et al. 2008), and OsGA2ox5-Hap2 was a selected hap- et  al. 2020). Thousand grain weight (TGW) were calcu - lotype by cultivars from Ind II. Tiller angle gene TAC3 lated by the division of total weight to total filled grain were selected for yield per plant of Ind II when compared number, and grain length width ratio (GLWR) was the with Ind I, and TAC3-Hap3 were selected by cultivars division of GL to GW. from Ind II (Dong et  al. 2016). GS3 and Osmyb3 were identified to be selected for Ind II when compared with Genomic Resequencing and Variation Calling Ind IV, and the Hap3 of these two genes were selected Young leaves of 358 landrace and 121 improved culti- for cultivars (Li et al. 2020a, b; Fan et al. 2006; Mao et al. vars of indica rice accessions from Guangdong province 2010). Starch biosynthesis and grain weight gene FLO13 of southern China were collected to construct sequenc- was selected for Ind II compared with Ind I, and FLO13- ing libraries according to the manufacturer’s instructions, Hap3 with a unique missense variant in cultivars was and qualified libraries were sequenced using Illumina a potential selected haplotype (Hu et  al. 2018). These HiSeq platform. A total of 38 Guangdong improved cul- selected favorable haplotypes are promising functional tivars were collected from the NCBI SRA database with alleles regulating the yield improvement of Guangdong accession numbers of PRJNA321462, PRJNA522896 modern cultivars. and PRJNA656900 (Additional file  1: Table  S1). Qual- ity of raw sequencing data were accessed using FastQC Conclusions (v0.11.9) software (Andrews 2010), and low-quality data In summary, large-scale genomic and yield assessment were trimmed using TrimGalore (version 0.6.6) to gener- of Guangdong landrace and modern cultivars promotes ate clean sequencing data. Clean data were mapped onto analysis of their diversity, classification and phyloge - reference genome (MSU7) using BWA (0.7.17-r1188) netic relationship. We revealed less deleterious varia- software with default parameter (Li and Durbin 2009). tions number in modern cultivars than landrace. Selected MarkDuplicates in Picard (2.12.1) was used to eliminate genomic regions were also identified and annotated using PCR duplication and sorting BAM files, and genome - GWAS of vital agronomic traits, which leads the iden- CoverageBed of bedtools (v2.27.1) was used to calculate tification of selected key genes during Guangdong rice genome coverage ratios. SNPs (single nucleotide poly- breeding and improvement. These results shed light on morphisms) and InDels (insertions and deletions) were regionally breeding trajectory and artificial selection, and then called using HaplotypeCaller of Genome Analysis provides valuable resources for rational design of molec- Toolkit (GATK, version 4.2.2.0) pipeline (McKenna et al. ular breeding. 2010), and annotated using SnpEff (4.3 s) with the GFF3 file of MSU7 reference genome (Cingolani et al. 2012). Materials and Methods Field Experimental Design and Phenotyping of Agronomic Population Structure Analysis Traits Principal components analysis (PCA), phylogenetic and Agronomic traits of 479 accessions of Guangdong rice admixture analysis were employed to classify subpopu- core germplasm was investigated on experimental field lations (Yu et  al. 2021). For population structure analy- of Rice Research Institute, Guangdong Academy of sis, SNP variations were filtered using VCFtools (0.1.16) Hang et al. Rice (2023) 16:25 Page 12 of 14 software with parameter “–max-missing 0.95” and “–maf were filtered out as such short blocks unlikely selected 0.05”. PCA method with kmeans clustering algorithm in during the short history of modern rice improvement CropGBM software was used to reducing dimensions of breeding. Long blocks with top 1% values of XP-CLR genotypic data (Yan et  al. 2021), and eigenvalues were scores were finally considered as selected regions. calculated by plink software. Phylogenetic relationship was constructed using VCF2Dis software (https:// github. GenomeW ‑ ide Association Analysis com/ BGI- shenz hen/ VCF2D is) and illustrated by using For genome-wide association analysis (GWAS), multi- FigTree (v1.4.3) software (https:// github. com/ ramba ut/ sample VCF file of genomic variations was converted into figtr ee). ADMIXTURE (version 1.3.0) software (Alex- plink file and variations were screened with parameters ander and Lange 2011) was used to analyze population of “–geno 0.1 –mind 0.4 –maf 0.05”. PCA analysis were structure with k values ranged from 2 to 12. The ancestry conducted by plink software with five major components distributions of individuals were visualized using R script. (Purcell et  al. 2007). Kinship analysis and GWAS were Genetic diversity (pi and Tajima’s D) and differentiation conducted using GEMMA software using filtered geno - (fst) analysis were conducted using VCFtools (0.1.16) types and eleven agronomic traits (Zhou and Stephens software with 100  kb sliding windows. Linkage disequi- 2012). librium (LD) decay for each subpopulation was estimated and plotted using PopLDdecay (Zhang et al. 2019). National indica rice sequencing data from RiceVarMap Gene Haplotype Reconstruction and Network Analysis database (Zhao et  al. 2015), variations of 3024 3KRG Software beagle (version 5.2) was used to impute miss- accession from SNP-seek database (Locedie et  al. 2017) ing genetic variations that generated by GATK (Brown- and variations of 998 wild rice lines from our recently ing et  al. 2021). Genomic variations of selected genes research (Zhang et  al. 2022) that used to conduct popu- were extracted based on the positions by using BCFTools lation structure and phylogenetic analysis in this study (Li 2011). Haplotype network of these genes were con- were subjected to the same data processing pipeline. structed by our previously described method (Yu et  al. A recently released and refined indica 9311 reference 2021). Haplotype network was constructed and illus- genome was also used to check the results of population trated by Popart software (Leigh and Bryant 2015). structure analysis (Wang et al. 2022). Abbreviations Estimation of Variation Eec ff ts and Deleterious Mutation SNP Single-nucleotide polymorphisms Prediction PCA Principal component analysis XP-CLR The cross-population composite likelihood ratio test Functional alteration of genomic variations was predicted QTLs Quantitative trait locus using Sorting Intolerant From Tolerant 4G (SIFT 4G) GWAS Genome-wide association studies software (Vaser et  al. 2016). Variations with SIFT scores LD Linkage disequilibrium QTNs Quantitative trait nucleotides smaller than 0.05 was considered as putatively deleteri- KRG 3000 Rice genome project ous variations. Allele function for known genes with vital role in rice were annotated using RiceNavi database (Wei Supplementary Information et  al. 2021). Advantage and inferior allele were deter- The online version contains supplementary material available at https:// doi. mined by manually check of allele functional alteration org/ 10. 1186/ s12284- 023- 00642-3. and its corresponding trait. Number of inferior alleles for accessions from each subpopulation were counted and Additional file 1: Table S1. Information for accessions used in this study. Table S2. Quality assessment of genome sequencing data. Table S3. plotted using boxplot or violin plot in R. Genome mapping quality and genome coverage of genome sequencing data. Table S4. Genomic variations for 517 indica rice accessions against the MSU reference genome. Table S5. Number of deleterious variations Identification of Breeding Signatures Using XP‑CLR in subpopulation of landrace and cultivar. Table S6. Genotyping results Breeding signatures for artificial selection were identified 319 quantitative trait nucleotides (QTNs) of the 212 vital gene in rice from by using the cross-population composite likelihood ratio RiceNavi database. Table S7. Genomic segments that identified as breed- ing signatures between Ind IV and Ind II. Table S8. Genomic segments test (XP-CLR) method (Chen et al. 2010) and its updated that identified as breeding signatures between Ind I and Ind II. Table S9. version of python module (https:// github. com/ hardi ngnj/ QTLs and known genes that identified by GWAS of eleven agronomic xpclr). XP-CLR was conducted between subpopulations traits for Guangdong indica rice. Table S10. Allele combinations of plant height (PH) genes of Guangdong indica rice germplasm. Table S11. Allele of landrace and cultivar with 10  kb sliding windows. combinations of thousand grain weight ( TGW ), grain length (GL), grain Genomic segments with XP-CLR values above the 80th width (GW ) and grain length width ratio (GLWR) genes of Guangdong percentile was considered as putatively selected regions. indica rice germplasm. Table S12. Phenotypic effect assessment of known QTNs that genotyped by RiceNavi. Table S13. Genotyping of 4 eating Adjacent segments within 20  kb distance were then quality genes. merged into longer blocks, and blocks shorter than 40 kb Hang  et al. Rice (2023) 16:25 Page 13 of 14 nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila Additional file 2: Fig. S1. Admixture analysis when subpopulation 1118 melanogaster strain w ; iso-2; iso-3. Fly 6:80–92 number (k) was set to two (a) and three (b). Numbers of cultivar (Cul) and Cui D, Lu H, Tang C, Li J, Yu T, Ma X, Zhang E, Wang Y, Cao G, Xu F, Qiao Y, Dai L, landrace (Lan) were noted in parentheses for each subgroup. Fig. S2. Li R, Tian S, Koh HJ, Han L (2019) Genomic analyses reveal selection foot- Population structure analysis of Guangdong indica rice with accessions prints in rice landraces grown under on-farm conservation conditions from RiceVarMap2 and 3KRG database. Fig. S3. Population structure during a short-term period of domestication. Evolut Appl 13:290–302 analysis of Guangdong indica rice accessions using indica rice 9311 as Dai M, Hu Y, Ma Q, Zhao Y, Zhou D (2008) Functional analysis of rice reference genome. Fig. S4. Manhattan plots for genome-wide association HOMEOBOX4 (Oshox4) gene reveals a negative function in gibberellin analysis of eleven agronomic traits. responses. Plant Mol Biol 66:289–301 Dong H, Zhao H, Xie W, Han Z, Li G, Yao W, Bai X, Hu Y, Guo Z, Lu K, Yang L, Xing Y (2016) A novel tiller angle gene, TAC3, together with TAC1 and D2 Acknowledgements largely determine the natural variation of tiller angle in rice cultivars. PLoS The authors are grateful to all lab members for their assistance in field experi- Genet 12:e1006412 ments, and appreciated to the tremendous dedication of rice scientists in Fan C, Xing Y, Mao H, Lu T, Han B, Xu C, Li X, Zhang Q (2006) GS3, a major QTL germplasm collection and breeding of Guangdong rice. for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor Appl Genet Author contributions 112:1164–1171 CL and HY conceived and designed the experiments, and wrote the paper. HY, Fuentes RR, Chebotarov D, Duitama J, Smith S, De la Hoz JF, Mohiyuddin M, YL, BRS, QL, XXM, LQJ, SWL, JZ, PLC, DJP, WFC, ZLF performed the experiments Wing RA, McNally KL, Tatarinova T, Grigoriev A, Mauleon R, Alexandrov N and analyzed the data. CL and HY contributed reagents/materials/analysis (2019) Structural variants in 3000 rice genomes. Genome Res 29:870–880 tools. All authors read and approved the final manuscript. Higgins J, Santos B, Khanh TD, Trung KH, Duong TD, Doai NTP, Khoa NT, Ha DTT, Diep NT, Dung KT, Phi CN, Thuy TT, Tuan NT, Tran HD, Trung NT, Giang Funding HT, Nhung TK, Tran CD, Lang SV, Nghia LT, Van Giang N, Xuan TD, Hall A, This work was supported by Natural Science Foundation of Guangdong Dyer S, Ham LH, Caccamo M, De Vega JJ (2021) Resequencing of 672 Province (2022A1515011741), Special Funds for Scientific Innovation native rice accessions to explore genetic diversity and trait associations in Strategy-Construction of High Level Academy of Agriculture Science Vietnam. Rice 14:1–16 (R2021YJYB3017), Guangzhou Science and Technology Plan Project Hour A, Hsieh W, Chang S, Wu Y, Chin H, Lin Y (2020) Genetic diversity of (2023A04J0144), Key Field Research and Development Project of Guangdong landraces and improved varieties of rice (Oryza sativa L.) in Taiwan. Rice Province (2022B0202110003), Seed Industry Revitalization Project of Special 13:1–12 Fund for Rural Revitalization Strategy in Guangdong Province (2022NJS00004, Hu T, Tian Y, Zhu J, Wang Y, Jing R, Lei J, Sun Y, Yu Y, Li J, Chen X, Zhu X, Hao Y, 2022NPY00011) and The Project of Collaborative Innovation Center of GDAAS Liu L, Wang Y, Wan J (2018) OsNDUFA9 encoding a mitochondrial com- (XTXM202203). plex I subunit is essential for embryo development and starch synthesis in rice. Plant Cell Rep 37:1667–1679 Availability of Data and Materials Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y, Li C, Zhu C, Lu T, Zhang Z, Li The raw reads of whole-genome resequencing were available at the NCBI M, Fan D, Guo Y, Wang A, Wang L, Deng L, Li W, Lu Y, Weng Q, Liu K, Huang Sequence Read Archive with accession ID PRJNA934413. The sequences and T, Zhou T, Jing Y, Li W, Lin Z, Buckler ES, Qian Q, Zhang Q, Li J, Han B (2010) annotations of reference genome MSU7 is available from the websites http:// Genome-wide association studies of 14 agronomic traits in rice landraces. rice. plant biolo gy. msu. edu/. Nat Genet 42:961–967 Huang X, Kurata N, Wei X, Wang Z, Wang A, Zhao Q, Zhao Y, Liu K, Lu H, Li W, Guo Y, Lu Y, Zhou C, Fan D, Weng Q, Zhu C, Huang T, Zhang L, Wang Y, Declarations Feng L, Furuumi H, Kubo T, Miyabayashi T, Yuan X, Xu Q, Dong G, Zhan Q, Li C, Fujiyama A, Toyoda A, Lu T, Feng Q, Qian Q, Li J, Han B (2012) A Ethics Approval and Consent to Participate map of rice genome variation reveals the origin of cultivated rice. Nature Not applicable. 490:497–501 Huang X, Yang S, Gong J, Zhao Q, Feng Q, Zhan Q, Zhao Y, Li W, Cheng B, Xia Consent for Publication J, Chen N, Huang T, Zhang L, Fan D, Chen J, Zhou C, Lu Y, Weng Q, Han Not applicable. B (2016) Genomic architecture of heterosis for yield traits in rice. Nature 537:629–633 Competing Interests Kumar A, Daware A, Kumar A, Kumar V, Gopala KS, Mondal S, Patra BC, Singh The authors have declared that no competing interests exist. AK, Tyagi AK, Parida SK, Thakur JK (2020) Genome-wide analysis of poly- morphisms identified domestication-associated long low-diversity region carrying important rice grain size/weight quantitative trait loci. Plant J Received: 14 February 2023 Accepted: 14 May 2023 103:1525–1547 Leigh JW, Bryant D (2015) POPART: full-feature software for haplotype network construction. Methods Ecol Evol 6:1110–1116 Li H (2011) A statistical framework for SNP calling, mutation discovery, associa- tion mapping and population genetical parameter estimation from References sequencing data. Bioinformatics 27:2987–2993 Alexander DH, Lange K (2011) Enhancements to the ADMIXTURE algorithm for Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows– individual ancestry estimation. BMC Bioinf 12:246 Wheeler transform. Bioinformatics 25:1754–1760 Andrews S (2010) FastQC: a quality control tool for high throughput sequence Li Q, Lu L, Liu H, Bai X, Zhou X, Wu B, Yuan M, Yang L, Xing Y (2020a) A minor data [Online]. Available online at: http:// www. bioin forma tics. babra ham. QTL, SG3, encoding an R2R3-MYB protein, negatively controls grain ac. uk/ proje cts/ fastqc/ length in rice. Theor Appl Genet 133:2387–2399 Browning BL, Tian X, Zhou Y, Browning SR (2021) Fast two-stage phasing of Li X, Chen Z, Zhang G, Lu H, Qin P, Qi M, Yu Y, Jiao B, Zhao X, Gao Q, Wang H, large-scale sequence data. Am J Hum Genet 108:1880–1890 Wu Y, Ma J, Zhang L, Wang Y, Deng L, Yao S, Cheng Z, Yu D, Zhu L, Xue Chen H, Patterson N, Reich D (2010) Population differentiation as a test for Y, Chu C, Li A, Li S, Liang C (2020b) Analysis of genetic architecture and selective sweeps. Genome Res 20:393–402 favorable allele usage of agronomic traits in a large collection of Chinese Cheng F, Quan X, Zhengjin X, Wenfu C (2020) Eec ff t of rice breeding process rice accessions. Sci China Life Sci 63:1688–1702 on improvement of yield and quality in China. Rice Sci 27:363–367 Liu C, Peng P, Li W, Ye C, Zhang S, Wang R, Li D, Guan S, Zhang L, Huang X, Guo Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden Z, Guo J, Long Y, Li L, Pan G, Tian B, Xiao J (2021) Deciphering variation of DM (2012) A program for annotating and predicting the effects of single Hang et al. Rice (2023) 16:25 Page 14 of 14 239 elite japonica rice genomes for whole genome sequences-enabled Yu H, Li Q, Li Y, Yang H, Lu Z, Wu J, Zhang Z, Shahid MQ, Liu X (2021) Genomics breeding. Genomics 113:3083–3091 analyses reveal unique classification, population structure and novel Lo S, Yang S, Chen K, Hsing Y, Zeevaart JAD, Chen L, Yu S (2008) A novel class of allele of neo-tetraploid rice. Rice 14:16 gibberellin 2-oxidases control semidwarfism, tillering, and root develop - Zhang C, Dong S, Xu J, He W, Yang T (2019) PopLDdecay: a fast and effective ment in rice. Plant Cell 20:2603–2618 tool for linkage disequilibrium decay analysis based on variant call format Locedie M, Roven RF, Frances NB, Jeffery D, Juan MA, Dmytro C, Millicent S, files. Bioinformatics 35:1786–1788 Kevin P, Dario C, Alexandre P, Inna D, Victor S, Rod AW, Ruaraidh SH, Ramil Zhang J, Pan D, Fan Z, Yu H, Jiang L, Lv S, Sun B, Chen W, Mao X, Liu Q, Li C M, Kenneth LM, Nickolai A (2017) Rice SNP-seek database update: new (2022) Genetic diversity of wild rice accessions (Oryza rufipogon Griff.) in SNPs, indels, and queries. Nucleic Acids Res 45(D1):D1075–D1081 Guangdong and Hainan provinces, China, and construction of a wild rice Lv Q, Li W, Sun Z, Ouyang N, Jing X, He Q, Wu J, Zheng J, Zheng J, Tang S, Zhu core collection. Front Plant Sci 13:999454 R, Tian Y, Duan M, Tan Y, Yu D, Sheng X, Sun X, Jia G, Gao H, Zeng Q, Li Y, Zhao H, Yao W, Ouyang Y, Yang W, Wang G, Lian X, Xing Y, Chen L, Xie W (2015) Tang L, Xu Q, Zhao B, Huang Z, Lu H, Li N, Zhao J, Zhu L, Li D, Yuan L, Yuan RiceVarMap: a comprehensive database of rice genomic variations. D (2020) Resequencing of 1,143 indica rice accessions reveals important Nucleic Acids Res 43(D1):D1018–D1022 genetic variations and different heterosis patterns. Nature Commun Zhou X, Stephens M (2012) Genome-wide efficient mixed-model analysis for 11:4778 association studies. Nat Genet 44:821–824 Mao H, Sun S, Yao J, Wang C, Yu S, Xu C, Li X, Zhang Q (2010) Linking differen- Zhou W, Malabanan PB, Abrigo E (2015) OsHox4 regulates GA signaling tial domain functions of the GS3 protein to natural variation of grain size by interacting with DELLA-like genes and GA oxidase genes in rice. in rice. Proc Natl Acad Sci 107:19579–19584 Euphytica 201:97–107 McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Gari- mella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome Publisher’s Note analysis toolkit: a MapReduce framework for analyzing next-generation Springer Nature remains neutral with regard to jurisdictional claims in pub- DNA sequencing data. Genome Res 20:1297–1303 lished maps and institutional affiliations. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575 Vaser R, Adusumalli S, Leng SN, Sikic M, Ng PC (2016) SIFT missense predictions for genomes. Nat Protoc 11:1–9 Wang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, Li M, Zheng T, Fuentes RR, Zhang F, Mansueto L, Copetti D, Sanciangco M, Palis KC, Xu J, Sun C, Fu B, Zhang H, Gao Y, Zhao X, Shen F, Cui X, Yu H, Li Z, Chen M, Detras J, Zhou Y, Zhang X, Zhao Y, Kudrna D, Wang C, Li R, Jia B, Lu J, He X, Dong Z, Xu J, Li Y, Wang M, Shi J, Li J, Zhang D, Lee S, Hu W, Poliakov A, Dubchak I, Ulat VJ, Borja FN, Mendoza JR, Ali J, Li J, Gao Q, Niu Y, Yue Z, Naredo MEB, Talag J, Wang X, Li J, Fang X, Yin Y, Glaszmann J, Zhang J, Li J, Hamilton RS, Wing RA, Ruan J, Zhang G, Wei C, Alexandrov N, McNally KL, Li Z, Leung H (2018) Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557:43–49 Wang S, Gao S, Nie J, Tan X, Xie J, Bi X, Sun Y, Luo S, Zhu Q, Geng J, Liu W, Lin Q, Cui P, Hu S, Wu S (2022) Improved 93–11 genome and time-course tran- scriptome expand resources for rice genomics. Front Plant Sci 12:769700 Wei X, Qiu J, Yong K, Fan J, Zhang Q, Hua H, Liu J, Wang Q, Olsen KM, Han B, Huang X (2021) A quantitative genomics map of rice provides genetic insights and guides breeding. Nat Genet 53:243–253 Xiao N, Pan C, Li Y, Wu Y, Cai Y, Lu Y, Wang R, Yu L, Shi W, Kang H, Zhu Z, Huang N, Zhang X, Chen Z, Liu J, Yang Z, Ning Y, Li A (2021) Genomic insight into balancing high yield, good quality, and blast resistance of japonica rice. Genome Biol 22:1–22 Xie W, Wang G, Yuan M, Yao W, Lyu K, Zhao H, Yang M, Li P, Zhang X, Yuan J, Wang Q, Liu F, Dong H, Zhang L, Li X, Meng X, Zhang W, Xiong L, He Y, Wang S, Yu S, Xu C, Luo J, Li X, Xiao J, Lian X, Zhang Q (2015) Breeding sig- natures of rice improvement revealed by a genomic variation map from a large germplasm collection. Proc Natl Acad Sci 112:E5411–E5419 Xu X, Liu X, Ge S, Jensen JD, Hu F, Li X, Dong Y, Gutenkunst RN, Fang L, Huang L, Li J, He W, Zhang G, Zheng X, Zhang F, Li Y, Yu C, Kristiansen K, Zhang X, Wang J, Wright M, McCouch S, Nielsen R, Wang J, Wang W (2012) Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nat Biotechnol 30:105–111 Xu Q, Yuan X, Wang S, Feng Y, Yu H, Wang Y, Yang Y, Wei X, Li X (2016) The genetic diversity and structure of Indica rice in China as detected by single nucleotide polymorphism analysis. BMC Genet 17:1–8 Yan J, Xu Y, Cheng Q, Jiang S, Wang Q, Xiao Y, Ma C, Yan J, Wang X (2021) LightGBM: accelerated genomically designed crop breeding through ensemble learning. Genome Biol 22:1–24 Ye J, Zhang M, Yuan X, Hu D, Zhang Y, Xu S, Li Z, Li R, Liu J, Sun Y, Wang S, Feng Y, Xu Q, Yang Y, Wei X (2022) Genomic insight into genetic changes and shaping of major inbred rice cultivars in China. New Phytol 236:2311 Yu H, Shahid MQ, Li Q, Li Y, Li C, Lu Z, Wu J, Zhang Z, Liu X (2020) Production assessment and genome comparison revealed high yield potential and novel specific alleles associated with fertility and yield in neo-tetraploid rice. Rice 13:32 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Rice Springer Journals

Genetic Diversity and Breeding Signatures for Regional Indica Rice Improvement in Guangdong of Southern China

Loading next page...
 
/lp/springer-journals/genetic-diversity-and-breeding-signatures-for-regional-indica-rice-YPDvTWsbhH
Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2023
ISSN
1939-8425
eISSN
1939-8433
DOI
10.1186/s12284-023-00642-3
Publisher site
See Article on Publisher Site

Abstract

As the pioneer of the Green Revolution in China, Guangdong province witnessed the improvement and spread of semi-dwarf Xian/Indica rice cultivars and possessed diverse rice germplasm of landrace and cultivars. A total of 517 accessions containing a core germplasm of 479 newly sequenced landraces and modern cultivars were used to reveal breeding signatures and key variations for regional genetic improvement of indica rice from Guangdong. Four subpopulations were identified in the collection, which including Ind IV as a novel subpopulation that not covered by previously released accessions. Modern cultivars of subpopulation Ind II were inferred to have less deleterious variations, especially in yield related genes. About 15 Mb genomic segments were identified as potential breeding signatures by cross-population likelihood method (XP-CLR) of modern cultivars and landraces. The selected regions spanning multiple yield related QTLs (quantitative trait locus) which identified by GWAS (genome-wide association studies) of the same population, and specific variations that fixed in modern cultivars of Ind II were characterized. This study highlights genetic differences between traditional landraces and modern cultivars, which revealed the potential molecular basis of regional genetic improvement for Guangdong indica rice from southern China. Keywords Rice, Yield improvement, Resequencing, Breeding signature, GWAS Background introduction and spread of semi-dwarf indica rice acces- Rice (Oryza sativa) feeds more than half of the world’s sions. Since then, rice yield has been increased by about population, and rice yield is vital for world food security. three-fold with the breakthrough of high-yield rice culti- Rice genetic improvement in China has facilitated the vars. However, with the spread of modern cultivars, rice increase of its production over the past several decades. landraces that were grown by local farmers is gradually Guangdong of southern China witnessed the breeding, disappearing. For the goal of further production increas- ing, the usage of genetic diversity for the valuable germ- plasm needs to be enhanced in breeding programs. *Correspondence: Tremendous efforts have been made by germplasm scien - Li Chen tist for the collection and conservation of landraces and lichen@gdaas.cn locally-improved traditional cultivars of southern China. Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China These landraces and cultivars represent the rice genetic Key Laboratory of Genetics and Breeding of High Quality Rice diversity of southern China before and after the rice in Southern China (Co-construction by Ministry and Province), Ministry “Green Revolution”, which could be used to reveal genetic of Agriculture and Rural Affairs, Guangzhou 510640, China Guangdong Key Laboratory of New Technology in Rice Breeding, trajectory for regional indica rice breeding and pheno- Guangzhou 510640, China type enhancement. Revelation of the functional varia- Guangdong Rice Engineering Laboratory, Guangzhou 510640, China tions related to the success of rice breeding in southern © The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Hang et al. Rice (2023) 16:25 Page 2 of 14 China will promote the utilization of genetic resources artificial selection and farmer cultivation. Genotyping by for future rice breeding. Moreover, characterization of sequencing of 108 core on-farm conserved rice landraces the genome sequences, genetic diversity and functional from Yunnan revealed 186 and 183 potential selective- variations of these germplasm collections is becoming sweep between different collection date (Cui et al. 2019). very critical for the next potential breakthrough of rice Different selection signature during breeding indicated production. by genetic differentiation between early and late culti - The advancement of sequencing technologies enabled vars of indica and japonica in Taiwan (Hour et al. 2020). the analysis of genetic diversity for large collection of As the pioneer of “Green Revolution” for indica rice in germplasm, which promoted the revelation of domesti- China, Guangdong province have rich diversity of germ- cated loci, and accelerated the identification of functional plasms. However, large-scale population genomics of rice genes. Genotypes of a large collection of 517 rice lan- landraces and improved varieties for the study of genetic draces were identified with onefold-coverage sequencing diversity and identification of regional breeding signa - and accurate imputation method, population structure, tures of are still lacking. genome-wide association analysis and haplotype analy- In this study, core germplasm of locally planted rice sis were conducted using about 3.6 million nonredun- accessions of landrace and cultivars from Guangdong of dant SNPs (Huang et  al. 2010). Thereafter, higher-depth southern China before and after rice “Green Revolution” sequencing with more than 15-fold coverage were con- were collected. Agronomic traits were systematically ducted on 40 cultivated and 10 wild rice accessions to investigated by field experiments, and they were geno - identify selection signatures during domestication using typed by 10-fold depth genome resequencing. Genetic nucleotide polymorphisms (Xu et  al. 2012), and larger diversity was analyzed and breeding signatures for mod- collection of 446 wild diverse rice accessions and 1083 ern cultivar subpopulation were identified and annotated cultivated varieties were also genotyped by sequencing by QTLs of eleven agronomic traits. Specific genetic vari - and used for the identification of 55 selective sweeps dur - ations and favorable alleles that fixed in subpopulation ing domestication (Huang et al. 2012). Further, 10,074 F of modern cultivar were identified that can be used for lines from 17 representative hybrid rice combinations molecular marker assisted breeding. were genotyped, and heterosis related loci were identi- fied (Huang et al. 2016). Release of sequencing data from Results “3000 Rice Genomes Project” (3KRGP) largely facili- Genetic Diversity and Population Structure Analysis tated the identification of untapped variations and novel A total of 517 accessions consisting mainly of indica rice genes (Fuentes et  al. 2019; Wang et  al. 2018). Jointly germplasm, which including 358 landrace and 159 arti- data analysis of 3KRGP and Indian long and short grain ficially improved cultivars from Guangdong, China were germplasm identified the long low-diversity region har - used to identify genomic variations (Additional file  1: boring key gene regulating grain weight (Kumar et  al. Table  S1). The 479 newly-sequenced accessions gener - 2020). Recently, the sequencing of local germplasm and ated 24.23 million 100  bp pair-end sequencing reads for improved varieties illustrated the regional genetic diver- each accession. Quality assessment of these sequenc- sity in detail. Sequencing of 239 japonica rice elites from ing reads revealed the average Q30 base quality (99.9% China, Japan and Korea identified 1131 novel genes and base call accuracy) percent was 99.65% (Additional file  1: artificial selection signals (Liu et al. 2021). Analysis of 672 Table S2). The average mapping depth against the MSU7 Vietnamese rice genomes described their classification reference genome was 12.02 with 91.68% coverage ratio and identified 21 unique QTLs from 19 traits (Higgins (Additional file  1: Table S3). Averagely 2.04 million SNPs, et al. 2021). Genotyping and systemically phenotyping of 82.34 thousand insertions and 127.99 thousand deletions 200 japonica rice varieties grown in central China over were identified for 517 indica rice accessions (Additional the past 30  years revealed the genetic factors regulating file 1: Table S4). the balance of yield, quality and blast resistance (Xiao Population structure analysis was conducted by prin- et al. 2021). cipal components analysis (PCA), phylogenetic and Artificial selection and breeding signatures during admixture analysis. In admixture analysis, group 1 con- the succession of rice varieties makes deep insight into tained 211 accessions (201 landraces and 10 cultivars), their genetic improvement. Low-coverage sequencing of and group 2 contained 306 accessions (157 landraces and 1479 landrace and modern cultivars from 73 countries 149 cultivars) when subpopulation number (k) was 2. revealed 200 regions were differentially selected between And three groups, namely group 1 (20 landraces and 116 two major indica subpopulations, and yield was cor- cultivars), group 2 (189 landraces and 11 cultivars) and related with number of the signatures (Xie et  al. 2015). group 3 (149 landraces and 32 cultivars) can be identi- Locally selective sweeps also showed pressure during fied when k = 3 (Additional file  2: Fig. S1). Integrated with Hang  et al. Rice (2023) 16:25 Page 3 of 14 PCA (Fig. 1A), phylogenetic tree (Fig. 1B) and admixture the square of the correlation coefficient (r ) between vari- analysis (Fig. 1C), a total of 4 subpopulations were finally ations. LD decay distance for Ind IV, Ind I and Ind II were determined. With the reference of 182 accessions (with 61.0  kb, 110.1  kb and 219.8  kb, respectively. The exten - three subpopulation that named with Ind I, Ind II and sion of LD decay distance for Ind II indicated that the Ind III) from RiceVarMap database, and the phylogenetic cultivar subpopulation Ind II underwent artificial selec - relationship with accessions from 3KRG, these 4 sub- tion pressure during the process of genetic improvement. populations were named as Ind I (landrace), Ind II (culti- Interestingly, the landrace subpopulation Ind I probably var), Ind IV (landrace) and GJ-tmp in this study (Fig. 2A, have selection effect by regional farmer breeders as its Additional file  2: Fig. S2). GJ-tmp diverged from other LD decay distance longer than landrace subpopulation subpopulations, and most of accessions from GJ-tmp Ind IV (Fig.  2B). Genetic diversity (pi and Tajima’s D) subpopulation were glutinous rice landraces. Another and differentiation (fst) analysis were conducted for three subpopulation Ind I contained a total of 181 accessions, main subpopulations of Guangdong indica rice. The pi 151 accessions of which were landraces and 30 culti- values for Ind IV, Ind I and Ind II were 0.0031, 0.0029 vars that were bred before 1980s. Subpopulation Ind II and 0.0029, respectively. Ind IV have higher pi values, contained 126 accessions, which have 117 cultivars and while Ind I and Ind II have similar values. The Tajima’s 9 landraces. Subpopulation Ind IV have 189 landraces D value for Ind IV was positive, while they were negative and 11 cultivars that were bred before 1980s (Additional for Ind I and Ind II, which implying potential selection file  1: Table  S1). A recently released and refined indica effect in subpopulation of Ind I and Ind II. Genetic diver - reference genome (9311) was also used to call genetic gence (fst) between Ind IV and Ind I is smaller than that variations and conduct population structure analysis, of Ind IV and Ind II, which indicates Ind II was higher which obtained similar results for genetic clustering for diverged from Ind IV than Ind I (Fig.  2C). Phylogenetic all those accessions (Additional file 2: Fig. S3). tree with 998 common wild rice (Oryza rufipogon Griff.) The length of linkage disequilibrium (LD) decay for lines also indicates degree of differentiation from wild subpopulation Ind I, Ind II and Ind IV other than the glu- rice populations from high to low was Ind IV, Ind I and tinous rice subpopulation GJ-tmp were estimated using Ind II (Fig. 2D). Together with these results, we deduced Fig. 1 Population structure analysis of Guangdong indica rice accessions. PCA plot (A), phylogenetic analyses (B) and population structure (C) showing genetic diversity and clustering of all accessions based on whole genome SNP variations. For PCA plot, PC1 (principal component 1) and PC2 (principal component 2) are showed on horizontal and vertical axes, and percentages of variance explained were noted in parentheses Hang et al. Rice (2023) 16:25 Page 4 of 14 Fig. 2 Genetic diversity and phylogenetic relationship of subpopulations for Guangdong indica rice. A PCA plot depicting the comparison of genetic clusters of 479 Guangdong and 220 indica accessions from RiceVarMap database. B Linkage disequilibrium (LD) decay analysis of three main subpopulations. C Genetic diversity and differentiation of three subpopulations for Guangdong indica rice. The size of the circles represents the level of genetic diversity (pi) of the subpopulations, and, and length of lines represent fst values between subpopulations. D Phylogenetic tree for Guangdong indica rice and common wild rice (Oryza rufipogon Griff.) that the genetic differences of these regionally cultivated rice accessions. Genetic diversity and LD decay analy- rice lines were attribute to the cultivation period, as mod- sis indicates potential selection pressure in Ind I, and ern cultivars may have high speed and flexible distance in even stronger selection effect in modern cultivar sub - their seed dispersal. population Ind II. The alteration of agronomic traits for these subpopulations recorded the trajectory of these selection effect. A total of eleven important agronomic Phenotypic Comparison for Subpopulations including plant height (PH), heading date (HD), yield The selection pressure by local breeders during rice per plant (YPP), panicle number (PN), grain number improvement for the past half century largely changed per panicle (GNPP), seed setting (SS), thousand grain the agronomic traits between traditional and modern weight (TGW), panicle length (PL), grain length (GL), Hang  et al. Rice (2023) 16:25 Page 5 of 14 grain width (GW) and grain length width ratio (GLWR) and grain length (Fig. 3b) were increased during modern were investigated and analyzed. breeding process, as shown by the comparison of Ind IV, During the improvement progress of Ind IV, Ind I and Ind I and Ind II. Plant height (Fig. 3c) and panicle length Ind II, values of eleven agronomic traits showed four dif- (Fig.  3d) descended during this process, which repre- ferent types of changing trends. Seed setting rate (Fig. 3a) sents the main phenotype alteration for semi-dwarf rice Fig. 3 Phenotype comparison of eleven main agronomic traits for three main subpopulations of Guangdong indica rice. Subpopulations were ordered by LD decay distance from short to long, and average trait values for each subpopulation were noted above boxplots Hang et al. Rice (2023) 16:25 Page 6 of 14 cultivars that released during rice “green revolution” of II, a total of 146 genomic segments with genomic length southern China. Trait of heading date (Fig.  3e), yield per of 14.59 Mb were identified (Additional file 1: Table S8). plant (Fig.  3f ), grain number per panicle (Fig.  3g) and Genome wide association study (GWAS) were con- grain length width ratio (Fig.  3h) showed fluctuation of ducted for eleven yield and yield-related traits and the decline in Ind I and elevation in Ind II. Panicle number effects of candidate genes were identified. For instance, (Fig.  3i), thousand grain weight (Fig.  3j) and grain width Ghd7.1/DTH7 explained 9.50% of heading date vari- (Fig. 3k) were raised in Ind I but decreased in Ind II. The ances, sd1 explained 15.07% of plant height variances, increasing of thousand grain weight and grain length and GS3 explained 4.57%, 9.84% and 6.46% phenotype vari- grain width reflecting the selection of high yield rice lines ances for thousand grain weight, grain length and grain with large grain size, while the breeding and applica- length width ratio, GSE5 explained 11.40%, 22.85% tion of high-quality “Simiao rice” with small and slender and 11.49% phenotype variances for thousand grain grains, the Guangdong indica rice showed decrease of weight, grain width and grain length width ratio, and these traits and the increase of grain length width ratio. GS5 explained 40.70% and 4.37% phenotype variances for grain width and grain length width ratio (Additional file  1: Table  S9, Additional file  2: Fig. S4). Effect of allele Frequency of Deleterious or Beneficial Allele During combination were analyzed for plant height and grain Genetic Improvement size genes. Average plant height of accessions with Number of deleterious variations that encode adverse Hap1 Hap2 combination of sd1 and Oshox4 was 116.49  cm, amino acid were predicted in three main subpopulations, Hap1 which significantly lower than 151.70  cm of sd1 and and the number of accessions from landrace subpopula- Hap1 Oshox4 (Additional file  1: Table  S10). A total of 25 tion Ind I and Ind IV were compared cultivar subpopu- major allele combinations of grain size genes GS3, GSE5 lation Ind II. Firstly, deleterious variations identified by and GS5 were detected. Thousand grain weight ranged SIFT software showed the total count of deleterious vari- from 19.56 to 24.22  g, grain length ranged from 7.62 to ations were stepwise decreasing in Ind I (median num- 9.43 mm, grain width ranged from 2.30 to 2.96 mm, and ber was 3255.0) and Ind II (median number was 3287.5) grain length width ratio ranged from 2.66 to 3.99 for the compared with Ind IV (median number was 3472.0), accessions with the 25 allele combinations. For instance, which implying these variations were lost during modern the high quality “Simiao” rice Meixiangzhan2hao have cultivars improvement under artificial selection pressure Hap3 Hap2 Hap6 the combination of GS3 , GSE5 and GS5 with (Fig. 4A and Additional file  1: Table S5). Secondly, a total thousand grain weight of 20.88 g and grain length width of 319 quantitative trait nucleotides (QTNs) of the 212 ratio of 3.99 (Additional file  1: Table S11). The phenotype vital gene in rice of RiceNavi database were used to anno- effects of known genes that genotyped by RiceNavi were tate accessions of three subpopulations (Additional file  1: also evaluated, and several QTNs have potential effects Table  S6). For all genes, the average inferior allele count on Guangdong indica rice phenotype. For instance, two of accessions in Ind IV, Ind I and Ind II were 52.19, 52.96 variations of Hd1 gene (9338004 and 9338220 on Chr6) and 50.32, respectively. Inferior allele counts were 17.38, shows effect to promoting heading date by about 9  days 16.34 and 15.72 for yield related genes and 29.83, 30.91 (Additional file  1: Table  S12). Four genes that regulating and 30.88 for Ind IV, Ind I and Ind II, respectively. These eating quality were also genotyped. Ten accessions were results suggesting that modern cultivars of subpopulation genotyped to have fragrance allele of Badh2 gene, and Ind II accumulated favorable alleles, especially for yield 9 of which are Ind II accessions. The only different site related genes during improvement (Fig. 4B and C). How- of two elite cultivars Huanghuazhan and Meixiangzhan- ever, modern cultivars lost some favorable allele of stress 2hao was the presence and absence of the fragrance allele responsive genes (Fig. 4D). of Badh2 gene (Additional file 1: Table S13). XP-CLR analysis were conducted by comparing Ind II with Ind IV (Fig. 5A) and Ind I (Fig. 5B), and the QTLs QTLs and Breeding Signatures of Guangdong Indica Rice were further used to annotate the regions of breed- Breeding signatures of modern cultivar subpopulation ing signatures (Fig.  5C). A total of 24 and 23 intersec- Ind II were identified using significant distorted patterns tions between agronomic QTLs and selected genomic in allele frequency of XP-CLR method. Modern cultivar regions for subpopulation Ind IV and Ind I, respec- subpopulation Ind II was respectively compared with lan- tively. Known vital yield and yield-related genes under drace subpopulation Ind IV and Ind I. For the compari- selection pressure were detected. For instance, sd1, son of Ind IV and Ind II, a total of 150 genomic segments Oshox4 and OsGA2ox5 were found under selection of spanning 15.10  Mb potentially selected genomic regions plant height, OsWDR5, and TAC3 were selected for the by modern cultivar breeding were identified (Additional improvement of grain yield per plant, and favorable file  1: Table S7). And for the comparison of Ind I and Ind Hang  et al. Rice (2023) 16:25 Page 7 of 14 Fig. 4 Deleterious variations in accessions of three main subpopulations. A Number of deleterious variations that predicted by SIFT (sorting intolerant from tolerant) software. Number of inferior alleles of all genes (B), yield related genes (C) and stress related genes (D) that annotated using RiceNavi database. Subpopulations were ordered by LD decay distance from short to long alleles of GS3, Osmyb3 and FLO13 were selected for and the selected genes in Ind I were OsGA2ox5 and grain length. Interestingly, different selected genes were Oshox4. For grain yield per plant, TAC3 and OsWDR5 detected for subpopulation Ind IV and Ind II. Plant were under selection pressure in Ind I but not in Ind IV. height genes sd1 and Oshox4 were selected in Ind IV, GS3 was selected for grain length in Ind IV but not in Ind I. GSE5 and OsDER1 were selected for grain width Hang et al. Rice (2023) 16:25 Page 8 of 14 Fig. 5 Selection sweeps for modern cultivar Ind II and the annotation of breeding signatures by QTLs of eleven agronomic traits. Distribution of XP-CLR scores between modern cultivar Ind II and local landrace subpopulations of Ind IV (A) and Ind I (B). C Annotation of breeding signatures using QTLs identified by GWAS and grain length width ratio in subpopulation Ind I, but main haplotypes, GS3-Hap3 was fixed during breed - OsABCG18 was selected in Ind IV (Fig. 5). ing and improvement process for Ind II. Hap3 of grain length gene Osmyb3 was mainly selected for Ind II dur- Allele Fixation During the Breeding Process of Guangdong ing modern rice breeding by one and two variations from Indica Rice Osmyb3-Hap4 and Osmyb3-Hap2. For the four haplo- Gene haplotype analysis were conducted to illustrate types of grain weight gene FLO13, Hap3 is a predominant evolutionary relationship and identify selected alleles allele in Ind II, which was selected from FLO13-Hap4 during modern breeding and improvement of subpopula- with one missense variation (Fig.  6). In those modern tion Ind II. A total of 6 potentially favorable alleles were cultivar fixed alleles, six Ind II specific variations were fixed in Ind II for key yield related genes, those genes identified. Oshox4-Hap3 have one specific intron varia - were Oshox4 and OsGA2ox5 for plant height, TAC3 for tion between exon 1 and exon 2, TAC3-Hap3 have three tiller angle and yield, GS3, Osmyb3 and FLO3 for grain cultivar specific variations, GS3-Hap3 have one stop size and weight. Four main haplotypes were identified codon gained variation, and FLO13-Hap3 have one spe- for plant height gene Oshox4, haplotype network analy- cific missense variation in Ind II (Fig. 7). sis revealed that Oshox4-Hap3 (Ind II fixed haplotype) was derived from Oshox4-Hap2, following the variations Discussion of Oshox4-Hap1 (Ind IV and Ind I) and Oshox4-Hap4 The major breakthrough of “green revolution” leads (GJ-tmp). For two main haplotypes of plant height gene quantum leaps of rice productivity (Cheng et  al. 2020), OsGA2ox5, most accessions of Ind II have OsGA2ox5- intensive breeding efforts and artificial selection have Hap2, while landraces of subpopulation Ind I and Ind IV facilitated the significant improvement of indica rice have OsGA2ox5-Hap1. Tiller angle and yield related gene yield in Guangdong, where the “green revolution” started TAC3 have four main haplotypes in these accessions, in China. Unlike previously reported selection analy- and TAC3-Hap3 are a fixed haplotype of Ind II modern sis of rice accession from multiple geographic positions cultivars. Grain length and weight gene GS3 have five (Li et al. 2020a, b; Lv et al. 2020; Xie et al. 2015; Xu et al. Hang  et al. Rice (2023) 16:25 Page 9 of 14 Oshox4-Hap1 10 samples (n=287) B 1 samples Ind I Ind II GJ-tmp Ind IV Oshox4-Hap4 (n=8) OsGA2ox5-Hap2 (n=87) Oshox4-Hap2 (n=183) OsGA2ox5-Hap1 (n=412) Oshox4-Hap3 (n=15) GS3-Hap1 GS3-Hap2 (n=222) D (n=161) C TAC3-Hap4 (n=19) TAC3-Hap3 (n=62) TAC3-Hap1 TAC3-Hap2 GS3-Hap4 GS3-Hap5 (n=258) (n=140) (n=18) (n=23) GS3-Hap3 (n=73) Osmyb3-Hap1 (n=238) FLO13-Hap1 (n=229) Osmyb3-Hap2 (n=125) FLO13-Hap2 FLO13-Hap4 (n=178) (n=13) Osmyb3-Hap4 FLO13-Hap3 (n=78) (n=21) Osmyb3-Hap5 (n=17) Osmyb3-Hap3 (n=86) Fig. 6 Haplotype networks of six key yield related genes in selected regions. A Oshox4 related to plant height, B OsGA2ox5 related to plant height, C TAC3 related to tiller angle, D GS3 related to grain length and weight, E Osmyb3 related to grain length, and F FLO13 related to grain weight. Circle size is proportional to accession number of one haplotype. Number of ticks in network edges represents the variation counts between haplotypes Hang et al. Rice (2023) 16:25 Page 10 of 14 Variaton effect type Chr9:17903164 17904164 17905164 intron_variant synonymous_variant 3_prime_UTR_variant missense_variant downstream_gene_variant Oshox4-Hap1 GC GG TC A splice_region_variant&intron_variant Oshox4-Hap2 GT CA CT A disruptive_inframe_deletion stop_gained Oshox4-Hap3 AT CA CT A Unique variation for Ind II Oshox4-Hap4 GC GG CC G group Chr3:29583117 29584117 CDS five_prime_UTR three_prime_UTR TAC3-Hap1 AT TTTTT TG A GG TAC3-Hap2 CT AC TT GT GAGA GG G TAC3-Hap3 ACAC TT GC GAGA GA G TAC3-Hap4 AT TC GTAGAA GT GAGA GG T Chr3:16729501 16730501 16731501 16732501 16733501 16734501 GS3-Hap1 CA G GS3-Hap2 GA G GS3-Hap3 GAAGGT GS3-Hap4 GAAGGG GS3-Hap5 CAAGGG Chr2:35013566 35014566 35015566 35016566 35017566 FLO13-Hap1 TC A FLO13-Hap2 GC T FLO13-Hap3 GG A FLO13-Hap4 G C A Fig. 7 Gene structure and haplotype specific variations of Oshox4 (A), TAC3 (B), GS3 (C) and FLO13 (D) for subpopulation Ind II. Background colors of bases represent annotation and effect level of the variation. Red stars mark the specific variations for the haplotype of Ind II 2016; Ye et  al. 2022), we focusing on the locally adapta- modern improved cultivars. For example, modern breed- tive selection of rice in Guangdong by comparing the ing of rice quality started from the end of last century in genomic variations and agronomic traits of locally culti- Guangdong favors slender grain type for regional appe- vated landraces by farmers before “green revolution” and tite in southern Aisa, which makes the increase of grain Hang  et al. Rice (2023) 16:25 Page 11 of 14 length and width ratio, and decrease of grain weight in Agricultural Sciences for two seasons of 2019 and 2021 subpopulation of modern cultivars. under conventional field management. All accessions The artificial replacement of deleterious variations with were planted in 1437 blocks under randomized complete favorable alleles during breeding and improvement are block design with three replications. A total of eleven meaningful to meet social demands for crop and food. agronomic traits were investigated and calculated. Plant By integrating GWAS for agronomic traits and breeding height (PH) was measured as length from the ground to signatures for selected regions, regionally selected key the highest point of the plant, heading date (HD) was genes that influencing yield and yield related traits for recorded when half of the plants in a block have reached Guangdong indica rice were identified. Oshox4 was iden- the heading stage and days from sowing to heading is tified as a selected gene for plant height in our accessions, calculated, yield per plant (YPP) was the total weight of which plays negative function in gibberellin responses filled grains per plant, panicle number (PN) was the num - and influencing plant height and tiller number (Dai et al. ber of effective panicles per mature plant, grain number 2008; Zhou et  al. 2015). An Ind II specific haplotype per panicle (GNPP) was the mean total number of grains (Oshox4-Hap3) were identified for cultivars. Another per panicle on a single plant, seed setting (SS) was calcu- regulator for rice growth and architecture, OsGA2ox5, lated as the percentage of filled grains to total grains per were also identified as selected gene of plant height for plant, grain length (GL), grain width (GW) and panicle cultivar subpopulation Ind II when compared with Ind I length (PL) were measured when seeds are mature (Yu (Lo et al. 2008), and OsGA2ox5-Hap2 was a selected hap- et  al. 2020). Thousand grain weight (TGW) were calcu - lotype by cultivars from Ind II. Tiller angle gene TAC3 lated by the division of total weight to total filled grain were selected for yield per plant of Ind II when compared number, and grain length width ratio (GLWR) was the with Ind I, and TAC3-Hap3 were selected by cultivars division of GL to GW. from Ind II (Dong et  al. 2016). GS3 and Osmyb3 were identified to be selected for Ind II when compared with Genomic Resequencing and Variation Calling Ind IV, and the Hap3 of these two genes were selected Young leaves of 358 landrace and 121 improved culti- for cultivars (Li et al. 2020a, b; Fan et al. 2006; Mao et al. vars of indica rice accessions from Guangdong province 2010). Starch biosynthesis and grain weight gene FLO13 of southern China were collected to construct sequenc- was selected for Ind II compared with Ind I, and FLO13- ing libraries according to the manufacturer’s instructions, Hap3 with a unique missense variant in cultivars was and qualified libraries were sequenced using Illumina a potential selected haplotype (Hu et  al. 2018). These HiSeq platform. A total of 38 Guangdong improved cul- selected favorable haplotypes are promising functional tivars were collected from the NCBI SRA database with alleles regulating the yield improvement of Guangdong accession numbers of PRJNA321462, PRJNA522896 modern cultivars. and PRJNA656900 (Additional file  1: Table  S1). Qual- ity of raw sequencing data were accessed using FastQC Conclusions (v0.11.9) software (Andrews 2010), and low-quality data In summary, large-scale genomic and yield assessment were trimmed using TrimGalore (version 0.6.6) to gener- of Guangdong landrace and modern cultivars promotes ate clean sequencing data. Clean data were mapped onto analysis of their diversity, classification and phyloge - reference genome (MSU7) using BWA (0.7.17-r1188) netic relationship. We revealed less deleterious varia- software with default parameter (Li and Durbin 2009). tions number in modern cultivars than landrace. Selected MarkDuplicates in Picard (2.12.1) was used to eliminate genomic regions were also identified and annotated using PCR duplication and sorting BAM files, and genome - GWAS of vital agronomic traits, which leads the iden- CoverageBed of bedtools (v2.27.1) was used to calculate tification of selected key genes during Guangdong rice genome coverage ratios. SNPs (single nucleotide poly- breeding and improvement. These results shed light on morphisms) and InDels (insertions and deletions) were regionally breeding trajectory and artificial selection, and then called using HaplotypeCaller of Genome Analysis provides valuable resources for rational design of molec- Toolkit (GATK, version 4.2.2.0) pipeline (McKenna et al. ular breeding. 2010), and annotated using SnpEff (4.3 s) with the GFF3 file of MSU7 reference genome (Cingolani et al. 2012). Materials and Methods Field Experimental Design and Phenotyping of Agronomic Population Structure Analysis Traits Principal components analysis (PCA), phylogenetic and Agronomic traits of 479 accessions of Guangdong rice admixture analysis were employed to classify subpopu- core germplasm was investigated on experimental field lations (Yu et  al. 2021). For population structure analy- of Rice Research Institute, Guangdong Academy of sis, SNP variations were filtered using VCFtools (0.1.16) Hang et al. Rice (2023) 16:25 Page 12 of 14 software with parameter “–max-missing 0.95” and “–maf were filtered out as such short blocks unlikely selected 0.05”. PCA method with kmeans clustering algorithm in during the short history of modern rice improvement CropGBM software was used to reducing dimensions of breeding. Long blocks with top 1% values of XP-CLR genotypic data (Yan et  al. 2021), and eigenvalues were scores were finally considered as selected regions. calculated by plink software. Phylogenetic relationship was constructed using VCF2Dis software (https:// github. GenomeW ‑ ide Association Analysis com/ BGI- shenz hen/ VCF2D is) and illustrated by using For genome-wide association analysis (GWAS), multi- FigTree (v1.4.3) software (https:// github. com/ ramba ut/ sample VCF file of genomic variations was converted into figtr ee). ADMIXTURE (version 1.3.0) software (Alex- plink file and variations were screened with parameters ander and Lange 2011) was used to analyze population of “–geno 0.1 –mind 0.4 –maf 0.05”. PCA analysis were structure with k values ranged from 2 to 12. The ancestry conducted by plink software with five major components distributions of individuals were visualized using R script. (Purcell et  al. 2007). Kinship analysis and GWAS were Genetic diversity (pi and Tajima’s D) and differentiation conducted using GEMMA software using filtered geno - (fst) analysis were conducted using VCFtools (0.1.16) types and eleven agronomic traits (Zhou and Stephens software with 100  kb sliding windows. Linkage disequi- 2012). librium (LD) decay for each subpopulation was estimated and plotted using PopLDdecay (Zhang et al. 2019). National indica rice sequencing data from RiceVarMap Gene Haplotype Reconstruction and Network Analysis database (Zhao et  al. 2015), variations of 3024 3KRG Software beagle (version 5.2) was used to impute miss- accession from SNP-seek database (Locedie et  al. 2017) ing genetic variations that generated by GATK (Brown- and variations of 998 wild rice lines from our recently ing et  al. 2021). Genomic variations of selected genes research (Zhang et  al. 2022) that used to conduct popu- were extracted based on the positions by using BCFTools lation structure and phylogenetic analysis in this study (Li 2011). Haplotype network of these genes were con- were subjected to the same data processing pipeline. structed by our previously described method (Yu et  al. A recently released and refined indica 9311 reference 2021). Haplotype network was constructed and illus- genome was also used to check the results of population trated by Popart software (Leigh and Bryant 2015). structure analysis (Wang et al. 2022). Abbreviations Estimation of Variation Eec ff ts and Deleterious Mutation SNP Single-nucleotide polymorphisms Prediction PCA Principal component analysis XP-CLR The cross-population composite likelihood ratio test Functional alteration of genomic variations was predicted QTLs Quantitative trait locus using Sorting Intolerant From Tolerant 4G (SIFT 4G) GWAS Genome-wide association studies software (Vaser et  al. 2016). Variations with SIFT scores LD Linkage disequilibrium QTNs Quantitative trait nucleotides smaller than 0.05 was considered as putatively deleteri- KRG 3000 Rice genome project ous variations. Allele function for known genes with vital role in rice were annotated using RiceNavi database (Wei Supplementary Information et  al. 2021). Advantage and inferior allele were deter- The online version contains supplementary material available at https:// doi. mined by manually check of allele functional alteration org/ 10. 1186/ s12284- 023- 00642-3. and its corresponding trait. Number of inferior alleles for accessions from each subpopulation were counted and Additional file 1: Table S1. Information for accessions used in this study. Table S2. Quality assessment of genome sequencing data. Table S3. plotted using boxplot or violin plot in R. Genome mapping quality and genome coverage of genome sequencing data. Table S4. Genomic variations for 517 indica rice accessions against the MSU reference genome. Table S5. Number of deleterious variations Identification of Breeding Signatures Using XP‑CLR in subpopulation of landrace and cultivar. Table S6. Genotyping results Breeding signatures for artificial selection were identified 319 quantitative trait nucleotides (QTNs) of the 212 vital gene in rice from by using the cross-population composite likelihood ratio RiceNavi database. Table S7. Genomic segments that identified as breed- ing signatures between Ind IV and Ind II. Table S8. Genomic segments test (XP-CLR) method (Chen et al. 2010) and its updated that identified as breeding signatures between Ind I and Ind II. Table S9. version of python module (https:// github. com/ hardi ngnj/ QTLs and known genes that identified by GWAS of eleven agronomic xpclr). XP-CLR was conducted between subpopulations traits for Guangdong indica rice. Table S10. Allele combinations of plant height (PH) genes of Guangdong indica rice germplasm. Table S11. Allele of landrace and cultivar with 10  kb sliding windows. combinations of thousand grain weight ( TGW ), grain length (GL), grain Genomic segments with XP-CLR values above the 80th width (GW ) and grain length width ratio (GLWR) genes of Guangdong percentile was considered as putatively selected regions. indica rice germplasm. Table S12. Phenotypic effect assessment of known QTNs that genotyped by RiceNavi. Table S13. Genotyping of 4 eating Adjacent segments within 20  kb distance were then quality genes. merged into longer blocks, and blocks shorter than 40 kb Hang  et al. Rice (2023) 16:25 Page 13 of 14 nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila Additional file 2: Fig. S1. Admixture analysis when subpopulation 1118 melanogaster strain w ; iso-2; iso-3. Fly 6:80–92 number (k) was set to two (a) and three (b). Numbers of cultivar (Cul) and Cui D, Lu H, Tang C, Li J, Yu T, Ma X, Zhang E, Wang Y, Cao G, Xu F, Qiao Y, Dai L, landrace (Lan) were noted in parentheses for each subgroup. Fig. S2. Li R, Tian S, Koh HJ, Han L (2019) Genomic analyses reveal selection foot- Population structure analysis of Guangdong indica rice with accessions prints in rice landraces grown under on-farm conservation conditions from RiceVarMap2 and 3KRG database. Fig. S3. Population structure during a short-term period of domestication. Evolut Appl 13:290–302 analysis of Guangdong indica rice accessions using indica rice 9311 as Dai M, Hu Y, Ma Q, Zhao Y, Zhou D (2008) Functional analysis of rice reference genome. Fig. S4. Manhattan plots for genome-wide association HOMEOBOX4 (Oshox4) gene reveals a negative function in gibberellin analysis of eleven agronomic traits. responses. Plant Mol Biol 66:289–301 Dong H, Zhao H, Xie W, Han Z, Li G, Yao W, Bai X, Hu Y, Guo Z, Lu K, Yang L, Xing Y (2016) A novel tiller angle gene, TAC3, together with TAC1 and D2 Acknowledgements largely determine the natural variation of tiller angle in rice cultivars. PLoS The authors are grateful to all lab members for their assistance in field experi- Genet 12:e1006412 ments, and appreciated to the tremendous dedication of rice scientists in Fan C, Xing Y, Mao H, Lu T, Han B, Xu C, Li X, Zhang Q (2006) GS3, a major QTL germplasm collection and breeding of Guangdong rice. for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor Appl Genet Author contributions 112:1164–1171 CL and HY conceived and designed the experiments, and wrote the paper. HY, Fuentes RR, Chebotarov D, Duitama J, Smith S, De la Hoz JF, Mohiyuddin M, YL, BRS, QL, XXM, LQJ, SWL, JZ, PLC, DJP, WFC, ZLF performed the experiments Wing RA, McNally KL, Tatarinova T, Grigoriev A, Mauleon R, Alexandrov N and analyzed the data. CL and HY contributed reagents/materials/analysis (2019) Structural variants in 3000 rice genomes. Genome Res 29:870–880 tools. All authors read and approved the final manuscript. Higgins J, Santos B, Khanh TD, Trung KH, Duong TD, Doai NTP, Khoa NT, Ha DTT, Diep NT, Dung KT, Phi CN, Thuy TT, Tuan NT, Tran HD, Trung NT, Giang Funding HT, Nhung TK, Tran CD, Lang SV, Nghia LT, Van Giang N, Xuan TD, Hall A, This work was supported by Natural Science Foundation of Guangdong Dyer S, Ham LH, Caccamo M, De Vega JJ (2021) Resequencing of 672 Province (2022A1515011741), Special Funds for Scientific Innovation native rice accessions to explore genetic diversity and trait associations in Strategy-Construction of High Level Academy of Agriculture Science Vietnam. Rice 14:1–16 (R2021YJYB3017), Guangzhou Science and Technology Plan Project Hour A, Hsieh W, Chang S, Wu Y, Chin H, Lin Y (2020) Genetic diversity of (2023A04J0144), Key Field Research and Development Project of Guangdong landraces and improved varieties of rice (Oryza sativa L.) in Taiwan. Rice Province (2022B0202110003), Seed Industry Revitalization Project of Special 13:1–12 Fund for Rural Revitalization Strategy in Guangdong Province (2022NJS00004, Hu T, Tian Y, Zhu J, Wang Y, Jing R, Lei J, Sun Y, Yu Y, Li J, Chen X, Zhu X, Hao Y, 2022NPY00011) and The Project of Collaborative Innovation Center of GDAAS Liu L, Wang Y, Wan J (2018) OsNDUFA9 encoding a mitochondrial com- (XTXM202203). plex I subunit is essential for embryo development and starch synthesis in rice. Plant Cell Rep 37:1667–1679 Availability of Data and Materials Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y, Li C, Zhu C, Lu T, Zhang Z, Li The raw reads of whole-genome resequencing were available at the NCBI M, Fan D, Guo Y, Wang A, Wang L, Deng L, Li W, Lu Y, Weng Q, Liu K, Huang Sequence Read Archive with accession ID PRJNA934413. The sequences and T, Zhou T, Jing Y, Li W, Lin Z, Buckler ES, Qian Q, Zhang Q, Li J, Han B (2010) annotations of reference genome MSU7 is available from the websites http:// Genome-wide association studies of 14 agronomic traits in rice landraces. rice. plant biolo gy. msu. edu/. Nat Genet 42:961–967 Huang X, Kurata N, Wei X, Wang Z, Wang A, Zhao Q, Zhao Y, Liu K, Lu H, Li W, Guo Y, Lu Y, Zhou C, Fan D, Weng Q, Zhu C, Huang T, Zhang L, Wang Y, Declarations Feng L, Furuumi H, Kubo T, Miyabayashi T, Yuan X, Xu Q, Dong G, Zhan Q, Li C, Fujiyama A, Toyoda A, Lu T, Feng Q, Qian Q, Li J, Han B (2012) A Ethics Approval and Consent to Participate map of rice genome variation reveals the origin of cultivated rice. Nature Not applicable. 490:497–501 Huang X, Yang S, Gong J, Zhao Q, Feng Q, Zhan Q, Zhao Y, Li W, Cheng B, Xia Consent for Publication J, Chen N, Huang T, Zhang L, Fan D, Chen J, Zhou C, Lu Y, Weng Q, Han Not applicable. B (2016) Genomic architecture of heterosis for yield traits in rice. Nature 537:629–633 Competing Interests Kumar A, Daware A, Kumar A, Kumar V, Gopala KS, Mondal S, Patra BC, Singh The authors have declared that no competing interests exist. AK, Tyagi AK, Parida SK, Thakur JK (2020) Genome-wide analysis of poly- morphisms identified domestication-associated long low-diversity region carrying important rice grain size/weight quantitative trait loci. Plant J Received: 14 February 2023 Accepted: 14 May 2023 103:1525–1547 Leigh JW, Bryant D (2015) POPART: full-feature software for haplotype network construction. Methods Ecol Evol 6:1110–1116 Li H (2011) A statistical framework for SNP calling, mutation discovery, associa- tion mapping and population genetical parameter estimation from References sequencing data. Bioinformatics 27:2987–2993 Alexander DH, Lange K (2011) Enhancements to the ADMIXTURE algorithm for Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows– individual ancestry estimation. BMC Bioinf 12:246 Wheeler transform. Bioinformatics 25:1754–1760 Andrews S (2010) FastQC: a quality control tool for high throughput sequence Li Q, Lu L, Liu H, Bai X, Zhou X, Wu B, Yuan M, Yang L, Xing Y (2020a) A minor data [Online]. Available online at: http:// www. bioin forma tics. babra ham. QTL, SG3, encoding an R2R3-MYB protein, negatively controls grain ac. uk/ proje cts/ fastqc/ length in rice. Theor Appl Genet 133:2387–2399 Browning BL, Tian X, Zhou Y, Browning SR (2021) Fast two-stage phasing of Li X, Chen Z, Zhang G, Lu H, Qin P, Qi M, Yu Y, Jiao B, Zhao X, Gao Q, Wang H, large-scale sequence data. Am J Hum Genet 108:1880–1890 Wu Y, Ma J, Zhang L, Wang Y, Deng L, Yao S, Cheng Z, Yu D, Zhu L, Xue Chen H, Patterson N, Reich D (2010) Population differentiation as a test for Y, Chu C, Li A, Li S, Liang C (2020b) Analysis of genetic architecture and selective sweeps. Genome Res 20:393–402 favorable allele usage of agronomic traits in a large collection of Chinese Cheng F, Quan X, Zhengjin X, Wenfu C (2020) Eec ff t of rice breeding process rice accessions. Sci China Life Sci 63:1688–1702 on improvement of yield and quality in China. Rice Sci 27:363–367 Liu C, Peng P, Li W, Ye C, Zhang S, Wang R, Li D, Guan S, Zhang L, Huang X, Guo Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden Z, Guo J, Long Y, Li L, Pan G, Tian B, Xiao J (2021) Deciphering variation of DM (2012) A program for annotating and predicting the effects of single Hang et al. Rice (2023) 16:25 Page 14 of 14 239 elite japonica rice genomes for whole genome sequences-enabled Yu H, Li Q, Li Y, Yang H, Lu Z, Wu J, Zhang Z, Shahid MQ, Liu X (2021) Genomics breeding. Genomics 113:3083–3091 analyses reveal unique classification, population structure and novel Lo S, Yang S, Chen K, Hsing Y, Zeevaart JAD, Chen L, Yu S (2008) A novel class of allele of neo-tetraploid rice. Rice 14:16 gibberellin 2-oxidases control semidwarfism, tillering, and root develop - Zhang C, Dong S, Xu J, He W, Yang T (2019) PopLDdecay: a fast and effective ment in rice. Plant Cell 20:2603–2618 tool for linkage disequilibrium decay analysis based on variant call format Locedie M, Roven RF, Frances NB, Jeffery D, Juan MA, Dmytro C, Millicent S, files. Bioinformatics 35:1786–1788 Kevin P, Dario C, Alexandre P, Inna D, Victor S, Rod AW, Ruaraidh SH, Ramil Zhang J, Pan D, Fan Z, Yu H, Jiang L, Lv S, Sun B, Chen W, Mao X, Liu Q, Li C M, Kenneth LM, Nickolai A (2017) Rice SNP-seek database update: new (2022) Genetic diversity of wild rice accessions (Oryza rufipogon Griff.) in SNPs, indels, and queries. Nucleic Acids Res 45(D1):D1075–D1081 Guangdong and Hainan provinces, China, and construction of a wild rice Lv Q, Li W, Sun Z, Ouyang N, Jing X, He Q, Wu J, Zheng J, Zheng J, Tang S, Zhu core collection. Front Plant Sci 13:999454 R, Tian Y, Duan M, Tan Y, Yu D, Sheng X, Sun X, Jia G, Gao H, Zeng Q, Li Y, Zhao H, Yao W, Ouyang Y, Yang W, Wang G, Lian X, Xing Y, Chen L, Xie W (2015) Tang L, Xu Q, Zhao B, Huang Z, Lu H, Li N, Zhao J, Zhu L, Li D, Yuan L, Yuan RiceVarMap: a comprehensive database of rice genomic variations. D (2020) Resequencing of 1,143 indica rice accessions reveals important Nucleic Acids Res 43(D1):D1018–D1022 genetic variations and different heterosis patterns. Nature Commun Zhou X, Stephens M (2012) Genome-wide efficient mixed-model analysis for 11:4778 association studies. Nat Genet 44:821–824 Mao H, Sun S, Yao J, Wang C, Yu S, Xu C, Li X, Zhang Q (2010) Linking differen- Zhou W, Malabanan PB, Abrigo E (2015) OsHox4 regulates GA signaling tial domain functions of the GS3 protein to natural variation of grain size by interacting with DELLA-like genes and GA oxidase genes in rice. in rice. Proc Natl Acad Sci 107:19579–19584 Euphytica 201:97–107 McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Gari- mella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome Publisher’s Note analysis toolkit: a MapReduce framework for analyzing next-generation Springer Nature remains neutral with regard to jurisdictional claims in pub- DNA sequencing data. Genome Res 20:1297–1303 lished maps and institutional affiliations. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575 Vaser R, Adusumalli S, Leng SN, Sikic M, Ng PC (2016) SIFT missense predictions for genomes. Nat Protoc 11:1–9 Wang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, Li M, Zheng T, Fuentes RR, Zhang F, Mansueto L, Copetti D, Sanciangco M, Palis KC, Xu J, Sun C, Fu B, Zhang H, Gao Y, Zhao X, Shen F, Cui X, Yu H, Li Z, Chen M, Detras J, Zhou Y, Zhang X, Zhao Y, Kudrna D, Wang C, Li R, Jia B, Lu J, He X, Dong Z, Xu J, Li Y, Wang M, Shi J, Li J, Zhang D, Lee S, Hu W, Poliakov A, Dubchak I, Ulat VJ, Borja FN, Mendoza JR, Ali J, Li J, Gao Q, Niu Y, Yue Z, Naredo MEB, Talag J, Wang X, Li J, Fang X, Yin Y, Glaszmann J, Zhang J, Li J, Hamilton RS, Wing RA, Ruan J, Zhang G, Wei C, Alexandrov N, McNally KL, Li Z, Leung H (2018) Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557:43–49 Wang S, Gao S, Nie J, Tan X, Xie J, Bi X, Sun Y, Luo S, Zhu Q, Geng J, Liu W, Lin Q, Cui P, Hu S, Wu S (2022) Improved 93–11 genome and time-course tran- scriptome expand resources for rice genomics. Front Plant Sci 12:769700 Wei X, Qiu J, Yong K, Fan J, Zhang Q, Hua H, Liu J, Wang Q, Olsen KM, Han B, Huang X (2021) A quantitative genomics map of rice provides genetic insights and guides breeding. Nat Genet 53:243–253 Xiao N, Pan C, Li Y, Wu Y, Cai Y, Lu Y, Wang R, Yu L, Shi W, Kang H, Zhu Z, Huang N, Zhang X, Chen Z, Liu J, Yang Z, Ning Y, Li A (2021) Genomic insight into balancing high yield, good quality, and blast resistance of japonica rice. Genome Biol 22:1–22 Xie W, Wang G, Yuan M, Yao W, Lyu K, Zhao H, Yang M, Li P, Zhang X, Yuan J, Wang Q, Liu F, Dong H, Zhang L, Li X, Meng X, Zhang W, Xiong L, He Y, Wang S, Yu S, Xu C, Luo J, Li X, Xiao J, Lian X, Zhang Q (2015) Breeding sig- natures of rice improvement revealed by a genomic variation map from a large germplasm collection. Proc Natl Acad Sci 112:E5411–E5419 Xu X, Liu X, Ge S, Jensen JD, Hu F, Li X, Dong Y, Gutenkunst RN, Fang L, Huang L, Li J, He W, Zhang G, Zheng X, Zhang F, Li Y, Yu C, Kristiansen K, Zhang X, Wang J, Wright M, McCouch S, Nielsen R, Wang J, Wang W (2012) Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nat Biotechnol 30:105–111 Xu Q, Yuan X, Wang S, Feng Y, Yu H, Wang Y, Yang Y, Wei X, Li X (2016) The genetic diversity and structure of Indica rice in China as detected by single nucleotide polymorphism analysis. BMC Genet 17:1–8 Yan J, Xu Y, Cheng Q, Jiang S, Wang Q, Xiao Y, Ma C, Yan J, Wang X (2021) LightGBM: accelerated genomically designed crop breeding through ensemble learning. Genome Biol 22:1–24 Ye J, Zhang M, Yuan X, Hu D, Zhang Y, Xu S, Li Z, Li R, Liu J, Sun Y, Wang S, Feng Y, Xu Q, Yang Y, Wei X (2022) Genomic insight into genetic changes and shaping of major inbred rice cultivars in China. New Phytol 236:2311 Yu H, Shahid MQ, Li Q, Li Y, Li C, Lu Z, Wu J, Zhang Z, Liu X (2020) Production assessment and genome comparison revealed high yield potential and novel specific alleles associated with fertility and yield in neo-tetraploid rice. Rice 13:32

Journal

RiceSpringer Journals

Published: Dec 1, 2023

Keywords: Rice; Yield improvement; Resequencing; Breeding signature; GWAS

References