Access the full text.
Sign up today, get DeepDyve free for 14 days.
Background Assessing the performance of elite lines in target environments is essential for breeding programs to select the most relevant genotypes. One of the main complexities in this task resides in accounting for the genotype by environment interactions. Genomic prediction models that integrate information from multi‑ environment trials and environmental covariates can be efficient tools in this context. The objective of this study was to assess the pre ‑ dictive ability of different genomic prediction models to optimize the use of multi‑ environment information. We used 111 elite breeding lines representing the diversity of the international rice research institute breeding program for irri‑ gated ecosystems. The lines were evaluated for three traits (days to flowering, plant height, and grain yield) in 15 envi‑ ronments in Asia and Africa and genotyped with 882 SNP markers. We evaluated the efficiency of genomic prediction to predict untested environments using seven multi‑ environment models and three cross‑ validation scenarios. Results The elite lines were found to belong to the indica group and more specifically the indica-1B subgroup which gathered improved material originating from the Green Revolution. Phenotypic correlations between environments were high for days to flowering and plant height (33% and 54% of pairwise correlation greater than 0.5) but low for grain yield (lower than 0.2 in most cases). Clustering analyses based on environmental covariates separated Asia’s and Africa’s environments into different clusters or subclusters. The predictive abilities ranged from 0.06 to 0.79 for days to flowering, 0.25–0.88 for plant height, and − 0.29–0.62 for grain yield. We found that models integrating genotype ‑ by‑ environment interaction effects did not perform significantly better than models integrating only main effects (genotypes and environment or environmental covariates). The different cross‑ validation scenarios showed that, in most cases, the use of all available environments gave better results than a subset. Conclusion Multi‑ environment genomic prediction models with main effects were sufficient for accurate pheno ‑ typic prediction of elite lines in targeted environments. These results will help refine the testing strategy to update the genomic prediction models to improve predictive ability. Keywords Rice, Oryza sativa, Elite lines, Genomic prediction, Genotype by environment interactions, Environmental covariates, Multi‑ environment genomic prediction models Introduction Rice (Oryza sativa L.) is one of the most important food crops in the world and in Asia in particular. About 3.5 billion people depend on rice as their main food source. As the world’s population increases, the demand for rice *Correspondence: will be under pressure as an estimated 116 million addi- Jérôme Bartholomé email@example.com tional tons of rice will be needed to meet demand by Full list of author information is available at the end of the article © The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. Nguyen et al. Rice (2023) 16:7 Page 2 of 17 2035 (Seck et al. 2012). In this context, genetic improve- models. These models were subsequently enhanced by ment for yield potential is considered to be one of the using different statistical regressions and kernel methods most effective strategies to meet this growing demand (Crossa et al. 2019; Cuevas et al. 2016, 2019; Lopez-Cruz and also to address the growing impact of climate change et al. 2015, p.; Montesinos et al. 2016, 2018), or by using on rice production (Saito et al. 2021). Rice breeders, crop growth models (Cooper 2015; Heslot et al. 2014; therefore, must increase yield potential at a greater pace Messina et al. 2017; Rincent et al. 2017) and recently by (Cobb et al. 2019). However, the use of conventional using reaction-norm models integrating the informa- breeding methods is time-consuming and can take up to tion of environmental covariates, such as weather and ten years to develop and evaluate new elite varieties (Col- soil information of the experimental trials, for prediction lard and Mackill 2008). To some extent, the advances of in the context of G × E (Costa-Neto et al. 2020, 2021; de marker-assisted selection (MAS) enable faster develop- los Campos et al. 2020; Jarquín et al. 2014; Ly et al. 2018; ment of new varieties but are limited to the introgression Millet et al. 2019; Morais Júnior et al. 2017). In this lat- favorable alleles of major genes or quantitative trait loci ter approach, G × E is accounted for by using the inter- (QTLs) with large effects mainly related to abiotic (e.g. action between markers and environmental covariates submergence, salinity) or biotic (e.g. blast, bacterial leaf (ECs) and has been shown to increase the accuracy of blight) stress tolerance into elite backgrounds (Gregorio genomic prediction in plant breeding. For example, Jar- et al. 2013; Jena and Mackill 2008). MAS is not tailored quín et al. (2014), using wheat data, reported an increase to enhance the effectiveness of breeding strategies for in the accuracy of the reaction-norm model integrating quantitative traits like grain yield which are governed by ECs compared to models with main effects alone. The a large number of genes or QTLs with small effects (Jena effectiveness of the use of ECs in GS is also discussed in and Mackill 2008). the literature (Costa-Neto et al. 2020; Heslot et al. 2014; With the reduction in genotyping costs, genomic selec- Millet et al. 2019; Monteverde et al. 2019; Morais Júnior tion (GS) has arisen as a more efficient option for breed - et al. 2017). In rice, a large number of GS studies have ing program optimization (Ahmadi et al. 2020; Heffner been published since 2014, when the first empirically et al. 2009). GS can accelerate the rate of genetic gain based study was published (see a review by Bartholomé without significantly increasing the size of the breed - et al. 2022). Through these studies, we gained a better ing program by reducing the length of the breeding understanding of the benefits and limitations of GS in the cycle (Cobb et al. 2019). GS uses genome-wide markers context of rice breeding. The impact of trait architecture, (mainly SNPs markers) to predict the genomic estimated population structure, the training set size, and compo- breeding values (GEBV) of selection candidates based on sition, as well as marker density, has been well covered. statistical models trained on a reference population that However, the impact of G × E has received somewhat is both genotyped and phenotyped (Ahmadi et al. 2020; less attention. Indeed, only a few studies using breeding Jannink et al. 2010; Meuwissen et al. 2001). Since 2010, material have used multi-environment models including many GS studies have been published on small grain G × E (Ben Hassen et al. 2018; Monteverde et al. 2018, crops such as wheat, barley, oats, or rice, indicating that 2019; Morais Júnior et al. 2017). The conclusions arising GS has been successfully applied in cereals breeding pro- from these works based on a relatively small number of grams to increase the rate of genetic gain (Crossa et al. environments are that multi-environment models tend to 2017). More recently, genomic prediction models inte- give higher prediction accuracies. grating multi-environment data have emerged in the This study aimed to assess the efficiency of multi- plant breeding community in order to increase accuracy environment genomic prediction models in the context by modeling the genotype-by-environment interactions of an applied breeding program. We used an elite core (G × E) rather than ignoring them (Burgueño et al. 2012; panel that represents the elite diversity managed by the Heslot et al. 2013; Jarquín et al. 2014; Lopez-Cruz et al. irrigated rice breeding program at the International Rice 2015). The G × E interactions in plant breeding are usu- Research Institute (IRRI). This panel was phenotyped in ally evaluated through multi-environment trials and refer 15 environments in Asia and Africa regions from 2018 to changes in the ranking of genotypes between envi- to 2020. This information from multi-environment trials ronments (Freeman 1973). The G × E analysis also plays (phenotypic data and environmental covariates) was used a key role in evaluating the stability of genotypes across to characterize the level of G × E interaction and to clus- environments (Cooper et al. 1993; Elias et al. 2016). ter the environments. We then compared seven genomic Crossa et al. (2022) have recently reviewed the evolution prediction models to evaluate the impact of modeling of genomic prediction models that consider G × E inter- G × E and environmental covariates on predictive abili- actions. Burgueño et al. (2012) and Schulz-Streeck et al. ties when new environments were predicted. (2013) proposed the first multi-environment prediction Nguy en et al. Rice (2023) 16:7 Page 3 of 17 the function dudi.pca within the R package ade4 (Dray Materials and Methods and Dufour 2007), the PCs were then visualized using Plant Material and Genotypic Characterization the R package ggplot2 (Wickham 2016). An unweighted The plant material consisted of 111 elite lines from the neighbor-joining tree between ECP and 3 K-RG’s sub- IRRI breeding program for irrigated systems (Addi- groups was constructed using TASSEL 5 software tional file 1: Table S1), hereafter referred to as the elite (Bradbury et al. 2007). core panel (ECP). The ECP represents the elite diversity of the parental lines used in IRRI’s breeding program for irrigated systems and is derived mostly from the breed- ing efforts that were conducted at IRRI since the 1960s Multi‑environment Evaluation of the Elite Breeding Lines (Juma et al. 2021). The population included recent varie - Within the IRRI breeding program framework, the ECP ties such as IRRI 154, IRRI 156, IRRI 174, IRRI 180, IRRI was evaluated in multi-environment trials at 12 differ - 186, and IRRI 193 as well as current parental lines. ent locations including IRRI headquarter (Los Baños, the The ECP was genotyped using the 1K Rice Custom Philippines) and research stations from partners in Asia Amplicon assay (1K-RiCA, Arbelaez et al. 2019). Leaf tis- and Africa. The information regarding the locations of sues of single plants of each line of the ECP were collected the 15 field experiments is available in Table 1 and Addi- and freeze-dried. Genomic DNA (gDNA) was extracted tional file 1: Table S2. The experiments were carried out using the CTAB method (Cetyl Trimethyl Ammonium in both the dry (DS) and wet seasons (WS) from 2018 Bromide), as described by Murray and Thompson (1980). to 2020. Different experimental designs were used to The quality of gDNA was visually checked on 1% aga - accommodate partners’ capacities: alpha-lattice, rand- rose gel. The quantity of gDNA was then evaluated using omized complete block, row-column, partially replicated, PicoGreen (https:// www. biotek. com) fluorometric kits or systematic arrangement designs with either one or two and adjusted to obtain a concentration close to 10 ng/ replicates for each. Due to limited seed availability, not all µl gDNA for the library preparation. Illumina ’s TruSeq the elite lines were evaluated in all 15 experiments result- Custom Amplicon chemistry was used to create the ing in sparse testing evaluation. The number of lines libraries and the sequencing was performed using the evaluated in each location ranged from 39 to 111 lines, MiSeq Sequencing-by-Synthesis Technology System. A as detailed in Table 1 and Additional file 2: Fig. S2. Most custom SNP-calling pipeline was used to align sequence of the experiments were carried out by transplanting, data on the Nipponbare rice genome MSU7 (Kawahara except for one experiment conducted with direct seeding et al. 2013). The sequences with non-alignment and mul - (at Maputo–Mozambique). Standard management prac- tiple positions were then removed. SNP data was saved tices were applied in all trials with basal fertilizer applica- in a HapMap format (Gibbs and et al. 2003). The raw tions along with chemical and/or manual pest and weed SNP data was then filtered with TASSEL 5 (Bradbury control. et al. 2007). The SNPs with more than 20% of missing Three agronomic traits were measured on each elite data, a minor allele frequency (MAF) lower than 5%, and line: days to flowering (DTF), plant height (HT), and a percentage of heterozygous calls greater than 10 were grain yield (YLD). DTF (days) were calculated as the removed. Consequently, four out of 111 ECP lines have number of days from seeding to the time of 50% of the been removed from the list. A final set of 107 lines and plants flowering within a plot. The plant height (cm) was 882 SNP markers distributed along the rice genome was measured from the ground level to the tip of the highest used for the analyses (Additional file 2: Fig. S1). The gen - panicle (awns excluded) at the maturity of five randomly otypic information for the ECP is available in HapMap selected plants for each elite line. For grain yield (tons/ format (Additional file 3). ha), each plot was harvested excluding border rows. The genotypic characterization of the ECP in rela - From this sample, grain moisture content was measured tion to O. sativa subgroups was performed by combin- using a moisture meter. Then, plot-level grain yield was ing the genotypic data of the ECP from the 1K-RiCA computed as the grain weight in kilograms from each assay above with the 3000 rice genomes (3 K-RG) data plot, normalized at 14% of moisture, and adjusted by the (Wang et al. 2018). The physical positions of the 882 harvested areas to obtain the yield in tons per hectare. SNPs were used to extract a dataset of filtered SNPs for the entire 3 K-RG using the rice SNP-seek database Phenotypic Data Analysis (Mansueto et al. 2017). As a result, a total of 837 SNPs For the statistical analysis of the trials, two linear mixed in common in both data sets were used for downstream models were used to take account of the diversity of analysis. The SNPs were then encoded from nucleotide the experimental design. The general form of the base alleles into numeric genotypes as 0, 0.5, and 1. A prin- models was: cipal component analysis (PCA) was carried out using Nguyen et al. Rice (2023) 16:7 Page 4 of 17 Table 1 Information of fifteen yield trials conducted on the elite core panel (ECP) Country Location Year & Season Study name No. lines Experimental Replication No. checks Seeding date Harvest date design level Bangladesh Gazipur 2019‑ Wet BD‑ GZ‑ 19W 93 (90) P‑ REP 27% lines 9 2019‑ 07‑ 08 2019‑ 11‑ 05 Bangladesh Nizmawna 2019‑ Wet BD‑ NM‑ 19W 93 (90) Systematic 1 6 2019‑ 07‑ 11 2019‑ 11‑ 10 arrangement India Hyderabad 2018‑ Wet IN‑ HY‑ 18W 39 (37) RCBD 2 4 2018‑ 07‑ 17 2018‑ 11‑ 27 India Cuttack 2019‑ Wet IN‑ CU‑ 19W 40 (38) RCBD 2 11 2019‑ 07‑ 10 2019‑ 11‑ 19 India Hyderabad 2019‑ Dry IN‑ HY‑ 19D 39 (37) RCBD 2 4 2019‑ 01‑ 17 2019‑ 05‑ 19 India Hyderabad 2019‑ Wet IN‑ HY‑ 19W 40 (38) Augmented 3% lines 5 2019‑ 07‑ 02 2019‑ 12‑ 07 RCBD India Maruteru 2019‑ Wet IN‑ MA‑ 19W 40 (38) P‑ REP 43% lines 4 2019‑ 07‑ 06 2019‑ 11‑ 17 India Raipur 2019‑ Wet IN‑ RP‑ 19W 40 (38) P‑ REP 43% lines 5 2019‑ 07‑ 15 2019‑ 12‑ 03 Kenya Ahero 2019‑ Dry KE‑ AH‑ 19D 92 (89) RCBD 2 5 2019‑ 08‑ 21 2020‑ 01‑ 04 Kenya Mwea 2020‑ Wet KE‑ MW‑ 20W 92 (89) RCBD 2 5 2020‑ 02‑ 24 2020‑ 07‑ 19 Mozambique Chokwe 2020‑ Wet MZ‑ CK‑ 20W 93 (90) Row‑ Column 2 3 2019‑ 11‑ 05 2020‑ 04‑ 09 Mozambique Maputo 2020‑ Wet MZ‑ MP‑ 20W 93 (90) Row‑ Column 2 3 2019‑ 11‑ 21 2020‑ 04‑ 11 Philippines Los Baños 2019‑ Dry PH‑ LB‑ 19D 111 (107) Alpha Lattice 2 5 2019‑ 01‑ 15 2019‑ 05‑ 13 Philippines Los Baños 2019‑ Wet PH‑ LB‑ 19W 111 (107) Alpha Lattice 2 5 2019‑ 06‑ 20 2019‑ 10‑ 17 Tanzania Dakawa 2020‑ Wet TZ‑ DK‑ 20W 91 (88) Augmented 5% lines 6 2020‑ 03‑ 16 2020‑ 08‑ 01 RCBD At the No. lines column, the numbers contained within the brackets show the numbers of lines having SNP data from the 1K-RiCA dataset where σ is the genotypic variance obtained from y = Xb + Zu + e the experimental data and σ is the residual variance where y is the vector of phenotypes, b is the vector of obtained from the model. H and the associated standard fixed effects, and X is the associated design matrix, u error were estimated with the function predict(). The best is the vector of random effects and Z is the associated linear unbiased predictors (BLUPs) for all genotypes were design matrix, and e is the vector of residuals. For tri- extracted for each trial and each trait and were used as als with a rectangular field layout, a model with first- adjusted phenotypes for further analysis. For the two tri- order autoregressive spatial structure (AR1 ⊗ AR1) was als without the replications (Nizmawna-Bangladesh and used (Gilmour et al. 1997). For these models, all vectors Hyderabad-India), we used the phenotypic data directly. and incidence matrices are the same as the base model The phenotypic information of the ECP is available in above, it only differs in the structure of variance residu - Additional file 4. als. The matrices of variance residuals are defined as We considered an environment as the combination 2 2 R = σ Σc(pc) ⊗ Σr(pr), where σ is the variance compo- of location, year, and season. Analysis of correlation e e nents of residual, Σc(pc) and Σr(pr) are the correlation between the three traits within single environments, matrices of the first-order autoregressive, pc and pr are and between environments was performed using the the autocorrelation parameters for the spatial coordi- Pearson correlation method within the ggpairs() func- nates, columns, and rows of plots respectively, ⊗ is the tion in the GGally R package (Schloerke and et al. Kronecker product from the auto-regressive process in 2020). The hierarchical clustering analysis of the envi - columns and rows, respectively. The factors of fixed, ran - ronments was carried out using the pvclust package in dom, and residual effects for statistical models of each R (Suzuki and Shimodaira 2006). trial were described in detail in Additional file 1: Table S3. A simple analysis of variance (ANOVA) for genotype The analyses were performed using the asreml() func- by environment interaction (G × E) upon the pheno- tion of the R package asreml (version 184.108.40.206) (Butler typic performance of ECP was also carried out using et al. 2017). Broad-sense heritability (H ) was estimated the metan R package (Olivoto and Lúcio 2020). for each trait using the following formula: Weather Data and Environmental Covariates 2 2 2 2 H = σ / σ + σ g g e The weather data of each environment were obtained from the NASA POWER database (https:// power. larc. Nguy en et al. Rice (2023) 16:7 Page 5 of 17 nasa. gov/). This database was queried using the R pack - G = X * X /p, in which X is the n × p matrix of centered age nasapower (Sparks 2018) via the R packages EnvR- and standardized markers, n is the number of genotypes type (Costa-Neto et al. 2021). The get_weather function and p is the number of markers and ε is the residual was used to retrieve daily weather data based on the geo- effects denoted as ε ∼ N 0, σ . The G model was used graphical coordinates (N latitude and E longitude) of each as a baseline model to construct the remaining six mod- trial. The following daily weather variable from the trans - els by adding the main effect of the environments (E), the planting date to the harvesting date was obtained for all environmental covariates (W), or the interaction effects the trials: the total precipitation (PP, mm), the dew-point with G (G × E and GxW) into the model (1). Ultimately, −1 temperature at two meters (DPT, °C d ), the minimum, four models included only the main effects and three maximum and mean temperature at two meters (TMIN, models also included interaction terms based on the −1 TMAX and TM, °C d ), the relative humidity at two approach of reaction norm models developed by (Jarquín meters (RH, %), the all-sky surface photosynthetically et al. 2014): active radiation total (APAR, W m ) and the clear sky Model GE : y = µ + g + e + ε ij i j ij (2) surface photosynthetically active radiation total (CPAR, W m ). The processWTH function from the R package EnvRtype was then used to compute the temperature Model GW : y = µ + g + w + ε ij i ij ij (3) range (TR, °C), the potential evapotranspiration (PET, −1 −1 mm d ) and the vapor pressure deficit (VPD, kPa d ) Model GEW : y = µ + g + e + w + ε ij i j ij ij (4) (Costa-Neto et al. 2021). Finally, eight environmental covariates (ECs) were selected for further analysis: PP, Model GE − G × E : y = µ + g + e + ge + ε ij i j ij ij DPT, PET, VPD, TM, TR, APAR, and CPAR (Additional (5) file 1: Table S4). To assess the effects of ECs through different devel - Model GW − G × W : y = µ + g + w + gw + ε ij i ij ij ij opmental phases of the ECP on the genomic predictive (6) ability, the phenology of the crop was identified for each Model GEW − G × E − G × W : y ij environment consisting of: the vegetative phase (from the (7) = µ + g + e + w + ge + gw + ε i j ij ij ij ij transplanting date to the earliest line); the reproductive phase (the interval between the earliest and latest date of where e is the effect of the j-th environment flowering); and the ripening phase (from the latest date of 2 2 which is denoted as e N(0, σ ), with σ representing e e flowering up to the latest harvest date) (Additional file 2: the variance component of the environments; ge is ij Fig. S3). The information on the 24-ECs is available in the interaction effects of the i-th genotypic within Additional file 5. The evaluation of the level of similar - the j-th environment which is modeled by the ity between environments (based on ECs) was performed T T Hadamard product of Z GZ and Z Z , denoted as g e g e by the hierarchical clustering analysis using the pvclust T T 2 ge ∼ N 0, Z GZ ◦ Z Z σ with Z as the inci- package of R (Suzuki and Shimodaira 2006). g e g e ge e dence matrix for the environmental effects that connect Genomic Prediction Analysis the phenotypes with environments; w is the effect of the ij Statistical Models for Genomic Prediction Analysis environmental covariates (ECs) in the ij-th genotype X Due to its stability of accuracy across different environ - environment combination which is denoted as ments and traits and its ease of implementation, GBLUP w ∼ N 0, �σ with Ω computed using ECs and propor- (genomic best linear unbiased prediction) is the most tional to WW’, where W is a matrix with centred and used method on rice (Bartholomé et al. 2022). In this standardised values of the ECs; gw the interaction effect ij study, we focused our effort on GBLUP, which is cur - of the genotypic and environmental covariates in the ij-th rently used routinely at IRRI. Seven genomic prediction genotype X environment combination which is modelled models were implemented to predict DTF, HT, and YLD. by the Hadamard product of Z GZ and , denoted as T 2 The first model was the standard GBLUP model with gw ∼ N 0, Z GZ ◦ �σ with Z as an incidence g gw g only the main effect of the genotypes (VanRaden 2008): matrix for the vector of additive genetic effects. The genomic heritability ( h ) of the studied traits was Model G : y = µ + g + ε i i i (1) estimated based on the seven models described above. where μ is the overall mean; g is the random effect of the i The different estimates of h were obtained with the fol- i-th genotype, denoted as g ∼ N 0, σ G with the lowing formula (de los Campos et al. 2015): genomic relationship matrix (G) estimated as Nguyen et al. Rice (2023) 16:7 Page 6 of 17 2 CrossV ‑ alidation Experiment: Assessing Predictive Abilities h = for Untested Lines 2 2 σ + σ g ε For this CV experiment, we used the leave-one-out method for predicting the untested lines. The models where σ is the additive genetic variance obtained with were trained using all the environment and all the lines the genomic relationship matrix (G) and σ is the resid- except one. The remaining line was predicted across all ual error variance as defined previously. environments. We repeated this process for all 33 lines The analyses of genomic prediction were performed evaluated in all fifteen environments. In this CV experi - in R (R Core Team 2022) using the R statistical package ment, the PAs (Pearson correlation coefficient) were BGLR (Pérez and de los Campos 2014). The hyperparam - measured in two ways: at the line level (correlation eters for prior specification and the number of iterations between the predicted values and the adjusted pheno- for the Markov Chain Monte Carlo (MCMC) algorithm types across the fifteen environments for a given line) were set up with 25,000 iterations, with a burn-in of 5000 and at the environment level (correlation between the and a thinning of 10. predicted values and the adjusted phenotypes in given environments across all the 33 lines). The R scripts for the different CV experiments are pro - CrossV ‑ alidation Experiments: Assessing Predictive Abilities vided in the Additional file 6. for Untested Environments Three different cross-validations (CV) experiments were Results designed to assess the predictive abilities (PA) in untested Characterization of Genetic Structure for the ECP environments. In the first CV experiment (CV-RAN), the The results showed that the Japonica (GJ), circum-Bas - target environment (validation set) was predicted using mati (cB), circum-Aus (cA), and Indica (XI) subgroups four environments selected randomly among the 14 from 3 K-RG were clearly separated and confirmed the remaining environments from the training set. Random clustering of the ECP into the XI subgroups (Fig. 1A). sampling was repeated 50 times. The predictive ability When only the Indica (XI) subgroups were used, the ECP was computed for each of the 50 replicates and then aver- was found to be close to the XI-1B subgroup (Fig. 1B). aged. An ANOVA and Tukey’s tests were then carried XI-1B is known to include essentially modern varie- out at the significance level of 5% based on z-transformed ties largely generated by the IRRI’s breeding program values (Z = 0.5 [ln(1 + r) − ln(1 − r)]), to identify the sig- in Southeast Asia. Similar results were found with the nificant differences in predictive ability (r) among the neighbor-joining tree between ECP with the whole models in each environment. Analyses were performed 3 K-RG samples and with only the XI subgroups (Addi- separately for each trait. After the confidence limits and tional file 2: Fig. S4). means for Z were estimated, these were transformed back to r values. Phenotypic Variation of the ECP Across Environments For the second CV experiment (CV-SEL), the target A large phenotypic variability was found for the three environment was predicted using four environments traits across all the environments (Table 2, Fig. 2). For specifically selected among the remaining fourteen envi - DTF, the average value per trial ranged from 86 (Los ronments to form the training set. The selection of envi - Baños, wet season) to 118 days (Chokwe) with most of the ronments for the training set was based on Euclidean trials displaying a vegetative phase of about 90 days. The distance in terms of ECs. The closest environments were duration of flowering (calculated as the difference between then identified (Additional file 1: Table S5). The predic - the earliest and latest in a given environment) ranged tion was performed once for each target environment. from 17 days (Hyderabad-dry season) to 56 days (Maputo) For the third CV experiment, we used the “leave-one- with an average value of 32 days. Trials at Ahero, Maputo, environment-out" (CV-LOEO) method. The target envi - and Los Baños (dry and wet seasons) had longer flower - ronment was predicted using the remaining fourteen ing times compared to the others. For HT, a continuous environments as a training set. Each environment was gradient in the average value per trial was found with val- predicted using the model trained based on the informa- ues ranging from 78.6 (Maputo) to 130.2 cm (Maruteru) tion (genotypic and phenotypic data as well as ECs) of (Table 2). As expected, a similar trend was observed for the remaining fourteen environments. YLD with an average value per trial ranging from 3.76 For the three CV experiments, the PAs were measured (Gazipur) to 6.46 ton/ha (Los Baños-dry season). as the Pearson correlation coefficient between the pre - Broad-sense heritability (H ) was rather high for all dicted values and the adjusted phenotypes in the valida- three traits. The H ranged from 0.52 (Ahero) to 0.96 tion set (target environment). Nguy en et al. Rice (2023) 16:7 Page 7 of 17 for HT, and from 0.10 to 0.12 for YLD. The correlations between environments corroborated these differences between traits. For DTF, correlations with values rang- ing from 0.07 to 0.82 were found with 33% of the pair- wise correlations greater than 0.5 (Additional file 2: Fig. S6A). A similar trend was observed for HT with correla- tions ranging from 0.04 to 0.77 and 54% of the correlation being greater than 0.5 (Additional file 2: Fig. S6B). On the contrary, only 18% of the correlations were significant in the case of YLD and most of the correlations were below 0.2 (Additional file 2: Fig. S6C). Three environments had significant correlations with most of the other environ - ments for the three traits considered: Hyderabad-India (2018-WS), Mwea-Kenya and Los Baños (2019-DS). In order to better identify similar environments based on phenotypic performances, a clustering analysis was conducted for the three traits (Fig. 3). For the DTF, two main clusters were identified: one comprising five loca - tions from Bangladesh and India (except Hyderabad) and the other including ten locations from Africa, the Phil- ippines and Hyderabad (Fig. 3A). For the HT, two main clusters were identified (Fig. 3B). The first cluster had only two environments (Hyderabad-wet season 2019 and Los Baños-wet season 2019) that presented lower cor- relations with other environments. The second cluster Fig. 1 The principal component analysis between the elite core gathered all other environments. However, similarly to panel (ECP) and 3000 rice genomes (3K‑RG) accessions. (A) The DTF, two subclusters tend to separate environments in ECP with all subgroups of 3K‑RG; (B) the ECP with only indica (XI) Bangladesh and India to the rest (Africa and the Philip- subgroups. The analysis is based on 837 common SNPs. The ECP lines are denoted with black dots. The subgroups from 3K‑RG included pines). For YLD, since the level of correlation between Admix, circum‑Basmati, circum‑Aus, indica (1A, 1B, 2, 3, admix), and environments was lower, the environments were spread japonica (admix, subtropical, temperate, tropical) in more clusters. Indeed, four clusters were identified with no clear structuration by regions or by seasons (Fig. 3C). However, the environments from the same location (Hyderabad or Los Baños) clustered together. (Cuttack) for DTF, from 0.27 (Chokwe) to 1.0 (Cuttack) for HT, and from 0.19 (Gazipur) to 0.89 (Cuttack and Dakawa) for YLD trait (Table 2). Characterization of Environments Based on Environmental For most of the environments, the phenotypic corre- Covariates lations between traits (DTF, HT and YLD) were low to The clustering analysis for the four periods showed medium (− 0.31–0.53). No clear trend was identified for different patterns (Fig. 4A–D). For the whole growing all environments, although HT was significantly corre - season, two main clusters were found. The first clus- lated with days to flowering in nine of the environments ter grouped India’s, and Bangladesh’s environments and flowering was significantly correlated with yield in and Los Baños in the wet season. The second cluster only five environments (Additional file 2: Fig. S5). included a subcluster of Hyderabad’s environments and Los Baños in the dry season, and a subcluster Characterization of G × E Interactions upon the Phenotypic with all of Africa’s environments. Similar results were Performance of ECP found for the reproductive phase with two main clus- The environment, genotypes, and their interaction effects ters. These clusters were also similar to the clustering were found to be significant for the three traits (Addi - of environments for DTF but very different from those tional file 1: Table S6). The heritabilities based on the of HT and YLD traits. For the vegetative and ripening combined analysis of all trials (h ) confirmed the strong phases, environments from Asia tend to cluster with effect of the environments (Additional file 1: Table S7). environments from Africa with no clear separation For the models including the G × E interactions, h between the two regions. ranged from 0.52 to 0.57 for DTH, from 0.41 to 0.44 Nguyen et al. Rice (2023) 16:7 Page 8 of 17 Table 2 Phenotypic values and broad‑sense heritability (H ) for the three traits across environments Country Location Study name DTF (days) HT (cm) YLD (t/ha) H (SE) Range Mean Range Mean Range Mean DTF HT YLD Bangladesh Gazipur BD‑ GZ‑19W 82–102 90 111.6–137.2 122.6 3.28–4.09 3.76 0.69 (1.02) 0.55 (2) 0.19 (0.15) Bangladesh Nizmawna BD‑NM ‑19W 86–105 92 100.2–146.8 120.4 3.75–6.06 5.01 – – – India Hyderabad IN‑HY ‑18W 96–116 105 74.1–101.6 85.9 4.93–6.87 5.94 0.85 (0.14) 0.74 (0.71) 0.67 (0.08) India Cuttack IN‑ CU‑19W 89–106 97 94.3–135.1 116.0 3.34–6.71 4.96 0.96 (0.02) 1 (0.0003) 0.89 (0.03) India Hyderabad IN‑HY ‑19D 86–97 90 74.8–103.7 87.1 3.90–7.58 6.01 0.64 (0.37) 0.72 (0.7) 0.7 (0.09) India Hyderabad IN‑HY ‑19W 99–118 109 75.0–133.3 104.5 1.97–8.34 5.47 – – – India Maruteru IN‑MA‑19W 89–101 95 109.9–145.4 130.2 3.09–5.91 4.03 0.73 (0.41) 0.93 (0.15) 0.87 (0.05) India Raipur IN‑RP ‑19W 99–117 107 98.7–134.1 114.7 4.15–6.36 5.25 0.92 (0.08) 0.91 (0.29) 0.53 (0.19) Kenya Ahero KE‑AH‑19D 95–104 98 86.3–109.9 98.1 3.98–5.45 4.83 0.52 (0.55) 0.51 (1.08) 0.28 (0.1) Kenya Mwea KE‑MW ‑20W 95–112 103 73.2–103.8 86.6 3.12–5.52 4.28 0.6 (0.65) 0.69 (0.84) 0.59 (0.09) Mozambique Chokwe MZ‑ CK‑20W 106–128 118 65.1–92.9 80.0 4.44–6.54 6.01 0.91 (0.04) 0.27 (6.81) 0.28 (0.1) Mozambique Maputo MZ‑MP ‑20W 92–124 108 66.0–102.1 78.6 3.74–5.40 4.70 0.78 (0.5) 0.7 (0.77) 0.45 (0.1) Philippines Los Baños PH‑LB‑19D 78–99 88 90.5–124.3 107.2 4.72–7.94 6.46 0.87 (0.08) 0.73 (0.66) 0.75 (0.04) Philippines Los Baños PH‑LB‑19W 72–98 86 100.8–138.4 119.3 3.56–6.68 5.33 0.82 (0.2) 0.5 (1.93) 0.71 (0.04) Tanzania Dakawa TZ‑DK ‑20W 78–101 88 88.4–119.7 104.9 3.36–5.72 4.87 0.93 (0.09) 0.74 (1.4) 0.89 (0.05) DTF days to flowering; HT plant height; YLD grain yield. In the two environments of Bangladesh-Nizmawna and India-Hyderabad, the broad-sense heritability was not calculated due to the experimental design without replications Fig. 2 Distribution of phenotypic values of elite lines for the three traits evaluated across the 15 environments. DTF days to flowering; HT plant height; YLD grain yield. The boxes with orange colors indicated the trials conducted in the dry season, and the boxes with the blue color indicated the trials in the wet season Genomic Prediction for Untested Environments scenario. The predictive abilities (PAs) ranged from 0.19 Impact of the Prediction Models (Ahero) to 0.67 (Hyderabad-2018) for DTF, from 0.28 We evaluated the efficiency of the different models (Los Baños - wet season) to 0.83 (Hyderabad-2018) for to predict untested environments with the CV-RAN HT and from − 0.06 (Nimwa) to 0.45 (Hyderabad-2018) Nguy en et al. Rice (2023) 16:7 Page 9 of 17 Fig. 3 Hierarchical clustering of environments for the three traits. DTF days to flowering (panel A); HT plant height (panel B); YLD grain yield (panel C). The different colors present different clusters among environments. The names of environments in the clusters are formatted by ordering country name, location, year, and season (see Table 1) for YLD (Additional file 1: Table S8). As expected, DTH CV-LOEO, respectively (Fig. 6). The CV-LOEO scenario and HT presented higher PAs than YLD. Consider- presented the highest PAs in five to fourteen environ - ing the models, the integration of the main effect of the ments depending on the trait. The CV-SEL scenario was environment (E) or the environmental covariates (W) the second in terms of PA with higher PA in two to six significantly increased the PA for DTF (12 environments environments. Similar results were found when compar- over 15) and HT (all environments) compared to base- ing CV scenarios using GW and GW − G × W models, line model G (Fig. 5). However, for YLD, the GE, GW or GEW and GEW − G × E–G × W models (Additional and GEW models did not perform significantly better file 2: Fig. S7A–B). To see the impact of the experimental than the G model, except in one case (Chokwe). Inter- design, we calculated PAs with CV-SEL and CV-LOEO estingly, in most of the cases, no significant increase in using a subset of 33 common lines in all environments. PA was found between models including the interaction The results revealed similar trends in PA between bal - term (G × E or/and G × W) and GE, GW or GEW model. anced and unbalanced datasets: no major gain in PA was Indeed, for DTF, the models with interactions were sig- observed when the interactions were included in the nificantly better than models with main effects in only models (Additional file 1: Table S9–S10). three environments. For HT and YLD, the models with interactions (more specifically with GxW) showed a sig - Genomic Prediction for Untested Lines nificant decrease in PA in five and six environments, The performances of untested lines across the fifteen respectively. environments were predicted with high accuracy with all the models including environmental effects or envi - Impact of Training Set Composition ronmental covariates (GE, GW, GEW, GE-GxE, GW- We compared three cross-validation scenarios (CV-RAN, GxW and GEW-GxE-GxW models, Table 3). The PAs CV-SEL, and CV-LOEO) with only GE and GE-G × E were close to 0.95 on average for DTF and HT, and close models, to evaluate the effect of training set composi - to 0.81 on average for YLD. No differences were found tion on PA. For DTF, the PA ranged from 0.19 (Ahero) between these models. As expected, the model with only to 0.67 (Hyderabad 2018) for CV-RAN, from 0.2 (Ahero) the main effect of the genotypes (G) displayed PA close to to 0.79 (Hyderabad 2018) for CV-SEL, and from 0.18 0 on average. When we looked at the environment level, (Ahero) to 0.77 (Dakawa) for CV-LOEO. For HT, PA we found lower PAs and large differences between envi - varied from 0.36 (Hyderabad 2019 wet season) to 0.83 ronments. The PA ranged from − 0.03 to 0.56 for DTF, (Hyderabad 2018), from 0.34 (Hyderabad 2019 wet sea- − 0.05 to 0.53 for HT and − 0.37 to 0.52 for YLD. A simi- son) to 0.81 (Hyderabad 2018) and from 0.38 (Hyderabad lar trend to that of the untested environment prediction 2019 wet season) to 0.87 (Hyderabad 2018) for CV- was found: the models with the main effects of the envi - RAN, CV-SEL and CV-LOEO, respectively. While YLD ronment (GE, GW, and GEW) tend to present higher PA reached PA ranging from − 0.06 (Nizmawna) to 0.32 (Los than the other models. However, a large variability was Baños 2019 dry season), from − 0.05 (Dakawa) to 0.62 found between environments and traits (Additional file 1: (Hyderabad 2018) and from − 0.1 (Nizmawna) to 0.48 Table S11). (Hyderabad 2019 wet season) for CV-RAN, CV-SEL and, Nguyen et al. Rice (2023) 16:7 Page 10 of 17 Fig. 4 Hierarchical clustering of environments upon ECs throughout (A) the whole growing season, each developmental phase (B–D). Different colors show the different clustering between environments. The names of environments in the clusters are formatted by ordering country name, location, year, and season (see Table 1) Discussion of the ECP and evaluated it in 15 environments in Asia Performance of Elite Breeding Lines and Africa. Although the trials were conducted with the The characterization at both genetic and phenotypic standard practices for irrigated systems, important differ - levels of elite lines is a key aspect of breeding programs. ences in the average performances were found between This information allows the breeder to drive the breeding environments for three traits measured (DTF, HT, YLD). population in the desired direction while making efficient For example, a difference of 32 days was found for DTF use of the available genetic diversity. In the framework between the two extreme environments. For YLD, the of IRRI breeding programs for irrigated systems, a panel productivity was on average 2.7 t.ha higher in the most representing the elite diversity of the program was con- productive environment compared to the least produc- stituted in 2018 and then enriched with recent parental tive. In addition to these differences, our results showed lines (Juma et al. 2021). In this study, we took advantage medium G × E for DTF and HT and a strong G × E for Nguy en et al. Rice (2023) 16:7 Page 11 of 17 Fig. 5 Predictive abilities for untested environments using the CV‑RAN scenario. Seven different models are compared (see material and methods section). The letters at the top of each bar represent the results of Tukey’s HSD comparison between models in each environment. The means between two groups are significantly different (p‑ value < 0.05) if there is no letter in common. The error bars are presented by PA mean ± SE where SE is the standard error of PA values from 50 replicates. DTF days to flowering; HT plant height; YLD grain yield of environments available enables us to assess the impact YLD. These levels of G × E are slightly higher than the of G × E modeling on PA for untested environments and ones generally found in similar studies on rice (Mon- untested lines using seven genomic prediction models. teverde et al. 2019; Morais Júnior et al. 2017; Spindel For untested environments, the approach resulted in et al. 2015). These results can partly be explained by the high PAs for different combinations of trait/environment wide distribution of the trials and the associated environ- with values as high as 0.77 for DTF, 0.88 for PH, and 0.62 mental variations. Indeed, the clustering analysis based for YLD. However, YLD was poorly predicted in nearly on eight ECs and four different phases (whole growing half of the environments with values close to zero. This season, vegetative, reproductive, and ripening phases) difference between more heritable traits (e.g. DTF and showed similarity to the ones based on phenotypic per- HT) and less heritable traits (e.g. YLD) has been already formances. However, the clustering structure did not reported in the literature on rice (Ben Hassen et al. 2018; clearly separate Asian and African environments. This Monteverde et al. 2018; Morais Júnior et al. 2017). For information will be used for a better definition of the tar - untested lines, the PAs were very high (0.80–0.90) high- get population of environments in the future (Atlin et al. lighting the complexity of predicting performance in new 2000). environments versus predicting new lines in known envi- ronments. We also found that, in most cases, the inte- Prediction Accuracies of Multi‑environment Models gration of environments (E), environmental covariates In rice, a wide variety of populations (diversity panels, (W), and interaction effects (G × E or/and G × W com- breeding population, biparental crosses,…) have been ponents) increased PA when compared to the baseline used in GS studies depending on the context and the G model. Interestingly, the integration of the interaction objective (Bartholomé et al. 2022). In the present study, effects did not result in better PAs for all environments we focused our efforts on a set of elite breeding materials and in some cases even decreased the PA, especially for phenotyped by the partners of the program. The number Nguyen et al. Rice (2023) 16:7 Page 12 of 17 Fig. 6 Comparison of the predictive abilities between the three cross‑ validation scenarios: CV-RAN (random), CV-SEL (selected environments) and CV-LOEO (leave one environment out). Two models are presented: the first one with only the main effects of genotypes and environments (GE) and the second one with the main effect and the interaction (GE − G × E). The error bars are presented by PA mean ± SE where SE is the standard error of PA values from 50 replicates. DTF days to flowering; HT plant height; YLD grain yield Table 3 Predictive abilities of untested lines Model DTF HT YLD Range Mean SE Range Mean SE Range Mean SE G − 0.59–0.53 − 0.10 0.056 − 0.48–0.47 − 0.01 0.046 − 0.46–0.37 − 0.05 0.039 GE 0.82–0.99 0.94 0.007 0.8–0.99 0.95 0.008 0.4–0.94 0.79 0.021 GW 0.82–0.99 0.94 0.007 0.8–0.99 0.95 0.008 0.39–0.94 0.79 0.021 GEW 0.82–0.99 0.94 0.007 0.8–0.99 0.95 0.008 0.4–0.94 0.79 0.021 GE‑ GxE 0.8–0.99 0.95 0.007 0.79–0.99 0.94 0.009 0.57–0.95 0.81 0.018 GW‑ GxW 0.79–0.99 0.95 0.008 0.8–0.99 0.94 0.008 0.43–0.95 0.79 0.019 GEW‑ GxE‑ GxW 0.79–0.99 0.95 0.008 0.79–0.99 0.94 0.009 0.58–0.95 0.81 0.018 DTF days to flowering; HT plant height; YLD grain yield; SE standard error values The average values are computed from predictive ability across 33 lines in each model HT and YLD. We found a similar trend with a smaller results (Monteverde et al. 2019; Morais Júnior et al. 2017). but balanced data set suggesting that the poor estima- Morais Júnior et al. (2017) used historical data from three tion of the G × E was related to other factors such as the cycles of a breeding program with a total of 10 environ- use of reaction norms to model the interactions (Cuevas ments to assess the predictive ability of a single-step et al. 2016). In rice, two studies reported the use of multi- reaction norm model. They obtained high accuracies environment models to predict the performances of gen- for the prediction of untested environment for the three otypes in untested environments and obtained similar traits evaluated: DTF (0.5–0.9), HT (0.25–0.7), and YLD Nguy en et al. Rice (2023) 16:7 Page 13 of 17 (0.15–0.65). Morais Júnior et al. (2017) also evaluated the an important driver of prediction accuracy compared to effect of GxE modeling in the context of the prediction genetic similarities for environment-specific predictions. of untested lines but did not find important differences In a study on rice, Spindel et al. (2016) found that one of with the models including only the main effects. Using the major differences in prediction accuracies was associ - two breeding populations (indica and japonica), Mon- ated with the level of correlation between environments, teverde et al. (2019) found that modeling the interaction in which the prediction accuracies were generally higher effects with the G × W component (G + W + G × W ) did when the training data used were from well-correlated not give better results than modeling the main effects of environments. In this context, the use of environmental genotypes and ECs (G + W). Similarly to our results, the covariates is central to guide the choice of phenotyping integration of the interactions (G × W) even decreases sites and potentially reduce phenotyping efforts while the PA in some cases compared to the simple GBLUP maintaining a high level of precision. Therefore, the topic model (G model). These results contrast with previ - of multi-environmental prediction models and integra- ous studies on barley and wheat where the use of ECs to tion of ECs has gradually developed over the past decade model the environmental effects has resulted in higher in the plant breeding community (Crossa et al. 2022). In prediction accuracies for untested environments (Jar- contrast to optimizing the composition of the training set quín et al. 2014; Malosetti et al. 2016). Previous studies (genotypes), optimizing the environmental information on rice also showed that the modelling of G × E inter- to be used for training the models has received less atten- actions tends to increase PA (Baertschi et al. 2021; Ben tion (Isidro et al. 2015; Rio et al. 2021). Hassen et al. 2018; Bhandari et al. 2019; Monteverde et al. 2018). However, most of these studies predict the perfor- Implications for the Breeding Strategy at IRRI mance of untested lines in known environments using Much of the complexity of plant breeding programs two common cross-validation approaches to evaluate the arises from G × E. For traits with a large proportion of PA of multi-environment models: CV1 and CV2 (Bur- G × E, such as yield, breeders have different options gueño et al. 2012). For example, Ben Hassen et al. (2018) for evaluating them in their breeding programs. Since reported a better prediction performance of multi-envi- the costs of phenotyping are usually a major limita- ronment models than single environment models using a tion, a small number of promising genotypes are evalu- diversity panel phenotyped under alternate wetting and ated in multi-environment trials to quantify the level drying and continuous flooding conditions. The gain in of G × E and select the genotypes with the best perfor- accuracy of multi-environment models over single-envi- mance (Comstock 1977). This can be a limitation if the ronment models was 30% under CV2. Similar results goal is to exploit G × E interactions rather than minimize were also reported by Monteverde et al. (2018) and Bae- them. For this reason, the concept of a target popula- rtschi et al. (2021) but with contrasted gains depending tion of environments (TPE) was defined. This is a set of on the traits. environments that are homogeneous in terms of pheno- typic perforations in which future varieties will be grown Impact of Training Set Composition (Crespo-Herrera et al. 2021). However, it can be difficult To achieve a higher level of PA for untested environ- to sample efficiently the TPE, especially in small public ments, the selection of environments to compose the breeding programs. Being able to predict the perfor- training set can play an important role (Jarquín et al. mance of untested environments using multi-environ- 2014). In this study, the PAs were found to be higher for ment models and ECs can be very useful for a breeding both CV-LOEO (all environments) and CV-SEL (four program that operates in different countries like the IRRI correlated environments) compared to the CV-RAN program for irrigated systems. Recently, the program was (four random environments), confirming that using a redesigned to integrate genomic selection with enhanced training set with only correlated environments can be multi-environment evaluations (first-stage yield tri - a good strategy. Indeed, several studies have shown als) with the partners. The objective was to shorten the that correlations between environments is a key fac- breeding cycle while optimizing multi-environment tor in achieving good prediction accuracy, and the use evaluations (Bartholomé et al. 2022). The findings of the of training data derived from correlated environments present study support the idea to use all the phenotypic can improve prediction accuracy (Spindel and McCouch information from correlated environments to make the 2016). For example, Rogers and Holland (2022), using prediction. Currently, the predictions are made by region empirical data on maize, found a sharp decrease in pre- but the results from the CV-SEL and CV-LOEO showed dictive ability for the scenario “leave out related envi- that information from other environments can be bor- ronments” compared to the scenario “leave out related rowed to increase the PA. In practice, the use of ECs can hybrids”. They concluded that environmental similarity is help to consider more carefully the correlations between Nguyen et al. Rice (2023) 16:7 Page 14 of 17 the different environments and therefore restructure Additional file 2: Fig. S1. Physical positions of the 882 SNPs on the 12 the genomic prediction pipeline. In addition, perhaps chromosomes of rice using Nipponbare reference genome. Fig. S2. Graphical representation of the allocation of ECP across 15 environments. the program is currently implementing a sparse-test- Blue vertical lines represent the ECP‑ environment combinations that ing approach that aims to increase the number of lines were observed, and the white lines (blanks) correspond to unobserved evaluated while keeping the number of plots to a man- combinations. Fig. S3. Schematic representation of the developmental stages of ECP in each environment. Fig. S4. Unweighted neighbor‑joining ageable size (Atanda et al. 2021; Jarquin et al. 2020). In trees between the ECP and Oryza sativa groups from 3K‑RG (A), or the that context, the genotypes are not fully replicated across indica subgroups (B). Fig. S5. The scatter plot, histogram, and correlation environments making the estimation of GxE interactions between the DTF, HT, and YLD traits in the single environments. Fig. S6. The scatter plot, histogram, and correlation between environments for more difficult. It is, therefore, necessary to go towards the DTF trait (A), HT trait (B) and YLD trait (C). Fig. S7. Comparison of predictive estimation of the marker by environment interactions or abilities between the three cross‑ validation scenarios. (A) models with marker by ECs to keep maintain the level of accuracy. genotypes and environmental covariates’ main effects (GW ) and their interactions (GW‑ GxW ); (B) models combined all main effects (GEW ) and their interactions (GEW‑ GxE‑ GxW ). Conclusion Additional file 3: The genotypic data of elite ‑breeding lines (882 SNPs). Understanding the level of G × E in a given population Additional file 4: The phenotypic data (BLUPs values) of three traits of and a given set of environments or locations is essential elite‑breeding lines across 15 environments. to better guide the testing strategy of a breeding program. Additional file 5: Environmental covariates (ECs) of each environment throughout the whole growing season for the different phases: vegetative However, the number of environments that can be evalu- ( VE), reproductive (RE), and ripening (RI). ated by a program is often limited. The use of genomic Additional file 6: R scripts to perform the genomic prediction analysis. prediction can be useful in a different way in this aspect. In this study, we showed that multi-environment mod- Acknowledgements els can predict untested lines with high accuracy. How- We thank all the staff from the Rice Breeding Team at IRRI in the Philippines ever, the prediction of an untested environment presents and in the regions: South and Southeast Asia (Parth Sarothi Saha, Pranesh some challenges. We showed that models with only the K J ), Est and Southern Africa (Oliver Nyongesa, Simon Njau Kariuki, Arlindo Matsinhe, Rehema Kwayu) for their support in the conduct, management, main effects (G + E or G + W) were sufficient to obtain a and collection of phenotypic data. We also thank Juan David Arbelaez and good level of accuracy and that modelling the genotype the members of the IRRI Genotyping Service Lab who helped with the by environment interaction (G × E or G × W) did not genotyping. Many thanks to Diego Jarquin and Eliana Monteverde for sharing references for genomic prediction analysis. increase the accuracy. These results will allow more effi - cient use of the information generated by the IRRI breed- Author Contributions ing program and optimization of the testing strategy for JB and JNC conceived the study. RIZM, VL, HV, RM, AN, SKK, MRI, and RUJ conducted, managed, and gathered data from the multi‑ environment trials updating the GS models. (METs). VHN and JB checked and analyzed the data. VHN and JB wrote the manuscript. JCG, HFG, and JNC revised and edited the manuscript. All authors approved the final manuscript for submission. Abbreviations ECP Elit e core panel Funding GS Genomic selection This study was funded by the Bill and Melinda Gates Foundation through GBLUP Genomic best linear unbiased prediction the Accelerated Genetic Gain in Rice (AGGRi) Alliance project (Grant no. PA Predictive ability OPP1194925). Agropolis Foundation (http:// www. agrop olis‑ fonda tion. fr/) and CV Cross‑validation SEARCA (http:// www. searca. org/) funded the PhD fellowship of Nguyen Van Hieu via the DivBreed project, Grant no. 1803‑007. Supplementary Information Availability of Data and Materials The online version contains supplementary material available at https:// doi. The datasets analyzed during the current study are included in this published org/ 10. 1186/ s12284‑ 023‑ 00623‑6. article and its additional files. Additional file 1: Table S1. Composition of the elite core panel (ECP) Declarations used in this study. Table S2. Information on each field trial’s locations and technical parameters. Table S3. Description of the statistical models used Ethics Approval and Consent to Participate for phenotypic data analysis. Table S4. Environmental covariates (ECs) of Not applicable. each environment throughout the whole growing season for the different phases. vegetative ( VE), reproductive (RE), and ripening (RI). Table S5. List Consent for Publication of environments selected for the CV‑SEL scenario. Table S6. Analysis of Not applicable. variance (ANOVA) for the three traits. Table S7. Genomic heritability (h²g) and the associated standard error (SD) for the three traits: days to flower ‑ Competing Interests ing (DTF), plant height (PH) and grain yield (YLD). Table S8. The predictive The authors declare that they have no conflict of interest. abilities for the DTF, HT, and YLD traits for the CV‑RAN scenario. Table S9. The predictive abilities for the DTF, HT, and YLD traits for the CV‑SEL Author details scenario. Table S10. The predictive abilities for the DTF, HT, and YLD traits 1 2 CIRAD, UMR AGAP Institut, 34398 Montpellier, France. UMR AGAP Institut, for the CV‑LOEO scenario. Table S11. Predictive abilities for each environ‑ Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France. R ice Breed‑ ment, based on the predictions of untested lines. ing Innovation Platform, International Rice Research Institute, DAPO, Box7777, Nguy en et al. Rice (2023) 16:7 Page 15 of 17 Metro Manila, Philippines. Institute of Crop Science, College of Agriculture agriculture. Australian plant breeding conference gold coast, Queens‑ and Food Science, University of the Philippines, Los Baños, Laguna, Philippines. land, pp 116–131 5 6 RiceTec. Inc, PO Box 1305, Alvin, TX 77512, USA. CIRAD, UMR AGAP Institut, Costa‑Neto G, Fritsche ‑Neto R, Crossa J (2020) Nonlinear kernels, dominance, Cali, Colombia. Alliance Bioversity‑ CIAT, Cali, Colombia. and envirotyping data increase the accuracy of genome‑based predic‑ tion in multi‑ environment trials. Heredity 126(1):92–106. https:// doi. org/ Received: 4 October 2022 Accepted: 31 January 2023 10. 1038/ s41437‑ 020‑ 00353‑1 Costa‑Neto G, Galli G, Carvalho HF, Crossa J, Fritsche ‑Neto R (2021) EnvR‑ type: a software to interplay enviromics and quantitative genomics in agriculture. G3 Genes Genomes Genet. https:// doi. org/ 10. 1093/ g3jou rnal/ jkab0 40 References Crespo‑Herrera L, Crossa J, Huerta‑Espino J, Mondal S, Velu G, Juliana P, Vargas Ahmadi N et al (2020) Genomic selection in rice: empirical results and implica‑ M, Pérez‑Rodríguez P, Joshi A, Braun H, Singh R (2021) Target population tions for breeding. Quant Genet Genomics Plant Breed. https:// doi. org/ of environments for wheat breeding in India: definition, prediction and 10. 1079/ 97817 89240 214. 0243 genetic gains. Front Plant Sci 12:638520. https:// doi. org/ 10. 3389/ fpls. Arbelaez JD, Dwiyanti MS, Tandayu E, Llantada K, Jarana A, Ignacio JC, Platten 2021. 638520 JD, Cobb J, Rutkoski JE, Thomson MJ, Kretzschmar T (2019) 1k‑RiCA Crossa J, Pérez‑Rodríguez P, Cuevas J, Montesinos‑López O, Jarquín D, de los (1K‑rice custom amplicon) a novel genotyping amplicon‑based SNP Campos G, Burgueño J, González‑ Camacho JM, Pérez‑Elizalde S, Beyene assay for genetics and breeding applications in rice. Rice 12(1):55. https:// Y, Dreisigacker S, Singh R, Zhang X, Gowda M, Roorkiwal M, Rutkoski J, doi. org/ 10. 1186/ s12284‑ 019‑ 0311‑0 Varshney RK (2017) Genomic selection in plant breeding: methods, mod‑ Atanda SA, Olsen M, Crossa J, Burgueño J, Rincent R, Dzidzienyo D, Beyene Y, els, and perspectives. Trends Plant Sci 22(11):961–975. https:// doi. org/ 10. Gowda M, Dreher K, Boddupalli PM, Tongoona P, Danquah EY, Olaoye G, 1016/j. tplan ts. 2017. 08. 011 Robbins KR (2021) Scalable sparse testing genomic selection strategy for Crossa J, Martini J, Gianola D, Pérez‑Rodríguez P, Jarquin D, Juliana P, early yield testing stage. Front Plant Sci. https:// doi. org/ 10. 3389/ fpls. 2021. Montesinos‑López O, Cuevas J (2019) Deep Kernel and deep learning for 658978 genome‑based prediction of single traits in multienvironment breeding Atlin G, Baker R, McRae K, Lu X (2000) Selection response in subdivided target trials. Front Genet 10:1–13. https:// doi. org/ 10. 3389/ fgene. 2019. 01168 regions. Crop Sci CROP SC I:40. https:// doi. org/ 10. 2135/ crops ci2000. 4017 Crossa J, Montesinos‑López O, Pérez‑Rodríguez P, Costa Neto G, Fritsche ‑Neto Baertschi C, Cao T‑ V, Bartholomé J, Ospina Y, Quintero C, Frouin J, Bouvet J‑M, R, Ortiz R, Martini J, Lillemo M, Montesinos A, Jarquin D, Breseghello F, Grenier C (2021) Impact of early genomic prediction for recurrent selec‑ Cuevas J, Rincent R (2022) Genome and environment based predic‑ tion in an upland rice synthetic population. G3 Genes Genomes Genet tion models and methods of complex traits incorporating genotype × 11(12):320. https:// doi. org/ 10. 1093/ g3jou rnal/ jkab3 20 environment interaction. Methods Mol Biol (clifton, N.J.) 2467:245–283. Bartholomé J, Prakash P, Cobb J (2022) Genomic prediction: progress and https:// doi. org/ 10. 1007/ 978‑1‑ 0716‑ 2205‑6_9 perspectives for rice improvement. Methods Mol Biol (clifton, N.J.) Cuevas J, Crossa J, Soberanis V, Pérez‑Elizalde S, Pérez‑Rodríguez P, de los 2467:569–617. https:// doi. org/ 10. 1007/ 978‑1‑ 0716‑ 2205‑6_ 21 Campos G, Montesinos‑López OA, Burgueño J (2016) Genomic prediction Ben Hassen M, Bartholomé J, Valè G, Cao T‑ V, Ahmadi N (2018) Genomic of genotype × environment interaction kernel regression models. Plant prediction accounting for genotype by environment interaction offers Genome. https:// doi. org/ 10. 3835/ plant genom e2016. 03. 0024 an effective framework for breeding simultaneously for adaptation to an Cuevas J, Montesinos‑López O, Guzmán C, Pérez‑Rodríguez P, Bucio JL, abiotic stress and performance under normal cropping conditions in rice. Burgueño J, Montesinos A, Crossa J, Km V, El B, Texcoco E, Mexico M, Post‑ G3 Genes Genomes Genet 8(7):2319–2332. https:// doi. org/ 10. 1534/ g3. graduados C, Mexico E (2019) Deep kernel for genomic and near‑infrared 118. 200098 predictions in multi‑ environment breeding trials. G3 Genes Genomes Bhandari A, Bartholomé J, Cao‑Hamadoun T ‑ V, Kumari N, Frouin J, Kumar Genet 9:37 A, Ahmadi N (2019) Selection of trait‑specific markers and multi‑ envi‑ de los Campos G, Sorensen D, Gianola D (2015) Genomic heritability: what is ronment models improve genomic predictive ability in rice. PLoS ONE it? PLoS Genet 11:e1005048. https:// doi. org/ 10. 1371/ journ al. pgen. 10050 14(5):e0208871. https:// doi. org/ 10. 1371/ journ al. pone. 02088 71 48 Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) de los Campos G, Pérez‑Rodríguez P, Bogard M, Gouache D, Crossa J (2020) A TASSEL: Software for association mapping of complex traits in diverse data‑ driven simulation platform to predict cultivars’ performances under samples. Bioinformatics 23(19):2633–2635. https:// doi. org/ 10. 1093/ bioin uncertain weather conditions. Nat Commun 11(1):4876. https:// doi. org/ forma tics/ btm30810. 1038/ s41467‑ 020‑ 18480‑y Burgueño J, de los Campos G, Weigel K, Crossa J (2012) Genomic prediction of Dray S, Dufour AB (2007) The ade4 package: implementing the duality breeding values when modeling genotype × environment interaction diagram for ecologists. J Stat Softw 22(4):1–20. https:// doi. org/ 10. 18637/ using pedigree and dense molecular markers. Crop Sci 52(2):707–719. jss. v022. i04 https:// doi. org/ 10. 2135/ crops ci2011. 06. 0299 Elias A, Robbins K, Doerge R, Tuinstra M (2016) Half a century of studying Butler DG, Cullis BR, Gilmour AR, Gogel BJ, Thompson R (2017) ASReml‑R refer ‑ genotype × environment interactions in plant breeding experiments. ence manual version 4. VSN International Ltd, Hemel Hempstead Crop Sci 56:2090. https:// doi. org/ 10. 2135/ crops ci2015. 01. 0061 Cobb JN, Juma RU, Biswas PS, Arbelaez JD, Rutkoski J, Atlin G, Hagen T, Quinn Freeman GH (1973) Statistical methods for the analysis of genotype‑ environ‑ M, Ng EH (2019) Enhancing the rate of genetic gain in public‑sector plant ment interactions. Heredity 31(3):339–354. https:// doi. org/ 10. 1038/ hdy. breeding programs: lessons from the breeder’s equation. Theor Appl 1973. 90 Genet 132(3):627–645. https:// doi. org/ 10. 1007/ s00122‑ 019‑ 03317‑0 Gibbs RA, &, et al (2003) The international HapMap project. Nature Collard BCY, Mackill DJ (2008) Marker‑assisted selection: an approach for preci‑ 426(6968):789–796. https:// doi. org/ 10. 1038/ natur e02168 sion plant breeding in the twenty‑first century. Philos Trans R Soc B Biol Gilmour AR, Cullis BR, Verbyla AP, Verbyla AP (1997) Accounting for natural Sci 363(1491):557–572. https:// doi. org/ 10. 1098/ rstb. 2007. 2170 and extraneous variation in the analysis of field experiments. J Agric Biol Comstock RE (1977) Quantitative genetics and the design of breeding pro‑ Environ Stat 2(3):269. https:// doi. org/ 10. 2307/ 14004 46 gramme. In: Pollack E, Kempthorne O, Bailey TBJ (eds) Proceedings of the Gregorio GB, Islam MR, Vergara GV, Thirumeni S (2013) Recent advances in rice international conference on quantitative genetics. Iowa State University science to design salinity and other abiotic stress‑tolerant rice varieties. Press, Ames, pp 705–718 SABRAO J Breed Genet 45(1):31–41 Cooper M (2015) Use of crop growth models (CGM) with whole genome pre‑ Heffner EL, Sorrells ME, Jannink J‑L (2009) Genomic selection for crop improve ‑ diction ( WGP): application of CGM‑ WGP to a maize multi‑ environment ment. Crop Sci 49(1):1–12. https:// doi. org/ 10. 2135/ crops ci2008. 08. 0512 trial. Crop Sci. https:// doi. org/ 10. 2135/ crops ci2015. 08. 0512 Heslot N, Jannink J‑L, Sorrells ME (2013) Using genomic prediction to Cooper M, Delacy I, Eisemann RL (1993) Recent advances in the study of geno‑ characterize environments and optimize prediction accuracy in applied type × environment interactions and their application to plant breeding. breeding data. Crop Sci 53(3):921–933. https:// doi. org/ 10. 2135/ crops Focused plant improvement: towards responsible and sustainable ci2012. 07. 0420 Nguyen et al. Rice (2023) 16:7 Page 16 of 17 Heslot N, Akdemir D, Sorrells ME, Jannink J‑L (2014) Integrating envi‑ with genotype × environment interaction. Genes Genomes Genet ronmental covariates and crop modeling into the genomic selec‑ 6:1165–1177. https:// doi. org/ 10. 1534/ g3. 116. 028118 tion framework to predict genotype by environment interac‑ Montesinos A, Montesinos‑López O, Gianola D, Crossa J, Hernandez Suarez tions. Theor Appl Genet 127(2):463–480. https:// doi. org/ 10. 1007/ CM (2018) Multi‑ environment genomic prediction of plant traits using s00122‑ 013‑ 2231‑5 deep learners with dense architecture. G3 Genes Genomes Genet Isidro J, Jannink J‑L, Akdemir D, Poland J, Heslot N, Sorrells ME (2015) Train‑ (bethesda, Md). https:// doi. org/ 10. 1534/ g3. 118. 200740 ing set optimization under population structure in genomic selection. Monteverde E, Rosas JE, Blanco P, Pérez de Vida F, Bonnecarrère V, Quero G, TAG Theor Appl Genet Theor Angew Genet 128:145–158. https:// doi. Gutierrez L, McCouch S (2018) Multi environment models increase pre‑ org/ 10. 1007/ s00122‑ 014‑ 2418‑4 diction accuracy of complex traits in advanced breeding lines of rice. Jannink J‑L, Lorenz AJ, Iwata H (2010) Genomic selection in plant breeding: Crop Sci 58(4):1519–1530. https:// doi. org/ 10. 2135/ crops ci2017. 09. 0564 from theory to practice. Brief Funct Genomics 9(2):166–177. https:// Monteverde E, Gutierrez L, Blanco P, Pérez de Vida F, Rosas JE, Bonnecarrère doi. org/ 10. 1093/ bfgp/ elq001 V, Quero G, McCouch S (2019) Integrating molecular markers and envi‑ Jarquín D, Crossa J, Lacaze X, Du Cheyron P, Daucourt J, Lorgeou J, Piraux ronmental covariates to interpret genotype by environment interac‑ F, Guerreiro L, Pérez P, Calus M, Burgueño J, de los Campos G (2014) A tion in rice (Oryza sativa L.) grown in subtropical areas. Genes Genomes reaction norm model for genomic selection using high‑ dimensional Genet 9(5):1519–1531. https:// doi. org/ 10. 1534/ g3. 119. 400064 genomic and environmental data. Theor Appl Genet 127(3):595–607. Morais Júnior O, Duarte J, Breseghello F, Coelho A, Morais O, Júnior A (2017) https:// doi. org/ 10. 1007/ s00122‑ 013‑ 2243‑1 Single‑step reaction norm models for genomic prediction in multien‑ Jarquin D, Howard R, Crossa J, Beyene Y, Gowda M, Martini JWR, Covarrubias vironment recurrent selection trials. Crop Sci. https:// doi. org/ 10. 2135/ Pazaran G, Burgueño J, Pacheco A, Grondona M, Wimmer V, Prasanna crops ci2017. 06. 0366 BM (2020) Genomic prediction enhanced sparse testing for multi‑ envi‑ Murray MG, Thompson WF (1980) Rapid isolation of high molecular weight ronment trials. G3 Genes Genomes Genet 10(8):2725–2739. https:// doi. plant DNA. Nucleic Acids Res 8(19):4321–4326. https:// doi. org/ 10. 1093/ org/ 10. 1534/ g3. 120. 401349nar/8. 19. 4321 Jena KK, Mackill DJ (2008) Molecular markers and their use in marker‑ Olivoto T, Lúcio A (2020) Metan: an R package for multi‑ environment trial assisted selection in rice. Crop Sci 48(4):1266–1276. https:// doi. org/ 10. analysis. Methods Ecol Evol 11:783–789. https:// doi. org/ 10. 1111/ 2041 ‑ 2135/ crops ci2008. 02. 0082210x. 13384 Juma R, Bartholomé J, Prakash P, Hussain W, Platten J, Lopena V, Verdeprado Pérez P, de los Campos G (2014) Genome‑ wide regression and prediction H, Murori R, Ndayiragije A, Katiyar S, Islam R, Biswas P, Rutkoski J, with the BGLR statistical package. Genetics 198(2):483–495. https:// doi. Arbelaez J, Mbute F, Miano D, Cobb J (2021) Identification of an elite org/ 10. 1534/ genet ics. 114. 164442 core panel as a key breeding resource to accelerate the rate of genetic R Core Team (2022) A language and environment for statistical computing. improvement for irrigated. Rice. https:// doi. org/ 10. 21203/ rs.3. rs ‑http:// www.R‑ proje ct. org/ 832443/ v1 Rincent R, Kuhn E, Monod H, Oury F‑ X, Rousset M, Allard V, Le Gouis J (2017) Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouy‑ Optimization of multi‑ environment trials for genomic selection based ang S, Schwartz DC, Tanaka T, Wu J, Zhou S, Childs KL, Davidson RM, on crop models. Theor Appl Genet 130(8):1735–1752. https:// doi. org/ Lin H, Quesada‑ Ocampo L, Vaillancourt B, Sakai H, Lee SS, Kim J, Numa 10. 1007/ s00122‑ 017‑ 2922‑4 H, Matsumoto T (2013) Improvement of the Oryza sativa Nipponbare Rio S, Gallego‑Sánchez L, Montilla‑Bascón G, Canales Castilla FJ, Sánchez J, reference genome using next‑ generation sequence and optical map Prats E (2021) Genomic prediction and training set optimization in a data. Rice 6(1):4. https:// doi. org/ 10. 1186/ 1939‑ 8433‑6‑4 structured mediterranean oat population. Theor Appl Genet. https:// Lopez‑ Cruz M, Crossa J, Bonnett D, Dreisigacker S, Poland J, Jannink J‑L, doi. org/ 10. 1007/ s00122‑ 021‑ 03916‑w Singh RP, Autrique E, de los Campos G (2015) Increased prediction Rogers AR, Holland JB (2022) Environment‑specific genomic prediction abil‑ accuracy in wheat breeding trials using a marker × environment ity in maize using environmental covariates depends on environmen‑ interaction genomic selection model. G3 Genes Genomes Genet tal similarity to training data. Genes Genomes Genet 12(2):jkab440. 5(4):569–582. https:// doi. org/ 10. 1534/ g3. 114. 016097https:// doi. org/ 10. 1093/ g3jou rnal/ jkab4 40 Ly D, Huet S, Gauffretau A, Rincent R, Touzy G, Mini A, Jannink J‑L, Cormier F, Saito H, Fukuta Y, Obara M, Tomita A, Ishimaru T, Sasaki K, Fujita D, Kobayashi Paux E, Lafarge S, Gouis J, Charmet G (2018) Whole‑ genome prediction N (2021) Two novel QTLs for the harvest index that contribute to high‑ of reaction norms to environmental stress in bread wheat (Triticum yield production in rice (Oryza sativa L.). Rice (new York, N.y.) 14(1):18. aestivum L.) by genomic random regression. Field Crop Res 216:32–41. https:// doi. org/ 10. 1186/ s12284‑ 021‑ 00456‑1 https:// doi. org/ 10. 1016/j. fcr. 2017. 08. 020 Schloerke B et al (2020) Ggally: extension to ggplot2. R Package Version Malosetti M, Bustos‑Korts D, Boer MP, van Eeuwijk FA (2016) Predicting 0.5.0. responses in multiple environments: issues in relation to genotype × Schulz‑Streeck T, Ogutu J, Gordillo A, Karaman Z, Knaak C, Piepho H‑P (2013) environment interactions. Crop Sci 56(5):2210–2222. https:// doi. org/ Genomic selection allows for marker‑by‑ environment interaction. Plant 10. 2135/ crops ci2015. 05. 0311 Breed. https:// doi. org/ 10. 1111/ pbr. 12105 Mansueto L, Fuentes RR, Borja FN, Detras J, Abriol‑Santos JM, Chebotarov Seck PA, Diagne A, Mohanty S, Wopereis MCS (2012) Crops that feed D, Sanciangco M, Palis K, Copetti D, Poliakov A, Dubchak I, Solovyev the world 7: rice. Food Secur 4(1):7–24. https:// doi. org/ 10. 1007/ V, Wing RA, Hamilton RS, Mauleon R, McNally KL, Alexandrov N (2017) s12571‑ 012‑ 0168‑1 Rice SNP‑seek database update: New SNPs, indels, and queries. Nucleic Sparks AH (2018) Nasapower: a nasa power global meteorology, surface Acids Res 45(D1):D1075–D1081. https:// doi. org/ 10. 1093/ nar/ gkw11 35 solar energy and climatology data client for R. J Open Sour Softw. Messina C, Technow F, Tang T, Totir R, Gho C, Cooper M (2017) Leveraging https:// doi. org/ 10. 21105/ joss. 01035 biological insight and environmental variation to improve phenotypic Spindel J, McCouch S (2016) When more is better: how data sharing would prediction: integrating crop growth models (CGM) with whole genome accelerate genomic selection of crop plants. New Phytol. https:// doi. prediction ( WGP). Eur J Agron. https:// doi. org/ 10. 1101/ 100057org/ 10. 1111/ nph. 14174 Meuwissen TH, Hayes BJ, Goddard ME (2001) Prediction of total Spindel J, Begum H, Akdemir D, Virk P, Collard B, Redoña E, Atlin G, Jannink genetic value using genome‑ wide dense marker maps. Genetics J‑L, McCouch SR (2015) Genomic selection and association mapping in 157(4):1819–1829 rice (Oryza sativa): effect of trait genetic architecture, training popula‑ Millet E, Kruijer W, Coupel‑Ledru A, Alvarez Prado S, Cabrera‑Bosquet L, tion composition, marker number and statistical model on accuracy of Lacube S, Charcosset A, Welcker C, Eeuwijk F, Tardieu F (2019) Genomic rice genomic selection in elite tropical rice breeding lines. PLoS Genet prediction of maize yield across European environmental conditions. 11(2):e1004982. https:// doi. org/ 10. 1371/ journ al. pgen. 10049 82 Nat Genet. https:// doi. org/ 10. 1038/ s41588‑ 019‑ 0414‑y Spindel J, Begum H, Akdemir D, Collard B, Redoña E, Jannink J‑L, Mccouch Montesinos A, Montesinos‑López O, Crossa J, Burgueño J, Eskridge S (2016) Genome‑ wide prediction models that incorporate de novo K, Falconi E, He X, Kumar Singh P, Cichy K, Agropecuarias I, Km P, GWAS are a powerful new tool for tropical rice improvement. Heredity. Research B (2016) Genomic bayesian prediction model for count data https:// doi. org/ 10. 1038/ hdy. 2015. 113 Nguy en et al. Rice (2023) 16:7 Page 17 of 17 Suzuki R, Shimodaira H, Suzuki R, Shimodaira H (2006) Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics (oxford, England) 22:1540–1542. https:// doi. org/ 10. 1093/ bioin forma tics/ btl117 VanRaden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91(11):4414–4423. https:// doi. org/ 10. 3168/ jds. 2007‑ 0980 Wang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, Li M, Zheng T, Fuentes RR, Zhang F, Mansueto L, Copetti D, Sanciangco M, Palis KC, Xu J, Sun C, Fu B, Zhang H, Gao Y, Leung H (2018) Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557(7703):43–49. https:// doi. org/ 10. 1038/ s41586‑ 018‑ 0063‑9 Wickham H (2016) Ggplot2: Elegant graphics for data analysis. Springer‑ Verlag New York. https:// github. com/ tidyv erse/ ggplo t2 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in pub‑ lished maps and institutional affiliations.
Rice – Springer Journals
Published: Dec 1, 2023
Keywords: Rice; Oryza sativa; Elite lines; Genomic prediction; Genotype by environment interactions; Environmental covariates; Multi-environment genomic prediction models
Access the full text.
Sign up today, get DeepDyve free for 14 days.