close
close

Identification of quantitative trait nucleotides for grain quality in bread wheat under heat stress

Phenotypic data analysis

Extensive phenotypic variation was observed among 126 bread wheat genotypes while evaluating the GPC, GAC, TSS content, grain Fe, and Zn content. These genotypes were cultivated under both normal and late sown conditions over two growing seasons “2020–2021 & 2021–2022”, at three different locations of India thus covering twelve different environments for evaluation of phenotypic data (Tables S2-a,b, S3-a,b and S4-a,b). To further understand the distribution of the data in different environments and stress conditions in relation to various studied traits, same is presented through box plot (Figs. 1a–e). Under normal sown conditions the best linear unbiased estimate (BLUE) values of genotypes for GPC, GAC, TSS, grain Fe and Zn content was 12% (± 0.57), 20% (± 1.0), 3% (± 0.52), 44 ppm (± 6.57) and 31 ppm (± 2.76) respectively. Contrarily, the BLUE values for the same attributes under late sown conditions were 13% (± 0.69), 18% (± 1.26), 4% (± 0.75), 48 ppm (± 6.21) and 37 ppm (± 3.08) for GPC, GAC, TSS, grain Fe and Zn content respectively (Table S5). Moreover, the BLUE values of all wheat genotypes for quality attributes was also calculated under different environments (Tables S6-a,b and Fig. S1). The overall impact of HS on the qualitative attributes of wheat under study is summarized in the Table 1.

Fig. 1

Box plot of five biochemical traits under timely and late sown conditions: (a) GPC, (b) GAC, (c) TSS, (d) Grain Fe content and (e) grain Zn content. Each box in the plot represents the average performance under particular environmental condition. JBTS Jabalpur timely sown, JBLS Jabalpur late sown, PBTS Punjab timely sown, PBLS Punjab late sown, SMTS Samastipur timely sown, SMLS Samastipur late sown.

Table 1 Summary of effect of heat stress on quality parameters of wheat with respect to normal sown conditions; + (increases), − (decreases).

The analysis of variance (ANOVA), conducted using individual evaluation data from twelve different environments, revealed significant differences (P 2). Furthermore, the broad-sense heritability values for GPC, GAC, TSS, grain Fe, and zinc content were 0.85, 0.83, 0.92, 0.95, and 0.90, respectively, suggesting that the majority of the phenotypic variation was due to genetic factors. The heritability has been calculated using line-mean basis by following the approach of Glimour et al.40. In addition to that, correlation analysis among quality traits was conducted for both normal and late sown conditions. It was observed that overall correlations under normal sown conditions were weak. Results suggests that for normal sown condition, there is a positive correlation between grain Fe and Zn content (0.17, P P P S2). Under HS circumstances, there was a significant negative correlation between GPC and GAC (− 0.56, P P P P S3).

Table 2  ANOVA of seed biochemical (quality) traits evaluated at three locations under timely and late sown conditions.

Marker distribution and population structure analysis

A 35 K Axiom SNP array was used to genotype 126 wheat genotypes for the AM panel. The genotyping resulted in a total of 35,143 SNPs, which were further filtered for various quality parameters. Minor allele frequency (MAF S7). For better visualization, the distribution of markers in the current studied panel is presented through marker density map (Fig. 2). The average number of polymorphic SNP markers per chromosome was 752.6, and varied from 287 (Chr4D) to 1140 (Chr2B). Maximum SNPs on sub-genome A were mapped on 2A (940), followed by 1A (823). Similarly, the sub-genome B has the most SNPs on Chr2B (1140), followed by Chr5B (1107) and Sub-genome D’s Chr2D had the most SNPs (917), followed by Chr7D (694) (Table S8-a–c).

Fig. 2
figure 2

Marker density map showing distribution of SNP markers within 1 Mb window size across twenty one chromosomes of T. aestivum.

To determine the population structure of AM panel, plot was generated to represent the relationship between ΔK value and the varying number of hypothetical subgroups K. The highest ΔK value was recorded at K = 2, indicating the maximum membership probability (Fig. 3a). Consequently, the panel consisting of 126 wheat genotypes was classified into two distinct clusters i.e. C1 and C2 (Fig. 3b). Among the 126 genotypes, 61 clustered to cluster 1 (C1- red), and 65 genotypes to cluster 2 (C2-green). In addition, 83% of the genotypes were pure line (similarity > 80%), of which 16% and 84% were EC (exotic collection—genotypes from foreign countries) and IC (indigenous collection—genotypes within India) respectively (Table S1), while remaining 17% of the genotypes were categorized as admixtures in C1. Furthermore, 75% of the genotypes from C2 were pure, with 5% of EC and 95% of IC collections, while remaining 25% were categorized as admixtures. Presence of two clusters among the genotypic panel was also confirmed from the phylogeny analysis (Fig. S4).

Fig. 3
figure 3

Structure and kinship analysis. (a) Estimated ∆K values for a given K, the ∆K showed steep declined with a clear peak value at K = 2 (b) Bar plot of the AM panel showing the details of two clusters (C1 and C2) in different colors, viz. C1 (red, 61 genotypes), C2 (green, 65 genotypes).

Linkage disequilibrium analysis

A total of 15,805 polymorphic SNPs were used to assess the level of LD within the AM panel. The extent of LD between marker pairs was estimated on a chromosome-by-chromosome basis. The sub-genome B comprised the most number of polymorphic SNPs (6113, 38.67%), followed by the sub-genome A (5096, 32.26%), and the sub-genome D (4596, 29.07%) (Table S8-a–c). Further, to examine the LD decay, the LOESS (locally weighted linear regression) curve was fitted for each sub-genomes and entire genome. Additionally, a threshold was determined using the confluence of the LD decay curve at r2 = 0.17 based on the fitted model. In sub-genome A, LD decayed the fastest, followed by sub-genomes D and B. LD declined at 3.16 Mb in sub-genome A, as compared to 3.18 Mb in sub-genome D and 4.81 Mb in sub-genome B (Fig. 4a–d). Furthermore, LD decayed for the entire genome at 3.75 Mb.

Fig. 4
figure 4

LD decay plot with LOESS curve in red at r2 = 0.17. (a) LD decay plot of sub-genome A, (b) LD plot of sub-genome B, (c) LD decay plot of sub-genome D and (d) LD decay plot of whole genome.

Association mapping

A significant marker-trait association analysis using six different ML-GWAS models (FASTmrMLM, mrMLM, FASTmrEMMA, ISIS EM-BLASSO, pLARmEB, and pKWmEB) revealed a total of 67 QTNs for five quality traits, with a LOD score of ≥ 3. These QTNs were identified based on the consensus results obtained from at least two ML-GWAS models (Table S9). Among these, 37 QTNs were found to be consistent across three or more models. Out of these, 21 QTNs belongs to the multi-model (≥ 3) and single environment category (Table S10), while 16 QTNs (Fig. 5) were classified under the stable category, representing the multi-model (≥ 3) and multi-environments (≥ 3) (Table 3). The range of r2 values, spanning from 3 to 44.5%, indicates that the wheat quality traits are influenced by multiple loci with varying magnitudes of impact, ranging from small to moderate effects. Furthermore, QTNs were recorded from all twelve environments under study (Tables S11-a–d, S12-a–d, and S133-a–d).

Fig. 5
figure 5

Physical map for 16 highly stable QTNs. Different colors indicate different quality attributes, i.e., green: grain Zn content, red: grain Fe content, blue: GAC, pink: TSS and fast green: GPC.

Table 3 Information of 16 stable QTNs detected for various quality parameters.

In addition, the different ML-GWAS models identified a varying number of significant QTNs. Furthermore, FASTmrEMMA, mrMLM, FASTmrMLM, ISIS EM-BLASSO, and pLARmEB models detected 13, 4, 30, 15, and 2 significant QTNs, respectively (Fig. S5). However, the pKWmEB model did not identify any QTN above the threshold. Among the quality traits, grain Fe content exhibited the highest number of reliable QTNs (15) across three or more models, followed by TSS (10), Zn (9), GAC (2), and GPC (2). The identified QTNs were distributed throughout the genome, except for chromosomes 7A and 7B. The highest number of QTNs was mapped to Chromosome 2B (10), followed by 2A (6), 5D (5), 1B (4), 3A (4), 3B (4), 3D (4), 1D (3), 2D (3), 6A (3), 6B (3), 6D (3), 7D (3), 1A (2), 4A (2), 4D (2), 4B (1), 5A (1), and 5B (1).

Stable QTNs for GPC

A total of 10 reliable QTNs associated with GPC were identified within the AM panel on 8 different chromosomal loci viz. 1B, 1D, 2A, 2B, 3A, 3D, 5D and 6D. These QTNs were detected by at least two GWAS models and accounted for 3–28% of the total phenotypic variation (Table S9). Among these QTNs, Qql.iari-2B.2_pro, linked to the SNP AX-95110999 on chromosome 2B, emerged as the most stable QTN for GPC (Fig. 6a). It exhibited the highest LOD score ranging from 3.47 to 6.09 and was consistently identified across four GWAS models (FastMrEMMA, FASTMrMLM, pLARmEB, and ISIS EM-BLASSO) (Table 3). Additionally, Qql.iari-2B.2_pro showed consistency in four different HS environments.

Fig. 6
figure 6

Manhattan plot of stable QTNs for seed quality (a) GPC (b) GAC, (c) grain TSS, (d) grain Fe content and (e) grain Zn content.

Stable QTNs for GAC

In the AM panel, a total of 7 reliable QTNs were identified, which were linked to GAC. These QTNs were detected by at least two GWAS models, explaining a range of 3.9–44.5% of the overall phenotypic variation (Table S9 and Table 3). The most stable QTN identified for GAC was Qql.iari.1B.1_amy, associated with the SNP AX-94733833 on chromosome 1B (Fig. 6b). This QTN exhibited a highest LOD score ranging from 3.52 to 10.57 for different models and consistently appeared in five GWAS models (MrMLM, FastMrEMMA, FASTMrMLM, pLARmEB, and ISIS EM-BLASSO) (Table 3). Moreover, Qql.iari-1B.1_amy consistently displayed its association with GAC across the AM panel tested in five distinct HS environments.

Stable QTNs for grain TSS content

The association analysis revealed the discovery of a total of 16 reliable QTNs (located on 11 chromosomes, namely 1A, 2A, 2B, 2D, 3B, 3D, 5A, 5B, 6B, 6D and 7D) associated with grain TSS content. These QTNs were identified by at least two GWAS models and accounted for a considerable range of phenotypic variation, ranging from 3 to 27% (Table S9). Notably, two stable QTNs of these, namely Qql.iari-2B.3_TSS and Qql.iari-2A.2_TSS, were linked to grain TSS content (Table 3 and Fig. 6c). These QTNs were further associated with specific SNPs (AX-94537892 and AX-94810283) located on chromosomes 2B and 2A, respectively. Furthermore, both Qql.iari-2B.3_TSS and Qql.iari-2A.2_TSS QTNs demonstrated the highest LOD score, ranging from 3.55 to 6.55, and consistently appeared in four out of six GWAS models (FastMrEMMA, FASTMrMLM, pLARmEB, and ISIS EM-BLASSO) (Table 3). Additionally, Qql.iari-2B.3_TSS and Qql.iari-2A.2_TSS consistently showed their association with grain TSS content across the tested association panel in 5 and 4different HS environments respectively.

Stable QTNs for grain Fe content

Among the total identified QTNs, a set of 20 reliable QTNs consistently appeared in at least two GWAS models for grain Fe content (Table S9). These QTNs were distributed across twelve different chromosomal regions, namely 1B, 1D, 2A, 2B, 3A, 3D, 4A, 4B, 4D, 5D, 6A, and 6B. They accounted for a phenotypic variation ranging from 3 to 39% (Table S9). Notably, most of the identified QTNs were located on chromosome 5D and explained 3.05–27.84% of the phenotypic variance. Eight QTNs (Qql.iari-2A.1_Fe, Qql.iari-2B.1_Fe, Qql.iari-1D.1_Fe, Qql.iari-5D.1_Fe, Qql.iari-5D.2_Fe, Qql.iari-4A_Fe, Qql.iari-3A_Fe, and Qql.iari.1D.2_Fe) exhibited high consistency across four GWAS models (FastMrEMMA, FASTMrMLM, pLARmEB, and ISIS EM-BLASSO). These QTNs were associated with specific SNPs (AX-94461119, AX-94496990, AX-94981856, AX-94686732, AX-94707318, AX-94750301, AX-94861766, and AX-94981856) located on chromosomes 2A, 2B, 1D, 5D, 5D, 4A, 3A, and 1D, respectively (Table 3). Moreover, all eight QTNs consistently demonstrated their association with grain Fe content across the tested AM panel in at least three different HS environments. Notably, Qql.iari-5D.1_Fe and Qql.iari-3A_Fe displayed the highest LOD scores, ranging from 3.10 to 6.61 and 4.01–6.17, respectively (Fig. 6d).

Stable QTNs for grain zinc content

A total of 14 QTNs showed reliable associations with grain Zn content, as detected by at least two models. These QTNs were located on various chromosomal regions, including 1B, 2A, 2D, 3B, 4A, 4D, 5D, 6A, 6B, and 7D. They explained phenotypic variation ranging from 1.24 to 31.51% (Table S9). Among these QTNs, five (Qql.iari-6A_Zn, Qql.iari-2A.3_Zn, Qql.iari-7D_Zn, Qql.iari-1B.2_Zn, and Qql.iari-3B_Zn) consistently appeared in four or five GWAS models. These QTNs were associated with specific SNPs (AX-94637211, AX94671612, AX-95220192, AX-95234556, and AX-95235883) located on chromosomes 6A, 2A, 7D, 1B, and 3D, respectively (Table 3). Moreover, all five QTNs consistently demonstrated their association with grain Zn content in at least three different heat stress environments. Notably, the QTN Qql.iari-6A_Zn was detected by five GWAS models (MrMLM, FastMrMLM, FastMrEMMA, pLARmEB, and ISIS EM-BLASSO), indicating it as a highly reliable locus for grain Zn content under HS. It exhibited the highest LOD score ranging from 3.96 to 7.49, across the different GWAS models (Fig. 6e).

Annotation of putative candidate genes linked to QTNs

QTNs that demonstrated a high level of significance and consistency, were annotated due to their perceived reliability and ability to perform effectively in various environments. To identify potential candidate genes, SNPs (probe sequences) significantly linked to quality traits were cross-referenced with the Triticum aestivum genome assembly IWGSC-refseq version 1.0 using the online web resource Ensemble plants ( To further validate and enrich our results, we conducted an additional Gene Ontology (GO) annotation analysis using the DAVID online tool. This cross-validation approach has helped to reinforce our findings and provide a more comprehensive understanding of the candidate genes’ functions ( (Table S14). Sixteen stable QTNs (detected by ≥ 3 models at ≥ 3 locations) were subjected for annotation for their potential role/function, of which twelve markers were successfully annotated (Table 4).

Table.4. Annotation of putative candidate genes linked to quantitative trait nucleotides (QTNs) of grain quality parameters.

KASP validation of identified QTNs

KASP primers (Table S15) have been designed from 51 QTNs of which 33 found to be polymorphic on panel of 101 lines consisting of resistant and susceptible lines (Tables S16-a,b). The panel has been procured from BISA, Ludhiana. The two pivotal KASP markers, denoted as AX-94461119 and AX-95220192, for grain Fe and Zn content respectively, were found to be associated with respective QTNs in this panel (Fig. 7). These markers, situated on distinct chromosomal locations, namely 2A and 7D, respectively, manifested remarkable statistical significance in the context of identified QTN concerning HS. Notably, the alternative alleles of these markers exerted discernible and substantial impacts on the phenomenon of HS, signifying tolerance.

Fig. 7
figure 7

Systematic representation of the workflow used to validate identified QTNs associated with the grain quality traits under heat stress through KASP assay.

In the subsequent validation phase, the study meticulously identified specific allelic variants correlated with the attributes of heat resistance or tolerance. Allelic variant G of marker AX-95220192, exhibiting a p-value p-value 8).

Fig. 8
figure 8

Kruskal–Wallis test confirms significant Heat Susceptibility Index (HSI) differences between alleles of two KASP validated markers (A) AX-94461119, for grain iron content and (B) AX-95220192, for grain zinc content.