Friday, April 24, 2026
HomeFitnessEstimating the distribution of fitness effects of loss of heterozygosity (LOH) events...

Estimating the distribution of fitness effects of loss of heterozygosity (LOH) events using an engineered library of Saccharomyces cerevisiae

?
This is an uncorrected proof.
Figures
Abstract
Loss of heterozygosity (LOH) is a large contributor of genetic variation in natural populations or lab-evolved asexual diploids. However, its contribution to adaptation is uncertain, because the full spectrum of its fitness effects remains largely uncharacterized. To systematically investigate the distribution of fitness effects (DFE) of LOH, we engineered a diverse, barcoded library of heterozygous diploid Saccharomyces cerevisiae strains, each containing randomly induced LOH events. By employing competitive fitness assays and barcode sequencing (Bar-seq) across seven distinct environments, including various stressors from chemicals, temperature, and an in vivo host model, we quantified the fitness consequences of LOH events. Our results reveal that the DFE of LOH is predominantly neutral to deleterious, with a general trend of decreasing fitness correlated with larger cumulative lengths of LOH tracts. However, the fitness effects were variable and the fitness landscape was highly environment-dependent. While beneficial LOH events were rare and of small effect in standard rich media, they were common and conferred substantial fitness gains in novel or stressful conditions. A key finding was the prevalence of antagonistic pleiotropy where over 75% of strains exhibited fitness trade-offs, gaining an advantage in one environment at the cost of fitness in others. The magnitude of these trade-offs was also found to correlate with longer LOH tracts. Overall, this work demonstrates that LOH plays a dual role in evolution. While it often imposes a genetic burden, it also provides a powerful mechanism for rapid, environment-specific adaptation, driving specialization by trading general robustness for niche-specific advantages.
Author summary
Loss of heterozygosity (LOH) is a common genetic process in diploid organisms, where one allele of a gene on one chromosome is replaced by a duplicate copy of the allele from the other chromosome. Although LOH occurs frequently as diploid cells divide, our understanding of its effects is limited, because most studies focus on events that have already been filtered by natural selection. To fully capture both the beneficial and detrimental effects LOH may generate, we engineered a mutant library of the baker’s yeast Saccharomyces cerevisiae containing with randomly induced LOH events. Unlike traditional methods, our approach allowed us to measure the immediate consequences of these changes across the genome before they were lost to natural selection. By testing these strains across various environments, we found that while LOH is often deleterious or neutral, it can be a source of beneficial genetic variation depending on the environmental context. We observed a high prevalence of trade-offs associated with LOH, where strains gained an advantage in one environment at the cost of fitness in others. These findings suggest that LOH is capable of facilitating rapid adaptation as a form of short-sighted evolution and specialization.
Citation: Ke Y-H, Orozco-Quime M, Bauman J, Carvalho T, Landry GA, Ge S, et al. (2026) Estimating the distribution of fitness effects of loss of heterozygosity (LOH) events using an engineered library of Saccharomyces cerevisiae. PLoS Genet 22(3): e1012083. https://doi.org/10.1371/journal.pgen.1012083
Editor: Jun-Yi Leu, Academia Sinica, TAIWAN
Received: November 26, 2025; Accepted: March 5, 2026; Published: March 16, 2026
Copyright: © 2026 Ke et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Whole genome sequencing and Bar-seq reads are available in BioProject PRJNA1347851 in GenBank. Scripts, raw tables, and raw plots are available at https://github.com/KeFungi/yeastLOH_pub.
Funding: This work was supported by the National Institutes of Health, National Institute of Allergy and Infectious Diseases (R21AI168571-01 to TYJ and AK) and from support from the Canadian Institute for Advanced Research (FS22-159 to TYJ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Genetic variation serves as the fundamental substrate for evolutionary change, providing the raw material upon which natural selection can act to shape the diversity of life and facilitate adaptation to myriad environments [1–3]. Genetic variation arises through mutation and recombination, and the influence of both processes on evolution has been well-established. Diploid organisms, characterized by the presence of two sets of chromosomes, possess unique properties in the relationship between mutation and fitness. The dual copies of genes allow for the masking of recessive alleles by their dominant counterparts, potentially buffering against the deleterious effects of mutations [4–8], while simultaneously maintaining a reservoir of cryptic genetic variation [9]. Consequently, evolutionary dynamics in diploid species are profoundly influenced by mechanisms that alter the state of heterozygous sites, such as sexual reproduction, ploidy shifts, or mitotic recombination, which brings previously hidden genetic information to the fore [10–13].
Loss of heterozygosity (LOH) is increasingly appreciated as a potent source of genetic and phenotypic variation in clonal diploid organisms. LOH is a ubiquitous genetic process, leading to the irreversible conversion of heterozygous genomic regions to a homozygous state [12,14,15]. This phenomenon can arise through various DNA repair and recombination pathways, including mitotic recombination, gene conversion, chromosome loss followed by reduplication, or deletion events [16–20]. The rate of generating new genotypes by LOH can be as high as 10 ⁻ ² events per cell division, equivalent to 5 × 10 ⁻ ⁵ per site per generation [21–23], which vastly outpaces point mutations occurring at a rate of roughly 2 × 10 ⁻ ¹⁰ per base per division [24–26]. This is particularly significant for asexual diploid organisms, because, unlike sexual populations which can bring together beneficial recessive alleles through sexual reproduction, an asexual population would never otherwise generate homozygotes of beneficial recessive alleles. LOH is therefore relevant to the evolution of a large number of diploid species that are primarily or facultatively clonal, including lineages of fungi, protists, invertebrates, and plants [27]. In these species, LOH tracts have been observed in both natural and experimental populations, particularly in microbial eukaryotes [28–30]. LOH may effectively allow partially clonal diploid organisms to “skip” a waiting period for novel beneficial recessive alleles in the population by generating homozygous genotypes of such alleles [31,32]. Although the advantages of homozygous alleles are normally discussed for recessive beneficial alleles, underdominant variants may be affected in a similar manner [31,33].
In both wild populations and lab-evolved populations of fungi, LOH has been shown to generate phenotypic plasticity that facilitates adaptation to novel or stressful environments [34–36]. For instance, in Candida albicans, LOH was found in isolated samples from patients in key genes like ERG11 or TAC1, conferring resistance against antifungal drugs [37–41]. In many clonal diploid species, such as Candida albicans, the emerging amphibian pathogen Batrachochytrium dendrobatidis [42], and opportunistic pathogens such as Cryptococcus hybrids [43], LOH is widespread across the genome, but the clonal nature of these organisms makes it difficult to determine the fitness impact of LOH events.
The budding yeast, Saccharomyces cerevisiae, a genetically tractable model organism, has been instrumental in dissecting the molecular mechanisms and evolutionary consequences of LOH [44–46]. Population genetics surveys of natural S. cerevisiae isolates suggests the populations experienced a history of occasional outcrossing followed by clonal proliferation with extensive LOH tracts, supporting the idea that the natural population of this species underwent LOH-mediated diversification [47,48], though the role of adaptation in promoting such variants is unclear. In a lab setting, clones adapting to different environments have accumulated LOH at loci implicated in adaptation, implying LOH was beneficial [29,49]. Other mutation accumulation or experimental evolution studies have demonstrated that LOH can generate substantial phenotypic variation, increasing fitness against challenges including chemical stresses and nutrient limitation [46,50–53]. On the other hand, in some experimental evolution studies on diploid yeast, the consequences and implications of LOH could not be determined [53,54].
Despite the prevalence of LOH, there is little consensus on its relevance for driving adaptive evolution relative to point mutation and sexual recombination. Although the distribution of fitness effects of point mutations in yeast is extensively studied [55–58], the fitness impact of LOH is multifaceted. Many investigations of adaptive LOH have focused on discovering the causal loci of particular selective pressures such as drug resistance or environmental stresses in accumulated mutations [29,37,50,59]. However, LOH events do not simply unmask a single major-effect allele, but introduce multiple genetic changes through the spread of homozygosity affecting many loci and various types of genetic variants [32,60–63]. This is supported by some cases that adaptive LOH can incur antagonistic pleiotropy [64–66], whereby clones gaining fitness under one stress frequently lose fitness under other environments [49,52]. Therefore, understanding the broader spectrum of distribution of fitness effects (DFE) of LOH in different environments prior to selection will allow a better estimation the fitness landscape of LOH events. Such systematic assessment is required to fully understand the extent to which LOH facilitates adaptation and its role in evolution [57,64].
To achieve the goal of systematically investigating the DFE of genome-wide LOH events, this study developed a high-throughput approach to generate random LOH in diploid yeast for which fitness could be easily measured. Our primary objectives were: (1) to generate a diverse, barcoded library of diploid S. cerevisiae strains each harboring distinct LOH events; (2) to quantitatively determine the DFE associated with these LOH events through barcode sequencing of the passaged library (Bar-Seq); (3) to determine the pleiotropic effects of LOH events by measuring fitness across environments. To construct such a library, we introduced double-strand DNA breaks (DSBs) in a BY4741 × SK1 hybrid using a rare restriction enzyme I-SceI to cut restriction sites inserted via random transposition, a method adapted from high-efficiency genome editing [67]. The subsequent repair of DSBs templated from the homologous chromosomes created randomized LOH events in the genome. The resultant strains comprise a library of diploid yeasts with a wide spectrum of LOH events, varying in length and genomic location and each labeled with a barcode sequence that could be used for tracking genotypes and pooled fitness assays.
Results
Properties of the constructed LOH library strains
We constructed a library of barcoded diploid S. cerevisiae strains exhibiting loss of heterozygosity (LOH) using our method of inducing site-specific double-strand breaks (DSBs) with randomly inserted I-SceI sites (Fig 1A). We performed whole genome sequencing on 195 FUdR-resistant strains for barcode calling, genotype calling, and LOH identification. After excluding strains with too much missing genotype (> 5%) (S1 Fig), haploid strains (homozygous rate > 95%) (S2 Fig), unidentifiable barcodes, and barcodes underrepresented in the inoculum (<0.01%) (S3 Fig), a total of 150 diploid LOH strains and 7 diploid control strains were used in the analyses (S1 Table). (A) One strain (BY4741) was used for piggyBac transposon insertion with an I-SceI restriction site, the counterselectable TK-MX gene, and became the replacement strain. The donor strain (SK1) contained a 20 bp barcode and the inducible I-SceI restriction enzyme encoding gene. (B) After genome sequencing of LOH-induced strains, they were pooled and fitness estimated in vivo (Galleria mellonella) and in vitro using barcode sequencing (Bar-seq). https://doi.org/10.1371/journal.pgen.1012083.g001 The LOH strain library displayed a diverse spectrum of LOH events, varying in length, location, and parental origin (Figs 2 and S4 and S1 Table). LOH resulted in homozygosity of alleles derived from either the SK1 or BY4741 parent, although the SK1-type was favored by design, because DSBs were induced on BY4741-derived chromosomes, while SK1-derived chromosomes acted as template for repair. Among detected LOH tracts, 82.6% were of the SK1-type (S2 Table), suggesting that the LOH genotypes differed from the expected 1:1 ratio in spontaneous LOH events and the SK1-type was prevalent (S5 Fig). The cumulative length of LOH regions per strain also showed that the SK1-type was indeed dominant in the constructed strains (Fig 2A), with 88.7% of strains having SK1-type LOH longer than 1000 bp, and 74.0% of strains having no BY4741-type LOH. The LOH tracts covered the majority of regions of chromosomes (Fig 2D), where the SK1-type covered 85.5% of the genome, the BY4741-type tracts covered 32.0% of the genome, and 86.8% of the genome was covered by either. The genomic regions missing LOH coverage were concentrated on locations adjacent to the centromere regions. (A) Histograms depicting the distribution of the cumulative length of LOH tracts in each strain. (B) Histograms depicting the distribution of length in individual LOH tracts. (C) Histograms depicting the distribution of the number of LOH tracts in each strain. In each histogram, LOH tracts are divided into homozygosity of alleles originating from the BY4741 alleles or the SK1 alleles, or pooled regardless of genotype (pooled). (D) Coverage of observed LOH tracts in the genome across all strains in the LOH mutant library. The position of the midpoint of LOH events, the centromere and SNPs between parental strains are marked below. https://doi.org/10.1371/journal.pgen.1012083.g002 The disparity in cumulative length between the two LOH genotypes was largely driven by differences in the number of LOH events per strain instead of the length of individual LOH events. The length distribution of either LOH genotype was broadly similar, with a predominance of events spanning from 1 kb to 1 Mb (92.3%) with fewer instances of very short (<1 kb) or extremely long (>1 Mb) events (Figs 2B and S6). However, the majority of strains contained no BY4741-type LOH but 1 or 2 SK1-type LOH events, resulting in a dominance of SK1-type LOH events in the constructed strains. Notably, although the BY4741-type LOH might have generated spontaneously, 34.9% of BY4741-type tracts were found within 100,000 bp of SK1-type LOH events, suggesting some of them were also induced by the DSBs on BY4741 chromosomes but underwent a more complicated repair process [68].
Fewer terminal LOH tracts (T-LOH) were observed than interstitial LOH tracts (I-LOH) (38% vs. 62%; S2 Table). This ratio was similar regardless of the parental genotype, where 38.4% SK1-type LOH and 36.5% BY4741-type LOH were T-LOH, respectively. T-LOH was prone to create longer LOH tracts than I-LOH (S6 Fig), whose median lengths were 200,241 bp and 10,478 bp, respectively. The length of chromosome was correlated to the T-LOH length (R = 0.37, p < 0.001), mostly through soft capping the maximum length of T-LOH event (S7 Fig), whereas I-LOH lengths were much shorter than the occurring chromosome and were uncorrelated with chromosome length (R = 0.062, p = 0.356). The number of LOH events on each chromosome was strongly correlated with the chromosome length (S8 Fig)(R = 0.92, p < 0.001), implying the DSBs were induced randomly across base pairs in the genome. The distribution of LOH events relative to the centromere and telomere between T-LOH and I-LOH or between SK1-type and BY4741-type were not significantly different (S9 Fig). Distribution of fitness effects of the induced LOH events in various environments The fitness of the LOH library strains and control strains was quantified by Bar-seq across a range of environmental conditions to determine their DFE (Fig 1B). The control strains were the diploid crosses from the same pool of SK1 strains with I-SceI and random barcodes, but crossed to the BY4741 line lacking transposition insertion and thus no I-SceI restriction site. All constructed strains were pooled by equal volume and inoculated into different treatments to constitute competitive fitness assays. The treatments tested included six diverse liquid growth environments that represent various source and degree of stresses previously used in experimental evolution of yeasts [29,49,69,70], as well as an invertebrate host model using Galleria mellonella [71] (worm-20C). We include synthetic complete medium (SC) and yeast extract peptone dextrose at 30°C (YPD-30C) as standard rich media in laboratory conditions to which the laboratory strains were adapted, acidic synthetic defined medium at pH 2.6 (Acidic) and YPD with 4.5 mM hydrogen peroxide (H2O2) as simple chemical stressors, YPD supplemented with 2 mg/ml caffeine (Caffeine) as a stressor of natural anti-microbial compounds, and worm-20C for representing biological interactions and pathogenic potential [71]. The abundance of strains in the inoculum and the competitive fitness assay was determined by the barcode frequency with a minimum of 1,432,531 sequenced reads per assay (S3 Table). Fitness of strains was calculated by the relative enrichments of abundance in the competitive assays in comparison to the control strains. Bar-seq of the inoculum pool indicated the initial frequency of strains varied from 0.14% to 1.07% in the populations, and the initial frequency did not systematically affect the inferred fitness (S10 Fig). Except in the worm-20C and H2O2 data, the control strains and LOH library strains had consistent frequency changes among replicates (S11 and S12 Figs), with more than 94.8% of variation explained by strains (S4 Table), providing a reliable estimation of fitness for each strain. In the worm-20C and H2O2 datasets, only 16.0% and 85.9% of observed variation could be explained by strains. The standard deviation of strain fitness measurement across replicates was also higher in these two treatments (S13 Fig), suggesting their fitness estimation was less accurate. The variability of fitness assays in the worm host environment was likely caused by low population size in this stressful environment due to bottleneck effects during the initial infection process [71]. The overall fitness landscape of the LOH library demonstrated the various impact of LOH in the tested treatments from negative to neutral (S5 Table and Fig 3). In SC and Caffeine, the average fitness changes were nearly neutral (s = 0.001 and -0.013, respectively). Half of the strains were positive in selection coefficients (48.5% and 52.2% respectively) with standard deviations of 0.082 and 0.077, readily suggesting that LOH could provide genetic and phenotypic variation for adaptation in those environments. More substantial average fitness detriments were observed in the YPD-30C, Acidic, YPD-37C, worm-20C, and H2O2 growth conditions, where the average selection coefficients ranged from -0.069 to -0.185, and only 3.7% to 27.2% of LOH strains possessed positive fitness effects. When restricting the fitness effect to strains containing single LOH events, the DFE and the summary statistics between those strains and the whole library were similar in shape (S14 Fig) and highly correlated (r > 0.98, R2 > 0.95, S15 Fig). Except in YPD-30C and SC, the other five treatments showed bimodal patterns in the distribution (Tables 1 and S6), with a nearly neutral peak and another low-fitness peak.
Each panel displays a histogram for the distribution of selection coefficient (s) of all strains in the LOH mutant library containing at least 1 LOH tract. The selection coefficient (s) is calculated relative to control strains in specific growth environments: Acidic, Caffeine, H₂O₂ (hydrogen peroxide), SC (synthetic complete medium), YPD-30C (yeast extract peptone dextrose at 30°C), YPD-37C (YPD at 37°C), worm-20C (inside Galleria mellonella at 20°C), and the average across environments. Text annotations provide the calculated mean fitness, standard deviation, and the positive rate representing the fraction of strains exhibiting a positive selection coefficient.
https://doi.org/10.1371/journal.pgen.1012083.g003
https://doi.org/10.1371/journal.pgen.1012083.t001
We further inferred the underlying distribution for a single LOH tract with explicit modeling. Since strains with a single LOH tract were limited, we leveraged data from strains containing multiple LOH events from 1 to 3 (73.3% of strains), with a simplifying assumption that the fitness effects of multiple LOH tracts were purely additive, so the DFE of those strains only depends on the underlying model and parameters of single LOH tracts. We applied a numerical maximum likelihood search to estimate the parameters of the single-LOH distribution using the observed pairs of LOH count and fitness in strains. The results showed a negatively-skewed Weibull in the Caffeine dataset and a light-tailed generalized normal distribution in SC (Tables 1 and S7 and S16 Fig), both possessing more concentrated peaks than a standard normal distribution. The bimodality of other treatments impeded fitting a simple model with unimodal distribution. However, we modeled the properties of the tail distribution from the 20 highest fitness strains against the generalized Pareto distribution and the nested exponential distribution that are predicted under the theory of extreme values [72]. In YPD media (YPD-30C and YPD-37C), the rich medium used for maintaining yeast cultures in the laboratory, the right tail was truncated, i.e., in the Weibull domain [73] (LRT p-value < 0.05 and shape ξ < 0; S8 Table and S15 Fig) with lower maximum s less than 0.020. The flat tail (ξ ≈ -1) inferred from YPD-37C was likely caused by the bimodality in the tail, which does not imply a true simple distribution. The other 5 treatments had an exponential right tail (LRT p-value > 0.05) with a higher observed maximum s ranging from 0.061 to 0.193.
The general negative effects of LOH events also manifested in the relationship between the cumulative length of LOH events and fitness consequences. A general trend emerged wherein an increasing cumulative length of LOH per strain was more often correlated with more negative fitness effects. The significant negative correlation was observed in 3 of 7 environments and in the average fitness across treatments (Fig 4). A similar pattern was observed in terms of the number of LOH events in the genome (S17 Fig), but the correlation was generally weaker and only significant regarding the number of T-LOH, indicating it was the generated runs of homozygosity that had negative effects on fitness. The cumulative LOH length and genome-wide homozygosity were highly correlated (S18 Fig) and a correlation between the heterozygosity in the genome and the fitness was also observed (S19 Fig). Since LOH containing alleles from either parent gave the same negative trend (S20 Fig), this effect also pointed to the detrimental effects of widespread homozygosity itself instead of alleles associated with higher fitness. If the effects had been caused by favorable parental alleles or genomes, it was expected that the length of one genotype of LOH would have a positive effect on fitness, while the other had a negative effect. To further understand whether some large effect might drive the fitness effects, quantitative trait loci (QTL) mapping was performed to identify specific genomic regions associated with fitness variation (S21-S28 Figs). Multiple peaks in the Manhattan plot suggested that fitness in the investigated environments was mostly polygenic. Thus, the trend was likely from the integration of numerous small-effect alleles created by LOH. A single opposite trend between fitness and LOH was observed in SC media, where higher number of LOH events, higher cumulative LOH length, and lower heterozygosity increased the fitness of strains.
Each panel shows a scatter plot of strain selection coefficient (s) against total LOH length within each strain (log-10 scale) in specific growth environments: Acidic, Caffeine, H₂O₂ (hydrogen peroxide), SC (synthetic complete medium), YPD-30C (yeast extract peptone dextrose at 30°C), YPD-37C (YPD at 37°C), worm-20C (inside Galleria mellonella at 20°C), and the average across environments. Each point represents an individual strain. A linear regression line is fitted to the data in each panel, with the corresponding equation and p-value for the slope displayed.
https://doi.org/10.1371/journal.pgen.1012083.g004
Pleiotropic effects of LOH
While many LOH events conferred neutral or negative fitness effects, specific LOH strains exhibited significant fitness gains in particular environments, demonstrating their potential to drive adaptive evolution. In some treatments, the strains with the highest fitness had increased growth rate by up to 19.3% in H2O2, 18.7% in SC, 14.4% in Caffeine, and 8.65% in the acidic environment. The top-performing strains in these environments had longer LOH tracts in the genome in contrast to top-performing strains in YPD with minimal LOH tracts (Fig 5). The investigation of the top-performing strains in each treatment revealed that the majority of these strains displayed positive fitness across all replicates in their respective optimal environments, suggesting robust benefits in their fitness (Fig 6). We did not include the high fitness strains from the worm-20C dataset because their variation among replicates was too high, as mentioned above.
LOH tracts of curated set of yeast strains, selected for being ranked among the top 5 performers in at least one of the six evaluated growth environments (Acidic, Caffeine, H₂O₂, SC, YPD-30C, or YPD-37C). The color blocks depict LOH tracts in regard to their genomic position and the origin of the present genotype.
https://doi.org/10.1371/journal.pgen.1012083.g005
Fitness of a curated set of yeast strains, selected for being ranked among the top 5 performers in at least one of the six evaluated growth environments (Acidic, Caffeine, H₂O₂, SC, YPD-30C, or YPD-37C). The six panels are each dedicated to their fitness in one of these treatments. Within each panel, the selection coefficient (s) of the same set of selected strains in the x-axis is displayed using box plots to represent the distribution of fitness across replicates. The color key indicates the specific treatments(s) in which each strain achieved its top-5 ranking. A combination of treatments and a different color are used if a strain reaches its top-5 performance in multiple treatments.
https://doi.org/10.1371/journal.pgen.1012083.g006
However, the fitness gain is environment-specific and comes with a cost in other environments, suggesting the presence of antagonistic pleiotropy. For example, strains NFS-2A and P1-4C exhibited a notable consistent fitness gain in the Caffeine (s = 0.094 ± 0.017) and Acidic (s = 0.03 ± 0.017) environments, respectively, while performing significantly worse in the opposite environments (s = -0.139 ± 0.025 and s = -0.241 ± 0.05, respectively). The majority of strains (75.7%) demonstrate a trade-off where they have positive fitness in some environments and negative fitness in others (S29 Fig). Some strains (23.5%) exhibited consistent negative fitness in all tested environments, demonstrating the potential burden of runs of homozygosity. Only one strain (P1-5A) had positive fitness changes across all treatments. Some strains demonstrated similar responses to multiple environments. For example, top-performing strains in Acid had worse fitness in Caffeine, top-performing strains in H2O2 had higher fitness in SC, and strains shared high fitness between the YPD-30C and YPD-37C growth conditions (Fig 6).
We observed that LOH events spanning larger genomic regions were more likely to result in substantial fitness trade-offs. To quantitatively explore these trade-offs, a metric was calculated to measure antagonistic pleiotropy. This metric was defined as the product of the maximum s observed across all environments and the mean fitness cost incurred in the remaining environments. The H2O2 and worm-20C growth environments were excluded since they had less accurate fitness measurement as mentioned above. Analysis of this trade-off metric against the cumulative LOH length per strain revealed a statistically significant positive correlation (Fig 7). The robustness of this finding was supported by the use of alternative pleiotropy metrics, including using standard deviation or using the fitness range across environments. The consistent positive correlation between LOH length and the magnitude of pleiotropy provided a direct mechanistic link to the consequences of runs of homozygosity. This suggests that although long LOHs are likely to be detrimental in most environments, the variation it creates can lead to high relative fitness in a specific environment.
The scatter plot shows the relationship between total LOH length within a strain (log-10 scale) and fitness gain, fitness cost, or trade-offs score among treatments. Maximum gain represents the highest selection coefficients across growth conditions for each strain (strains without positive selection coefficients in any treatment are ignored). Maximum cost represents the lowest selection coefficients across growth conditions for each strain (cost is shown as the negative parts of selection coefficients). To calculate the average cost, fitness costs in environments where the strain showed positive selection coefficients are treated as 0. The trade-off score of a strain is defined as the product of the maximum fitness gain observed across growth environments and the average fitness cost in the other environments (strains without antagonistic pleiotropy are ignored). Other pleiotropy metrics, the standard deviation and range of selection coefficients across treatments, were also performed. Each point represents an individual strain. A linear regression line is fitted to the data with the equation and associated p-value for the slope displayed.
https://doi.org/10.1371/journal.pgen.1012083.g007
Discussion
Loss of heterozygosity (LOH) is prevalent in clonal diploid organisms, but the amount of phenotypic variation a random LOH event generates is unclear. The current known fitness effects of LOH have primarily been surveyed using either marker selection or experimental evolution, leaving the DFE of unselected LOH unexplored. In this study, we successfully developed a genetic engineering approach to address a different aspect of LOH that investigates the fitness landscape of random LOH in a diverse LOH library. By subjecting such an unconditioned library to competitive fitness assays across a range of environments, we estimated the DFE of LOH across environments and their pleiotropy.
The frequency of T-LOH relative to I-LOH found in our library is higher than in the previous mutation accumulation studies. Previous studies observed T-LOH percentage among the two types was 13.9% [21] or 4.6-14.6% [29]. In contrast, our engineered strain demonstrated 38.1% of LOH events wereT-LOH. These differences likely reflect underlying differences in the mechanisms of LOH, as our experiment mostly focused on DSB-induced recombination and used a different diploid genotype than other studies. Moreover, mutation accumulation studies are more subject to fitness impacts, because the LOH events are filtered through hundreds to thousands of generations wherein selection can weakly act.
By modeling the DFE of LOH using approximate maximum likelihood, we aligned our data with previous observations of other types of mutation. The observed bimodality in the DFE of LOH is commonly present in the DFE of point mutations in yeast and other organisms [74]. Gene deletion and mutagenesis studies in yeast [56,75,76] and Salmonella enterica [77] showed one peak at neutrality and another at lethality or large fitness decline. The same pattern is observed in viral [78] and mouse systems [79]. Bimodality can arise from a range of mechanisms, including essential versus redundant gene function [80], complex landscapes of protein folding stability [78], correlation between effect size and gene expression level whose distribution is non-unimodal [77], and topology in gene regulatory networks [81]. The inferred two unimodal distributions in Caffeine and SC both have a more concentrated peak than a standard normal distribution. This aligns with the mutation accumulation evidence showing that most mutations have small effects with fewer larger effects [82,83], giving a pattern consistent with a sharply peaked unimodal distribution rather than an exact Gaussian distribution. In the truncated Weibull shape fitted to Caffeine, constraints such as saturating enzyme kinetics could place a hard cap on the performance increase achievable through a single mutation, naturally leading to a DFE in the Weibull domain of attraction [84]. Alternatively, a concave (diminishing-returns) mapping from phenotype to fitness could compress the tail of the DFE, making it appear truncated even if the underlying phenotypic effects are not strictly bounded [84].
The shape of the tail influences the rate and mode of adaptation, including the likelihood of selective sweeps and clonal interference under various tail regimes [85–88]. The Weibull domain tail plus low maximum gain in YPD media fits the established model where organisms exist near a fitness optimum [89]. YPD, as a standard rich laboratory medium, represents an environment where the parental diploid strain is likely already highly adapted. The low maximum fitness gains (s < 0.02) we observed from growth in YPD are consistent with adaptation being constrained by a nearby fitness peak, a scenario predicted to produce truncated or Weibull-like DFEs [84,89]. The limited LOH tracts in the best-performing strains in YPD support the conclusion that LOH has limited potential for increasing fitness in YPD. On the other hand, in the more novel or stressful environments, the parental strain is likely maladapted, opening up a much broader range of possible beneficial mutations, including those of large effect. The absence of a detectable upper bound implies that adaptation is not limited by proximity to a local optimum, allowing LOH events to generate substantial fitness gains. This environmental dichotomy highlights that the DFE is not an intrinsic property of a mutational class but an emergent feature of the organism-environment interaction. Our primary finding supports that, like other types of mutation, most LOH events are neutral to deleterious, with the DFE of LOH skewed toward the negative side. This echoes the results of widespread negative effects in other mutation accumulation or mutagenesis experiments in yeast [55–57]. However, there are substantial differences between the two processes. Firstly, depending on the environment, a significant proportion of LOH events appear to be beneficial. Secondly, although the DFE of LOH may be bimodal, there is no large peak of highly deleterious alleles as seen with spontaneous mutation [56,75]. The less detrimental DFE of LOH is consistent with the fact that LOH shuffles alleles that have survived natural selection, where the genetic load is expected to be low due to the free-living haploid stage in yeast. Therefore, the DFE of LOH and point mutation are expected to be distinctive. While the fitness cost from point mutations likely stems from new suboptimal variants through altered protein products or regulatory changes [78], the cost of LOH likely stems from general reduction of allelic diversity. Possible mechanisms for this are through eliminating overdominant alleles [62], or exposing deleterious recessive alleles that are commonly found in yeast [90]. The direct competition between the conditionally beneficial alleles and the cost of homozygosity also explains why the frequency of antagonistic pleiotropy found in LOH is strikingly high. Another potential mechanism for the fitness changes through LOH events is the dosage effects of alleles from the more adapted or less adapted parental genome. Our results suggest such dosage effects are likely to have less weight on the fitness effects since the decrease in fitness seems to depend on the size of LOH but not the origin of alleles, suggesting the fitness cost stems from homozygosity itself. Despite the general fitness cost of increased homozygosity, our results meanwhile agree that LOH can be a powerful source for rapid, environment-specific adaptation. Our results confirm the finding from lab-evolved populations that LOH can provide sufficient heritable variation upon which natural selection can act [33,46,49,52,91]. Its contribution toward adaptation is two-fold. First, in every environment we tested, a fraction of the LOH strains exhibited significant fitness gains, with some showing robust positive selection coefficients across all replicates. As the fitness gain in each environment can be as high as 19.2%, those strains are likely to become dominant in the population within a few generations. Second, we demonstrate that the fitness changes through LOH can be more significant in evolution than simple directional selection. One significant outcome in our study is the prevalence of antagonistic pleiotropy, where the vast majority of strains (75.7%) were adaptive in at least one environment. When genotypes are specifically adapted to environments at the expense of generalist capabilities, it could speed local adaptation or niche partitioning, which leads to diversification of lineage and speciation [92,93]. Clonal expansion followed by LOH-mediated diversification is the leading hypothesis that drives the evolution of budding yeast, and essentially all wild and domesticated isolates of yeast have clear evidence of 1–60 LOH regions [47]. As evidence is clear that LOH is a source of phenotypic variation and adaptation in this study and previous studies [33,46,49], LOH tracts in a lineage can be considered a scar of short-term adaptation. However, heterozygosity will progressively deplete after repeated LOH occurrence, and the genetic load associated with homozygosity will accumulate to reduce fitness as demonstrated in our investigation. In the long run, occasional sex or outcrossing should increase the adaptive potential by restoring heterozygosity and breaking linkage disequilibrium [10,86], a framework that helps explain rare outcrossing observed in wild yeast populations. Recent comparative and population-scale studies report substantial heterogeneity in the frequency of sexual production and LOH spectra in both rates and fitness outcomes among different Saccharomyces genetic backgrounds and hybrids [21,34,46,47,94]. These differences can profoundly alter the opportunities for beneficial LOH and the costs of increased homozygosity. More extensive experimental work across diverse genetic backgrounds and species is required to quantify how baseline heterozygosity and genetic load governs the rate and magnitude of LOH-mediated adaptation and its interaction with sexual reproduction. Conclusion This systematic, genome-wide assessment reveals the remarkably multifaceted role of LOH in generating variation in fitness. We show the similarity and dissimilarity between LOH and other types of mutation in terms of DFE. Its fitness effects are not monolithic but instead a complicated interplay among genetic background, the size of the tract, and the environment. We characterized its most common role as a source of genetic burden, where increasing homozygosity imposes a significant fitness cost. However, we demonstrated that pleiotropic effects impart environment-specific fitness gains that can overcome their inherent costs, which may function as a driver of specialization by trading general robustness for niche-specific advantages. Overall, our work provides a tangible link between a complex genetic phenomenon and its impact on evolution. Future studies dissecting the causal genes and epistasis within these regions will further illuminate how these distinct functional aspects of LOH interact to shape the trajectory of adaptation. Materials and methods Construction of LOH strains Loss of heterozygosity (LOH) strains were generated by inducing site-specific double-strand breaks (DSBs) using the endonuclease I-SceI in diploid S. cerevisiae, following a method adapted from Alexander et al. [67]. In this design, SK1-derived parental strains were transformed with a galactose-inducible I-SceI expression cassette at the TRP1 locus, while BY4741-derived parental strains were engineered to harbor I-SceI recognition sites randomly inserted throughout the genome via transposon mutagenesis. Upon mating of the two parental strains, I-SceI expression induced DSBs at the transposon-inserted recognition sites, leading to LOH events generated during DSB repair in which the BY4741-derived parent acted as the replaced genome while the SK1-derived parent acted as the donor genome. By using a library of BY4741 strains with unique I-SceI insertion sites, we established a library of diploid strains, each harboring distinct LOH events. Unless otherwise specified, yeast strains were maintained on yeast extract peptone dextrose (YPD; 1% yeast extract, 2% peptone, and 2% glucose) agar or in liquid YPD at 30°C, plasmids were extracted using the QIAprep Spin Miniprep Kit (Qiagen), and gel purification was performed using the Monarch DNA Gel Extraction Kit (New England Biolabs). Construction of BY4741-derived parental strain (replaced genome) To transform the BY4741 parental strain (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0) [95], we constructed a transposon plasmid, pJR487-TkMX, by inserting an I-SceI recognition site and thymidine kinase gene (TkMX) into the piggyBac transposon on plasmid pJR487 [96], which already encoded the transposon, transposase, and selectable marker URA3 outside of the insertion region. The pJR487 backbone was amplified by PCR using Phusion High-Fidelity DNA Polymerase (Thermo Scientific) with primers pJR487-backbone-F and pJR487-backbone-R (S9 Table). The I-SceI recognition site and TkMX cassette were amplified from plasmid pHERP1.0 [67] with primers I-SceI-TkMX-F and I-SceI-TkMX-R. Both PCR products were gel-purified and then assembled into the pJR487 background using the Gibson Assembly Master Mix (NEB). Correct plasmid assembly was confirmed by sequencing. The resulting pJR487-TkMX plasmid was transformed into BY4741 via the lithium acetate/PEG-4000 method [97], followed by selection on SC–Ura agar. Individual colonies were propagated in SC–Ura liquid medium at 30°C for two days to promote transposition. Cultures were subsequently streaked onto SC agar supplemented with 5-fluoroorotic acid (5FOA; 1000 μg/mL), to select for descendants that had lost pJR487-TkMX plasmids. Successful transposon insertions were verified by colony PCR using GoTaq DNA Polymerase (Promega) and primers tk-gene-F and tk-gene-R targeting sequences within the transposon. Only one strain descended from each colony on SC–Ura agar was retained to ensure unique transposon insertions. Construction of SK1-derived parental strains (donor genomes) The SK1-derived strain (MATα ura3- FLO8::URA3) [98] was transformed with a PCR product comprising the GAL1 promoter, I-SceI endonuclease, an hph antibiotic resistance marker, and a 20-bp barcode, targeted to the TRP1 locus in the yeast genome. The fragment containing the GAL1 promoter and I-SceI, along with an overhang of a 20-bp 4-fold degenerate barcode and 80-bp homology to one side of TRP1 in the primer, was amplified by PCR from plasmid pHERP1.0 using primers TRP1-GAL1-SceI-1 and Nat-barcode-SceI-3 with the Q5 High-Fidelity DNA Polymerase (New England Biolabs). The fragment containing the hph cassette, along with an overhang of 80-bp homology to the other side of TRP1 in the primers, was amplified from plasmid pAG32 [99] with the primers barcode-Nat-1 and TRP1-NAT-4. Both fragments were gel-purified and then concatenated via overlap extension PCR (OE-PCR) with the primers TRP1-F-1 and TRP1-R-1. The final OE-PCR product was gel-purified, sequenced to confirm correctness, and transformed into SK1 using the same lithium acetate/PEG-4000 method. Transformants were selected on YPD supplemented with 200 μg/mL hygromycin B. Each colony was expected to carry a unique barcode due to the independent incorporation of different amplicon molecules during transformation. Mating and induction of LOH events Diploid strains were generated by mating the transformed donor strains and the replaced strains to create crosses each harboring a unique barcode and LOH events. Equal volumes (5 μL each) of the respective saturated parental cultures were mixed in 2 mL YPD and incubated overnight at 30°C to allow mating to occur. Mated diploid strains were selected by complementary auxotrophy on SC-Ura-Trp agar. To induce I-SceI expression, diploids were first transferred to YP sucrose medium (1% yeast extract, 2% peptone, and 2% sucrose) to relieve glucose repression of the GAL1 promoter. After overnight incubation at 30°C, 50 μL of culture was transferred into 2 mL YP medium supplemented with galactose (1% yeast extract, 2% peptone, and 2% galactose) and incubated for 4 hours at 30°C to activate I-SceI expression and subsequent LOH events. To select for LOH, cultures were streaked on SC agar supplemented with the counter-selectable agent FUdR (600 μg/mL) to select for loss of TkMX flanking the cleavage site. One colony per cross was retained for inclusion in the LOH library for downstream whole-genome sequencing and fitness assays. In addition, 8 control strains, generated from crossing the barcoded SK1 strains and the BY4741 strain without transposon insertions and thus lacking I-SceI cleavage sites, were subject to the same mating and induction process as the LOH library strains to serve as controls bearing the same genetic and physiological backgrounds, except lacking LOH events. Genome sequencing of LOH library strains Whole genome resequencing on an Illumina platform was performed on individual LOH library strains and control strains to determine the regions of LOH and the barcodes. Genomic DNA was extracted from overnight YPD liquid culture using the Puregene Yeast/Bacterium Kit (Qiagen) and purified by AMPure XP Beads (Beckman Coulter) at a 0.8x ratio before performing Illumina library preparation using the Nextera XT DNA Library Preparation Kit (Illumina) combined with Illumina DNA/RNA UD Indexes (Illumina). The sequencing service was provided by the Advanced Genomics Core at the University of Michigan on NovaSeq X Plus Flow Cells in 150 bp paired-end mode. The sequencing depth of strains ranged from 68.8x to 705x (S1 Table). The barcodes of each strain were identified by mapping Illumina reads to the expected knock-in sequence containing the barcode region. Gentle quality trim at score 20 and adaptor trim on Nextera adaptor (CTGTCTCTTATACACATCT) were applied to the raw reads using cutadapt [100]. Trimmed reads were aligned to the knock-in sequence (S1 File) using Bowtie2 [101] with relaxed mismatch penalties to accommodate alignment across the poly-N barcode region (--n-ceil L,0,0.2 and --np 0). Variants within the poly-N region were called using BCFtools [102] with the -M flag enabled to report ambiguous bases (Ns) present in the reference. Mapping results were manually inspected to ensure the absence of spurious or secondary barcode alignments. LOH regions were identified based on continuous stretches of homozygous genotypes inferred from whole-genome sequencing (WGS) data, following a modified approach from [49] for each LOH library strain. Trimmed reads (as described above) were mapped to the S. cerevisiae S288C reference genome (version R64.5.1) using BWA-MEM [103]. Variants were initially called per strain using the GATK HaplotypeCaller and subsequently jointly genotyped across all samples using GATK GenotypeGVCFs, both with default parameters. Only single nucleotide variants (SNPs) were utilized for analyses. The same parameters of site quality filter and genotype quality filter were applied according to [49]. The same variant calling and filtering procedure was applied to SK1 and BY4741 parental strains. Only verified polymorphic sites between the parents were retained for LOH detection. Additionally, to avoid dubious heterozygous SNP sites affecting LOH detection [104], sites with missing genotype rates of more than 5%, SNP sites with alternative allele frequency below 45% or above 55% across LOH library strains, and individuals with missing genotype rates greater than 5% across SNP sites were filtered to ensure confident inference of genotypes. A single LOH region was defined as a stretch of at least 5 consecutive homozygous sites, which were expected to be heterozygous based on parental genotypes, before being interrupted by 20 consecutive heterozygous sites, with missing genotypes ignored in both cases. Since the choice of reference genome was known to bias LOH detection [104], variant calling with the alternative SK1 reference genome was deployed. The variants detected between SK1 and BY4741 were applied to the BY4841 reference genome (S288C) to generate the alternative reference genome. The same variant calling and LOH detection method was applied using the alternative reference genome. The LOH tracts present in the alternative reference genome analysis but in S288C reference genome analysis were added to the total detected LOH tracts. The LOH events were manually categorized as interstitial (I-LOH) or terminal (T-LOH) based on whether the LOH tract covered the last SNP of either end of chromosomes. The differences between types of LOH were tested by a one-way ANOVA and Tukey honest significant differences, or Wilcoxon rank sum test in case of number of LOH events which did not form a unimodal distribution, using the lm(), anova(), TukeyHSD(), and wilcox.test() functions in base R. Fitness and bar-seq assay To determine the fitness of LOH library strains, we pooled the LOH library strains as a population and utilized bar-seq [105] to track the changes in abundance of each strain in multiple environments. Saturated cultures of all LOH and control strains were pooled at equal volumes to form the initial inoculum. For the liquid culture assays, the initial inoculum was grown under 5 diverse growth environments at 30°C: YPD, SC, YPD supplemented with caffeine (2mg/ml), YPD with H2O2 (4.5 mM), and acidic SD at pH 2.6 [106], and an additional YPD environment at 37°C. All the fitness assays were repeated 8 times using 1 μl initial inocula in 1 ml volume in 96-well plates sealed with Adhesive Plate Seals (Thermo Scientific), shaken at 200 rpm for 24–72 hours until saturation to represent 10 generations of growth. The genomic DNA of the initial inoculum and each grown population were extracted using the Puregene Yeast/Bacterium Kit (Qiagen). Fitness in an in vivo pathogenicity environment was performed using waxworm infection. Inoculation was performed using 107 yeast cells in 10 μl sterile water injected into the hemolymph of living Galleria mellonella in the third right proleg using a 10 μl Hamilton syringe. They larvae were then incubated individually in plastic containers at 20°C for 48 hours. DNA was extracted from G. mellonella larvae using a modified phenol–chloroform protocol optimized for yeast recovery. Briefly, each larva was ground in 100 µL ultrapure water, filtered through a 40 µm strainer, and the filtrate pelleted (13,000 rpm, 10 min). The pellet was digested in ATL buffer (Qiagen) with Proteinase K (56 °C, overnight), washed, and treated with zymolyase (37 °C, 4 h) to lyse yeast cells. DNA was then purified by two phenol:chloroform (1:1) extractions, precipitated with 2 volumes of 100% ethanol at −80 °C, washed with 70% ethanol, air-dried, and resuspended in ultrapure water. Barcode regions of the extracted DNA were amplified with primers incorporating Illumina adaptors and indexes (S10 Table) using the Q5 High-Fidelity DNA Polymerase in 25 μl reactions. The recipe for such reactions comprised 12.5 μl 2x PCR Master Mix, 10 μM of each primer, and 2.5 ng DNA template in combination with the thermal cycle: 98 °C for 30 sec, 30 cycles of {98 °C for 10 sec, 65 °C for 30 sec, 72 °C for 30 sec}, and a final extension at 72 °C for 120 sec. Successful reactions confirmed by electrophoresis were pooled in equal volumes and purified using AMPure XP beads at a 1.0x ratio. The purified amplicons were sequenced as a Illumina-ready library at the Advanced Genomics Core (University of Michigan) on NovaSeq X Plus Flow Cells in 150 bp paired-end mode. Each replicate of treatments was extracted and sequenced once. Each pooled inoculum was extracted and sequenced 8 times to generate 8 replicates of the initial frequencies. At least 1,000,000 reads were sequenced for each replicate of treatments and inoculum (S3 Table). Fitness was calculated by the relative barcode enrichments in comparison to the control strains. Barcodes in raw Illumina reads were extracted and counted using Bartender [107] with a match pattern “ATGTC [13–21]TAAAC” and quality score threshold of 25 inside barcode to extract 13–21 bp barcodes flanked by 5 bp upstream and 5 bp downstream of the designed barcode region, allowing 1 mismatch in the match pattern and clustering barcodes within 2 mismatches per default setting. The strain frequency in each population was calculated by the times its barcode appeared divided by the total appearance of known barcodes. Since the inoculum was amplified and sequenced 8 times, the abundance of each strain in the inoculum was calculated by the average frequency among sequencing runs excluding outliers (beyond 1.5 IQR). The enrichment of each strain in each population was calculated by its after-growth frequency divided by its inoculum frequency. A strain’s selection coefficient in a population was , where and are the enrichment the strain and controls in the respective replicate, and is the number of generations. We averaged across replicates (excluding outliers beyond 1.5 IQR) to obtain the fitness value for the strain in the environment. To assess variability among replicates, a one-way ANOVA was performed separately for each growth condition, treating strain as a factor, using the lm() and anova() functions in base R. For strains presenting antagonistic pleiotropy, i.e., strains that have both positive and negative selection coefficients dependent across environments, we defined fitness gain as the highest selection coefficient across growth environments, i.e., gain = max(si) and the cost of LOH as the negative parts of the selection coefficients in other growth environments, i.e., cost = -min(0, s). We then calculate the tradeoff as gain × cost, where the value is only large when both cost and gain are large. We also considered the antagonistic pleiotropy of each strain as their standard deviation or range across environments. Inferring distribution models We modeled the DFE of LOH in each treatment. The distribution of selection coefficients for each treatment was tested for bimodality using Hartigan’s dip test using the diptest package in Python. Only unimodal distributions were carried forward for analysis. To incorporate more strains into the analysis, we conducted a maximum likelihood inference on strains with 1–3 LOH events, assuming that the fitness effects of individual LOH events were independent. This means the observed strain selection coefficient is the sum of effects drawn from an underlying single-event DFE, or . Since the probability density function (PDF) for this sum is not analytically tractable while completely dependent on the single-event models and parameters, we employed a numerical Maximum Likelihood Estimation (MLE) approach to optimize the model parameters. This involved a Monte Carlo function that estimated the PDF by drawing 1,000,000 random samples to create a numerical distribution, with the parameters optimized using the L-BFGS-B algorithm implemented in minimize() function in SciPy. We tested 7 candidate models, Normal, Weibull, Frechet, Gamma, Cauchy, Skew-Normal, and Generalized Normal using PDF functions in the scipy.stats module. To ensure convergence, each model was optimized three times. Model performance was compared using the Akaike Information Criterion (AIC). To specifically characterize the extreme beneficial tail of the DFE, we applied principles from Extreme Value Theory (EVT). For each treatment, this analysis focused on the 20 strains with the highest selection coefficients that have 1–3 LOH events. We compared two nested models commonly used in tail analyses, the exponential distribution and the Generalized Pareto Distribution (GPD) [73,84]. The additive model and MLE was maintained, but the likelihood function for the sum of effects was calculated using trapezoidal integration for greater precision. Model parameters were optimized using the L-BFGS-B algorithm. To determine which model provided a better fit, we used a Likelihood Ratio Test (LRT), where the significance (p < 0.05) indicates that the GPD better explains the tail’s shape than the simpler Exponential model. Supporting information S1 Fig. Distribution of missing genotype rates. Histogram showing the count of strains based on their missing genotype rate. The red vertical line indicates a threshold at 5% missing genotype rate. https://doi.org/10.1371/journal.pgen.1012083.s001 (PDF) S2 Fig. Distribution of homozygous genotype rates. Histogram showing the count of strains based on their homozygous genotype rate across the expected heterozygous sites according to parental genotype. The red vertical line indicates a threshold at 95% homozygous rate. https://doi.org/10.1371/journal.pgen.1012083.s002 (PDF) S3 Fig. Strain frequencies in inoculum. Boxplots displaying the frequency of each strain in the initial experimental pool of the liquid media experiment and the worm experiment. The red horizontal line indicates a threshold frequency 0.01%. https://doi.org/10.1371/journal.pgen.1012083.s003 (PDF) S4 Fig. Genomic position and the origin of the present genotype of LOH tracts in each strain. The color blocks depict LOH tracts in regard to their genomic position and the origin of the present genotype. https://doi.org/10.1371/journal.pgen.1012083.s004 (PDF) S5 Fig. Boxplots of the number of different types of LOH events in each strain. (A) between SK1 and BY4741 genotype (B) between I-LOH and T-LOH. https://doi.org/10.1371/journal.pgen.1012083.s005 (PDF) S6 Fig. Boxplots of LOH tract length regarding different types of LOH events. Letters show the significant groups by Tukey honest significant differences. https://doi.org/10.1371/journal.pgen.1012083.s006 (PDF) S7 Fig. Scatter plot and correlation of the length of LOH events versus the occurring chromosome length. (A) The black diagonal line shows the maximum possible length of LOH tract (the length of the chromosome). (B) The correlation between LOH tracts and the length of the occurring chromosome. The correlation coefficient was determined by Pearson correlation. https://doi.org/10.1371/journal.pgen.1012083.s007 (PDF) S8 Fig. Scatter plot and correlation of the number of LOH events versus the occurring chromosome length. The correlation coefficient was determined by Pearson correlation. https://doi.org/10.1371/journal.pgen.1012083.s008 (PDF) S9 Fig. Boxplots of the position of LOH tracts on chromosomes regarding different LOH types. (A) absolute distance from the proximal end (closer to centromere) of LOH tract to centromere in base pair (B) relative distance from the proximal end of LOH tract to centromere in the proportion to the occurring chromosome arm length (C) absolute distance from the proximal end of LOH tract to telomere in base pair (D) relative distance from the proximal end of LOH tract to telomere in the proportion to the occurring chromosome arm length. https://doi.org/10.1371/journal.pgen.1012083.s009 (PDF) S10 Fig. Scatter plots of the initial frequency of strains in inoculum versus the measured fitness. https://doi.org/10.1371/journal.pgen.1012083.s010 (PDF) S11 Fig. Enrichment of control strains across treatments. Boxplot showing the log2 enrichment of control strains (x-axis) across treatments. Different colors represent strains (top panel) or replicates (bottom panel). https://doi.org/10.1371/journal.pgen.1012083.s011 (PDF) S12 Fig. Strain fitness distributions across treatments. Box plots illustrating the distribution of selection coefficients for the LOH yeast library across different treatments. The red bars depict the means without outliers applied in the downstream analyses. The strains were reordered according to the mean in each panel. https://doi.org/10.1371/journal.pgen.1012083.s012 (PDF) S13 Fig. Standard deviation of fitness within treatments. Box plots showing the distribution of standard deviation of fitness measurements for strains within each experimental treatment. Letters below box plots identify groups that are not significantly different by Tukey’s honest significance test. https://doi.org/10.1371/journal.pgen.1012083.s013 (PDF) S14 Fig. Fitness distribution of strains containing single LOH across different treatments. The same information as Fig 3, but calculated from a subset of strains with a single LOH tract. https://doi.org/10.1371/journal.pgen.1012083.s014 (PDF) S15 Fig. Comparison of summary statistics between DFE from whole library and the subset of strains with a single LOH tract. The correlation coefficient was determined by Pearson correlation. https://doi.org/10.1371/journal.pgen.1012083.s015 (PDF) S16 Fig. Maximum Likelihood Estimation (MLE) of distribution of fitness effects (DFE) of a single LOH tract. The estimated DFE of a single LOH tract from strains in the mutant library based on the assumption of pure additivity of fitness effects. The whole distribution of treatments with bimodal patterns was skipped for estimation. The curves show the probability density function (PDF) of each model with the best-fitted parameters. The solid histogram shows the fitness of strains with a single LOH tract, which should follow the PDF curve. The hollow histograms show the selection coefficient of all strains used in the inference, with 1–3 LOH tracts. The whole distribution includes all strains in the library while the tail distribution includes only the 20 highest-fitness strains. The star marks in the legend indicate the best model based on Akaike information criterion (AIC) for whole distribution or the likelihood ratio test (LRT) for tail distribution. https://doi.org/10.1371/journal.pgen.1012083.s016 (PDF) S17 Fig. Relationship between strain fitness and number of LOH across treatments. (A) the number total number of LOH tracts in strains versus the fitness (B) the number of specific LOH type (I-LOH vs T-LOH) versus the fitness. https://doi.org/10.1371/journal.pgen.1012083.s017 (PDF) S18 Fig. Scatter plot and regression between genomic heterozygosity and cumulative length of LOH in each strain. The correlation coefficient was determined by Pearson correlation. https://doi.org/10.1371/journal.pgen.1012083.s018 (PDF) S19 Fig. Relationship between strain fitness and genomic heterozygosity across treatments. Each panel shows a scatter plot of strain selection coefficient (s) against 1 - genomic heterozygosity within each strain in specific growth environments: Acidic, Caffeine, H₂O₂ (hydrogen peroxide), SC (synthetic complete medium), YPD-30C (yeast extract peptone dextrose at 30°C), YPD-37C (YPD at 37°C), worm-20C (inside Galleria mellonella at 20°C), and the average across environments. Each point represents an individual strain. A linear regression line is fitted to the data in each panel, with the corresponding equation and p-value for the slope displayed. https://doi.org/10.1371/journal.pgen.1012083.s019 (PDF) S20 Fig. Relationship between LOH length and fitness by LOH type. Scatter plots showing strain selection coefficient versus total LOH length in a strain (log-10 scale) across growth environments (panels). Data points are colored based on the LOH regions that became homozygous BY4741 alleles or SK1 alleles. Linear regression lines and corresponding equations and p-values are shown for each parental type within each env

web-intern@dakdan.com

RELATED ARTICLES
- Advertisment -

Most Popular

Recent Comments

Translate »