## Abstract

The skin is the largest human organ, functioning to serve as the protective barrier to the harsh, outside world. Recent studies have revealed that large numbers of somatic mutations accumulate in normal tissue, which can be used to infer skin cell dynamics^{1-5}. Here we present the first realistic, cell-genome mechanistic epidermal model that shows homeostasis imposes a characteristic log-linear subclone size distribution for both neutral and oncogenic driver mutations, where the largest skin subclones are the oldest subclones. Because homeostasis inherently limits proliferation and therefore clonal sweeps, selection for driver mutations (NOTCH1 and TP53) in normal epidermis is instead conferred by greater persistence, which leads to larger subclone sizes. These results highlight how the integration of mechanistic modeling with genomic data provide novel insights into the evolutionary cell dynamics of normal human homeostatic tissues.

Recent studies have documented that substantial numbers of mutations accumulate in normal human tissues, with evidence for selection of canonical, oncogenic driver mutations^{2-6}. Typically, selection for these driver mutations is modeled by increased cell proliferation; however, requirements of skin homeostasis imposes spatial constraints that inherently limit selective sweeps. Few, if any, studies touch on how a functionally advantaged cell may expand (i.e. be selected for) in a normal tissue without destroying the homeostatic architecture of the tissue.

Although substantial numbers of mutations accumulate with age, and a quarter of all cells carry driver mutations^{2}, skin thickness and mitotic activity do not significantly change. A controversy currently exists between our understanding of mutation selection and neutral evolution within tissues^{1,2,7,8}, in part because our understanding of how a mutant clone is able to expand within normal tissue is completely lacking. Furthermore, most mutations are passengers and skin subclone size/frequency distributions are consistent with neutral evolution or the absence of detectable selection. Varying exposures to sunlight also complicate mutation accumulation. Hence, the challenge is to develop a model that integrates selection, evolution, and sunlight. The reward would be a realistic model of lifelong human skin division and death.

In the scope of the epidermis, first principles dictate a constant cell number through equal loss and replacement of cells, a constant tissue height, and constant stem cell numbers. Using these principles, we built a three-dimensional mechanistic model of the human epidermis using the HAL framework^{9} to infer how both neutral and driver mutations accumulate and exhibit clonal expansions without disrupting homeostasis. We based our model on a simplified version of the vSkin model^{10} that is capable of wound repair, and uses EGF as a key diffusible stromal growth factor to define a stochastic stem niche within the basal layer and determine keratinocyte birth/death (Figure 1 and Figure S1). To assess clonal expansions and incorporate experimental data, we embed 72 specific genes with base pair resolution and gene specific mutation rates within each keratinocyte, accurately capturing mutation spectra consistent with UV exposure (Figure 1B). This allows us to fully represent the tissue architecture of a homeostatic epidermis and realistic mutation accumulation (Figure 1A and Supplemental Movie S1).

Simulations of variable biopsy sizes (0.75, 1.0, and 3.14*mm*^{2}), age matched to comparable patients, reconstructs a log-linear subclone size-age distribution with neutral mutations that easily captures the extent of clone size variability within patients (Figures 1C and 1D). We see that small clones dominate the highest frequencies while a heavy right tailed distribution is reflected by the clonal exponential size dependency consistent with patient data (Figures 1C and 1D) and substantiated by further work^{1,3,11}. We then perform pairwise Kolmogrov Smirnov tests on each patient and the simulated biopsy’s First Incomplete Moment distribution (FIM). We fail to reject the null hypothesis, that the distribution of clone sizes are from different distributions, for the vast majority of our simulation versus patient biopsy comparisons, but are able to reject the null for a select few comparisons (Figure 1D). However, for all patient biopsies at least one of the patient specific simulations fails to reject the null hypothesis. In a homeostatic epidermis, most subclones marked with neutral mutations are small, recent, and destined for extinction from random stem cell niche turnover (Figures 1C, 1D, and 2C). Persistent, older subclones are rarer and larger (Figure 2C). Hence, the observed log-linear subclone mutation size distributions (Figure 1C) are an emergent property of the homeostatic constraints of normal human skin. These results reveal that it is highly improbable that a mutation is capable of expanding larger than any other, given homeostasis. Age-related clonal size dependencies exist even for advantageous driver mutations in the homeostatic system (Figures 2D-E). This implicates the homeostatic tissue architecture as the primary factor in defining the age-related exponential size dependency due to tissue architecture constraints.

Previously, NOTCH1 and TP53 mutations were observed to have larger subclone sizes relative to neutral mutations^{2}, indicating positive selection. To better define this selection within the context of homeostasis, we simulated increases in the ability of NOTCH1 mutant subclones to displace neighboring cells, which leads to clonal distributions consistent with positive selection (Figure 2A and 2D). However, the requirement to maintain homeostasis indicates this mechanism comes with a cost (Figures 2 and 3). The stronger the advantage, the greater the loss of homeostasis, creating a catch-22 where a strong NOTCH1 mutant clone with too great of a selective advantage destroys the epidermis. Therefore, young, rapidly expanding large NOTCH1 subclones are never observed (Figures 2 and 4). This is consistent with the observation that patient NOTCH1 mutations are smaller but more frequent than observed TP53 mutations^{2,3}. Instead, similar to the neutral mutations, the largest NOTCH1 subclones are the oldest NOTCH1 mutations. Although driver mutations can only confer small selective advantages without disrupting homeostasis, even small persistence advantages expressed at earlier patient ages pays dividends and inherently leads to larger and more frequent NOTCH1 subclones in adult skin (relative to neutral mutations).

TP53 mutations protect against cell death and require sunlight for selection because TP53 subclones are smaller in non-sun exposed skin^{12}. Normal sun-exposed skin does not exist because the extent of sun exposure varies massively between patients. Here we incorporate this important environmental variability through “sun days”, where our homeostatic tissue takes a sun-filled vacation and a proportion of non-TP53 mutant cells are killed by UV irradiation (Figure 2B). This mechanism, unlike NOTCH1, renders young TP53 capable of rapid expansion as the skin is exposed to more sun days (Figure 2F). However, similar to NOTCH1, simulated TP53 subclonal expansions are also constrained by homeostasis, which is disrupted by too much UV exposure leading to a loss in tissue density.

When we shed light on these functional differences of NOTCH1 and TP53 existing within the same simulated biopsy, the observed patient data is further understood (Figures 3 and 4). The key variable in tissue homeostasis and clonal expansions is sun exposure. In the absence of sunlight, TP53 mutations are subject to drift, along with all other mutations. Consistent with patient data we see more frequent occurrences of NOTCH1 mutations at smaller sizes across a range of fitness values. In this same tissue TP53 mutations occur less often but are able to expand to a larger area upon sun exposure, even when they arise later, thereby shifting the age-dependent distribution to a more recent point in time. In rare occurrences we see passenger mutations carried to larger clonal areas, but they are largely relegated to the lowest frequencies (Figure 3). As with NOTCH1 and TP53 in isolation, we see that in a tissue where both exist, homeostasis constrains these clones from expanding. However, entirely unexpected, we see that with the presence of a NOTCH1 mutation(s) and TP53 mutation(s), sunlight can minimize the effects of a NOTCH1 mutations (i.e. reduce subclone size) while returning the tissue to a homeostatic state. This is because as a TP53 mutation expands with UV irradiation, the population size recovers over time (Figure 3). Even with the presence of sunlight and both mutations we observe an age-dependent log-linear relationship where only the earliest mutations expand to an appreciable size within the tissue (Figure 4).

Here we have presented the first *in silico* model able to capture homeostatic behavior of the epidermis with high-resolution base pair tracking of the human genome (summarized in figure 5). This novel approach reveals that mutational persistence is the key to observed subclone frequency-size distributions. Strikingly, both neutral and driver mutations exhibit similar frequency-size distributions consistent with neutral evolution, reflecting that random stem cell turnover is a feature of normal epithelial homeostasis. Although mitotic stem cells accumulate mutations, random stem cell death with replacement leads to mutation flushing such that most mutations are lost or are limited to small subclones. This work broadens our current understanding of selection and fitness acting in a homeostatic, normal tissue, where selection reflects persistence rather than selective sweeps, leading to larger subclone sizes in adult skin. The next time you take that selfie of your homeostatic epidermis on a tropical, sun filled vacation, think about the portrait of the mutations you would likely observe and the mutational landscape present just at the surface.

## Author Contributions

ROS, DS and ARAA conceived the question and model design. ROS wrote the model, performed analysis of model outputs, and wrote the manuscript with critical comments and inputs from SL, ARAA, and DS. EK and RRB assisted with editing the manuscript and model design. JW assisted with manuscript edits and select data visualizations. All authors have read and edited the final manuscript.

### Acknowledgements

ROS is supported by the Wellcome Trust (grant no. 108861/7/15/7) and the Wellcome Centre for Human Genetics (grant no. 203141/7/16/7). JW, EK and ARAA are supported by Physical Sciences Oncology Network (PSON) grant from the National Cancer Institute (grant no. U54CA193489). SL is supported by the Wellcome Trust (grant no. 206314/Z/17/Z). DS is supported by the Cancer Systems Biology Consortium grant from the National Cancer Institute (grant no. U54CA217376 and grant no. P01 CA196569). ARAA is also supported the Cancer Systems Biology Consortium grant from the National Cancer Institute (grant no. U01CA23238) and the Moffitt Cancer Center of Excellence for Evolutionary Therapy.

## Competing Interests

The authors declare no competing interests.

## Methods

Here we have constructed a homeostatic, three-dimensional, hybrid cellular automaton (3D-HCA) to mechanistically model the human epidermis using the HAL framework^{9}. We based our model on a simplified version of the vSkin model we previously developed^{10} to investigate the clonal dynamics of keratinocyte populations (Figure S1). In this simpler model, while we model keratinocyte dynamics, we do not explicitly consider melanocytes and stromal cells (not in the epidermis) but rather, a single diffusible growth factor acting to define the stem niche on the basal layer. We chose EGF, as a key growth factor, to determine keratinocyte birth/death (Figure S1 and Supplementary Information). We define our area to approximate different sized biopsies assuming a median cell size of approximately 15μ*m*.

### Microenvironmental factor

The microenvironmental factor, epidermis growth factor *(EGF)*, governing keratinocyte apoptosis and proliferation is defined by a system of partial differential equations that dictates how the diffusion moves throughout the system over time. Similar to Kim *et al.*^{10} & Basanta *et al.*^{13}, due to the interactions between discrete, on lattice cells, and the continuous *EGF* variable we describe the discretized version of the equations. Over time EGF diffuses from an imposed, unchanging source (*E*_{C}) along the basal layer (i.e. *E*(*x,y,0*) = *E*_{C}) with a diffusion rate (*ψ*, Table S1) representative of fibroblast production from the dermis, is consumed by keratinocytes (*K*) at a consumption rate (*γ*, Table S1), and decays naturally (*λ*, Table S1). The temporal dynamics of *EGF*([*E*(*η*, t)]), in its discretized form using a first order Euler method, at a three-dimensional lattice point *η* ≡ (*η*_{x}, *η*_{y}, *η*_{z}) from time *t* to time *t* + *δt* is given by:
where *δt* denotes the time step, *K*_{η} is set to be 1 if the lattice point *η* is occupied by a keratinocyte and is set to be 0 otherwise. The Δ_{3} denotes a forward time and central space finite difference approximation of the Laplacian operator in a three-dimensional space:
where *h* is the grid size. Boundary conditions on the top (*η*_{z} = *Z, Z*: height) is no-flux condition and on the bottom *(η*_{z} = 0) is a Dirichlet boundary condition *(E(*η_{x}, *η*_{y}, 0) = *E*_{c}), while periodic boundary conditions are present on the left & right and front & back *(E(*0,η_{y}, *η*_{z}*)=E(X, η*_{y}, *η*_{z}) & *E(η*_{x},*0*, *η*_{z}*)= E(η*_{x}, *Y, η*_{z}*)).* We solve equation (1) until steady state reached (<100 iterations). A steady state value of EGF is used to determine keratinocyte birth and death in the model

### Keratinocyte dynamics model

Cell death can occur one of two ways at each time step for each cell (cell time step is equal to 1 day). First, the cell is dependent upon EGF concentration to prevent a cell from undergoing apoptosis. This value, the level at which apoptosis can occur by chance *(α*, Table S1), is a model specific parameter and translates to a cell undergoing apoptosis when a randomly selected number *c*_{α} ∈*U(*0,1), U: uniform distribution (0,1), is less than (i.e., if , the keratinocyte *K*_{η} at a lattice point *η* dies).

If a cell survives in low EGF concentrations or is not in an area with sufficiently low EGF it is then subject to random death representative of an intrinsic probability of death *(P*_{Death_i} (*K*_{η}) = *θ*) where death will occur if a randomly selected number from (0,1), *c*_{θ} ∈*U* (0,1), is less than *θ* (i.e., if *c*_{θ} *<θ,c*_{θ} *∈U(*0,1), then death occurs).

Provided a cell does not die within a given time step it is capable of moving to an empty lattice position in its 6 von Neumann neighborhood (*N*_{(x,y,z)}^{v} = {(*x,y,z* + *h)*, (*x, y, z* - *h*), (*x* + *h, y, z*), (*x* - *h, y, z*), (*x, y* - *h, z*), (*x, y* + *h, z*)}). This location is determined by assessing if any empty lattice positions exist within the cell’s neighborhood. If multiple spaces are empty, one is chosen randomly. The model specific parameter dictating the probability of movement *(P*_{Move}(*K*_{η}) = *ρ* Table S1) governs the cells ability to move into that empty space so that when a randomly selected number *c*_{p} from (0,1) is greater than p, then the cell move into that empty position (i.e., if *c*_{ρ} (∈*U(*0,1)) ≥ *ρ*, the cell will move into that randomly selected empty neighborhood *(N*_{(x,y,z)}^{v}).

Cells divide based on the underlying EGF concentrations (*P*_{Birth}(*K*_{η})) = *ξ* [*E*]_{η}), where *ξ* is the proliferation scale factor (Table S1). This results in a stem-like population of cells operating dependently upon the underlying EGF concentration. A cell (*K*_{η} at *η*) may divide into two identical daughter cells (*D*_{1,2}) if a randomly selected number *c*_{d}*(c*_{d} *∈U* (0,1)) is less than *ξ* [*E*]_{η} regardless of there being an empty position. One daughter cell occupies the position of the parent cell (location of *D*_{1} = *η* _{x y z}) while the other daughter cell (*D*_{2}) occupies one of five locations, being above or on one of the four directly adjacent sides of the parent cell. This location is governed by *ω*), the division location probability, whereby the second daughter cell is placed in a position according to
where *η*_{xyz} is the coordinates of the parent cell’s lattice position, *h* is the grid size, and *c*_{ω} is a randomly selected number from (0,1) (i.e., *c*_{ω} ∈ *U*(0,1). Boundary conditions for cells are also periodic on the left & right and front & back to enable spatial competition, no-flux on the bottom (*η*_{z} = 0), and a Dirichlet (Cell (*η*_{x}, *η*_{y}, *Z*) = 0) on the top, which makes for the most realistic approach.

### Keratinocyte genome model

Each cell agent contains a base pair resolution of 72 genes with genomic positional information constructed from the reference hgl9 genome using ANNOVAR^{14}. Each gene has a gene specific mutation rate such that the mean mutation rate (*μ*_{g}) is normalized to ≈ 3.2 × 10^{-9} *bp*^{-1} *division*^{-1} (likely a conservative estimate, but normal somatic mutation rates can vary wildly^{15-18}). Mutations within the rest of the genome are tracked with less resolution. While a random base within any given position can be tracked, these mutations are not of particular interest here, but can be used in statistical analysis and model-data comparisons. However, due to substantial computational requirements to track such large amounts of information for each cell this method is employed sparingly and not in all model simulations. This method allows for a more realistic mutation accrual, since a genome wide mutation rate would be biologically unrealistic due to a large number of covariate factors (e.g. level of expression, location on chromosome, and GC content, etc.) leading to differences in mutation rates^{19}. A mutation rate applied uniformly across the genome leads to the longest genes, such as ERBB4, PTPRT and BAB accumulating the most mutations (in our model) while NOTCH1-4 would acquire almost no mutations. This is not observed in normal epidermis, where NOTCH 1 and a number of various other shorter genes are frequently mutated.

Upon initialization of the model we construct Poisson distributions for each gene given the mutation rate, *μ*_{g}, and the length of the gene, *L*_{g}, providing the expected number of mutations upon division. Where the number of mutations for each gene, *X*_{g}, is *X*_{g} ∼ *Poisson*(*μ*_{g} × *L*_{g}), and the specific base affected is determined by a multinomial distribution constructed for each A, C, T, and G allowing us to define and capture a specific mutational signature observed in the data. Keratinocyte (and melanocytes not considered here) contain, disproportionately more C>T transitions^{2,3}; this needs to be accounted for by adjusting the probability of a C>T mutation occurring versus all other mutation types. Here we set the probability of a cytosine mutation to 0.5 and split the remaining bases to an equal probability. Finally, we determine the position within the gene by applying a uniform probability to all of the same bases within the gene. All ancestral mutations are inherited by daughter cells. Within the model distributions are constructed using the Colt java library upon initialization.

### Model Parameterization

Parameter values for the spatial model were selected based on iterative redundancy analysis (RDA) using R^{20} (vegan package^{21}), where constraining variables were the emergent properties measured as a necessary component of homeostasis (see Supplementary Information):

Primary Constraining Variables:

Mean Tissue height

Mean Loss/replacement rate within the basal layer

Secondary Constraining Variables:

Mean Overall tissue turnover time

Mean cell age

Mean Basal cell density

During the first iteration of parameterization variables are set randomly choosing values between zero and one for 20,000 two-year simulations. EGF parameters (*ψ, γ, λ*), where values were examined between zero and 1/6, were not subject to this iterative process. For the diffusion coefficient (*ψ*), values greater than 1/6 (for 3D, 0.25 in 2D) results in numerically unstable solutions for a given lattice point because it fails to satisfy Courant-Friedrichs-Lewy condition that guarantees convergence of a forward-time central-space finite difference method . After each model run we collect the constraining variable data and in conjunction with the model parameters, we then perform RDA. Variance explained by each parameter are used to determine the influence of each parameter on the constraining variables. Refinement of the model parameters is then performed based on the variance explained and the linearity of that parameter on the constraining variables. In this way we are able to constrain the variable range iteratively by reducing the number of simulations to 5,000 (two-year simulations) while converging on parameter values. Iterations are repeated until target values are reached for each of the constraining variables of homeostasis where the combination of parameter values yields homeostasis. The secondary constraining variables served as a sanity check rather than a strict parameterization.

A pseudo-code for the parameterization process is as follows:

*Function 1: Execute Simulations given parameter ranges:*

For *n* simulations

Randomly select parameter values from provided ranges.

Run 2-year simulation of model.

Collect and calculate constraining variable values.

*Function 2: RDA Analysis*

While homeostasis is not reached:

Execute function 1 above (20,000 initially, 5,000 after).

Perform RDA and analyze parameter influence.

Further constrain ranges.

Repeat Step 1.

### Patient Data

Patient data was previously procured and exhaustively characterized^{2}. However, the data is filtered to remove patient biopsies present in multiple biopsies as this may prove problematic in its interpretation, as previously noted^{1}. The clonal area is taken as the product of the biopsy size(mm^{2}) and twice the variant allele frequency. This curated dataset provides the necessary information to perform direct comparisons with simulation data.

### Model Data

Once a simulation is complete the true variant allele frequency (*VAF*_{t}) is calculated as the quotient of the total number of cells carrying that mutation over the product of twice the total population size. This neglect copy number variation as this is not tracked within the model. However, high-throughput sequencing methodologies rarely yield the true variant allele frequency and results in imperfect information, such as that collected here. We employ and build upon a method to simulate sequence data yielding a simulated variant allele frequency (*VAF*_{S})^{22}. The authors determine *VAF*_{S} by drawing the number of variant reads, *f*_{t} for any given mutation *i* from a binomial distribution where the success probability is given by *VAF*_{t} and the number of trials is itself binomially distributed and yields the total number of reads *(D*_{t}*):*

However, here *f*_{i} utilizes a higher resolution approach given the higher quality sequence data that is used for comparisons. Martincorena *et al.* went to great lengths to generate high resolution data by sequencing up to an average depth of lOOOx (patient PD13634; while all others are approximately 500x) to ensure the capture of low frequency variants, approaching the error-rate of the sequencing method used. In order to incorporate the error and variance around the depths at each variant site that would be introduced by the sequencing methods we define *D*_{t} by drawing from a gamma distribution. The gamma distribution shape parameters, *k*_{p} and θ_{p}, are specific to each patient and determined by fitting each patient’s variant depth distributions using a maximum likelihood estimation (conducted in R^{20} using the MASS package ^{23}). *D*_{i} whose parameters are specific for each patient, for each variant becomes:

Thus, as described above, the final frequency of a variant for comparisons is given as,

### Assessing neutrality

Most recently introduced mutations dominate the lowest frequencies and thus are difficult to observe. By using a VAF cutoff of0.005, the lowest VAF observed in the patient data, a large fraction of mutations are filtered out and analysis is conducted on what is likely to be observable within high resolution patient sequence data.

Given a neutrally expanding keratinocyte population driven by progenitor proliferation with a basal progenitor loss/replacement rate given by γ λ, it follows that the clone, n, distribution can be calculated for any given patient of age *t:*

The cumulative probabilities of clone size distributions can then be used to assess neutrality, under an expectation of exponential size dependency given by the first incomplete moment^{1}:

Here, *N* represents the largest clone observed and ⟨(n(t)⟩ is the average mutant clone size. This assessment of neutrality allows for a straightforward method when applied to simulation data and patient biopsies. Clone size can be calculated by 2 * *VAF* * *B*, where *B* is either patient biopsy size or simulated biopsy area in *mm*^{2}.

### NOTCH1 Advantage

Upon induction of a mutation within NOTCH 1 a fitness advantage is given to that cell and its progeny, if any. The fitness advantage gained through a NOTCH1 mutation is non-proliferative and allows the cell to remain within the basal layer longer than it would have otherwise. This is accomplished utilizing a blocking probability, *f*_{0} which is independent for each cell regardless of clonal lineage. This indirectly impacts cell division at the basal layer, since successful division requires displacement of neighboring cells towards the corneal layer. We assessed a range of *f*_{0} values up to *f*_{0}= 1.0, where the tissue height is a quarter of its parameterized height.

### TP53 Advantages

Introducing functional heterogeneity by a TP53 advantaged clone does not involve a cell intrinsic parameter. Rather, UV damage kills cells on a given sun damage day, S, where the proportion of non-TP53 mutant cells is given by θ_{S} (Figure 2A). The TP53 mutant cells are subject to the same dynamics aside from this reprieve from death induced by UV damage. We determined the parameters to use for the sun damage day set, defined by the number of days and the spacing of those days throughout the year, and θ_{S} by approximating our 3D-HCA using an ODE that models keratinocyte population dynamics in a space limited condition.

Sun exposure varies within the lifetime of each individual. We assume that a baseline UV exposure is taken into account for the parameterization of the model, encompassing the empirically derived turnover (birth/death) of cells within the basal layer. Thus, it is necessary to have a method of UV damage and subsequent TP53 mutant advantaged clones without a direct increase in proliferation rates. The number of sun damage days within a given year defines the sun day set (S) and can take on a number of values depending on how those sun damage days are spaced throughout the year.

There are 365 possible sun damage days a year. It is unlikely that an individual would be subjected to all of these sun days, but there is a high degree of variability between individuals. In order to assess a range of values for the number of sun days within a given year, the spacing of those days, and the proportion of cells damaged by UV damage on each of those sun days we employ an ordinary differential equation (9). This approach takes a complex model, reduces its complexity, and spatial dynamics necessary to evaluate clonal dynamics in order to assess UV damage on the spatial model. We ignore the necessary information that can only be gained from a functionally heterogeneous spatial context in order to determine and implement three additional parameters to then incorporate into the spatial model (3D-HCA). The total population size over time for our homeostatic, spatial model was approximated by can be defined by an ordinary equation model that describes population growth dynamics over time in a space limited condition. The equation is
where *N* is the number of keratinocytes, *r* indicates a growth rate, *K* stands for a carrying capacity representing space limitation. Equation (9) allows us to approximate the same initializing conditions within the spatial model (3D-HCA) by initializing the model with the total number of cells in the simulated domain. Upon initialization the population reduces to the carrying capacity (*k*) over time. In order to incorporate death from UV-damage we introduce a new component:
where the proportion of cells, *θ*_{S}, at time, *t*, is killed by sun exposure. The indicator function defined as,

The set of sun days, *S =* {*t, t* ∈ ℤ, *t* ≤ 365} because *S* ⊂ *Y* (the total days in a year), where UV damage occurs depends on the number of days chosen to repeat each year and the spacing of those days. We see that at realistic values for *θ*_{S} the ODE model is able to approximate our spatial simulations well, while a larger set of sun days and higher *θ*_{S} result in complete, or nearly complete, loss of tissue (Figure S2). In addition, the spatial model (3D-HCA) follows a logistic growth after a delayed period. This delayed period is exacerbated by higher values of *θ*_{S} and/or by less spacing in the number of sun days. This is the result of cells settling upon large death, creating a sparse tissue (in the absence of a TP53 mutant), the cells settle, partially, towards the basal layer prior to re-establishing homeostasis (Figure S2).

Here we assess a range of values for *θ*_{S} and sun day sets, *S.* The number of sun days varies for each individual as does the number of spacings between sun exposure days. We attempt to address both of these variables with regard to homeostasis. We assessed a number of sun days throughout the year (7, 20, and 100) and the spacing of those sun days (1 day to evenly spaced throughout the year). We can see that the fewer the number of sun days the least amount of tissue is lost (Figures S2 and S3). Likewise, the more spaced out these sun days are the closer to homeostasis the tissue remains (Figure S3). This is true for both the ODE and spatial models.

### Departure from Homeostasis

In order to assess how the 3D-HCA behaves in response to functional heterogeneity being introduced through NOTCH1 and TP53 mutants we use a distance metric *d*_{H}. This distance metric summarizes the departure from homeostasis given a normal morphology such that the overall population size is constant throughout the simulation (e.g. all neutral simulations). It is defined as
where *N*_{k} and *N*_{s} are the population size at the given time, *t*_{i}, for the homeostatic and simulation in question, respectively. This provides a single, relative value to assess how far from homeostasis any simulation departs.

### Source Code Availability

The model was created using the HAL framework^{9} with a custom module (HAL framework available here: https://github.com/MathOnco/HAL.git).

## Footnotes

↵† These authors share senior authorship.