# Fluctuating fitness shapes the clone-size distribution of immune repertoires

See allHide authors and affiliations

Edited by José N. Onuchic, Rice University, Houston, TX, and approved November 11, 2015 (received for review July 2, 2015)

## Significance

Receptors on the surface of lymphocytes specifically recognize foreign pathogens. The diversity of these receptors sets the range of infections that can be detected and fought off. Recent experiments show that, despite the many differences between these receptors in different cell types and species, their distribution of diversity is a strikingly reproducible power law. By introducing effective models of repertoire dynamics that include environmental and antigenic fluctuations affecting lymphocyte growth or “fitness,” we show that a temporally fluctuating fitness is responsible for the observed heavy tail distribution. These models are general and describe the dynamics of various cell types in different species. They allow for the classification of the functionally relevant repertoire dynamics from the features of the experimental distributions.

## Abstract

The adaptive immune system relies on the diversity of receptors expressed on the surface of B- and T cells to protect the organism from a vast amount of pathogenic threats. The proliferation and degradation dynamics of different cell types (B cells, T cells, naive, memory) is governed by a variety of antigenic and environmental signals, yet the observed clone sizes follow a universal power-law distribution. Guided by this reproducibility we propose effective models of somatic evolution where cell fate depends on an effective fitness. This fitness is determined by growth factors acting either on clones of cells with the same receptor responding to specific antigens, or directly on single cells with no regard for clones. We identify fluctuations in the fitness acting specifically on clones as the essential ingredient leading to the observed distributions. Combining our models with experiments, we characterize the scale of fluctuations in antigenic environments and we provide tools to identify the relevant growth signals in different tissues and organisms. Our results generalize to any evolving population in a fluctuating environment.

Antigen-specific receptors expressed on the membrane of B- and T cells (B-cell receptors, BCRs and T-cell receptors, TCRs) recognize pathogens and initiate an adaptive immune response (1). An efficient response relies on the large diversity of receptors that is maintained from a source of newly generated cells, each expressing a unique receptor. These progenitor cells later divide or die, and their offspring make up clones of cells that share a common receptor. The sizes of clones vary, as they depend on the particular history of cell divisions and deaths in the clone. The clone-size distribution thus bears signatures of the challenges faced by the adaptive system. Understanding the form of the clone-size distribution in healthy individuals is an important step in characterizing the antigenic recognition process and the functioning of the adaptive immune system. It also presents an important starting point for describing statistical deviations seen in individuals with compromised immune responses.

High-throughput sequencing experiments in different cell types and species (2⇓⇓⇓⇓⇓⇓–9) have allowed for the quantification of clone sizes and their distributions (2, 9⇓–11). Previous population dynamics approaches to repertoire evolution have taken great care in precisely modeling these processes for each compartment of the population, through the various mechanisms by which cells grow, die, communicate, and change phenotype (12⇓⇓⇓⇓–17). However, one of the most striking properties of repertoire statistics revealed by high-throughput sequencing is the observation of power laws in clone-size distributions (Fig. 1 *A* and *B*), which holds true for various species (human, mice, zebrafish), cell type (B- and T cells), and subsets (naive and memory, CD4 and CD8), and seems to be insensitive to these context-dependent details. It remains unclear, however, what universal features of these dynamics lead to the observed power-law distributions. Here we identify the key biological parameters of the repertoire dynamics that govern its behavior.

The wide range and types of interactions that influence a B- or T-cell fate happen in a complex, dynamical environment with inhomogeneous spatial distributions. They are difficult to measure in vivo, making their quantitative characterization elusive. Motivated by the universality of the observed clone-size distribution, we describe the effective interaction between the immune cells and their environment as a stochastic process governed by only a few relevant parameters. All cells proliferate and die depending on the strength of antigenic and cytokine signals they receive from the environment, which together determine their net growth rate (Fig. 1*C*). This effective fitness that fluctuates in time is central to our description. We find that its general properties determine the form of the clone-size distribution. We distinguish two broad classes of models, according to whether these fitness fluctuations are clone-specific (mediated by their specific BCR or TCR) or cell-specific (mediated by phenotypic fluctuations such as the number of cytokine receptors). We identify the models that are compatible with the experimentally observed distributions of clone sizes. These distributions do not depend on the detailed mechanisms of cell signaling and growth, but rather emerge as a result of self-organization, with no need for fine-tuned interactions. Performing a series of validated approximations, we find a simple algebraic relationship constraining the different timescales of the problem by the experimentally observed exponent of the clone-size distribution. This result allows for testable predictions and estimates of the rates that govern the diversity of a clonal distribution.

## Results

### Clone Dynamics in a Fluctuating Antigenic Landscape.

The fate of the cells of the adaptive immune system depends on a variety of clone-specific stimulations. The recognition of pathogens triggers large events of fast clone proliferation followed by a relative decay, with some cells being stored as memory cells to fend off future infections. Naive cells, which have not yet recognized an antigen, do not usually undergo such extreme events of proliferation and death, but their survival relies on short binding events (called “tickling”) to antigens that are natural to the organism (self-proteins) (18, 19). Because receptors are conserved throughout the whole clone (with the exception of B-cell hypermutations), clones that are better at recognizing self-antigens and pathogens will on average grow to larger populations than bad binders. By analogy to Darwinian evolution, they are “fitter” in their local, time-varying environment.

We first present a general model for clonal dynamics that accounts for the characteristics common to all cell types, following previous work by de Boer, Perelson, and collaborators (14, 20, 21). We later explore the effect of specific features such as hypermutations, memory/naive compartmentalization, and thymic output decay on the clone-size distribution.

We denote by *j* as a function of time. We assume that after its introduction at a random time *i* interacts with antigen *j* (foreign or self) is encoded in the cross-reactivity function *i* and *j* do not interact, or a positive number drawn from a distribution to be specified, if they do. In general, interactions between lymphocytes and antigens effectively promote growth and suppress cell death, but for simplicity we can assume that the effect is restricted to the division rate. In a linear approximation, this influence is proportional to *j* for which clone *i* is specific. This leads to the following dynamics for the evolution of the size *i* (Fig. 1*C*):*ν* and *μ* are the basal division and death rates, respectively, and where *SI Appendix*, section A for details about birth–death noise).

New clones, with a small typical initial size *C*). For example, a number on the order of ^{−1} in homeostatic conditions (22). Because the probability of rearranging the exact same receptor independently is very low (**1** is *SI Appendix*, Fig. S2 and *SI Appendix*, section B).

We simulated the dynamics of a population of clones interacting with a large population of antigens. Each antigen interacts with each present clone with probability *E* (green curve), and shows how clone growth tracks the variations of the antigenic environment. When the stimulation is particularly strong, the model recapitulates the typical behavior experimentally observed at the population level following a pathogenic invasion (26, 27), as illustrated in Fig. 1*D*: The population of a clone explodes (red curve), driving the growth of the total population (blue curve), while taking over a large fraction of the carrying capacity of the system, and then decays back as the infection is cleared.

On average, the effects of division and death almost balance each other, with a slight bias toward death because of the turnover imposed by thymic or bone marrow output. However, at a given time, a clone that has high affinity for several present antigens will undergo a transient but rapid growth, whereas most other clones will decay slowly toward extinction. In other words, locally in time, the antigenic environment creates a unique “fitness” for each clone. Because growth is exponential in time, these differential fitnesses can lead to very large differences in clone sizes, even if variability in antigen concentrations or affinities is nominally small. We thus expect to observe large tails in the distribution of clone size. Fig. 2*A* shows the cumulative probability distribution function (CDF) of clone sizes obtained at steady state (blue curve) showing a clear power-law behavior for large clones, spanning several decades.

The exponent of the power law is independent of the introduction size of clones (Fig. 2*A*, *Inset*) and the specifics of the randomness in the environment (exponential decay, random number of partners, random interaction strength) as long as its first and second moment are kept fixed (*SI Appendix*, Fig. S3 and *SI Appendix*, section C).

### Simplified Models and the Origin of the Power Law.

To understand the power-law behavior observed in the simulations, and its robustness to various parameters and sources of stochasticity, we decompose the overall fitness of a clone at a given time (its instantaneous growth rate) into a constant, clone-independent part equal to its average **1** as

The function *i*. Because antigens can be recognized by several receptors, these fluctuations may be correlated between clones. Assuming that these correlations are weak, *SI Appendix*, section D). This is also the process of maximum entropy or caliber (28) with that autocorrelation function (*SI Appendix*, section E and ref. 29).

The effect of the birth–death noise *SI Appendix*, Fig. S5 and *SI Appendix*, section F). It can thus be ignored when looking at the tail of the distribution and its power-law exponent, but it will play an important role for defining the range over which the power law is satisfied.

The population dynamics described by Eqs. **2** and **3** can be reformulated in terms of a Fokker–Planck equation for the joint abundance *ρ* of clones of a given log size *f*:*A*, matches closely that of the full simulated population dynamics (in blue). The power-law behavior is apparent above a transition point that depends on the distribution of introduction sizes of new clones and the parameters of the model (see below). Intuitively, the microscopic details of the noise are not expected to matter when considering long timescales, as a consequence of the central limit theorem. However, the long tails of the distribution of clone sizes involve rare events and belong to the regime of large deviations, for which these microscopic details may be important. Therefore, the agreement between the process described by the overdamped spring and the exponentially decaying, Poisson-distributed antigens is not guaranteed, and in fact does not hold in all parameter regimes (*SI Appendix*, Fig. S8).

We can further simplify the properties of the noise by assuming that its autocorrelation time is small compared with other timescales. This leads to taking the limit *SI Appendix*, section F and *SI Appendix*, Fig. S4). The corresponding Fokker–Planck equation now reads*SI Appendix*, section F). The full solution, represented in Fig. 2*A* in red, captures well the long-tail behavior of the clone-size distribution despite ignoring the temporal correlations of the noise, and approaches the solution of the colored-noise model (Eq. **3**) as *A*).

The power-law behavior and its exponent depend on the noise intensity, but are otherwise insensitive to the precise details of the microscopic noise, including its temporal properties. Fat tails (small *α*) are expected when the average cell lifetime is long (small *σ* or *A*). The explicit expression for the exponent of the power law ^{−1}. We estimated *A* (2, 10) using canonical methods of power-law exponent extraction (30) (see *SI Appendix*, section G for details), and also found a similar value in human T cells (31). The resulting estimate, ^{−1}, is rather striking, as it implies that fluctuations in the net clone growth rate, *A*, are much larger than its average

Whereas the distribution always exhibits a power law for large clones, this behavior does not extend to clones of arbitrarily small sizes, where the details of the noise and how new clones are introduced matter. We define a power-law cutoff *B* and *C* we show how *A* with our simplified model using *λ* as an adjustable parameter, we obtain ^{−1} (*SI Appendix*, section G), which corresponds to a characteristic lifetime of antigens of around a week. Although this estimate must be taken with care, because of possible PCR amplification biases plaguing the small clone size end of the distribution, the procedure described here can be applied generally to any future repertoire sequencing dataset for which reliable sequence counts are available.

### A Model of Fluctuating Phenotypic Fitness.

So far, we have assumed that fitness fluctuations are identical for all members of a same clone. However, the division and death of lymphocytes do not only depend on signaling through their TCR or BCR. For example, cytokines are also growth inducers and homeostatic agents (32, 33), and the ability to bind to cytokines depends on single-cell properties such as the number of cytokine receptors on the membrane of a given cell, independent of their BCR or TCR. Other stochastic single-cell factors may affect cell division and death. These signals and factors are cell-specific, as opposed to the clone-specific properties related to BCR or TCR binding. Together, they define a global phenotypic state of the cell that determines its time-varying fitness, independent of the clone and its TCR or BCR. This does not mean that these phenotypic fitness fluctuations are independent across the cells belonging to the same clone. Cells within a clone share a common ancestry, and may have inherited some phenotypic properties of their common ancestors, making their fitnesses effectively correlated with each other. However, this phenotypic memory gets lost over time, unlike fitness effects mediated by antigen-specific receptors.

We account for these phenotypic fitness fluctuations by a function *c* differs from the average fitness **3**, it is important to note that here the fitness dynamics occurs at the level of the single cell (and its offspring) instead of the entire clone. The dynamics of the fitness *i* can be approximated from Eq. **7** by averaging the fitnesses *ν* and *μ* are the average birth and death rates, respectively, so that *SI Appendix*, section I). The difference with Eq. **3** is the **8** and **9** at the population level gives a Fokker–Planck equation with a source term accounting for the import of new clones. We verify the numerical steady-state Fokker–Planck solution against Gillespie simulations (*SI Appendix*, Fig. S6; see *SI Appendix*, section H for details).

Fig. 3 *A* and *B* shows the distribution of clone sizes for different values of the phenotypic relaxation rate *C* shows a strong dependency of this cutoff with the phenotypic memory

The model can be solved exactly at the two extremes of the heritability parameter *SI Appendix*, sections I and J), yielding an exact power law with exponential cutoff, *B* is close to this limit. Note that even with a negligible exponential cutoff, the predicted

## Discussion

The model introduced in this paper describes the stochastic nature of the immune dynamics with a minimal number of parameters, helping interpret the different regimes. These parameters are effective in the sense that they integrate different levels of signaling, pathways, and mechanisms, focusing on the long timescales of clone dynamics. We assumed that they are general enough that different cell types (B- and T cells) or subsets (naive or memory) can be described by the same dynamical equations despite their differences. How do refined models including these differences affect our results?

Naive and memory cells differ in their turnover rate, i.e., their death rate, memory cells being renewed at a pace 10 times faster than naive ones (34). In our model, this difference is reflected in a higher birth–death noise for memory cells. We have shown that this noise had no effect on the tail of the clone-size distribution for clone-specific fitness (*SI Appendix*, Fig. S5), whereas it was important for the case of a cell-specific fitness, where birth–death noise contributed to the distribution to the same extent as fitness fluctuations. However, some repertoire datasets mix both naive and memory sets, and one could wonder whether our results hold for such mixtures. To examine this question, we simulated a simple two-compartment model where naive cells get irreversibly converted into memory cells when their stimulation is above a certain threshold (see *SI Appendix*, section K for details). We found that when fitness was clone-specific, the clone-size distribution of the mixture and that of memory cells alone still follow a power law, whereas that of naive cells only does so when conversion to memory upon stimulation is partial (*SI Appendix*, Fig. S12). Repeating the same analysis for the cell-specific fitness model, we found that clone-size distributions for each phenotype differed according to their respective birth–death noises, with a longer tail for memory cells as expected from their higher turnover rate.

The main difference between B- and T cells ignored by our model is that BCRs accumulate hypermutations upon proliferation. We studied this effect by allowing proliferating clones to spawn new clones with slightly modified affinities to antigens (*SI Appendix*, section L). The resulting clone-size distribution still follows a power law (*SI Appendix*, Fig. S13), although with a slightly smaller exponent due to increased stochasticity.

Another simplifying assumption of our model is that the dynamics reaches a steady state. This may be challenged by the decay of the thymic output *SI Appendix*, section M). The clone-size distributions at different points in time, shown in *SI Appendix*, Fig. S14, still follow a power law. Interestingly, the exponent *α* is predicted to decrease with age, consistent with

We showed that the relevant sources of stochasticity for the shape of the clone-size distributions fall into two main categories, depending on how cell fate is affected by the environment. Either the stochastic elements of clone growth act in a clone-specific way, through their receptor (BCR or TCR), leading to power-law distributions with exponent

The application of our method to data from the first immune repertoire survey [BCRs in zebrafish (2)] suggests that clone-specific noise dominates in that case, allowing us to infer a relation between the dynamical parameters of the model from the observed power-law exponent

Thanks to its generality, our model is also relevant beyond its immunological context, and follows previous attempts to explain power laws in other fields (41⇓–43). The dynamics described here corresponds to a generalization of the neutral model of population genetics (44) where thymic or bone marrow outputs are now reinterpreted as new mutations or speciations, and where we have added a genotypic or phenotypic fitness noise (receptor or cell-specific noise, respectively). It was recently shown that such genotypic fitness noise strongly affects the fixation probability and time in a population of two alleles (45, 46). Note that, because new thymic or bone marrow clones are unrelated to existing clones, there are no lineage histories, in contrast with previous theoretical work on evolving populations in fluctuating fitness landscapes (47⇓–49). Our main result (Eq. **6**) shows how fitness noise can cause the clone-size distribution (called “frequency spectrum” in the context of population genetics) to follow a power law with an arbitrary exponent

## Acknowledgments

This work was supported in part by Grant ERCStG 306312.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: tmora{at}lps.ens.fr.

Author contributions: J.D., T.M., and A.M.W. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1512977112/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵.
- Janeway C

- ↵.
- Weinstein JA,
- Jiang N,
- White RA 3rd,
- Fisher DS,
- Quake SR

- ↵.
- Ndifon W, et al.

*β*gene segment usage. Proc Natl Acad Sci USA 109(39):15865–15870 - ↵.
- Thomas N, et al.

- ↵.
- Larimore K,
- McCormick MW,
- Robins HS,
- Greenberg PD

- ↵.
- Sherwood AM, et al.

- ↵.
- Robins HS, et al.

- ↵.
- Zvyagin IV, et al.

- ↵.
- Warren RL, et al.

- ↵.
- Mora T,
- Walczak AM,
- Bialek W,
- Callan CG Jr

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Troy AE,
- Shen H

- ↵.
- Mak T,
- Saunders M

- ↵
- ↵
- ↵.
- Bains I,
- Antia R,
- Callard R,
- Yates AJ

- ↵.
- Murugan A,
- Mora T,
- Walczak AM,
- Callan CG Jr

- ↵.
- Kosmrlj A,
- Jha AK,
- Huseby ES,
- Kardar M,
- Chakraborty AK

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Tan JT, et al.

- ↵
- ↵.
- Seddon B,
- Zamoyska R

- ↵.
- Tanchot C,
- Lemonnier FA,
- Pérarnau B,
- Freitas AA,
- Rocha B

- ↵.
- Nesić D,
- Vukmanović S

- ↵.
- Best K,
- Oakes T,
- Heather JM,
- Taylor JS,
- Chain B

- ↵.
- Vollmers C,
- Sit RV,
- Weinstein JA,
- Dekker CL,
- Quake SR

- ↵
- ↵.
- Sornette D,
- Cont R

- ↵
- ↵
- ↵.
- Kimura M

- ↵.
- Cvijović I,
- Good BH,
- Jerison ER,
- Desai MM

*The*Fate*of a*Mutation*in a*F*luctuating*E*nvironment*, preprint - ↵
- ↵.
- Leibler S,
- Kussell E

- ↵.
- Mustonen V,
- Lässig M

- ↵.
- Rivoire O,
- Leibler S