Engineered bacterial sensors have potential applications in human health monitoring, environmental chemical detection, and materials biosynthesis. While such bacterial devices have long been engineered to differentiate between combinations of inputs, their potential to process signal timing and duration has been overlooked. In this work, we present a two‐input temporal logic gate that can sense and record the order of the inputs, the timing between inputs, and the duration of input pulses. Our temporal logic gate design relies on unidirectional DNA recombination mediated by bacteriophage integrases to detect and encode sequences of input events. For an E. coli strain engineered to contain our temporal logic gate, we compare predictions of Markov model simulations with laboratory measurements of final population distributions for both step and pulse inputs. Although single cells were engineered to have digital outputs, stochastic noise created heterogeneous single‐cell responses that translated into analog population responses. Furthermore, when single‐cell genetic states were aggregated into population‐level distributions, these distributions contained unique information not encoded in individual cells. Thus, final differentiated sub‐populations could be used to deduce order, timing, and duration of transient chemical events.
A synthetic temporal logic gate is combined with stochastic single‐cell digital response to achieve population‐level encoding of the order, timing, and duration of two chemical inputs.
Bacteriophage integrases can be used to create a synthetic temporal logic gate that records the timing and duration of two chemical inputs. Based on the sequence of events, single cells have four possible final genetic states.
While single cells have digital states, stochastic noise translates into population distributions that are analog with response to timing and duration of input signals.
A combination of mathematical model simulations and in vivo E. coli experiments was used to show that the final proportion of cells in each state provides detailed information about past events that cannot be deduced from single cell analysis alone.
The stochasticity and delays inherent in biological systems can and should be utilized to provide additional avenues of information and control in synthetic systems.
Engineered bacteria could one day be powerful self‐replicating biosensors with environmental, health, and industrial applications. Synthetic biology has made important strides in identifying and optimizing genetic components for building such devices. In particular, much work has focused on Boolean logic gates that detect the presence or absence of static chemical signals (Gardner et al, 2000; Anderson et al, 2007; Wang et al, 2011; Moon et al, 2013; Shis et al, 2014) and compute a digital response.
Temporal logic gates, which process time‐varying chemical signals, have been much less explored. Pioneering work by Friedland et al (2009) used serine integrase‐based recombination for the counting and detection of sequential pulses of inducers. But thus far, no work has studied the potential for temporal logic gates to provide information about the duration of a signal, or the time between two chemical events. Here, we present a temporal logic gate that allows us to infer analog signal timing and duration information about the sequential application of two inducer molecules to a population of bacterial cells.
Similar to previous temporal logic gates, our design takes advantage of the irreversibility of serine integrase recombination. While bistable switches have been successfully deployed as memory modules in genetic circuits (Kotula et al, 2014), such switches require constant protein production to maintain state and are sensitive to cell division rates and growth phase. The large serine integrases, on the other hand, reliably and irreversibly flip or excise unique fragments of DNA (Yuan et al, 2008). Thus, logic circuits built from integrases intrinsically include DNA‐level memory that requires virtually no cellular resources to maintain state, thus enabling permanent and low‐cost genetic differentiation of individual bacterial cells based on transient integrase induction. Further advantages of the serine integrases include the short length (40–50 bp) and directionality of their attachment sites. Serine integrases recognize flanking DNA binding domains (attB, attP) and subsequently digest, flip or excise, and re‐ligate the DNA between the attachment sites. Flipping or excision activity is determined by the relative orientation of the sites, which allows complex orientation‐dependent behavior to be programmed into integrase circuits. Well‐known serine integrases include Bxb1, TP901‐1, and ΦC31, all of which have been used to demonstrate static‐input logic gates (Bonnet et al, 2013; Siuti et al, 2013), and some have cofactors that can reverse directionality (Khaleel et al, 2011; Bonnet et al, 2012). Recently, an entirely new set of 11 orthogonal integrases was characterized, greatly expanding the set of circuits that can be built (Yang et al, 2014).
In contrast to previous studies of temporal logic gates, our work leverages the stochastic nature of single‐cell switching to create a robust population‐level response to a time‐varying chemical signal. The fundamental nature of living cells that makes them so attractive for engineering—their extremely low energy operation in the limit of using small numbers of molecules to represent information—is also inextricably linked to stochasticity and noise. By traditional engineering standards, synthetic circuits would ideally perform identically within every cell of a population. When this ideal is applied to biology, the stochastic nature of molecular processes, particularly at low‐copy numbers, presents a significant barrier to reliable outputs from engineered cells. Thus, while natural cellular dynamics and differentiation take advantage of noisy gene expression (Elowitz et al, 2002; Süel et al, 2007), synthetic circuits often require noise reduction for proper function (Dunlop et al, 2008). Recent work has taken a different direction, toward understanding of population‐level dynamics. This includes analysis of both stochastic cellular responses to inputs (Uhlendorf et al, 2012; Ruess et al, 2015) and changes in collective population‐level memory in response to stress (Mathis & Ackermann, 2016). Such efforts suggest that a deeper understanding of the inherent heterogeneity in biological systems might eventually lead to circuit designs that operate on distributions of cellular responses, rather than depending on homogeneous responses from all cells.
It is with this vision in mind that we designed a two‐input temporal logic gate using strategically interleaved and oriented integrase (Bxb1, TP901‐1) DNA recombination sites and used this gate to engineer an E. coli strain with four possible genetically differentiated end states. This strain contains single genomic copies of the temporal logic gate, ensuring digital‐yet‐stochastic responses from individual cells. We then utilized the heterogeneity of individual cellular responses to encode sequences of chemical inputs into the overall population response and used a stochastic model of single‐cell trajectories to predict the population response. By analyzing the distributions of final cell states, we can deduce the timing and pulse duration of transient chemical pulses and show that cumulative population‐level distributions contain additional event information not encoded in any single cell. Furthermore, because the states are genetically encoded, we can recover details of a chemical event long after its occurrence.
Design of a two‐integrase temporal logic gate
We have designed a two‐input temporal logic gate that differentiates between the start times of two chemical inputs and produces unique outputs accordingly (Fig 1A). The design relies on a system of two integrases with nested DNA attachment sites (Fig 1B). The use of integrases irreversibly inverts segments of DNA, resulting in a memory feature that can be maintained for multiple generations (Bonnet et al, 2012).
The design of the integrase temporal logic gate hinges on interleaving the attB attachment site of integrase B (intB) with the attP site of integrase A (intA), thus ensuring that the possible DNA flipping outcomes are mutually exclusive (Fig 1B). The serine integrases used in this design are TP901‐1 (intA) and Bxb1 (intB). The fluorescent proteins mKate2‐RFP (RFP) and superfolder‐GFP (GFP) are used both as placeholders for future downstream gene activation and as real‐time readouts of the logic gate. The design also features a terminator (Bba‐B0015) and a strong constitutive promoter (P7). In the case where there are no inputs, the terminator prevents expression of RFP from the constitutive promoter.
There are five possible basic events that could occur in a two‐input system (Fig 1C): no input, inducer a only (Ea), inducer b only (Eb), inducer a followed by b at a later time (Eab), and inducer b followed by a at a later time (Eba). Consequently, in a perfectly resolved temporal logic gate, there should be five unique DNA states corresponding to the five types of events: So (the initial state), Sa, Sb, Sab, and Sba. This design is limited to only four DNA states due to excision when Eb occurs (Sb = Sba). The two fluorescent outputs correspond to the two states that occur when inducer a is detected first—RFP is produced when the cell is in state Sa, and GFP is produced when the cell is in state Sab.
Fig 1D illustrates the sequence of recombination that occurs during an event Eab that results in DNA state Sab and the production of GFP. Upon addition of inducer a at time t1, TP901‐1 flips the DNA between its attachment sites, reversing the directionality of the terminator and the Bxb1 attB recognition site (state Sa). Then, when inducer b is added at some time t2 that is greater than t1, the directionality of the Bxb1 sites is such that the DNA is flipped to reverse the directionality of the P7 constitutive promoter (state Sab). If inducer b is added first (Fig 1E), the Bxb1 attachment sites are unidirectional—a configuration that results not in recombination, but in excision of the DNA between the sites (state Sb).
Once DNA recombination has occurred, it is irreversible. The unique attB and attP attachment sites are recombined into attL and attR sites, respectively, with which the integrases cannot bind to without additional cofactors (Ghosh et al, 2005). The nesting of the integrase attachment sites is the key design feature that produces the temporal a then b logic, and the irreversibility of the recombination records the event in DNA memory. The result is a genetic record that can both be sequenced later and immediately read via constitutive production of fluorescent outputs.
A Markov model for integrase recombination
The most compelling advantage of engineered biological systems over man‐made sensors lies in their inherent capabilities for replication and parallel sensing with minimal energy and resource requirements. Thus, deployment of synthetic bacterial devices would almost certainly involve populations of cells, never just a single cell. It is therefore important to understand how stochastic single‐cell responses affect overall population‐level distributions and outcomes. We created a Markov model of integrase‐mediated DNA flipping and then used a stochastic simulation algorithm (Gillespie, 1977) to simulate individual cell trajectories (Fig 2A). All of the four possible DNA states are represented in the model: the original state (So), the intB excision state (Sb), the intA single flip state (Sa), and the a then b double flip state (Sab). We have implemented the system experimentally by chromosomally integrating the target DNA into the genome of the E. coli cell. This allows us to assume that each cell only has one copy of the temporal logic gate (Haldimann & Wanner, 2001) and that each cell can be characterized by the tuple (DNA; IntA; IntB) (Fig 2B). The DNA terms are So, Sa, Sb, or Sab, and IntA and IntB are non‐negative integers representing the molecular copy number of each integrase. Once a DNA cassette has flipped into any of the states other than the original state So, there is no reverse process. The logic gate is designed such that if integrase B is expressed prior to integrase A, the DNA cassette is excised and the chain reaches the dead‐end Sb state. In order for a cell to successfully detect Eab, it first needs to switch into state Sa then transition into state Sab upon addition of inducer b.
Since each cell contains only a single copy of the temporal logic gate DNA, we can expect each cell to behave differently and to be highly susceptible to internal and external noise. This stochastic behavior will create a heterogeneous population response that can be analyzed for a more complex profile of event than if all the cells behaved uniformly. In order to capture the heterogeneity of cell population, we model the temporal logic gate using a stochastic model. Specifically, the stochastic transitions between the DNA states and the production/degradation of integrases are mathematically modeled by a continuous‐time Markov chain over the state space (DNA; IntA; IntB) as illustrated in Fig 2B. Definitions of transition rates can be found in Appendix Table S1.
In silico, the dynamics of a single cell translates to each stochastic simulation of the Markov model starting with (DNA = So; IntA = 0; IntB = 0) state. We define ;;; and as the probability that the DNA state of a single cell is So; Sa; Sb; and Sab at time t, respectively.
The temporal dynamics of the probability can be modeled by the following ordinary differential equation (ODE) (see equation (1)).
Where the notation stands for the conditional expected value at time t (Full derivation, Appendix Section 12.1).
Serine integrases are produced as monomers that form dimers, search for specific attB and attP sequences, and, once both attB and attP sites are occupied, form a tetramer (dimer of dimers) that digests, flips, and re‐ligates the DNA (Yuan et al, 2008; Rutherford et al, 2013). Though some cooperativity in ΦC31 binding to attB has been found (McEwan et al, 2009), cooperativity in Bxb1 or TP901‐1 integrase binding to attB and attP not been observed (Ghosh et al, 2005; Singh et al, 2014).
Rather than account for all individual DNA‐integrase interactions, we have created a minimal model of stochastic transitions where only the final DNA states (So, Sa, Sb, Sab) and the number of integrase monomer molecules (intA, intB) are tracked and all integrase activity is encompassed in the term. Since no cooperativity has been observed in Bxb1 or TP901‐1 DNA binding (i.e., occupation of attB does not increase the probability of attP binding), we represent the required tetramerization as a fraction where flipping efficiency is zero unless at least four molecules are present. Thus, the propensity functions for state transitions as a function of integrase concentration, , are defined in equation (2)
where is integrase concentration; is the dissociation constant; is the rate of flipping if the tetramer is formed; i = 1; 2; 3; and * = A;B (See Appendix Fig S1 for visualization of and Appendix Section 12.2 for full derivation).
We also define the time between the introduction of the first inducer () and the arrival of the second inducer () as the inducer separation time (), such that (3) as shown in Fig 2C.
In the following set of simulations and experiments, we will consider cases with step inputs (Fig 2C), where the inducers are either present or not present. Concentrations of the inducers when they are “on” will be held constant. Also, it is important to note that inducer a is still present during and after time Δt when inducer b is introduced.
Simulations of the Markov model were done with biologically plausible parameters in order to predict qualitative circuit behavior (Appendix Table S1). We limited the parameters to only the basic processes (integrase production, degradation, and DNA flipping), and parameter values were chosen to be within biological orders of magnitude. The single production rate constants, and , combine the transcription and translation rates of each integrase. When an integrase in the model is induced, its production rate, , is the sum of and any leaky transcriptional expression, (*= intA or intB). The integrase monomer disassociation constant, , was estimated from measured Bxb1 binding constants (Singh et al, 2013). Parameter values for preliminary simulations were , (2.3 h half‐life), , , and molecules.
Our analysis of initial numerical simulation results highlights the significant role that the inducer separation time, ∆t, plays in setting the final population distributions (Fig 2D). For each ∆t, individual cell trajectories were generated with the assumption that each cell only has one copy of the target DNA (N = 5,000 trajectories). Then, at every time point, the total number of cells in each DNA state is counted (Appendix Fig S2). Fig 2D shows the contrast between adding both inducers simultaneously (∆t = 0 h) and adding inducer b after a 5‐h delay (∆t = 5 h). Since both inducers are present by the end of simulation, all of the cells must have a final state that is either the Sab state or the Sb state. No cells remain in the original So configuration. Sa is a transient state that builds up prior to the addition of inducer b and begins to convert to Sab immediately after the introduction of b. These initial simulation results suggest that ∆t may be a way to reliably tune the final population fractions of Sab versus Sb state cells.
Population distributions reflect inducer order and separation time
We used the model to further investigate the effects of varying both inducer order and separation time on population distributions in our experimental system and to understand the possible outcomes. In Fig 3, we simulate in silico cell populations that have been exposed to a sequence of overlapping step functions (N = 5,000 trajectories).
In the case of an Eab event, the proportion of cells that successfully detect a then b and switch to state Sab is a function of the inducer separation time, ∆t (Fig 3A). High ∆t means increasing the time that cells spend in only inducer a, allowing for most of the population to transition from So → Sa before the addition of any inducer b. Exposing cells to the inverse sequence of events, Eba, results in a decrease of Sab cells proportional to increasing ∆t (Fig 3B). High ∆t in an Eba event means that So → Sb is the dominant reaction and cells that are partitioned into Sb will not respond to inducer a. If we plot the final number of Sab cells from both Eab and Eba as a function of ∆t (Fig 3C), we see that the two curves do not overlap. Sab fractions exposed to Eab increase monotonically with ∆t, while those exposed to Eba decrease monotonically with ∆t. Thus, measuring the fraction of Sab cells by itself is sufficient to determine both the order of events and the timing, ∆t, between them.
Additionally, we can define a detection limit, ∆t90, for which the inducer separation time results in ≥ 90% of the population switching into the Sab state (Fig 3C). This ∆t90 limit provides a way to capture the two response regimes of the population. If the inducer separation time is less than the detection limit (∆t < ∆t90), then the rate of population switching is fast enough such that the number of Sab cells will correspond uniquely to some ∆t value. If ∆t > ∆t90, then most cells have already switched to a final state, and the differences in Sab cell count are too small to uniquely determine ∆t.
The single‐cell limitations of the temporal logic gate circuit can be overcome by measuring the number of Sab cells as a fraction of total cells. Though the logic gate itself does not have a unique genetic Sba state and cannot distinguish between a b only event versus a b then a event, these simulation results suggest that population‐level fractional phenotypes can provide this additional information (Fig 3D). In the case of Eab, fractions of Sab will always be above 50%, while Sab fractions less than 50% indicate Eba. Additional figures showing how populations of Sa, Sb, and So cells change with ∆t can be found in Appendix Figure S3.
In vivo step induction data supported model predictions and showed that population fractions of Sab cells could be tuned using ∆t (Fig 4). DH5α‐Z1 cells were chromosomally integrated with one copy of the integrase target DNA and then transformed with a high‐copy plasmid containing Ptet‐Bxb1 and PBAD‐TP901‐1. When ∆t was varied from 0 to 8 h, we observed results qualitatively similar to model predictions. In Fig 4A, the cells have been exposed to an Eab event, where inducer a is present from time t = 0 h to tend, and b is present from t = ∆t h to tend. GFP expression during time course measurements is used as a proxy for Sab state cells, and flow cytometry was used to measure final populations. Comparisons of bulk fluorescence versus cytometry cell counts suggest that in single‐copy integrants, overall GFP fluorescence is a good approximation of population Sab levels (Appendix Fig S12).
Source Data for Figure 4 [msb156663-sup-0004-SDataFig4.xlsx]
In Fig 4A, the number of cells in the GFP‐expressing Sab state increases proportionally with increasing ∆t and continues to be responsive even when the two inducers are separated by 8 h. There is some expression of GFP in the presence of only inducer a (Ea), indicating some basal levels of intB. RFP expression, a proxy for the number of cells in state Sa, begins to increase at t = 0 h and drops at time t = ∆t when inducer b is added (Appendix Fig S4A). Aligning all of the GFP expression curves by ∆t (Appendix Fig S5) shows that lower values of ∆t not only have lower final GFP expression values, but also have slower rates of GFP production. This is consistent with modeling results because if we assume inducer b has an equal probability of entering any one cell, then in case of small ∆t (∆t90 ≤ 4 h), there are a much larger number of So cells and so the rate of Sa → Sab state conversion will be lower. In the case of ∆t > 4 h, the majority of cells in the population are already in the Sa state configuration, and so the rate of cell state conversion to Sab will be much higher. When cells are exposed to Eba, the number of Sab cells decreases monotonically with increasing ∆t (Fig 4B), and there is no RFP expression above background (Appendix Fig S4B). In both types of events, the cells maintained their state for up to 30 h in liquid culture and when re‐streaked as single colonies. (Additional data with a more distinct color scheme and OD curves for this set of experiments can be found in Appendix Figs S6 and S7. Single‐colony analysis in Appendix Fig S11.)
Final Sab (GFP) population fractions are sufficient to differentiate between populations that have been exposed to Eab versus Eba within 1 h of separation time between inducers (Fig 4C). Final populations after 30 h of growth were measured via flow cytometry and plotted against ∆t. As ∆t increases, so does the Sab sub‐population. The cells that encountered Eba have lower Sab fractions with high ∆t, and at ∆t = 6 h, the final Sab sub‐population is equal to the baseline expression of the b only population, indicating that the addition of inducer a after a 6‐h exposure to only inducer b has no effect at all. Based on where the GFP fraction exceeds 90% of the maximum Sab population fraction, the ∆t90 detection limit for the experimental system is ~4 h. These experimental results show that the Sab population fraction clearly diverges for Eab and Eba when ∆t 0 h, indicating that Sab fractions alone can be used to determine both event order and separation time.
Further analysis of population‐level data for all of the measurable fluorescent cell states can provide additional insights into differences in sub‐population growth rates and leaky integrase expression (Fig EV1, Appendix Figs S8–S10). In Fig EV1, experimental populations from the step input experiments have been gated into quadrants such that Sab, Sa, and So + Sb populations can be counted. Even with maximum induction at highest ∆t, the maximum population fraction that can be switched appears to be approximately 60% of the total population. We believe this is due to the non‐fluorescent cells (So, Sb) having a slight growth advantage over differentiated cells. Studies have shown that unnecessary protein production has inverse effects on cell growth (Tan et al, 2009; Scott et al, 2010), and even with single‐copy integrants, this would result in some overrepresentation of non‐fluorescent sub‐populations within the population. Single‐colony analysis of the final populations shows that So cells persist in the population even with 30–40 h of inducer exposure (Appendix Fig S11E).
Leaky expression of intA and intB can also be inferred from the no inducer, a only, and b only populations (Fig EV1A and B), and we can conclude that leaky expression is quite low, not exceeding ~0.5–3%. Even accounting for the overrepresentation of non‐fluorescent cells, the baseline population split when both a and b are added simultaneously (∆t = 0 h) is just under 50% of the total GFP population fraction. This suggests that the integrase flipping rates, and , may not be equal and that the basal expression rates, should be nonzero.
Varying model parameters for integrase activity and basal expression
Prior to proceeding with additional model‐driven experimental designs, model parameters were modified to better represent asymmetrical integrase activity. The parameters for integrase flipping and leaky basal expression were tuned to account for the asymmetrical population responses to Eab versus Eba events (Fig 4C). We hypothesized that this asymmetry arises from a combination of unequal integrase activity when searching for and flipping the DNA, as well as leaky background expression of the integrases (Fig 5).
To understand overall trends in model behavior, we varied and while holding the other parameters constant. When the relative flipping efficiency of intA () was varied from 0.2 to 0.5 h−1 (kflipB = 0.3 h−1), we observed a bias in the baseline population split when both inducers are introduced simultaneously, ∆t = 0 h (Fig 5A, N = 3,000). Previously in the preliminary model (Fig 3C), the two integrases were assigned equal flipping rates, and the population split was expected to be 50/50 for Sab/Sb. As the flipping rate of intA decreases relative to that of intB, that baseline shifts downwards to favor the more active integrase, intB. Varying the basal expression of intB () from 1% to 20% of the intB production rate () monotonically decreases the maximum Sab population fraction that can be reached in an Eab event (Fig 5B, N = 3,000). If there is a constant level of un‐induced intB, then there will always be a minimum population of Sb cells inhibiting the maximum fraction of Sab cells.
These simulation results showed that by varying and , we could tune the baseline shift at ∆t = 0 and the maximum Sab ceiling at high ∆t to better approximate our experimental system. However, experimental measurements of leaky integrase expression showed that background expression was actually quite low (1% for intA, 1–3% for intB) (Figs 4C and EV1, b only, a only). Given actual measurements for , we constrained those parameters and fit the model by varying .
In order to find the best pair of values for and , the flipping efficiency parameters for both integrases were varied from 0.1 to 0.6 h−1 in silico (N = 500 cell trajectories), creating a matrix of simulated Sab population fractions for each combination (Appendix Fig S13). Leaky basal expression of the integrases was held constant based on experimentally measured values ( = 1% of , = 2% of ), and experimental data were normalized to a 70% population maximum for fitting purposes. Mean squared error was found by comparing model fits with experimental data (Appendix Fig S13A), and the combination with the minimum MSE was chosen (Appendix Fig S13B).
Fig 5C shows ∆t versus Sab population simulation results for final revised parameters. The final parameters were set to be , , , and (Appendix Table S2). The introduction of leaky integrase expression into the model suggests that due to leaky expression of intB, around 2% of the population will “detect” Eab and be in state Sab even when no inducer a has been introduced. Additionally, preliminary simulation results suggest that the ∆t90 detection limit can be tuned by increasing or decreasing the overall production rate (*= A or B) (Appendix Fig S14), though this remains to be experimentally verified in future work.
In silico parameter space exploration shows that varying and parameters enables tuning of baseline ∆t = 0 h split for Eab/Eba and the maximum ceiling for Sab population fraction. Fold‐change variations in relative rates allowed us to understand overall trends in the final populations, and we adjusted the model to account for inequalities in integrase flipping and leaky basal expression. Since leaky expression was measured to be small, we primarily tuned flipping rates. This process led us to more relevant model‐informed predictions of experimental outcomes. With the refined model, we were interested to see whether distributions of the RFP‐expressing Sa state could provide information that measuring Sab fractions alone could not.
Deducing pulse width from Sa population fractions
Using the fraction of Sab (GFP) cells alone, we can determine ∆t values up to a ∆t90 limit for any given sequence of two step inputs. Now consider a pulse type of event, in which inducer a begins at time t = 0 h, remains constant throughout, and inducer b is introduced as a finite pulse at time t = ∆t h (Fig 6A). The start time of inducer a then becomes a reference for when the entire system is activated and ready to detect inducer b. Cell states are measured via flow cytometry at time tend, where tend > 24 h. Modeling results presented in this section are using the refined set of parameters defined in Fig 5C and Appendix Table S2.
If either of the two inducers is present in the media to some limit tend, we would expect all of the So cells to end up in one of two populations (Fig 6B). Cells that encounter inducer b first will be in the Sb state, while cells that encounter a first will either be in the Sa or Sab states. In the previous sections, once an inducer was added to the population, it was not removed, and the assumption was made that at times > 24 h, only a negligible number of So cells remained. This type of step function induction also meant that only the final number of Sab cells (GFP) was needed to uniquely determine the separation time ∆t because any and all cells that had switched to Sa would eventually become Sab.
However, in the case of a transient pulse, some cells that are in the Sa state (RFP) will not ever encounter inducer b. Assuming that is small, these cells will remain in the Sa state. Therefore, the population of a first cells equals Sa + Sab. We simulated a matrix of populations exposed to varying inducer separation times (∆t) and inducer b pulse widths (PWb) to measure the resolution of detectable events (Fig 6C). In simulation (Fig 6D), we can see that the two populations mirror each other to add up to 100% of the total cells (N = 3,000 cells, additional simulations in Appendix Fig S16).
Given that the step induction of b is equivalent to a pulse of infinite length (PWb = ∞) and our prior experimental evidence showed that virtually no cells remain in state Sa when PWb = ∞, we reasoned that the final number of Sa cells could be used to deduce information about the pulse width of b. This hypothesis was tested in silico by running a matrix of simulations with varying ∆t and PWb. In Fig 6E, we see that the fraction of Sa cells over the total number of cells decreases monotonically with increasing PWb, and the curves overlap regardless of ∆t. The overlap occurs despite nonzero leaky expression of intA and intB. The maximum number of Sa cells does not go to 1 at PWb = 0 h because of leaky intB expression ( = 0.02 ).
Analytically, we solved equation (1) for to ensure that the Sa population fraction is only dependent on PWb. If inducer a is used as a constant reference signal, all cells transition into one of Sa; Sb; or Sab states, thus . If we assume that the basal leaky expression of intB is zero ( = 0), holds for , since there is no intB to switch the DNA state into Sb or Sab. Thus, we can show that is dependent only on PWb, the duration of the pulse width of inducer B, for t > ∆t. This conclusion holds as long as is negligibly small compared to other kinetic constants (, , , , and ) (See Appendix Section 12.3 for full derivation).
If Sa population fractions can be modulated by changing PWb, then conversely, we should be able to use measured experimental RFP population fractions as a way to determine PWb. Once PWb is known, then the Sab fraction can be used to uniquely determine the time between inducers, ∆t (Fig 6F). Furthermore, the genetically encoded state means that these population fractions should be maintained and measurable at a time, tend, that is much later than the time of the events.
These conclusions can be extended in simulation to create a scatterplot of Sa cells versus Sab cells in a population (Fig 7A) over an 11 × 11 parameter matrix varying ∆t and PWb from 0 to 6 h in increments of 0.5 h (Additional plots in Appendix Fig S17). Each point on the chart in Fig 7A represents a simulated population (N = 3,000) exposed to a unique combination of ∆t and PWb values. Vertical lines represent the same PWb value, and points with the same shape and color have the same ∆t value. The simulation results suggest sufficient resolution of events as long as PWb and ∆t values are between 0 and 4 h. For any single value of PWb, we can follow the increasing ∆t values vertically and see that the population response saturates after 4.5 h resulting in overlapping between populations with 4.5 < ∆t < 6 h. We can trace any individual ∆t value horizontally from right to left and observe that the points begin to cluster and overlap when 4.5 < PWb < 6 h. These simulation data suggest that there should be some defined detection range of ∆t and PWb where every possible combination of the two is uniquely identifiable.
Source Data for Figure 7 [msb156663-sup-0005-SDataFig7BC.xlsx]
Experimentally, we tested a 7 × 7 matrix of varying ∆t and PWb (0–6 h, 1 h increments) on independent populations of the temporal logic gate E. coli strain (Fig 7B). All populations, except for the control, were exposed to inducer a (L‐ara 0.01%/vol) at time t0 to tend. Pulses of inducer b (aTc, 200 ng/ml) were achieved by sampling 5 μl of the population and diluting 1:100 into fresh media with only inducer a (M9CA + 0.01%/vol L‐ara). Populations were collected and measured via flow cytometry after 24 additional hours of growth in inducer a (~36 h after start of experiment) (Fig EV2). For all values of ∆t, the number of Sa cells (RFP) is highest when there is no exposure to inducer b (PWb = 0 h) and decreases monotonically as a function of PWb (Fig 7B, top). We see a more pronounced separation of the ∆t curves when we look at Sab (GFP) cell fractions (Fig 7B, bottom). The number of Sab cells is dependent on both ∆t and PWb and increases proportionally with both increasing b pulse duration and inducer separation time.
By counting population fractions of RFP versus GFP‐expressing cells, we can resolve the different populations that result from varying ∆t and PWb values (Fig 7C). As with Fig 7A, each point on the graph represents an independent population of cells (OD~0.7, ~106 cells counted per population). All of the populations exposed to either or both of the inducers occupy fractional coordinates that are unique from that of the no inducer controls (indicated by dotted circle). We see that if ∆t is constant and PWb increases (Fig 7C, right to left), then the Sa fraction decreases as Sab fractions increase. For constant PWb with increasing ∆t (Fig 7C, bottom to top), the Sa cell fraction remains mostly constant relative to increasing Sab. In the case where there is no b pulse (PWb = 0 h), we see maximum Sa (RFP) cell fractions of about 60% with minimal Sab populations that are about the same as no inducer Sab levels. Overall, populations with different PWb exposures are well separated by Sa (RFP) fraction up to 4 h. Even for PWb at 5 and 6 h, the populations have unique Sa/Sab coordinates, just not unique Sa fractional values.
This method of profiling is only valid if the fraction of Sa state cells can be used as a measure of PWb that is independent of ∆t. In previous experiments with step inputs (Fig EV1), there would be a significant population of cells with both GFP and RFP fluorescence, since they had transitioned to Sab but had not yet fully diluted out built up RFP protein levels from being in Sa for extended periods. If a significant percentage of the population remained in this transition state (Q2), that would make RFP an unreliable measure of Sa state cells. However, flow cytometry analysis of the pulse‐modulated populations (Fig EV2) showed that although there were some cells expressing both RFP and GFP (Q2), these cells were always < 3% of the total population. (Additional flow cytometry analysis can be found in Appendix Figs S18–S23.) Thus, RFP was measured to be a reliable determinant of Sa state cells, and subsequently, of PWb.
For any given PWb, we observed higher experimental Sa (RFP) population fractions with lower ∆t (Fig 7A top), resulting in a diagonal slant for each value of PWb (Fig 7C). Upon further investigation, we believe this is due to a slower transition than we anticipated. In our model, we assume is equal to , since both transitions are mediated by intB. However, the gradual decrease in Sa fractions with increasing ∆t for each value of PWb suggests that the α1 transition rate may be actually be slower than α2 or α3. Simulation results with adjusted transition rates (α1 < α2 = α3) recapitulated the slanting Sa population fractions (Appendix Fig S25). This inequality in transition rates could have arisen from differences in DNA sequence length or from differences in the DNA excision required for So transition to Sa instead of the recombination that occurs in the other transitions. Differences in DNA excision or recombination for a single integrase are important experimental parameters, but do not ultimately affect our conclusions about the overall system. Despite unequal intB transition rates, experimental implementation of the temporal logic gate still produces unique (Sa, Sab) fractional coordinates for each combination of ∆t, PWb, even though Sa values are not unique for higher PWb.
Model‐informed predictions on population fractions in response to pulses of inducer b led to experiments that could produce unique Sa and Sab coordinates for different combinations of ∆t and PWb. However, experimental data also revealed areas in which the model had been oversimplified. While it is important to have a model to understand overall properties and limitations of the experimental system, it is also impractical to design simulations that can account for all possible variations that might occur in the implementation of biological devices. Therefore, we believe that future workflows should also involve calibration protocols for specific applications of engineered biological populations.
Practical use and calibration of populations for event detection
Curve‐fitting methods were used to automatically convert experimentally measured RFP and GFP population fractions into PWb and ∆t values and to evaluate the resolution with which population ratios can be used to determine inducer separation time and pulse duration. Using the experimental data from Fig 7B and C, we generated fitting curves for PWb as a function of RFP population percentage (R) and for ∆t as a function of both GFP population percentage (G) and PWb (Appendix Figs S26 and S27, Appendix Table S3). We will denote these functions with PWb(R) and ∆t(G, PWb), respectively. The functions PWb(R) and ∆t(G, PWb) can then be used to generate a mesh of estimated PWb and ∆t values for any given normalized fluorescence values (Fig 8A, Appendix equations 8–11).
Source Data for Figure 8 [msb156663-sup-0006-SDataFig8.zip]
The estimated values were compared against the actual values to determine the approximate time window with which a specific PWb or ∆t value can be resolved. For each actual value of PWb and ∆t, we calculated the average and standard deviation for the set of estimated values. The standard deviation allows us to visualize the range for which the majority of predictions will fall for any given actual value. For instance, a PWb of 1 h can be detected ± 0.25 h, but as PWb increases, this prediction window widens and for PWb ≥ 3 h, the resolution of detection is closer to ± 1 h (Fig 8B). Similarly, predicted values of ∆t fall within ± 0.5 h for 0 < ∆t < 3 h and increase to ± 1 h when ∆t ≥ 3 h (Fig 8C). Using these fitting functions, we can also pre‐generate a reference table that converts RFP and GFP population fractions into predicted PWb and ∆t values (Appendix Table S4).
Engineered biological systems have inherent capabilities for replication, parallel processing, and energy efficiency. These advantages rely on the existence of bacteria not as single cells, but as populations. As the field moves forward with synthetic gene circuits, it is important to understand outcomes not just as single‐cell outputs but as overall population‐level distributions.
We have designed and implemented a temporal logic gate that takes advantage of the population dynamics to collectively sense and record sequences of transient chemical inputs. We show both that single cells independently sense and record events and that aggregate population fractions create unique outcomes that provide information not encoded in single cells. As with all engineered systems, proper calibration of these temporal logic gate populations will be required prior to deployment in the “field”. We envision a process similar to the one described in this report. First, experimental populations are exposed to a matrix of PWb and ∆t values. This will set the maximum and minimum RFP and GFP population fractions and provide necessary data for determining the ∆t90 limit and producing the fitting functions PWb(R) and ∆t(G, PWb). Once the fitting functions have been determined, values for PWb and ∆t for experimental samples can be estimated within ± 0.25 to 1 h of the actual values. A calibrated table could also be generated and used for as a reference for samples that have been exposed to unknown conditions.
The stochastic nature of molecular processes often presents a significant barrier to homogenous outputs from an engineered population of cells. This implementation of event detection via population fractions takes advantage of stochastic and heterogeneous individual responses to environmental conditions in order to map final population fractions back to unique sequences and durations of chemical events. The sensitivity of the system and the ∆t90 detection limit could potentially be modulated by increasing or decreasing protein production rates via tuning of plasmid copy numbers, signal concentration, or transcription/translation sequences. The use of digital cellular outputs combined with the analog population response creates event detection systems that are more robust to stochasticity and can be tuned more easily. We plan to explore these possibilities in future work.
As a proof‐of‐concept, we have used the common laboratory inducers L‐arabinose and aTc as inputs, but we anticipate that our temporal logic gate system could be modularly adapted to any pair of biosensors. In particular, we believe there are possibilities for detection of miRNAs and biofilm formation. Stable populations of microRNAs (miRNAs) circulating in the blood have generated a lot of interest as biomarkers for human health (Cortez et al, 2011). These short (~20–30nt) regulatory RNAs have been shown to have sequential tissue‐specific expression signatures that correlate with pregnancy, tumor formation, and other diseases (Gilad et al, 2008; Mitchell et al, 2008), and synthetic biology has developed many customizable RNA sensors (Friedland et al, 2009; Green et al, 2014). Detection of miRNAs would require implementation of the temporal logic gate in mammalian cells. Though recombinase‐based synthetic circuits have not been shown in mammalian cells, serine integrases have been used quite effectively in a wide variety of mammalian cell types, primarily for genome editing and integration (Keravala et al, 2006; Xu et al, 2013).
Another possible application would involve the detection of harmful biofilms. Biofilms are self‐assembling, highly structured, multi‐species consortia that develop in stages and have sophisticated networks of interaction and function (Stoodley et al, 2002; Flemming & Wingender, 2010; Elias & Banin, 2012). Unnatural biofilm development in environments such as industrial water sources or waste streams can be both harmful for both the natural environment and the industrial mechanisms. Detection of biomarkers for known strains of biofilm colonizers would provide early warning of changing ecosystems, and although we do not yet fully understand these networks, it is known that quorum sensing plays a critical role in the process. Quorum‐sensing molecules and receptors are available in the synthetic biology toolbox and so may provide an accessible way of detecting the sequential colonization of different microbes. Field deployment of engineered bacteria will likely involve transient signals, low‐nutrient environments, and possibly even other microbial competitors (i.e., soil, flowing rivers, the digestive tract). We used minimal media in this study to better approximate low‐nutrient environments, and anticipate further characterization in more customized “local” environments (i.e., gut model or air model or soil model) and with hardier microbial chassis.
Finally, this study focused on the population outputs as indicators of past events, but we believe that this temporal logic gate could be used to reliably differentiate a single strain into controlled sub‐populations via input pulse order, duration, and frequency. In recent years, it has been recognized that many natural systems modulate cellular behavior not only by changing the concentration of signaling molecules but also by regulating signal pulse frequency (Cai et al, 2008; Lin et al, 2015). If we consider the fluorescent proteins GFP and RFP in this circuit as simply placeholders for downstream genes, then this system could easily be applied as a top‐down population differentiator. By modulating the sequence of inputs, one could systematically predict and create mixed populations of genetically differentiated cells. This greatly expands our capability to design synthetic systems that have controllable distributions as outcomes, not just digital on/off phenotypes. Furthermore, we can then begin to develop frameworks for understanding the role of feedback and control theory in modulating these sub‐populations given different starting distributions or uneven growth rates due to resource limitations. As the scientific community turns toward further understanding of microbiomes and multi‐cellular consortia, engineered bacteria populations could be used not only as a tool for investigating the activities of natural communities but also as a way to build synthetic communities from the ground up.
Materials and Methods
Cell strains and plasmids
All plasmids used in this study were designed in Geneious 7.1 (Biomatters, Ltd.) and made using standard Gibson isothermal cloning techniques. Integrases Bxb1 and TP901‐1 are on a high‐copy plasmid (pVHed05, plasmid map in Appendix Fig S9) with a ColE1 origin of replication (original template from the dual recombinase controller (Bonnet et al, 2013), Addgene Plasmid 44456). Integrase A (Bxb1) is behind a Ptet promoter and integrase B (TP901‐1) is behind a PBAD promoter. The plasmid has been modified with an additional TetR gene. The temporal logic gate was integrated into the Phi80 site on the E. coli chromosome using CRIM integration (Haldimann & Wanner, 2001) and screened for single integrant colonies. The integration plasmid template and DH5α‐Z1 strain were generously provided by J. Bonnet and D. Endy and modified to contain the temporal logic gate (pVHed07, plasmid map in Appendix Fig S9). Additional DNA and oligonucleotides primers were ordered from Integrated DNA Technologies (IDT, Coralville, Iowa). A custom formulation of M9CA media was used for all experiments. The media contained 1× M9 salts (Teknova, M1906) augmented with 100 mM NH4CL, 2 mM MGSO4, 0.01% casamino acids, 0.15 μg/ml biotin, 1.5 μM thiamine, and 0.2% glycerol and then sterile‐filtered (0.2 μm).
The stochastic simulation algorithm by Gillespie (Gillespie, 1977) was implemented to generate the sample paths of individual cells using the Markov model (see Appendix Table S6 for the definitions of Markov transitions and transition rates). All simulation runs and their analyses were done with MATLAB (R2014b, The MathWorks, Inc.). Simulated populations were done with 3,000–5,000 individual cell trajectories. Source code for MATLAB simulations is available as Code EV1.
Prior to all experiments, cells were grown overnight from plate cultures in M9CA for 2 days, then diluted to OD 0.1 and recovered for 4–6 h at 37°C. L‐arabinose and anhydrous tetracycline (aTc) were used as inducers a and b, respectively. L‐ara was used a concentration of 0.01% by volume, and aTc was used a concentration of 200 ng/ml (450 nM). All media contained the antibiotics chloramphenicol (Sigma‐Aldrich, Inc (C0378); 50 μg/ml) and kanamycin (Sigma‐Aldrich, Inc (K1876); 30 μg/ml). All experiments were performed with the aid of timed liquid handling by a Hamilton STARlet Liquid Handling Robot (Hamilton Company).
For step function experiments, the cells were diluted to OD 0.06–0.1 into a 96‐well matriplate (Brooks Automation, Inc., MGB096‐1‐2‐LG‐L) with 500 μl total volume in M9CA. Cultures were incubated at 37°C in a BioTek Synergy H1F plate reader with linear shaking (1096 cycles per minute) (BioTek Instruments, Inc.), and inducers were added at appropriate time by the Hamilton robot. OD and fluorescence measurements (superfolder‐GFP ex488/em520, mKate2‐RFP ex580/em610) were taken by the BioTek every 10 min. Each experimental condition was done on the plate in triplicate.
For the pulse experiments, single 500‐μl cultures were grown at 37°C in the BioTek plate reader (linear shaking, 1096 cycles per minute) and inducers added at time ∆t by the Hamilton liquid handler. Pulses were achieved through dilution of the culture into fresh M9CA media containing 0.01% L‐arabinose. The Hamilton was programmed to sample 5 μl of the culture and dilute it into 500 μl of fresh M9CA + 0.01% L‐ara to achieve pulsatile exposure to aTc. This was done in three independent triplicates for each experimental condition. About 96‐well deep‐well plates containing the diluted cultures were then incubated at 37°C incubated for an additional 24 h (~36–40 h from start of experiment). Final end point populations were measured using the plate reader and also stored and further analyzed using flow cytometry.
Analysis of experimental data was done using custom MATLAB scripts. All depicted error bars are standard error of the mean. Fitting of curves was done in MATLAB.
Experimental cultures were spun down, washed, resuspended in sterile PBS with 15% glycerol, and stored at −80°C (Jahn et al, 2013). Cultures were then thawed on ice and diluted to 106 cells/ml in sterile PBS prior to running on the flow cytometer. Flow cytometry was done using a MACSQuant VYB (Miltenyi Biotec, Germany) at the Caltech flow cytometry core facility. Flow data analysis and gating was done with FlowJo version 10.0.8r1 (Flowjo, LLC, Ashland, OR). For inducer separation time experiments shown in Figs 4 and EV1, ~105 cells were measured per population. For pulse induction experiments shown in Figs 7 and EV2, ~106 cells were measured per population.
The authors would like to thank J. Bonnet and D. Endy for the initial plasmids used in this work, S. Sanchez for critical assistance in automation and liquid handling, D. Perez for flow cytometry assistance, and C. Hayes for discussions. V.H. is supported by the U.S. Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program. Y.H. is supported by JSPS Fellowship for Research Abroad. Research supported in part by the Institute for Collaborative Biotechnologies through grant W911NF‐09‐0001 from the U.S. Army Research Office. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.
VH conceived of the circuit design, constructed the necessary experimental strains, performed experimental work and data analysis, ran model simulations, and wrote the manuscript. YH developed the stochastic model and derived the mathematical results. PWKR provided feedback and guidance on data analysis and interpretation. RMM provided feedback and guidance on overall project vision, circuit design, and interpretation of results.
Conflict of interest
The authors declare that they have no conflict of interest.
Expanded View Figures PDF [msb156663-sup-0002-EVFigs.pdf]
Code EV1 [msb156663-sup-0003-CodeEV1.zip]
Source Data for Expanded View and Appendix [msb156663-sup-0007-SDataEVAppendix.zip]
FundingU.S. Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Programhttp://dx.doi.org/10.13039/100000005
This is an open access article under the terms of the Creative Commons Attribution 4.0 License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
- © 2016 The Authors. Published under the terms of the CC BY 4.0 license