The ErbB signaling pathways, which regulate diverse physiological responses such as cell survival, proliferation and motility, have been subjected to extensive molecular analysis. Nonetheless, it remains poorly understood how different ligands induce different responses and how this is affected by oncogenic mutations. To quantify signal flow through ErbB‐activated pathways we have constructed, trained and analyzed a mass action model of immediate‐early signaling involving ErbB1–4 receptors (EGFR, HER2/Neu2, ErbB3 and ErbB4), and the MAPK and PI3K/Akt cascades. We find that parameter sensitivity is strongly dependent on the feature (e.g. ERK or Akt activation) or condition (e.g. EGF or heregulin stimulation) under examination and that this context dependence is informative with respect to mechanisms of signal propagation. Modeling predicts log‐linear amplification so that significant ERK and Akt activation is observed at ligand concentrations far below the Kd for receptor binding. However, MAPK and Akt modules isolated from the ErbB model continue to exhibit switch‐like responses. Thus, key system‐wide features of ErbB signaling arise from nonlinear interaction among signaling elements, the properties of which appear quite different in context and in isolation.
The four transmembrane receptors of the ErbB family, of which the epidermal growth factor tyrosine kinase (EGFR or ErbB1) is the founding member, are widely expressed in human tissues and stimulate diverse cellular responses, such as proliferation, survival and motility. ErbB receptors have been studied extensively at a molecular level and several have been shown to play a central role in human cancer. ErbB2 (Her2), for example, is overexpressed in a subset of breast cancers and is the target of trastuzumab, an important anticancer drug. Signaling through ErbB receptors, and the downstream proteins they activate, is complex because ErbB1–4 combine to form both homo‐ and heterodimers having distinct affinities for 13 known ligands and intracellular adaptor proteins. Understanding and predicting how signaling varies with receptor mutation or overexpression is essentially impossible using informal reasoning and pictorial representations of signaling pathways.
To address this challenge, we have constructed, trained and analyzed a mass action model of immediate‐early signaling involving four ErbB receptors, as well as the downstream MAP kinase and PI3K–Akt signaling pathways involved in regulating cell proliferation (Figure 1). Our model aims to capture as much mechanistic information as possible, based on an extensive literature, while accurately representing the responses of cells to two major ligand subclasses (EGF and heregulin in multiple tumor cell types). In assembling such a model, we encounter a number of challenges, among which the processes of entraining models to experimental data are the most serious. Mass action models are governed by two kinds of parameters: the concentrations of individual protein species, and the rate constants governing protein–protein association and enzyme catalysis. Models of cell signaling that are realistic at a molecular level may inevitably contain many parameters that are unknown a priori. These parameters can be estimated either by direct measurement (usually in vitro, raising the question how values obtained in dilute solution should be translated to a crowded intracellular environment) or through an inverse process in which the model is fitted to data. The processes of training a kinetic model on experimental data constrain only a subset of parameters. It is assumed by some that the presence of parametric uncertainty in trained models makes them useless; in fact the aim of fitting is to determine only the subset of parameters that determine model performance with sufficient accuracy from which meaningful hypotheses can be drawn. Methods to accomplish this with cell signaling models are in their infancy, and most biochemical models are parameterized based on generic or theoretical assumptions; systematic or rigorous analysis of parameters variation and uncertainty has been restricted to small idealized models. In the current study, we use iterative fitting to generate families of model fits with similar biochemical connectivity, but different values for unconstrained parameters. We then attempt to draw well‐substantiated inferences from the families of models.
To ascertain which features of our ErbB model are conserved when parameterized by different sets of rate constants and concentrations, we calculated dynamic sensitivities of multiple observables; this serves to measure the ‘importance’ of each rate or species in affecting measurable outputs. Sensitivity analysis showed that despite degeneracy in parameter values among families of fits, many important features are conserved. However, the rank order of sensitive parameters is strongly influenced by the feature being examined. For example, the proteins and rate constants that influence ERK activation differed from those that influence Akt, and factors important at high ligand concentrations differ from those important at low concentrations. A picture of ErbB signaling emerges in which different physiological outputs can be mapped back to specific biochemical characteristics of signaling proteins under varying input conditions.
One striking aspect of this input–output behavior involves the relationship between ligand concentration (the input) and the activities of ERK or Akt (the output). A priori, we might assume little activation of the ErbB pathway when the concentration of ligand is below receptor Kd. However, experiments demonstrate significant Akt and ERK activation (∼20% maximal) and 100‐fold lower ligand. Moreover, as ligand levels rise we would expect the response to rise ∼9‐fold for every 81‐fold increase in ligand concentrations (representing standard binding thermodynamics). In contrast, we observe a log‐linear amplification relationship between ligand and output over a nearly 106 concentration range (Figure 7). To understand how this might arise, we have examined the performance of our parameterized ErbB model of and subsets of model comprising coupled reactions such as the MAP kinase cascade. We conclude that the unexpected log‐linear input–output behavior of the ErbB model arises from interactions between enzyme cascades that in isolation exhibit canonical Hill‐like behavior. Thus, we must now think carefully about the ways in which signaling modules interact so as to generate behavior that is quite different from what we would expect of the isolated molecules or small cascades.
In conclusion, we describe an approach in constructing and analyzing models of cell signaling pathways that can incorporate substantial molecular detail, while remaining sensitive to the inability of experiments to fully constrain parametric complexity. By focusing on well‐substantiated model features, we attempt to understand the origins of an unexpected and physiologically significant characteristic of the ErbB network with regard to dose–response behavior. Looking forward we anticipate building on the current model by including more accurate representations of more reactions, while developing a rigorous approach to generating model‐derived hypotheses, the degree of belief of which is determined by the underlying uncertainty in the model and the data. We will then be in a position to reliably understand cellular physiology in terms of molecular mechanism.
A complex mass‐action model of immediate‐early ErbB signaling in human cells yields new biochemical insight despite its non‐identifiability and resulting parametric uncertainty
Sensitivity analysis reveals a strong relationship between those parameters that are important for model dynamics and the specific physiological feature being analyzed: biochemical values that are highly significant for some responses are unimportant for others
Proteins downstream in the ErbB signaling cascade are activated to substantial levels at ligand concentrations several orders of magnitude below those that result in appreciable receptor‐ligand binding or receptor phosphorylation
The extreme sensitivity of ErbB receptors to low ligand concentrations arises from the ability of MAPK and PI3K‐Akt kinase cascades to amplify signals in a highly non‐linear manner; this behavior is lost when the cascades are isolated from the larger ErbB network, demonstrating context sensitivity in their operation
The ErbB1–4 receptor tyrosine kinases (RTKs) and the signaling pathways they activate, control cell division, motility and survival and are among the best studied of all signal‐transduction networks (Citri and Yarden, 2006). Hyperactive and aberrant ErbB signaling has been implicated in a wide variety of human cancers and is a frequent target of pharmaceutical intervention (Slamon, 1987; Rusch, 1993; Engelman, 2007). ErbB1–4 receptors bind as homo‐ and heterodimers to a family of 13 soluble and membrane‐bound ligands and activate multiple downstream signaling pathways, including the mitogenic Ras‐MAPK cascade and the pro‐survival PI3K/Akt pathway. Complexity arises in ErbB networks from hetero‐ and homo‐oligomerization among receptors having distinct ligand‐binding properties and different intracellular binding partners, and from the multiplicity of intracellular proteins that interact with these receptors. A wide variety of anti‐ErbB drugs are in use or development, including small‐molecule ATP competitors (e.g. gefitinib and erlotinib) and therapeutic antibodies (e.g. cetuximab and trastuzumab), but most drugs exhibit significant patient‐to‐patient variation in efficacy (Slamon, 2001; Paez, 2004). The causes of this variation are under active investigation (Ono, 2004; Carey, 2006; Mulloy, 2007) and appear to be multi‐factorial, even at the level of immediate‐early signaling (Jasper, in preparation). Quantitative, network‐level analysis is therefore needed to predict the consequences for signal transduction of changes in the activities and levels of multiple proteins. Toward this end, we have developed a computational model of the mammalian ErbB network that includes signaling from all four receptors and the ERK and Akt signal‐transduction cascades.
ErbB receptors are single‐pass type I transmembrane receptors with extracellular ligand‐binding domains, intracellular tyrosine kinase domains and cytoplasmic tails that act as signaling scaffolds. ErbB1 and ErbB4 are fully functional as ligand‐dependent tyrosine kinases, but ErbB2 does not bind any known ligand, functioning instead as a dimerization‐ready signal amplifier (Klapper, 1999). ErbB3 has a crippled kinase domain lacking catalytic activity (Guy, 1994) and therefore transduces signals only when phosphorylated by other ErbB receptors. The 13 known ErbB ligands can be divided into three groups: (i) those that bind specifically to ErbB1, such as epidermal growth factor (EGF), transforming growth factor alpha and amphiregulin, (ii) those that bind to both ErbB1 and ErbB4, including betacellulin, heparin‐binding EGF, epigen and epiregulin and (iii) the neuregulins (NRGs), which fall into a subgroup comprising NRG1 (also known as GGF2, SMDF or HRG) and NRG2 that binds to heterodimers containing ErbB3 or ErbB4 and a second subgroup comprising NRG3/NRG4 that binds ErbB4 homodimers (Figure 1; Supplementary Figure 1A). Concomitant with ligand binding, ErbB receptors dimerize and undergo trans‐phosphorylation on residues in their cytoplasmic tails thereby creating docking sites for SH2‐containing adapter molecules, such as Shc, Grb2, Sos and PI3K. ErbB1 has at least 20 sites of tyrosine phosphorylation on its cytoplasmic tail, 12 of which are thought to partner with SH2‐containing adapter proteins and enzymes (Schulze et al, 2005). Other ErbB receptors undergo equally extensive post‐translational modification and adapter binding. Receptor‐associated adapters activate Ras, ERK and Akt. Akt can also be activated in a Ras‐independent manner through the direct binding of PI3K–p85 to multiple sites on ErbB3. Key effectors of ERK and Akt include the Elk‐1, AP‐1 and NF‐κB transcription factors. In addition, ErbB receptors also activate protein kinase C, actin‐based morphogenic responses and Jak–STAT signaling, but we have not yet modeled these pathways. Finally, the ErbB response is silenced by internalization and degradation of receptors, dephosphorylation by phosphatases and negative feedback from active ERK and Akt.
The ‘hour‐glass’ shape of RTK signaling networks (Citri and Yarden, 2006), in which a broad array of ligands and receptors funnel into a limited number of core signaling cascades (MAPK, PI3K–Akt, Jak–STAT and so on) to control a diverse set of transcription factors, raises fundamental questions as to how signals are transduced from the cell surface to the nucleus without loss of information. This question is particularly acute in the case of ErbB networks, in which multiple ligands control a diversity of physiological responses in a cell‐specific manner, although nonetheless engaging a common set of signaling proteins. It seems probable that the ErbB network encodes information through changes in the stoichiometry and half‐lives of post‐translational modifications and the localization of interacting kinases, phosphatases and so on. To begin to understand these phenomena, we require a quantitative framework that encompasses the diversity of receptor hetero‐ and homo‐oligomerization and at least some of the complexity of upstream ligands and downstream signaling. We also require an effective means to incorporate experimental data into the model in a manner that reproduces ligand‐ and cell‐specific responses while taking into account uncertainty in the precise biochemistry of ErbB signaling and the inability of experiments to measure all relevant processes. Here, we present, as a substantial extension of our earlier work (Schoeberl et al, 2002), an ordinary differential equation (ODE) model of ERK and Akt regulation by two ErbB ligands and four ErbB receptors during the immediate‐early phase of ligand‐stimulated cell signaling. Relative to previously published models of ErbB signaling (Kholodenko et al, 1999; Hatakeyama, 2003; Resat et al, 2003; Hendriks, 2005; Sasagawa et al, 2005; Birtwistle et al, 2007), ours is more extensive in including all four receptors and two ligand classes, although nonetheless retaining the rigor of a mass action formulation based on elementary reactions. A recently published ErbB model by Birtwistle et al (2007) is also based on a kinetic representation of immediate‐early signaling (from 0 to 30 min rather than 0 to 120 min as in our model), but aggregates species to reduce complexity. In contrast, we rely on elementary reactions throughout, albeit at the cost of more species and parameters. We consider parametric uncertainty and model non‐identifiability explicitly and account for the fact that parameter sensitivity or robustness can only be interpreted in light of this uncertainty. Despite its non‐identifiability, our model predicts experimentally verifiable system‐wide features, such as variable amplification in receptor‐activated enzymes as the basis of a very broad range in dose responsiveness.
To construct a computational model of ErbB‐mediated signaling, we extended our previous model (Schoeberl et al, 2002), which contained only the ErbB1 receptor, using information from models of other mammalian signal‐transduction cascades and a large body of newly published data (Yarden and Sliwkowski, 2001; Citri and Yarden, 2006). Our primary goal was expanding from a single receptor to all four members of the ErbB family, thereby enabling analysis of receptor–receptor interaction and determinants of differential signaling by ligands such as EGF and HRG. The Immediate‐early ErbB Reaction Model (IERMv1.0) in this paper was implemented in a deterministic, continuum approximation as a network of ODEs with temporal coverage from 0 to 120 min after ligand addition.
Seven ErbB hetero and homodimers that have been described in the literature were included in our model: ErbB1/1, ErbB1/2, ErbB1/3, ErbB1/4, ErbB2/2, ErbB2/3 and ErbB2/4. The majority of these dimers are activated by ligand binding, but several arise through a process of ‘lateral signaling’ (or secondary dimerization) in which dimers phosphorylated in a ligand‐dependent manner dissociate into monomers that then reassociate with either phosphorylated or unphosphorylated monomers to create new homo‐ or heterodimers (Graus‐Porta et al, 1997); in this way, active ErbB2/2 can form even though it is does not bind ligand. Because it has no kinase activity and is presumably inactive (Guy, 1994), ErbB3/3 was omitted from the model. ErbB3/4 and ErbB4/4 were also omitted because the cell lines we studied express ErbB3 and ErbB4 at substantially lower concentrations than ErbB1 and ErbB2 (receptor numbers for A431, H1666 and H3255 lines are reported in Supplementary Figure 2B). Each receptor dimer spawns a cascade of downstream reactions and omitting ErbB3/3, ErbB3/4 and ErbB4/4 reduces the total number of species in the model by ∼250, making numerical integration and parameter estimation significantly easier.
Two critical decisions that arise during model design involve scope and level of detail. One point of view holds that models should include all known components and relevant phenomena so as to capture cellular biochemistry in as complete and realistic a manner as possible. However, as the number of proteins increases, so does the number of free parameters that govern their interaction, often in a nonlinear manner. A second point of view holds that models should be as simple as possible as long as they reproduce empirical findings; ideally, models should have fewer degrees of freedom than training data (Aldridge et al, 2006). However, strict application of this reasoning generates models in which biochemical processes are lumped together and the functions of individual proteins are difficult to discern. In common with previous studies (Kholodenko et al, 1999; Hatakeyama, 2003; Hendriks, 2005), we therefore chose to model ErbB signaling at an intermediate resolution in which key proteins were represented explicitly, but without accounting for all assembly intermediates. General cellular processes, such as protein degradation and endocytosis, were represented without molecular detail as simple first‐ or second‐order reactions, the rates of which were estimated by optimization (see below). With respect to downstream pathways, the model was restricted to ERK and Akt because, among the many cascades induced by ErbB ligands, both are known to be critical for mitogenesis and cell survival. Among the 13 known ErbB ligands, EGF and heregulin (HRG) were implemented as being representative of two major specificity classes: EGF binds only ErbB1‐containing dimers, and HRG binds only ErbB3‐ and ErbB4‐containing dimers (Berkers et al, 1991; Peles, 1992; Carraway et al, 1994). Modeling two ligands, seven receptor dimers and downstream ERK and Akt pathways involved 28 distinct proteins. Protein–protein interactions, post‐translational modifications and compartmentalization involving these 28 proteins generated a total of 471 additional species that participated in 828 reactions requiring 499 differential equations, 201 unique reaction rates and 28 non‐zero initial conditions (species, rates and reactions can be found in Supplementary Figure 2). Although variations in network topology were explored in the course of model assembly, only the final configuration is described below; it represents the most compact description of the full ErbB receptor layer. Future model refinement is expected to focus on ensemble approaches that attempt to resolve topological uncertainty in the Akt pathway (Kuepfer et al, 2007), inclusion of additional signaling cascades and more principled treatment of combinatorial complexity during receptor activation.
On the basis of the experiments in HeLa cells, it has been proposed that EGF‐induced expression of dual specificity phosphatases and other ErbB‐negative regulators plays an important role in shaping the dynamics of intracellular signaling starting ∼30 min after ligand addition (Amit, 2007). However, when A431 or H1666 cells, which are the focus of this study, were treated with cycloheximide (an inhibitor of protein synthesis) no significant difference in ERK and Akt dynamics was observed relative to cells not treated with cycloheximide (data not shown). Thus, we ignore the impact of ligand‐induced protein expression on the dynamics of immediate‐early signaling in the current work.
Mass action kinetics
Reactions in the ErbB model were formulated as elementary unimolecular or bimolecular processes according to the law of mass action, with r=kA for a unimolecular reaction involving protein A (where A indicates the total number of molecules of A per cell, r a rate and k a rate constant), r=kAB for a bimolecular reaction involving A and B. Hill functions and other higher order algebraic expressions were not used because they represent approximations to cascades of elementary reactions. Thus, cooperativity, nonlinear input–output behaviors and feedback arise in the model only from the interplay of simple reactions. Protein concentrations throughout the ErbB network were high (absolute protein numbers≫103 per cell), so deterministic approaches were used. We have not yet considered the possible involvement of slow reactions or small reaction compartments (<<100 molecules) for which stochastic simulation might be more suitable.
Biological and reaction compartments were implemented and both were assumed to be well mixed. The former included compartments for plasma and endosomal membranes, cytosol, nucleoplasm and lysosomal lumen. We also implemented clathrin‐mediated endocytosis as a second‐order reaction in clathrin and ErbB; this is obviously an extreme simplification of the actual biochemistry but does reflect the need for clathrin and the receptor to interact prior to vesicular uptake. Reaction compartments were implemented by representing a single‐gene product as multiple species each in its own well‐mixed pool and able to participate in its own set of reactions. This made it possible to model the actions of scaffolding and adapter proteins, the molecular details of which are unclear. Protein transport was modeled in a computationally tractable manner as movement of a species from one compartment to the next with first‐order kinetics (spatial gradients and partial differential equations were therefore avoided). In the current model, reaction compartments were used to encode cytosolic and membrane‐bound Ras and to represent protein phosphatase 2A (PP2A), an enzyme that dephosphorylates Raf, MEK and Akt in IERMv1.0 (Ugi et al, 2002). Although the model includes only one PP2A catalytic subunit, by implementing pools of PP2A in three separate compartments it was possible to account for differential phosphatase activity with respect to Raf, MEK and Akt. Differential activity presumably arises from the association of PP2A with different regulatory subunits (Silverstein et al, 2002), but we did not include any of these details.
Simplifying network structure
The process of multi‐site receptor phosphorylation and recruitment of SH2‐containing proteins to receptor tails generates a ‘combinatorial explosion’ in possible species and reactions (Hlavacek, 2003). If we assume that the 12 known SH2‐binding phospho‐sites on ErbB1 are independent of each other and exist in both phosphorylated and unphosphorylated states, then ErbB1 monomers have 212=4096 distinct states and dimers have (4096)2∼1.6 × 107 states (Schulze et al, 2005). Careful treatment of these phospho‐states is likely to be important, as ErbB1 binds ∼8 distinct SH2 proteins and in some cases, multiple SH2 domains compete for binding to a single tyrosine residue (Jones et al, 2006). However, systematic computational treatment of combinatorial complexity places severe demands on parameter optimization because repeated cycles of numerical integration are required (Blinov et al, 2004). For simplicity, we therefore used a two‐state approximation in which receptors were either fully dephosphorylated or fully phosphorylated. The two‐state assumption seems reasonable, given experimental data that five ErbB1 phosphotyrosine sites (Y845, Y992, Y1045, Y1068 and Y1173) for which assays could be established in A431 cells were phosphorylated and dephosphorylated with similar dynamics over a 120‐min period post‐EGF stimulation (data not shown). However, the assumption is likely to break down under some conditions (Schulze et al, 2005). Thus, our approach should be considered a first step toward a more complete treatment of receptor biochemistry as computational methods advance and additional experimental data are collected.
Determining parameter values
ODE models are composed of initial‐value differential equations involving two types of parameters that must be measured or estimated: initial species numbers (protein concentrations, expressed in molecules per cell) given by x0,i for the ith species, and rate constants (forward and reverse rate constants for complex formation and enzymatic rate constants for enzymes) given by kj for the jth rate. Prior to estimation, we specified parameter values for as many x0,i and kj as possible based on literature data (Table I). In addition, x0,i was measured for several key proteins in A431, H1666 and H3255 cell lines (ErbB1–4, Shc, MEK, ERK and Akt) by semiquantitative immunoblotting relative to recombinant standards; our measurements (e.g. ∼106 molecules of ErbB1 per A431 cell) were consistent with literature estimates when available. Receptor–ligand association constants for EGF and HRG were obtained from published cell surface‐binding assays or surface plasmon resonance experiments performed on purified receptor ectodomains (Berkers et al, 1991; Landgraf and Eisenberg, 2000; Stein et al, 2002). Rate constants for ErbB1/1 and ErbB1/2 dimerization were also obtained from the literature (Hendriks et al, 2003). Association rates for other receptor dimers were inferred from published data describing partnering preferences for ErbB receptors (Graus‐Porta et al, 1997). Experimentally determined rates of internalization were used for ErbB1 homodimers (Hendriks et al, 2003). Downregulation of ErbB3 and ErbB4 was not modeled explicitly, except for inclusion of a single tyrosine phosphatase of unspecified identity that acted on all four receptors. We observed that ErbB3 activation peaks and is substantially downregulated on the order of minutes (data not shown). Ubiquitin‐mediated protein degradation has been shown to regulate ErbB3 (Cao, 2007), but this process is on the order of several hours, which is too slow to account for the observed dynamics of ErbB3 phosphorylation. Further experimental analysis of ErbB3 and ErbB4 downregulation is therefore necessary to enable detailed modeling. The remaining class of rate constants covered the binding of cytoplasmic proteins to receptor tails and to partner proteins (e.g. PI3K to Gab1, MEK to ERK). Some of these rates have been measured experimentally for ErbB‐ or other RTK‐mediated signal‐transduction systems, including the forward and reverse rates for PI3K binding to the insulin receptor (Felder, 1993), and others have been estimated computationally under the assumption that translational and rotational diffusion‐limited kinetics prevail (Northrup and Erickson, 1992). On the basis of rates of related reactions and theoretical estimates, we assumed adapter binding to receptors to have nominal forward rates of 10−5/molecule/s and reverse rates of 10−3/s on a per cell basis, as first described by Kholodenko et al, (1999). However, Grb2–Sos dissociation rates were set at a value slightly lower than nominal adapter‐binding off‐rates to reflect the experimental observation that the two proteins are constitutively complexed (Sastry, 1995).
Parameters for which good experimental cell‐based estimates are available, including Kd values for EGF receptor and HRG receptor binding and the concentrations of experimentally measured species were set at their measured values. Other parameters were estimated by minimizing an objective function (see Materials and methods) comprising the normalized root mean square deviation (RMSD) between time course data and computed model trajectories. Training data comprised 120 measurements collected in triplicate for pErbB1, pAkt and pERK levels at 10 time points from 0 to 120 min following stimulation of A431, H1666 or H3255 cells with saturating (5 nM EGF or HRG) or near‐physiological (10 pM EGF or 100 pM HRG) ligand concentrations. Assays were performed with phospho‐specific antibodies for ErbB1‐Y1068, ERK1‐T202/Y204 (ERK2‐T185/Y187) and Akt‐S473 as the phosphorylation states of these kinases are known to be reasonable proxies for activity, as well as for Shc‐Y239/Y240, which is phosphorylated by ErbB1. A431 cells were chosen largely for their high expression of ErbB1, which ensures that receptor modification was easily measurable, and also because A431 cells have been the subject of many previous studies (Gadella and Jovin, 1995; Graus‐Porta et al, 1997; Jones et al, 2006). H1666 and H3255 cell lines, which carry wild‐type and mutant L858R ErbB1 receptors (Sordella et al, 2004; Tracy, 2004), respectively, were studied to ascertain whether the IERMv1.0 model could effectively capture differing signaling dynamics in other cell types.
On the basis of previous work with models of similar complexity (Singer et al, 2006), we expected IERMv1.0 to be non‐identifiable (Melke, 2006; Gonzalez, 2007) and the landscape of the objective function to contain multiple local minima. This precluded parameter optimization through local optimization methods such as Levenberg–Marquardt (Marquardt, 1963) or simple measures of convergence such as Fisher information (Rodriguez‐Fernandez et al, 2006) and instead required broad and repeated searches through parameter space for good fits. We therefore used simulated annealing (SA) to search across a region of parameter space spanning 2.5 log orders above and below a priori values (as described in Table I) (Kirkpatrick et al, 1983). By restricting ourselves to a subset of 75 rate constants and initial conditions with the greatest impact on the objective function (out of 229 total; as identified by sensitivity—see below), we substantially improved the convergence of parameter optimization. SA and similar methods have been applied widely to estimate parameters for biochemical reaction models (Gonzalez, 2007), but not, to our knowledge, with a set of ODEs as large as those in IERMv1.0. We tested the performance of SA using synthetic data comprising the same measurements and time points as real data but generated with a model fitted manually to A431 time course. Twenty rounds of SA were observed to return excellent fits of training data to simulation (R2>0.98; see Materials and methods and Supplementary Figure 3 for details), showing that SA is an effective means to optimize parameters in IERMv1.0. Even with synthetic data obtained from a model with known parameters, SA returned fits in different areas of parameter space that were associated with different values of the objective function. These differences arise when annealing does not converge rapidly, often because the landscape of the objective function is flat in multiple dimensions and lacks well‐defined minima (Brown and Sethna, 2003), but differences that are smaller than measurement error (which we estimate at ±10% in our experiments) are not of practical significance because they cannot be distinguished experimentally. We therefore considered all fits having an RMSD between simulated trajectories and experimental data of <0.1 as equally valid. Such fits could be recovered for actual data derived from multiple cell lines, despite substantial differences in pErbB1, pERK and pAkt dynamics (Figure 2A–C). However, SA with real data never returned the near‐perfect fits observed with synthetic data (Supplementary Figure 3), particularly in the case of low HRG concentrations, suggesting the presence of errors in model structure. We also found it difficult to train the model simultaneously to Akt dynamics at high and low EGF doses: experiments showed a 50‐fold increase in EGF concentration from 0.01 nM to 5 nM to have little effect on maximal Akt levels, but to induce different dynamics of downregulation (Figure 2A, A431). No fit has as yet reproduced this effect and further refinement of Akt biochemistry is probably necessary.
The number of identified good fits was constrained by computational demands; on average, finding one good fit required 100 annealing runs and 24 h on a 100‐node cluster computer. As the amount of training data increased, so did the number of constraints on the objective function, increasing the ruggedness of the landscape and the time needed to find minima. However, the challenge of parameter estimation using time course data is more fundamental than this. Sethna and others have shown the process to be ‘sloppy’ so that even ‘ideal and complete’ synthetic training data allow only a subset of the parameters in a complex biochemical model to be estimated (Brown and Sethna, 2003). Such parametric uncertainty is a reality for all complex biochemical models and remains an issue even when data are added and different fitting methods are implemented.
When different rounds of SA having similar final goodness of fit to data (that is, RMSD <0.1) were compared, some parameter values lay within ∼3‐fold of their mean values, whereas others took on the full range of values allowed in the search (Figure 2D; Supplementary Figure 4). This confirms our expectation that IERMv1.0 is non‐identifiable. Parameters with similar values across multiple fits are well constrained, whereas those with widely varying values are poorly constrained, but we have not yet recovered enough fits (at a cost of ∼2000 processor hours per fit) to compute uncertainties for individual parameters. Future implementation of refined sampling approaches such as Bayesian calibration and Markov chain Monte Carlo methods will be required (Vyshemirsky and Girolami, 2008).
Analyzing partially calibrated models
To ascertain whether biologically meaningful information can be derived from the IERMv1.0 model despite its non‐identifiability, we examined five good fits for features that seemed to be well constrained based on parameter sampling. For the sake of simplicity, we restrict this discussion to data obtained from A431 cells, but similar results were obtained for other cell lines. First, we sought to identify, by computing normalized sensitivities, those parameters having the greatest impact on biological outputs. The fully normalized sensitivity (sij(t)) of the ith observable ci(t) (such as pERK levels) with respect to a change in the jth rate constant (kj) or initial concentration (x0,j) is given by:
Sensitivities were converted into time‐independent quantities by integration:
where T is the final time point, and the absolute value of the integrand ensures that negative and positive sensitivities do not trivially cancel to zero under the integral. The quantity sij measures the fractional change in the ith observable upon a fractional change in the value of the jth parameter, normalized by 1/T so as to obtain a time‐averaged value. Outputs of interest (ci(t)) include the trajectories of phosphorylated ErbB1, ERK and Akt over time. These were, in general, aggregates of multiple model species. For example, total pERK levels are represented in IERMv1.0 by ∼30 differentially bound and localized species. When local sensitivity analysis for pErbB1, pERK and pAkt was performed on the five best fits to data from A431 cells, sij ranged from 0 to ∼0.8, depending on the parameter. By plotting all pairs of sij values for all pairs of fits, correlations of R=0.69 to 0.93 were observed (where R is the correlation coefficient; Figure 3). Correlation of less than 1.0 is expected, as sensitivity is a local property dependent on actual position in parameter space, which varies from fit to fit, but the mean value of R=0.84 implies significant similarity in sensitive parameters across fits. We explored this issue further using a more robust but less familiar measure of ‘regional’ sensitivity:
where 〈…〉AVG indicates an average over parameter values lying within 102.5‐fold above and below the parameter values of the fit. This method of calculating average local sensitivity has been discussed previously (Cukier et al, 1973; Bentele, 2004; Mahdavi, 2007) and has two advantages over simple sensitivity: (i) it accounts for the possibility that a fit does not precisely hit a nearby minimum due to problems with convergence during SA and (ii) it reduces the impact on sensitivity of small but steep irregularities in the landscape of the objective function. Because averaging over the entire region of parameter space was too costly computationally, we approximated the average by randomly sampling for parameter space in the vicinity of the fit. Sampling revealed a power‐law relationship that reached convergence at ∼1000 rounds (Figure 4A); we therefore performed regional sampling at 1000 points around the nominal fit. Regional sensitivity of five A431 model fits showed substantial commonality in the rank order: 8 of the 10 most sensitive parameters for EGF‐ or HRG‐stimulated pERK were shared across five fits as were 7 of 10 sensitive parameters for pAkt dynamics across three models (Figure 4). Thus, even a partially calibrated and non‐identifiable model yields meaningful information on parameter sensitivities, probably because the most sensitive parameters are among the best constrained. In the following, we concentrated on a single good fit to A431 data that had the lowest value of the objective function after calibration.
Time, stimulus and signal dependence of sensitivity analysis
Sensitivities are always calculated for a particular model feature (e.g. time‐integrated pERK or pAkt values) under a particular condition (e.g. stimulation with EGF or HRG). Correlation plots were used to determine how the identities and rank order of parameter sensitivities depended on the feature and condition (Figure 5). Most sij fell close to the origin, demonstrating that only a few parameters impacted each feature, but sensitive parameters exhibited significant differences from one feature to the next. For example, sensitive parameters for pERK activation by EGF or HRG stimulation were largely shared (Figure 5A). In contrast, when factors controlling Akt activation by EGF and HRG were compared, ∼50% of the sensitive parameters lay well off the diagonal, demonstrating independent control of Akt by the two ligands (Figure 5B). Clustering of sensitive parameters away from the diagonal was even more pronounced when pERK and pAkt were compared (Figure 5C and D). Finally, differences in the identities of highly sensitive parameters were observed for high and low ligand concentrations and early and late time points (Supplementary Figure 5). To confirm these findings experimentally, we selected a non‐obvious differential sensitivity for analysis. Modeling suggested that pAkt levels were more sensitive to changes in ErbB1 activity than those of pERK (Figure 5C). We therefore measured the dose‐dependent inhibition of pERK and pAkt, in EGF‐stimulated A431 cells by the ErbB1‐specific inhibitor gefitnib and the dual‐specific ErbB1–ErbB2 inhibitor lapatinib. When sensitivity to ErbB1 activity was defined as the concentration of drug that inhibited the downstream signal by 50% (IC50), we observed Akt phosphorylation to be significantly more sensitive to gefitinib (IC50∼5 × 10−8 M) than ERK phosphorylation (IC50∼5 × 10−7 M) (Figure 6A). A similar 10‐fold difference was observed with lapatinib (Figure 6B), confirming our prediction from sensitivity analysis. We therefore conclude that parameter sensitivity in the ErbB model is critically dependent on the choice of target observable ci(t).
The target and context dependence of parameter sensitivity is fully consistent with elementary dynamical systems theory, but our work shows that it is also true for physiologically important outputs and realistic parameter sets in biochemical models. This fact is often ignored in discussions of robustness, in which it is claimed that parametric insensitivity is inherently physiologically meaningful (Morohashi, 2002). Because the parameters sensitive for different observables or conditions overlap only partly, the fraction of parameters that are sensitive (42 out of 229) is significantly greater than what might be assumed from examining only one observable or condition (for which 6–10 sensitive parameters is typical). For example, although pAkt dynamics in EGF‐stimulated cells are insensitive to changes in the initial concentration of the GTPase Ras (Figure 5B), this does not mean that the ErbB pathway is robust to changes in Ras levels: pERK dynamics are critically dependent on this parameter (Figure 5A, yellow dots).
With respect to the utility of sensitivity analysis on a partially constrained model, we note that considerable biological insight can be obtained by comparing sensitive parameters for different ci(t). For pERK dynamics in EGF‐ or HRG‐stimulated cells, the three sets of sensitive parameters were Ras levels, on‐ and off‐rate constants for association of MEK and Raf, and for association of MAPK pathway phosphatases and their targets (Figure 5A). This makes sense biologically because both EGF and HRG activate ERK through Ras and Raf. In contrast, the differential sensitivity of pERK to ErbB1 initial concentrations in EGF‐stimulated cells reflects the key role played by ErbB1/1 dimers in EGF binding, whereas the sensitivity of pERK in HRG‐stimulated cells to ErbB2 and ErbB3 levels reflects their roles in HRG binding. The number of ligand‐specific sensitive parameters for pAkt dynamics was particularly high, revealing non‐overlapping mechanisms of Akt activation by EGF and HRG. For example, for HRG but not EGF stimulation, on‐rates of binding of PI3K–p85 to ErbB3 receptor were critical. Intriguingly, PIP2 levels, and rate constants for PIP3–Akt and PIP3–PTEN interaction appeared to play particularly important roles in pAkt dynamics regardless of ligand, consistent with frequent mutation of PTEN and regulators of inositol phosphates in human cancer.
The set of sensitive parameters common to both EGF‐mediated pERK and pAkt induction was limited to GAP and ErbB1 initial concentrations (Figure 5C). This is reasonable given that both ERK and Akt depend on ErbB1 and Ras for activation. In contrast, protein phosphatases appeared as significant off‐diagonal factors, implying differential control of the two kinases. The calibrated vmax (calculated using the expression vmax=kcatA, where A is the concentration of enzyme A) for the PP2A compartment that targets pRaf and pMEK was 1–3 orders of magnitude greater than for the compartment targeting pAkt (data not shown). Differential sensitivity extended to all three PP2A reaction compartments, supporting the hypothesis that PP2A functions quite differently on pERK and pAkt. As a computational test of this prediction, we implemented an alternative model having only one compartment for PP2A. In this model, pERK dynamics deviated substantially from experimental values under conditions of HRG stimulation (Supplementary Figure 6, dotted line). However, such a comparison ignores the possibility that a single‐compartment model with an optimized set of parameters might perform as well as the three‐compartment model. When we refitted the single‐compartment model using data from A431 cells and SA, we observed substantial improvement in goodness of fit. Nonetheless, significant differences between single‐ and three‐compartment models were obtained across multiple SA runs: IERMv1.0 correctly predicted that pERK levels would peak at 5 min after HRG stimulation before gradually declining (Supplementary Figure 6, solid line), whereas no single‐compartment model returned a fit with correct timing (typically, the pERK peak was delayed ∼15 min in simulation relative to experiment; Supplementary Figure 6, dashed line). The inability of SA to identify good fits to data for the single‐compartment model does not constitute proof that PP2A exists in multiple distinct states, but the negative result is consistent with the hypotheses that dephosphorylation of Raf, MEK and Akt occurs at different rates, presumably through different PP2A‐containing complexes. Moreover, our findings are consistent with previous analysis suggesting that phosphatases play a particularly critical role in shaping RTK signaling dynamics (Heinrich et al, 2002). From a methodological perspective, our findings also highlight the need to re‐optimize parameters when comparing models that differ in structure.
Network‐level predictions and their experimental confirmation
An important measure of the value of a kinetic model is its ability to generate testable predictions that provide new insight into complex biochemical processes. Experimental validation of pAkt sensitivity to ErbB1 inhibition relative to pERK is one example (Figure 6). The trained IERMv1.0 model also predicted correctly the dose–response of proteins such as pShc (Figure 7A), but this is not a stringent test of performance as pShc levels closely followed those of pErbB1. We therefore turned to predicting and understanding the dose–response behavior of the ErbB network, a fundamental measure of biological activity that has been examined in detail by others (Goldbeter and Koshland, 1984; Huang and Ferrell, 1996; Ferrell and Machleder, 1998;) Two measures of dose responsiveness are the ligand concentration for half maximal activation (the EC50) and the steepness of the transition from on to off, which can be described by an apparent Hill coefficient (Happ). Happ is calculated by a nonlinear least‐squares fit of dose–response data to the Hill equation:
where pSignal is the normalized magnitude of signal (e.g. ErbB1 phosphorylation), x is the concentration of input (e.g. EGF), the free parameters Happ, k are the apparent Hill coefficient and half‐maximal concentration of x. A sigmoidal dose–response curve for a second‐order ligand–receptor interaction has k=Kd, Happ=1.0 such that the levels of receptor–ligand complex rise from 10 to 90% of their maximal value over an 81‐fold change in ligand concentration. A third useful measure of a signaling cascade is the extent of amplification and attenuation at successive steps:
where pSignaln refers to the activity or state of modification of protein n in a signaling cascade. Because all signals are dynamic, EC50 and Happ are functions of time, which we set at t=5 min (the peak for transient ERK and Akt responses).
Simulation across multiple fits showed Happ for pERK and pAkt as a function of EGF (equation (4)) to be ∼0.30 and, unexpectedly, that both kinases would be activated to significant levels in A431 cells at ligand concentrations well below the lowest reported Kd for EGF–ErbB1 association (∼0.1 nM; Carraway et al, 1994). Experiments confirmed this prediction: although pErbB1 and pShc levels rose to measurable levels only above 10−9 M EGF, pERK and pAKT were ∼20% maximal at EGF concentrations as low as ∼10−12 M (Figure 7A). Thus, the relationship between ligand concentration and ERK and Akt activities was nearly log‐linear over a 106 range of ligand (between 10−14 and 10−8 M EGF; R2∼0.98 for a log‐linear fit). Modeling suggests that this linearity arises because the ErbB signaling cascade downstream of pShc acts as a dose‐dependent amplifier with a peak at ∼10−12 M EGF and Ψ (EGF, t=5 min) varying between 0 and 30 over the range 10−12–10−6 M EGF. The prediction of dose‐dependence in Ψ was confirmed experimentally (Figure 7B; Supplementary Figure 7B and C). These observations contrast with experimental and theoretical analysis of MAPK signaling in progesterone‐stimulated Xenopus oocyctes by Huang and Ferrell (1996) showing dose–response curves for MEK and ERK activation to be nearly switch‐like with Happ=1.7 for MEK and Happ=4.9 for ERK. In EGF signaling, we observe exactly the opposite behavior: an input–output curve that was much more gradual than expected for normal second‐order ligand–receptor binding.
To identify determinants of dose responsiveness, we examined the input–output behavior of a MAPK module isolated from IERMv1.0. Although the best IERMv1.0 fit predicted a shallow dose–response curve consistent with experimental data (Figure 8A), the behavior of the isolated MAPK module was consistent with results from Huang and Ferrell in exhibiting a steep dependence of pMEK and pERK on Raf activity (Happ=4.6 and 6.3, respectively) (Figure 8B). This difference can be rationalized by the fact that in the full model (and presumably also in cells) the protein cascade leading to ERK and MEK activation is driven by a Ras‐dependent input, the strength of which varies nonlinearly with time and ligand dose and the cascade is therefore not at equilibrium. In the isolated MAP kinase module, ERK and MEK are activated by a Ras input of constant magnitude. This observation underscores the importance of considering signaling modules in the context of the larger networks in which they are embedded. We therefore undertook a general search for variables that might control pERK dose responsiveness in IERMv1.0. Starting with an A431 model fit having Happ=2.9 (for pERK), we used Monte Carlo sampling to explore single moves, or a succession of ∼105 moves across a landscape of the 75 sensitive parameters optimized during model fitting. Among all possible single moves, none decreased Happ by greater than 10%, but searches involving a succession of changes were repeatedly successful in reducing Happ by 1.5‐ to 2‐fold. Remarkably, in any Monte Carlo sampling run involving 105 moves, changes of only a few parameters were responsible for almost all of the change in Happ. This implies initial diffusive movement across a region of parameter space having nearly constant Happ followed by a sudden transition to a new region of space having a substantially lower Happ value (Figure 8C). The amount of Ras (c26), Raf–MEK association rate (k44) and MEK phosphatase rate (c53) were the parameters, the alteration of which was most frequently responsible for transitions from high to low values of Happ (Figure 8D). Moreover, when the impact of changes in c26 in IERMv1.0 were examined in greater detail, high Ras concentrations (105 molecules per cell) resulted in a steeper dose–response curve (Happ∼1.77), whereas low Ras concentrations (103 molecules per cell) resulted in a flatter curve (Happ∼0.96) and a higher EC50 for EGF (Figure 8E). Importantly, when isolated MAPK modules having high and low c26 were compared, Happ of ERK was unchanged at ∼4 in both (Figure 8F). Thus, the impact of c26 on MAP signaling is restricted to settings in which the module is embedded in a larger signaling network. Similar results were obtained with Raf–MEK association rates (k44) (Figure 8G and H). From these data, we conclude that the input–output behavior of the MAPK cascade is strongly dependent on the context in which it is found: varying c26 and k44 in the context of the full IERMv1.0 changes EC50 and Happ, but varying the same parameters in an isolated cascade has virtually no effect.
Parametric sensitivity in input–output responses is something we might anticipate from the properties of electronic filters and amplifiers, but it has not previously been explored in cell signaling. The preceding analysis implies, however, that we should observe considerable diversity from one cell type to the next with respect to the EC50 and Happ for Akt and ERK. In four cancer cell lines (AU565, HBL100, MCF7 and T47D), pErbB1 activation exhibited a narrow range of EC50 values (10−8–10−9 M EGF) and Happ∼1.0. In contrast, Happ for pERK and pAkt was much more variable, ranging from 0.58 to 1.61 (Figure 9A) and EC50 values were spread over 10−9–10−12 M EGF (Figure 9B). These data are consistent with the biochemistry of the ErbB signaling network. Activation of ErbB1, which is most upstream and governed only by the kinetics and thermodynamics of EGF–receptor interaction and the biochemistry of the kinase domain, is expected to be similar across cell lines. In contrast, signals further down the cascade are modulated by many upstream proteins, many of whose concentrations and rate constants impact EC50 and Happ. We therefore conclude that the diversity of dose–responses that can be generated by simulation (using different sets of parameter values) also occurs in real cells. We speculate that this behavior confers great adaptability on ErbB signaling and helps to explain its importance in many aspects of cell physiology.
In this paper, we describe the construction of an ODE‐based model of immediate‐early signaling involving all four ErbB receptors, analysis of parameter sensitivity and uncertainty and exploration of factors controlling overall input–output behavior. The IERMv1.0 represents ErbB receptors and downstream pathways at an intermediate level of detail in which interactions among elementary species are encoded in a continuum mass action approximation by first‐ and second‐order elementary reactions. Our model has the merit of explicitly capturing interactions among multiple ErbB receptors and ligands, as well as downstream events such as Ras activation and MAPK and PI3K–Akt signaling, without use of complex functions, quasi‐static state Michaelis–Menten approximations and aggregation of biochemically distinct species. Complex behaviors arise in the model from interactions among simple elementary reactions, making it possible to explore (through sensitivity analysis) the functions of individual species on input–output behavior and to predict the effects of RNAi, small‐molecule drugs and mutations. We explore the first topic in this paper and the others in a forthcoming paper on the determinants of cellular sensitivity to anti‐ErbB1 drugs (Jasper, in preparation). For simplicity, the current model omits several signaling pathways activated by ErbB receptors and lacks a rigorous treatment of the web of reaction intermediates arising from protein assembly (e.g. SH2‐containing proteins). In future work, it should be possible to extend IERMv1.0 to include additional downstream signaling pathways and upstream ligands. However, thorough treatment of interactions among 13 ligands, 4 ErbB receptors and 8–12 adapters will require both conceptual and technical advances in modeling reaction networks (Hlavacek, 2003; Danos et al, 2007).
Our representation of immediate‐early ErbB signaling involves a large number of free parameters, but with respect to scope and level of detail IERMv1.0 is not atypical of other recent efforts to model cell signaling (Kholodenko et al, 1999; Hatakeyama, 2003; Hendriks, 2005). Sensitivity analysis shows that the behavior of IERMv1.0 is dependent on the values of many rate constants and initial protein concentrations that have not been measured directly. Moreover, even when in vitro biochemical data are available, their relevance to the crowded environment of a cell is unclear (Schnell and Turner, 2004). Thus, many significant parameters must be estimated by fitting; biochemical data on isolated proteins are most useful, in this context, as a means to initiate or constrain the search (e.g. Table I). Over a series of ∼2000 independent parameter optimization runs involving searches across ∼106 parameter sets, we found only ∼100 (0.01%) that had a good fit to data (RMSD<0.1), implying that the search landscape is very rugged and deep minima are infrequent, Moreover, among equally good fits (those that varied by less than experimental error), only a subset of parameter values were constant, whereas others varied over the entire allowable range. Thus, IERMv1.0 is non‐identifiable and we presume that large regions of parameter space are consistent with experimentally observed dynamics. One interpretation of this finding is that biological networks are so robust to changes in rate constants and protein concentrations (von Dassow et al, 2000) that parametric uncertainty is not a significant issue: key physiological behaviors, in this view, are determined primarily by pathway structure. The other extreme view is that complex models with unconstrained parameters have little practical value, as any behavior can be achieved with suitable adjustment of parameters. We have demonstrated that neither of these extremes is true: physiologically important aspects of ErbB signaling such as the dose responsiveness are indeed determined by the values of specific parameters, but useful and accurate predictions can be made using partially constrained models.
Model identifiability and parametric uncertainty
Model identifiability is a subtle issue for which a wide variety of analytical and numerical methods have been developed. Nonetheless, most kinetic models of biological pathways are presented as though they have a single parameter solution set. In the area of biochemical or pathway models, published identifiability analysis has focused on analytic approaches to small idealized models (Margaria et al, 2001; White, 2001). Parameter estimation for large models such as IERMv1.0 involves numerical and Monte Carlo approaches such as SA (van Riel, 2006; Gonzalez, 2007). In general, parameter estimation has four possible outcomes (White, 2001): (i) a single, unique, parameter solution set, making the model fully identifiable, (ii) a countable number of parameter sets, making the model ‘locally identifiable’, (iii) an infinite number of solution sets, making the model non‐identifiable and (iv) no solution sets, in which case the model probably has structural flaws. Parameter optimization with IERMv1.0 conforms to possibility (iii).
Non‐identifiability of our model arises in part because the quantity, quality and type of data are inadequate and relationships among parameters and experimental observables are inherently non‐unique (Hengl et al, 2007). Why then, do we not simply double or triple the amount of data used for model training? Unfortunately, it has been shown that even a complete set of time‐resolved measurements of all species in a complex reaction model is insufficient to fully constrain all parameters due to the ‘sloppy’ nature of the fitting procedure (Gutenkunst, 2007). It is likely that experiments can be designed to mitigate some of this sloppiness, but the necessary design and fitting methods have not yet been developed. Thus, non‐identifiability is likely to remain an issue for all complex biochemical models. Methods such as ensemble modeling and parameter sampling have been developed to cope with non‐identifiability in fields as diverse as climate and nuclear reaction modeling (Christie et al, 2005). Means to sample parameters have also been applied to models of Drosophila melanogaster segment polarity (Ingolia, 2004) and mTOR signaling (Kuepfer et al, 2007). However, the issue of identifiability is widely ignored in biological modeling. It would be valuable to know which parameters are well constrained, which co‐vary and which are truly unconstrained. The width of the distribution in parameter values across repeated rounds of SA serves as a proxy for this parametric uncertainty (Figure 5E), but the small number of good fits available for the ErbB model prevents our deriving reliable estimates for individual parameters. Nonetheless, it is clear that some parameters are much better constrained than others (e.g. phosphatase levels were tightly distributed across multiple parameter optimization runs but many receptor dimerization rates assumed a wide range of values). In principle, it is legitimate to draw conclusions based on well‐constrained parameters despite the presence of other unconstrained variables. We approximate this ideal by comparing features across multiple fits. We anticipate that future implementation of SA, genetic and hybrid algorithms that take advantage of modularity or branch‐and‐bound (Singer et al, 2006) will improve our understanding of parameter space and thus of parametric uncertainty. It will then be possible to generate predictions from models with quantifiable confidence or significance.
Parameter sensitivity and robustness
Sensitivity analysis is a straightforward and informative means to determine which features of a model have the greatest impact on a particular output or model feature. When we evaluated parameter sensitivity in regions of parameter space centered on independent fits (Cukier et al, 1973), we observed the rank order of the most sensitive parameters to be relatively constant and thus, relatively unaffected by parametric uncertainty. However, we also found the identities of the most sensitive parameters to be highly dependent on the output being evaluated. Thus, sensitive parameters for ERK versus Akt, or high versus low ligand concentration or HRG versus EGF are frequently distinct. The union of sensitive parameters across all features and conditions encompasses a much larger fraction of the total parameters in the IERMv1.0 model than sensitivity to a single feature. This conclusion contradicts the oft‐cited statement that biological pathways are insensitive to variation in parameter values, but is congruent with experience in other fields of dynamical systems theory. Moreover, it emphasizes the fallacy of simply equating ‘robustness’ of a network with parametric insensitivity. Response duration might be insensitive to changes in parameter P but dose–response relationships might be highly sensitive; as we cannot a priori claim that only the former is physiologically important, we cannot state that the network is robust to variation in P. Where it possible to enumerate all physiologically significant features of the ErbB pathway in development or normalcy versus disease, we speculate that the majority of parameters would prove sensitive under one condition or another. Thus, analysis of robustness is interpretable only with respect to specific physiological behaviors. Behaviors that are common to many cell types and to variation in initial conditions hint at a much more interesting type of robustness in ‘design’ that we have yet to understand.
Input–output responses of the ErbB network as revealed by modeling and experiment
To ascertain whether IERMv1.0 can predict significant network features not included in the training data, we examined input–output relationships for EGF and pERK or pAkt. Detailed analysis of progesterone‐mediated ERK activation in Xenopus oocytes (Huang and Ferrell, 1996) has revealed ultrasensitive behavior (Goldbeter and Koshland, 1984) and a high Hill coefficient arising from double phosphorylation of ERK by MEK and from positive feedback (Huang and Ferrell, 1996; Brown et al, 1997; Ferrell and Machleder, 1998). We were therefore surprised to find that, over a 106 range in EGF concentrations, our model predicted low Hill coefficients (Happ=0.32) and experiments confirmed this prediction. As a consequence, ERK was ∼20% as highly phosphorylated at 10−12 M EGF as at ∼10−7 M (a saturating level). Amplification of signal at low doses of EGF (as implied by Happ<1) may help to explain why many ErbB‐targeted drugs inhibit receptors at significantly lower drug concentrations than they inhibit downstream kinases: a small number of active receptors are sufficient to drive a substantial downstream signal.
The multivariate nature of control over Happ in IERMv1.0 makes it challenging to understand mechanistically, but it is clear that multiple parameter values, particularly those affecting the coupling of Ras, Raf and MEK play a major role in setting dose responsiveness of a receptor‐driven MAPK cascade. It should be noted, however, that dose responsiveness is more complex than we have implied hitherto, as it has an important temporal component; Raf activation dynamics, for example, are predicted to change dramatically as EGF concentrations increases (a prediction we have been unable to test due to poor antibody specificity; Supplementary Figure 7D). Further study of these phenomena will require the development of suitable descriptors of time‐varying transfer properties in cell signaling cascades. Moreover, it will undoubtedly prove interesting to compare our data‐driven approach with recent theoretical work on network biology (Saez‐Rodriguez et al, 2004; Del Vecchio et al, 2008). The clear suggestion is that the coupling of modules under particular conditions (high retroactivity or mismatched impedance at the connections) results in networks, the behavior of which cannot be easily predicted from analysis of the modules in isolation.
Summary and future prospects
Two aspirations motivate the construction and analysis of complex kinetic models: elucidating the roles of individual proteins at the level of specific biochemical reactions and determining how sets of proteins work together to create modules with discrete physiological functions. We have shown progress in both of these areas for a biochemically realistic model of immediate‐early ErbB signaling trained against dynamic data. The strong contextual dependency of parameter sensitivity emphasizes that protein activity is meaningful only with respect to a particular physiological function: many parameters that are insensitive under one set of physiologically reasonable conditions are highly sensitive under another. This is also true at the level of network modules. Although the Raf–MEK–Erk signaling cascade has a switch‐like input–output behavior in isolation, within the context of the ErbB pathway the apparent Hill coefficient is ∼10‐fold lower and the response is log‐linear. Thus, we cannot generalize from the behavior of a signaling module in isolation to its behavior in a biological network. These are valuable insights with respect to our initial aspirations, but emphasize the importance of developing methods to estimate parameter values and handle parametric uncertainty through optimal experimental design and Bayesian estimation. The challenge is now to implement these concepts in a practical manner with a complex biochemical model.
Materials and methods
The ErbB model was implemented in two software packages: MATLAB and Jacobian. In MATLAB (MathWorks, Natick, MA), the model was constructed using the Simbiology Tool Box, comprising 828 reactions, 499 species and 201 parameters. The model was also reimplemented in Jacobian, a reaction‐engineering program from Numerica (Cambridge, MA). The numerical integrator in Jacobian is optimized for sparse systems such as the ErbB model. The model is available in five formats, (i) SBML, (ii) MATLAB Simbiology object, (iii) a set of Jacobian scripts, (iv) a set of text files of reactions, species and parameters or (v) as a set of text files of ODE equations, initial conditions and parameters (http://www.cdpcenter.org/data/Chen_2008, model files). Model dynamics on MATLAB and Jacobian platforms were shown to be consistent.
Local sensitivity analysis was performed with Jacobian. Sensitivities were fully normalized as in equation (1). Regional sensitivity, defined as an ‘average’ sensitivity of a parameter around a fit in parameter space, involved repeated rounds of sensitivity calculation and averaging of the value. In each run, all parameters are randomly modulated around the nominal values. We calculated regional sensitivities only in the Jacobian package, because sampling is computationally intensive and Jacobian's numerical solver was able to calculate more samples. Sensitivity averaging and parameter modulation was implemented in Jacobian's scripting environment. For parameter modulation, each nominal parameter was multiplied by a different, random factor of 10x where x is a random number evenly distributed between −2.5 and 2.5. This gives rise to a model with new parameter values that deviate from nominal ones. The sensitivity of a parameter is calculated for this new model, and entered into the average. This process is repeated a thousand times, and the resulting average sensitivity across randomly parameterized models gives the regional sensitivity.
SA was implemented in the scripting environment of Jacobian. The configuration space is given by the vector of parameter values. The ‘energy’ function is given by the RMSD function that assigns a score to each configuration is the objective function, defined as follows:
where x is the number of activated molecules of some species in the model, xe is the number of activated molecules of the corresponding species in experiment, Nobs is the number of measurables or observable species, Nexpt is the number of experimental conditions and Nt is the number of time points in the experiment. The measured phosphorylation of a species was normalized for each cell line such that the maximum signal for a species among all stimulation conditions was set to 1 (in Figure 2, the signals are further scaled to reach 100). The simulated dynamics for pERK and pAkt were normalized by dividing by the maximal possible activated signal (the initial number of molecules of the unphosphorylated form). For simulated pErbB1, the maximal possible activated signal was assumed to be 70% of the measured number of ErbB1 molecules, as we observed that with higher ligand concentrations, more ErbB1 activation could be observed (data not shown). The normalization conditions for ERK and Akt reflect the assumption that the maximal observed signal is a ‘saturating’ one, i.e. all species have been phosphorylated. The objective function measures how well a model with a particular configuration of parameters is able to produce dynamics‐matching experiment. Of the 201 rate constants and 28 initial protein species numbers, 75 moves are possible (see main text). The move at each iteration of SA in parameter space consists of a multiplicative change in a selected parameter, where the multiplicative factor (or move size) is 10x where x is a random increment of 0.25. Acceptance of a move is given by the Metropolis criterion, i.e. if the move results in a decrease in the objective function, the move is accepted; if the move results in an increase in the objective function, the probability that a move is accepted is given by e−Δ E/T, where ΔE is the change in objection function value after the move. We set bounds on the moves, so parameters did not deviate more than 102.5‐ or 10−2.5‐fold from the starting point of annealing. These bounds put limits on the size of the explored space. Cooling of the system adhered to an exponential schedule T=T0exp(−step/step0), where T0 is the starting temperature, step is the step number in the 4000‐step annealing run and step0 is the time it takes for the temperature to be reduced by a factor 1/e (i.e. analogous to the half‐life). Through trial and error, we chose T0 to be 1, and step0 to be 1700, values that gave the highest number of near optimal fits.
A431, H1666 and H3255 cell lines were grown in ACL‐4 medium, HBL100 and MCF7 were grown in DMEM medium, and T47D and AU565 were grown in RPMI medium. Each medium was supplemented with 10% fetal bovine serum (FBS). Cells were trypsinized and subcultured into 96‐well plates (cat. no. 165305; Nalgene Nunc, Rochester, NY, for in‐cell western blot assays and cat. no. 3072; Becton Dickinson, Lincoln Park, NJ, for xMAP assays) at varying densities, grown for 24 h and subsequently serum starved for 16 h in the appropriate medium containing 0.1% FBS. At the time of treatment, the confluency of all cell lines was approximately 75%.
Time course measurements by in‐cell western blot assay
Time course measurements were performed by in‐cell western blot assay. Cells were stimulated by adding ligand for the indicated length of time. At the time of observation, ligand was removed and cells were fixed (4% formaldehyde in 1 × PBS, 20 min at 25°C), permeabilized (0.1% Triton in 1 × PBS, four washes for 5 min each at 25°C with rotation) and blocked (0.5 × Odyssey Blocking Buffer (OBB) (Licor Biosciences, Lincoln, NE), 1 h at 25°C with rotation). Primary antibodies were diluted 1:100 in 0.5 × OBB (50 μl per well) and allowed to incubate overnight at 4°C. Cells were then washed 3 × with 1 × PBS‐Tween (0.1%) for 5 min at 25°C with rotation, followed by the addition of secondary antibody (diluted 1:800 in 0.5 × OBB) and TO‐PRO‐3 (DNA stain, diluted 1:2500 in 0.5 × OBB) and incubation for 1 h at 25°C. Following three additional wash cycles, wells were aspirated dry and scanned using the Licor Odyssey Scanner (Licor Biosciences). Data were normalized on a ‘per cell’ basis by dividing fluorescence on the 800 channel (pErbB1, pERK or pAkt) by the 700 channel (DNA stain).
IC50 and EC50 measurements by xMAP assay
EC50 and IC50 measurements were performed by bead‐based immunoassay. For IC50 measurements, cells were pretreated for 1 h with the indicated concentrations of Iressa or Lapatinib (LC Laboratories, Woburn, MA). EGF was added at 5 nM for the IC50 measurements or at the indicated concentrations for the EC50 measurements. After 5‐min incubations, the cells were washed and lysed using the Bio‐Plex cell lysis kit (Bio‐Rad, Hercules, CA). pAkt (S473) and pERK (Y185/Y187) were measured using xMAP bead kits (LHO0101 and LHO0241; Invitrogen, Carlsbad, CA) and pY‐ErbB1 was measured using Luminex bead kit (71935‐3; EMD Chemicals Inc., San Diego, CA). Assays were performed using the Bio‐Plex Phosphoprotein Detection Reagent kit (Bio‐Rad) and the Bio‐Plex 200 platform (Bio‐Rad).
Primary monoclonal antibodies recognizing pErbB1 (Y1068) (cat. no. 2234), pERK 1 (T202/Y204) or pERK2 (T185/Y187) (cat. no. 4377), pAkt (S473) (cat. no. 4058) and MEK (S221) (cat. no. 2338) were obtained from Cell Signaling Technology (Danvers, MA). Primary monoclonal antibody recognizing Shc‐Y239/Y240 (cat. no. 44–830) was obtained from Biosource (now Invitrogen). Primary polyclonal antibodies recognizing ErbB1 phosphotyrosine sites Y845 (cat. no. 2237), Y992 (cat. no. 2234), Y1045 (cat. no. 2235), Y1068 (cat. no. 4404) and Y1173 cat. no. 2231 were obtained from Cell Signaling Technology. IRDye 800‐conjugated goat‐anti‐rabbit secondary antibodies were obtained from Rockland Immunochemicals, Gilbertsville, PA, and TO‐PRO‐3 DNA dye was obtained from Invitrogen.
We thank Laura Sontag‐Kleiman, Julio Saez‐Rodriguez and Carlos Lopez for discussion and editing and Taeshin Park for assistance with Jacobian. This study was financially supported by NIH center grants GM68762 and CA112967.
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplementary Materials TOC and Supplementary Figures 1‐7 [msb200874-sup-0001.pdf]
Supplementary Data 1 [msb200874-sup-0002.zip]
This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
- Copyright © 2009 EMBO and Nature Publishing Group