During evolution, enzyme‐coding genes are acquired and/or replaced through lateral gene transfer and compiled into metabolic pathways. Gene regulatory networks evolve to fine tune biochemical fluxes through such metabolic pathways, enabling organisms to acclimate to nutrient fluctuations in a competitive environment. Here, we demonstrate that a single TrmB family transcription factor in Halobacterium salinarum NRC‐1 globally coordinates functionally linked enzymes of diverse phylogeny in response to changes in carbon source availability. Specifically, during nutritional limitation, TrmB binds a cis‐regulatory element to activate or repress 113 promoters of genes encoding enzymes in diverse metabolic pathways. By this mechanism, TrmB coordinates the expression of glycolysis, TCA cycle, and amino‐acid biosynthesis pathways with the biosynthesis of their cognate cofactors (e.g. purine and thiamine). Notably, the TrmB‐regulated metabolic network includes enzyme‐coding genes that are uniquely archaeal as well as those that are conserved across all three domains of life. Simultaneous analysis of metabolic and gene regulatory network architectures suggests an ongoing process of co‐evolution in which TrmB integrates the expression of metabolic enzyme‐coding genes of diverse origins.
Several lateral gene transfers and homologous gene replacement events are speculated to have an important function in evolution of archaeal metabolic networks (Galperin and Koonin, 1999; Siebers and Schonheit, 2005). If so, then this raises important questions regarding the evolution and the architecture of gene regulatory networks (GRNs) that integrate and coordinate these enzymes in the face of unique environmental challenges. Metabolism of sugars in Halobacterium salinarum NRC‐1 represents one such central process in which several enzymes and entire segments seem to have been acquired through lateral gene transfers. Taking a systems approach, we have characterized global regulation of these core processes by a single regulator TrmB. Specifically, we integrated data from classical physiology and genetics experiments with orthogonal sources of genome‐wide evidence, including (i) protein–DNA interactions measured globally with ChIP‐chip; (ii) transcriptional responses of genetically and environmentally perturbed strains using microarray analysis; (iii) genome‐wide distribution of a conserved TF‐binding motif signature); (iv) a reconstructed metabolic network (Figure 6).
Deletion of this regulator resulted in severe growth defects in diverse environments including nutrient replete or limitation, metal excess or limitation, and oxidative stress conditions. This generalized growth phenotype was likely because of a reduced NAD+/NADH ratio relative to the parent strain and readily complemented by glucose or glycerol. Global transcription analysis of the ΔtrmB strain revealed perturbed regulation of 182 genes (16 down‐ and 166 upregulated) with significant overrepresentation of carbohydrate metabolism genes. Surprisingly, the deletion of TrmB also resulted in defective regulation of amino acid, cofactor, vitamin, and purine biosynthesis genes. These observations were further refined through ChIP‐chip experiments, which showed that TrmB was physically associated with the promoters of many of these genes in a glucose‐ or glycerol‐dependent manner (P=5.5 × 10−11). We have verified the requirement of a conserved cis‐regulatory motif within many of these promoters to be essential for TrmB function. Together these results support the hypothesis that, depending on the carbon source (glucose or glycerol), TrmB acts as both a transcriptional activator and a repressor to directly coordinate enzymes of central metabolism with associated pathways.
Upon integration of the gene regulatory network with a reconstructed metabolic network, we observed several instances of direct TrmB‐mediated transcriptional control of metabolic enzymes with the biosynthesis of their cognate cofactors. For example, TrmB directly controls over 10 enzymes that require adenosine phosphates (AXP; Figure 6, e.g. reactions 3, 7, 10, 11, 27, 33, 63) and six genes that encode the biosynthesis of these cofactors (Figure 6, reactions 43, 44, 45, 46, 47, 52). Our data suggest that TrmB represses the semi‐phosphorylative Entner–Doudoroff (E–D) glycolytic pathway (Danson et al, 2007; Kanai et al, 2007; van der Oost and Siebers, 2007; Pfeiffer et al, 2008) (e.g. gap, pykA, VNG0442G) (Figure 6), and induces gluconeogenesis (e.g. ppsA; Figure 6). Thus, a deletion in trmB would lead to an inability to generate energy through gluconeogenesis in the absence of glucose. Furthermore, ΔtrmB cultures grown in the absence of glucose upregulate enzymes that reduce NAD(P)+ to NAD(P)H, potentially forcing the cell toward an oxidized state (e.g. Figure 6, reactions 19, 31). If so, then this would lead to a shortage of reducing equivalents. The hypersensitivity to oxidative stress and reduced NAD+/H ratio observed in ΔtrmB mutant cells are consistent with this hypothesis. We conclude that TrmB acts to maintain redox and energy balance in response to nutrient availability in H. salinarum.
The TrmB‐specified regulatory network coordinates the transcription of enzymes of mixed evolutionary lineage (Figure 6). For example, in the shikimate biosynthesis pathway, only one gene encoding shikimate kinase is of archaeal origin, whereas all other genes are conserved throughout evolution (orange shaded area, Figure 6). Strikingly, all genes of this pathway except shikimate kinase are direct TrmB targets (Figure 6), suggesting that shikimate kinase may have been acquired by homologous gene replacement (Galperin and Koonin, 1999). In this regard, the TrmB regulatory network might be a specific example of an active evolutionary process, because several lateral gene transfer or homologous gene replacement events are thought to have occurred in the evolutionary compilation of metabolic networks (Galperin and Koonin, 1999). In summary, this study provides insight into how the architecture of a large metabolic network and an associated GRN may have co‐evolved using components of diverse origins, and how this assembly may be conserved across the archaeal lineage.
We have discovered an evolutionarily conserved gene regulatory network (GRN) specified by a single transcription regulator TrmB that coordinates over 100 enzymes of diverse ancestry.
Depending on the carbon source, TrmB functions either as an activator or repressor to coordinate enzymes of core metabolism with pathways for synthesis of their co‐factors.
Given that many TrmB targets are NAD(P)+‐dependent enzymes, disruption of its activity alters the redox and energy balance to result in a generalized growth defect under diverse environmental conditions.
This study provides insight into the co‐evolution of a GRN and a large metabolic network that has assembled from components of diverse origins.
Archaeal genomes encode unusual metabolic enzymes with homologs in either eukarya or bacteria (Siebers and Schonheit, 2005). Several homologous gene replacement events are speculated to have an important function in evolution to integrate these enzymes into archaeal metabolic networks that are otherwise comprised of enzymes conserved across two or more domains of life (Galperin and Koonin, 1999; Siebers and Schonheit, 2005). If so, then this raises important questions regarding the evolution and the architecture(s) of gene regulatory networks (GRNs) that integrate and coordinate enzyme‐coding genes within archaeal metabolic networks in the face of unique environmental challenges.
GRNs evolve by internalizing environmental factor changes to coordinate the efficient uptake and usage of limited nutritional resources (Tagkopoulos et al, 2008). Not surprisingly, the activity of many transcription factors (TFs) in these GRNs reflects cellular adaptations to environmental niches. For example, greater than half of all TFs in bacteria are thought to bind small molecules to monitor changes in environmental and cellular status (Madan Babu and Teichmann, 2003). Likewise, at least 50 eukaryotic TFs coordinate central metabolic pathways in multiple cellular compartments (Herrgard et al, 2006; Reece et al, 2006). Although limited information exists on archaeal GRNs, it is known that the pre‐initiation complex (PIC) is made up of orthologs of the eukaryotic general transcription factors (GTFs): transcription factor II B (TFB), a TATA‐binding protein (TBP), and a eukaryotic RNA‐Pol II‐like polymerase (Geiduschek and Ouhammouch, 2005). In contrast, many of their sequence‐specific repressors and activators of transcription share ancestry with bacterial transcription regulators (Bell, 2005). However, only 10 of these regulators have been characterized to date (Bell, 2005), and even fewer have a known function in vivo (Lie et al, 2005; Muller and DasSarma, 2005; Lee et al, 2008). Those that have a known function in vivo include transcription regulators for glycolytic/gluconeogenic, nitrogen, and lysine usage pathways (Brinkman et al, 2002; Lie et al, 2005; Kanai et al, 2007). For example, the TrmB transcription factor (thermococcus regulator of maltose binding) acts as a repressor for genes encoding glycolytic enzymes and as activator for genes encoding gluconeogenic enzymes (Kanai et al, 2007). In these systems, TrmB also binds to glucose, maltose, trehalose, maltodexterins, and sucrose molecules to differentially regulate the genes encoding corresponding sugar uptake systems in a sequence‐specific manner (van de Werken et al, 2006; Lee et al, 2008). However, the regulation of all other metabolic pathways in archaea is currently unknown.
The limited information on regulation of metabolism in archaea is a significant handicap in comparative analysis for understanding evolutionary similarities and differences in the architecture(s) of GRNs. Here we have characterized the TrmB regulatory network in the halophilic archaeon Halobacterium salinarum NRC‐1 by integrating three disparate sources of evidence (protein–DNA interactions measured globally with ChIP‐chip, transcriptional responses of genetically and environmentally perturbed strains using microarray analysis, and genome‐wide distribution of a conserved TF‐binding motif signature) with a metabolic reconstruction. These results demonstrate that the haloarchaeal TrmB ortholog (VNG1451C) coordinates the transcription of more than 100 central metabolic enzyme‐coding genes with genes involved in de novo synthesis of their cognate cofactors. We hypothesize that this balanced regulation allows the cell to modulate redox and energy status. More importantly, we show that the TrmB‐dependent metabolic network integrates the transcription of enzyme‐coding genes that are uniquely archaeal with those that are conserved across all three domains of life. In sum, this study provides insight into how the architecture of a large metabolic network and an associated GRN may have co‐evolved using components of diverse origins, and how this assembly may be conserved across the archaeal lineage.
We used a combination of classical genetics, genome‐wide experimental, and computational approaches to identify the TrmB ortholog and characterize the architecture of the network it specifies to control central metabolism in the archaeon H. salinarum. These approaches included (i) sequence analysis to identify a putative TrmB homolog; (ii) phenotypic characterization of a ΔtrmB deletion strain; (iii) transcriptomic analysis of the ΔtrmB strain under defined growth conditions associated with the defective phenotypes; (iv) ChIP‐chip (genome‐wide in vivo localization of TrmB binding); (v) genome‐wide distribution of a conserved motif signature discovered de novo within experimentally mapped TrmB‐binding sites; (vi) promoter: reporter fusion assays to validate TrmB targets identified by the high‐throughput methods; (vii) computational integration of the results of these experiments and data from earlier studies to construct transcriptional and metabolic networks governed by TrmB. We conclude from the results of these experiments that TrmB is a bifunctional regulator that governs the transcription of genes in central metabolic pathways of diverse ancestry to manage cellular redox and energy status. Results of these experiments are described in detail below.
Sequence analysis suggests that VNG1451C encodes a putative sugar‐binding transcription regulator
Given the central nature of sugar metabolism in cellular physiology, we searched for putative TFs that may control this process. At least seven proteins in the H. salinarum NRC‐1 proteome (http://baliga.systemsbiology.net) have significant matches to protein family signatures and sequences of known sugar metabolism regulators. Among these candidate regulators, the VNG1451C amino‐acid sequence (Figure 1A) significantly matches (e‐value=2 × 10−8) the 50aa TrmB family signature (PF01978, http://pfam.sanger.ac.uk/) with 21% identity to the consensus sequence (Figure 1B). According to ClustalW analysis, VNG1451C possesses at least three active site residues known to be critical for sugar binding in the characterized TrmB orthologs (Krug et al, 2006; Kanai et al, 2007; Lee et al, 2008). Interestingly, although the TrmB signature is conserved across 175 bacterial and archaeal species, no TrmB orthologs have been identified to date in bacteria (Lee et al, 2008). TFs of the TrmB family have been implicated in the regulation of maltose and glucose usage in thermophilic archaea (van de Werken et al, 2006; Kanai et al, 2007; Lee et al, 2008). In these archaea, the genetic loci encoding TrmB also harbor genes coding for the maltose and/or trehalose ABC transporters. Notably, these genes are absent in chromosomal vicinity of VNG1451C in the H. salinarum genome (Figure 1A). This combined evidence suggests that VNG1451C encodes a widely conserved regulator with a putative function related to sugar metabolism.
Phenotypic analysis suggests that TrmB is involved in sugar metabolism and maintenance of redox balance
We investigated the phenotypic consequence of deleting trmB in diverse environments. This revealed a severe growth defect in the mutant under nearly every condition tested, including standard growth in rich media, nutrient starvation in defined media, metal depletion and excess, and oxidative stress (Figure 2A; Supplementary Tables 1 and 2; Materials and methods). In addition, the NAD+/NADH ratio in mid‐logarithmic phase ΔtrmB cultures was, on average, significantly lower than in the parent strain in the absence of glucose (Figure 2B). Wild‐type growth rates were recovered by functional complementation in trans with a plasmid‐borne copy of trmB in the ΔtrmB background, ruling out polar effects of the gene deletion on surrounding genes (Supplementary Table 2). The addition of glucose to the growth media also complemented both the growth defect and NAD+/NADH ratio imbalance (Figures 2A and B). Partial complementation of the mutant growth phenotype was observed in the presence of glycerol. However, the inability of sucrose, galactose, raffinose, maltose, and pyruvate to remedy these defects (P<0.001) indicated some nutrient specificity in the function of this regulator (Figure 2A; Supplementary Table 2). Together these observations suggest that the function of TrmB is associated with glucose or glycerol metabolism and potentially linked to the maintenance of cellular redox balance.
Transcriptome analysis reveals that TrmB might regulate functions in diverse metabolic pathways
To characterize the TrmB regulon, genome‐wide transcriptional changes were analyzed in the ΔtrmB deletion background during growth in the presence and absence of glucose (Figure 2C; Supplementary Table 3). On culturing in the absence of glucose, the transcription of 182 genes showed perturbed expression in the ΔtrmB background compared with the isogenic parent (16 down and 166 upregulated; Figure 2C; Supplementary Table 3). These genes were grouped into 11 functional categories according to GO and KEGG databases (Ashburner et al, 2000; Kanehisa and Goto, 2000). As expected from the aforementioned experiments, genes whose products function in carbohydrate metabolism were significantly overrepresented in these categories (P∼5 × 10−6; e.g. ppsA, PEP synthase; pykA, pyruvate kinase; Figure 2C; Supplementary Table 3). Surprisingly, genes encoding additional metabolic pathways were also significantly perturbed in the mutant; including amino acid, cofactor, vitamin, and purine biosynthetic pathways (Figure 2C). Consistent with the phenotypic data, growth phase‐specific regulation of these genes was restored in ΔtrmB on the addition of glucose (Figure 2C). Two possible molecular mechanisms could lead to this result: (i) direct TrmB binding to affected promoters, including those of other TFs; and (ii) an indirect consequence of perturbing sugar metabolism.
TrmB binds target promoters that function in diverse metabolic pathways in the absence of glucose or glycerol
TF‐binding location analysis with ChIP‐chip.
To differentiate between direct and indirect regulatory influences of TrmB, its transcription‐factor‐binding sites (TFBS) were localized throughout the genome using ChIP‐chip. This procedure localizes DNA fragments within transcription factor complexes enriched with chromatin immunoprecipitation (ChIP) using whole genome tiling arrays (chip). We mapped TFBSs in the presence and absence of varying concentrations of glucose or glycerol (Materials and methods). TrmB bound to 113 sites throughout the chromosome in the absence of glucose or glycerol (Figure 3A). Interestingly, no TrmB binding was observed across the genome in the presence of high glucose or glycerol (Figures 3B and C; Supplementary Figures 1 and 2). This finding is consistent with the observation that both glucose and glycerol can complement the physiological consequences of deleting TrmB (Figure 2) and with earlier reports on TrmB function in other systems (Lee et al, 2008).
The 113 binding sites were ranked according to statistical confidence (Table I). The top seven high‐confidence (P<10−6) hits reside in intergenic regions upstream of genes encoding functions in glycolysis and gluconeogenesis (functional enrichment P=1.11E−02) (Table I; Figures 3A–C). These genes also exhibited the most significant enrichment in immunoprecipitated TrmB–DNA complexes (gpm, ppsA, pgk, pykA, VNG0683C; Figure 3A). In addition, one gene, ppsA, remained bound under low glucose concentrations, suggesting a high‐affinity interaction with TrmB at this site (Figure 3B).
Other high‐confidence targets in the list (10−3>P>10−6) also showed a strong transcriptional perturbation in the ΔtrmB background (Table I; Supplementary Table 4). Consistent with the transcriptome data, genes coding for biosynthesis of purine nucleotides, cobalamin, thiamin, and amino acids were significantly overrepresented across all 113 targets (Table I; Supplementary Table 4). We also observed binding to the intergenic region upstream of five TFs, including VNG0156C, VNG0247C, VNG0878G, putative regulators; VNG1179C, a regulator of copper homeostasis (Kaur et al, 2006); and TrmB itself (Supplementary Table 4). This could explain the differential regulation of a large number of genes whose promoters are not directly bound by TrmB. Together these data suggest that TrmB directly controls the expression of genes functioning in diverse metabolic pathways.
Although TF binding in intergenic regions is generally considered as evidence for direct regulation of downstream genes, we found that ∼40% (45 of 113) of TrmB‐binding sites were inside coding sequences. However, given that the H. salinarum genome is ∼85% coding, our sample of binding sites is actually somewhat enriched in intergenic regions (P=0.18), with 60% of these binding sites falling within 250 bp of an experimentally determined transcription start site (Koide et al, 2009). ChIP‐chip studies for other bacterial TFs have found up to 70% of targets in intergenic regions (Shimada et al, 2008). Combined, these results suggest that these binding events might be functional. However, further investigation is required to elucidate the physiological function of these unusual TrmB targets, at least two of which are near loci that encode newly discovered putative noncoding RNAs (Koide et al, 2009).
Identification of a cis‐regulatory sequence motif.
To further define the TrmB‐binding site, we searched for a conserved cis‐regulatory sequence motif within 250 bp of its genomic binding locations identified by ChIP‐chip. Locations of transcription start sites (Koide et al, 2009), translation start sites (http://baliga.systemsbiology.net)(Ng et al, 2000), and putative GTF‐binding sites (Facciotti et al, 2007) were used to constrain the sequence search space (Materials and methods). We identified a conserved cis‐element [TACT‐N(7‐8)‐GAGTA (P<2 × 10−5)] (Figure 4A) within 250 bp of 115 (P=8.7 × 10−50) of all genes nearby TrmB‐binding sites identified by ChIP‐chip (Figure 4B). Matches to this signature were also detected in the vicinity of other sites, albeit farther from the ChIP‐chip location (>250 bp) (Supplementary Table 4). This motif is divergent from other characterized TrmB‐binding sites (e.g. Thermococcales glycolytic motif (TGM), TATCAC‐N5‐GTGATA) (van de Werken et al, 2006). However, consensus sequences for different HTH‐domain containing TFs can be quite divergent (Rigali et al, 2004). A genome‐wide pattern searching algorithm identified a total of 317 matches to this motif signature (Supplementary Table 5; Materials and methods) which are nearby 396 genes in operons, suggesting that TrmB may bind to additional loci across the genome. This motif signature is enriched in intergenic regions (P∼0.002). However, given that functional promoter binding is a product of combinatorial interactions of TFs, GTFs, cofactors, and RNA polymerase; further studies will determine which of these additional putative TrmB‐binding sites are indeed functional.
In vivo validation of key TrmB‐binding sites using promoter: reporter fusion assays.
To ensure that the TrmB‐binding motif represents a physiologically relevant regulatory region, the ppsA promoter was fused to the GFP reporter and assayed in the wild type and ΔtrmB backgrounds (Figure 4C). As expected from the microarray and ChIP‐chip data, FACS assays validated that transcription from the ppsA promoter is activated three‐fold in the absence of glucose in the parent background, whereas ΔtrmB cells are impaired for induction (Figure 4C). Strikingly, when the distal TrmB‐binding site is removed, the glucose responsiveness is reduced in the parent strain (Figures 4C and D). Activity from this shorter promoter remains at background levels in the ΔtrmB mutant regardless of condition (Figure 4C). Together these data verify that the cis‐regulatory TrmB‐binding site identified by high‐throughput methods is physiologically relevant and requires TrmB for regulation in the absence of glucose. Further, both the promoter proximal and distal binding sites contribute to TrmB‐mediated response to glucose, with both binding sites required for full activation, suggesting binding site synergy (Figure 4D).
TrmB governs an integrated transcriptional and metabolic network to balance the expression of evolutionarily diverse cofactor and enzyme‐coding genes
TrmB is a bifunctional regulator that activates some targets and represses others.
To construct the H. salinarum TrmB‐dependent transcriptional network, we calculated the significance of the overlap between the integrated ChIP‐chip, transcriptome, and motif location data generated here. This was further integrated with genome‐wide transcription start site data (Koide et al, 2009) and ChIP‐chip data for seven GTFs (Facciotti et al, 2007). This enabled identification of a significant group of 37 genes (organized among 20 operons) in the overlap between integrated system‐wide datasets (Figures 4A and B). Functional annotations (Materials and methods) revealed that these 37 genes encode enzymes of (i) glycolytic and gluconeogenic pathways, (ii) purine biosynthesis, (iii) cobalamin biosynthesis, (iv) TCA cycle, and (v) glutamate dehydrogenase (Figure 4A).
The mechanism by which TrmB activates and/or represses these genes was investigated further by analyzing the locations of its binding sites relative to those of seven GTFs (Facciotti et al, 2007) (Figure 5A). TrmB‐binding sites were located upstream of GTF‐binding sites in promoters of genes it activates (e.g. ppsA; Figure 5B; P<0.03). In contrast, no GTF‐binding locations were detected within promoters of genes that are repressed by TrmB (e.g. gap; Figures 5A and C). Alternatively, GTF‐binding sites were located upstream of repressed genes (e.g. VNG0303G, VNG0382G, VNG1128G, Figure 5A). Thus, our system‐wide ChIP‐chip data support a model in which TrmB binds to its motif downstream of the PIC to occlude transcription. In contrast, TrmB‐binding upstream of the PIC facilitates transcription at weak promoters (Figure 5D). Although this model was suggested earlier (Kanai et al, 2007; Lee et al, 2008), this study presents the first in vivo experimental evidence of these interactions and places them in a global context. Similar analysis of the remaining TrmB‐bound promoters revealed a more complex distribution of GTF and TrmB‐binding sites, which may indicate the involvement of additional mechanisms in the regulation of those genes (Facciotti et al, 2007). Nevertheless, this integrated systems analysis has provided both a global perspective and valuable mechanistic insight into combinatorial regulation by GTFs and a sequence‐specific transcription regulator (Bell et al, 1999; Kanai et al, 2007).
The combined evidence presented thus far strongly suggests that TrmB acts as both a transcriptional activator and a repressor in response to carbon source availability. Further, these results suggest that TrmB directly and coordinately controls genes significantly overrepresented for functions in central metabolism and its associated pathways in the metabolic network.
Metabolic network reconstruction analysis suggests that TrmB coordinates the expression of evolutionarily diverse enzymes.
To gain a systems‐scale perspective on the role of TrmB in metabolism, we reconstructed the metabolic network of H. salinarum in the context of the TrmB transcription regulatory network and known archaeal reactions reported in the literature (Figure 6; Supplementary Table 6). Many TrmB‐regulated enzymes catalyze reactions at critical regulatory branch points but not the end stages of several metabolic pathways, including amino acid, purine, thiamine, cobalamin biosynthesis. (Figure 6; Falb et al, 2008; Gonzalez et al, 2008). This is illustrated by TrmB‐mediated transcriptional control of genes encoding branch points between purine and thiamine metabolism (purK, purE, and purM; reactions 46 and 47, Figure 6), carbon and nitrogen metabolism (gdhB, korAB, and gdhA1; reactions 31, 32, and 39, respectively, Figure 6)
We also observed several instances of direct TrmB‐mediated transcriptional control of metabolic enzymes with the biosynthesis of their cognate cofactors. Three examples support this assertion. (i) TrmB directly controls over 10 enzymes that require adenosine phosphates (AXP; Figure 6, e.g. reactions 3, 7, 10, 11, 27, 33, 63) and six genes that encode the biosynthesis of these cofactors (Figure 6, reactions 43, 44, 45, 46, 47, 52). (ii) TrmB target genes encoding functions such as 5‐phosphoribosyl‐N‐formylglycinamidine (FGAM) synthase (purL2; Figure 6, reaction 45) and asparagine synthase (asnA, reaction 63) use glutamate or glutamine as co‐reactants and ATP as a cofactor. Concomitantly, TrmB regulates GOGAT pathway genes (reactions 31, 39). (iii) TrmB coordinately controls thiamin (TPP) synthesis (thiC) with thiamine‐requiring enzymes (porAB and menD, reactions 12 and 25, respectively) (Rodionov et al, 2003).
In addition, among direct TrmB targets were genes encoding enzymes of uniquely archaeal lineage (Figure 6; Supplementary Table 6). Of the enzyme‐coding genes under TrmB control included in the metabolic pathways shown in Figure 6, 13 are unique to archaea (Supplementary Table 6) and 35 are conserved across species from all three domains of life. Integrated analysis of the metabolic and gene regulatory network architecture reveals two opposing scenarios. (i) In the shikimate biosynthesis pathway, only one gene encoding shikimate kinase (VNG1245C, reaction 20) is of archaeal origin, whereas all other genes are conserved throughout evolution (orange shaded area, Figure 6; Supplementary Table 6). Strikingly, all genes of this pathway except shikimate kinase are direct TrmB targets (Figure 6), suggesting that shikimate kinase may have been acquired by homologous gene replacement (Galperin and Koonin, 1999). This finding is especially surprising given that enzymes in bacterial metabolic pathways without branch points tend to be co‐regulated (Seshasayee et al, 2008). (ii) The latter half of the TCA cycle from 2‐oxoglutarate to fumarate is directly TrmB‐dependent (purple shaded area, Figure 6), including the genes encoding 2‐oxoglutarate oxidoreductase, the only uniquely archaeal enzyme in the pathway (Supplementary Table 6). These findings suggest that TrmB governs the transcription of a metabolic network with hybrid evolutionary origin.
TrmB coordinately regulates metabolic enzyme‐coding genes with cofactor genes
From the evidence presented in this study, we conclude that TrmB governs a sugar‐responsive global metabolic regulatory network to coordinate the expression of genes with diverse evolutionary ancestry. Remarkably, TrmB coordinates the transcription of enzyme‐coding genes involved in the synthesis of cofactors required for the function of these metabolic enzymes. We hypothesize that this TrmB‐directed coordination may enable redox and energy balance. Specifically, our data suggest that TrmB represses the semi‐phosphorylative Entner–Doudoroff (E–D) glycolytic pathway (Kanai et al, 2007; Danson et al, 2007; van der Oost and Siebers, 2007; Pfeiffer et al, 2008) (e.g. gap, pykA, VNG0442G) (Figures 5 and 6), and induces gluconeogenesis (e.g. ppsA; Figure 2). Thus, a deletion in trmB would lead to an inability to generate energy through gluconeogenesis in the absence of glucose. Secondly, ΔtrmB cultures grown in the absence of glucose overexpress genes coding for enzymes that convert NAD(P)+ to NAD(P)H, this may force the cell toward an oxidized state (e.g. Figure 6, reactions 19, 31). If so, then this would lead to a shortage of reducing equivalents. The hypersensitivity to oxidative stress and reduced NAD/H ratio observed in ΔtrmB mutant cells are consistent with this hypothesis (Supplementary Table 2; Figure 2B). We conclude that TrmB acts to maintain redox and energy balance in response to nutrient availability in H. salinarum.
Interestingly, our evidence points to additional regulatory mechanisms that may cooperate with the TrmB transcription network. First, TrmB does not seem to regulate the end stages of several pathways (e.g. amino acid, purine, thiamine, cobalamin biosynthesis). Second, TrmB is not only autoregulated, but also seems to directly control four other TFs (Supplementary Table 4). Third, the transcription of some genes with a direct TrmB–promoter interaction remains unchanged in the ΔtrmB strain (e.g. amino‐acid biosynthesis gene asnA). Finally, in several instances TrmB regulates an entire pathway except for one gene (e.g. enolase, shikimate kinase). Together, this evidence suggests the cooperation of other regulatory mechanisms with the TrmB transcription network. This interpretation is in line with earlier metabolic regulatory network analyses in Escherichia coli, which found that the majority of central metabolic pathways are controlled by multiple TFs (Seshasayee et al, 2008).
Evolutionary context for the TrmB transcription‐metabolism network
Among known global regulators of central metabolic genes in prokaryotes, no single transcription factor has been shown to directly control both metabolic enzyme and cognate cofactor biosynthesis genes (Grainger et al, 2005; Supplementary Table 4). For example, CRP, a global regulator of carbon and nitrogen metabolic pathways in enteric bacteria, is required for the condition‐specific transcriptional induction of cobalamin biosynthesis genes (cob). However, a direct CRP–cob promoter interaction has not been established (Ailion et al, 1993; Grainger et al, 2005). Instead, the cob‐specific transcription factor PocR may be a more likely candidate for direct regulation (Ailion et al, 1993). Similarly, in Bacillus subtilis, CcpA controls global targets in carbon metabolism (Sonenshein, 2007), whereas the PurR repressor is specific for purine biosynthetic genes (Saxild et al, 2001).
In contrast, in other archaeal species, it is possible that TrmB or TrmB‐type global transcriptional control of metabolism is operative. For example, in species in which TrmB is known to control glycolysis and gluconeogenic pathways (e.g. Pyrococcus furiosus), an unknown transcription factor upregulates genes in other metabolic pathways such as the TCA cycle and chorismate synthesis in response to maltose (Schut et al, 2003). In addition, divergent TrmB‐binding sites can be detected in the vicinity of the transcription start site for some of these genes in this and other thermophilic archaeal species, although they have not been experimentally validated (van de Werken et al, 2006). It will be interesting to confirm these preliminary findings, because it suggests that the network motif of integrated control of cofactor and enzyme genes could be widespread in archaea. Therefore, the metabolic network model presented here will be useful as a structural framework for other archaeal systems and a starting point for evolutionary comparisons with other understudied representatives in other domains of life.
In light of this evolutionary context, it was striking to observe that genes encoding enzymes of uniquely archaeal lineage were included among the direct TrmB targets in H. salinarum (Figure 6; Supplementary Table 6). Combined with the observation that the TrmB regulon genes encode enzymes of diverse ancestry, these results are consistent with the hypothesis that several lateral gene transfer or homologous gene replacement events occurred in the evolutionary compilation of the TrmB network (Galperin and Koonin, 1999). However, additional information is required to determine whether the TrmB regulatory network architecture (e.g. promoter elements, nutrient responsiveness) was in place before or after the acquisition of these new genes, because our data support both scenarios (e.g. compare the opposing shikimate pathway and TCA cycle examples. See shaded areas in Figure 6). Nevertheless, the TrmB regulatory network may represent a unique window into an active evolutionary process.
In summary, this study reveals that TrmB regulatory control is restricted primarily to central metabolism and branch points, suggesting combinatorial control between TrmB and pathway‐specific regulatory mechanisms. In addition, TrmB seems to balance the expression of genes coding for metabolic enzymes with those of their cognate cofactors in a sequence‐specific manner. Finally, the TrmB metabolic regulatory network is an evolutionary mosaic, controlling genes coding for uniquely archaeal enzymes with those that are more widely distributed, even within the same metabolic pathway.
Materials and methods
Strains, media, plasmids, and growth curve assays
H. salinarum NRC‐1 (ATCC700922) was grown routinely in complex medium (CM; 250 NaCl, 20 g/l MgSO4 7H2O, 3 g/l sodium citrate, 2 g/l KCl, 10 g/l peptone) or a complete defined medium (CDM) containing 19 amino acids (Supplementary Table 1). For growth in CDM, starter cultures were grown in CM to mid‐logarithmic phase and washed three times in basal salts buffer (CM lacking peptone) and resuspended in CDM at OD600 ∼0.1. Subsequent growth was conducted for all media conditions at 37°C with 225 r.p.m. shaking in the presence of low light intensity (24.6 μmol photons/m2 s from fluorescent lamps).
For routine culturing and growth assays of the Δura3 parent and ΔVNG1451C mutant strains, CM or CDM was supplemented with 0.05 mg/ml uracil. For growth curve assays, NAD+/H assays and growth complementation experiments in the Δura3 parent and ΔVNG1451C mutant strains, CM or CDM was supplemented with various sugars (Figure 2) at 7% w/v except for glycerol at 0.08% v/v. Samples were grown in 200 μl cultures for 6 days under continuous ∼225 r.p.m. shaking in a Bioscreen C (Growth Curves USA, Piscataway, NJ), set to measure optical density (OD) at 600 nm automatically every 30 min for 200 culture samples simultaneously. The average and standard deviation of the doubling time for Δura3 and Δura3ΔVNG1451C during the logarithmic phase is shown in Figure 2, which represents at least 9 and at most 20 biological replicate samples for each strain under each condition. Two sets of non‐parametric paired t‐tests were performed comparing (i) wild type versus mutant growth (asterisks in figure represent this set of tests); and (ii) mutant growth without carbon source versus mutant growth in each of the carbon sources shown in Figure 2. By the latter measure, mutant growth rates in the absence of carbon source were significantly different from growth in glycerol and glucose but not for any other source tested. For TFBS location array experiments with this strain, glucose was either omitted or added at 0.01% or 7% w/v or glycerol at 0.08% v/v where indicated in the text. Strain constructions and NAD+/H assays are described in detail in Supplementary information.
Gene expression arrays
10 ml of H. salinarum NRC‐1 Δura3ΔVNG1451C and Δura3 parent strain sample cultures grown in CM or CDM in the presence or absence of glucose were collected at three time points throughout the growth curve (OD600∼0.2, 0.6, and 1.2). Cells were immediately pelleted by room temperature centrifugation at 8820 g for 8 min at 4°C and snap‐frozen on a dry‐ice ethanol bath. Sample pellets were stored overnight at −80°C, followed by RNA preparation using the Absolutely‐RNA kit (Stratagene, La Jolla, CA) according to the manufacturer's instructions. RNA quality was checked using the Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA) and freedom from DNA contamination was ensured by PCR amplification of 200 ng of RNA sample. 5 μg of each quality‐checked RNA sample was hybridized against the H. salinarum NRC‐1 reference RNA prepared under standard conditions (mid‐logarithmic phase batch cultures grown at 37°C in CM). This common reference RNA has been used across all ∼950 microarray experiments in the H. salinarum NRC‐1 microarray data repository (Bonneau et al, 2007). Samples were hybridized to a 70‐mer oligonucleotide array containing the 2400 nonredundant open reading frames (ORFs) of the H. salinarum NRC‐1 genome as described in Baliga et al (2004). Each ORF was spotted on each array in quadruplicate and dye flipping was conducted (to rule out bias in dye incorporation) for all samples, yielding eight technical replicates per gene per time point. At least two independent biological replicates exist for all experimental conditions for a total of 16 replicates per gene per condition. Direct RNA or DNA (TFBS location arrays, see below) labeling, slide hybridization, and washing protocols were performed as described earlier (Facciotti et al, 2007; Schmid et al, 2007). Raw intensity signals from each slide were processed by the SBEAMS‐microarray pipeline (Marzolf et al, 2006) (www.SBEAMS.org/microarray), in which resultant data were median normalized and subjected to significant analysis of microarrays (SAM) and variability and error estimates (VERA) analysis. Each data point was assigned a significance statistic, λ, using maximum likelihood (Ideker et al, 2000).
Microarray data were analyzed using the TM4 MultiExperiment Viewer (MeV) application (http://www.tm4.org/) within the Gaggle data analysis environment (Shannon et al, 2006). Specifically, all 2400 genes across mutant and wild‐type microarray experiments, described above, were subjected to three independent analyses: significance analysis of microarrays (SAM), KMEANS clustering, and hierarchical clustering. Resultant clusters of genes in the union of all three analyses that displayed significant differential transcription between the parent and mutant strains in the presence and/or absence of glucose were considered to be VNG1451C dependent. Biological replicates were considered independently to ensure statistical rigor.
TFBS location array analysis
ChIP of VNG1451C‐cmyc‐tagged constructs was performed as described (Ren et al, 2000; Facciotti et al, 2007) in cultures grown in the presence or absence of glucose or glycerol. Resultant TFBS location data were analyzed for statistically significant enrichment of features in the ChIP‐chip sample versus the unenriched sample using MeDiChI, a regression‐based deconvolution algorithm (Reiss et al, 2008). Enrichment lists from each of the five independent MeDiChI runs were combined into a density algorithm (Koide et al, 2009) to find TFBS locations overrepresented in the data. To be considered as part of the final TFBS enrichment list (Supplementary Table 4), we required that each enrichment peak from the density output be composed of at least two biological replicate peaks with a combined MeDiChI P‐value <0.001 (product of replicate P‐values). A peak was considered to be ‘intergenic’ if it fell within 250 bp of a transcription start site or termination site (a conservative estimate of the resolution of the data from MeDiChI‐derived binding sites; (Reiss et al, 2008)). Subsequently, the resultant binding sites from the combined dataset were compared with orthogonal datasets (i.e. genome‐wide mRNA expression and binding motif searches, details below).
To analyze the TrmB TFBS location data in the context of the GTF TFBS data (Facciotti et al, 2007), the distance of the genomic position for each high‐resolution GTF‐binding site (Reiss et al, 2008) to that of TrmB (GTF coordinate—TrmB coordinate=relative position) at each target promoter was calculated. The Pearson correlation between this distance and the mRNA expression data in ΔtrmB for the gene of interest was then calculated. The P‐value for these correlations reported in the text were calculated based on 100 000‐fold resamplings of the data.
DNA‐binding motif searching
To find the consensus binding motif for VNG1451C, the sequence search space was limited for each putative promoter region enriched in the TFBS location data through several constraints: (i) sequence ±250 bp from the center of each MeDiChI‐based peak; (ii) sequence ±20 bp from the putative transcription start site (Koide et al, 2009); (iii) annotated translation start site location (Ng et al, 2000). Resultant sequences were used as input for three independent motif‐finding algorithms: (i) Bioprospector (http://ai.stanford.edu/~xsliu/BioProspector/), which finds gapped motifs in query sequences (Liu et al, 2001); (ii) MEME/MAST (http://meme.sdsc.edu/meme/) (Bailey and Gribskov, 1998; Bailey et al, 2006); and (iii) RSAT pattern finding (http://rsat.ulb.ac.be/rsat/). Only motifs represented in the intersection of all the three algorithm outputs were considered in further analysis. To generate the P‐value for each motif, the three algorithms were re‐run on randomized query sequences. Results were compared with algorithm outputs from original sequences using the Wilcoxon test (Frith et al, 2008). One motif had a statistically significant P‐value (P=2.0 × 10−5), and the resultant consensus motif described in the text was generated using weblogo.berkeley.edu/logo.cgi. To scan the remainder of the genome for the resultant motif, we used the pattern‐finding program at rsat.ulb.ac.be/rsat/ with the parameters of (i) no more than 1 bp away from the consensus motif, (ii) unbiased for genomic position (i.e. coding and noncoding sequences were searched); (iii) containing a 7 or 8 bp gap within the motif; (iv) located on the chromosome (as no TrmB hits were found on either of pNRC100 or pNRC200).
GFP promoter: reporter fusion validation experiments
The ppsA p1+p2 construct contains both putative TrmB‐binding sites in a 115‐bp fragment upstream of the translation start site of the ppsA (VNG0330G) gene (Figure 4D). The ppsA p2 construct is an 88‐bp truncated version of the p1+p2 construct and lacks the promoter‐distal TrmB‐binding site (Figure 4D). These promoter fragments were fused to a red‐shifted GFP variant optimized to function in haloarchaea, which was adapted from (Reuter and Maupin‐Furlow, 2004). See Supplementary information for details on strain construction. Constructs were transformed into the Δura3 parent strain and the ΔtrmB deletion mutant. Resultant constructs were grown in CM media in the presence or absence of 7% glucose to mid‐logarithmic phase (OD600∼0.4–0.8). Cultures were diluted to OD600=0.2 and fixed in 0.25% formaldehyde dissolved in basal salts (CM lacking peptone) for 10 min at 4°C and subsequently washed in basal salt to remove fixative. Fluorescence of fixed cells was measured by flow cytometry on a FACS‐Calibur instrument (Becton Dickinson, San Jose, CA) in the presence of 1 μM fluorescent beads (Polysciences, Warrington, PA) spiked in at a concentration of 4 × 108 beads/ml. A negative control strain carrying the empty GFP vector with no promoter insert was treated identically to gauge background fluorescence levels (black bar in Figure 4C). Resultant data were analyzed using Flow‐Jo software (Tree Star, Inc., Ashland, OR). The average absolute fluorescent cell counts normalized to bead counts from three biological replicate experiments±s.d. are shown in the graph (Figure 4C).
Data integration analysis
To assess the extent of agreement between the three system‐wide datasets presented in this study (gene expression, ChIP‐chip, and motif search data), the hypergeometric distribution P‐values were calculated, which reflect the likelihood that the intersection of any two of these three datasets are due to chance. Specifically, we calculated the significance of (i) the number of genes within 250 bp of both the ChIP‐chip hits and binding motif sequences (one extra bp in motif degeneracy was allowed for a few of the genes in Figure 4A, which were nearby ChIP‐chip hits, which showed a highly significant change in the microarray data;Figure 4A); (ii) the number of genes whose expression changed in the trmB mutant, which were also within 250 bp of a ChIP‐chip hit; (iii) the number of genes changing in the transcriptome data, which contained a motif within 250 bp of their transcription start site (Figure 4B).
Detailed annotation analysis of the 37 genes in the intersection of the three high‐throughput datasets (Figure 4B) was conducted using protein functional data from online databases (http://baliga.systemsbiology.net/halobacterium; KEGG, GO, STRING, HaloLex) (Bonneau et al, 2004; Bare et al, 2007; Pfeiffer et al, 2008; Jensen et al, 2009). To build the metabolic network governed by TrmB, a four‐step bioinformatic process was conducted according to the flowchart shown in Supplementary Figure 3.
All ChIP‐chip and gene expression array data presented in this study are available at the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) under the accessions GSE13531, GSE13529, and GSE13498.
We are indebted to Ludmila Chistoserdova and Monica Orellana for their critical reading of the paper, Kenia Whitehead for useful discussions, Christopher Bare for software support, and Lee Pang and Noel Blake for assistance with the FACS analysis. This work was supported by grants from NIH (P50GM076547 and 1R01GM077398‐01A2), DoE (MAGGIE: DE‐FG02‐07ER64327), NSF (DBI‐0640950) to NSB, and from NIH (5F32GM078980‐02) to AKS.
Conflict of interest
The authors declare that they have no conflict of interest.
This file contains supplementary results, methods, and figures S1–3 [msb200940-sup-0001.pdf]
Supplementary Table 1
Supplementary Table 1. Halobacterium salniarum NRC‐1 complete defined synthetic medium (CDM) [msb200940-sup-0002.xls]
Supplementary Table 2
Supplementary Table 2. High throughput growth data for VNG1451C (trmB) deletion mutant compared to parent strain(s) [msb200940-sup-0003.xls]
Supplementary Table 3
Supplementary Table 3. Microarray gene expression data. [msb200940-sup-0004.xls]
Supplementary Table 4
Supplementary Table 4. Direct targets of TrmB: all 113 ChIP‐chip hits. [msb200940-sup-0005.xls]
Supplementary Table 5
Supplementary Table 5. TrmB binding motifs located throughout the H. salinarum NRC‐1 genome. [msb200940-sup-0006.xls]
Supplementary Table 6
Supplementary Table 6. Details regarding metabolic reactions depicted in Figure 5. [msb200940-sup-0007.xls]
This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
- Copyright © 2009 EMBO and Nature Publishing Group