The EGFR‐driven cell‐cycle pathway has been extensively studied due to its pivotal role in breast cancer proliferation and pathogenesis. Although several studies reported regulation of individual pathway components by microRNAs (miRNAs), little is known about how miRNAs coordinate the EGFR protein network on a global miRNA (miRNome) level. Here, we combined a large‐scale miRNA screening approach with a high‐throughput proteomic readout and network‐based data analysis to identify which miRNAs are involved, and to uncover potential regulatory patterns. Our results indicated that the regulation of proteins by miRNAs is dominated by the nucleotide matching mechanism between seed sequences of the miRNAs and 3′‐UTR of target genes. Furthermore, the novel network‐analysis methodology we developed implied the existence of consistent intrinsic regulatory patterns where miRNAs simultaneously co‐regulate several proteins acting in the same functional module. Finally, our approach led us to identify and validate three miRNAs (miR‐124, miR‐147 and miR‐193a‐3p) as novel tumor suppressors that co‐target EGFR‐driven cell‐cycle network proteins and inhibit cell‐cycle progression and proliferation in breast cancer.
A genome‐wide microRNA (miRNome) screen coupled with high‐throughput monitoring of protein levels reveals complex, modular miRNA regulation of the EGFR‐driven cell‐cycle network, and identifies new miRNAs that can suppress breast cancer cell proliferation.
We interrogated, for the first time, a mammalian oncogenic signaling network with the miRNome and report the outputs at the protein level.
Whole‐genome microRNA (miRNA) effects on a given protein are generally mild, supporting a fine‐tuning role for miRNAs, and these effects are dominated by sequence‐matching mechanisms.
We developed a novel network‐analysis methodology with a bipartite graph model to identify proteins co‐regulated by miRNAs. Besides the sequence‐based mechanism, our results demonstrated that miRNAs simultaneously regulate several proteins belonging to the same functional module.
We identified three miRNAs, miR‐124, miR‐147 and miR‐193a‐3p, as novel tumor suppressors that co‐regulate EGFR‐driven cell‐cycle network proteins, and inhibit cell‐cycle progression and proliferation in breast cancer.
Our results demonstrate the potential to steer miRNA research toward the network level, underlining the need for systematic approaches before positioning miRNAs as drugs or drug targets.
Signaling pathways are prime candidates for regulation by microRNAs (miRNAs) due to their dose‐sensitive nature and the fine‐tuning role of miRNAs (Inui et al, 2010). Indeed, EGFR signaling and cell‐cycle components are known to be regulated by miRNAs in the cancer context (Bueno et al, 2008; Barker et al, 2010). EGFR itself is regulated by miR‐7, which can induce cell‐cycle arrest and cell death in cancer cell lines (Webster et al, 2009). In addition, let‐7 regulates the oncogene Ras, which is the activator of the MAPK pathway upon EGFR activation (Johnson et al, 2005). Similarly, several cell‐cycle genes have been identified as direct targets of miRNAs, including the miR‐16 family and miR‐17/20 that target Cyclin D1 and have important roles in regulating G1 arrest and S‐phase entry (Liu et al, 2008; Yu et al, 2008). A systematic analysis, however, of miRNAs regulating the EGFR pathway and downstream cell‐cycle proteins has not been performed.
The identification of targets of individual miRNAs is commonly achieved by combining transient transfection of miRNA mimics or inhibitors, target prediction algorithms and gene expression profiling (Wellner et al, 2009; Uhlmann et al, 2010). This approach has two main drawbacks: first, it only captures regulatory effects at the transcriptional level, which might deviate from the outputs at the protein level; second, it cannot distinguish direct and indirect effects induced by miRNAs. A more direct, biochemical approach to identify targets of miRNAs is RISC‐IP, where miRNA and target mRNAs are co‐immunoprecipitated with Ago2 (component of RISC) and then analyzed either by array‐based hybridization (Karginov et al, 2007) or by deep sequencing (Hanina et al, 2010). This approach has identified several genes that are regulated both by mRNA degradation and by translational repression (Thomson et al, 2011), and some rare cases of 5′‐UTR targeting (Grey et al, 2010).
Even with the knowledge of direct targeting, measurements of miRNA effects directly at the protein level are needed to fully comprehend the complexity of their activities. Recent advances in high‐throughput proteomics have allowed researchers to begin addressing this issue. For example, two reports used quantitative mass spectrometry to assess changes in protein levels after overexpressing or knocking down selected miRNAs (Baek et al, 2008; Selbach et al, 2008). Both reached a similar conclusion: the miRNAs they studied affected the expression of hundreds of proteins, but in a rather mild manner, fine‐tuning protein synthesis. Another study used a combination of a luciferase‐based 3′‐UTR reporter assay and label‐free shotgun proteomics to identify targets of a liver‐specific miRNA, miR‐122, revealing a network of miR‐122‐regulated genes that was enriched for proliferation, cell cycle and apoptosis, all of which are relevant to liver metabolism, liver diseases and cancer‐related processes (Boutz et al, 2011). Researchers have also employed protein lysate arrays, which allow higher throughput compared with mass spectrometry methods, in one case screening >300 miRNAs for regulatory effects on the expression of the estrogen receptor alpha (ERα) protein (Leivonen et al, 2009).
Despite these advances, understanding how miRNAs regulate an oncogenic signaling system requires that we simultaneously investigate the interactions between a body of miRNAs and a network of proteins. In this work, we aimed to systematically identify miRNAs that regulate the EGFR pathway on the protein level, and to unravel principles of these regulations. To this end, we screened a genome‐wide library of miRNA mimics for effects on the expression of 26 proteins in the EGFR‐driven cell‐cycle pathway using reverse phase protein arrays (RPPAs). This approach identified a series of miRNAs that downregulate one or more proteins of the EGFR‐driven cell‐cycle pathway. For most of the proteins with reduced expression levels, sequence‐based prediction of targeting miRNAs can explain a significant portion of the regulation. In order to uncover the organizational principles of a mild, yet potentially very dense regulatory network, we developed a novel network‐based analysis approach to identify those miRNAs that regulate multiple proteins in the studied system. Network analysis implies that the proteins controlling EGFR‐driven G1/S transition are co‐regulated by several miRNAs. This principle is supported by the discovery of three novel tumor‐suppressor miRNAs that regulate the EGFR‐driven cell‐cycle pathway.
miRNome screen to unravel the miRNA–protein interaction network of the EGFR pathway
To elucidate how the EGFR‐driven cell‐cycle protein network is regulated at the genome‐wide miRNA level, we performed a gain‐of‐function screen using a mimic library containing 810 mature miRNAs (miRBase version 10.0). As readout, we measured and quantified the abundances of 26 proteins in this pathway. We employed the MDA‐MB‐231 breast cancer cell line as our experimental system, where a large body of known human miRNAs is expressed (429 detected with microarray, and 598 by next‐generation sequencing data, see Supplementary information, Supplementary Tables S1 and S2 and Supplementary Figure S1 for details). Cells were transfected using an automated robotics system with individual miRNA mimics, and total protein lysates in miRNA overexpressing cells were spotted on RPPAs. Next, we incubated the protein arrays with thoroughly validated antibodies in an automated pipeline (Figure 1A).
We chose the proteins to be examined from the EGFR signaling/cell‐cycle network (Table I) based on the two criteria. First, the expression of the gene must be detectable in the given cell line. We analyzed published RNA sequencing data (Sun et al, 2011) and chose those genes with at least one transcript detectable in the MDA‐MB‐231 cell line (Supplementary Table S3). Second, a validated antibody for the RPPAs must be available. To determine the specificity and sensitivity of the antibodies, we validated each antibody using the RNAi‐based antibody validation method that we have previously published (Mannsperger et al, 2010). Knockdowns with siRNAs resulted in strong reductions of targeted proteins, confirming the antibody specificity/sensitivity for all proteins analyzed (Figure 1B, raw data provided in Supplementary Table S4).
We performed the miRNome screen with two biological replicates, which showed high correlations with an average Pearson's correlation coefficient of 0.78 (Figure 1C; Supplementary Figure S2). We included miR‐Control (scramble), known EGFR‐targeting microRNA miR‐7 (Webster et al, 2009), si‐Control (scramble) and si‐EGFR in all 34 screening runs (each run involved transfection of 24 different miRNA mimics). These quality control measures assured the robustness of the screen (Supplementary Figure S3). We present the global effect of whole‐genome miRNAs on the EGFR/cell‐cycle network proteins, summarized as a matrix of 810 × 26=21 060 data points, as a heatmap in Figure 1D (the screen results are provided in Supplementary Table S5). Overall, this experimental platform combining a robust whole‐genome miRNA screen with quantitative proteomics allowed us to identify miRNA–protein regulations in the MDA‐MB‐231 breast cancer cell system.
Fine‐tuning of EGFR pathway proteins by miRNAs and the resulting miRNA–protein interaction network
To build an miRNome–protein interaction network, we first identified miRNA–protein pairs where the regulation was statistically significant. For each protein, we used the z‐score method to quantify the effects of all miRNAs on its expression (see Materials and methods). While a positive z‐score suggests upregulation of protein expression, a negative value indicated a downregulation upon miRNA expression. For most proteins analyzed, the global effect of miRNAs followed a normal distribution with relatively short tails. This pattern suggests that the effects of miRNAs are rather mild, thereby fine‐tuning the protein expression (in Figure 2A, PLCG1 is shown as an example. Histograms for all other proteins are given in Supplementary Figure S4).
To better characterize the properties of the underlying miRNA–protein regulation network, we tested different significance thresholds and chose two commonly used ones, P<0.05 and P<0.001 (equivalent to ∣z∣>1.96 and ∣z∣>3.29, see Materials and methods for details). We observed that networks of different significance thresholds show distinct patterns with respect to the number of remaining edges (Figure 2B). The resulting miRNA–protein regulation network is very dense at the lower threshold (∣z∣>1.96), as can be seen in Figure 2C. This high density is not entirely unexpected, since (1) it has been computationally predicted that one miRNA can regulate several genes, (2) one gene can be regulated by dozens of miRNAs and (3) indirect regulations are possible. Our data are well in line with these hypotheses, and provide the first large‐scale experimental evidence, to the best of our knowledge, at the miRNome level. The number of links decreased sharply with increasing stringency (Figure 2B) and we observed a relatively sparse graph with a low number of regulations when applying a high significance threshold (∣z∣>3.29) (Figure 2D). These observations reveal potential challenges in understanding the complex network: if the significance threshold is set too low, regulatory relationships are complicated and are noisy; on the other hand, if it is set too high, too few edges will remain to reflect structural features of the network.
To overcome this dilemma, one possibility is to combine the low‐stringency network with prior knowledge of miRNA‐direct target predictions. In order to construct such a network, one can use pre‐compiled miRNA‐mRNA sequence mappings provided by the microRNA.org database (http://www.microrna.org/microrna/getDownloads.do, Release August 2010) by choosing different stringencies of the sequence‐matching requirements (seed sequence mapping). We set the stringency filter so that all target predictions having a 6‐mer or more seed sequence pairing are included. To construct the subnetwork, we filtered the whole network by keeping miRNA–protein edges that (1) has an absolute z‐score larger than 1.96 and (2) the miRNA is mapped to gene's 3′‐UTR with at least one 6‐mer match or more. This approach led to 241 miRNAs and 25 genes (PIK3CB was not a predicted miRNA target), with 355 edges connecting them (Supplementary Figure S5; Supplementary Table S6). We could further reduce the complexity of the resulting network by using an evolutionary conserved sequence‐matching filter. To this end, we compared the low‐stringency network obtained from experimental data with miRNA‐direct target predictions made by TargetScan (Friedman et al, 2009). We only kept those miRNA–protein links where (1) the miRNA significantly downregulates the protein with ∣z∣>1.96 and (2) the miRNA is predicted to directly target the 3′‐UTR of the gene in an evolutionarily conserved manner. The resulting network, supported by both computational predictions and experimental evidence, ended up with 120 potential miRNA/protein target interactions (Figure 3; Supplementary Table S7).
In order to find out how many interactions in this network were validated by previously published independent studies, we manually searched the miRWalk database (http://www.ma.uni‐heidelberg.de/apps/zmf/mirwalk/) for predicted and validated miRNA targets. Indeed, some of the miRNA direct targeting interactions had been validated in other studies (16 validated interactions), e.g., the miR‐143/KRAS (Chen et al, 2009), miR‐200b/c/429/PLCG1 (Uhlmann et al, 2010) and miR‐7/EGFR (Webster et al, 2009). However, our results identified many more new potential direct regulations. These results suggest that our experimental platform, aided by computational predictions to filter results, can indeed identify miRNA–protein interactions, even if their effects are merely moderate.
Seed/3′‐UTR matches dominate the negative regulation of protein expression by miRNAs
To test whether the regulation by miRNAs is dominated by the nucleotide matching mechanism between the miRNA seed sequences and 3′‐UTRs of target genes, we assessed whether protein expression differs in the presence of predicted targeting miRNAs compared with non‐targeting ones for each protein in the network. For this purpose, we used two distinct target prediction algorithms (TargetScan, Friedman et al, 2009; and MicroCosm, Griffiths‐Jones et al, 2008; see Bartel, 2009 for detailed discussions on the prediction algorithms) to compare predicted versus non‐predicted miRNAs for 26 proteins in our network. We applied the Kolmogorov–Smirnov (K–S) test, a non‐parametric, threshold‐free statistical method (Smirnov, 1948), to assess whether distributions of z‐scores of the two groups differ significantly (or three groups, in case of TargetScan with conserved and non‐conserved predicted targets). For most of the proteins with predicted targeting miRNAs, we observed a significant negative aberration of expression distributions (18 proteins out of 25, with two‐sided P<0.05). This suggests that miRNAs, which are predicted to be direct targeting, are indeed more likely to negatively regulate the protein expression level. A primary motif to mediate this reduction was the match between the miRNA seed sequence and the 3′‐UTR of the target gene (Figure 4).
This hypothesis has been supported by other studies in cases of single miRNAs at both mRNA and protein level (Baek et al, 2008; Selbach et al, 2008; Boutz et al, 2011). However, here we report the systematic expression shift caused by predicted miRNAs in dozens of proteins from the EGFR pathway at the miRNome level. We note that for some of the proteins, the shift caused by conserved seeds was more profound compared with the non‐conserved ones. This potentially highlights the role of species conservation in miRNA targeting. Interestingly, the expression aberration was not significant for the rest of the proteins. We assume this could partially be caused by a lack of evolutionary conservation of seed‐matching sequences, and consequent lack of accurate predictions. Overall, changes in the protein expression levels upon miRNA overexpression suggest that computationally predicted miRNAs are more likely to reduce expression of their targets, when compared with non‐predicted miRNAs.
MiRNA co‐regulation of proteins in the same functional modules
Having an integrated miRNA–protein regulation network is helpful to identify new miRNA targets. However, it does not address whether the regulation of protein expression by miRNAs follows any other rules besides the seed‐matching mechanism. Moreover, challenges emerge in maximizing the signal‐to‐noise ratio for discoveries from the regulatory network, calling for new strategies to identify patterns of miRNAs regulating the EGFR‐driven cell‐cycle pathway.
In order to explicitly address these questions, we developed a novel framework in the context of graph theory and complex network analysis. First, we evaluated whether any two nodes (here: proteins) on one side of a given bipartite network would share a significant number of neighbors (here: miRNAs) on the other side. Based on the z‐scores from the RPPA data, a bipartite graph containing all proteins and their significantly regulating miRNAs was built. This bipartite graph allowed for discovering statistically significant relationship patterns. Our focus was to find pairs of proteins that were either consistently co‐upregulated or co‐downregulated by a group of miRNAs. Furthermore, we were interested in two proteins where one was consistently upregulated and the other was downregulated by one or more miRNAs. We define any of these three co‐regulation patterns between proteins A and B to be intrinsic, if the observed pattern occurred statistically significantly more often than in a random model of the same bipartite graph (the more‐than‐random model). In essence, our method employed a bootstrapping approach in which the data were permutated by maintaining some of the properties (Figure 5A, see Materials and methods for details).
To this end and in order to detect the principles of miRNA regulations as well as potential co‐regulation patterns, we built a consensus graph. This graph combined the information from different bipartite graphs by varying parameters in the network model (Figure 5B, see Materials and methods for details). Two proteins are connected if and only if they were significantly co‐regulated by one or more miRNAs under all of the chosen parameter values. The upper bound FDR of the consensus graph was q<0.132 determined with the Benjamini‐Hochberg method (see Supplementary information and Supplementary Figure S6 for details).
We observed that most edges between cell‐cycle proteins were present in the consensus graph, indicating that their potential co‐regulation by given miRNAs is robust. Furthermore, the miRNAs co‐regulating the cell‐cycle proteins showed an interesting pattern that is biologically relevant: they upregulated the expression of the CDK inhibitor protein p27/Kip1, while downregulating the Cyclin‐dependent kinase‐4 (CDK4) and RB1 which would, in turn, inhibit cell‐cycle progression (Figure 5B). Of note, PIK3CA, AKT2 and PLCG1 were co‐downregulated with retinoblastoma protein RB1 (for PIK3CA and AKT2) and CDK4 (for PLCG1) by dozens of miRNAs. This observation might be linked to the key roles these proteins likely have in the regulation of the G1/S transition in the MDA‐MB‐231 cell line model. Overall, these results indicate that proteins, which fall into the same known functional modules (here: EGFR‐driven cell‐cycle module), are co‐regulated by miRNAs.
MiRNAs 124, 147 and 193a‐3p inhibit G1/S transition and cell proliferation by targeting EGFR signaling and/or cell‐cycle proteins
The network‐analysis approach has the power to predict the outcome of overexpressing individual miRNAs in the context of the EGFR‐driven cell cycle. For example, miR‐124, 147 and 193a‐3p were among the miRNAs with highest degrees in the consensus graph (Figure 6A, source data in Supplementary Table S8). Furthermore, the expression of several key proteins controlling G1/S transition was regulated in a tightly coordinated manner by these miRNAs (Figure 6B). Therefore, our analysis predicted that they should, when overexpressed, alter cell‐cycle progression patterns by inhibiting the G1/S transition. We performed molecular and cellular assays to confirm the validity of these predictions.
First, we validated all regulations shown in Figure 6B for all three miRNAs with western blots (Figure 6C). We also validated the overexpression of miR‐124, miR‐147 and miR‐193a‐3p after transfection with miRNA mimics by qRT–PCR (Supplementary Figure S7). All three miRNAs co‐regulated cell‐cycle components either by reducing RB1 level or by increasing p27/Kip1 level or affecting both (Figure 6B). For miR‐124, we observed a core regulation of seven EGFR protein network components. The PTEN tumor suppressor had the highest number of connections with other proteins and was co‐upregulated with p27/Kip1. In case of miR‐147, AKT2 was co‐downregulated with CDK4 and RB1, indicating the potential impact of AKT2 on cell‐cycle progression. Finally, for miR‐193a‐3p, PIK3CA was co‐downregulated with CDK4 and Cyclin D1 as well as with MIG‐6.
We then tested the prediction that all three miRNAs inhibit EGFR‐driven G1/S transition by targeting several components simultaneously at the phenotypic level. By staining cells with BrdU (for S‐phase fraction) and 7‐AAD (for total DNA content), we could show that all three miRNAs induced a very strong G1 arrest in the breast cancer cell line MDA‐MB‐231 (Figure 7A). Furthermore, viability of the cells was reduced after 72 and 96 h upon individual overexpression of these miRNAs (Figure 7B).
In order to test whether or not the regulation of proteins by miRNAs resulted from direct targeting, we cloned the 3′‐UTRs of the downregulated genes into a luciferase reporter vector, and performed luciferase reporter assays. For genes with long and difficult to clone 3′‐UTRs, we fragmented the 3′‐UTR into two or three parts (denoted as P1, P2 and P3). AKT2, p38 MAPK and STAT3 were predicted to be conserved targets of miR‐124 (Supplementary Figure S8A), and were also identified as potential direct targets in our study (Figure 7C). Although CDK4, CDK2 and SHC1 were poorly conserved predicted targets of miR‐124 (Supplementary Figure S8A), they were not identified as direct targets. RB1 was neither predicted nor found to be a direct target. These data demonstrate that the inhibitory effect of miR‐124 on cell‐cycle progression could be, at least in part, via targeting EGFR core proteins that were analyzed in this study. In the case of miR‐147, AKT2 and Cyclin D1 were predicted to be poorly conserved direct targets (Supplementary Figure S8B), and identified here as direct targets. These data suggest that targeting of AKT2 and Cyclin D1 by miR‐147 contributes to its strong inhibitory effect on G1/S transition (Figure 7C). Finally, JNK1 was identified as a direct target of miR‐193a‐3p, and could potentially contribute to the strong inhibitory effect miR‐193a‐3p had on cell‐cycle progression (Figure 7C).
We further tested the effects of miRNAs on mRNA expression of their targets to examine if the targeting was mediated via mRNA cleavage or translational repression (Figure 7D). We found that three of six miRNA/target interactions (miR‐124/AKT2, miR‐147/AKT2 and miR‐193a‐3p/JNK1) reduced protein expression of their target genes without significant changes at the mRNA level. These data consolidate the concept that the analysis of miRNA effects at the protein level is more informative and comprehensive than that only at the mRNA level.
Finally, in order to rule out that the observed results are cell line specific, but rather can be generalized in breast and other cancer types, we validated our results in another breast cancer cell line, MCF‐7, as well as in the U87 glioblastoma cell line. The latter represents another tumor type where EGFR signaling has a critical role in the pathogenesis (Hatanpaa et al, 2010). Similarly to the results we had obtained in the MDA‐MB‐231 cell line, all three miRNAs (miR‐124, miR‐147 and miR‐193a‐3p) led to a reduction of the cell population in S‐phase and also in reduced cell viability (Figure 7E and F). Furthermore, we recapitulated all the direct targeting relationships in these two cell lines (Supplementary Figure S9A). The only difference which we observed was that targeting of AKT2 by miR‐124 or miR‐147 was via translational repression in MDA‐MB‐231 cell line while it was via mRNA cleavage in MCF‐7 (in case of miR‐147) and in U87 (in case of both miR‐124 and miR‐147) (Supplementary Figure S9B).
Overall, our results indicate that the three miRNAs predicted to interfere with the cell‐cycle subnetwork indeed block the G1/S transition of cancer cell lines by regulating multiple proteins of the EGFR/cell‐cycle network simultaneously. Therefore, we propose these miRNAs as potential tumor suppressors in breast cancer. Furthermore, their discovery underscores the value of the platform, which combines high‐throughput proteomics and a network‐analysis framework, in identifying regulatory patterns in a complex biological network.
In this first proteomics screen investigating a mammalian oncogenic signaling network at the miRNome level, we systematically identified miRNAs that regulate the expression of 26 proteins in the EGFR‐driven cell‐cycle pathway, and developed a novel framework to unravel principles of these regulations. Our results indicate that a robust combination of miRNA screening with a proteomic readout and network‐analysis tools helps to identify miRNAs regulating a given protein network and to discover (co‐)regulation patterns of the proteins.
Our study led us to three key conclusions: (1) whole‐genome miRNA effects on a given protein are generally mild supporting a fine‐tuning role for miRNAs, and these effects are dominated by sequence‐matching mechanisms. (2) A new network‐analysis methodology identified co‐regulated protein targets. Our results suggest that miRNAs simultaneously regulate several proteins belonging to the same functional module. (3) Three miRNAs (miR‐124, miR‐147 and miR‐193a‐3p) act as novel tumor suppressors by co‐regulating EGFR‐driven cell‐cycle network proteins, thus qualifying as potential targets for breast cancer therapy.
As in‐silico predictions for miRNA targets rely on Watson–Crick base pairing between the miRNA seed region and the 3′‐UTR of target genes, we tested this hypothesis at the genome‐wide miRNA level for 25 of the 26 proteins analyzed in this study. PIK3CB had to be excluded from this analysis as its 3′‐UTR was not represented in the target prediction databases used. Our data showed a systematic expression shift caused by predicted miRNAs in dozens of proteins from the EGFR pathway. These observations are very much in line with two previous studies which examined the effects of a few miRNAs on the proteome (up to 5000 proteins) using mass spectrometry‐based approaches (Baek et al, 2008; Selbach et al, 2008).
Compared with these two studies, we scaled up the tested miRNAs to the whole‐genome level while scaling down the number of proteins to a single pathway. This trade‐off between the number of miRNAs and proteins tested was indispensable due to the methodologies chosen and the biological questions addressed. Protein arrays have the capacity of examining thousands of samples, but are limited by antibody availability (Wilson et al, 2010). On the other hand, mass spectrometry can analyze only a limited number of samples, but has the capacity of analyzing thousands of proteins (Choudhary and Mann, 2010). Undoubtedly, our approach complements these two studies and suggests that the 3′‐UTR‐seed match contributes to the targeting of not only a few miRNAs, but for the whole‐genome set of miRNAs. Further development in high‐throughput technologies will eventually generate a global picture by analyzing miRNome versus proteome in the future. For certain proteins, we did not observe a significant difference between expression changes caused by predicted miRNAs compared with those by non‐predicted ones. Besides a possible lack of evolutionary conservation of seed‐matching sequences other factors, e.g., shortened 3′‐UTRs due to alternative polyadenylation (Fu et al, 2011), could contribute to this observation in the analyzed cell system. This raises the necessity for further investigations.
We next asked whether miRNA targets could also be identified by combining expression correlation analysis with computational target predictions and without performing a large‐scale screen. Therefore, we tested if the miRNA–protein interaction network we obtained (Figure 3) could as well be simply inferred by performing computational target prediction based on expression correlation analyses. For this purpose, we employed two publicly available NCI60 cancer cell line panel data sets at miRNA‐mRNA (Liu et al, 2010) and miRNA–protein levels (Shankavaram et al, 2007). In‐silico prediction was indeed able to infer the potential regulatory relationships for a very small proportion of miRNA–gene or miRNA–protein pairs that we had identified with the proteomics screen approach (see Supplementary Table S9, Supplementary Figure S10 and Supplementary information for details). However, this method missed the major part of the miRNA–protein network identified in the screen. Therefore, we believe that correlation analysis combined with predictions might be useful as filters to prioritize miRNA–gene pairs for experimental validation. However, due to the high false‐positive and false‐negative rates, these computational methods cannot compensate for wet‐lab proteomics screening.
Loss‐of‐function studies of miRNA targets (e.g., by siRNAs) partially mimics, in most cases, the molecular and phenotypical observations induced by miRNA expression (Uhlmann et al, 2010). This is likely to be caused by a very complex regulation of several targets by a single miRNA or simultaneous targeting of a protein by dozens of miRNAs. Furthermore, identifying targets of miRNAs with transcript‐based array profiling is a useful approach that, however, has considerable drawbacks. As miRNAs might lead to translational inhibition without affecting the mRNA level, transcript‐level approaches can miss certain targets. This latter assumption was proven in our study as three out of six examined miRNA/target pairs downregulated their targets only at the protein level without affecting the respective mRNA levels. Thus, it would have not been possible to identify these targets with an mRNA‐based approach.
To circumvent these two problems, i.e., complexity of miRNA regulation and limitations of mRNA level study, and to get more insights into the miRNA biology, we devised a protein network level study. To this end, we combined a proteomic approach with a novel network‐analysis framework to identify those proteins that are co‐regulated by miRNAs in an intrinsic way. This approach helped us to filter noise from screen data while maintaining the ability to detect structural properties, i.e., co‐regulated proteins, from a network containing mild and noisy relationships. In this way, we identified proteins that were significantly co‐regulated by miRNAs. Several cell‐cycle components that regulate the G1/S transition check point were found to be co‐regulated. For example, while proteins driving the G1/S transition (Cyclin D1 and CDK4) were co‐downregulated with RB1 by dozens of miRNAs (35 and 31 miRNAs, respectively), Cyclin D1 and CDK4 were co‐upregulated by 19 miRNAs. Furthermore, RB1 and CDK4 were inversely co‐regulated with p27/Kip1 CDK inhibitor (30 and 19 miRNAs, respectively). These data suggest that miRNAs might rather have tumor‐suppressor than oncogenic functions. We also observed that AKT1 and ERK2 were co‐downregulated by 30 miRNAs. These miRNAs may potentially lead to the simultaneous inhibition of two key pathways regulating cell survival and proliferation, highlighting their potential to be used as drugs themselves.
Based on the analysis framework, we selected three miRNAs, miR‐124, miR‐147 and miR‐193a‐3p, for their roles in the G1/S transition. Inhibitory effects of these three miRNAs on cell proliferation were potentially mediated by targeting EGFR network proteins (e.g., AKT2, STAT3, p38 and JNK1) as well as Cyclin D1, a key G1/S transition regulating protein. Supporting the tumor‐suppressor function of miR‐124, it was reported that its expression is reduced in glioblastoma and its ectopic expression inhibits tumor migration and invasion (Fowler et al, 2011). Furthermore, miR‐124 was found to be epigenetically silenced in acute lymphoblastic leukemia (ALL; Agirre et al, 2009) and to regulate the CDK6/RB pathway by targeting CDK6 post‐transcriptionally in ALL as well as in medullablastoma (Pierson et al, 2008). Similarly, miR‐193 was epigenetically silenced in acute myeloid leukemia (AML) and targets the c‐Kit oncogene leading to apoptosis (Gao et al, 2011). The role of miR‐147 was reported to be in feedback regulation of inflammation where toll‐like receptors (TLRs) induce miR‐147 expression, which involves binding of NF‐κB and STAT1α to the miR‐147 promoter. This induction, in turn, inhibits cytokine expression and prevents excessive inflammatory responses (Liu et al, 2009).
To demonstrate that the effects of miR‐124, miR‐147 and miR‐193a‐3p on EGFR signaling have a more general nature, we tested the miRNA‐target interactions and tumor‐suppressor functions in another breast cancer as well as in a glioblastoma cell line. The results potentially support the principal tumor‐suppressor function of these miRNAs by targeting key components of the EGFR/cell‐cycle pathway. To further confirm the miRNA‐target interactions and tumor‐suppressor roles of these miRNAs, large breast cancer or other cancer type collections have to be analyzed using miRNA/mRNA profiling, deep sequencing and/or immunohistochemistry (IHC) or protein array‐based methods. The published data together with our results for breast cancer and glioblastoma cell lines indicate that these three miRNAs might have tumor‐suppressor functions by regulating different processes, e.g., cell‐cycle, invasion, and inflammation related to cancer, and could potentially be targets for cancer therapy.
In summary, we identified miRNAs that regulate the EGFR‐driven cell‐cycle pathway. Starting from experimental results and by combining computational analysis and a network‐based analytical framework, we could show a parsimonious picture of how miRNAs regulate the protein network: by sequence matching, and by intrinsic co‐regulation of proteins belonging to same functional module. Our results provide the research community with a unique resource to identify novel miRNA‐target interactions. This paves the way toward studying miRNA–protein interactions at a network level to elucidate the complex regulation of biological processes by miRNAs, which assure further detailed studies in future.
Materials and methods
The human breast cancer cell lines MDA‐MB‐231 and MCF‐7 and the glioblastoma cell line U87 were obtained from ATCC (Manassas, VA, USA). Culturing media and supplements for the MDA‐MB‐231 and MCF‐7 cells were described previously (Sahin et al, 2009). U87 was cultured in DMEM (Invitrogen, CA, USA) supplemented with 10% FBS and 1% Pen/Strep. Contamination tests of the cell lines were performed by the DKFZ Core Facility.
Screening with the miRNA mimic library
Cells were seeded in 6‐well plates and were transfected with the miRIDIAN miRNA mimic library (Dharmacon, CO, USA), which comprises 810 miRNA mimics, using Lipofectamine 2000 (Invitrogen, CA, USA) with help of a Biomek FX pipetting robot (Beckman Coulter, CA, USA). Each miRNA mimic was transfected at a final concentration of 25 nM in two biological replicates. The miRNome screen was accomplished in 34 screening runs and each run included 24 miRNA mimics, two positive controls (miR‐7 and si‐EGFR; Dharmacon), three negative controls (miR‐Control‐1 and miR‐Control‐2; Dharmacon) and siRNA control siAllstar (Qiagen, Hilden, Germany). Forty‐eight hours after the transfection, the cells were harvested and further used for RPPAs.
Reverse phase protein arrays
Lysates were prepared as previously described (Sahin et al, 2009). Two biological replicates and three technical replicates were spotted for each miRNA mimic on nitrocellulose‐coated glass slides (Grace Biolabs, OR, USA) using an Aushon 2470 contact printer (Aushon, MA, USA). Target protein‐specific primary antibodies (Supplementary Table S10) were validated using siRNAs (siRNA sequences are given in Supplementary Table S11) as previously described (Mannsperger et al, 2010) and incubated at a dilution of 1:300. After incubation with Alexa680‐labeled secondary antibodies (Invitrogen, Karlsruhe, Germany), signals were visualized using an Odyssey scanner (LI‐COR, NE, USA). Signal intensities were quantified using Gene Pix Pro 5.0 (Molecular Device, CA, USA) and background corrected. Data pre‐processing and quality control was performed with the custom R package caUtils.
Preparation of protein lysates and western blotting was previously described (Sahin et al, 2009). In all, 10 μg of protein was separated by SDS–PAGE and exposed to primary antibodies (Supplementary Table S10). HRP‐conjugated secondary antibodies were purchased from Santa Cruz Biotechnology (Santa Cruz, CA, USA).
Plasmid constructs and luciferase reporter assay
The 3′‐UTRs of potential miRNA target genes were amplified by PCR using human genomic or complementary DNA (primer sequences were given in Supplementary Table S12) of MDA‐MB‐231 cell line, and cloned into the psiCHECK2 vector downstream of the luciferase ORF using the SgfI or, XhoI and NotI sites depending on the gene. For the luciferase reporter assays, cells were co‐transfected with 25 nM of miRNAs and 15 ng of the luciferase vector. Firefly and Renilla luciferase activities were measured after 48 h using a luminometer (Tecan, Männedorf, Switzerland), according to the manufacturer's instructions. Relative luciferase activity was determined by the ratio of Renilla luciferase signal intensity to that of firefly for normalization. The average and standard deviation of the ratio were estimated by the Delta‐method (Bioconductor ratioAssay).
Cell‐cycle analysis and viability assay
MDA‐MB‐231, MCF‐7 or U87 cells were transfected with miRNA mimics (miR‐124, miR‐147 and miR‐193a‐3p) and control miRNA. Forty‐eight hours after transfection, 7‐AAD/BrdU cell‐cycle assay was done according to the manufacturer's protocol (BD Biosciences, CA, USA). Stained cells were measured by flow cytometry (FACSCalibur, BD Bioscience, Heidelberg, Germany) using Cell Quest Pro software (BD Bioscience, Heidelberg, Germany). Cell viability was measured using Cell Titer Glo Luminescent Cell Viability assay (Promega, WI, USA) following the manufacturer's instructions. Transfections were carried out in a 96‐well plate (8 × 103 cells per well) in a final miRNA mimic concentration of 25 nM per well with five replicates.
Quantitative RT–PCR of mRNAs
Total RNA was isolated using the RNeasy Mini Kit (Qiagen). cDNA was synthesized from 10 ng of RNA using the Revert Aid H Minus First Strand cDNA Synthesis Kit (Fermentas, St Leon‐Rot, Germany) according to the manufacturer's instructions. For reverse transcriptase PCR reactions, the TaqMan Abgene universal mix (Thermo Scientific, Rockford, IL, USA) and probes from the Universal Probe Library (Roche, Penzberg, Germany) were used. Oligonucleotide primers were synthesized by Sigma‐Aldrich (MO, USA). Sequences of primers and the respective UPL probe numbers are given (Supplementary Table S13). Data were analyzed according to the ΔΔCt method (Bioconductor ddCt).
Images of RPPAs were captured by the GenePix software. Local background values were removed with the default option. The normalization against total protein concentration was performed with an additive linear model. Relative protein abundance of each miRNA was summarized as z‐score (Malo et al, 2006). Two replicates were summarized by the mean of square sum.
The miRNA target predictions were queried in two target prediction databases: TargetScan (http://www.targetscan.org) and MicroCosm (http://www.ebi.ac.uk/enright‐srv/microcosm/cgi‐bin/targets/v5/search.pl) with default parameters. Kolmogorov–Smirnov tests were performed with R. Cell‐based assay results were analyzed with R packages in the Bioconductor (Gentleman et al, 2004) community. All the other statistical analysis, if not explicitly stated, was performed using R scripts.
Network analysis of the miRNA–protein network
Construction of bipartite graph.
The experimental data can be represented in an array A of dimensions (number of proteins)=26 times (number of miRNAs)=810. Let P be the set of proteins and M be the set of miRNAs. Given a threshold t1, the bipartite graph G(t1)=(MUP, E) contains an edge between protein A and miRNA X if the expression level of A upon transfection with X has a z‐score whose absolute value is at least as large as t1. This edge is unweighted, but we differentiate it as upregulating or downregulating edge, which is represented by different colors.
The number of edges adjacent to a protein or miRNA is defined as its degree. For a given bipartite graph, we compute a sample of randomized bipartite graphs in which each node has the same number of edges of both regulation types, i.e., we maintain the so‐called degree sequence of the two sides (Zweig, 2011). The method to compute these samples is explained in the following subsection. By this randomization, we accomplish a bootstrapping method that helps us to identify those co‐regulations that are not merely caused by the structural constraints (degrees) of the system. For these samples and all of the three co‐regulation patterns (co‐upregulation, co‐downregulation and antagonistic co‐regulation), a P‐value is then assigned to each pair of proteins that describes how likely it is to see these two proteins at least as often in the given pattern just by random fluctuations. If these values are low, we claim that the pattern is more‐than‐random. The P‐value then needs to be thresholded by a second threshold t2 to result in the final co‐regulation network between the proteins.
Definition of the more‐than‐random model.
Given the bipartite graph G(t1) the degree of each protein and each miRNA is fixed, but the connections between them are perturbed by the following procedure: In each step, pick two edges (A,X) and (B,Y) uniformly at random (here: A and B are two proteins, and X and Y are two miRNAs). If both edges are of the same regulation type, try a so‐called edge swap: if (A,Y) and (B,X) are not yet connected, remove edges (A,X) and (B,Y) and add edges (A,Y) and (B,X). If at least one of the edges (A,Y) or (B,X) already exists, do nothing. The edge swap constitutes a random walk on a Markov chain (Brualdi, 2006). It is thus assured that, if the number of attempted and conducted edge swaps is large enough, the resulting graph is a random sample from all bipartite graphs with this fixed degree sequence. In our experiments, we computed 10 000 samples after q log q time steps, where q is the number of edges in the respective bipartite graph. The first random walk starts at the observed bipartite graph, all following ones start from the previously obtained bipartite graph.
Assessing the P‐value of a given relationship between two proteins.
For any two proteins A, B and any of the four relationships (I (co‐upregulation), II (co‐downregulation), and the two antagonistic patterns: IIIa and IIIb) the number of miRNAs constituting this relationship is computed in the data and denoted by obsI(A,B), obsII(A,B), obsIIIa(A,B) and obsIIIb(A,B). The observed value is compared with the distribution of these values in the samples. We are interested in the P‐value, i.e., the percentage of samples in which the value is at least as large as the observed one. For each pair of proteins and each observed relationship that occurs at least once, this P‐value is then computed for the given number of samples.
Computing consistent co‐regulation relationships between proteins.
After all P‐values are computed, a second threshold t2 is chosen and two proteins are connected if their P‐value is at most as large as t2. The resulting graph is denoted by G(t1, t2). To determine a set of suitable t2 thresholds, we computed the so‐called clustering coefficient for a fine‐grained set of possible t2 thresholds (Watts and Strogatz, 1998). The rational is that if there is a substantial connectedness or even different levels of connectedness in the graph, then this curve will show one or more maxima or plateaus in the clustering coefficient. This will then enable us to identify structural changes between the graphs obtained with different thresholds t2. In our case, there were little jumps in which the average clustering coefficient increased (Supplementary Figure S11). A jump means a difference between the clustering coefficients for two consecutive P‐values. We selected the five largest differences occurring at a still reasonable significance level. We chose the 0.01<t2<0.2 domain and imposed that the difference between two consecutive chosen t2 values should be >0.01.
Note that A and B can, in principle, be connected by all three kinds of edges but, biologically, we expect that A and B are (a) in only one relationship, or (b) co‐upregulated by some miRNAs (pattern I) and co‐downregulated by others (pattern II) or (c) that they are always antagonistically co‐regulated (pattern IIIa/b). Inconsistent mixtures of these patterns have not occurred with any combinations of t1 and t2 that we have tested so far. The edges can be assigned with the miRNAs that induce the relationship between the two proteins
The consensus graph.
As we reasoned in the Results section, the choice of t1 influences the resulting bipartite graph. It will also influence the identification of those co‐regulations that are statistically significant as edges are added to the system. To keep only those co‐regulations that are assigned statistical significance under different t1 thresholds, we built the consensus graph. For this purpose, we chose three relatively relaxed thresholds for t1, namely 1.28 (P<0.2), 1.64 (P<0.1) and 1.96 (P<0.05). For those, we computed the P‐values of all relationships and the resulting protein co‐regulation graphs for a series of five t2 values. The t2 thresholds were chosen based on the change of the average clustering in function of the P‐values. These graphs vary in their size and the number of edges. We concentrated on those edges that all of the three graphs (corresponding to one strictness level t2) find statistically significant. This results in a consensus graph, in which each edge only exists if it is contained in all three graphs at the same t2 value.
We thank Frederik Bethke, Maike Wosch and Corinna Becki for the excellent technical help. This work was supported within the National Genome Research Network (grant 01GS0864) and the Medical Systems Biology program (grant 0315396B) of the German Federal Ministry of Education and Research (BMBF) and Wilhelm Sanderstiftung (grant 2009.051.1). JDZ was supported by the DKFZ International PhD Program. EAH and KZ are supported by the Heidelberg Graduate School of Mathematical and Computational Methods for the Sciences, University of Heidelberg, Germany, which is funded by the German Excellence Initiative (GSC 220).
Author contributions: ÖS conceived, coordinated and supervised the study. SU, HM, UK and ÖS designed the experiments. SU, HM, CS, MK, FH and AW performed the experiments. JDZ, EAH and KZ analyzed the data. JDZ, UT, UK and SW contributed reagents/materials/analysis tools. SU, JDZ, EAH, KZ, SW and ÖS wrote the paper.
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplementary Methods, Supplementary Figures and Supplementary Tables with less than 50 rows. [msb2011100-sup-0001.doc]
List of miRNAs expressed in MDA‐MB‐231 cell line determined by an NGS platform. [msb2011100-sup-0002.xls]
List of miRNAs expressed in the MDA‐MB‐231 cell line determined by array‐based hybridization with an Illumina platform. [msb2011100-sup-0003.xls]
Results of RNAi‐based antibody validation with RPPA output. [msb2011100-sup-0004.xls]
Results (z‐scores) of the miRNome screen with RPPA output. [msb2011100-sup-0005.xls]
List of miRNAs/targets in the "intermediate" network shown in Supplementary Figure S5. [msb2011100-sup-0006.xls]
List of miRNAs/targets identified in the regulatory network shown in Figure 3. [msb2011100-sup-0007.xls]
Rank of the miRNAs based on their frequency in the consensus graphs. [msb2011100-sup-0008.xls]
Sequences of siRNAs to validate sensitivity/specificity of the primary antibodies used in RPPAs. [msb2011100-sup-0009.xls]
This is an open‐access article distributed under the terms of the Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation without specific permission.
- Copyright © 2012 EMBO and Macmillan Publishers Limited