Hematopoietic stem cells (HSCs) are capable of giving rise to all blood cell lineages throughout adulthood, and the generation of engraftable HSCs from human pluripotent stem cells is a major goal for regenerative medicine. Here, we describe a functional genome‐wide RNAi screen to identify genes required for the differentiation of embryonic stem cell (ESC) into hematopoietic stem/progenitor cells (HSPCs) in vitro. We report the discovery of novel genes important for the endothelial‐to‐hematopoietic transition and subsequently for HSPC specification. High‐throughput sequencing and bioinformatic analyses identified twelve groups of genes, including a set of 351 novel genes required for HSPC specification. As in vivo proof of concept, four of these genes, Ap2a1, Mettl22, Lrsam1, and Hal, are selected for validation, confirmed to be essential for HSPC development in zebrafish and for maintenance of human HSCs. Taken together, our results not only identify a number of novel regulatory genes and pathways essential for HSPC development but also serve as valuable resource for directed differentiation of therapy grade HSPCs using human pluripotent stem cells.
In this study, the authors use a genome‐wide RNAi silencing approach to find stage‐specific gene networks required for HSPC development. A novel conserved group of genes is identified to be required for HSPC specification in zebrafish and the maintenance of human HSCs.
A genome‐wide RNAi screen reveals genes and networks that regulate the development of hematopoietic stem/progenitor cells (HSPCs).
Specific groups of genes important for distinct stages of HSPC differentiation are identified.
The results are validated in zebrafish and in human hematopoietic stem cells (HSCs) using four conserved genes (Ap2a1, Mettl22, Lrsam1, and Hal).
Hematopoietic stem cells (HSCs) are a rare population of self‐renewing cells capable of replenishing all blood cell types throughout adulthood. HSC transplantation has become a standard treatment for hematologic and lymphoid cancers and is also used to treat other malignant and nonmalignant disorders , . However, the rarity and limited sources of HSCs [bone marrow (BM), peripheral blood, and umbilical cord blood] hamper their clinical application. Thus, a major goal for regenerative medicine is to find a method capable of generating engraftable hematopoietic stem/progenitor cells (HSPCs) from human pluripotent stem cells (hPSCs). Directed differentiation and transcription factor‐mediated conversion of HSCs provide new strategies to engineer therapy grade HSCs (reviewed in ). Directed differentiation has been applied to differentiate mouse ,  and human , ,  ESCs into HSC lineages. Somatic cell reprogramming to induced pluripotent stem cells (iPSCs) ,  provides new approaches to obtain autologous patient‐specific HSCs, if all the obstacles in ESCs to HSCs differentiations are solved. Recent work has shown that direct reprogramming of fibroblasts into hematopoietic lineage cells can be achieved by using transcription factors , , , . For example, infection of mouse fibroblasts with four transcription factors, Gata2, Gfi1b, cFos, and Etv6, induces hemogenic endothelial‐like cells that are able to differentiate into hematopoietic cells . Similarly, transduction of hPSC‐derived hematopoietic progenitors with HOXA9, ERG, RORA, SOX4, and MYB transcription factors induces their development to engraftable hematopoietic cells . Despite the remarkable progress made in the past few decades, efforts to generate transplantable hPSC‐derived HSCs have been largely elusive, due in part to insufficient long‐term self‐renewal capacity and an inadequate understanding of HSC ontogeny.
In the mouse, blood cells are thought to originate from the ventral floor of the dorsal aorta in the aorta‐gonad‐mesonephros region at embryonic day 10.5, and hematopoietic progenitors then sequentially colonize the fetal liver, spleen, and BM, where definitive hematopoiesis occurs . Although multipotent hematopoietic progenitors are widely accepted to arise from hemogenic endothelial cells , , , , , , , the molecular mechanisms regulating the endothelial–hematopoietic transition and the specification of definitive HSPCs from hematopoietic precursors remain poorly understood. Several studies have been performed with the goal of identifying the global regulatory networks controlling HSC hierarchy and differentiation , , the HSC niche , and HSC ontogeny . In addition, 159 long non‐coding RNAs (lncRNAs) were enriched in HSCs and knockdown of two HSC‐specific lncRNAs resulted in distinct effects on HSC self‐renewal and lineage commitment highlighting the role of another layer of RNA‐mediated control of HSC biology . Nevertheless, a genome‐wide functional study of HSC ontogeny is still lacking, making it difficult to generate a global picture of the gene networks governing definitive stage‐specific HSC development. Here, we report the results of a genome‐wide functional RNAi screen that identifies essential genes and networks regulating HSPC differentiation from ESCs in vitro.
A genome‐wide functional genomics screen to identify regulators of HSPC development
The inducible ESC line iHoxB4  was infected with a pGIPZ lentiviral vector containing a mouse shRNA library of 57,600 shRNAs targeting 15,570 mouse genes , , and the cells were then cultured on an OP9 stromal cell layer to promote differentiation toward the hematopoietic lineage, as described previously  (Figs 1A and EV1). The pGIPZ vector also encoded GFP to identify successfully transduced ESCs and enable identification of the GFP+ population (Fig 1B and C). To dissect the development of HSPCs in vitro, we isolated five cell populations on days 6 and 20 of ESC differentiation: SSEA1−Flkl+CXCR4− (designated D6F) and SSEA1−/Flkl−CXCR4+ (D6C) cells on day 6; Lin−Sca1+c‐Kit− (D20LS), Lin−Sca1−c‐Kit+ (D20LK), and Lin−Sca1+c‐Kit+ (D20LSK) cells on day 20. mRNA microarray analysis and DNA‐seq were performed on ESCs and isolated cell populations from three independent experiments (Appendix Fig S1).
SSEA1 is an embryonic stem cell marker and was used here to exclude ESCs from the differentiated populations, and CXCR4 is a marker for both primitive and definitive endoderm , . Flk1 is expressed by hemogenic mesodermal cells, which can differentiate into endothelial and hematopoietic cells in vitro and in vivo , , cardiomyocytic and mural cells in vitro , and functional blood vessels in vivo . Expression of CXCR4 by the D6C population identified them as endodermal. D6F cells expressed the highest levels of three endothelial marker genes: Cdh5 (vascular endothelial [VE]‐cadherin), CD31 (Pecam‐1), and Tie2 (Tek), and expressed the lowest level of the epithelial (E)‐cadherin (Cdh1) marker (Fig 2A), thus identifying the D6F fraction as hemogenic mesodermal/endothelial (HM/E) cells. Consistent with this, the hematopoietic markers, Scl (Tal1)  and CD41 (Itga2b) were expressed in D6F cells and in all day 20 cell populations, suggesting that the hematopoietic lineage was already established by day 6. The Lin−Sca1+c‐Kit+ (D20LSK) population has been defined as HSPCs in mouse BM and mouse ESC‐derived cells in vitro , whereas D20LK cells derived from ESCs in vitro and BM in vivo include hematopoietic, common myeloid, granulocyte/macrophage, and megakaryocyte/erythrocyte progenitors , . Thus, in this study, the D20LK population included the precursors and derivatives of HSPCs, whereas the D20LS population was heterogeneous and contained non‐hematopoietic and differentiated cells. The identity of the sorted day 20 populations was confirmed by qRT–PCR analysis of c‐Kit, Sca1, Gata1, Gata2, Lyz2, and Scl expression (Fig 2B and C). Finally, a large proportion of cells isolated on day 20 were confirmed by FACS analysis to be hematopoietic by their positive expression of CD41 and the pan‐hematopoietic antigen CD45 (Fig 2D).
PCA analysis of the mRNA microarray results showed clustering of the ESC, D6C, and D6F samples from three independent experiments, while the D20LS, D20LK, and D20LSK samples were more dispersed (Fig EV2). This was not surprising because long‐term cultures are expected to consist of a more heterogeneous cell population than the short‐term cultures, and the three surface markers used for selection are unlikely to be sufficient to distinguish cells with distinct gene expression patterns. Nevertheless, the D20LS, D20LK, and D20LSK cell populations may still have similar functions, as previously demonstrated , , .
Identification of genes involved in transitional stages of HSC development
We next performed DNA‐seq analysis to identify shRNA sequences that were enriched or depleted in the cell populations isolated on days 6 or 20 compared with the starting population of iHoxB4 ESCs. We reasoned that shRNAs targeted to genes that are essential for either cell survival or differentiation would cause either cell death or differentiation arrest and thus be underrepresented (depleted) in the cell populations isolated on day 6 or day 20 compared with ESCs. Conversely, shRNAs targeted to genes that are either not essential for cell survival or act as barriers to differentiation would be enriched in the day 6 and/or day 20 populations compared with ESCs. Thus, depletion of shRNAs in the five cell populations would identify genes that are essential for survival or differentiation. We conducted a clustering analysis using a k‐means algorithm based on log10 fold changes in shRNA reads in the D6C (endoderm), D6F (HM/E), D20LS, D20LK, and D20LSK (HSPC) populations compared with ESCs. For further analysis, shRNAs were assigned to ten groups based on the function of their target genes (Fig 3A).
Interestingly, only Group I shRNAs were significantly enriched in the day 20 populations, and those in Group VIII were essentially unchanged in all five populations compared with ESCs. Thus, the corresponding target genes of Groups I and VIII shRNAs are unlikely to play essential roles in HSPC development (Fig 3A). In contrast, shRNAs in the remaining groups were depleted to different extents in the three populations isolated at day 20. Groups V, VI, and X target genes were specifically depleted in the D20LK, D20LS, and D20LSK populations, respectively. Group II target genes appear to be essential for the development of the D20LK and D20LSK populations, whereas Group III genes were potentially required for differentiation to the D20LS and D20LSK stages. Although Group VII genes were not required for D20LSK differentiation, they appear to be critical for the development of both the D20LK and D20LS populations. To visualize the differentially enriched or depleted shRNAs in the D6C (endoderm) and D6F (HM/E) cell samples compared with ESCs, we performed clustering analysis using log2 fold changes in shRNA reads. We identified two groups of shRNAs that were specifically depleted in D6C cells and D6F cells, which we designated Groups XI and XII, respectively (Fig 3B).
We next performed gene ontogeny (GO) analysis to identify biological processes and pathways most highly represented by target genes in Groups XII, VI, V, and X, reflecting their requirement for the development of D6F (HM/E), D20LS, D20LK, and D20LSK (HSPC) populations, respectively (Fig 3C and Appendix Fig S2). We found that biological processes relevant to mesodermal development and endothelial specification, such as “cellular component organization or biogenesis”, “single‐organism cellular process”, and “single‐organism developmental process”, were highly enriched among the HM/E‐specific target genes, as expected (Fig 3C). The most enriched biological process networks among the D20LS‐specific target genes were “cell activation involved in immune response” and “leukocyte activation involved in immune response”. Several key activities related to phosphorylation of STAT protein were also enriched. Among the prominent D20LK‐specific processes were “negative regulation of neurological system process” and “neurological system process involved in regulation of systemic arterial blood pressure”. These processes are hallmarks of hematopoietic lineage precursors, suggesting that such cells were contained within the D20LK population. “Positive regulation of hematopoietic stem cell migration” was significantly enriched among the Group X target genes, confirming the HSPC identity of the D20LSK population. Interestingly, genes within the “phospholipase C‐activating angiotensin‐activated signaling pathway” were also enriched in D20LSK cells. In support of this finding, AGTR1 (angiotensin receptor II, type I) has been shown to positively regulate differentiation of BM granulocyte/macrophage progenitors (Lin−Sca‐1−c‐Kit+CD34+CD16/32+) and peripheral blood monocytes from HSCs , and blockade of AGTR1 leads to reduced number of BM and peripheral blood, as well as CD11b+ monocytes in bone marrow . The enrichment of target genes associated with “neuron projection morphogenesis” in this group also indicates that HSPC specification may involve novel pathways.
To identify pathways involved at all stages of differentiation, we examined target genes that were enriched in the five day 6 and day 20 cell populations (Fig 3D). The top four pathways included Wnt signaling, cell adhesion, cell cycle, and reproduction/differentiation. Among the common pathways was Wnt signaling, which is recognized to play crucial roles in the metabolism and differentiation of many cell lineages ,  and is known to be a critical regulatory pathway for HSPC development , , . Wnt signaling mechanisms are often known as the canonical or Wnt/ß‐catenin and the non‐canonical pathways (reviewed in ). High Wnt signals in the canonical pathway can lead to loss of stemness and enhance HSC differentiation . In addition, a switch in canonical to non‐canonical Wnt signaling could play important roles in HSC aging . As expected, pathways involved in cell adhesion  and cell cycle  regulation were also represented in all cell populations. Hedgehog signaling was also a prominent pathway in our GO analysis. Although several studies have shown that Hedgehog signaling is dispensable for HSC self‐renewal and function , , it is involved in the embryonic endothelial–hematopoietic transition . These results demonstrate that our experimental system faithfully captures the dynamic transitional stages of HSC development. To confirm our GO analysis findings, we analyzed reads of shRNAs targeted to > 50 cell cycle‐related genes, including many Cdc and Cdk genes. Indeed, most of the shRNAs targeting these genes were significantly decreased in D20LS, D20LK, and D20LSK cells, consistent with the findings of the GO analysis (Fig 3E).
Functional validation of genes required for differentiation of ESCs toward HSPCs
To validate the gene identified from our primary RNAi screen, we selected 3–4 genes from Groups XI, XII, IV, and X and individually analyzed their role in differentiation of mESCs toward HSPCs in vitro. Group XI genes, Atp5g3, Bub1b, Eci1, and Dis3l, and Group XII genes, Esrra, S100a8, and Hcfc2, were selected to examine their role in endoderm and mesoderm/endothelium specification. Silencing of Esrra, S100a8, and Hcfc2 in mESCs (Fig EV3A) significantly reduced the Flk1+Cxcr4− cells in EBs at day 6 (Fig 4A and B), confirming that these genes are required for generation of Flk1+Cxcr4− cell during differentiation. In contrast, Group XI genes Atp5g3, Bub1b, Eci1, and Dis3l knockdown (Fig EV3B) clearly diminished Cxcr4+Flk1− cells (Fig 4C and D), supporting that Group VI genes are essential for endoderm cell fate determination. Given that shRNAs in Group IV were largely decreased or depleted in D20 samples, the genes targeted by these shRNAs are most likely required for generation D20LK, D20LS, and D20LSK population. As expected, knockdown of Group IV genes Wbp5, Rbm26, Gdpd4, and Nrxn1 (Fig EV3C) caused reduced number of D20LK, D20LS, and D20LSK cells (Fig 4E and F). Interestingly, blockage of these genes increased Cxcr4+ cells while decreased Flk1+ cells at EB6 (Fig EV3D–F), suggesting their role on HSPC development may occur at earlier stage. Next, we want to examine whether genes identified in Group X are required for D20LSK population generation. We selected four genes Ap2a1, Mettl22, Lrsam1, or Hal which are conserved across Danio rerio, Mus musculus, and Homo sapiens. As we expected, the knockdown of these genes (Fig EV3G) remarkably diminished D20LSK population (Fig 4G and H) without affecting endoderm and mesoderm/endothelium cells on day 6 (Fig EV3H–J). Taken together, these data demonstrate that our primary screen was able to identify genes required for specific cell populations during differentiation.
Identification and functional validation of genes critical for HSPC specification
The largest group of shRNAs identified was Group IV, representing 7,222 genes. These shRNAs were significantly depleted in the day 20 populations, suggesting that the corresponding genes may be required for the HM/E–hematopoietic transition. The GO analysis of process networks revealed prominent enrichment of immune response and inflammation signaling pathways among the Group IV target genes, including IL‐6 signaling, amphoterin signaling, and IL‐2 signaling (Fig 5A). Notch signaling, progesterone signaling, ESR2 pathway, Wnt signaling, and platelet–endothelium–leukocyte interaction pathways were also significantly enriched among the Group IV genes, supporting the relevance of this group to the endothelial–hematopoietic transition. Interestingly, this group was enriched in genes associated with “male sex differentiation” and “spermatogenesis, motility, and copulation”, suggesting that these pathways may have previously unknown roles in hematopoiesis.
In contrast to Group IV, Group I shRNAs were significantly enriched in the populations isolated on day 20 relative to day 6, indicating that their target genes are potential barriers for the transition from HM/E to hematopoietic precursors. Interestingly, the most enriched pathway in Group I was related to cardiac development (P = 3.3 × 10−3). Although the Wnt signaling pathway plays a critical role in the HM/E–hematopoietic transition, Wnt antagonists induce cardiac specification from mesoderm and promote cardiomyocyte differentiation . Consistent with this, GO analysis showed high enrichment of the Wnt pathway among Group IV genes (P = 3.6 × 10−12) but not Group I genes (Figs 5A and EV4A). We also found that shRNAs targeting most of the well‐known HSC regulators, such as Hoxa9 , Prdm16 , Gata2, Runx1, Lmo2, and Gfi1b , were identified in Group IV and were significantly decreased in day 20 populations (Fig 5B). In contrast, some of the transcription factors represented in the enriched Group I target genes ,  are known to be involved in cardiac lineage development (Fig EV4B).
Although many known HSC regulators were found in Group IV, Group X target genes appeared to be critical for HSPC specification, as shRNAs in this group were specifically diminished in the D20LSK population. To identify key genes critical to the specification of HSPCs, we examined Group X shRNAs in more detail. These shRNAs were slightly depleted in D20LS and D20LK cells and markedly depleted in the D20LSK population (Figs 3A and 5C), suggesting that Group X target genes may play essential roles in LSK specification. To test this, we examined the in vivo functional importance in HSC development of four of the top ranked Group X target genes: Ap2a1, Lrsam1, Hal, and Mettl22 (Fig 5C). All four genes are conserved between the zebrafish Danio rerio and Homo sapiens, allowing us to validate their function in HSPC development in zebrafish. Ap2a1 is a subunit of the adaptor‐related protein complex 2 (AP2), which is important for cell polarization and is known to be involved in HSPC migration . Lrsam1 is an E3 ubiquitin ligase and may be involved in a novel regulatory pathway for HSC specification. Mettl22 (methyltransferase like 22) is a protein methyltransferase and acts on various chaperone complexes; for example, trimethylation of K135 of KIN/Kin17, K315 of VCP/p97, K561 of HSPA8/Hsc70, as well as lysine residues of other Hsp70 isoforms . To date, Mettl22 has not been reported to function in hematopoiesis. Similarly, while deficiency of the cytosolic catabolic enzyme HAL (histidine ammonia‐lyase)  causes histidinemia, the enzyme has not previously been implicated in HSC ontogeny . HAL catalyzes the deamination of L‐histidine to urocanic acid, which is involved in UV radiation‐induced immunosuppression . Although not yet reported, we hypothesized that histidine deamination may also play a role in hematopoiesis. Knockdown of the four genes was achieved by injecting splice‐blocking morpholino oligomers (MOs) into zebrafish embryos at the one‐cell stage. For each gene, MO‐induced abnormal splicing was confirmed by RT–PCR using three different primer sets, and overall mRNA knockdown efficiency was calculated by qRT–PCR (Appendix Fig S3A–E). To assess the effects of MOs on expression of the HSC differentiation‐associated genes, embryos were collected and fixed for in situ hybridization at approximately 26 h post‐fertilization (Runx1), 30 hpf (c‐Myb) and 54 hpf (c‐Myb). Knockdown of Prdm16 serves as a positive control because it has previously been demonstrated to be required for HSC development in zebrafish . Compared with negative control MO injections, knockdown of Lsram1, Ap2a1, Mettl22, Hal, and Prdm16 dramatically reduced staining of both c‐Myb and Runx1 in the aorta‐gonad‐mesonephros region at each of the three time points (Fig 5D and Appendix Fig S3F), demonstrating a requirement for these genes during HSC specification. Interestingly, knockdown of these genes had a similar or greater effect than the previously described Prdm16 gene. These results validate the RNAi‐based identification of Group X genes as essential for mammalian HSPC development and further testify to the high discovery rate of functionally relevant genes in our genome‐wide analysis.
Group X genes, AP2A1, METTL22, LRSAM1, and HAL, are crucial for maintaining human CD34+ HSCs
Given the conservation of group X genes in fish, mouse, and human, we sought to determine whether our identified genes play important roles in human HSC biology. Many genes required for HSC specification are also essential for HSC self‐renewal. Group X genes were identified to be required for HSPC specification; therefore, we reasoned that these genes may also be involved in maintaining self‐renewals of adult HSCs. To address this question, we tested the role of four Group X genes, AP2A1, METTL22, LRSAM1, and HAL, in human CD34+ cells. Knockdown of Ap2a1, Lrsam1, Hal, and Mettl22 in human cord blood CD34+ cells (Fig EV3K) caused significant decrease in CD34+ cells after 5 days post‐transduction (Fig 6A and B). This phenotype is much more dramatic in long‐term HSCs as measured by two human HSC markers CD34 and CD90, with nearly 90% loss of CD34+CD90+ cells (Fig 6C and D). These results demonstrate that Group X genes, AP2A1, METTL22, LRSAM1, and HAL, are critical for maintaining human HSPC self‐renewal, especially for long‐term HSCs. Overall, our data confirmed that Group X genes, AP2A1, METTL22, LRSAM1, and HAL, are crucial for HSPC specification and HSC self‐renewal. Notably, whereas most genes known to be important for HSC development are transcriptional regulators, we show here that proteins with diverse functions, including ligase, methyltransferase, transporter, and lyase activities, also play crucial roles in specification of HSPCs and that these functions are conserved from zebrafish to humans.
Our analysis of the population‐specific gene groups suggests the existence of a pre‐HSC stage prior to maturation of definitive HSPCs, as proposed in a developmental stage map of ESC to HSPC differentiation (Fig 7). To identify functional stage‐specific gene networks, we compared the target gene sets associated with the five differentiating cell populations (Fig 3A and B). For Stage I, which represents the differentiation of ESCs to HM/Es, we compared target genes in Group XII between HM/E (D6F) and endoderm (D6C) populations. For the HM/E to “pre‐HSC” transition (Stage II), we compared day 20 populations (D20LS, D20LK, and D20LSK) with HM/E (D6F) in Group IV target genes, which are required for the differentiation of all day 20 populations. For Stage III, representing the specification of HSPCs, a network was built by comparing genes present in Group X between D20LSK and D20LK populations. String 10  was used to build networks for the top 20 genes identified in each stage (Fig EV5 and Datasets EV1, EV2 and EV3). We found that the Stage I network is connected by Ppara‐Set‐Nup98‐Ddx39b gene sets; the Stage II network is mainly composed of Cct‐Cps1‐Cad‐Dpysl gene sets; and the Ap2a network is dominant at Stage III. At the HM/E transition (Stage I), the network is more broadly connected, as expected, whereas the networks at the pre‐HSPC transition and definitive HSPC specification stages (Stages II and III) become progressively more restricted to a few connected gene sets (Fig EV5).
Collectively, these analyses identify stage‐specific gene networks that are critical to the differentiation of definitive HSPCs. Furthermore, we identified 351 genes in Group X that function in the definitive HSPC specification step that confers multipotency. Because RNAi functional screens can identify genes that are not detected by mRNA expression‐based datasets, we were able to discover new genes and pathways required for HSPC development. For example, comparison of our Group X target genes with those previously reported to play roles in HSPC differentiation  revealed that only 67 of the 351 genes in Group X were previously identified. The four candidate genes selected from Group X were successfully validated in zebrafish and human CD34+ cells (Figs 5D and 6), demonstrating that our RNAi screen was highly effective in identifying functionally essential and universally conserved genes for HSC differentiation.
Our proposal that a pre‐HSPC stage lies between the HM/E and definitive HSPC stages is based on the observation that most of the known HSC regulators were represented in Group IV, whereas most of the D20LSK‐specific Group X genes have not been previously reported. However, this hypothesis does not exclude the possibility that Group IV genes are required throughout HSPC specification. If our hypothesis is correct, the pre‐HSPC stage may serve as a checkpoint for cells entering the hematopoietic lineage rather than other HM/E‐related lineages such as cardiomyocytes and mural cells. Because we did not examine additional cell populations between days 6 and 20, further experiments will be required to determine whether the putative pre‐HSPC population exists in vitro or in vivo and, if so, to establish whether it is multipotent.
Materials and Methods
Mouse embryonic stem cell culture
The mouse ESC line iHoxB4 (kindly provided by Dr. Michael Kyba) and E14tg2a (a gift from Dr. Chuan He) were used in this study. Mouse ESCs were cultured on irradiated mouse embryonic fibroblasts (MEFs) in DMEM‐based medium supplemented with 15% ES screened fetal bovine serum (FBS; Hyclone Laboratories), 0.1 mM nonessential amino acids (NAEE, GIBCO), 2 mM glutamine (Gibco), 50 μg/ml penicillin and 75 μg/ml streptomycin (GIBCO), 0.00086% 1‐thioglycerol (Sigma), and 1,000 U/ml leukemia inhibitory factor (LIF; Life Technologies). The medium was changed every other day and the ES cells were normally passaged every 3 days.
In vitro differentiation of hematopoietic stem/progenitor cells
In vitro differentiation of HSPCs from mouse ESCs was described previously , . In brief, on day 0, ESCs were trypsinized and resuspended in EBD medium [IMDM containing 15% FBS, 2 mM glutamine, 50 μg/ml penicillin, 75 μg/ml streptomycin, 0.45 mM 1‐thioglycerol, 200 g/ml iron‐saturated transferrin (Sigma), and 50 g/ml ascorbic acid (Sigma)]. The cells were then cultured in hanging drops at 150 cells/15 μl in inverted 150 × 15 mm petri dishes. On day 2, the EBs were collected and placed in 10 ml EBD medium in a 100 mm ultra‐low attachment dish. The medium was exchanged on day 4. On day 6, EBs were dissociated by incubating for 20–30 min at 37°C in DMEM containing 10 mg/ml collagenase IV (Invitrogen, cat. no. 17104‐019), 20 mg/ml hyaluronidase (Sigma, cat. no. H2126), and 800 U/ml DNase (Sigma, cat. no. D4527), with occasional shaking. The dissociated cells were resuspended in IMDM differentiation medium (supplemented with 10% FBS, 2 mM glutamine, 50 μg/ml penicillin, 75 μg/ml streptomycin, 100 ng/ml stem cell factor, 40 ng/ml VEGF, 40 ng/ml thrombopoietin, and 100 ng/ml Flt‐3 ligand) and plated at 1 × 105 cells/ml on an OP9 stromal layer plated 1 day prior. Doxycycline was added to the medium at final concentration of 1 μg/ml to induce HoxB4 expression in iHoxB4 cells. For E14tg2a cells, the retrovirus expressing HoxB4 was used to infect EB6 cells to overexpress HoxB4 in differentiating cells. The cells were passaged and replated onto fresh OP9 cells when they became ~80% confluent.
Human CD34+ cell culture
Human cord blood CD34+ cells were cultured in StemSpan SFEM (Stem cell Technologies) containing 1× antibiotics and 100 ng/ml of recombinant human cytokines: stem cell factor (hSCF), Flt3 ligand (Flt3L), thrombopoietin (hTPO), interleukin‐6 (hIL6) (Peprotech), supplemented with 1 μM StemRegenin 1 (SR1). The cells were plated at 5 × 105 cells/ml as needed.
Lentivirus generation and transduction
The mouse pGIPZ‐shRNA library lentiviruses and pLKO shRNA lentiviruses were prepared as previously described , . Briefly, the lentiviral vectors were transfected into 293FT cells together with pPAXs and pMD2 vectors using Lipofectamine® LTX with Plus™ Reagent (ThermoFisher Scientific, #15338100). The viral supernatant was collected 48 h post‐infection and concentrated by ultracentrifugation. The concentrated viruses were divided into aliquots and snap frozen in liquid nitrogen before freezing at −80°C. Transduction of iHoxB4 cells was carried out as previously described . Transduction of human CD34+ cells were performed using RetroNectin reagent (TaKaRa).
FACS analysis and whole‐genome RNAi screening
Cells were collected and stained with fluorochrome‐conjugated antibodies in staining medium (PBS supplemented 0.5% BSA and 2 mM EDTA) according to manufacturers' instructions. UltraComp eBeads (eBioscience) was used for single staining samples. The flow cytometry FCS data were further analyzed in FlowJo software. Differentiated mesodermal cells (Ssea1−Flkl+Cxcr4−), endodermal cells (Ssea1−Flkl−Cxcr4+), D20LS (Lin−Sca1+c‐Kit−), D20LK (Lin−Sca1−c‐Kit+), and D20LSK (Lin−Sca1+c‐Kit+) cells were purified by FACS. Transduced iHoxB4 cells were analyzed as undifferentiated control cells. Detailed information on the antibodies used in this study is provided in Table EV1. Genomic DNA and total RNA were extracted from the sorted cells for mRNA microarray and Illumina Hi‐seq analysis, respectively. For DNA‐seq, the sequences corresponding to each shRNA were PCR‐amplified using universal primers, digested with XhoI and EcoRI restriction enzymes, and gel‐purified. The purified fragments were then used for preparation of a sequencing library according to the Illumina Hi‐seq kit instructions. For mRNA microarrays, total RNA was extracted with TRIzol and used directly for microarray analysis using the Illumina MouseWG‐6 v2.0 Expression BeadChip kit.
Total RNA was extracted from sorted cells or zebrafish using TRIzol (Life Technologies) and treated with TURBO DNA‐free DNase (Ambion) to remove residual genomic DNA. Samples of 400 ng DNase‐treated RNA served as templates for reverse transcription using oligo(dT)15 primers (Promega) and SuperScript III reverse transcriptase (Life Technologies). Diluted cDNA samples were mixed with 2× SYBR Green master mix (Bio‐Rad) and analyzed on a Roche Light‐Cycler 480. All qRT–PCR primers used in this study are listed in Table EV2.
FASTQ files of DNA‐seq data were mapped to reference shRNA libraries, and perfectly matched shRNAs were further analyzed. Clustering analysis of sorted populations was performed using Cluster 3.0 and visualized in Java TreeView. GO and enrichment analyses were carried out using the Metacore platform. Functional interactions of overrepresented genes were analyzed using the String database.
For DNA‐seq data analysis, a dataset of shRNA library sequences was generated from 57,600 22‐bp hairpin stem sequences. Next, we removed the low‐quality reads and converted the FASTQ file format to FASTA format. Blast analysis of the sequenced reads (100 bp) against the shRNA library was carried out, and perfectly matched shRNAs were used for normalization.
shRNAs targeting the same gene were totaled in each experiment and the average read numbers from the three independent experiments were used for further analysis. Analysis of variance (ANOVA) and post hoc analysis were performed in Partek Genomics Suite 6.6 to calculate the fold change of shRNA reads in the D6F, D6C, D20LS, D20LK, D20LSK populations compared with ESCs, and the corresponding P‐values. An shRNA read in ESCs of < 10 reads per million total reads was set as the cutoff value. For analysis of shRNA reads in the five sorted populations compared with ESCs, the fold change in log10 was employed for clustering analysis with a k‐means algorithm using Cluster 3.0. The clustered data were visualized as heat maps using Java TreeView. To better visualize the difference between D6F and D6C, clustering analysis was performed using the fold change in log2.
For GO and enrichment analysis, the ANOVA file was uploaded to the Metacore server. The cutoff values for the fold change and P‐value were set as 2 and 0.05, respectively. The enriched genes, networks, and pathways were identified using the Metacore platform. To analyze the enriched genes and networks in Stage I, the shRNAs from Group XII were used to calculate the fold change in D6F compared with D6C populations. For Stage II, the shRNAs from Group IV were used to calculate the fold change in the day 20 populations (average of shRNAs in D20LS, D20LK, and D20LSK) compared with the D6F population. For Stage III, Group X shRNAs were used to calculate the fold change in D20LSK compared with D20LK populations. To exclude variation resulting from zero‐read shRNAs, 1 was added to all reads of < 1 per million total reads. For genetic network construction, the top 20 target genes of depleted shRNAs at Stages I (Dataset EV1), II (Dataset EV2), and III (Dataset EV3) were selected for String analysis with action view.
For microarray analysis, the sample probe files of all samples were collected using Illumina's GenomeStudio software after expression intensities calculation and quality control which was carried out based on the cutoff of detection P‐value < 0.05 for each gene probed on the array. Then, quantile normalization, log transformation, and statistical analysis were performed using Agilent's GeneSpring GX 11.5 software. PCA analysis (Fig EV2A) and hierarchical clustering with the “average” linkage algorithm were performed in Partek Genomics Suite 6.6 (Fig EV2B).
General maintenance, collection, and staging were performed as previously described . All animal work was approved by the Institutional Review Board at the University of California, San Diego, and was performed in accordance with Institutional Animal Care and Use Committee guidelines. To prepare the Runx1 antisense RNA probe, a fragment containing the full‐length cDNA and the T7 promoter sequence was PCR‐amplified from the pCS2‐Runx1 vector using Runx1‐probe‐F (5′‐CATACCAAATGGTTTTTCTTTGGGACGCC3′) and T7‐Runx1‐R (5′‐TAATACGACTCACTATAGGTTCTAGACTGTCCCT‐3′) primers. For the c‐Myb antisense RNA probe, c‐Myb‐probe‐F (5′‐GCAACACAAACAGCCCAAATA‐3′) and T7‐c‐Myb‐R (5′‐CGACGGCCAGTGAATTGTAATA‐3′) primers were used to amplify a partial cDNA sequence of c‐Myb and the T7 promoter sequence from the pBK‐c‐Myb vector. The PCR products were digested with DpnI and purified, and the fragments were used as a template for reverse transcription using DIG RNA Labeling Mix (T7) (Roche Diagnostics). The RNA probes were divided into aliquots, snap frozen, and placed at −80°C.
Splice‐blocking morpholinos (MOs) targeting Prdm16, Ap2a1, Lrsam1, Mettl22, and Hal were designed and supplied by Gene Tools, LLC. The MOs, including a standard control oligo CCTCTTACCTCAGTTACAATTTATA, were injected into embryos at the one‐cell stage and larvae were collected for analysis of Runx1 (26 hpf) and c‐Myb (30 hpf and 54 hpf) expression, respectively. Three primer sets were designed (Table EV3) to confirm splicing alterations of MO‐targeted exons using zebrafish RNA (Appendix Fig S3A–E). The MO knockdown efficiency was quantified by qRT–PCR using primer set c for Prdm16, Lrsam1, Ap2a1, Mettl22, and Hal as previously described , , ,  (Appendix Fig S3A–E). For whole‐mount in situ hybridization, the fish were fixed with 4% paraformaldehyde in PBS at 4°C overnight, and in situ hybridization was performed as previously described , , , .
The GEO accession numbers for the DNA‐seq and RNA microarray data are GSE86898 and GSE86853, respectively.
TH designed and performed the experiments, analyzed the data, and wrote the manuscript. C‐SY and K‐YC analyzed the data. DZ performed the zebrafish experiments. FBI designed the zebrafish functional studies and analyzed the data. TMR designed the overall study and experiments, analyzed the data, wrote the manuscript, and provided financial support.
Conflict of interest
The authors declare that they have no conflict of interest.
Expanded View Figures PDF [embr201642395-sup-0002-EVFigs.pdf]
Dataset EV1 [embr201642395-sup-0003-DatasetEV1.xlsx]
Dataset EV2 [embr201642395-sup-0004-DatasetEV2.xlsx]
Dataset EV3 [embr201642395-sup-0005-DatasetEV3.xlsx]
Table EV1 [embr201642395-sup-0006-TableEV1.xlsx]
Table EV2 [embr201642395-sup-0007-TableEV2.xlsx]
Table EV3 [embr201642395-sup-0008-TableEV3.xlsx]
We thank Dr. Michael Kyba (University of Minnesota) for kindly providing the iHoxB4 cell line and vectors and for valuable suggestions on hematopoietic differentiation of mouse ESCs. We also thank Dr. David Traver (UCSD) for kindly providing the pBK‐c‐Myb and pCS2‐Runx1 probe vectors and Runx1 comparison probe for zebrafish in situ staining; Dr. Chuan He (University of Chicago) for kindly providing E14tg2a ES cell line; and Dr. Giulio Cattarossi for his help with the animal experiments. We gratefully acknowledge the staff of the core facilities at the Sanford Burnham Prebys Medical Discovery Institute, UCSD, and The Scripps Research Institute. At SBP, Yoav Altman and Amy Cortez of the Flow Cytometry Core, Alexey Eroshkin and Stacy Huang of the Bioinformatics and Data Management Core, and the staff of the Animal Facility; at UCSD, Neal Sekiy and Tara Rambaldo of the Flow Cytometry Research Core facility (funded by CFAR at the VA Hospital, UCSD), and staff of the Animal Care Program; and at TSRI, Steve Head and the staff of the Next Generation Sequencing Core Facility for help with the HT‐seq. We thank Jennifer Klabis and Jason Dang for their help with preparation of the artwork and figures, and members of the Rana laboratory for helpful discussions and advice, especially Drs. Zhonghan Li and Nianwe Lin. Funded by the National Institute of Allergy and Infectious Diseases (NIAID) (AI43198, AI41404) and the National Institute on Drug Abuse (NIDA) (DA039562).
FundingNational Institute of Allergy and Infectious Diseases (NIAID)http://dx.doi.org/10.13039/100000060 AI43198AI41404
This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs 4.0 License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
- © 2016 The Authors. Published under the terms of the CC BY NC ND 4.0 license