The paradox of R-loops: guardians of the genome or drivers of disease?
Abstract
R-loops, chromatin structures containing DNA–RNA hybrids with displaced single-stranded DNA, play crucial roles in various cellular processes. Their formation is influenced by factors such as DNA topology, RNA stability, and the presence of GC-rich regions. However, excessive or uncontrolled R-loop accumulation can threaten genomic stability, leading to DNA damage, particularly double-strand breaks. To preserve genome integrity, cells have developed mechanisms to regulate R-loop formation and resolution. Dysregulation of these processes is linked to several diseases, including cancer and neurodegenerative disorders. In this review, we will explore the dynamics of R-loop formation and resolution and how they are detected, their roles in DNA damage and repair, and how their dysregulation may lead to immune responses and disease pathogenesis.
R-loops are abundant noncanonical nucleic acid structures consisting of a DNA–RNA hybrid and a displaced single-stranded DNA (ssDNA) (Thomas et al. 1976). The ssDNA can further fold into G-quadruplex secondary structures that help stabilize the R-loop (Duquette et al. 2004). R-loops typically form in cis during the process of transcription as the nascent RNA strand exits RNA polymerase II (Pol II) and hybridizes to the template DNA strand (Ginno et al. 2012; Sanz et al. 2016). In addition, recent studies showed that some long noncoding RNAs (lncRNAs) can associate to distant genomic regions through the formation of R-loops in trans (Cloutier et al. 2016; Ariel et al. 2020; Luo et al. 2022). Whether these lncRNAs scan the genome and find their targets based on sequence complementarity or whether they are targeted to specific regions through proximity in three-dimensional space or through their protein interactions is unclear.
When not properly regulated, R-loops can become a significant source of genomic instability and can activate inflammatory immune responses (Brickner et al. 2022). However, R-loops also play roles in cellular processes, including transcription, DNA replication, DNA repair, and chromosome segregation (Ohle et al. 2016; Kabeche et al. 2018; Niehrs and Luke 2020; Petermann et al. 2022; Wulfridge and Sarma 2024). The first examples of R-loops with potential regulatory roles in eukaryotes came from studies on immunoglobulin class-switching recombination of stimulated B cells (Yu et al. 2003). These R-loops, which form within the G-rich switch regions, may contribute to recombination-based deletions and drive antibody class diversity. Similarly, R-loops play a regulatory role in gene expression in plants, in which their stabilization over the COOLAIR promoter in Arabidopsis was shown to repress transcription (Sun et al. 2013). Traditionally seen as harmful byproducts, R-loops are now increasingly recognized for their essential roles in maintaining genome integrity, particularly through their involvement in DNA repair processes (Fig. 1). These structures play a pivotal role in the repair of double-strand breaks (DSBs), among the most severe forms of DNA damage. This duality, in which R-loops contribute to both genome stability and instability, highlights their complex nature. Here, we will review the advances in methods to detect R-loops and examine the complex dynamics of R-loop formation and resolution, their involvement in DNA damage and repair, and how dysregulation of these structures can contribute to various pathologies.
Dual roles of R-loops in DNA damage and repair pathways. The single-strand DNA (ssDNA) in R-loops is susceptible to DNA damage, which can lead to the formation of DNA double-strand breaks (DSBs). Conversely, R-loops also play crucial roles in homologous recombination (HR) repair by facilitating DNA end resection and the recruitment of key repair factors.
Advances in R-loop detection
In 1976, Thomas and colleagues (1976) first visualized R-loop formation in vitro through electron microscopy. R-loop detection methods utilize the S9.6 antibody or a catalytic mutant form of the RNase H enzyme, both of which recognize DNA–RNA hybrids. The development of an antibody, S9.6, that recognized DNA–RNA hybrids paved the path to examining the intracellular locations and dynamics of R-loops (Boguslawski et al. 1986). Structural studies showed that the S9.6 antibody recognizes and interacts with the minor groove of the DNA–RNA hybrid (Bou-Nader et al. 2022). However, a thorough characterization of the S9.6 antibody in immunofluorescence studies showed that although R-loops are detected by S9.6 in vivo, this antibody also shows significant cross-reactivity with double-stranded RNA (Hartono et al. 2018; Smolka et al. 2021). Although this limits its suitability for imaging applications, the development of DNA–RNA immunoprecipitation (DRIP) using the S9.6 antibody enabled the application of next-generation genomics to the study of R-loop biology (Ginno et al. 2012).
In DRIP-seq (Fig. 2A), fragmented genomic DNA is incubated with the S9.6 antibody, which results in enrichment of genomic fragments containing DNA–RNA hybrids followed by high-throughput sequencing. DRIP-seq experiments revealed enrichment of R-loops at transcription start and end sites, the tendency for R-loop-forming regions to be GC-rich, and the histone modification and DNA methylation signatures associated with regions containing R-loops (Ginno et al. 2012; Sanz et al. 2016). Several variations of the DRIP-seq technique exist (Nadel et al. 2015; Wahba et al. 2016; Dumelie and Jaffrey 2017; Xu et al. 2017), which aim to either improve resolution or sequence the RNA strand of R-loops. In bis-DRIP-seq (Dumelie and Jaffrey 2017), nondenaturing bisulfite treatment selectively deaminates ssDNA regions to convert cytosine residues into uracil, and these modified nucleotides are identified by sequencing (Fig. 2B). Bis-DRIP-seq profiles R-loops at near base pair resolution. Most DRIP-based methods sequence the DNA strand of R-loops. In DRIPc-seq (Fig. 2C), the RNA strand in the DNA–RNA hybrid is converted to cDNA and sequenced (Sanz et al. 2016). However, based on the reported cross-reactivity of the S9.6 antibody (Hartono et al. 2018; Smolka et al. 2021), DRIPc data may identify a combination of RNAs within R-loop structures and dsRNA.
Schematic overviews of key methodologies used for mapping R-loops. (A) In DRIP, fragmented genomic DNA is incubated with the S9.6 antibody to immunoprecipitate DNA–RNA hybrids for sequencing. (B) bisDRIP combines DRIP with nondenaturing bisulfite treatment to selectively deaminate ssDNA regions and convert cytosine residues into uracil. Detection of cytosine conversion events in sequenced libraries enables the identification of the ssDNA and DNA–RNA hybrid components of the R-loops in a strand-specific fashion. (C) DRIPc combines immunoprecipitation with the S9.6 antibody with DNase I treatment to selectively degrade the DNA strands, leaving the RNA component of DNA–RNA hybrids. Strand-specific cDNA library preparation allows identification of R-loop RNA. (D) In DRIVE-seq, catalytically deficient RNase H (RHΔ) is used to isolate DNA–RNA hybrids from fragmented genomic DNA prior to sequencing. (E) In R-ChIP, RHΔ is expressed in vivo with an epitope tag, and genomic sequences bound by RHΔ are recovered via chromatin immunoprecipitation. (F) MapR uses a purified RHΔ–MNase fusion protein to bind and cleave chromatin fragments containing R-loops, which are then purified and sequenced.
To provide an orthogonal antibody-independent approach to detect R-loops, methods taking advantage of the evolutionarily conserved specificity of RNase H to DNA–RNA hybrids were developed. Based on the observation that catalytically inactive RNase H1 (RHΔ) can recognize DNA–RNA hybrids (Wu et al. 2001), the DNA–RNA in vitro enrichment (DRIVE) assay was developed to isolate DNA–RNA hybrids in vitro from fragmented genomic DNA prior to sequencing (Fig. 2D; Ginno et al. 2012). Although DRIVE-seq identified R-loops, it lacked efficiency and sensitivity compared with DRIP-seq. However, it showed that RNase H could be used as a tool to detect R-loops in genomic studies. R-ChIP was developed to capture in vivo R-loops (Fig. 2E; Chen et al. 2017). In R-ChIP, RHΔ is expressed in vivo with an epitope tag. Genomic sequences bound by RHΔ are then recovered through standard chromatin immunoprecipitation. R-ChIP discovered genomic sequences with the G/C skew that is typically seen in R-loops and an enrichment of signal at transcription pause sites. Interestingly, unlike DRIP-seq, R-ChIP did not show R-loop presence at transcription termination sites, suggesting that S9.6 and RNase H may selectively recognize distinct features within R-loops.
Cleavage under targets and release using nuclease (CUT&RUN) (Skene and Henikoff 2017; Skene et al. 2018) is an alternate approach to chromatin immunoprecipitation to identify regions of the genome bound by specific proteins of interest. Its advantages include low backgrounds, speed, and the potential to use in small cell numbers. MapR is an approach developed on the principles of CUT&RUN, but it uses a purified fusion protein containing RHΔ and micrococcal nuclease (RHΔ–MNase) instead of an antibody (Fig. 2F; Yan et al. 2019). In MapR, permeabilized cells are incubated with RHΔ–MNase protein, which can recognize DNA–RNA hybrids in cells through the RHΔ moiety. The MNase module is activated by addition of calcium, which then results in cleavage around the binding sites of the RHΔ–MNase protein. R-loop-containing chromatin fragments are released, purified, and sequenced. The major advantage of MapR is that it can be used in any cell type without the need to generate a stable cell line. In an improvement to MapR, incorporation of nondenaturing bisulfite treatment and directional library preparation enabled strand-specific detection of R-loops as well as improved resolution (Wulfridge and Sarma 2021). CUT&Tag is another alternative to ChIP and CUT&RUN that uses Tn5 transposase to insert adapters directly into specific genomic sites (Kaya-Okur et al. 2019). This streamlines sample preparation considerably prior to high-throughput sequencing. In R-loop CUT&Tag, the hybrid binding domain of RNase H was expressed in cells and processed for CUT&Tag (Wang et al. 2021), which confers the advantages of CUT&Tag to an RNase H–based mapping protocol, including speed, low background, and adaptability to use in small cell numbers.
R-loops can also be detected by approaches that target their single-stranded component, rather than the DNA–RNA hybrid. The ssDNA component of the R-loop can fold into a G4 structure. The BG4 antibody recognizes G4 structures (Biffi et al. 2013) and was used in conjunction with CUT&RUN or CUT&Tag approaches to profile G4s genome-wide (Hänsel-Hertsch et al. 2018). Similarly, kethoxal, a chemical that reacts with the N1 and N2 positions of guanine residues in ssDNA and ssRNA, was used to develop kethoxal-assisted ssDNA sequencing (KAS-seq) to map single-stranded guanine-rich regions of the genome (Wu et al. 2022). An azide-tagged kethoxal used in this method can be easily modified with a biotin group after it reacts with guanine residues. These biotin-labeled guanines are enriched to recover G-rich single-stranded regions of the genome, which can occur on the template strands of R-loops. Although BG4 and N3-kethoxal are not direct R-loop detection tools, the co-occurrence of G4s and R-loops at many genomic loci suggests that they may be used as a proxy for R-loop detection or used together with other DNA–RNA hybrid detection strategies for a more comprehensive view of R-loop distribution in different settings.
In addition to molecular methods (Table 1), several computational tools exist for R-loop prediction. R-loopDB was developed by splitting a potential R-loop-forming sequence (RLFS) into sections that require guanine nucleotide clusters linked to G-rich regions to identify regions within genes with the potential to form R-loops (Wongsurawat et al. 2012). Updates to R-loopDB now supports detection of RLFS in multiple organisms (Jenjaroenpun et al. 2015, 2017). More recently, deep learning prediction models were trained to identify R-loops by training on an integrated cohort with numerous experimental data sets (Hu et al. 2024). Continued development of machine learning and artificial intelligence in research may shed further insight on sequence predictors of R-loop formation under different biological conditions.
Comparison of some R-loop profiling methods
Dynamics of R-loop formation and resolution in the genome
Molecular drivers of R-loop formation
R-loops typically occur at GC-rich genomic regions, where G-rich RNA sequences form highly stable hybrids with complementary C-rich DNA strands (Roy and Lieber 2009). DNA topology, particularly negative DNA superhelicity, plays a crucial role in R-loop formation and stability (Drolet et al. 1994; Massé et al. 1997; Massé and Drolet 1999; El Hage et al. 2010). Studies in Escherichia coli showed that negative supercoiling lowers the energy barrier to form the DNA–RNA hybrid, thereby enhancing R-loop stability (Massé et al. 1997; Massé and Drolet 1999). Advancing RNA Pol II generates negative supercoils behind it, creating an underwound DNA state that facilitates the reannealing of the nascent transcript and promotes R-loop formation while simultaneously alleviating superhelical stress during transcription (Liu and Wang 1987; Kouzine et al. 2013). DNA topoisomerases are a family of nuclear enzymes that regulate DNA topology. In the absence of DNA topoisomerase I (TOP1), negative supercoils accumulate behind elongating RNA Pol II, promote DNA unwinding, and facilitate nascent RNA annealing to form R-loops. Topoisomerases transiently cut DNA strands to relax superhelicity generated by the unwinding of the DNA template behind RNA Pol II. Topoisomerase deficiency in bacteria and yeast results in growth defects and R-loop accumulation, which can be rescued by the overexpression of RNase HI (Drolet et al. 1995). In human cells, TOP1 deficiency causes stalled and collapsed replication forks (RFs), as well as chromosome breaks during the S phase in highly transcribed regions, driven by excessive R-loop formation (Tuduri et al. 2009), highlighting the significant contribution of DNA topology to R-loop dynamics. In contrast, little is known about protein factors that actively promote R-loop formation. Replication protein A (RPA) was found to contribute to R-loop formation in vitro (Mazina et al. 2020), but whether it functions similarly in vivo remains to be tested.
Key players in R-loop resolution
Cellular R-loop levels are controlled through both the inhibition of the formation and the active resolution of formed structures. Numerous RNA binding and processing factors bind nascent RNA and prevent their hybridization with the DNA strand to prevent R-loop formation. THO/TREX is a conserved eukaryotic protein complex that functions in transcription and mRNA metabolism (Huertas and Aguilera 2003; Pühringer et al. 2020). THO/TREX binds mRNA and recruits NXF1 and NXT1 mRNA transport proteins. Binding of NXF1 to TREX alters the structure of NXF1 to facilitate RNA binding and promote transfer of mRNA out of the nucleus (Viphakone et al. 2012). Mutations in a component of the THO/TREX results in R-loop accumulation in yeast (Huertas and Aguilera 2003). Recently, THO/TREX was reported to function in the maintenance of telomere stability by binding TERRA lncRNA to prevent the formation of R-loops at telomeres (Fernandes and Lingner 2023). Members of the serine/arginine (SR)-rich protein family are well-characterized regulators of mRNA metabolism. Notably, SRSF1, initially identified as a splicing factor, regulates various aspects of the mRNA life cycle, including alternative splicing, transcription, nuclear export, and translation. Consistent with the role of RNA-binding proteins in R-loop prevention, in vivo depletion of SRSF1 leads to genomic instability through R-loop formation (Li and Manley 2005). Similarly, mutations in other splicing factors, such as SF3B1 and SRSF2, also result in R-loop accumulation and can impact genome stability (Chen et al. 2018; Singh et al. 2020). Finally, loss of ATRX, a chromatin remodeler that can also bind RNA (Sarma et al. 2014), results in increased R-loops at telomeres (Nguyen et al. 2017). In vitro studies show that ATRX–RNA binding can prevent R-loop formation (Yan et al. 2022). These examples highlight that RNA-binding proteins with functions in mRNA and chromatin regulation can also have critical roles in R-loop prevention.
The most well-known factors for resolving R-loops are RNase H enzymes. Eukaryotes possess two specialized RNases, RNase H1 and RNase H2, which degrade the RNA in DNA–RNA hybrids but differ in their substrate specificities. RNase H1 consists of a hybrid binding domain and an endonuclease domain requiring interaction with ribose molecules for optimal activity (Cerritelli and Crouch 1995). In contrast, RNase H2 is a protein complex composed of three subunits: RNase H2A, H2B, and H2C. Like RNase H1, RNase H2 is able to degrade the RNA component of R-loops; however, it also has a unique capability to remove ribonucleotides that are erroneously incorporated into DNA during replication (Williams et al. 2016). Interestingly, in yeast, the expression of the catalytic subunit of RNase H2 was increased after DNA replication, further supporting a crucial role for RNase H2 in removing misincorporated ribonucleotides in DNA (Lockhart et al. 2019). DICER, an endo-RNase of the RNase III family with established roles in small RNA biogenesis, was recently shown to counteract R-loop accumulation by specifically cleaving the RNA strand within R-loops (Camino et al. 2023). Characterization of the RNA cleavage activity of DICER in the context of R-loops and DNA–RNA hybrids showed that DICER appears to target the RNA moiety only when present in R-loops and not in DNA–RNA hybrid substrates. This specificity raises intriguing questions about how DICER distinguishes R-loops from other DNA–RNA hybrids, warranting further investigation into its mechanism of action and regulation.
Helicases are a large family of enzymes that are also implicated in R-loop regulation and function by unwinding DNA–RNA hybrids or G4s. Senataxin (human SETX/yeast Sen1) is a well-characterized R-loop-resolving helicase that can unwind the RNA from the DNA–RNA hybrid to resolve R-loops (Skourti-Stathaki et al. 2011; Hasanova et al. 2023). Senataxin has important roles in transcription regulation and genome integrity. In yeast, mutations in sen1 lead to the accumulation of R-loops, particularly at transcription-replication conflicts (TRCs), resulting in transcription-dependent hyperrecombination (Mischo et al. 2011; Alzu et al. 2012). Most helicases involved in R-loop resolution function by unwinding hybrids, as seen in the cases of DDX5 and DHX9 (Chakraborty and Grosse 2011; Mersaoui et al. 2019). Only a few helicases have been examined for their ability to resolve G4s. PIF1 and BLM, which play important roles in the maintenance of telomere integrity and DNA repair, are able to unwind both hybrids and G4s (Sun et al. 1998; Boule and Zakian 2007; Popuri et al. 2008; Sanders 2010). Further studies on whether these hybrid and G4 helicases function together in cells and on how they are regulated will yield valuable information on how their mutations contribute to various diseases.
The dual role of R-loops in DNA damage induction and repair mechanisms
Although R-loops have been implicated in genome instability through their ability to cause DNA damage, it has also become abundantly clear that these structures also play important roles in promoting DNA repair. Because the mechanisms through which R-loops contribute to damage and repair have been discussed in detail in several recent reviews (Marnef and Legube 2021; Brickner et al. 2022; Petermann et al. 2022; Gómez-González and Aguilera 2023; Wulfridge and Sarma 2024), we will briefly summarize recent advances in this field.
R-loops as catalysts of DNA damage
Disruptions in pathways involved in R-loop resolution can lead to DNA damage, particularly DSBs (Huertas and Aguilera 2003; Li and Manley 2005; Paulsen et al. 2009). The ssDNA in R-loops can be sensed as DNA damage and recognized by endonucleases, XPG and XPF, in the transcription-coupled nucleotide excision repair (TC-NER) pathway (Sollier et al. 2014). Indeed, reducing the levels of either XPG or XPF alleviates R-loop-associated DNA damage. Recent studies also identified R-loop-dependent interactions between XPG and XPF endonucleases with the XAB2 splicing factor, providing an explanation for how defective splicing can induce DNA damage through R-loop accumulation (Goulielmaki et al. 2021).
TRCs are the major source of DSBs in cells (Gómez-González and Aguilera 2019). In the S phase of the cell cycle, RFs encountering R-loops or paused RNA Pol II can lead to fork stalling and collapse (Fig. 1). These TRCs are particularly detrimental when RFs collide with transcription machinery in a head-on orientation compared with codirectional conflict (Mirkin and Mirkin 2005; Prado and Aguilera 2005). Notably, head-on conflicts are more likely to result in R-loop formation, turning RNA Pol II into a stronger obstacle for replication (Hamperl et al. 2017). Interestingly, in vitro studies suggest that replisomes can bypass such barriers without RF collapse, and a reconstituted yeast replication system demonstrated that RFs can bypass naked R-loops (Kumar et al. 2021). Consistent with these findings, in bacterial systems, RNA polymerase bound to R-loops temporally stalls RFs more effectively than R-loops alone (Brüning and Marians 2020). When elongating RNA polymerase encounters an obstacle, it moves backward along the DNA template (Nudler 2012). Such backtracking destabilizes the transcription complex and arrests RNA polymerase (Cheung and Cramer 2011). Bacterial RNA polymerase backtracking can increase TRCs, and such collisions can potentially lead to DSBs in an R-loop-dependent manner. TFIIS, through its RNA cleavage activity, can attenuate TRC-induced DSBs by releasing backtracked Pol II through 3′- RNA cleavage (Zatreanu et al. 2019; Duardo et al. 2024).
R-loops in DNA repair: mediators or barriers?
Accurate repair of DNA damage is essential for maintaining genome integrity, as any failure to properly repair damaged DNA can lead to mutations, chromosomal instability, and cellular dysfunction. Interestingly, several studies have uncovered that DNA–RNA hybrid formation plays a critical role in the early steps of the DNA damage response (DDR). Organisms rely on two major pathways to repair DSBs: nonhomologous end-joining (NHEJ) and homologous recombination (HR). NHEJ is error-prone and often introduces insertions, deletions, and substitutions at break sites (Lieber 2010; Chapman et al. 2012). In contrast, HR, which is active during the S and G2 phases, uses homologous DNA templates for error-free repair. R-loops are predominantly implicated in HR repair of DSBs (Fig. 1). A critical step in HR is DNA-end resection, initiated by the MRE11–RAD50–NBS1 (MRN) complex localizing to DSBs and associating with ATM (previously known as ATM serine/threonine kinase), which phosphorylates the histone variant H2AX. The MRN complex, together with RB binding protein 8, endonuclease (RBBP8; also known as CtIP), promotes short-range 5′-end DNA resection, generating 3′-ssDNA overhangs (Sartori et al. 2007). Resection can be further processed by the Exo1 and/or Dna2 nucleases to create long DNA resections (Zhu et al. 2008). The resected DNA with 3′-ssDNA overhangs is coated with RPA, facilitating DDR signaling and enabling RAD51 loading to drive HR. Importantly, the HR pathway relies on hybrid formation at DNA break sites, which is essential for the recruitment of key mediators such as BRCA1, BRCA2, and RAD51 (D'Alessandro et al. 2018). DNA–RNA hybrids were shown to accumulate at DSBs that occur at active genes, and these result in the recruitment of senataxin helicase and other HR repair factors (Cohen et al. 2018). Interestingly, when DSBs occur at regions that are not transcriptionally active, RNA molecules may be synthesized de novo at the break site (Sharma et al. 2021; Lim et al. 2023). The DNA–RNA hybrids that form as a consequence facilitate the assembly of the repair machinery (Hatchi et al. 2015; D'Alessandro et al. 2018; Wulfridge and Sarma 2024).
R-loops are also crucial for HR repair mechanisms that utilize RNA as a template. In transcription-associated HR repair (TA-HRR), RAD52 is recruited to R-loop-associated DNA–RNA hybrids near DSBs (Yasuhara et al. 2018). The endonuclease XPG processes these hybrids and helps recruit BRCA1, which competes with 53BP1 to favor HR over NHEJ. In transcription-coupled HR (TC-HR), RNA molecules can serve as templates for precise repair of DSBs, enabling accurate repair even in G0/G1 cells that lack sister chromatids. Cockayne syndrome B (CSB) is important in TC-NER to remove stalled RNA Pol II by NER. CSB recognizes DNA–RNA hybrids at DSB sites and recruits HR factors such as RAD52 and RAD51 (Wei et al. 2015; Teng et al. 2018). These highlight the critical role R-loops have in safeguarding the genome from deleterious mutations that can arise because of DNA damage.
Pathogenesis through dysregulation of beneficial R-loop functions
As discussed above, DNA damage and genome instability are the most well-known mechanisms by which R-loops are thought to contribute to pathogenesis and disease progression. For this reason, early R-loop research treated the structures as transcriptional byproducts whose prompt removal must be carefully managed by the cell. However, it is now widely understood through the work of many groups that R-loops play several key roles in normal cellular function, including regulation of gene expression and genome organization (Niehrs and Luke 2020; Wulfridge and Sarma 2024). These discoveries have revealed R-loops as essential components of cell function rather than merely harmful structures, but importantly, they also point to pathways other than DNA damage by which abnormal R-loop levels can lead to dysfunction and disease.
R-loops are enriched at both the promoter and terminator regions of genes, where they play multiple roles in regulating transcription. At the 3′-ends of RNA Pol II–transcribed genes, cotranscriptional formation of R-loops over G-rich pause sites, and their subsequent resolution by senataxin, facilitates transcription termination through the recruitment of XRN2 (Skourti-Stathaki et al. 2011). Interestingly, both aberrant accumulation and excessive removal of R-loops result in increased levels of readthrough RNA. Therefore, dysregulation of R-loops at gene ends could result in termination defects that lead to pathogenic misexpression of genes. For example, readthrough transcripts have been detected in cancer, in which they may promote oncogenic expression programs and are associated with poor prognosis (Grosso et al. 2015; Abe et al. 2024). Notably, defective termination could also imbalance the distribution of transcript isoforms. Isoform diversity is crucial to neurodevelopment (Patowary et al. 2024), and termination defects that reduce isoform diversity have been linked to neurodegeneration (LaForce et al. 2022). Thus, transcription termination defects could represent an alternative pathway to DNA damage through which R-loops contribute to the progression of cancer and neurological diseases.
R-loops can also affect gene expression through epigenetic regulation. At the CpG islands of active gene promoters, R-loop formation helps establish an unmethylated state both by blocking the activity of DNA methyltransferase 1 (Grunseich et al. 2018) and by recruiting the demethylating enzyme TET1 through GADD45A (Arab et al. 2019). As with transcriptional regulation, perturbing R-loop levels in either direction can result in a corresponding change in methylation levels and, in turn, altered gene expression. As an example, mutations in RNase H2 are observed in Aicardi–Goutières syndrome (AGS), a rare neurodevelopmental disease characterized by autoimmune inflammation (Crow et al. 2006). In AGS fibroblasts, R-loop accumulation across intergenic and intronic regions corresponds with hypomethylation across those same regions (Lim et al. 2015). Importantly, depletion of RNase H2 in non-AGS cells also results in the loss of DNA methylation, suggesting that R-loop accumulation in this disease is a cause rather than a consequence of hypomethylation (Lim et al. 2015). Conversely, the abnormal loss of R-loops can result in hypermethylation. The neurodegenerative disease amyotrophic lateral sclerosis type 4 (ALS4) is caused by mutations in the DNA–RNA hybrid helicase senataxin (Chen et al. 2004). The ALS4-mutated form of senataxin results in a reduction of R-loop levels, including over the promoter of BAMBI, a negative regulator of TGFB1 (Grunseich et al. 2018). In the absence of R-loop formation, DNMT1 binds and establishes methylation over the promoter, leading to reduced BAMBI expression and consequent activation of TGFB signaling that may contribute to disease progression. In these diseases, DNA methylation may represent a mechanism through which R-loop dysregulation can induce long-term cellular changes, given that it is a particularly stable epigenetic mark. Interestingly, however, the link between R-loops and methylation may also be exploitable in the therapeutic setting. For example, IGF2BP proteins were found to recognize m6A-modified R-loops in prostate cancer cells (Ying et al. 2024). In these cells, overexpression of IGF2BP proteins results in R-loop accumulation that inhibits methylation at the promoter for the tumor suppressor SEMA3F, leading to high expression of SEMA3F and sensitization to chemotherapy.
Recent findings have revealed roles for R-loops in the regulation of three-dimensional genome organization. R-loops frequently colocalize with binding sites for the architectural protein CTCF, where they can strengthen the binding of CTCF to its consensus motifs in conjunction with G4 structures that form on the single-stranded component (Luo et al. 2022; Wulfridge et al. 2023). The association between CTCF and G4 is particularly enriched at cis-regulatory elements marked by H3K4me3 and H3K27ac, suggesting they may facilitate promoter–enhancer contacts (Zhang et al. 2024). Given this role in maintaining chromatin architecture, it follows that R-loop dysregulation may lead to pathogenic rewiring of the genome. This has been observed in acute myeloid leukemia, in which the highly expressed lncRNA HOTTIP forms disease-specific R-loops to regulate CTCF binding and TAD formation (Luo et al. 2022). Interestingly, CTCF binding sites are also prone to DNA breakage and genome instability. G4s may mediate this fragility by acting as a cleavage site for DNA topoisomerase II at strong CTCF sites (Raimer Young et al. 2024), thus connecting genome organization to DNA damage. Identifying similar R-loop driven mechanisms by which 3D contacts, and the architectural proteins that maintain them, may be deregulated in other cancers and diseases will provide valuable insight into pathogenesis and identify potential avenues of therapy.
Pathogenesis through R-loop-mediated activation of inflammatory immune responses
R-loops including those forming at sites of DNA damage must be resolved to maintain genome integrity. Recent studies now suggest that certain byproducts of R-loop resolution may also impact cell health in the form of DNA–RNA hybrids that are excised from the genome. DNA–RNA hybrids were found to activate the cGAS–STING pathway in vitro (Mankan et al. 2014), suggesting they could play a role in innate immune response in vivo. Following this line of inquiry, Crossley et al. (2023) recently discovered the presence of cytoplasmic DNA–RNA hybrids that occur when genomic R-loops are cleaved by XPG and XPF and exported from the nucleus. These hybrids can trigger activation of the IRF3 immune signaling pathway through the cGAS and TLR3 receptors, resulting in a proinflammatory TNF response and apoptosis (Fig. 3). In this study, genomic sequencing revealed that cytoplasmic DNA–RNA hybrids exist at low levels even in wild-type cells but increase greatly upon loss of R-loop resolvers such as senataxin. Interestingly, only a small subset of genomic R-loops is susceptible to this processing and cytoplasmic release. These regions contain R-loops with particularly high stability and correspond with sites of R-loop accumulation upon senataxin loss. Together, these results point to a model in which difficult-to-remove R-loops are cleaved into cytoplasmic hybrids as a last resort when other resolution pathways fail. These hybrids could thus act as a sensor for potential genome instability, triggering apoptosis through IRF3 signaling when R-loop accumulation in the cell becomes catastrophic.
Mechanisms of R-loop-mediated inflammatory immune responses. Genomic R-loops that cannot be resolved by helicases such as senataxin may be cleaved by XPG and XPF endonucleases. Cleaved DNA–RNA hybrids can be exported into the cytoplasm or form micronuclei. Cytosolic hybrids or hybrids released from ruptured micronuclei can bind to cGAS or TLR3 receptors in the cytosol or endolysosome, respectively. Activation of cGAS triggers innate immune signaling and apoptosis through the cGAS/STING/TBK1/IRF3 axis.
R-loop roles in DNA damage may be further linked to the innate immune response in the context of cytosolic micronuclei, which can form in an R-loop-dependent manner and are associated with genomic instability (De Magis et al. 2019). cGAS accumulates at micronuclei upon disruption of the micronuclear envelope, triggering a similar response to that prompted by cytosolic DNA or DNA–RNA hybrids (Mackenzie et al. 2017). In cancer cells, senataxin loss was observed to induce micronuclei formation from R-loops processed by EXO1, in turn triggering cGAS mediated inflammation (Zannini et al. 2024). Senataxin loss further impaired autophagy, which is normally one way these micronuclei are removed. Thus, similar to cytoplasmic hybrid detection, activation of immune pathways by micronuclei could represent a safeguard to manage cells with excessive R-loop-related DNA damage. One interesting question is what factors determine whether an R-loop is exported as a cytoplasmic hybrid versus retained in micronuclei, as well as how this may affect the intensity of the consequent immune response.
The propensity of DNA–RNA hybrids to generate a proinflammatory immune response means that R-loops could accelerate disease progression in some contexts, while acting as vulnerabilities in others. Neuroinflammation driven by the cGAS–STING pathways in microglia may contribute to progression of neurodegenerative diseases (Talbot et al. 2024). A similar mechanism, in which telomeric R-loop accumulation upon transcriptional stress causes release of telomeric DNA into the cytoplasm and a subsequent immune response, may contribute to cellular senescence and aging (Siametis et al. 2024). Conversely, in cancers, loss of ARID1A results in R-loop accumulation and increased levels of cytosolic DNA–RNA hybrids (Maxwell et al. 2024). These ARID1A tumors are sensitized to treatment by immune checkpoint blockade, which is driven by STING-dependent IFN signaling upon the detection of accumulated cytosolic hybrids. Similarly, G4 stabilizing ligands such as pyridostatin have been shown to generate micronuclei in cancer cells mediated by R-loop formation (De Magis et al. 2019). It is therefore possible that these G4 ligands, in addition to inducing genome instability, could also enhance the immune response against tumor cells. Future studies on how management of R-loop accumulation can control the innate immune response will therefore be invaluable for developing preventative measures in certain diseases, as well as targeted therapeutic strategies in others.
Conclusions
R-loops present a fascinating duality in genome maintenance. They act as both genomic safeguards and threats, depending on their regulation and context. On one hand, R-loops contribute to DNA damage if they accumulate excessively or persist unresolved, leading to replication stress, mutations, and chromosomal instability. On the other hand, they play essential roles in DNA repair, particularly in HR and DDR, by facilitating the recruitment of key repair proteins and even serving as templates for error-free repair. This paradox underscores the importance of finely tuned regulatory mechanisms that balance R-loop formation and resolution. By carefully managing their presence, cells leverage the protective functions of R-loops while minimizing their potential to harm, highlighting their indispensable role in maintaining genomic integrity.
We highlight the importance of examining R-loop function and dysfunction in contexts beyond genome instability. Although the participation of R-loops in DNA damage remains an incontrovertible component of their role in disease, their roles in key facets of cellular homeostasis indicate that their dysregulation may contribute to pathogenesis via many other mechanisms. A crucial implication here is that any R-loop imbalance, not just accumulation, can lead to harm. Therefore, therapeutic strategies may need to be more complex than simply suppressing R-loop formation, which could have unforeseen and deleterious effects. A thorough investigation of cellular pathways that are regulated by R-loops, as well as a comprehensive understanding of mechanisms of how R-loops regulate various processes, will aid in advancing strategies to modulate these structures for therapies.
Competing interest statement
The authors declare no competing interests.
Acknowledgments
This work was supported by funding from the National Institutes of Health (National Institute of General Medical Sciences: R01GM143229; National Institute of Neurological Disorders and Stroke: R01NS135217 and R01NS127828) to K.S.
Footnotes
-
Article published online before print. Article and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.278992.124.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.














