Systems and methods for photoregulated oligomerization and phase separation of folding domains and RNA particle-associated protein domains

文档序号:1821181 发布日期:2021-11-09 浏览:19次 中文

阅读说明:本技术 折叠结构域和rna颗粒相关蛋白质结构域的光调节寡聚化和相分离的系统和方法 (Systems and methods for photoregulated oligomerization and phase separation of folding domains and RNA particle-associated protein domains ) 是由 C·P·布朗温内 D·布拉查 V·德拉克 D·W·桑德斯 于 2020-02-19 设计创作,主要内容包括:公开了用于折叠域的相分离的方法和系统,更具体地,用于诱导折叠域的簇作为基于药物的筛选应用的一部分。所述系统和方法利用一种或多种第一融合蛋白(100、101),每种第一融合蛋白包含与第二区域(120)融合的的第一区域(110),所述第一区域(110)包含至少一种光敏蛋白(115)或光敏蛋白的同源伴体(116),并且所述第二区域(120)包含一个或多个折叠RNA结合结构域(RBDs)、无序RBDs、折叠非RBD结构域或其组合(125)。(Methods and systems for phase separation of folded domains, and more particularly for inducing clustering of folded domains as part of drug-based screening applications, are disclosed. The systems and methods utilize one or more first fusion proteins (100, 101), each first fusion protein comprising a first region (110) fused to a second region (120), the first region (110) comprising at least one photoprotein (115) or a cognate partner of a photoprotein (116), and the second region (120) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (125).)

1. A protein system, comprising:

one or more photoproteins (100, 101), each photoprotein comprising a first region (110) fused to a second region (120), the first region (110) comprising at least one photoprotein (115) or a cognate partner of a photoprotein (116), the second region (120) comprising one or more folded RNA Binding Domains (RBDs), disordered RBDs, folded non-RBD domains, or a combination thereof (125).

2. The protein system according to claim 1,

wherein the first region (110) of the photoprotein (101) comprises a first cognate (116) of a first photoprotein, and

wherein the protein system (250) further comprises:

a core protein (200) comprising a first region (210) fused to a second region (220), the first region (210) of the core protein (200) comprising the first photoprotein (215) and the second region (220) of the core protein (200) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains or a combination thereof (225), the second region (220) of the core protein (200) being adapted for self-assembly.

3. The protein system according to claim 2,

wherein the protein system (350) further comprises:

a fixed linker protein (300) comprising a first region (310) fused to a second region (320), the first region (310) and the second region (320) of the fixed linker protein (300) each comprise one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (315, 325), and the first region (310) and the second region (320) of the fixed linker protein (300) are each adapted to interact with the second region (120) of the photoprotein (100).

4. The protein system according to claim 2,

wherein the protein system (450) further comprises:

an optional photoprotein (400) comprising a first region (410) fused to a second region (420), the first region (410) of the optional photoprotein (400) comprising a second cognate (415) of a second photoprotein, the second region (420) of the optional photoprotein (400) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (425), the second region (420) of the optional photoprotein (400) adapted to interact with the second region (120) of the first fusion protein (100); and

an optional core protein (500) comprising a first region (510) fused to a second region (520), the first region (510) of the optional core protein (500) comprising the second light sensitive protein (515), and the second region (520) of the optional core protein (500) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (525), the second region of the optional core being adapted for self-assembly.

5. The protein system according to claim 1,

wherein the system (650) comprises at least two photoproteins (100, 101), wherein one of the at least two photoproteins (100) comprises a first photoprotein (115), and wherein another of the at least two first photoproteins (101) comprises a cognate partner (116) of the first photoprotein, and

wherein the system further comprises two or more PPI core proteins (600), each PPI core protein (600) comprising a first region (610) fused to a second region (620), the first region (610) of the PPI core protein (600) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (615), the second region (620) of the PPI core protein (600) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (625),

wherein the first region (610) of each PPI core protein (600) is adapted to interact with the second region (120) of the photoprotein (100, 101), and wherein the second region (620) of each PPI core protein (600) is adapted to self-assemble.

6. The protein system according to claim 1,

wherein the first region of the photoprotein comprises a first light sensitive protein (115), and

wherein the protein system further comprises:

additional photoproteins (700), wherein the first region comprises the first photoprotein (715) and the second region comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (725),

wherein the second region (720) of the additional photoprotein (700) is adapted to interact with the second region (120) of the photoprotein (100), and

wherein the first region (110) of the photoprotein (100) and the first region (710) of the additional photoprotein (700) are adapted to self-assemble into an oligomer of at least 2 in response to light.

7. The protein system of claim 1, wherein the photoprotein is fused to a folded RBD, and the folded RBD is an RNA Recognition Motif (RRM), a K Homology (KH) domain, a pumimio (pum) domain, a zinc finger domain, a DEAD box helicase domain, a double-stranded RNA binding domain (dsRBD), a m6A RNA binding domain (YTH domain), or a Cold Shock Domain (CSD).

8. The protein system of claim 1, wherein the light sensitive protein is fused to a disordered RBD, and the disordered RBD is an arginine-glycine (RG) domain, an arginine-glycine (RGG) domain, a serine-arginine (SR) domain, or a basic-acidic dipeptide (BAD) domain.

9. The protein system of claim 1 wherein said light sensitive protein is fused to one or more fold-over non-RBDs.

10. The protein system of claim 1, wherein the first region comprises ferritin.

11. The protein system of claim 1, wherein the at least one light sensitive protein is an engineered protein.

12. The protein system of claim 11, wherein the engineered protein is LOV 2-ssrA.

13. The protein system of claim 1, wherein the first region comprises two LOV2-ssrA proteins.

14. The protein system of claim 1, wherein at least one fusion protein comprises a fluorescent tag.

15. A cell line or stem cell derived cell expressing the protein system of claim 1.

16. The cell of claim 15, wherein one or more genes configured to express the protein system are delivered into the cell using lentiviruses, adeno-associated viruses (AAV), Bacterial Artificial Chromosomes (BAC), transient transfection (e.g., liposomes or proprietary formulations for DNA plasmid introduction), microinjection, electroporation, or CRISPR/Cas 9-based methods.

17. The cell of claim 15, wherein the cell is a human cell, a yeast cell, a cultured neuron, or a worm, drosophila, rodent, or primate model.

18. An expression vector system comprising at least one expression vector configured to transfect a cell with one or more genes configured to express the protein system of claim 1.

19. The expression vector system of claim 18, wherein the expression vector system comprises a first plasmid comprising a gene capable of expressing the first fusion protein.

20. A method for measuring the phase behavior of a natural or engineered multicomponent coacervate, comprising the steps of:

a. providing a protein system according to claim 1;

b. oligomerizing the folded RNA Binding Domain (RBD), disordered RBD, or folded non-RBD domain by exposing the light sensitive protein to at least one wavelength of light; and

c. phase behavior is measured by mapping phase diagrams, determining whether phase separation, aggregation or aggregation has occurred, measuring aggregate material properties, protein concentration, valence states, and combinations thereof.

21. The method of claim 20, wherein the protein system is located within a living cell.

22. The method of claim 20, wherein the protein system is located outside of living or dead cells.

23. The method of claim 20, wherein oligomerization drives gelation of cytoplasmic Ribonucleoprotein (RNP) granules.

24. The method of claim 20, wherein the protein system is in a well of a multi-well array/plate.

25. The method of claim 24, further comprising providing one or more chemical reagents to the well.

26. The method of claim 20, further comprising using a genetic screen based on gene knockdown or gene upregulation.

27. The method of claim 20, further comprising:

determining the effect of genetic screening based on gene knockdown, genetic screening based on upregulation, addition of one or more chemical agents to the well, or a combination thereof, based on the measured phase behavior.

Technical Field

The present disclosure relates generally to phase separation of folded domains, and more particularly, to inducing clusters of folded domains as part of drug-based screening applications.

Background

Cells compartmentalize the diverse biochemical processes required for life in a reaction center called an organelle. Organelles are generally described as membrane-enclosed compartments separated from a homogeneous cytosolic solution. However, cells also use membrane-deficient organelles to organize their contents. Such compartments are particularly abundant in the nucleus of eukaryotic cells and include the nucleoli where ribosomes are produced as well as RNA-proteosomes of unknown function (e.g. Cajal bodies, speckles) (Zhu and Brangwynne, 2015). In the cytoplasm, the membrane-free compartment is usually context-specific, as evidenced by polysome breakdown (i.e., stressed particles) (Ivanov et al, 2018; Protter and Parker, 2016) or the detection of specific extracellular signals (e.g., semaphores, inflammasomes) (Gamma and Bienz, 2018; Wu and Fuxreiter, 2016). In many cases, it is not clear whether such macroscopic assemblies enhance a particular biochemical reaction, or exist passively as inert sequestering centers (Shin and Brangwynne, 2017).

Recent studies have shown that the physics of liquid-liquid phase separation (LLPS) accounts for the assembly of these structures, which are increasingly known as coacervates (Banani et al, 2017; Brangwynne et al, 2009; Shin and Brangwynne, 2017). Intracellular LLPS occurs at saturating protein/nucleic acid concentrations due to minimization of free energy by preferential self-association (Brangwynne et al, 2015). Although weak interactions between proteins containing low complexity/Intrinsic Disorder Regions (IDRs) or short "sticky" motifs can mediate intracellular LLPS in certain systems (Elbaum-Garfinkle et al, 2015; Frey et al, 2006; Kato et al, 2012; Molliex et al, 2015; Murakami et al, 2015; Nott et al, 2015; Patel et al, 2015), it is unclear whether these additional "multivalent" motif repeats are necessary for the formation of common macroscopic biological aggregates such as nucleoli and stressor particles. In such RNP bodies, proteins containing low specificity RNA binding domains may be critical for LLPS due to their weak interaction with RNA-based cross-links (Chong et al, 2018; Feric et al, 2016; Lee et al, 2016; Mitrea et al, 2018; Nott et al, 2015; Vernon et al, 2018).

Despite extensive research into the effects of weak multivalent interactions, there is less interest in specific interactions that allow for selective phase separation of multi-component cellular aggregates and recruitment of certain substrates while excluding other substrates. The essential coacervate-nucleating proteins typically exhibit a common modular structure including oligomerization domains, IDRs and matrix binding moieties, the most common class being RBDs (Aoki et al, 2018; Hebert and Matera, 2000; Kedersha et al, 2016; Matsuki et al, 2013; Mitrea et al, 2018; 2016; 2014; Tourriere et al, 2003). Many of these RBDs are characterized by having a fully folded RNA Recognition Motif (RRM) that binds with high affinity to specific RNA motifs, and a terminal RGG region that binds with low affinity to bulky RNA and dissociated ribosomes (Chong et al, 2018; Mitrea et al, 2016; Thandapani et al, 2013). For example, G3BP (stress particle), PGL (P particle), and NPM1 (nucleolus), each have an oligomerization domain at the N-terminus, and a bi-branched RBD (folded RRM, disordered RGG) at the C-terminus (Aoki et al, 2018; Kedersha et al, 2016; Matsuki et al, 2013; Mitrea et al, 2014; Tourriere et al, 2003). While such oligomerization domains and RBDs are considered important, we lack a quantitative understanding of their relative contribution to aggregate phase separation with defined material properties.

Stressed particles (SGs) represent a particularly interesting cytoplasmic agglomeration that has become an important model for elucidating the general principles of intracellular phase separation (Ivanov et al, 2018; Kedersha et al, 1999; Protter and Parker, 2016). SGs are microscale, liquid-like RNA-protein assemblies that form in mammalian cells in response to translational arrest and subsequent polysaccharidoses (Kedersha et al, 1999; 2016; 2002; Kroschwald et al, 2015; Molliex et al, 2015; Wheeler et al, 2016; Wippich et al, 2013). SG assembly involves a network of interacting RBPs, ribosomal subunits and RNAs (Bounedjah et al, 2014; Kedersha et al, 2016; Markmiller et al, 2018; Youn et al, 2018). Despite this rich interaction network, previous work highlighted the importance of the single protein G3BP, which appears to be essential for SG aggregation (Kedersha et al, 2016; Matsuki et al, 2013), and has the characteristics of the typical modular architecture described above (Tourriere et al, 2003). However, the mechanisms by which G3BP regulates SG biogenesis and the biophysical role of its modular architecture are still poorly understood (Kedersha et al, 2016; Matsuki et al, 2013; Panas et al, 2015; Schulte et al, 2016; Solomon et al, 2007; Wu et al, 2016).

A key obstacle that has hampered previous work on SG and other aggregates is the lack of tools for quantitatively probing the relative roles of specific aggregate-nucleating protein unique interaction modules in vivo.

Disclosure of Invention

Phase separation/coacervation generally requires the formation of a linked network of interacting biomolecules. The disclosed systems and methods allow one of skill in the art to design constructs that activate phase separation upon light activation, but only when potential protein-protein or protein-RNA interactions occur. This in turn allows one to screen for conditions that disrupt the interaction by finding conditions under which phase separation/aggregation does not occur due to loss of the interacting connecting network.

A system and method are disclosed that provide optogenetic tools to quantitatively examine biomolecular interactions, such as oligomerization, protein-protein interactions, and RNA binding, using intracellular aggregation as a readout. The center of the weakly cross-linked complex or folded domain (hub) can be used to cross the phase boundary of the liquid-liquid phase separation. These centers may be disrupted by, for example, molecules from small molecule libraries or physiological proteins/substrates (e.g., USP10), which reduces the valency of the complex, thereby eliminating its ability to mediate phase separation of the associated protein-RNA network.

A first aspect of the present disclosure relates to a protein system that can be used as part of a drug-based screening application. The protein system requires one or more first fusion proteins, wherein each first fusion protein comprises a first region fused to a second region. The first region comprises at least one photoprotein or a cognate partner of a photoprotein, and the second region comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (125).

Optionally, when the first region of the first fusion protein comprises a first cognate of the first light sensitive protein, the system can comprise a second fusion protein that also includes the first region fused to a second region. The first region of the second fusion protein comprises the first light sensitive protein (under appropriate light conditions allowing the first fusion protein to be linked to the second fusion protein). The second region of the second fusion protein comprises one or more folding RNA binding domains, disordered RBDs, folding non-RBD domains, or a combination thereof, wherein the second region of the second fusion protein is capable of self-assembly (e.g., through dimer, trimer, pentamer, n-mer interactions, including homo-and hetero-interactions) when in proximity to other second fusion proteins.

Furthermore, such systems may further comprise third and fourth fusion proteins, wherein the second and fourth fusion proteins self-assemble into a core structure, and the first and third fusion proteins are configured to interact with each other and may be optogenetically attached to the second and fourth fusion proteins, respectively. The third fusion protein includes a first region fused to a second region. The first region of the third fusion protein comprises a cognate partner of the second photoprotein and the second region of the third fusion protein comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof, wherein the second region of the third fusion protein is adapted to interact with the second region of the first fusion protein. The fourth fusion protein comprises a first region fused to a second region, the first region of the fourth fusion protein comprising a second light sensitive protein, and the second region of the fourth fusion protein comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof, wherein the second region of the fourth fusion protein is capable of self-assembling with other fourth fusion proteins or with other second fusion proteins.

Alternatively, the system can include a third fusion protein having two regions fused together, each region comprising one or more folded RNA Binding Domains (RBDs), disordered RBDs, folded non-RBD domains, or a combination thereof. And each of the two regions of the third fusion protein is adapted to interact with the second region of the first fusion protein.

Optionally, when the first region of the first fusion protein comprises a first light sensitive protein, the system may further comprise a second fusion protein and two or more third fusion proteins. The second fusion protein includes a first region fused to a second region, wherein the first region of the second fusion protein utilizes a cognate partner of the first light sensitive protein and the second region of the second fusion protein is identical to the second region of the first fusion protein. I.e., the first and the second fusion proteins are nearly identical except for one having the light sensitive protein and one having a cognate partner of the light sensitive protein. Each of the two or more third fusion proteins comprises a first region fused to a second region, each of the two regions of the third fusion protein comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof, the second region of the third fusion protein comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof, but the first region is adapted to interact with the second region of the first fusion protein or the second region of the second fusion protein, and the second region of each third fusion protein is adapted to self-assemble.

Optionally, when the first region of the first fusion protein/photoprotein comprises a first light sensitive protein, the system may utilize a second fusion protein comprising the first region fused to a second region. The first region of the second fusion protein comprises a first photoprotein and the second region of the second fusion protein comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof, wherein the second region of each second fusion protein is adapted to interact with the second region of the first fusion protein, and wherein the first region of the first fusion protein and the first region of the second fusion protein are adapted to self-assemble into an oligomer of at least 2 in response to light.

In some disclosed systems, the light-sensitive protein is fused to a folded RBD, and the folded RBD is an RNA Recognition Motif (RRM), a K Homology (KH) domain, a pumimio (pum) domain, a zinc finger domain, a DEAD box helicase domain, a double-stranded RNA binding domain (dsRBD), an m6A RNA binding domain (YTH domain), or a Cold Shock Domain (CSD). In some disclosed systems, the light-sensitive protein is fused to a disordered RBD, and the disordered RBD is an arginine-glycine (RG) domain, an arginine-glycine (RGG) domain, a serine-arginine (SR) domain, or a basic-acidic dipeptide (BAD) domain (e.g., RD, RE). In some disclosed systems, the light sensitive protein is fused to one or more folded non-RBDs.

In some disclosed systems, the first region comprises ferritin.

Optionally, the at least one light sensitive protein is an engineered protein, such as LOV 2-ssrA. Optionally, the at least one light sensitive protein comprises a first LOV2-ssrA fused to a second LOV 2-ssrA.

Optionally, one of the fusion proteins in the system, e.g., the first fusion protein, comprises a fluorescent tag.

A second aspect of the present disclosure relates to a cell line or stem cell derived cell expressing the protein system described above. Optionally, one or more genes configured to express the protein system are delivered into the cell using lentiviruses, adeno-associated viruses (AAV), Bacterial Artificial Chromosomes (BACs), transient transfection (e.g., liposomes or proprietary formulations for DNA plasmid introduction), microinjection, electroporation, or CRISPR/Cas 9-based methods. Optionally, the cell is a human cell, a yeast cell, a cultured neuron, or a worm, drosophila, rodent, or primate model.

A third aspect of the present disclosure relates to an expression vector system comprising at least one expression vector configured to transfect a cell with one or more genes configured to express the protein system of claim 1. Optionally, the expression vector system comprises a first plasmid comprising a gene capable of expressing the first fusion protein.

A fourth aspect of the present disclosure relates to methods for measuring phase behavior (e.g., concentration-related phase diagrams, including saturation concentrations, full double-node phase boundaries, etc.) of natural or engineered multi-component membraneless organelles/aggregates. The method comprises providing a protein system as described above, oligomerizing the folded RNA Binding Domain (RBD), disordered RBD, or folded non-RBD domain by exposing the light sensitive protein to at least one wavelength of light, and measuring phase behavior by mapping a phase diagram, determining whether phase separation, aggregation, or aggregation has occurred, measuring aggregate material properties, protein concentration, valence state, and combinations thereof.

Optionally, the method may be when the protein system is located inside a living cell, or outside a living (or dead) cell. Optionally, the protein system is in a well of a multi-well array (or plate). Optionally, oligomerization drives gelation of cytoplasmic Ribonucleoprotein (RNP) particles.

Optionally, the method further comprises providing one or more chemical reagents to the well.

Optionally, the method further comprises utilizing genetic screens based on gene knockdown (e.g., CRISPR KO, CRISPRi, siRNA, shRNA, or antisense oligonucleotides) or gene upregulation (e.g., CRISPRa or DNA plasmid-based overexpression).

Optionally, the method further comprises determining the effect of gene knock-down based genetic screening, up-regulation based genetic screening, addition of one or more chemical agents to the well, or a combination thereof, based on the measured phase behavior.

Drawings

FIG. 1A is a simplified embodiment of a first fusion protein according to the present disclosure, highlighting the first and second regions of the fusion protein.

FIG. 1b is a simplified alternative embodiment of a first fusion protein according to the present disclosure highlighting the first and second regions of the fusion protein.

FIG. 1C is a simplified diagram illustrating an embodiment in which a single type of first fusion protein can self-assemble.

Fig. 2 is a simplified embodiment of a second fusion protein according to the present disclosure.

Fig. 3 is a simplified diagram illustrating an embodiment in which the second fusion protein self-assembles, and the first fusion protein can be attached to the self-assembled core under light of a particular wavelength.

Fig. 4 is a simplified embodiment of a fifth fusion protein according to the present disclosure.

Fig. 5 is a simplified diagram illustrating an embodiment in which a second fusion protein self-assembles, a first fusion protein can attach to the second fusion protein under light of a particular wavelength, and a third fusion protein interacts with the second fusion protein.

Fig. 6 is a simplified embodiment of a third fusion protein according to the present disclosure.

Fig. 7 is a simplified embodiment of a fourth fusion protein according to the present disclosure.

Fig. 8 is a simplified diagram illustrating an embodiment in which the second and fourth fusion proteins self-assemble, the first fusion protein can be attached to the second fusion protein under light of a particular wavelength, the third fusion protein interacts with the second fusion protein, and the third fusion protein can be attached to the fourth fusion protein under light of a particular wavelength.

Fig. 9 is a simplified embodiment of a sixth fusion protein according to the present disclosure.

Fig. 10 is a simplified diagram illustrating an embodiment in which the sixth fusion protein self-assembles, one type of first fusion protein can interact with the sixth fusion protein, a second type of first fusion protein can be attached to the first type of first fusion protein under light of a particular wavelength, and the second type of first fusion protein can interact with the sixth fusion protein.

FIG. 11 is a simplified diagram illustrating an embodiment in which different types of first fusion proteins can self-assemble.

FIG. 12 is a diagram showing how concepts from graph theory, which form the basis of the present application, provide information for the mechanical framework of network-based cell aggregation; the "valency" (v) describes the number of interaction sites (shown as: 1-6) associated with a "particle", which is a single protein or protein complex in a cell (the "cap" refers to a particle of v ═ 1; "bridge", v ═ 2; "node", i > >2), and the linkage between the individual particles results in an assembled complex, each particle having its own valency (as shown). Particles that lack interaction with larger complexes are "bystanders" (v ═ 0). If observed in isolation (e.g., RBP complex, no RNA), the complex has a total valency that reflects the number of available sites for additional ligation (e.g., RNA binding domain) (here, v ═ 6). In the case of G3 BP-mediated SGs (bottom), the amount of exposed mRNA for G3BP binding is typically low in non-stressed cells (high occupancy of ribosomes); following arsenite stress-induced polysome breakdown, mRNA is exposed and mediates network aggregation through the high RBD valency of the G3BP node.

Fig. 13 is a western blot from GFP-tagged G3BP domain deletion co-immunoprecipitation studies validating endogenous protein interaction partners predicted by the technique using the folding domain of G3BP (NTF2), with legends showing the various domains of G3BP (1300), similarly seen in G3BP1, G3BP2A, and G3BP2B (1301, 1302, 1303): oligomerization domain (NTF2 (dimerized): 1-141 (1310)), two IDR domains (IDR1 (acidic) (142-224) (1320) and IDR2(P-rich) (225-334) (1330)) and two RBD domains (RRM domain (334-409) (1340) and RGG domain (410-466) (1350) (recognizing that different isoforms have the same domain organization but different amino acid designations)), deletion of NTF2 domain (1305) eliminates the stress-independent high affinity binding RNAs (RNAse, RIPA washing of beads) of GFP-G3BP and USP10, CAPRIN1 and UBAP2L in G3BP KO representative Western blots of three independent experiments.

FIG. 14A is a simplified depiction (1400) of the five domains of interest in G3 BP: oligomerization domains (NTF2 (dimerized): 1-141 (1401)), two IDR domains (IDR1 (acidic) (142-224) (1402) and IDR2(P-rich) (225-334) (1403)) and two RBD domains (RRM domain (334-409) (1404) and RGG domain (410-466) (1405)).

Fig. 14B is a simplified depiction (1450) of sspB-ANTF2, in which NTF2 domain (1401) of G3BP (1400) is replaced by sspB (1451), but otherwise remains unchanged.

Fig. 14C is a simplified depiction (1460) of proteins used to screen dimerization domain (NTF2) interactions and proteins that modulate their aggregation, comprising four domains: an oligomerization domain (NTF2 (dimerized): 1-141 (1401)), two IDR domains (IDR1 (acidic) (142-224) (1402) and IDR2(P-rich) (225-334) (1403)) and ssbB (1451).

Fig. 15A is an intracellular phase diagram showing the interplay between core valency, core concentration and matrix (RNA) availability of untreated systems, where the calculated optimal phase thresholds are shown, where the systems use the same sspB construct as fig. 14B, and where experiments were performed in human U20S cells.

Figure 15B is an intracellular phase diagram showing the interplay between core valency, core concentration and substrate (RNA) availability of arsenite treated systems (available RNA increase) using the same sspB construct as figure 14B, and where experiments were performed in human U20S cells, with calculated optimal phase thresholds shown.

Figure 15C is an intracellular phase diagram showing the interplay between core valence state, core concentration and matrix availability of a system treated with arsenite and cycloheximide (blocking arsenite-induced RNA increase), where the calculated optimal phase threshold is shown, where the system uses the same sspB construct as figure 14B, and where the experiments were performed in human U20S cells. The calculated optimal phase threshold was almost the same as that of non-treated cells (fig. 15A).

Fig. 15D is an intracellular phase diagram of a system using the same sspB construct as fig. 14B, treated with actinomycin D (reducing available RNA by blocking RNA transcription), indicating that the addition of actinomycin D disrupts SGs formation in experiments performed in human U20S cells.

Fig. 16A is an intracellular phase diagram showing the interplay between core valence state and core concentration, showing the calculated optimal phase threshold, wherein the system used the same sspB construct as fig. 14C (i.e., with NTF2 protein-protein interaction domain but lacking RBD), and wherein the experiment was performed in human U20S cells.

Fig. 16B is an intracellular phase diagram showing the interaction between core valency, core concentration, and overexpression of a control NTF2 interacting RNA binding protein (CAPRINl-miRFP670), which retains its network of RNA binding interactions, where the calculated optimal phase threshold is shown, where the system uses the same sspB construct as fig. 14C, and where the experiment was performed in human U20S cells. The calculated optimal phase threshold was similar to cells that did not express fluorescent protein (fig. 16A).

FIG. 16C is an intracellular phase diagram showing the interplay between core valency, core concentration, and overexpression of the NTF2 interaction protein (USP10-mirFP670), which unravels its network of RNA-binding protein interactions and inhibits phase separation, where the calculated optimal phase threshold is shown, where the system uses the same sspB construct as in FIG. 14C, and where the experiments were performed in human U20S cells.

Fig. 17 is a graphical representation of some compositionally overlapping stress particles and P-body protein components revealed by the techniques in this disclosure, including a graphical representation of how network junctions result in proteosome complexes and RNA attached to P-bodies forming stress particles (bottom).

Figure 18 is a Fluorescence Correlation Spectroscopy (FCS) calibration curve used to estimate GFP and mCherry cytoplasmic concentrations in U20S cells to determine fusion protein concentrations, valency states, and phase boundaries for the techniques recited in this application. The iLID-GFP and mCherry-sspB were used for calibration due to the lack of expected endogenous binding partners, predicted monomer status and general use as tags (note that LOV2-SsrA may sometimes be referred to as iLID as used herein).

FIG. 19 is a flow chart describing an embodiment of a screening method.

Detailed description of the preferred embodiments

The present disclosure relates to systems and methods for photoregulated oligomerization and phase separation of folding domains and RNA particle-associated protein domains, particularly for drug-based screening applications.

The system may involve various types of fusion proteins, any or all of which contain a fluorescent protein. These fusion proteins are configured to function together to co-oligomerize and network while illuminated with light of certain wavelengths. This oligomerization and networking (or lack thereof) results in phase-specific behavior that can be monitored under various conditions and environments (e.g., when various chemical or biological agents are added) to determine under what conditions or environments the behavior can change.

The system can be seen in its simplest form in fig. 1A-1C. Referring to fig. 1A and 1B, the system requires one or more first fusion proteins (100, 101), sometimes referred to as "photoproteins", and typically a plurality of these first fusion proteins. Each first fusion protein (100, 101) comprises a first region (110) fused to a second region (120).

The first region (110) comprises at least one light sensitive protein (115) or a cognate partner (116) of a light sensitive protein. These light-sensitive proteins (115) or cognate partners (116) may be any light-sensitive protein or cognate partner known to those skilled in the art, including natural or engineered proteins such as BLUF domains (e.g., bPAC), photopigments (e.g., Phy-PIF or BphP1-PpsR2), cryptochromes (e.g., LARIAT, LITE, OPTOTIM, cryptochrome 2, and CIBl), LOV domains (e.g., BACCS, LAD, LITEZ, iLID [ LOV2-SsrA ]/Ss pB, pDawn, and pDusk), fluorescent protein domains (e.g., Dronpa-based systems and PhoCl), and UVR8 domains (e.g., UVR 8). In a preferred embodiment, the first fusion protein (100) uses a single LOV2-SsrA protein. In another embodiment, the first fusion protein (100) uses two LOV2-SsrA proteins.

As can be seen in the simplified figure, the cognate (116) is adapted to link with and attach to a light sensitive protein when the light sensitive protein is illuminated with light of at least one wavelength. For example, using the iLID system, when LOV2-SsrA (light-sensitive protein) and SspB (their cognate partners) are mobile in the cell (i.e., in a position that is capable of interacting with each other), the LOV2-SsrA will eventually attach to SspB only when illuminated with light at a wavelength of about 450nm, but will subsequently detach when not illuminated with such light.

The first region (110) may optionally be a region configured to self-assemble. For example, in some embodiments, the region comprises one or more proteins known to promote self-assembly via dimer, trimer, pentamer, n-mer interactions, including homotypic and heterotypic interactions. In some embodiments, the region comprises ferritin, a family of proteins known to self-assemble into hollow cage-like structures, each structure having 24 identical subunits.

The second region (120) comprises one or more folded RNA Binding Domains (RBDs), disordered RBDs, folded non-RBD domains, or a combination thereof (125). Folded RBDs can include, but are not limited to, an RNA Recognition Motif (RRM), a K Homology (KH) domain, a pumimio (pum) domain, a zinc finger domain, a DEAD box helicase domain, a double stranded RNA binding domain (dsRBD), a m6A RNA binding domain (YTH domain), or a Cold Shock Domain (CSD). Disordered RBDs can include, but are not limited to, an arginine-glycine (RG) domain, an arginine-glycine (RGG) domain, a serine-arginine (SR) domain, or a basic-acidic dipeptide (BAD) domain (e.g., RD, RE). The folding non-RBD can be, but is not limited to, a dimerization or oligomerization domain (e.g., G3BP NTF2, NPM1 oligomerization domain, HSF1 trimerization domain, DCPIA trimerization domain, etc.), which are generally essential for the formation of physiological biological aggregates (e.g., stress particles, nucleoli, nuclear stressors, P-bodies, etc.). Full-length proteins can be used without prior knowledge of the oligomerization or matrix-binding (e.g., RNA-binding) domains.

If the first fusion protein (100, 101) comprises a fluorescent protein, it may be present in the first region (110) or the second region (120), although preferably it is present in the first region.

Referring to FIG. 1C, it can be seen that this basic form of the system (150) can self-assemble when illuminated with light of a predetermined wavelength (based on the particular light sensitive protein involved) due to the interaction between the first regions (110) of the plurality of first fusion proteins (100, 101). The system in FIG. 1C can be seen to have a heterogeneous cluster of first fusion proteins, shown here as including a first fusion protein of a first type (100) and an optional type (101). In other embodiments, the system may form a homogenous cluster.

When the first fusion protein (100, 101) does not self-assemble when illuminated with light of the appropriate wavelength, a more complex system is required.

Referring to fig. 2 and 3, the first option is to introduce a second fusion protein (200), sometimes referred to as the "core protein". The second fusion protein (200) also includes a first region (210) fused to a second region (220). The first region (210) comprises a light sensitive protein (215). The second region (220) comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (225), the second region of the second fusion protein being adapted for self-assembly, including homo-and hetero-interactions, by dimer, trimer, pentamer, or n-mer interactions. In some embodiments, the region comprises ferritin, a family of proteins known to self-assemble into hollow cage-like structures, each structure having 24 identical subunits.

FIG. 3 illustrates a system (250) with a first (101) and a second (200) fusion protein. When a plurality of second fusion proteins/core proteins (200) are in a system that allows the fusion proteins to interact, the light sensitive protein (215) of the second fusion protein/core protein (200) may be attached to a cognate partner of the light sensitive protein (116) present on the first fusion protein/photoprotein (101).

Referring to FIGS. 4-5, the second selection was established on the first selection by introducing a third fusion protein (300). The third fusion protein is sometimes referred to as a "fixed linker". It may connect systems of the form shown in figure 3, allowing for more interactions and larger networks. The third fusion protein (300) comprises at least two regions-a first and a second region (310, 320) fused together. For the third fusion protein, each region comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (315, 325), and the first and second regions (310, 320) of the third fusion protein are each adapted to interact with the second region of the first fusion protein.

Referring to fig. 5, a preferred embodiment of the present system is a dimerization or higher order oligomerization domain that requires additional endogenous proteins adapted to interact so that phase separation occurs. For example, in the G3BP NTF2 (dimerization domain) example, this endogenous protein is UBAP2L, which allows further networking and aggregation between G3BP dimers. Removal of the protein from the cell by knocking out or over expressing the protein USP10 that competes for interaction with it in the same binding pocket of the NTF2 domain prevents phase separation. This can be seen by comparing fig. 16A-16C. Figure 16C illustrates that USP10 can prevent aggregate formation of G3BP NTF 2. Similar competitive protein-protein interactions may play a role in the formation of other biological aggregates, such as DDX 6-dependent P-bodies. Small molecules that disrupt these types of interactions have a similar effect, i.e., prevent or in some cases enhance phase separation.

Referring to FIG. 5, it can be seen that this system (350) is similar to that depicted in FIG. 3, but with the addition of the third fusion protein (300), showing that the second region (125) of the first fusion protein (101) interacts with the first region (315) of the third fusion protein (300). For ease of understanding, these interacting regions are shown graphically as being able to fit together and are indicated with a "+" or "-" symbol. It will be appreciated that the third fusion protein is expected to link at least two first fusion proteins, thereby allowing the system to link various self-assembled cores, thereby facilitating large-scale phase separation/aggregation. The extended ligation requires five fusion proteins-the second fusion protein (200) is ligated to the first fusion protein (101), to the fixation linker (300), which fixation linker (300) is ligated to the other first fusion protein (101), and then the first fusion protein (101) is ligated to the other second fusion protein (200). In these systems, the order of linkage is relatively unimportant, although it is possible that linkage and interaction between the various fusion proteins does not occur simultaneously.

Referring to fig. 6-8, a third option is based on the first option by the inclusion of a third and fourth fusion protein (400, 500) by introducing a moiety that may be referred to as a "protein-protein interaction (PPI) linker". By incorporating the concept of "fixed linker" into the variant of the first fusion protein, this shortens the extension linkage seen in the second option from five fusion proteins to four, but the system (450) now requires the introduction of two different fusion proteins rather than just one.

Referring to fig. 6 and 8, the third fusion protein (400) may be considered a subset of the first fusion protein (101), and is therefore sometimes referred to as an "optional photoprotein". The third fusion protein (400) comprises a first region (410) fused to a second region (420), the first region (410) of the third fusion protein (400) comprising a second cognate of a second light sensitive protein (415). The second light sensitive protein may or may not be the same light sensitive protein (115) as the first fusion protein/photoprotein. That is, the third fusion protein (400) may not be intended to bind to the same light sensitive protein to which the first fusion protein (101) binds.

The second region (420) of the third fusion protein (400) comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (425). The second region (420) of the third fusion protein (400) is adapted to interact with the second region (120) of the first fusion protein (100). As shown in fig. 8, the one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (425) of the third fusion protein interacts with the one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (125) of the first fusion protein.

Referring to fig. 7 and 8, the fourth fusion protein (500) may be considered a subset of the second fusion protein (200), and is therefore sometimes referred to as the "optional core protein". The fourth fusion protein (500) comprises a first region (510) fused to a second region (520). The first region (510) of the fourth fusion protein (500) comprises the second light sensitive protein (515) which will bind to the cognate partner (415) present in the third fusion protein (400) or "optional photoprotein" (see, e.g., fig. 8). The second region (520) of the fourth fusion protein (500) comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (525), and is adapted for self-assembly, similar to the second fusion protein (200). As shown in fig. 8, the second (200) and fourth (500) fusion proteins can self-assemble. The self-assembly appears heterogeneous (both the second and fourth fusion proteins interact and assemble together), but homogenous self-assembly may also occur.

Another alternative can be seen in fig. 9 and 10. This option uses two forms of the first protein to form what may be referred to as an "optolinker" which can then bind to a modified core protein ("PPI core") via protein-protein interactions. Thus, the protein system (650) consists of at least two first fusion proteins (100, 101). One first fusion protein (100) in the "optical junction" comprises a light sensitive protein (115) fused to one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (125). Another first fusion protein (101) in the "optical junction" comprises a cognate partner (116) of the photoprotein fused to the same one or more folded RNA Binding Domains (RBDs), disordered RBDs, folded non-RBD domains, or combinations thereof (125). When illuminated with light of the correct wavelength, the two fusion proteins will bind, forming a link that can link the two modified core proteins (600). The system typically has two or more modified core proteins (600).

The modified core protein (600) is a second fusion protein comprising a first region (610) fused to a second region (620). The first region (610) of the second fusion protein (600) comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or combinations thereof (615). The first region (610) is adapted to interact with a second region (125) of the first fusion protein. The second region (620) of the second fusion protein (600) comprises one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (625). The second region (620) is adapted for self-assembly.

Yet another alternative may be seen in fig. 11. It can be seen that the system (750) comprises two variants of the first fusion protein. A variant (100) comprises a first region (110) comprising a first light sensitive protein fused to a second region comprising one or more folded RNA Binding Domains (RBDs), disordered RBDs, folded non-RBD domains, or a combination thereof (125). The second variant (700) or "photoprotein variant" comprises a first region (710) fused to a second region (720, not shown) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains, or a combination thereof (725). The second region comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains or combinations thereof (725) is adapted to interact with the second region (120) of the first fusion protein (100) comprising one or more folding RNA Binding Domains (RBDs), disordered RBDs, folding non-RBD domains or combinations thereof (125). The first region (710) of the second variant (700) comprises the same first light sensitive protein, and the first regions of both variants (100, 700) are adapted to self-assemble into oligomers of at least 2 in response to light. Thus, it can be envisaged how there may be multiple self-assembled cores connected by protein-protein interactions between the two variants.

A second aspect of the present disclosure relates to a cell line or stem cell derived cell expressing one of the above mentioned protein systems. In some embodiments, the cell can be a human cell, a yeast cell, a cultured neuron, or a worm, drosophila, rodent, or primate model. In some embodiments, one or more genes configured to express the protein system are delivered into a cell using lentiviruses, adeno-associated viruses (AAV), Bacterial Artificial Chromosomes (BAC), transient transfection (e.g., liposomes or proprietary formulations for DNA plasmid introduction), microinjection, electroporation, or CRISPR/Cas 9-based methods.

A third aspect of the present disclosure relates to an expression vector system comprising at least one expression vector configured to transfect a cell with one or more genes configured to express the protein system described above. In some embodiments, the expression vector system comprises a first plasmid comprising a gene capable of expressing the first fusion protein.

A fourth aspect of the present disclosure relates to a method for measuring the phase behavior of natural or engineered multi-component coacervates.

The method first requires the provision of a protein system as described above. The system may be present within living cells, or within or outside dead cells. In some embodiments, the system is present in a well of a multi-well array (plate).

The method then entails oligomerizing the folded RNA Binding Domains (RBDs), disordered RBDs, or folded non-RBD domains of the fusion protein in the protein system by exposing the light sensitive protein to at least one wavelength of light. For example, when LOV2-SsrA is fused to FTH1, the 24-mer ferritin "core" coated by the LOV2-SsrA molecule spontaneously self-assembles. When present with SspB (its cognate partners) and mobile in the cell (i.e., in a position that is capable of interacting with each other), the LOV2-SsrA will eventually only attach to SspB when illuminated with light having a wavelength of about 450nm, but then detach when not illuminated with such light. By varying the relative concentrations of the two components (see fig. 18 for a calibration method for determining the concentration of fluorescent protein in a cell), the oligomerization state (valence state) can be varied (0 to 24) and the intracellular phase diagram can be quantified, which is suitable for compound-based or genetic-based screening applications. Phase separation/coacervation generally requires the formation of a linked network of interacting biomolecules. The disclosed systems and methods allow one of skill in the art to engineer constructs that activate phase separation upon light activation, but only when potential protein-protein or protein-RNA interactions occur. This in turn allows one to screen for conditions that disrupt the interaction by finding conditions that do not phase separate/condense due to loss of the linked interaction network. This can be seen in fig. 14C and fig. 16A-16C.

The system here uses the so-called "NTF 2 Corelets". The core consists of a 24-mer ferritin complex coated with iLID molecules (i.e. these are analogous to the second fusion protein or "core protein" described above) which serves as an oligomerization platform mediated by blue-light stimulated sspB-iLID interactions, where the sspB-iLID is fused to the IDR region and folded NTF2 of G3BP (i.e. typically mapped to the first fusion protein or "photoprotein" described above). By varying the relative concentrations of the two substances, the oligomerization state can be varied between 0 and 24.

The system in fig. 16A-16C utilized U20S cells expressing NTF2 Corelet. A control can be seen in fig. 16A, where the NTF2Corelet is not co-expressed with any other protein of interest. The phase boundaries were similar when NTF2Corelet was co-expressed with CAPRIN10-mirFP670 (see FIG. 16B), an NTF2 interacting protein that retains its protein-and RNA-interaction network. However, no aggregates were formed when NTF2Corelet was co-expressed with USP10-miRFP670 (see fig. 16C), a protein lacking additional protein-protein interactions and RNA-binding capacity, and thus unraveling its interaction network. That is, unlike CAPRIN1, USP10 prevents phase separation of NTF2 Corelet.

In some embodiments, oligomerization drives gelation of cytoplasmic Ribonucleoprotein (RNP) particles.

The method then requires measuring the phase behavior. This may be done by mapping a phase diagram (which may consist of mapped phase boundaries), determining whether phase separation, aggregation or aggregation has occurred, measuring aggregate material properties, protein concentration, valence state, or a combination thereof.

In some embodiments, the method may further involve providing one or more chemical or biological agents to the well. In some embodiments, the methods can also involve the use of genetic screens based on gene knockdown (e.g., CRISPR KO, CRISPRi, siRNA, shRNA, or antisense oligonucleotides) or gene upregulation (e.g., CRISPRa or DNA plasmid-based overexpression).

In preferred embodiments, the method further comprises determining the effect of gene knock-down based genetic screening, up-regulation based genetic screening, addition of one or more chemical agents to the well, or a combination thereof, based on the measured phase behavior. That is, using known screening techniques, the effect is determined based on changes in phase behavior. For example, expression of USP10 prevents photoinduced aggregation of G3BP NTF2 oligomers by unraveling the essential protein-protein interaction network; the effect of compounds targeting this binding pocket is similar. Similar compound-based approaches to disrupt homo-and hetero-oligomerization of similar coacervate-related proteins (e.g., NPM1, DCPIA, HSF1, etc.) are possible. Alternatively, compounds that prevent the aggregation of essential RBD-RNA interactions can be identified (see FIG. 15D).

An embodiment of the method can be seen in fig. 19. Here, the method (1900) begins by providing appropriate cells (1910). These cells are then transfected with plasmids, lentiviruses, etc. in the appropriate fusion protein for the desired system (1920), resulting in a stable cell line (1930).

Cells from stable cell lines can be introduced into multi-well plates (e.g., 96 or 384 well plates) (1940).

At this point, one or more screening components may be introduced (1950) into one or more wells. The screening component can be a chemical screening component (e.g., a compound from a library of small molecules), a genetic screening component (e.g., knockdown, knock-out, overexpression, etc.), or some combination thereof. Typically, at this point, there will be some delay in allowing the screening component to interact with the cells in the well.

Then, based on the specific light sensitive proteins utilized in the system, one or more of the wells are activated by irradiating them with light of the appropriate wavelength. This activation may be for any length of time, but preferably occurs for 30 minutes or less, more preferably 20 minutes or less, and still more preferably 10 minutes or less. The illumination may be provided using a laser, an array of Light Emitting Diodes (LEDs), an LED lamp, or any such method for generating light of an appropriate wavelength.

If the user wishes to fix the cells using any suitable known fixation technique (e.g., paraformaldehyde, methanol, ethanol, etc.), the cells can be fixed upon activation (1970).

With live or dead cells, the user can then capture images of the cells by known microscopic techniques (e.g., confocal, wide field, super-resolution, etc.) (1980). In some embodiments, these images may be captured simultaneously with sorting of live cells, for example using commercially available equipment known to those skilled in the art that combines Fluorescence Activated Cell Sorting (FACS) with rapid microscopy imaging.

Once the image is captured, the image may be analyzed (1990). This may involve, for example, determining the degree of aggregation, determining the concentration (e.g., by comparing the measured intensity to a calibration curve, see fig. 18), or determining the phase boundary.

From theory and experiments on patch colloids (Bianchi et al, 2011) it is known that for a systematic phase separation of interacting particles into a dynamically linked network, each particle must have a sufficient number of sites to bind to other particles, which defines its valence state v (fig. 1M); here, a "particle" (or "vertex" in graph theory) may represent a single protein or RNA molecule, or a stable complex of biomolecules. In general, v >2 is required for phase separation, with higher valencies more likely to drive phase separation. In the case of the synthesis of the fused G3BP dimer complex, there are only two possible interaction interfaces, which are conferred by the two exposed RBDs, so the overall valency of the synthesized G3BP dimer is 2 (i.e., 2 RBD-RNA interfaces); we call v-2 particles "bridges" that can connect different parts of the network, but cannot form an interaction network across space by themselves (see fig. 12).

Given that the NTF2 domain of G3BP cannot be replaced by a universal dimerization domain, it can be concluded that the G3BP dimer must somehow embody a particle with v >3, rather than representing a bridge (v ═ 2). Objects with v >3 are called "nodes" (see FIG. 12). In the case of the native G3BP dimer, this valency will be achieved by at least one hetero-protein interaction (PPI) with the NTF2 domain, except for two RBDs. If so, NTF2 may be used as an interaction platform to link to additional nodes to amplify the valency required for SG aggregation.

To screen for SG proteins that may increase the overall valency of the G3 BP-based complex, the dimerization capacity of NTF2 can be exploited, which enhances the system by passing through the core. It is hypothesized that NTF2 dimers will generate stable homobridges (cross-links) between the cores, and heterotypic NTF2 interaction bridges/nodes will partition and confer growth by multiplying valency, allowing identification of such proteins by microscopic analysis.

Of the group of GFP-tagged SG-tagged proteins (N ═ 20) and P-body proteins (N ═ 3), only 8 SG proteins (USP10, UBAP2L, CAPRIN1, FMR1, FXR1, NUFIP2, G3BP1, G3BP2A) were strongly localized to the light-induced G3BP ARBD Corelet. These proteins are specific for NTF2 interaction, as aggregates formed by self-associating IDRs from another protein (FUS) only recruit FUS ANLS.

Consistent with the results of Corelet NTF2 interaction analysis, G3 BP-mediated co-immunoprecipitation (co-IP) of USP10, CAPRIN1 and UBAP2L all required its NTF2 domain, and the interaction was retained after RNase treatment and stringent washing steps (fig. 13).

Given that all proteins except USP10 have RBDs, these identified proteins can act as bridges or nodes for G3BP interactions to contribute to the overall valence requirements for SG condensation. Knockouts of USP10, CAPRIN1, NUFIP2, FXR1/FXR2/FMR1(3KO) or FXR1/FXR2/FMR1/NUFIP2(4KO) had no effect on SG formation, indicating that these components play a limited role in total valency amplification. Furthermore, the arsenite-triggered phase threshold was not significantly adjusted by endogenous levels of USP10 or CAPRINl, as triple KO (G3BP1/G3BP2/USP10, G3BP1/G3BP2/CAPRIN1) did not require significantly different amounts of G3BP for rescue relative to G3BP1/2 double KO. In contrast, UBAP2/2L dual KO cells showed reduced size SGs, which were formed in only a few cells; these data indicate the possibility that UBAP2L may act as an additional critical node. Support for this resulted from the unexpected finding of a missense mutation in G3BP (S38F), which was found to abolish its ability to rescue stress particle formation in G3BP KO cells, even though its expression level was >10 times higher than the WT G3BP rescue threshold. G3BP S38F retained homodimerization and USP 10-binding capacity and strongly partitioned into the stress particles formed by WT G3 BP. However, it failed to form a high affinity complex with CAPRIN1 and UBAP2L, suggesting that the mutation changed G3BP from the v >3 node to the v ═ 2 bridge, which is no longer able to introduce additional total valency from UBAP 2L. In contrast, the previously described G3BP F33W variant was unable to bind USP10 or CAPRIN1, but rescued SG formation, retained its nodal properties by association with UBAP2L and showed similar rescue threshold concentrations as WT. Together with the above findings, these data provide strong evidence that the G3BP dimer must serve as a node to bind UBAP2L to fulfill its important role in SG condensation.

The higher valence G3BP RBD node is sufficient for the formation of stress particles with attached P-bodies

The data indicate that stress particle formation may require sufficiently high valency of the RBD complex, but rigorous testing of the model and determination of the lowest valency of the aggregation requires synthetic intracellular remodeling pathways. To quantitatively investigate the relationship between RBD valency for stress particle formation and threshold protein complex concentration, the previously described Corelet system (core consisting of the 24-mer ferritin complex coated with iLID molecules, which serves as an oligomerization platform mediated by blue-light stimulated sspB-iLID interactions, fused to the IDR region and the RBDs of G3BP) can be used to quantitatively map the valency and concentration-dependent phase diagrams.

Fig. 14A shows five domains of interest in G3BP (1400): replacing the valency-amplified dimerization domain of G3BP with the synthetic valency-amplified sspB node (NTF2) found that non-stressed cells required a very high degree of oligomerization (valency-24 at 0.15 μ M core) to drive aggregation (see fig. 15A). However, after arsenite treatment, aggregation occurred at much lower concentrations and valencies (valencies-8 at 0.15pM core) (see figure 15B), and the particles produced were significantly larger relative to non-stressed cells. This valence-dependent phase separation occurs rapidly (within a few seconds) and is fully reversible regardless of activation time (5vs.60 min), indicating that multivalent RNA binding contact is necessary for the formation and maintenance of stressed particles. In addition, these aggregates were found to mimic the properties of endogenous SGs, including influx dependent on exposed RNA, recruitment of canonical SG proteins and polyadenylated mrnas, attachment of P-bodies, and liquid-like fusions with dynamic rearrangement of internal components. These structures are called opto-SGs (optogenetic stress particles).

The shift in phase threshold of ANTF 2Corelet opto-SG after treatment with arsenite was significantly visible in cells exposed to a continuous 5 minute cycle of blue light activation and deactivation. Arsenite treatment triggers de novo assembly of opto-SGs in a time and valence dependent manner, with the assembled opto-SG becoming progressively larger in time scale components, consistent with SG assembly in WT cells. Arsenite-driven shifts in the opto-SG phase diagram were offset by pre-treatment with cycloheximide, which prevented polysome decomposition after translation had ceased (see FIG. 15C). Furthermore, long-term inhibition of actinomycin D-induced RNA production prevented the formation of opto-SGs (see FIG. 15D). These drug-dependent changes in the assembly of opto-SG are not artifacts of Corelet systems, as similar shifts in phase threshold are not present in the case of control self-associated idr (fus idr) lacking the ability to bind RNA.

These data indicate that polysome breakdown greatly enhances the formation of light-triggered opto-SGs, which is expected to flood the surrounding cytoplasm with RNA, which serves as a nucleic acid-based node (i.e., binding site or bridge for RNA-binding proteins) with a very high valency state. To check which G3BP domains are necessary for opto-SG condensation, an iterative truncation of G3BP ANTF2 was fused to sspB and its response to light in the Corelet system was checked. Consistent with the lack of partitioning into SGs, the disordered linker of G3BP is unlikely to participate in significant self-interactions, as G3BP IDR1, IDR2 and IDR1/2Corelet never lead to phase separation regardless of drug treatment. In contrast, polyA + opto-SG including all test SG markers was assembled by Corelets including IDR2-RBD (RRM and RGG) or RBD only. Furthermore, the Corelet system recapitulates key features associated with the expression of GFP-tagged truncated variants.

First, ANTF2/aid 2 (i.e., an analog of GFP-G3BP1 aid 2 that effectively lacks RNA binding capacity due to local electrostatic repulsion) failed to form particles under all conditions tested. Second, like GFP-aid 1, ANTF2/aid 1 formed more irregular particles. Third, only the phase threshold of the RBD Corelet was right shifted relative to ANTF2 (i.e. containing IDR1/2), consistent with higher concentrations of GFP-tagged aid 11/2 expression required for rescue. Fourth, all ANTF2/AIDR1 corelets similarly recruit SG proteins and polyA + RNA relative to ANTF2 corelets, and exhibit enhanced and reversible phase separation upon successive light-dark cycles following arsenite treatment. Finally, as with endogenous stress granules, all G3BP opt-SGs formed multiphase structures with DDX6 positive P-bodies; importantly, this indicates that in each case, the high valency of G3BP Corelets confers sufficiently adverse interaction with the P-body interaction network to produce phase immiscibility. However, unlike the GFP-tagged G3BP variant, opto-SG formation requires RRM and RGG fragments of the RBD, which may reflect steric hindrance of the closely juxtaposed core.

Stress particles with attached P-bodies are the default aggregates associated with high valency RBD nodes

The higher valence G3BP RBD node (but not the dimeric bridge) is sufficient to induce stress particle formation after polysome breakdown, but it is not clear whether this is a unique feature of G3BP or is common to the RNA binding nodes that interact with G3BP NTF 2. We conclude that if such NTF 2-associated RBPs contribute the necessary, additional RNA-binding valencies to the polyprotein complex, the synthetic hypervalent nodes attached to the individual RBDs will nucleate the isolated opto-SGs (i.e., mimic G3BP RBD Corelets). To test this, RBDs of CAPRIN1 and UBAP2L were oligomerized using the Corelet system and phase maps were mapped in untreated and arsenite-exposed cells. Surprisingly, although each contained only a single RGG region, its phase threshold was shifted to lower concentrations and Corelet valence relative to G3BP RBD (1RRM and 1RGG), indicating an enhanced propensity to drive mRNP agglomeration; these results are not artifacts of Corelet system, since SG rescue experiments using GFP-tagged chimeric G3BP protein showed that such RBDs can replace G3BP RBD with similar enhancement tendency. In addition, in each case, arsenite-induced polysome breakdown resulted in a shift in phase threshold and growth of reversible polyA + opto-SGs, which were positive for a panel of SG 15 markers. Again, each of these RBD Corelet-mediated opto-SG was attached to the P-corpuscle, suggesting that CAPRIN1 and UBAP2L RBDs alone were sufficient to impart immiscibility to the P-corpuscle phase. It should be noted that the arsenite-induced shift in phase threshold is relatively small compared to the G3BP RBD, which may indicate that stress has a different effect on the ability of different RBDs to bind to the decomposed polysome matrix relative to the intact polysome, or that particular RBDs may have an intrinsic self-interaction that leads to phase separation. For the aromatic-rich CAPRIN 1RGG, the latter possibility was excluded, as RNA depletion (actinomycin D) eliminated phase separation. Furthermore, RGG-mediated aggregation is not simply due to the net positive charge imparted by the high arginine residue abundance, as the disordered CAPRIN RGG region prevents phase separation. The RBDs of FXR1 (2KH and 1RGG), a dimeric RBP stably associated with UBAP2L, are also capable of assembling stress-regulated, reversible polyA + opto-SG with the expected SG markers and attached P-bodies. Based on the large set of RBDs placed in the Corelet system, synthetic nodes with high RBD valences are sufficient to nucleate polyA + SGs, whether they are associated with SG or P-corpuscle proteins or are linked to G3BP IDR. Nevertheless, different Corelets can be inserted into the alternative aggregate-forming interaction network, as full-length P-body proteins (DCPIA) Corelets, which retain the ability to participate in PPI bridges with P-body proteins, recruit an additional set of P-body markers, not SG proteins. Furthermore, the aggregate protein composition, P-body attachment and relative phase threshold do not depend on the type or number of RNA binding motifs. Thus, the "default" choice for multivalent RBD nodes is polyA +, SG-like aggregates coupled to immiscible P-bodies, and the aggregate specificity may depend on the network connectivity of a particular protein-protein interaction node. Competition between protein-protein interaction nodes encodes heterogeneous aggregation.

Since the synthetic high valency forms of their related RBDs are sufficient to cause SGs, the UBAP2L/FXR or CAPRINl/FXR complex may compensate for G3BP KO if a single protein can mimic the G3BP node (e.g., network centrality) and be expressed at a similar level. Unlike G3BP, CAPRIN1 did not rescue, even at relatively high levels, suggesting that it primarily acts as an RNA binding bridge in the SG network. In contrast, in the absence of G3BP, a slight overexpression (<1 μ M) of UBAP2L or FXR1 was sufficient for the formation of polyA + SGs, suggesting that both proteins serve as SG nodes providing sufficient RBD network valency for SG condensation. Inspired by the case of G3BP, it was assumed that the self-associating domain would confer the valence state (v >3) necessary for node identity. Although previous studies showed that such a domain (dimerization) is present for FXR1, a self-association interface for UBAP2L has not been established.

Fragment scan Corelet screening of UBAP2L regions with FUS IDR- (weak self-association) or NTF 2-like (dimerization) properties was performed using fragments from CAPRIN1 (predicted bridge) as internal controls. Of the 13 fragments tested, UBAP2L 781-1087 was unique in the formation of stress-independent polyA-negative droplets, these properties being retained upon further truncation of this aromatic-containing FUS-like region. The predictive power of the Corelet domain screening method is evident because the identified C-terminal is essential for the role of UBAP2L in G3 BP-independent SG formation (i.e., the deletion converts UBAP2L from a node to a bridge). UBAP2L does not form a high affinity complex with its ortholog UBAP2, but the protein is highly conserved, including the identified self-associating domain. This was considered based on the UBAP2/2L dual KO phenotype, which presumably forms weak self-associations between UBAP2/2L proteins in separate high affinity complexes (e.g., FXR1/UBAP2L, UBAP2L/G3BP), thereby acting as a valence multiplier necessary for SG aggregation. Since the high affinity UBAP2L complex, which contains both FXR1 and G3BP, is rare or non-existent, we hypothesized that the G3BP and FXR nodes compete for a limited number of connecting nodes UBAP2L, and that relative stoichiometry is critical for their mixed distribution in endogenous SGs. In line with this, ectopic overexpression of FXR1 leads to heterogeneous SGs, which may play a role in endogenous systems, since STED microscopy analysis of G3BP1 in live SGs shows micro-heterogeneity not visible by conventional microscopy techniques. In contrast, UBAP2L forms a single-phase SGs with G3 BP. Co-expression of all three proteins indicates that at constant levels of UBAP2L, the stoichiometry between FXR1 and G3BP nodes is critical to determine whether a monophasic or multiphasic compartment outcome: the relatively high FXR1 resulted in delamination of G3BP from UBAP2L present in both phases; in contrast, high G3BP resulted in a single phase for all three proteins. Thus, competition between non-adjacent nodes (G3BP, FXR1) for limited supply of connecting nodes (UBAP2L) appears to determine the degree to which the network intertwines to form a single miscible phase. Unexpectedly, we observed that both FXR1 and UBAP2L nucleated small, stress-independent particles in G3BP KO cells that fused and grew after stress.

Since cells lack SG nodes with the greatest centrality, it is hypothesized that these G3 BP-independent aggregates associate into PPI networks that differ from endogenous SGs. In fact, the UBAP2L aggregate contained SG and P-corpuscle proteins, probably due to the high affinity association of UBAP2L with the essential (Ayache et al, 2015; Ohn et al, 2008) P-corpuscle node DDX 6. Interestingly, DDX6 was weakly recruited to SGs, while EDC3 and DCPIA were rejected, reflecting the relative preference for one of the two immiscible networks.

These studies and past work supported a continuum of nodes (e.g., G3BP and FXR1 vs. ubap2l) that overlapped in connectivity (see figure 17, a graphical summary of the interaction network supported by this series of experiments). We hypothesize that this node-based competition would lead to a recombination of the global P-bodies and SG network, which would be observed on the scale of miscibility.

To test this, central SG and P-corpuscle nodes were expressed in pairs in G3BP KO cells to examine the form of a heterogeneous pattern after stress. Depending on the network distance between the nodes, the two proteins are observed to be miscible, heterogeneous or to be present in separate aggregates. In contrast to neighboring nodes that form a single phase (e.g., G3BP and UBAP2L, EDC3 and DCPIA), upregulation of remote nodes (e.g., G3BP and DCPIA) decouples SGs from pbomes. Although these studies indicate that competing PPI networks are sufficient to encode different heterogeneous aggregates, it is not clear to what extent the network-matrix interaction is ancillary. If PPI networks with adverse interactions are the primary mediators of SG/PB layering, we predict that networks sharing the same substrate preference are immiscible even if disconnected. An ideal model to test this is the synthetic G3BP/UBAP2P2L RBD node, which encodes an aggregate with the same properties as endogenous SGs, but lacks PPI connectivity through deletion of the NTF2 domain.

As a proof of principle, G3BP RBD opto-SGs formed on the surface of stress-induced UBAP2L aggregates and maintained heterogeneous properties throughout the maturation process. Once deactivated, the opto-SGs dissolved and the surface tension disappeared resulting in the UBAP2L phase dispersing into a single point.

Heterogeneous agglomerations were similarly observed in a set of co-expression pairs of the G3BP/UBAP 2L-related RBD nodes and their FL counterparts; of particular note is how UBA2L RBD cores form significant heterogeneous agglomerations with FL UBAP2L, although this heterogeneous behavior is not yet clear on the diffraction limited scale for G3BP RBD cores co-expressed with FL G3BP 1. Since the same experiment with NTF2 Corelets generally resulted in a single-phase structure, it can be concluded that shorter network distances (i.e., direct protein-protein interactions) promote miscibility, while longer network distances (i.e., via RNA intermediate binding) promote heterogeneous behavior.

Details of the method

Plasmid construction

Unless otherwise indicated (e.g., pHR lentiviral vector, SFFV promoter), all lentiviral DNA plasmids used FM5 lentiviral vectors with ubiquitin C promoter. By usingPCR with high fidelity DNA polymerase (NEB) amplifies DNA fragments encoding the proteins of our interest. Oligonucleotides for PCR were synthesized from IDT. The In-Fusion HD cloning kit (Clonetech) was used to insert PCR amplified fragments into the required linearized vector (with standardized linkers) and overlap to allow high throughput cloning. Sequencing by GENEWIZ (from it)Sequencing both ends of the insert) confirmed the cloned product. For all sspB-mCherry labeled DNA constructs, the independent investigator again confirmed correct sequencing. Two different fully sequenced DNA constructs (FM5-mGFP-G3BP 1S 38F and pcDNA4 t/o-GFP-G3BP 1S 38F) generated by two independent laboratories were used to confirm the stress particle (SG) rescue defect associated with the G3BP S38F mutation.

Cell culture

Cells were cultured in DMEM (GIBCO) with 10% FBS (Atlanta biological), supplemented with 1% streptomycin and penicillin, and maintained at 37 ℃ and 5% CO2In a wet incubator. All cell lines tested negative for mycoplasma. HEK293 and HEK293T cells were donated by Marc Diamond laboratory (UT Southwestern). HeLa cells were obtained from ATCC. U20S cells and U20S G3BP1/2KO cells have been previously described (Kedersha et al, 2016). This knock-out cell line is well characterized in the cited paper, and multiple independent laboratories have validated resistance to stress particle development (personal communication). G3BP1/2KO (hereinafter referred to as G3BP KO) was confirmed internally by Western blotting.

Lentiviral production and lentivirus transduction

All live cell imaging experiments were performed using lentivirus-stably transduced cells, except for the rescue of the photoinduced sspB-/iLID-ANTF2 dimer-mediated G3BP gene knockout (see transient transfection). Lentiviruses containing the desired construct were prepared by using LipofectamineTM-3000(Invitrogen) transfection of this plasmid into HEK293T cells, together with the helper plasmids VSVG and PSP (Marc Diamond laboratories from UT Southwestern). Viruses were harvested 2-3 days post transfection and used to infect WT U20S, G3BP KO U20S, or WT HEK293 cells. Lentivirus transduction was performed in 96-well plates. Three days after the lentivirus application to low confluency cells, cells were passaged to either steady maintenance or directly to 96-well fibronectin coated glass-bottom plates for live cell microscopy. For non-Corelet experiments, stable cells were passaged at least 3 times within 8+ days to eliminate cells expressing lethal levels of the fusion protein of interest prior to use in live cell imaging experiments. In all experiments, 90% + ofThe cells have a concentration in a certain range (usually<5 mu M; estimated concentration labeled as relevant) of the protein of interest. This particular protocol is designed to avoid fusion protein concentrations that are prone to artifacts, which can occur in matrix-based transient transfections.

Transient transfection

Unlike other experiments (see above), transient transfection was used for rescue of light-induced ANTF2 dimer-mediated G3BP gene knock-out. Preliminary attempts to use lentivirus-based transduction to rescue defects (data not shown) were unsuccessful due to the inability to achieve high concentrations of individual fusion proteins (i.e., > 5 μ M of mCherry-sspB-G3BP ANTF2 and mGFP-iLID-G3BP ANTF 2). Therefore, Lipofectamine was used according to manufacturer's recommendationsTM-3000(Invitrogen) Individual wells of a 96-well plate containing G3BP1/2KO U20S cells were transfected with both mCherry-sspB-G3BP ANTF2 and mGFP-iLID-G3BP ANTF 2. After 24 hours, the cells were observed to have two fusion proteins widely expressed throughout the cytoplasm. Arsenite was added to a final concentration of 400 pM. After 1 hour, the cells were imaged. Three biological replicates were performed. In rare cells where both component concentrations were very high (> 10pM for each component), stressed particles were observed, regardless of the time of blue light activation. The non-light dependent nature of the dimer-based rescue at these concentrations was compared with a K of 4.3pM measured in vitro in the dark state for iLID-sspBdConsensus (Guntas et al, 2015). At such concentrations, the iLID and sspB are expected to associate strongly in the dark. in vitro light State K of iLID-sspBdAt 0.2pM (or-10 nM for "core" measurements, see phase diagram data collection), this sets a lower limit for the assay.

Confocal microscopic analysis of living cells

Cells were imaged on fibronectin-coated 96-well glass-bottom dishes (Cellvis). Confocal images were taken on a Nikon Al laser scanning confocal microscope using a 60x oil immersion lens with a numerical aperture of 1.4. The microscope stage was equipped with an incubator to maintain the cells at 37 ℃ and 5% CO2The following steps. mCherry, mGFP (GFP), EYFP and mirFP670(iRFP) labelled proteins were imaged with 560, 488 and 640nm lasers respectively. As described aboveDetails apply to all imaging data herein except for STED super-resolution and wide-field microscopy images. See below for details.

Stimulated emission depletion (STED) super-resolution microscopy

For single channel STED images, successive image sets are taken at increasing STED power using the "Custom Axis" option available in the imscope (each line is simultaneously imaged with and without the STED laser to float off artifacts). For the two-channel STED image, two sets of consecutive images were taken using the per-line imaging mcfp (+/-STED) and miRFP (+/-STED), the first mcfp STED power being set to 0% STED power to avoid the miRFP image bleaching, which occurs during the second image (again using the "custom axis" option available in the imager).

Wide field microscopic analysis

For some images, G3BP KO or UBAP2L KO U20S cells stably expressing GFP-UBAP2L were grown on glass coverslips, stressed with 400 μ M arsenite at the indicated time, and fixed using 4% paraformaldehyde in PBS for 15 minutes followed by post-fixing/permeabilization in ice-cold methanol for 5 minutes. Cells were blocked in 5% horse serum/PBS and primary and secondary incubations were performed in blocking buffer with shaking for 1 hour. After washing with PBS, cells were loaded in polyethylene mounting medium and observed. Images were captured using a Nikon Eclipse E800 microscope equipped with a 63x Plan Apo objective (NA 1.4) and using mercury lamps and standard filters with DAPI (UV-2A 360/40; 420/LP), Cy2(FITC HQ 480/40; 535/50), Cy3(Cy 3HQ 545/30; 610/75) and Cy5(Cy 5HQ 620/60; 700/75). Images were captured using a SPOT Pursuit digital camera (Diagnostics Instruments) with manufacturer software, and the original TIF file was imported into Adobe Photoshop CS 3. In a given experiment, the same brightness and contrast adjustment was applied to all images.

Corelet activation

The mCherry (560) channel was used to capture pre-and post-activation images of G3BP KO cells stably expressing the indicated fusion protein, only to visualize the sspB component without triggering light-induced dimerization with the iLID-mGFP-labeled ferritin nucleus. 488 for 1% laser powerLaser-activated cells to cause the dimerization of iLID and sspB. By pairing 120x120 μm under Nyquist zoom2The (1024x1024 pixels) region simultaneously imaged the mCherry and mGFP channels using a frame interval of 6 seconds to achieve activation of the cells. See also phase diagram data collection.

Fluorescence Recovery After Photobleaching (FRAP)

G3BP KO cells stably expressing the indicated fusion protein were first globally activated (i.e., iLID-sspB dimerization) by continued exposure to 488 laser for 5 minutes. Then 560 laser at high power at 1pm2In the region of (a) bleaching the light activated coacervate to quench most of the mCherry-sspB component of the coacervate. Fluorescence recovery was monitored while imaging the mCherry and mGFP channels at frame intervals of 6 seconds. Fluorescence was normalized based on non-FRAP droplets to control for bleaching and fluorescence intensity was compared to the initial image for mapping purposes. Cells were treated with arsenite to dissociate polysomes. Cells were treated by adding sodium arsenite to the cell culture medium at a concentration of 400. mu.M, which was in excess of the saturation concentration for polysome breakdown (Kedersha et al, 2016). Unless a bright-dark cycle experiment (see below) is performed, images are captured 50 minutes to 2 hours (typically 1 hour) after arsenite treatment. Between 60 minutes and 120 minutes, no difference was observed in rescue, phase threshold shift, SG inhibition, etc. SG number/size usually peaks at 45 minutes and a time window of 1 to 2 hours is chosen so that the drug achieves maximum effect. Cells typically begin to die about 6 hours after treatment, so to avoid toxicity/lethality confusion, the specified 1 to 2 hour time window is used. Inhibition of polysome split addition by pre-treatment with cycloheximide to a final concentration of 100pg/mL (G3BP KO cells expressing the indicated fluorescent fusion protein). After incubation for 30 minutes, arsenite was added at a concentration of 400. mu.M. After 1 hour, cells were evaluated for stress particle formation (GFP-G3BP rescue) or for activation cycling (Corelets).

Cell treatment of actinomycin D to inhibit transcription

Actinomycin D dissolved in DMSO was used to treat G3BP KO cells expressing the indicated Corelets at a concentration of 5 pg/mL. Images were taken 12-18 hours after actinomycin D treatment, during which time interval nucleoli were no longer evident by bright field observation and most mRNA degradation was expected. The final concentration of DMSO was 0.5%. For the actinomycin D plus arsenite experiment, 400 μ M arsenite was added after-12 hours of actinomycin D treatment, and cells were imaged after 1-2 hours. Qualitative observations indicate that actinomycin D administered at the indicated concentration is lethal after-30-36 hours of treatment. The time points were chosen to maximize time since treatment (i.e., to minimize cellular RNA) without the extensive lethality of the drug.

Phase diagram data collection

To determine the precise phase threshold boundaries of the intracellular phase map, the cells analyzed must have a high degree of variability in sspB-mCherry and iLID-mGFP stoichiometry to sample sufficient core concentrations and valencies. To achieve a wide concentration range for both components, G3BP KO cells were transduced in 96-well plates (Cellvis) using an arrayed lentivirus approach. In this protocol, lines run from 2-60pL iLID-GFP-Fe lentivirus; column, 2-60pL mCherry-sspB-Open Reading Frame (ORF)/ORF-mCherry-sspB lentivirus variations. G3BP KO cells were seeded directly into arrayed lentiviruses to achieve-25% confluency upon subsequent attachment to plastic substrates. After 72 hours, at confluence, all 16 wells associated with a single Corelet condition were washed with PBS, trypsinized, quenched with fresh medium and pooled, ensuring a diverse cell population with a highly variable iLID to sspB ratio. Cells were seeded at a dilution of 1:8 onto fibronectin coated glass-bottom 96-well plates (Cellvis) and imaged at 60-90% confluence after 48 hours.

For all data collected for generating phase diagrams, standardized imaging protocols were used to avoid confusion related to changes in microscope settings. The same imaging setup was used (see quantitative and statistical analysis) relative to a calibration (fluorescence relative absolute concentration) based on Fluorescence Correlation Spectroscopy (FCS). Specifically, images were collected using a 0.5 frame/second scan rate, 1024x1024 pixel frames, and 1.75x Nyquist zoom (63x oil immersion lens). The laser power (1% 488 and 100% 546), intensity and gain remained unchanged. The length of all time shifts was 5 minutes with a 6 second interval between frame acquisitions. After the last frame, the laser intensity is reduced for an additional 4 frames, followed by the acquisition of 4 final images at higher relative laser intensities.

This scheme is chosen to achieve a wide dynamic range (e.g., to achieve sufficient resolution for lower concentrations of cells, which have a higher signal-to-noise ratio) and to avoid pixel saturation in the case of Corelets, which result in dense, exceptionally bright spots. Using this normalization protocol, each 5 minute acquisition was able to add (average) 10 data points in the phase plot. Therefore, the mean phase plot used in this study required collection of 20-30 fields or-2 hours of data acquisition time. Typically, a single phase plot is compiled from data collected over the course of 3-5 experiments (i.e., different lentivirus transductions on different days). However, some phase diagrams have data from significantly more experiments (e.g. G3BP ANTF2 Corelets, used as a positive control for the effect of drug treatment throughout the study, which ensures quality control).

There was no evidence of a systemic change in drug response, drug efficacy, or fluorescence intensity measurements throughout the duration of the study. When selecting cells for analysis, only fully activated cells (i.e. whole cells in the field of view) are considered to avoid potential artifacts associated with local activation and diffusion capture (Bracha et al, 2018). The average mCherry and mcgfp fluorescence intensity of cells was determined using artificial image segmentation of the first frame (i.e. prior to blue-light mediated dimerization of the iLID on the core with the sspB-labeled protein of interest) and 4.5x4.5 pm squared target Regions (ROIs) in the cytoplasmic region with homogeneous fluorescence (i.e. the region with low density of membrane-bound organelles such as the endoplasmic reticulum). mCherry and mGFP concentrations were then determined using the FCS calibration curve described above (fig. 18). The concentration of mGFP was divided by 24 (subunits or "cores" of each ferritin complex) to determine the core concentration. The valence state of a single cell is determined by dividing the mCherry value by the core value.

This is a highly accurate measure based on the lever rule-in "one-component" systems (i.e. FUS IDR Corelets, which have minimal endogenous proteins, nucleic acids), the consistency of valency between the initial, diluted and concentrated phases can be reliably observed. Binary decisions regarding Corelet-mediated phase separation in cells of interest were determined manually. The data sets and phase diagrams used for subsequent automatic generation of the phase diagrams are encoded and transmitted to the individual individuals.

Cycling test after drug treatment

Cycling experiments were performed as described in the phase diagram data collection, with slight variations. Image acquisition was started immediately after treatment of G3BP KO cells expressing the designated sspB/iLID Corelets with arsenite (or designated drug). For most experiments, an activation time interval of 5 minutes was obtained per cycle, followed by a deactivation time interval of 5 minutes. From studies of diverse proteins in the Corelet system, we have determined that this deactivation time far exceeds the time required for complete reversibility (i.e. typically 30-60 seconds). The specified cycle parameters were repeated 6-8 times. In some experiments, an activation time interval of 10 minutes was instead followed by a deactivation time interval of 5 minutes. This process was repeated 4 times. The interval was kept constant at 6 seconds. Representative cells/fields were selected for data analysis based on standard core concentration (0.25 μ M) and desired valency.

G3BP rescue competition and stress particle inhibition experiment

For the G3BP rescue competition experiments, the same array lentivirus approach as described in the phase diagram data collection was used (i.e., 2-60pL of G3BPm Cherry and 2-60pL of mGFP-ORF, 16 wells in a 96 well plate in a4 well x4 well arrangement). G3BP KO cells were seeded into lentiviruses and then pooled and passaged 72 hours later.

Confocal microscopy of live cells was performed on day 5 post transduction. For each condition (GFP-tagged protein of interest), 4 independent experiments (1 well per arsenite treatment) were performed on three different days, replicated (field of view) with multiple techniques. Confocal imaging of live cells was performed 1-2 hours after arsenite treatment. mCherry and mGFP concentrations were determined similarly to the phase diagrams and scored manually for the presence or absence of stress particles. Rescue was assessed using a similar protocol without competition.

For stress particle suppression experiments, WT U20S cells stably expressing YBXl-mCherry (SG marker) were seeded at 25% confluence in 96-well plates and transduced in an array format with 2-60 μ L of lentiviruses designated mGFP-tagged proteins. Three days later, cells were washed, trypsinized, pooled and passaged. Three days after this, cells were passaged onto fibronectin-coated 96-well plates. Confocal imaging of viable cells was performed after 2 days (i.e. 8 days after lentiviral transduction) when the confluency of cells was 60-80%. Images were taken within 1-2 hours after arsenite treatment. On two different days, 3-4 independent experiments were performed for each condition, with multiple technical replicates (i.e., fields of view) per experiment. The concentration of the mGFP-tagged protein was determined, SG formation was assessed in a binary fashion, and all data was encoded and then sent to independent individuals for quantitative analysis.

Stress particle distribution

For stress particle partitioning experiments, WT U20S cells stably expressing mcfary-CAPRINl or mCherry-CAPRINl were seeded at 25% confluence in 96-well plates and transduced with 30pL of the indicated mchery-labeled lentivirus (mcfary-CAPRINl) or mchery-CAPRIN 1 cells. Three days later, cells were washed, trypsinized, binned and passaged onto fibronectin-coated 96-well plates. Imaging was performed after 2 days when cell confluence reached 60-80%. Images were taken 1-2 hours after arsenite treatment. Each condition was subjected to 3 independent experiments.

The co-localization Corelet study followed a protocol similar to "phase plot data collection", but was performed using two lentiviruses co-transduced (using non-fluorescent iLID-Fe instead of the typical GFP-tagged form) on G3BP KO cells stably expressing the indicated GFP-tagged protein. 72 hours post infection, passages were made on fibronectin coated glass-bottom 96-well plates (Cellvis) at 1:8 dilution. After 48 hours, treatment with arsenite (400. mu.M) was carried out. After one hour, the plate was removed from the humidified incubator and placed on a blue LED illuminator (Invitrogen Safelmager 2.0) for 10 minutes to activate Corelets. Immediately fix with 4% PFA for 10 minutes. Washed twice with PBS and infiltrated with ice-cold 70% methanol for 10 min. Washed twice more with PBS and then left overnight at 4 ℃. Fixed cell confocal microscopy was performed the next day to examine co-localization of opto-SGs with the indicated GFP-tagged SG/PB protein.

Fluorescence in situ histochemistry of RNA

Designated cells were fixed with 4% PFA for 10 min. Washed twice with PBS, infiltrated with ice-cold 70% ethanol, and left overnight at-4 ℃. The ethanol was replaced with wash buffer a (stellaris) and incubated for 5 min at room temperature. Replace with hybridization buffer (Stellaris) containing 5pM 5' -Cy 5-oligo d (T)20(Gene Link) and incubate for 16 hours in the dark to probe polyadenylated mRNA. Transfer to wash buffer a, stand at 37 ℃ for 30 min, then replace with wash buffer B, and incubate at room temperature for an additional 5 min. Washed 3 times with PBS and imaged.

Western blot for assessment of G3BP1/2 levels and knock-out of human cells

U20S WT, U20S G3BP1/2KO, HEK293 or HeLa cells from 6-well plates were washed, trypsinized, quenched with media and centrifuged at 500Xg for 5 min. The cell pellet was washed with PBS and flash frozen. Immediately prior to lysis, cells were thawed on ice and resuspended at 150 μ L2 ×LDS sample buffer/reducing agent, sonicated, and boiled at 100 ℃ for 5 minutes. As positive controls 50ng of the following recombinant proteins were used for cell lysates: g3BP1(Novus, NBP1-50925-50UG), G3BP2(Novus, NBP1-78843-100 UG).

Samples were prepared according to the manufacturer's protocolNovex 10% Bis-Tris gels were run on and transferred to PVDF pre-cut blot membranes. Membranes were blocked overnight at 4 ℃ in 5% NFDM in TBST (5mM Tris-HCl, pH 7.5, 15mM NaCl, 1% Tween-20) with shaking. The membrane was probed overnight in blocking solution at 4 ℃ with shaking with the following primary antibodies: g3BP1 (murine monoclonal antibody, Abcam ab86135, 1:300), G3BP2 (Rabbit polyclonal antibody, Abeam ab86135, 1:5000), Beta actin (Rabbit polyclonal antibody, Abcam ab8227, 1: 1)0,000). The following day, membranes were washed multiple times and then incubated with the following secondary antibodies in blocking solution for 30 minutes at room temperature with shaking: peroxidase-Affinipure goat anti-mouse IgG (H + L) (Jackson, 115-. Subsequently, SuperSignal was used as instructed by the manufacturerTMWest Pico PLUS chemiluminescent substrate was washed multiple times before developing the membrane.

Immunoprecipitation

150mM dishes near confluent cells were treated as indicated, washed with cold Hanks basic salt solution, and scraped to lysis buffer (20mM Tris-HCl, pH 7.4, 150mM NaCl, 5mM MgC 1) at 4 ℃21mM DTT 0.5% NP-40, 10% glycerol) containing 1mM DTT, protease inhibitors (Roche EDTA free), HALT phosphatase inhibitors (Pierce) and 20pg/nL RNAse A. Cells were spun at 4 ℃ for 30 min, clarified by centrifugation (5,000rpm for 5 min), and the supernatant removed, then incubated with Chromotek-GFP-Beads (Allle Biotech) were incubated together at 4 ℃ for 2 hours with continuous rotation. The beads were washed 5 times and then eluted directly into RNase-treated SDS lysis buffer or extracted in RIPA buffer (50mM TRIS, 150mM NaCl, 1.0% NP40, 0.5% DOC, 0.05% SDS) at 4 ℃ for 1 hour with rotation. The material released by RIPA buffer was recovered and precipitated with 60% acetone. The RIPA-extracted beads contain binding material representing "high affinity" which is released by heating in reducing SDS-PAGE lysis buffer. Proteins were resolved on a 4-20% Mini-PROTEAN TGX precast gel (Bio-Rad) and transferred to nitrocellulose membranes using the Transfer-Blot Turbo Transfer system (Bio-Rad) and blotted using the standard procedure described above. Chemiluminescence was detected using SuperSignal West Pico substrate (Thermo Scientific).

Cas9 deletion cell line

With the exception of UBAP2, each target sequence was purchased from IDT as a paired DNA oligo (sense/antisense pair), annealed and ligated into pCas-guide (origene). The plasmid insert was verified by sequencing and co-transfected into cells with pDOnor-D09 (origin) encoding puromycin resistance. After transfection, cells were briefly (24 hours) selected in puromycin (2pg/mL) and recovered for 2 days or more before evaluation using the indicated antibodies and immunofluorescence. Cells were cloned by limiting dilution and clones were verified using immunostaining and western blotting. For the single KO line, the parental cell line is U20S expressing the tet-repressor (Kedersha et al, 2016). CAPRIN1 and USP10 were individually knocked out in previously characterized double (G3BP1/G3BP2) KO cells (Kedersha et al, 2016).

To establish the U20SAFFF cell line, FXR2 was first knocked out, clones were selected, and FXR2 protein expression was assessed by immunofluorescence and western blot. Clone 6 was then co-transfected with guide RNAs targeting FXR1 and FMR 1. Clones were selected and screened in a similar manner, and a triple null line was finally obtained. All loci were sequenced to confirm deletions in DNA.

In the case of UBAP2/UBAP2L dual KO, validated UBAP2L single KO cells were seeded into 200 μ L of pCRI8PRv2-UBAP2 gRNA (pooled, 6gRNA) or 200pL of pCRISPRv2-Nontarget gRNA (Shalem et al, 2014) in 96-well plates. After 72 hours, confluent cells were washed, trypsinized and passaged into new wells containing 200pL of the same lentivirus.

Cells were passaged three times and checked for successful KO and validated with two antibodies to UBAP2, indicating that-30% of cells had very low or undetectable levels of UBAP2 (in the non target control, 100% of cells showed LIBAP2 staining). Cells were expanded by serial passage from 96 wells to 24 wells to 96 wells over a 1 week period. After confluency in 96 wells, cells were passaged in a limiting dilution into 3 separate 96-well plates, giving each well approximately-50% chance to accept cells. After 10 days, about-20-30% of the wells showed distinct colonies. For the nintarget control, six wells were harvested and passaged; candidate UBAP2/2L doublet-KOs, 50 separate lines.

After two weeks of additional passages and growth, candidate KO lines (and the NonTarget control) were seeded onto fibronectin-covered glass (96-well plates). After 24 hours, the cells reached-60-80% confluence. Cells were fixed with 4% PFA, permeabilized with ice-cold methanol for 5 minutes, and immunohistochemistry (anti-UBAP 2, anti-G3 BPl) was performed. In the non target control, most cells had G3BP positive stress particles, but it was slightly smaller than the control conditions (i.e. not UBAP2L KO), a result that was validated between laboratories (data not shown). The 4 candidate UBAP2/2L dual KO lines had UBAP2 that was undetectable by immunofluorescence. In these examples, the SGs positive for G3BP were present only in-30% of the cells and were much smaller in size than WT or UBAP2L single Kos. Double knockdown of UBAP2 and UBAP2L was verified by western blotting.

Genotyping of Cas9 mutant cell lines

To identify Cas 9-induced mutations for all KO cell lines in the coding sequence, genomic amplification was performed using nested primer sets surrounding the region targeted by a particular guide sequence. Genomic DNA PCR was performed using the DNA polymerase from AccuPrime GC-Rich (buffer A) from Invitrogen. The DNA was initially denatured at 95 ℃ for 3 minutes, then denatured at 95 ℃ for 30 seconds, annealed at 60 ℃ for 30 seconds, and extended at 72 ℃ for 1 minute for 30 cycles. The final extension was carried out at 72 ℃ for 10 minutes. The PCR amplicons were directly sequenced. If multiple sequences (i.e., multiple alleles) are evident, PCR products are adenylated using Taq polymerase and cloned into Promega-T Easy vector; individual clones were obtained and sequenced.

Double positive U20S stable cell line

A clonal cell line constitutively expressing mCherry-G3BPl was prepared by transfecting mCherry-G3BPl-Cl into a (G3BP1/G3BP2) KO cell containing a tet repressor, and was selected and cloned using G418(500 pg/mL). This cell line was used to prepare double positive cells expressing tet-inducible GFP marker proteins (G3BP1WT, G3BP 1S 38F, G3BP 1F 33W and UBAP2L WT) in pcDNA4 t/o vector (Invitrogen) and selected using bleomycin (zeocin) (Invtrogen, 250 pg/mL).

Quantitative and statistical analysis

Fluorescence correlation spectroscopy

GFP and mCherry fluorescence values were converted to absolute concentrations using Fluorescence Correlation Spectroscopy (FCS), performed as previously described (Bracha et al, 2018), with minor modifications. Data specifying the diffusion and concentration of the fluorescent fusion protein were obtained using a 30 second FCS measurement time.

Measurement of the U20S G3BP1/2 dual KO cell population expressing iLID-mGFP and mCherry-sspB alone, the choice of fusion protein conditions was based on the hypothesis that this non-native fusion protein would be monomeric and without major endogenous binding partners. Images were taken using a Nikon Al laser scanning confocal microscope with an oil immersion objective (Plan Apo 60X/1.4 numerical aperture, Nikon). All measurements and data analyses were performed using symhotime software (PicoQuant).

The autocorrelation function for simple diffusion is:

the variables in the above equation are defined as follows: g (0) is the amplitude of the short timescale; τ is the lag time; tau isDIs the half decay time; and κ is the ratio of the axis to the diameter of the measurement volumeHere, the first and second liquid crystal display panels are,κ ═ 5.1, determined from the fluorophore dye Alexa 488 in water. Parameter tauDAnd G (0) are optimized in the fitting and used to determine the diffusion coefficientAnd concentration of molecules

The fluorescence-concentration calibration curve shown in figure 18 was used for all experiments to quantitatively mimic the concentrations of mCherry-and mGFP-labeled fusion proteins in WT and G3BP KO U20S cells.

Such FCS calibration curves yield several findings that support the accuracy of this estimation:

(A) the mCherry FCS experiments performed independently produced concentration estimates that differed by < 5% (Bracha et al, 2018). In addition, the above study co-expressed mGFP and mCherry in equimolar ratio using an autocatalytic P2A system, with GFP concentrations extrapolated from the FCS calibration curve determined for mChery. The indirectly extrapolated calibration curve predicted GFP concentrations that differed by < 20% from the independently obtained calibration and estimates used in this study.

(B) The slope of the stoichiometry between USP10 and G3BP, required to quantify G3BP defect rescue, is clearly close to 1(-0.98), the predictive value of this competitive inhibitor is shown at concentrations much greater than its Kd and is further confirmed by the nearly equivalent slopes for the different strong inhibitors (USP10 NIMxl and CAPRIN 11-386);

(C) the concentration of G3BP1/2 in U20S cells was extrapolated by adding the concentration of G3BP for rescue (620nM) (values determined in separate experiments were within 50nM of this value) and USP10 for SG inhibition (1560nM), to extrapolate the concentration of-2180 nM G3BP in the cytoplasm of U20S cells. This value is approximately equal to the mass spectrum value independently obtained in HeLa cells (Hein et al, 2015) and western blots confirm the level of similarity between the two cell lines;

(D) mGFP-G3BPl and G3BPl-mCherry have very similar SG rescue concentration thresholds (i.e., within 50nM of each other).

Image analysis

All images were analyzed using a combination of manual image segmentation (ImageJ), custom semi-automated workflows in ImageJ, and MATLAB 2018 b. In all experiments, regions of interest were selected in ImageJ and the mean cytoplasmic intensity was calculated using the FCS calibration described above. In cases other than cycling experiments, the presence of stress particles was determined by manual scoring based on co-localization with stress particle markers that were diffusely distributed in the cytoplasm in the absence of stress.

Artificial image segmentation

The mean fluorescence intensity of mCherry and gfp of cells was used to approximate the concentration of the fusion protein of interest. This was determined by using artificial image segmentation to map 4.5x4.5 pm square ROIs in cytoplasmic regions with uniform fluorescence distribution (i.e. low density regions of membrane bound organelles such as the endoplasmic reticulum). The concentration of the protein was then determined using the FCS calibration curve described above. For experiments not involving Corelets, the presence or absence of stress particles was manually annotated. For phase diagram purposes, the phase separation was manually annotated by assessing whether macroscopic spots formed after a 5 minute activation time course (6 second interval). Only fully activated cells were considered to avoid confusion associated with diffusion-based capture (Bracha et al, 2018).

Light and dark cycle experiment

A single region of interest that remains in view over the entire time course is manually selected. The standard deviation was calculated from the measured mCherry intensity and normalized by the standard deviation of the first frame taken.

G3BP rescue competition data analysis in G3BP KO U20S cells

The concentration of each cell was determined by manual image segmentation as described previously and the absence or presence of stress particles was annotated. To determine the boundaries from the data, a Support Vector Machine (SVM) was trained by applying the fitcsvm () function in MATLAB statics and Machine Learning packages using the concentrations of the two components as explanatory variables and the classified stress particle states as response variables using a default solver. In short, under the assumption that data is linearly separable, the support vector machine constructs a linear decision surface based on boundary points ("support vectors"). In this two-dimensional case, the parameters of slope and intercept were extracted to calculate the minimum G3BP concentration for stress particle formation and the stoichiometry of interaction with the protein of interest.

Phase diagram and calculation of critical valence states

For each phase plot, the average concentration of the iLID-GFP-Fe core and mCherry-sspB marker proteins was calculated and assigned to the class with or without stress particles. To determine the phase threshold boundaries in an automated and unbiased manner, the SVM regression was again used, using the core concentration and log 2-transformed valence states as explanatory variables and the presence of a phase separated structure as a classification response variable. However, since the data is not linearly separable, a polynomial kernel of order 2 is used to interpret the curvature of the phase threshold. Then, in order to calculate the decision surface, the score of the SVM is calculated at all points of the 50x50 grid in the phase diagram, and a contour line representing the phase threshold of the connection point having the score of 0 is drawn using the contourr () function of MATLAB. The specific value of the critical valence state at the specified core concentration is then calculated by linearly interpreting the zero-fractional contour.

Approximation of the critical concentration of stress particle assembly inhibition in WT U20S cells

For each experiment, the concentration of the protein of interest per cell was determined and classified for the presence of stress particles. The critical concentration for inhibition or rescue is defined as the protein concentration of interest at which 50% of the cells have a chance of having stressed particles. Specifically, the probability density is calculated by binning the concentration distribution using a square root rule. Within each bin, the probability of having a stress particle is calculated as the number of cells having a stress particle out of the total number of cells in that bin. This results in a monotonic function; the value with a probability of 0.5 is then interpolated to determine the threshold concentration for inhibition or rescue. This process is repeated for each repetition and the standard error of the mean between the repetitions used to determine the error bars.

43页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:编辑RNA的方法和组合物

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!