About Author manuscripts Submit a manuscript HHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Science. Author manuscript; available in PMC 2016 Nov 29.
Published in final edited form as:
PMCID: PMC5127784
NIHMSID: NIHMS826641
PMID: 27256883

C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector

Associated Data

Supplementary Materials

Abstract

The CRISPR-Cas adaptive immune system defends microbes against foreign genetic elements via DNA or RNA-DNA interference. We characterize the Class 2 type VI-A CRISPR-Cas effector C2c2 and demonstrate its RNA-guided RNase function. C2c2 from the bacterium Leptotrichia shahii provides interference against RNA phage. In vitro biochemical analysis show that C2c2 is guided by a single crRNA and can be programmed to cleave ssRNA targets carrying complementary protospacers. In bacteria, C2c2 can be programmed to knock down specific mRNAs. Cleavage is mediated by catalytic residues in the two conserved HEPN domains, mutations in which generate catalytically inactive RNA-binding proteins. These results broaden our understanding of CRISPR-Cas systems and suggest that C2c2 can be used to develop new RNA-targeting tools.

Almost all archaea and about half of bacteria possess Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated genes (CRISPR-Cas) adaptive immune systems (1, 2), which protect microbes from viruses and other invading DNA through three steps: (i) adaptation, i.e., insertion of foreign nucleic acid segments (spacers) into the CRISPR array in between pairs of direct repeats (DRs), (ii) transcription and processing of the CRISPR array to produce mature CRISPR RNAs (crRNAs), and (iii) interference, whereby Cas enzymes are guided by the crRNAs to target and cleave cognate sequences in the respective invader genomes (35). All CRISPR-Cas systems characterized to date follow these three steps, although the mechanistic implementation and proteins involved in these processes display extensive diversity.

The CRISPR-Cas systems are broadly divided into two classes on the basis of the architecture of the interference module: Class 1 systems rely on multi-subunit protein complexes whereas Class 2 systems utilize single effector proteins (1). Within these two classes, types and subtypes are delineated according to the presence of distinct signature genes, protein sequence conservation, and organization of the respective genomic loci. Class 1 systems include type I, where interference is achieved through assembly of multiple Cas proteins into the Cascade complex, and type III systems, which rely on either the Csm (type III-A/D) or Cmr (Type III-B/C) effector complexes which are distantly related to the Cascade (1, 611).

Class 2 CRISPR systems comprise type II, characterized by the single-component effector protein Cas9 (1217), which contains RuvC and HNH nuclease domains, and type V systems, which utilize single RuvC domain-containing effectors such as Cpf1 (18), C2c1, and C2c3 (19). All functionally characterized systems, to date, have been reported to target DNA, and only the multi-component type III-A and III-B systems additionally target RNA (7, 2025). However, the putative Class 2 type VI system is characterized by the presence of the single effector protein C2c2, which lacks homology to any known DNA nuclease domain but contains two Higher Eukaryotes and Prokaryotes Nucleotide-binding (HEPN) domains (19). Given that all functionally characterized HEPN domains are RNases (26), there is a possibility that C2c2 functions solely as an RNA-guided RNA-targeting CRISPR effector.

HEPN domains are also found in other Cas proteins. Csm6, a component of type III-A systems, and the homologous protein Csx1, in type III-B systems, each contain a single HEPN domain and have been biochemically characterized as ssRNA-specific endoribonucleases (21, 27, 28). In addition, type III systems contain complexes of other Cas enzymes that bind and cleave ssRNA through acidic residues associated with RNA-recognition motif (RRM) domains. These complexes (Cas10-Csm in type III-A and Cmr in type III-B) carry out RNA-guided co-transcriptional cleavage of mRNA in concert with DNA target cleavage (22, 29, 30). In contrast, the roles of Csm6 and Csx1, which cleave their targets with little specificity, are less clear, although in some cases, RNA cleavage by Csm6 apparently serves as a second line of defense when DNA targeting fails (21). Additionally Csm6 and Csx1 have to dimerize to form a composite active site (27, 28, 31), but C2c2 contains two HEPN domains, suggesting that it functions as a monomeric endoribonuclease.

As is common with Class 2 systems, type VI systems are simply organized. In particular, the type VI locus in Leptotrichia shahii contains Cas1, Cas2, C2c2 and a CRISPR array, which is expressed and processed into mature crRNAs (19). In all CRISPR-Cas systems characterized to date, Cas1 and Cas2 are exclusively involved in spacer acquisition (3237), suggesting that C2c2 is the sole effector protein which utilizes a crRNA guide to achieve interference, likely targeting RNA.

Reconstitution of the L. shahii C2c2 locus in Escherichia coli confers RNA-guided immunity

We explored whether LshC2c2 could confer immunity to MS2 (25), a lytic single-stranded (ss) RNA phage, without DNA intermediates in its life cycle, that infects E. coli. We constructed a low-copy plasmid carrying the entire LshC2c2 locus (pLshC2c2) to allow for heterologous reconstitution in E. coli (fig. S1A). Because expressed mature crRNAs from the LshC2c2 locus have a maximum spacer length of 28nt (fig. S1A) (19), we tiled all possible 28-nt target sites in the MS2 phage genome (Fig. 1A). This resulted in a library of 3,473 spacer sequences (along with 490 non-targeting guides designed to have a Levenshtein distance of ≥8 with respect to the MS2 and E. coli genomes) which we inserted between pLshC2c2 direct repeats (DRs). After transformation in of this construct into E. coli, we infected cells with varying dilutions of MS2 (10−1, 10−3, and 10−5) and analyzed surviving cells to determine the spacer sequences carried by cells that survived the infection. Cells carrying spacers that confer robust interference against MS2 are expected to proliferate faster than those that lack such sequences. Following growth for 16 hours, we identified a number of spacers that were consistently enriched across three independent infection replicas in both the 10−1 and 10−3 dilution conditions, suggesting that they enabled interference against MS2. Specifically, 147 and 150 spacers showed >1.25 log2-fold enrichment in all three replicates for the 10−1 and 10−3 phage dilutions, respectively; of these two groups of top enriched spacers, 84 are shared (Figs. 1B, S2A–G, table S1). Additionally, no non-targeting guides were found to be consistently enriched among the three 10−1, 10−3, or 10−5 phage replicates (fig. S2D, G). We also analyzed the flanking regions of protospacers on the MS2 genome corresponding to the enriched spacers and found that spacers with a G immediately flanking the 3’ end of the protospacer were less fit relative to all other nucleotides at this position (i.e. A, U, or C), suggesting that the 3′ protospacer flanking site (PFS) affects the efficacy of C2c2-mediated targeting (Figs. 1C, S2E–F, S3). Although the PFS is adjacent to the protospacer target, we chose not to use the commonly used protospacer adjacent motif (PAM) nomenclature as it has come to connote a sequence used in self vs. non-self differentiation (38), which is irrelevant in a RNA-targeting system. It is worth noting that the avoidance of G by C2c2 echo the absence of PAMs applicable to other RNA-targeting CRISPR systems and effector proteins (20, 22, 24, 25, 39, 40).

Heterologous expression of the Leptotrichia shahii C2c2 locus mediates robust interference of RNA phage in Escherichia coli

A) Schematic for the MS2 bacteriophage interference screen. A library consisting of spacers targeting all possible sequences in the MS2 RNA genome was cloned into the LshC2c2 CRISPR array. Cells transformed with the MS2-targeting spacer library were then treated with phage and plated, and surviving cells were harvested. The frequency of spacers was compared to an untreated control (no phage), and enriched spacers from the phage-treated condition were used for the generation of PFS preference logos.

B) Box plot showing the distribution of normalized crRNA frequencies for the phage-treated conditions and control screen (no phage) biological replicates (n = 3). The box extends from the first to third quartile with whiskers denoting 1.5 times the interquartile range. The mean is indicated by the red horizontal bar. The 10−1 and 10−3 phage dilution distributions are significantly different than each of the control replicates (****, p < 0.0001 by ANOVA with multiple hypothesis correction).

C) Sequence logo generated from sequences flanking the 3’ end of protospacers corresponding to enriched spacers in the 10−1 phage dilution condition, revealing the presence of a 3’ H PFS (not G).

D) Plaque assay used to validate the functional significance of the H PFS in MS2 interference. All protospacers flanked by non-G PFSs exhibited robust phage interference. Spacer were designed to target the MS2 mat gene and their sequences are shown above the plaque images; the spacer used in the non-targeting control is not complementary to any sequence in either the E. coli or MS2 genome. Phage spots were applied as series of half-log dilutions.

E) Quantitation of MS2 plaque assay validating the H (non-G) PFS preference. 4 MS2-targeting spacers were designed for each PFS. Each point on the scatter plot represents the average of three biological replicates and corresponds to a single spacer. Bars indicate the mean of 4 spacers for each PFS and standard error (s.e.m).

The fact that only ~5% of crRNAs are enriched may reflect other factors influencing interference activity, such as accessibility of the target site that might be affected by RNA binding proteins or secondary structure. In agreement with this hypothesis, the enriched spacers tend to cluster into regions of strong interference where they are closer to each other than one would expect by random chance (fig. S3F–G).

To validate the interference activity of the enriched spacers, we individually cloned four top-enriched spacers into pLshC2c2 CRISPR arrays and observed a 3- to 4-log10 reduction in plaque formation, consistent with the level of enrichment observed in the screen (Figs 1B, S4). We cloned sixteen guides targeting distinct regions of the MS2 mat gene (4 guides per possible single-nucleotide PFS). All 16 crRNAs mediated MS2 interference, although higher levels of resistance were observed for the C, A, and U PFS-targeting guides (Figs. 1D, 1E, S5), indicating that C2c2 can be effectively retargeted in a crRNA-dependent fashion to sites within the MS2 genome.

To further validate the observed PFS preference with an alternate approach, we designed a protospacer site in the pUC19 plasmid at the 5’ end of the β-lactamase mRNA, which encodes ampicillin resistance in E. coli, flanked by five randomized nucleotides at the 3’ end. Significant depletion and enrichment was observed for the LshC2c2 locus (****, p<0.0001) compared to the pACYC184 controls (Fig. S6A). Analysis of the depleted PFS sequences confirmed the presence of a PFS preference of H (Fig. S6B).

C2c2 is a single-effector endoRNase mediating ssRNA cleavage with a single crRNA guide

We purified the LshC2c2 protein (fig. S7) and assayed its ability to cleave an in vitro transcribed 173-nt ssRNA target (Figs. 2A, S8) containing a C PFS (ssRNA target 1 with protospacer 14). Mature LshC2c2 crRNAs contain a 28-nt direct repeat (DR) and a 28 nt spacer (fig. S1A) (19). We therefore generated an in-vitro-transcribed crRNA with a 28-nt spacer complementary to protospacer 14 on ssRNA target 1. LshC2c2 efficiently cleaved ssRNA in a Mg2+- and crRNA-dependent manner (Figs. 2B, S9). We then annealed complementary RNA oligos to regions flanking the crRNA target site. This partially double-stranded RNA substrate was not cleaved by LshC2c2, suggesting it is specific for ssRNA (figs. S10A–B).

LshC2c2 and crRNA mediate RNA-guided ssRNA cleavage

A) Schematic of the ssRNA substrate being targeted by the crRNA. The protospacer region is highlighted in blue and the PFS is indicated by the magenta bar.

B) A denaturing gel demonstrating crRNA-mediated ssRNA cleavage by LshC2c2 after 1 hour of incubation. The ssRNA target is either 5’ labeled with IRDye 800 or 3’ labeled with Cy5. Cleavage requires the presence of the crRNA and is abolished by addition of EDTA. Four cleavage sites are observed. Reported band lengths are matched from RNA sequencing.

C) A denaturing gel demonstrating the requirement for an H PFS (not G) after 3 hours of incubation. Four ssRNA substrates that are identical except for the PFS (indicated by the magenta X in the schematic) were used for the in vitro cleavage reactions. ssRNA cleavage activity is dependent on the nucleotide immediately 3’ of the target site. Reported band lengths are matched from RNA sequencing.

D) Schematic showing five protospacers for each PFS on the ssRNA target (top). Denaturing gel showing crRNA-guided ssRNA cleavage activity after 1 hour of incubation. crRNAs correspond to protospacer numbering. Reported band lengths are matched from RNA sequencing.

We tested the sequence constraints of RNA cleavage by LshC2c2 with additional crRNAs complementary to ssRNA target 1 where protospacer 14 is preceded by each PFS variant. The results of this experiment confirmed the preference for C, A, and U PFSs, with little cleavage activity detected for the G PFS target (Fig. 2C). Additionally, we designed 5 crRNAs for each possible PFS (20 total) across the ssRNA target 1 and evaluated cleavage activity for LshC2c2 paired with each of these crRNAs. As expected, we observed less cleavage activity for G PFS-targeting crRNAs compared to other crRNAs tested (Fig. 2D).

We then generated a dsDNA plasmid library with protospacer 14 flanked by 7 random nucleotides to account for any PFS preference. When incubated with LshC2c2 protein and a crRNA complementary to protospacer 14, no cleavage of the dsDNA plasmid library was observed (fig. S10C). We also did not observe cleavage when targeting a ssDNA version of ssRNA target 1 (fig. S10D). To rule out co-transcriptional DNA cleavage, which has been observed in type III CRISPR-Cas systems (22), we recapitulated the E. coli RNA polymerase co-transcriptional cleavage assay (22) (fig. S11A) expressing ssRNA target 1 from a DNA substrate. This assay of purified LshC2c2 and crRNA targeting ssRNA target 1 did not show any DNA cleavage (fig. S11B). Together, these results indicate that C2c2 cleaves specific ssRNA sites directed by the target complementarity encoded in the crRNA, with a H PFS preference.

C2c2 cleavage depends on local target sequence and secondary structure

Given that C2c2 did not efficiently cleave dsRNA substrates and that ssRNA can form complex secondary structures, we reasoned that cleavage by C2c2 might be affected by secondary structure of the ssRNA target. Indeed, after tiling ssRNA target 1 with different crRNAs (Fig. 2D), we observed the same cleavage pattern regardless of the crRNA position along the target RNA. This observation suggests that the crRNA-dependent cleavage pattern was determined by features of the target sequence rather than the distance from the binding site. We hypothesized that the LshC2c2-crRNA complex binds the target and cleaves exposed regions of ssRNA within the secondary structure elements, with potential preference for certain nucleotides.

In agreement with this hypothesis, cleavage of three ssRNA targets with different sequences flanking identical 28-nt protospacers resulted in three distinct patterns of cleavage (Fig. 3A). RNA-sequencing of the cleavage products for the three targets revealed that cleavage sites mainly localized to uracil-rich regions of ssRNA or ssRNA-dsRNA junctions within the in silico predicted co-folds of the target sequence with the crRNA (Figs. 3B–C, S12A–D). To test whether the LshC2c2-crRNA complex prefers cleavage at uracils, we analyzed the cleavage efficiencies of homopolymeric RNA targets (a 28-nt protospacer extended with 120 As or Us regularly interspaced by single bases of G or C to enable oligo synthesis) and found that LshC2c2 preferentially cleaved the uracil target compared to adenine (figs. S12E, S12F). We then tested cleavage of a modified version of ssRNA 4 which had its main site of cleavage, a loop, replaced with each of the four possible homopolymers and found that cleavage only occurred at the uracil homopolymer loop (fig. S12G). To further test whether cleavage was occurring at uracil residues, we mutated single uracil residues in ssRNA 1 that showed cleavage in the RNA-sequencing (Fig. 3B) to adenines. This experiment showed that, by mutating each uracil residue, we could modulate the presence of a single cleavage band, consistent with LshC2c2 cleaving at uracil residues in ssRNA regions (Fig. 3D).

C2c2 cleavage sites are determined by secondary structure and sequence of the target RNA

A) Denaturing gel showing C2c2-crRNA-mediated cleavage after 3 hours of incubation of three non-homopolymeric ssRNA targets (1, 4, 5; black, blue and green on figs 3B–C and S12A–D respectively) that share the same protospacer but are flanked by different sequences. Despite identical protospacers, different flanking sequences resulted in different cleavage patterns. Reported band lengths are matched from RNA sequencing.

B) The cleavage sites of non-homopolymer ssRNA target 1 were mapped with RNA-sequencing of the cleavage products. The frequency of cleavage at each base is colored according to the z-score and shown on the predicted crRNA-ssRNA co-fold secondary structure. Fragments used to generate the frequency analysis contained the complete 5’ end. The 5’ and 3’ end of the ssRNA target are indicated by blue and red outlines, on the ssRNA and secondary structure, respectively. The 5’ and 3’ end of the spacer (outlined in yellow) is indicated by the blue and orange residues highlighted respectively. The crRNA nucleotides are highlighted in orange.

C) Plot of the frequencies of cleavage sites for each position of ssRNA target 1 for all reads that begin at the 5’ end. The protospacer is indicated by the blue highlighted region.

D) Schematic of a modified ssRNA 1 target showing sites (red) of single U to A flips (left). Denaturing gel showing C2c2-crRNA mediated cleavage of each of these single nucleotide variants after 3 hours of incubation (right). Reported band lengths are matched from RNA sequencing.

The HEPN domains of C2c2 mediate RNA-guided ssRNA-cleavage

Bioinformatic analysis of C2c2 has suggested that the HEPN domains are likely to be responsible for the observed catalytic activity (19). Each of the two HEPN domains of C2c2 contains a dyad of conserved arginine and histidine residues (Fig. 4A), in agreement with the catalytic mechanism of the HEPN endoRNAse (2628). We mutated each of these putative catalytic residues separately to alanine (R597A, H602A, R1278A, H1283A) in the LshC2c2 locus plasmids and assayed for MS2 interference. None of the four mutant plasmids were able to protect E. coli from phage infection (Figs. 4B, S13).

The two HEPN domains of C2c2 are necessary for crRNA-guided ssRNA cleavage but not for binding

A) Schematic of the LshC2c2 locus and the domain organization of the LshC2c2 protein, showing conserved residues in HEPN domains (dark blue).

B) Quantification of MS2 plaque assay with HEPN catalytic residue mutants. For each mutant, the same crRNA targeting protospacer 35 was used. (n=3 biological replicates, ****, p < 0.0001 compared to pACYC184 by t-test. Bars represent mean ± s.e.m.)

C) Denaturing gel showing conserved residues of the HEPN motif, indicated as catalytic residues in panel A, are necessary for crRNA-guided ssRNA target 1 cleavage after 3 hours of incubation. Reported band lengths are matched from RNA sequencing.

D) Electrophoretic mobility shift assay (EMSA) evaluating affinity of the wild type LshC2c2-crRNA complex against a targeted (left) and a non-targeted (right) ssRNA substrate. The non-targeted ssRNA substrate is the reverse-complement of the targeted ssRNA 10. EDTA is supplemented to reaction condition to reduce any cleavage activity.

E) Electrophoretic mobility shift assay with LshC2c2(R1278A)-crRNA complex against on-target ssRNA 10 and non-targeting ssRNA (same substrate sequences as in D)

We purified the four single-point mutant proteins and assayed their ability to cleave 5’-end-labeled ssRNA target 1 (Fig. 4C). In agreement with our in vivo results, all four mutations abolished cleavage activity. The inability of either of the two wild-type HEPN domains to compensate for inactivation of the other implies cooperation between the two domains. These results agree with observations that several bacterial and eukaryotic single-HEPN proteins function as dimers (27, 28, 41).

Catalytically inactive variants of Cas9 retain target DNA binding, allowing for the creation of programmable DNA-binding proteins (12, 13). Electrophoretic mobility shift assays (EMSA) on both the wild-type (Fig. 4D) and R1278A mutant LshC2c2 (Fig. 4E) in complex with crRNA showed the wild-type LshC2c2 complex binding strongly (KD ~ 46 nM, fig. S14A) and specifically to 5’-end-labeled ssRNA target 10 but not to the 5’-end-labeled non-target ssRNA (the reverse complement of ssRNA target 10). The R1278A mutant C2c2 complex showed even stronger (KD ~ 7 nM, fig. S14B) specific binding, indicating that this HEPN mutation results in a catalytically inactive, RNA-programmable RNA-binding protein. The LshC2c2 protein or crRNA alone showed reduced levels of target affinity, as expected (fig. S14C–E). Additionally, no specific binding of LshC2c2-crRNA complex to ssDNA was observed (fig. S15).

These results demonstrate that C2c2 cleaves RNA via a catalytic mechanism distinct from other known CRISPR-associated RNases. In particular, the type III Csm and Cmr multiprotein complexes rely on acidic residues of RRM domains for catalysis, whereas C2c2 achieves RNA cleavage through the conserved basic residues of its two HEPN domains.

Sequence and structural requirements of C2c2 crRNA

Similar to the type V-B (Cpf1) systems (18), the LshC2c2 crRNA contains a single stem loop in the direct repeat (DR), suggesting that the secondary structure of the crRNA could facilitate interaction with LshC2c2. We thus investigated the length requirements of the spacer sequence for ssRNA cleavage and found that LshC2c2 requires spacers of at least 22 nt length to efficiently cleave ssRNA target 1 (fig. S16A). The stem-loop structure of the crRNA is also critical for ssRNA cleavage, because DR truncations that disturbed the stem loop abrogated target cleavage (fig. S16B). Thus, a DR longer than 24 nt is required to maintain the stem loop necessary for LshC2c2 to mediate ssRNA cleavage.

Single base pair inversions in the stem that preserved the stem structure did not affect the activity of the LshC2c2 complex. In contrast, inverting all four G-C pairs in the stem eliminated the cleavage despite maintaining the duplex structure (fig. S17A). Other perturbations, such as those that introduced kinks and reduced or increased base-pairing in the stem, also eliminated or drastically suppressed cleavage. This suggests that the crRNA stem length is important for complex formation and activity (fig. S17A). We also found that loop deletions eliminated cleavage, whereas insertions and substitutions mostly maintained some level of cleavage activity (fig. S17B). In contrast, nearly all substitutions or deletions in the region 3’ to the DR prevented cleavage by LshC2c2 (fig S18). Together, these results demonstrate that LshC2c2 recognizes structural characteristics of its cognate crRNA but is amenable to loop insertions and most tested base substitutions outside of the 3’ DR region. These results have implications for the future application of C2c2-based tools that require guide engineering for recruitment of effectors or modulation of activity (4244).

C2c2 cleavage is sensitive to double mismatches in the crRNA-target duplex

We tested the sensitivity of the LshC2c2 system to single mismatches between the crRNA guide and target RNA by mutating single bases across the spacer to the respective complementary bases (e.g., A to U). We then quantified plaque formation with these mismatched spacers in the MS2 infection assay and found that C2c2 was fully tolerant to single mismatches across the spacer as such mismatched spacers interfered with phage propagation with similar efficiency as fully matched spacers (figs. S19A, S20). However, when we introduced consecutive double substitutions in the spacer, we found a ~3 log10-fold reduction in the protection for mismatches in the center, but not at the 5’- or 3’-end, of the crRNA (figs. 19B, S20). This observation suggests the presence of a mismatch-sensitive “seed region” in the center of the crRNA-target duplex.

We generated a set of in vitro transcribed crRNAs with mismatches similarly positioned across the spacer region. When incubated with LshC2c2 protein, all single mismatched crRNA supported cleavage (Fig. S19C), in agreement with our in vivo findings. When tested with a set of consecutive and non-consecutive double mutant crRNAs, LshC2c2 was unable to cleave the target RNA if the mismatches were positioned in the center, but not at the 5’- or 3’-end of the crRNA (Fig. S19D, S21A), further supporting the existence of a central seed region. Additionally, no cleavage activity was observed with crRNAs containing consecutive triple mismatches in the seed region (fig. S21B).

C2c2 can be reprogrammed to mediate specific mRNA knockdown in vivo

Given the ability of C2c2 to cleave target ssRNA in a crRNA sequence-specific manner, we tested whether LshC2c2 could be reprogrammed to degrade selected non-phage ssRNA targets, and particularly mRNAs, in vivo. We co-transformed E. coli with a plasmid encoding LshC2c2 and a crRNA targeting the mRNA of red fluorescent protein (RFP) as well as a compatible plasmid expressing RFP (Fig. 5A). For OD-matched samples, we observed an approximately 20% to 92% decrease in RFP positive cells for crRNAs targeting protospacers flanked by C, A, or U PFSs (Fig. 5B, C). As a control, we tested crRNAs containing reverse complements (targeting the dsDNA plasmid) of the top performing RFP mRNA-targeting spacers. As expected, we observed no decrease in RFP fluorescence by these crRNAs (Fig. 5B). We also confirmed that mutation of the catalytic arginine residues in either HEPN domain to alanine precluded RFP knockdown (fig. S22). Thus, C2c2 is capable of general retargeting to arbitrary ssRNA substrates, governed exclusively by predictable nucleic-acid interactions.

RFP mRNA knockdown by retargeting LshC2c2

A) Schematic showing crRNA-guided knockdown of RFP in E. coli heterologously expressing the LshC2c2 locus. Three RFP-targeting spacers were selected for each non-G PFS and each protospacer on the RFP mRNA is numbered.

B) RFP mRNA-targeting spacers effected RFP knockdown whereas DNA-targeting spacers (targeting the non-coding strand of the RFP gene on the expression plasmid, indicated as “rc” spacers) did not affect RFP expression. (n=3 biological replicates, ****, p < 0.0001 compared to non-targeting guide by ANOVA with multiple hypothesis correction. Bars represent mean ± s.e.m )

C) Quantification of RFP knockdown in E. coli. Three spacers each targeting C, U, or A PFS-flanking protospacers (9 spacers, numbered 5–13 as indicated in panel (A)) in the RFP mRNA were introduced and RFP expression was measured by flow cytometry. Each point on the scatter plot represents the average of three biological replicates and corresponds to a single spacer. Bars indicate the mean of 3 spacers for each PFS and errors bars are shown as the s.e.m.

D) Timeline of E. coli growth assay.

E) Effect of RFP mRNA targeting on the growth rate of E. coli transformed with an inducible RFP expression plasmid as well as the LshC2c2 locus with non-targeting, RNA targeting (spacer complementary to the RFP mRNA or RFP gene coding strand), and pACYC control plasmid at different anhydrotetracycline (aTc) concentrations.

When we examined the growth of cells carrying the RFP-targeting spacer with the greatest level of RFP knockdown, we noted that the growth rate of these bacteria was substantially reduced (Fig. 5A, spacer 7). We investigated whether the effect on growth was mediated by the RFP mRNA-targeting activity of LshC2c2 by introducing an inducible-RFP plasmid and an RFP-targeting LshC2c2 locus into E. coli. Upon induction of RFP transcription, cells with RFP knockdown showed substantial growth suppression, not observed in non-targeting controls (Fig. 5D, E). This growth restriction was dependent on the level of the RFP mRNA, as controlled by the concentration of the inducer anhydrotetracycline. In contrast, in the absence of RFP transcription, we did not observe any growth restriction nor did we observe any transcription-dependent DNA targeting in our biochemical experiment (fig. S11). These results indicate that RNA-targeting is likely the primary driver of this growth restriction phenotype. We therefore surmised that, in addition to the cleavage of the target RNA, C2c2 CRISPR systems might prevent virus reproduction also via non-specific cleavage of cellular mRNAs, causing programmed cell death (PCD) or dormancy (45, 46).

C2c2 cleaves collateral RNA in addition to crRNA-targeted ssRNA

Cas9 and Cpf1 cleave DNA within the crRNA-target heteroduplex at defined positions, reverting to an inactive state after cleavage. In contrast, C2c2 cleaves the target RNA outside of the crRNA binding site at varying distances depending on the flanking sequence, presumably within exposed ssRNA loop regions (Figs. 3B, 3C, S12A–D). This observed flexibility with respect to the cleavage distance led us to test whether cleavage of other, non-target ssRNAs also occurs upon C2c2 target binding and activation. Under this model, the C2c2-crRNA complex, once activated by binding to its target RNA, cleaves the target RNA as well as other RNAs non-specifically. We carried out in vitro cleavage reactions that included, in addition to LshC2c2 protein, crRNA and its target RNA, one of four unrelated RNA molecules without any complementarity to the crRNA guide (Fig. 6A). These experiments showed that, whereas the LshC2c2-crRNA complex did not mediate cleavage of any of the four collateral RNAs in the absence of the target RNA, all four were efficiently degraded in the presence of the target RNA (Figs. 6B, S23A). Furthermore, R597A and R1278A HEPN mutants were unable to cleave collateral RNA (Fig. S23B).

crRNA-guided ssRNA cleavage activates non-specific RNase activity of LshC2c2

A) Schematic of the biochemical assay used to detect crRNA-binding-activated non-specific RNase activity on non-crRNA-targeted collateral RNA molecules. The reaction consists of C2c2 protein, unlabeled crRNA, unlabeled target ssRNA, and a second ssRNA with 3’ fluorescent labeling and is incubated for 3 hours. C2c2-crRNA mediates cleavage of the unlabeled target ssRNA as well as the 3’-end-labeled collateral RNA which has no complementarity to the crRNA.

B) Denaturing gel showing non-specific RNase activity against non-targeted ssRNA substrates in the presence of target RNA after 3 hours of incubation. The non-targeted ssRNA substrate is not cleaved in the absence of the crRNA-targeted ssRNA substrate.

To further investigate the collateral cleavage and growth restriction in vivo, we hypothesized that if a PFS preference screen for LshC2c2 was performed in a transcribed region on the transformed plasmid, then we should be able to detect the PFS preference due to growth restriction induced by RNA targeting. We designed a protospacer site flanked by five randomized nucleotides at the 3’ end in either a non-transcribed region or in a region transcribed from the lac promoter (fig. S24A). The analysis of the depleted and enriched PFS sequences identified a H PFS only in the assay with the transcribed sequence but no discernable motif in the non-transcribed sequence (fig. S24B–C).

These results suggest a HEPN-dependent mechanism whereby C2c2 in a complex with crRNA is activated upon binding to target RNA and subsequently cleaves non-specifically other available ssRNA targets. Such promiscuous RNA cleavage could cause cellular toxicity, resulting in the observed growth rate inhibition. These findings imply that, in addition to their likely role in direct suppression of RNA viruses, type VI CRISPR-Cas systems could function as mediators of a distinct variety of PCD or dormancy induction that is specifically triggered by cognate invader genomes (Fig. 7). Under this scenario, dormancy would slow the infection and supply additional time for adaptive immunity. Such a mechanism agrees with the previously proposed coupling of adaptive immunity and PCD during the CRISPR-Cas defensive response (47).

C2c2 as a putative RNA-targeting prokaryotic immune system

The C2c2-crRNA complex recognizes target RNA via base pairing with the cognate protospacer and cleaves the target RNA. In addition, binding of the target RNA by C2c2-crRNA activates a non-specific RNase activity which may lead to promiscuous cleavage of RNAs without complementarity to the crRNA guide sequence. Through this non-specific RNase activity, C2c2 may also cause abortive infection via programmed cell death or dormancy induction.

Conclusions

In summary, the Class 2 type VI effector protein C2c2 is an RNA-guided RNase that can be efficiently programmed to degrade any ssRNA by specifying a 28-nt sequence on the crRNA (Fig. 10). C2c2 cleaves RNA through conserved basic residues within its two HEPN domains, in contrast to the catalytic mechanisms of other known RNases found in CRISPR-Cas systems (25, 48). Alanine substitution of any of the four predicted HEPN domain catalytic residues converted C2c2 into an inactive programmable RNA-binding protein (dC2c2, analogous to dCas9). Many different spacer sequences work well in our assays although further screening will likely define properties and rules governing optimal function.

These results suggest a broad range of biotechnology applications and research questions (4951). For example, the ability of dC2c2 to bind to specified sequences could be used to (i) bring effector modules to specific transcripts to modulate their function or translation, which could be used for large-scale screening, construction of synthetic regulatory circuits and other purposes; (ii) fluorescently tag specific RNAs to visualize their trafficking and/or localization; (iii) alter RNA localization through domains with affinity for specific subcellular compartments; and (iv) capture specific transcripts (through direct pull down of dC2c2) to enrich for proximal molecular partners, including RNAs and proteins.

Active C2c2 also has many potential applications such as targeting a specific transcript for destruction, as performed here with RFP. In addition, C2c2, once primed by the cognate target, can cleave other (non-complementary) RNA molecules in vitro and inhibit cell growth in vivo. Biologically, this promiscuous RNase activity might reflect a PCD/dormancy-based protection mechanism of the type VI CRISPR-Cas systems (Fig. 7). Technologically, it might be used to trigger PCD or dormancy in specific cells such as cancer cells expressing a particular transcript, neurons of a given class, or cells infected by a specific pathogen.

Further experimental study is required to elucidate the mechanisms by which the C2c2 system acquires spacers and the classes of pathogens against which it protects bacteria. The presence of the conserved CRISPR adaptation module consisting of typical Cas1 and Cas2 proteins in the LshC2c2 locus suggests that it is capable of spacer acquisition. Although C2c2 systems lack reverse transcriptases, which mediate acquisition of RNA spacers in some type III systems (52), it is possible that additional host or viral factors could support RNA spacer acquisition. Additionally or alternatively, type VI systems could acquire DNA spacers similar to other CRISPR-Cas variants but then target transcripts of the respective DNA genomes, eliciting PCD and abortive infection (Fig. 7).

The CRISPR-C2c2 system represent a distinct evolutionary path among Class 2 CRISPR-Cas systems. It is likely that other, broadly analogous Class 2 RNA-targeting immune systems exist, and further characterization of the diverse members of Class 2 systems will provide a deeper understanding of bacterial immunity and provide a rich starting point for the development of programmable molecular tools for in vivo RNA manipulation.

Materials and Methods

Expanded materials and methods, including computational analysis, can be found in supplementary materials and methods.

Bacterial phage interference

The C2c2 CRISPR locus was amplified from DNA from Leptotrichia shahii DSM 19757 (ATCC, Manassas, VA) and cloned for heterologous expression in E. coli. For screens, a library of all possible spacers targeting the MS2 genome were cloned into the spacer array; for individual spacers, single specific spacers were cloned into the array. Interference screens were performed in liquid culture and plated; surviving colonies were harvested for DNA and spacer representation was determined by next-generation sequencing. Individual spacers were tested by spotting on top agar.

β–lactamase and transcribed/non-transcribed PFS preference screens

Sequences with randomized nucleotides adjacent to protospacer 1 were cloned into pUC19 in corresponding regions. Libraries were screened by co-transformation with LshC2c2 locus plasmid or pACYC184 plasmid control, harvesting of the surviving colonies, and next-generation sequencing of the resulting regions.

RFP targeting assay

Cells containing an RFP expressing plasmid were transformed with an LshC2c2 locus plasmid with corresponding spacers, grown overnight, and analyzed for RFP fluorescence by flow cytometry. The growth effects of LshC2c2 activity were quantified by titrating inducible RFP levels with dilutions of anhydrotetracycline inducer and then measuring OD600.

in vitro nuclease and electrophoretic mobility shift assays

LshC2c2 protein and HEPN mutants were purified for use in in vitro reactions; RNA were synthesized via in vitro transcription. For nuclease assays, protein was co-incubated with crRNA and either 3’ or 5’-labeled targets and analyzed via denaturing gel electrophoresis and imaging or by next-generation sequencing. For electrophoretic mobility shift assays, protein and nucleic acid were co-incubated and then resolved by gel electrophoresis and imaging.

Supplementary Material

Supplemental

Click here to view.(7.3M, pdf)

Acknowledgments

We would like to thank P. Boutz, J. Doench, P. Sharp, and B. Zetsche for helpful discussions and insights; R. Belliveau for overall research support; J. Francis and D. O’Connell for generous MiSeq instrument access; D. Daniels and C. Garvie for providing bacterial incubation space for protein purification; R. Macrae for critical reading of the manuscript; and the entire Zhang laboratory for support and advice. We would like to thank N. Ranu for generously providing pRFP and D. Daniels for providing 6-His-MBP-TEV. O.A.A. is supported by a Paul and Daisy Soros Fellowship, a Friends of the McGovern Institute Fellowship, and the Poitras Center for Affective Disorders.

J.S.G. is supported by a D.O.E. Computational Science Graduate Fellowship. S.S. is supported by the graduate program of Skoltech Data-Intensive Biomedicine and Biotechnology Center for Research, Education, and Innovation. I.S. is supported by the Simons Center for the Social Brain. D.B.T.C. is supported by award number T32GM007753 from the National Institute of General Medical Sciences. K.S.M., E.V.K. and, in part, S.S. are supported by the intramural program of the US Department of Health and Human services (to the National Library of Medicine). K.S. is supported by an NIH grant GM10407, Russian Science Foundation grant 14-14-00988, and Skoltech. F.Z. is a New York Stem Cell Foundation-Robertson Investigator. F.Z. is supported by the NIH through NIMH (5DP1-MH100706 and 1R01-MH110049), NSF, the New York Stem Cell, Simons, Paul G. Allen Family, and Vallee Foundations; and James and Patricia Poitras, Robert Metcalfe, and David Cheng. O.A.A., J.S.G., J.J., E.V.K., S.K., E.S.L., K.S.M., L.M., E.S., K.S., S.S., Y.W., and F.Z. are inventors on provisional patent application 62/181,675 applied for by the Broad Institute, MIT, Harvard, NIH, Skoltech, and Rutgers that covers the C2c2 proteins described in this paper. Deep sequencing data are available at Sequence Read Archive under BioProject accession number PRJNA318890. The authors plan to make the reagents widely available to the academic community through Addgene and to provide software tools via the Zhang lab website (www.genome-engineering.org) and GitHub (github.com/fengzhanglab).

References and Notes

1. Makarova KS, et al. An updated evolutionary classification of CRISPR-Cas systems. Nat Rev Microbiol. 2015;13:722–736. [PMC free article] [PubMed] [Google Scholar]
2. Makarova KS, et al. Evolution and classification of the CRISPR-Cas systems. Nat Rev Microbiol. 2011;9:467–477. [PMC free article] [PubMed] [Google Scholar]
3. Wright AV, Nunez JK, Doudna JA. Biology and Applications of CRISPR Systems: Harnessing Nature's Toolbox for Genome Engineering. Cell. 2016;164:29–44. [PubMed] [Google Scholar]
4. Marraffini LA. CRISPR-Cas immunity in prokaryotes. Nature. 2015;526:55–61. [PubMed] [Google Scholar]
5. van der Oost J, Jore MM, Westra ER, Lundgren M, Brouns SJ. CRISPR-based adaptive and heritable immunity in prokaryotes. Trends Biochem Sci. 2009;34:401–407. [PubMed] [Google Scholar]
6. Brouns SJ, et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science. 2008;321:960–964. [PMC free article] [PubMed] [Google Scholar]
7. Hale CR, et al. RNA-guided RNA cleavage by a CRISPR RNA-Cas protein complex. Cell. 2009;139:945–956. [PMC free article] [PubMed] [Google Scholar]
8. Jackson RN, et al. Structural biology. Crystal structure of the CRISPR RNA-guided surveillance complex from Escherichia coli. Science. 2014;345:1473–1479. [PMC free article] [PubMed] [Google Scholar]
9. Marraffini LA, Sontheimer EJ. CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA. Science. 2008;322:1843–1845. [PMC free article] [PubMed] [Google Scholar]
10. Mulepati S, Heroux A, Bailey S. Structural biology. Crystal structure of a CRISPR RNA-guided surveillance complex bound to a ssDNA target. Science. 2014;345:1479–1484. [PMC free article] [PubMed] [Google Scholar]
11. Sinkunas T, et al. In vitro reconstitution of Cascade-mediated CRISPR immunity in Streptococcus thermophilus. EMBO J. 2013;32:385–394. [PMC free article] [PubMed] [Google Scholar]
12. Gasiunas G, Barrangou R, Horvath P, Siksnys V. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc Natl Acad Sci U S A. 2012;109:E2579–2586. [PMC free article] [PubMed] [Google Scholar]
13. Jinek M, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. [PMC free article] [PubMed] [Google Scholar]
14. Deltcheva E, et al. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature. 2011;471:602–607. [PMC free article] [PubMed] [Google Scholar]
15. Sapranauskas R, et al. The Streptococcus thermophilus CRISPR/Cas system provides immunity in Escherichia coli. Nucleic Acids Res. 2011;39:9275–9282. [PMC free article] [PubMed] [Google Scholar]
16. Garneau JE, et al. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature. 2010;468:67–71. [PubMed] [Google Scholar]
17. Barrangou R, et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science. 2007;315:1709–1712. [PubMed] [Google Scholar]
18. Zetsche B, et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell. 2015;163:759–771. [PMC free article] [PubMed] [Google Scholar]
19. Shmakov S, et al. Discovery and Functional Characterization of Diverse Class 2 CRISPR-Cas Systems. Mol Cell. 2015;60:385–397. [PMC free article] [PubMed] [Google Scholar]
20. Hale CR, et al. Essential features and rational design of CRISPR RNAs that function with the Cas RAMP module complex to cleave RNAs. Mol Cell. 2012;45:292–302. [PMC free article] [PubMed] [Google Scholar]
21. Jiang W, Samai P, Marraffini LA. Degradation of Phage Transcripts by CRISPR-Associated RNases Enables Type III CRISPR-Cas Immunity. Cell. 2016;164:710–721. [PMC free article] [PubMed] [Google Scholar]
22. Samai P, et al. Co-transcriptional DNA and RNA Cleavage during Type III CRISPR-Cas Immunity. Cell. 2015;161:1164–1174. [PMC free article] [PubMed] [Google Scholar]
23. Staals RH, et al. Structure and activity of the RNA-targeting Type III-B CRISPR-Cas complex of Thermus thermophilus. Mol Cell. 2013;52:135–145. [PMC free article] [PubMed] [Google Scholar]
24. Staals RH, et al. RNA targeting by the type III-A CRISPR-Cas Csm complex of Thermus thermophilus. Mol Cell. 2014;56:518–530. [PMC free article] [PubMed] [Google Scholar]
25. Tamulaitis G, et al. Programmable RNA shredding by the type III-A CRISPR-Cas system of Streptococcus thermophilus. Mol Cell. 2014;56:506–517. [PubMed] [Google Scholar]
26. Anantharaman V, Makarova KS, Burroughs AM, Koonin EV, Aravind L. Comprehensive analysis of the HEPN superfamily: identification of novel roles in intra-genomic conflicts, defense, pathogenesis and RNA processing. Biol Direct. 2013;8:15. [PMC free article] [PubMed] [Google Scholar]
27. Niewoehner O, Jinek M. Structural basis for the endoribonuclease activity of the type III-A CRISPR-associated protein Csm6. RNA. 2016;22:318–329. [PMC free article] [PubMed] [Google Scholar]
28. Sheppard NF, Glover CV, 3rd, Terns RM, Terns MP. The CRISPR-associated Csx1 protein of Pyrococcus furiosus is an adenosine-specific endoribonuclease. RNA. 2016;22:216–224. [PMC free article] [PubMed] [Google Scholar]
29. Goldberg GW, Jiang W, Bikard D, Marraffini LA. Conditional tolerance of temperate phages via transcription-dependent CRISPR-Cas targeting. Nature. 2014;514:633–637. [PMC free article] [PubMed] [Google Scholar]
30. Deng L, Garrett RA, Shah SA, Peng X, She Q. A novel interference mechanism by a type IIIB CRISPR-Cmr module in Sulfolobus. Mol Microbiol. 2013;87:1088–1099. [PubMed] [Google Scholar]
31. Kim YK, Kim YG, Oh BH. Crystal structure and nucleic acid-binding activity of the CRISPR-associated protein Csx1 of Pyrococcus furiosus. Proteins. 2013;81:261–270. [PubMed] [Google Scholar]
32. Nunez JK, Lee AS, Engelman A, Doudna JA. Integrase-mediated spacer acquisition during CRISPR-Cas adaptive immunity. Nature. 2015 [PMC free article] [PubMed] [Google Scholar]
33. Heler R, et al. Cas9 specifies functional viral targets during CRISPR-Cas adaptation. Nature. 2015 [PMC free article] [PubMed] [Google Scholar]
34. Nunez JK, et al. Cas1-Cas2 complex formation mediates spacer acquisition during CRISPR-Cas adaptive immunity. Nature structural & molecular biology. 2014;21:528–534. [PMC free article] [PubMed] [Google Scholar]
35. Diez-Villasenor C, Guzman NM, Almendros C, Garcia-Martinez J, Mojica FJ. CRISPR-spacer integration reporter plasmids reveal distinct genuine acquisition specificities among CRISPR-Cas I-E variants of Escherichia coli. RNA Biol. 2013;10:792–802. [PMC free article] [PubMed] [Google Scholar]
36. Yosef I, Goren MG, Qimron U. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucleic Acids Res. 2012;40:5569–5576. [PMC free article] [PubMed] [Google Scholar]
37. Datsenko KA, et al. Molecular memory of prior infections activates the CRISPR/Cas adaptive bacterial immunity system. Nat Commun. 2012;3:945. [PubMed] [Google Scholar]
38. Marraffini LA, Sontheimer EJ. Self versus non-self discrimination during CRISPR RNA-directed immunity. Nature. 2010;463:568–571. [PMC free article] [PubMed] [Google Scholar]
39. Hale CR, Cocozaki A, Li H, Terns RM, Terns MP. Target RNA capture and cleavage by the Cmr type III-B CRISPR-Cas effector complex. Genes Dev. 2014;28:2432–2443. [PMC free article] [PubMed] [Google Scholar]
40. Zhang J, et al. Structure and mechanism of the CMR complex for CRISPR-mediated antiviral immunity. Mol Cell. 2012;45:303–313. [PMC free article] [PubMed] [Google Scholar]
41. Kozlov G, et al. Structural Basis of Defects in the Sacsin HEPN Domain Responsible for Autosomal Recessive Spastic Ataxia of Charlevoix-Saguenay (ARSACS) J Biol Chem. 2011;286:20407–20412. [PMC free article] [PubMed] [Google Scholar]
42. Kiani S, et al. Cas9 gRNA engineering for genome editing, activation and repression. Nat Methods. 2015;12:1051–1054. [PMC free article] [PubMed] [Google Scholar]
43. Konermann S, et al. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature. 2015;517:583–588. [PMC free article] [PubMed] [Google Scholar]
44. Dahlman JE, et al. Orthogonal gene knockout and activation with a catalytically active Cas9 nuclease. Nat Biotechnol. 2015;33:1159–1161. [PMC free article] [PubMed] [Google Scholar]
45. Makarova KS, Wolf YI, Koonin EV. Comprehensive comparative-genomic analysis of type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes. Biol Direct. 2009;4:19. [PMC free article] [PubMed] [Google Scholar]
46. Hayes F, Van Melderen L. Toxins-antitoxins: diversity, evolution and function. Crit Rev Biochem Mol Biol. 2011;46:386–408. [PubMed] [Google Scholar]
47. Makarova KS, Anantharaman V, Aravind L, Koonin EV. Live virus-free or die: coupling of antivirus immunity and programmed suicide or dormancy in prokaryotes. Biol Direct. 2012;7:40. [PMC free article] [PubMed] [Google Scholar]
48. Benda C, et al. Structural model of a CRISPR RNA-silencing complex reveals the RNA-target cleavage activity in Cmr4. Mol Cell. 2014;56:43–54. [PubMed] [Google Scholar]
49. Abil Z, Zhao H. Engineering reprogrammable RNA-binding proteins for study and manipulation of the transcriptome. Mol Biosyst. 2015;11:2658–2665. [PubMed] [Google Scholar]
50. Mackay JP, Font J, Segal DJ. The prospects for designer single-stranded RNA-binding proteins. Nat Struct Mol Biol. 2011;18:256–261. [PubMed] [Google Scholar]
51. Filipovska A, Rackham O. Designer RNA-binding proteins: New tools for manipulating the transcriptome. RNA Biol. 2011;8:978–983. [PubMed] [Google Scholar]
52. Silas S, et al. Direct CRISPR spacer acquisition from RNA by a natural reverse transcriptase-Cas1 fusion protein. Science. 2016;351:aad4234. [PMC free article] [PubMed] [Google Scholar]
53. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. [PMC free article] [PubMed] [Google Scholar]
54. Crooks GE, Hon G, Chandonia JM, Brenner SE. WebLogo: a sequence logo generator. Genome research. 2004;14:1188–1190. [PMC free article] [PubMed] [Google Scholar]
55. Liberzon A, et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. [PMC free article] [PubMed] [Google Scholar]

Formats: