Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ENGINEERED NUCLEIC ACID-TARGETING NUCLEIC ACIDS
Document Type and Number:
WIPO Patent Application WO/2019/173248
Kind Code:
A1
Abstract:
The present disclosure provides engineered Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acids, nucleoprotein complexes comprising these nucleic acids and a Cas9 protein, and compositions thereof. Nucleic acid sequences encoding the Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acids, as well as expression cassettes, vectors, and cells comprising such nucleic acid sequences, are described. Also, methods are disclosed for making and using the Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acids, nucleoprotein complexes comprising such Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acids and a Cas9 protein, and compositions thereof.

Inventors:
DONOHOUE PAUL (US)
Application Number:
PCT/US2019/020624
Publication Date:
September 12, 2019
Filing Date:
March 04, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CARIBOU BIOSCIENCES INC (US)
International Classes:
C12N15/11
Domestic Patent References:
WO2017027423A12017-02-16
Foreign References:
US20170145425A12017-05-25
US9816093B12017-11-14
US9650617B22017-05-16
US9580701B22017-02-28
US9688972B22017-06-27
US9771601B22017-09-26
US9868962B22018-01-16
US9580727B12017-02-28
US9745600B22017-08-29
US9677090B22017-06-13
US9745562B22017-08-29
US9816081B12017-11-14
US9260752B12016-02-16
US9816093B12017-11-14
US6156303A2000-12-05
Other References:
YAMADA MARI ET AL: "Crystal Structure of the Minimal Cas9 from Campylobacter jejuni Reveals the Molecular Diversity in the CRISPR-Cas9 Systems", MOLECULAR CELL, vol. 65, no. 6, 16 March 2017 (2017-03-16), pages 1109, XP029959270, ISSN: 1097-2765, DOI: 10.1016/J.MOLCEL.2017.02.007
BARRANGOU, R. ET AL., SCIENCE, vol. 315, 2007, pages 1709 - 1712
MAKAROVA, K. S. ET AL., NATURE REVIEWS MICROBIOLOGY, vol. 9, 2011, pages 467 - 477
GARNEAU, J. E. ET AL., NATURE, vol. 468, 2010, pages 67 - 71
SAPRANAUSKAS, R. ET AL., NUCLEIC ACIDS RESEARCH, vol. 39, 2011, pages 9275 - 9282
KOONIN, E.V. ET AL., CURR. OPIN. MICROBIOL., vol. 37, 2017, pages 67 - 78
SHMAKOV, S. ET AL., NAT. REV. MICROBIOL., vol. 15, no. 3, 2017, pages 169 - 182
MAKAROVA, K. S. ET AL., NAT. REV. MICROBIOL., vol. 13, 2015, pages 722 - 736
SHMAKOV, S. ET AL., NAT. REV. MICROBIOL., vol. 15, 2017, pages 169 - 182
ABUDAYYEH, O. O. ET AL., SCIENCE, vol. 353, 2016, pages 1 - 17
KOONIN, E. V. ET AL., CURR OPIN MICROBIOL., vol. 37, 2017, pages 67 - 78
ZETSCHE, B., CELL, vol. 163, 2015, pages 1 - 13
FONFARA, I. ET AL., NATURE, vol. 532, no. 7600, 2016, pages 517 - 521
SWARTS, D. C. ET AL., MOL. CELL, vol. 66, 2017, pages 221 - 233
YAMANO, T., CELL, vol. 165, no. 4, 2016, pages 949 - 962
SHMAKOV, S. ET AL., MOLECULAR CELL, vol. 60, no. 3, 2015, pages 385 - 397
ABUDAYYEH, O., SCIENCE, vol. 353, no. 6299, 2016, pages aaf5573
EAST-SELETSKY, A. ET AL., NATURE, vol. 538, no. 7624, 2016, pages 270 - 273
FONFARA, I., NUCLEIC ACIDS RESEARCH, vol. 42, no. 4, 2014, pages 2577 - 2590
CHYLINSKI K., NUCLEIC ACIDS RESEARCH, vol. 42, no. 10, 2014, pages 6091 - 6105
JINEK, M., SCIENCE, vol. 337, 2012, pages 816 - 821
RAN, F.A. ET AL., NATURE, vol. 520, no. 7546, 2015, pages 186 - 191
FONFARA, I. ET AL., NUCLEIC ACIDS RESEARCH, vol. 42, no. 4, 2014, pages 2577 - 2590
ZUKER, M., NUCLEIC ACIDS RESEARCH, vol. 31, 2003, pages 3406 - 3415
BERNHART, S.H. ET AL., ALGORITHMS FOR MOLECULAR BIOLOGY, vol. 1, no. 1, 2006, pages 3
HOFACKER, I.L. ET AL., JOURNAL OF MOLECULAR BIOLOGY, vol. 319, 2002, pages 1059 - 1066
DARTY, K. ET AL., BIOINFORMATICS, vol. 25, 2009, pages 1974 - 1975
FAGERLUND, R., GENOME BIOLOGY, vol. 16, 2015, pages 251
JINEK M., SCIENCE, vol. 337, 2012, pages 816 - 821
JINEK M. ET AL., ELIFE, vol. 2, 2013, pages e00471
MAKAROVA, K. S. ET AL., CELL, vol. 168, 2017, pages 946
A. K. ABBAS ET AL.: "Cellular and Molecular Immunology", 2017, ELSEVIER
L. H. BUTTERFIELD ET AL.: "Cancer Immunotherapy Principles and Practice", 2017, DEMOS MEDICAL
KENNETH MURPHY: "Janeway's Immunobiology", 2016, GARLAND SCIENCE
C. DORRESTEYN STEVENS ET AL.: "Clinical Immunology and Serology: A Laboratory Perspective", 2016, F.A. DAVIS COMPANY
E.A. GREENFIELD: "Antibodies: A Laboratory Manual", 2014, COLD SPRING HARBOR LABORATORY PRESS
R.I. FRESHNEY: "Culture of Animal Cells: A Manual of Basic Technique and Specialized Applications", 2016, WILEY-BLACKWELL
"Transgenic Animal Technology", 2014, A LABORATORY HANDBOOK, C.A. PINKERT, ELSEVIER
H. HEDRICH: "The Laboratory Mouse", 2012, ACADEMIC PRESS
R. BEHRINGER ET AL.: "Manipulating the Mouse Embryo: A Laboratory Manual", 2013, COLD SPRING HARBOR LABORATORY PRESS
M.J. MCPHERSON ET AL.: "PCR 2: A Practical Approach", 1995, IRL PRESS
J.M. WALKER: "Methods in Molecular Biology (Series", HUMANA PRESS
D.C. RIO ET AL.: "RNA: A Laboratory Manual", 2010, COLD SPRING HARBOR LABORATORY PRESS
"Methods in Enzymology (Series", ACADEMIC PRESS
M.R. GREEN ET AL.: "Molecular Cloning: A Laboratory Manual", 2012, COLD SPRING HARBOR LABORATORY PRESS
G.T. HERMANSON: "Bioconjugate Techniques", 2013, ACADEMIC PRESS
W.V. DASHEK: "Methods in Plant Biochemistry and Molecular Biology", 1997, CRC PRESS
V.M. LOYOLA-VARGAS ET AL.: "Plant Cell Culture Protocols (Methods in Molecular Biology", 2012, HUMANA PRESS
C.N. STEWART ET AL.: "Plant Transformation Technologies", 2011, WILEY-BLACKWELL
C. CUNNINGHAM ET AL.: "Recombinant Proteins from Plants (Methods in Biotechnology", 2010, HUMANA PRESS
W. BUSCH: "Plant Genomics: Methods and Protocols (Methods in Molecular Biology", 2017, HUMANA PRESS
R. KESHAVACHANDRAN ET AL.: "Plant Biotechnology: Methods in Tissue Culture and Gene Transfer", 2008, ORIENT BLACKSWAN
BARRANGOU, R ET AL., SCIENCE, vol. 315, 2007, pages 1709 - 1712
KIM, E. ET AL., NAT. COMMUN., vol. 8, 2017, pages 14500
YAMADA, M. ET AL., MOL. CELL, vol. 65, no. 6, 2017, pages 1109 - 1121
BRINER, A. ET AL., MOL. CELL, vol. 56, no. 2, 2014, pages 333 - 339
NOWAK, C. ET AL., NUCLEIC ACIDS RES, vol. 44, no. 20, 2016, pages 9555 - 9564
WRIGHT, A. ET AL., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 112, no. 10, 2015, pages 2984 - 2989
R. F. GESTELAND: "The RNA World", 2005, COLD SPRING HARBOR LABORATORY PRESS
R. F. GESTELAND: "The RNA World", 1999, COLD SPRING HARBOR LABORATORY PRESS
R. F. GESTELAND: "The RNA World (Cold Spring Harbor Monograph Series", 1993, COLD SPRING HARBOR LABORATORY PRESS
I. TINOCO: "Appendix 1: Structures of Base Pairs Involving at Least Two Hydrogen Bonds"
W. SAENGER: "Principles of Nucleic Acid Structure", 1988, SPRINGER INTERNATIONAL PUBLISHING AG
S. NEIDLE: "Principles of Nucleic Acid Structure", 2007, ACADEMIC PRESS
MANIATIS, T. ET AL.: "Molecular Cloning: A Laboratory Manual", 1982, COLD SPRING HARBOR LABORATORY PRESS
CASEY, J. ET AL., NUCLEIC ACIDS RESEARCH, vol. 4, 1977, pages 1539 - 1552
BODKIN, D.K., JOURNAL OF VIROLOGICAL METHODS, vol. 10, no. 1, 1985, pages 45 - 52
WALLACE, R.B., NUCLEIC ACIDS RESEARCH, vol. 9, no. 4, 1981, pages 879 - 894
RAN, F., NATURE PROTOCOLS, vol. 8, no. 11, 2013, pages 2281 - 2308
SMITHIES, O. ET AL., NATURE, vol. 317, 1985, pages 230 - 234
THOMAS, K. ET AL., CELL, vol. 44, 1986, pages 419 - 428
WU, S. ET AL., NATURE PROTOCOLS, vol. 3, 2008, pages 1056 - 1076
SINGER, B. ET AL., CELL, vol. 31, 1982, pages 25 - 33
SHEN, P. ET AL., GENETICS, vol. 112, 1986, pages 441 - 457
WATT, V., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 82, 1985, pages 4768 - 4772
SUGAWARA, N., JOURNAL OF MOLECULAR CELL BIOLOGY, vol. 12, no. 2, 1992, pages 563 - 575
RUBNITZ, J. ET AL., JOURNAL OF MOLECULAR CELL BIOLOGY, vol. 4, no. 11, 1984, pages 2253 - 2258
AYARES, D. ET AL., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 83, no. 14, 1986, pages 5199 - 5203
LISKAY, R, GENETICS, vol. 115, no. 1, 1987, pages 161 - 167
SFEIR, A. ET AL., TRENDS IN BIOCHEMICAL SCIENCES, vol. 40, 2015, pages 701 - 714
BOSHART, M. ET AL., CELL, vol. 41, 1985, pages 521 - 530
LUPO, A., CURRENT GENOMICS, vol. 14, no. 4, 2013, pages 268 - 278
MARGOLIN, J., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 91, 1994, pages 4509 - 4513
WITZGALL, R., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 91, 1994, pages 4514 - 4518
FRIEDMAN J.R. ET AL., GENES & DEVELOPMENT, vol. 10, 1996, pages 2067 - 2678
RAN, F.A., NATURE, vol. 520, no. 7546, 2015, pages 186 - 191
ZUKER, M.: "Mfold web server for nucleic acid sequence folding and hybridization prediction", NUCLEIC ACIDS RESEARCH, vol. 31, 2003, pages 3406 - 3415
GRUBER A.R. ET AL.: "The Vienna RNA Websuite", NUCLEIC ACIDS RESEARCH, vol. 36, no. 2, 2008, pages W70 - W74
LORENZ, R.: "ViennaRNA Package 2.0", ALGORITHMS FOR MOLECULAR BIOLOGY, vol. 16, 2011, pages 26
LOW J.T., METHODS, vol. 52, no. 2, 2010, pages 150 - 158
MCGOOKIN, R., METHODS MOLECULAR BIOLOGY, vol. 2, 1985, pages 93 - 100
GARNER, M. ET AL., NUCLEIC ACIDS RESEARCH, vol. 9, no. 13, 1981, pages 3047 - 3060
FRIED, M., NUCLEIC ACIDS RESEARCH, vol. 9, no. 23, 1981, pages 6505 - 6525
FRIED, M., ELECTROPHORESIS, vol. 10, 1989, pages 366 - 376
GAGNON, K., METHODS MOLECULAR BIOLOGY, vol. 703, 2011, pages 275 - 2791
FILLEBEEN, C. ET AL., JOURNAL OF VISUALIZED EXPERIMENTS, vol. 3, no. 94, 2014
VALTON, J., JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 287, no. 46, 2012, pages 38427 - 38432
FONT, J., METHODS MOLECULAR BIOLOGY, vol. 649, 2010, pages 479 - 491
ISALAN, M., NATURE BIOTECHNOLOGY, vol. 19, no. 7, 2001, pages 656 - 660
SHAN S. WONG; DAVID M. JAMESON: "Methods of Chemistry of Protein and Nucleic Acid Cross-Linking and Conjugation", 2011, CRC PRESS
GREG T. HERMANSON: "Bioconjugate Techniques", 2013, ACADEMIC PRESS
"Chemistry of Bioconjugates - Synthesis, Characterization, and Biomedical Applications", 2014, WILEY
"Bioconjugation Protocols - Strategies and Methods (Series: Methods in Molecular Biology (Book 751", 2011, HUMANA PRESS
"Crosslinking Technical Handbook", 2009, THERMO FISHER SCIENTIFIC
C. COSTAS ET AL., NUCLEIC ACIDS RESEARCH, vol. 28, no. 9, 2000, pages 1849 - 1858
GAUR R.K., METHODS MOLECULAR BIOLOGY, vol. 488, 2008, pages 167 - 180
LIVNAH, O., ET AL., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 90, no. 11, 1993, pages 5076 - 5080
AIRENNE, K.J., BIOMOLECULAR ENGINEERING, vol. 16, no. 1-4, 1999, pages 87 - 92
HIRAO, I., NATURE METHODS, vol. 3, no. 9, 2006, pages 729 - 735
ZUO, J., PLANT JOURNAL, vol. 24, no. 2, 2000, pages 265 - 273
SETSCREW, B., NATURE BIOTECHNOLOGY, vol. 33, 2015, pages 139 - 142
CHIU M.I., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 91, no. 26, 1994, pages 12574 - 12578
NAVANI, N.K., BIOSENSORS AND BIODETECTION (METHODS IN MOLECULAR BIOLOGY, vol. 504, 2009, pages 399 - 415
A. V. KULBACHINSKIY, BIOCHEMISTRY (MOSCOW, vol. 72, no. 13, 2007, pages 1505 - 1518
LEE, L.Y., PLANT PHYSIOLOGY, vol. 146, no. 2, 2008, pages 325 - 332
DAYA, S. ET AL.: "Gene Therapy Using Adeno-Associated Virus Vectors", CLINICAL MICROBIOLOGY REVIEWS, vol. 21, no. 4, 2008, pages 583 - 593, XP055087083, DOI: doi:10.1128/CMR.00008-08
COORAY, S. ET AL., METHODS IN ENZYMOLOGY, vol. 507, 2012, pages 29 - 57
NARUSAKA, Y., TRANSGENIC PLANTS - ADVANCES AND LIMITATIONS, 2012
CHO, A. ET AL.: "Generation of Transgenic Mice", CURRENT PROTOCOLS IN CELL BIOLOGY, 2009
DUGAR, G. ET AL., MOLECULAR CELL, vol. 69, no. 5, 2018, pages 893 - 905
HELLMAN L.M., NATURE PROTOCOLS, vol. 2, no. 8, 2007, pages 1849 - 1861
CHYLINSKI, K. ET AL., RNA BIOLOGY, vol. 10, no. 5, 2013, pages 726 - 737
BRINER, A. ET AL., MOLECULAR CELL, vol. 56, no. 2, 2014, pages 333 - 339
Attorney, Agent or Firm:
MCCLUNG, Barbara et al. (US)
Download PDF:
Claims:
Claims

1. A Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) composition, comprising:

a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPNl) comprising a spacer sequence, a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical single-stranded polynucleotide (dht-casPN2) comprising a nexus, and a third Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN3), each having a 5' end and 3' end, wherein each of the dht-casPNl, the dht-casPN2, and the dht-casPN3 is a continuous series of covalently connected nucleotides, and the 3' end of the dht-casPN2 and the 5' end of the dht- casPN3 are located 3' of the nexus; and

wherein the dht-casPNl connects with the dht-casPN2 through hydrogen base-pair bonds, and the dht-casPN2 connects with the dht-casPN3 through hydrogen base-pair bonds to form the dht-NATNA; the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein; and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

2. The dht-NATNA composition of claim 1,

wherein the dht-casPNl further comprises:

a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence

I, and

wherein the dht-casPN2 and the dht-casPN3 together further comprise:

a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I,

a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II,

a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment, a fourth segment comprising a third stem nucleotide sequence I, a third connecting nucleotide sequence covalently connecting the 3' end of the third segment to the 5' end of the fourth segment, a fifth segment comprising a third stem nucleotide sequence II, a fourth connecting nucleotide sequence covalently connecting the 3' end of the fourth segment to the 5' end of the fifth segment, a sixth segment comprising a helical-triplex forming nucleotide sequence, and

a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment, and

the 3' end of the dht-casPN2 and the 5' end of the dht-casPN3, are between the 5' end of the fourth segment and the 3' end of the sixth segment;

wherein the first stem repeat nucleotide sequence I is connected through hydrogen base- pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base-pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, and the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex; and

wherein the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

3. The dht-NATNA composition of claim 2, wherein the dht-casPN2 comprises at least the second segment and the third segment; and the dht-casPN3 comprises at least the fourth segment, the fifth segment, and the sixth segment.

4. The dht-NATNA composition of claim 2, wherein the dht-casPN2 comprises at least the second segment, the third segment, and the fourth segment; and the dht-casPN3 comprises at least the fifth segment, and the sixth segment.

5. The dht-NATNA composition of claim 2, wherein the dht-casPN2 comprises at least the second segment, the third segment, and a first portion of the fourth segment; and the dht- casPN3 comprises at least a second portion of the fourth segment, the fifth segment, and the sixth segment.

6. A Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) composition, comprising:

a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPNl) comprising a spacer and a nexus, and a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN2), each having a 5' end and 3' end, wherein the dht-casPNl and the dht-casPN2 are each a continuous series of covalently connected nucleotides, and the 3' end of the dht-casPNl and the 5' end of the dht-casPN2 are located 3' of the nexus; and

wherein the dht-casPNl connects with the dht-casPN2 through hydrogen base-pair bonds to form the dht-NATNA, the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht-NATNA.

7. The dht-NATNA composition of claim 6,

wherein the dht-casPNl and dht-casPN2 together further comprise:

a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence I,

a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I,

a first connecting nucleotide sequence covalently connecting the 3' end of the first segment to the 5' end of the second segment, a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II,

a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment, a fourth segment comprising a third stem nucleotide sequence I, a third connecting nucleotide sequence covalently connecting the 3' end of the third segment to the 5' end of the fourth segment, a fifth segment comprising a third stem nucleotide sequence II, a fourth connecting nucleotide sequence covalently connecting the 3' end of the fourth segment to the 5' end of the fifth segment, a sixth segment comprising a helical-triplex forming nucleotide

sequence, and

a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment, and

the 3' end of the dht-casPNl and the 5' end of the dht-casPN2, are between the 5' end of the fourth segment and the 3' end of the sixth segment;

wherein the first stem repeat nucleotide sequence I is connected through hydrogen base- pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base-pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, and the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex; and

wherein the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

8. The dht-NATNA composition of claim 7, wherein the dht-casPNl comprises at least the first segment, the second segment and the third segment; and the dht-casPN2 comprises at least the fourth segment, the fifth segment, and the sixth segment.

9. The dht-NATNA composition of claim 7, wherein the dht-casPNl comprises at least the first segment, the second segment, the third segment, and the fourth segment; and the dht- casPN2 comprises at least the fifth segment, and the sixth segment.

10. The dht-NATNA composition of claim 7, wherein the dht-casPNl comprises at least the first segment, the second segment, the third segment, and a first portion of the fourth segment; and the dht-casPN2 comprises at least a second portion of the fourth segment, the fifth segment, and the sixth segment.

11. A Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) composition, comprising: a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPNl) comprising,

a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence I,

a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I,

a first connecting nucleotide sequence covalently connecting the 3' end of the first segment to the 5' end of the second segment, a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II,

a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment, a fourth segment comprising a third stem nucleotide sequence I, and a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPN2) comprising,

a fifth segment comprising a third stem nucleotide sequence II, a sixth segment comprising a helical-triplex forming nucleotide sequence, and

a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment;

wherein the first stem repeat nucleotide sequence I is connected through hydrogen base- pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base-pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, and the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex; and

wherein the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

12. The dht-NATNA composition of claim 11, wherein the third stem nucleotide sequence I further comprises an additional 3' nucleic acid sequence and the third stem nucleotide sequence II further comprises an additional 5' nucleic acid sequence, and wherein the additional 3' nucleic acid sequence is connected through hydrogen base-pair bonds to the additional 5' nucleic acid sequence.

13. The dht-NATNA composition of any one of claims 1 to 5, wherein dht-casPNl, dht- casPN2, or dht-casPN3 comprise DNA, RNA, or DNA and RNA.

14. The dht-NATNA composition of any one of claims 1 to 5, wherein at least two of dht-casPNl, dht-casPN2, and dht-casPN3 comprise DNA, RNA, or DNA and RNA.

15. The dht-NATNA composition of any one of claims 1 to 5, wherein dht-casPNl, dht- casPN2, and dht-casPN3 comprise DNA, RNA, or DNA and RNA.

16. The dht-NATNA composition any one of claims 6 to 12, wherein dht-casPNl, dht- casPN2, or dht-casPNl and dht-casPN2, comprise DNA, RNA, or DNA and RNA.

17. A cell, comprising:

the dht-NATNA composition of any preceding claim.

18. A nucleoprotein composition, comprising:

the dht-NATNA composition of any one of claims 1 to 16; and

a Cas9 protein.

19. The nucleoprotein composition of claim 18, wherein the dht-NATNA composition is in a complex with the Cas9 protein.

20. The nucleoprotein composition of claim 19, wherein the Cas9 protein is

enzymatically inactive.

21. A cell comprising the nucleoprotein composition of any one of claims 18 to 20.

22. One or more nucleic acid sequences encoding one or more of dht-casPNl, dht- casPN2, and dht-casPN3 of the dht-NATNA composition of any one of claims 1 to 5.

23. One or more nucleic acid sequences encoding one or more of dht-casPNl and dht- casPN2 of the dht-NATNA composition of any one of claims 6 to 12.

24. An expression cassette comprising the one or more nucleic acid sequences of claim 22 or 23.

25. A vector comprising the expression cassette of claim 24.

26. A method of binding a nucleic acid sequence comprising:

providing the nucleoprotein composition of any one of claims 18 to 20 for introduction into a cell or biochemical reaction; and

introducing the nucleoprotein composition into the cell or the biochemical reaction, thereby facilitating contact of a target nucleic acid sequence in the nucleic acid sequence with the nucleoprotein composition resulting in binding of the nucleoprotein composition to the target nucleic acid sequence in the nucleic acid sequence.

27. The method of claim 26, wherein genomic DNA comprises the nucleic acid sequence.

28. A method of cutting a nucleic acid sequence comprising:

providing the nucleoprotein complex of any one of claims 18 to 20 for introduction into a cell or biochemical reaction; and

introducing the nucleoprotein complex into the cell or the biochemical reaction, thereby facilitating contact of a nucleic acid target sequence in the nucleic acid sequence with the nucleoprotein complex resulting in binding of the nucleoprotein complex to the nucleic acid target sequence and cutting of the nucleic acid target sequence.

29. The method of claim 28, wherein genomic DNA comprises the nucleic acid sequence.

30. A kit, comprising:

the dht-NATNA composition of any one of claims 1 to 16; and

a buffer. 31. A kit comprising:

the one or more nucleic acid sequences of claim 22 or 23;

and a buffer.

32. The kit of claim 30 or 31, further comprising a Cas9 protein or a nucleotide sequence encoding a Cas9 protein.

Description:
ENGINEERED NUCLEIC ACID-TARGETING NUCLEIC ACIDS

Cross-Reference to Related Applications

[0001] This application claims the benefit of U.S. Provisional Patent Application Serial No. 62/640,004, filed 7 March 2018, now pending, which application is herein incorporated by reference in its entirety.

Statement Regarding Federally Sponsored Research or Development

[0002] Not applicable.

Sequence Listing

[0003] The present application contains a Sequence Listing that has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. The ASCII copy, created on 4 March 2019 is named CBI029-30_ST25.txt is 4 KB in size.

Technical Field

[0004] The present disclosure relates generally to engineered nucleic acid-targeting nucleic acids and nucleoprotein complexes comprising such engineered nucleic acid-targeting nucleic acids and one or more Cas proteins. The disclosure also relates to compositions and methods for making and using the engineered nucleic acid-targeting nucleic acids and nucleoprotein complexes of the present invention.

Background

[0005] The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) proteins systems provide adaptive immunity against foreign polynucleotides in bacteria and archaea (see, e.g., Barrangou, R., et al. , Science 315: 1709-1712 (2007); Makarova, K. S., et al , Nature Reviews Microbiology 9:467-477 (2011); Garneau, J.

E., et al. , Nature 468:67-71 (2010); Sapranauskas, R., et al. , Nucleic Acids Research 39:9275- 9282 (2011); Koonin, E.V., et al., Curr. Opin. Microbiol. 37:67-78 (2017) - dx.doi.org/l0.l0l6/j.mib.20l7.05.008; Shmakov, S., et al, Nat. Rev. Microbiol. 15(3): 169-182 (2017) - dx.doi.org /l0.l038/nrmicro.2016.184). Various CRISPR-Cas systems in their native hosts are capable of DNA targeting (Class 1 Type I; Class 2 Type II and Type V), RNA targeting (Class 2 Type VI), and joint DNA and RNA targeting (Class 1 Type III) (Makarova, K. S., et al, Nat. Rev. Microbiol. 13:722-736 (2015); Shmakov, S., et al, Nat. Rev. Microbiol. 15: 169-182 (2017); Abudayyeh, O. O., et al, Science 353: 1-17 (2016)).

[0006] The classification of CRISPR-Cas systems has had many iterations. Koonin, E. V., et al, (Curr Opin Microbiol. 37:67-78 (2017)) proposed a classification system that takes into consideration the signature cas genes specific for individual types and subtypes of CRISPR-Cas systems. The classification also considered sequence similarity between multiple shared Cas proteins, the phylogeny of the best conserved Cas protein, gene organization, and the structure of the CRISPR array. This approach provided a classification scheme that divides CRISPR-Cas systems into two distinct classes: Class 1 comprising a multiprotein effector complex (Type I (Cascade effector complex), Type III (Cmr/Csm effector complex), and Type IV), and Class 2 comprising single effector protein (Type II (Cas9)), Type V (Casl2a, previously referred to as Cpfl), and Type VI (Casl3, previously referred to as C2c2)). In the Class 1 systems, Type I is the most common and diverse, Type III is more common in archaea than bacteria, and Type IV is least common.

[0007] Class 2 systems encode a single-subunit protein (e.g., Cas9, Casl2a (Cpfl), Casl2b (C2cl), C2c4, C2c5, Casl3a (C2c2), Casl3b (C2c6), Casl3c (C2c7) proteins) that complexes with guide RNA to form an effector complex.

[0008] Type II systems comprise casl, cas2, cas9, cas4 and csn2 genes. The cas9 gene encodes a single, multi-domain nuclease that combines with a guide RNA to bind and cleave a DNA target. Type II systems are further divided into three subtypes, subtypes II- A, II-B, and II- C. Subtype II-A contains an additional gene, csn2. Examples of organisms with a subtype II-A systems include, but are not limited to, Streptococcus pyogenes , Streptococcus thermophilus , and Staphylococcus aureus. Subtype II-B lacks the csn2 gene but contains the cas4 gene. An example of an organism with a subtype II-B system is Legionella pneumophila. Subtype II-C is the most common Type II subtype found in bacteria and has only three genes, casl, cas 2, and cas9. An example of an organism with a subtype II-C system is Neisseria lactamica.

[0009] Type V systems have cas 12a casl and cas2 genes ( see Zetsche, B., et cil. , Cell 163 i l ls (2015); Koonin, E.V., et al. , Curr. Opin. Microbiol. 37:67-78 (2017) - dx.doi.org/l0.l0l6/j.mib.20l7.05.008). The cas!2a gene encodes a protein, Casl2a, that has a RuvC-like nuclease domain that is homologous to the respective domain of Cas9 proteins, but lacks the HNH nuclease domain that is present in Cas9 proteins. Type V systems have been identified in several bacteria including, but not limited to, Parcubacteria bacterium ,

Lachnospiraceae bacterium, Butyrivibrio proteoclasticus, Peregrinibacteria bacterium, Acidaminococcus spp., Porphyromonas macacae, Porphyromonas crevioricanis , Prevotella disiens, Moraxella bovoculi, Smithella spp., Leptospira inadai, Franciscella tularensis, Franciscella novicida, Candidatus methanoplasma termitum, and Eubacterium eligens.

Recently it has been demonstrated that Casl2a protein also has RNase activity and is responsible for pre-crRNA processing {see Fonfara, T, eta/., Nature 532(7600):517-521 (2016)). [0010] In Class 2 systems, the crRNA is associated with a single protein, and this protein- RNA complex achieves interference by combining nuclease activity with RNA-binding domains and base-pair formation between the crRNA and a nucleic acid target sequence.

[0011] In Type II systems, nucleic acid target sequence binding and cleavage involve a Cas9 protein, a crRNA, and a trans-activating CRISPR RNA (tracrRNA). In Type II systems, the RuvC-like nuclease (RNase H fold) domain and the HNH (McrA-like) nuclease domain of the Cas9 protein each cleave one of the strands of the double-stranded nucleic acid target sequence. The Cas9 protein cleavage activity of Type II systems also requires hybridization of the crRNA to the tracrRNA to form a duplex that facilitates the crRNA and nucleic acid target sequence binding by the Cas9 protein.

[0012] In Type V systems, nucleic acid target sequence binding and cleavage involve a Casl2a protein and a crRNA. In Type V systems, the RuvC-like nuclease domain of the Casl2a protein cleaves both strands of the nucleic acid target sequence in a sequential fashion (Swarts, D. C., el ctl., Mol. Cell 66:221-233. e4 (2017)), producing 5’ overhangs, which contrasts with the blunt ends generated by Cas9 protein cleavage.

[0013] The Casl2a protein cleavage activity of Type V systems does not require hybridization of a crRNA to a tracrRNA to form a duplex; rather Type V systems use a single crRNA that has a stem-loop structure forming an internal duplex. The Casl2a protein binds the crRNA in a sequence- and structure-specific manner by recognizing the stem loop and sequences adjacent to the stem loop, most notably nucleotides 5’ of the spacer sequence, which hybridizes to the nucleic acid target sequence. This stem-loop structure is typically in the range of 15 to 19 nucleotides in length. Substitutions that disrupt this stem-loop duplex abolish cleavage activity, whereas other substitutions that do not disrupt the stem-loop duplex do not abolish cleavage activity. Nucleotides 5’ of the stem loop adopt a pseudo-knot structure further stabilizing the stem -loop structure with non-canonical Watson-Crick base pairing, triplex interaction, and reverse Hoogsteen base pairing {see Yamano, T., eta/. Cell 165(4):949-962 (2016)).

[0014] Other proteins associated with Type V crRNA and nucleic acid target sequence binding and cleavage include Casl2b (also known as Class 2 candidate 1 protein, or C2cl) and Casl2c (also known as Class 2 candidate 3 protein, or C2c3). Casl2b and Casl2c proteins are similar in length to Cas9 and Casl2a proteins, ranging from approximately 1,100 amino acids to approximately 1,500 amino acids. Casl2b and Casl2c proteins also contain RuvC-like nuclease domains and have an architecture similar to Casl2a protein. Casl2b proteins are similar to Cas9 proteins in requiring a crRNA and a tracrRNA for nucleic acid target sequence binding and cleavage but have an optimal cleavage temperature of 50 C. Casl2b proteins target an AT -rich protospacer adjacent motif (PAM), similar to the PAM of Casl2a protein, which is 5’ of the nucleic acid target sequence {see, e.g., Shmakov, S., eta/, Molecular Cell 60(3):385- 397 (2015)).

[0015] The Casl3a protein (also known as Class 2 candidate 2 protein, or C2c2) does not share sequence similarity with other CRISPR effector proteins and was recently identified as a Type VI system {see Abudayyeh, O., eta/., Science 353(6299):aaf5573 (2016)). Casl3a proteins have two HEPN domains and possess single-stranded RNA cleavage activity. Casl3a proteins are similar to Casl2a proteins in requiring a crRNA for nucleic acid target sequence binding and cleavage, and in not requiring tracrRNA. Also, similar to Casl2a protein, the crRNA for Casl3a proteins forms a stable hairpin, or stem -loop structure, that aids in association with the Casl3a protein. Type VI systems have a single polypeptide RNA endonuclease that utilizes a single crRNA to direct RNA cleavage in a target-dependent fashion. Additionally, after hybridizing to the target RNA complementary to the spacer, Casl3a protein becomes a promiscuous RNA endonuclease exhibiting non-specific endonuclease activity toward any single-stranded RNA in a sequence independent manner (see East-Seletsky, A., eta/., Nature 538(7624):270-273 (2016)).

[0016] Regarding Class 2 Type II CRISPR-Cas systems, many Cas9 protein orthologs are known in the art as well as their associated polynucleotide components (tracrRNA and crRNA) {see, e.g., Fonfara, T, eta/., Nucleic Acids Research 42(4):2577-2590 (2014), including all Supplemental Data; Chylinski K., eta/., Nucleic Acids Research 42(10):6091-6105 (2014), including all Supplemental Data). In addition, Cas9-like synthetic proteins are known in the art {see U.S. Published Patent Application No. 2014-0315985, published 23 October 2014).

[0017] Cas9 is an exemplary Type II CRISPR Cas protein. Cas9 is an endonuclease that can be programmed by tracrRNA/crRNA to cleave, in a site-specific manner, a target DNA sequence using two distinct endonuclease domains (HNH and RuvC/RNase H-like domains) {see, e.g., U.S. Published Patent Application No. 2014-0068797, published 6 March 2014; see also Jinek, M., eta/., Science 337:816-821 (2012)).

[0018] Typically, each wild-type CRISPR-Cas9 system includes a crRNA and a tracrRNA. The crRNA has a region of complementarity to a potential target DNA sequence and a second region that forms base-pair hydrogen bonds with the tracrRNA to form a secondary structure, typically to form at least one stem structure. The region of complementarity to the target DNA sequence is the spacer. The tracrRNA and the crRNA interact through a number of base-pair hydrogen bonds to form secondary RNA structures. Complex formation between

tracrRNA/crRNA and a Cas9 protein results in conformational change of the Cas9 protein that facilitates binding to DNA, endonuclease activities of the Cas9 protein, and crRNA-guided site- specific DNA cleavage by the Cas9 endonuclease. For a Cas9 protein/tracrRNA/crRNA complex to cleave a double-stranded target DNA sequence, the target DNA sequence is adjacent to a cognate PAM. By engineering a crRNA to have an appropriate spacer sequence, the complex can be targeted to cleave at a locus of interest, e.g., a locus at which sequence modification is desired.

[0019] A variety of Type II CRISPR-Cas system crRNA and tracrRNA sequences, as well as predicted secondary structures, are known in the art (see, e.g., Ran, F.A., et al. , Nature

520(7546): 186-191 (2015), including all Supplemental Data, in particular Extended Data Figure 1; Fonfara, F, et al. , Nucleic Acids Research 42(4):2577-2590 (2014), including all Supplemental Data, in particular Supplemental Figure Sl 1). Predicted tracrRNA secondary structures were based on the Constraint Generation RNA folding model (Zuker, M., Nucleic Acids Research 31 :3406-3415 (2003)). RNA duplex secondary structures were predicted using RNAcofold of the Vienna RNA package (Bernhart, S.H., et al. , Algorithms for Molecular Biology l(l):3 (2006); Hofacker, I.L., et al. , Journal of Molecular Biology 319: 1059-1066 (2002)) and RNAhybrid (bibiserv.techfak.uni-bielefeld.de/rnahybrid/). The structure predictions were visualized using VARNA (Darty, K., et al. , Bioinformatics 25: 1974-1975 (2009)). Fonfara, T, et al. , show that the crRNA/tracrRNA complex for Campylobacter jejuni does not have the bulge region; however, the complex retains a stem structure located 3’ of the spacer that is followed in the 3’ direction with another stem structure.

[0020] The spacer of Class 2 CRISPR-Cas systems can hybridize to a nucleic acid target sequence that is located 5’ or 3’ of a PAM, depending upon the Cas protein to be used. A PAM can vary depending upon the Cas protein to be used. For example, if a Cas9 protein from S. pyogenes is used, the PAM can be a sequence in the nucleic acid target sequence that comprises the sequence 5’-NRR-3’, wherein R can be either A or G, N is any nucleotide, and N is immediately 3’ of the nucleic acid target sequence targeted by the nucleic acid target binding sequence. A Cas protein may be modified such that a PAM may be different compared with a PAM for an unmodified Cas protein. For example, if a Cas9 protein from S. pyogenes is used, the Cas9 protein may be modified such that the PAM no longer comprises the sequence 5’- NRR-3’, but instead comprises the sequence 5’-NNR-3’, wherein R can be either A or G, N is any nucleotide, and N is immediately 3’ of the nucleic acid target sequence targeted by the nucleic acid target sequence.

[0021] Other Cas proteins recognize other PAMs, and one of skill in the art can determine the

PAM for any particular Cas protein. For example, Cas 12a protein has a thymine-rich PAM site that targets, for example, a TTTN sequence (see Fagerlund, R., e/a/., Genome Biology 16:251 (2015)).

[0022] The RNA-guided Cas9 endonuclease has been widely used for programmable genome editing in a variety of organisms and model systems (see, e.g., Jinek M., e/a/., Science

337:816-821 (2012); Jinek M., e/ a/., eLife 2:e0047l - dx.doi.org l0.7554/eLife.0047l (2013); U.S. Published Patent Application No. 2014-0068797, published 6 March 2014).

[0023] Makarova, K. S., et al, (Cell 168:946 (2017) - dx.doi.org/ 10.1016/

j. cell.2017.02.018) provide a summary of genes, homologs, effector protein domain

organization, RNA components, effector complexes, and mechanisms of action for Class 2 CRISPR-Cas systems.

[0024] Genome engineering includes altering the genome by deleting, inserting, mutating, or substituting specific nucleic acid sequences. The alteration can be gene- or location-specific. Genome engineering can use site-directed nucleases, such as Cas proteins and their cognate polynucleotides, to cut DNA, thereby generating a site for alteration. In certain cases, the cleavage can introduce a double-strand break (DSB) in the target DNA sequence. DSBs can be repaired, e.g., by non-homologous end joining (NHEJ), microhomology-mediated end joining (MMEJ), or homology-directed repair (HDR). HDR relies on the presence of a template for repair. In some examples of genome engineering, a donor polynucleotide or portion thereof can be inserted into the break.

Summary of the Invention

[0025] The present invention generally relates to Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acids (dht-NATNAs).

[0020] In a first aspect, the present invention relates to Class 2 Type II CRISPR-Cas9- associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) compositions. In one embodiment, a dht-NATNA composition comprises a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht- casPNl) comprising a spacer sequence, a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical single-stranded polynucleotide (dht-casPN2) comprising a nexus, and a third Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN3). Each of the dht-casPNl, the dht-casPN2, and the dht-casPN3 having a 5' end and 3' end. Each of the dht-casPNl, the dht-casPN2, and the dht-casPN3 is a continuous series of covalently connected nucleotides, and the 3' end of the dht-casPN2 and the 5' end of the dht-casPN3 are located 3' of the nexus. In this embodiment, the dht-casPNl connects with the dht-casPN2 through hydrogen base-pair bonds, and the dht-casPN2 connects with the dht-casPN3 through hydrogen base-pair bonds to form the dht-NATNA. The dht- NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht-NATNA.

[0027] In another embodiment, a dht-NATNA composition comprises a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht- casPNl) comprising a spacer and a nexus, and a second Class 2 Type II CRISPR-Cas9- associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN2), each having a 5' end and 3' end. The dht-casPNl and the dht-casPN2 are each a continuous series of covalently connected nucleotides, and the 3' end of the dht-casPNl and the 5' end of the dht- casPN2 are located 3' of the nexus. In this embodiment, the dht-casPNl connects with the dht- casPN2 through hydrogen base-pair bonds to form the dht-NATNA. The dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence

complementary to a target nucleic acid binding sequence of the dht-NATNA.

[0028] In a further embodiment, a dht-NATNA composition comprises a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht- casPNl) and a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN2). The dht-casPNl comprising: a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence I; a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I; a first connecting nucleotide sequence covalently connecting the 3' end of the first segment to the 5' end of the second segment; a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II; a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment; and a fourth segment comprising a third stem nucleotide sequence I. The dht-casPN2 comprising: a fifth segment comprising a third stem nucleotide sequence II; a sixth segment comprising a helical-triplex forming nucleotide sequence; and a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment. In this embodiment, the first stem repeat nucleotide sequence I is connected through hydrogen base-pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base-pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex. The dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

[0029] Additional exemplary dht-NATNA compositions are described herein.

[0030] In another aspect, the present invention includes a nucleoprotein composition comprising a dht-NATNA composition and a Cas9 protein. In some embodiments, the Cas9 protein is a Campylobacter jejuni Cas9 protein. In further embodiments of the nucleoprotein composition, the dht-NATNA composition is in a complex with the Cas9 protein. Embodiments of the present invention include an enzymatically inactive Cas9 protein.

[0031] In a further aspect, the present invention relates to kits comprising one or more components of a dht-NATNA composition and an excipient (e.g., a buffer). In some

embodiments, the dht-NATNA composition comprises a one or more dht Cas polynucleotide (dht-casPN), or one or more nucleic acid sequences encoding the dht-casPN, and a buffer. Kits can further comprise one or more Cas9 proteins or one or more nucleic acid sequences encoding the one or more Cas9 proteins. In further embodiments, a kit can comprise

nucleoprotein complexes comprising a dht-NATNA composition and a Cas9 protein, as well as an excipient.

[0032] In another aspect, the present invention relates to one or more nucleic acid sequences encoding one or more dht-casPNs of a dht-NATNA. In some embodiments, an expression cassette comprises the one or more nucleic acid sequences encoding one or more dht-casPNs of a dht-NATNA.

[0033] In an additional aspect, the present invention relates to an expression vector comprising one or more nucleic acid sequences encoding one or more dht-casPNs of a dht- NATNA composition.

[0034] In yet another aspect, the present invention relates to a recombinant cell comprising one or more nucleic acid sequences encoding one or more dht-casPNs of a dht-NATNA composition.

[0035] In another aspect, the present invention relates to a recombinant cell comprising one or more dht-NATNA compositions. In some embodiments, a recombinant cell comprises a nucleoprotein composition comprising a dht-NATNA composition and a Cas9 protein. [0036] In a further aspect, the present invention relates to a recombinant cell modified by one or more Cas9 and dht-NATNA compositions.

[0037] Further aspects of the present invention include methods of using a dht-NATNA composition including, but not limited to, a method of binding DNA or RNA. This method comprises contacting a target DNA or RNA sequence in a DNA or RNA polynucleotide with a nucleoprotein complex comprising a dht-NATNA composition and a Cas9 protein, thereby facilitating binding of the nucleoprotein complex to the target DNA or RNA sequence in the DNA or RNA polynucleotide.

[0038] Another method of the present invention is a method of cutting DNA or RNA. The method comprises contacting a target DNA or RNA sequence in a DNA or RNA

polynucleotide with a nucleoprotein complex comprising a dht-NATNA composition and a Cas9 protein, thereby facilitating binding of the nucleoprotein complex to the target DNA or RNA sequence. Such binding results in cutting of the target DNA or RNA sequence.

[0039] These aspects and other embodiments of the present invention using the dht-NATNA compositions and nucleoprotein complexes comprising the dht-NATNA compositions of the present invention will be readily apparent to those of ordinary skill in the art in view of the disclosure herein.

Brief Description of the Figures

[0040] The figures are not proportionally rendered, nor are they to scale. The locations of indicators are approximate.

[0041] FIG. 1A, FIG. IB, FIG. 1C, and FIG. ID present illustrative examples of dual-guide Class 2 Type II CRISPR-Cas9-associated RNAs comprising a helical triplex (FIG. 1 A), a single-guide Class 2 Type II CRISPR-Cas9-associated RNA comprising a helical triplex (FIG. 1B), and two different graphical views of the 3' helical triplex region of a Class 2 Type II CRISPR-Cas9-associated RNA (FIG. 1C and FIG. 1D). Guide RNAs derived from

Campylobacter jejuni and Campylobacter lari have this type of secondary structure.

[0042] FIG. 2 presents another illustrative example of a dual-guide Class 2 Type II CRISPR- Cas9-associated RNA comprising a helical triplex.

[0043] FIG. 3 present another illustrative example of a single-guide Class 2 Type II CRISPR- Cas9-associated RNA comprising a helical triplex.

[0044] FIG. 4A and FIG. 4B illustrate a split-nexus modification of a single-guide Class 2 Type II CRISPR-Cas9-associated RNA comprising a helical triplex. [0045] FIG. 5A, FIG. 5B, FIG. 5C, and FIG. 5D illustrate embodiments of Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht- NATNA) compositions of the present invention.

[0046] FIG. 6A, FIG. 6B, FIG. 6C, FIG. 6D, FIG. 6E, FIG. 6F, FIG. 6G, FIG. 6H, FIG. 61, FIG. 6J, and FIG. 6K illustrate additional embodiments of Class 2 Type II CRISPR-Cas9- associated discontinuous-helical-triplex nucleic acid-targeting nucleic acid (dht-NATNA) compositions of the present invention.

[0047] FIG. 7A and FIG. 7B illustrate an example of a Class 2 Type II CRISPR-Cas9- associated discontinuous-helical-triplex nucleic acid-targeting nucleic acid (dht-NATNA)/Cas9 nucleoprotein complex of the present invention forming, binding to, and cleaving a double- stranded DNA comprising a target DNA sequence.

Incorporation by Reference

[0048] All patents, publications, and patent applications cited in this specification are herein incorporated by reference as if each individual patent, publication, or patent application was specifically and individually indicated to be incorporated by reference in its entirety for all purposes.

Detailed Description of the Invention

[0049] It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms“a,”“an,” and“the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to“a polynucleotide” includes one or more polynucleotides, and reference to“a vector” includes one or more vectors.

[0050] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although other methods and materials similar, or equivalent, to those described herein can be useful in the present invention, preferred materials and methods are described herein.

[0051] In view of the teachings of the present specification, one of ordinary skill in the art can apply conventional techniques of immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics, and recombinant polynucleotides, as taught, for example, by the following standard texts: Cellular and Molecular Immunology, 9th Edition, A. K. Abbas et ah, Elsevier, (2017), ISBN 978-0323479783; Cancer Immunotherapy Principles and

Practice, lst Edition, L. H. Butterfield, et ah, Demos Medical (2017), ISBN 978-1620700976; Janeway's Immunobiology, 9th Edition, Kenneth Murphy, Garland Science, (2016), ISBN 978-

0815345053; Clinical Immunology and Serology: A Laboratory Perspective, 4th Edition, C. Dorresteyn Stevens, et al., F.A. Davis Company (2016), ISBN 978-0803644663; Antibodies: A Laboratory Manual, Second edition, E. A. Greenfield, Cold Spring Harbor Laboratory Press (2014), ISBN 978-1-936113-81-1; Culture of Animal Cells: A Manual of Basic Technique and Specialized Applications, 7th Edition, R.I. Freshney, Wiley -Blackwell (2016), ISBN 978- 1118873656; Transgenic Animal Technology, Third Edition: A Laboratory Handbook, C.A. Pinkert, Elsevier (2014), ISBN 978-0124104907; The Laboratory Mouse, Second Edition, H. Hedrich, Academic Press (2012), ISBN 978-0123820082; Manipulating the Mouse Embryo: A Laboratory Manual, Fourth Edition, R. Behringer, et al ., Cold Spring Harbor Laboratory Press (2013), ISBN 978-1936113019; PCR 2: A Practical Approach, M.J. McPherson, et al, IRL Press (1995), ISBN 978-0199634248; Methods in Molecular Biology (Series), JM. Walker, ISSN 1064-3745, Humana Press; RNA: A Laboratory Manual, D.C. Rio, et al ., Cold Spring Harbor Laboratory Press (2010), ISBN 978-0879698911; Methods in Enzymology (Series), Academic Press; Molecular Cloning: A Laboratory Manual (Fourth Edition), M.R. Green, et al, Cold Spring Harbor Laboratory Press (2012), ISBN 978-1605500560; Bioconjugate Techniques, Third Edition, G.T. Hermanson, Academic Press (2013), ISBN 978-0123822390; Methods in Plant Biochemistry and Molecular Biology, W.V. Dashek, CRC Press (1997), ISBN 978-0849394805; Plant Cell Culture Protocols (Methods in Molecular Biology), VM. Loyola- Vargas, et al. , Humana Press (2012), ISBN 978-1617798177; Plant Transformation

Technologies, C.N. Stewart, et al. , Wiley-Blackwell (2011), ISBN 978-0813821955;

Recombinant Proteins from Plants (Methods in Biotechnology), C. Cunningham, et al. ,

Humana Press (2010), ISBN 978-1617370212; Plant Genomics: Methods and Protocols (Methods in Molecular Biology), W. Busch, Humana Press (2017), ISBN 978-1493970018; Plant Biotechnology: Methods in Tissue Culture and Gene Transfer, R. Keshavachandran, et al, Orient Blackswan (2008), ISBN 978-8173716164.

[0052] Clustered regularly interspaced short palindromic repeats (CRISPR) and related CRISPR-associated proteins (Cas proteins) constitute CRISPR-Cas systems (see, e.g.,

Barrangou, R., et al, Science 315: 1709-1712 (2007)).

[0053] As used herein,“Cas protein” and“CRISPR-Cas protein” refer to Cas proteins including, but not limited to, Class 1 Type I Cas proteins, Class 1 Type III Cas proteins, Class 1 Type IV Cas proteins, Class 2 Type II Cas proteins, Class 2 Type V Cas proteins, and Class 2 Type VI Cas proteins. Class 2 Cas proteins include Cas9 proteins, Cas9-like proteins encoded by Cas9 orthologs, Cas9-like synthetic proteins, Cpfl proteins, proteins encoded by Cpfl orthologs, Cpfl -like synthetic proteins, C2cl proteins, C2c2 proteins, C2c3 proteins, and variants and modifications thereof. In some embodiments, Cas proteins are Class 2 Cas proteins, for example one or more Class 2 Type II Cas proteins, such as Cas9, one or more Class 2 Type V Cas proteins, such as Cpfl, or one or more Class 2 Type VI Cas proteins, such as C2c2. In preferred embodiments, Cas proteins are one or more Class 2 Type II Cas proteins, such as Cas9, and one or more Class 2 Type V Cas proteins, such as Cpfl . Typically, for use in aspects of the present invention, a Cas protein is capable of interacting with one or more cognate polynucleotides (most typically, RNA) to form a nucleoprotein complex (most typically, a ribonucleoprotein complex).

[0054]“Cas9 protein,” as used herein, refers to Cas9 wild-type proteins derived from Class 2 Type II CRISPR-Cas9 systems, modifications of Cas9 proteins, variants of Cas9 proteins, Cas9 orthologs, and combinations thereof. Cas9 proteins include, but are not limited to, Cas9 from Streptococcus pyogenes (UniProtKB - Q99ZW2 (CAS9 STRP1)), Streptococcus thermophilus (UniProtKB - G3ECR1 (CAS9 STRTR)), Staphylococcus aureus (UniProtKB - J7RUA5 (CAS9 STAAU), Campylobacter jejuni (UniProtKB - Q0P897 (CAS9 CAMJE)),

Campylobacter lari (UniProtKB - A0A0A8HTA3 (A0A0A8HTA3 CAMLA), Helicobacter canadensis (UniProtKB - C5ZYI3 (C5ZYI3 9HELI)), Campylobacter sp. RM1670

(UniProtKB - A0A0A8GXC3 (AOAOA8GXC3 9PROT)), Campylobacter subantarcticus (UniProtKB - A0A0A8H849 (AOAOA8H849 9PROT)), Campylobacter peloridis (UniProtKB - A0A0A8H849 (AOAOA8H849 9PROT)), and Helicobacter cinaedi (UniProtKB - I7GTK8 (I7GTK8 9HELI)). Cas9 homologs can be identified using sequence similarity search methods known to one skilled in the art.

[0055]“dCas9,” as used herein, refers to variants of a Cas9 protein that are nuclease- deactivated Cas9 proteins, also termed“catalytically inactive Cas9 protein,”“enzymatically inactive Cas9,”“catalytically dead Cas9” or“dead Cas9.” Such molecules lack all or a portion of endonuclease activity and can therefore be used to regulate genes in an RNA-guided manner {see Jinek M., etal. ', Science 337:816-821 (2012)). This is accomplished by introducing mutations to catalytic residues, such as D10A in the RuvC-l domain and H840A in the HNH domain (numbered relative to S. pyogenes Cas9), that inactivate Cas9 nuclease function.

Mutations of only one of these nuclease domains, e.g., a D10A mutation in the RuvC-l domain, results in a Cas9 that is only capable of nicking one stand of the target DNA. It is understood that mutations of other catalytic residues to reduce activity of either or both of the nuclease domains can also be carried out by one skilled in the art. The resultant dCas9 is unable to cleave double-stranded DNA but retains the ability to complex with a guide nucleic acid and bind a target DNA sequence. The Cas9 double mutant with changes at amino acid positions D10A and

H840A (numbered relative to S. pyogenes Cas9) inactivates both the nuclease and nickase activities of the Cas9 protein. Targeting specificity is determined by Cas9 protein binding to the PAM sequence and by complementary base pairing of guide RNA (typically, a single-guide RNA) to the genomic locus.

[0056]“Nucleic acid-targeting nucleic acid” (NATNA), as used herein, refers to one or more polynucleotides that guide a protein, such as a Cas protein (preferably Cas9), to preferentially bind a nucleic acid target sequence in a polynucleotide (relative to a polynucleotide that does not comprise the nucleic acid target sequence). NATNAs can comprise ribonucleotide bases (e.g., RNA), deoxyribonucleotide bases (e.g., DNA), combinations of ribonucleotide bases and deoxyribonucleotide bases (e.g., RNA/DNA; see, e.g., U.S. Patent No. 9,650,617, issued 16 May 2017; U.S. Patent No. 9,580,701, issued 28 February 2017; U.S. Patent No. 9,688,972, issued 27 June 2017; U.S. Patent No. 9,771,601, issued 26 September 2017; U.S. Patent No. 9,868,962, issued 16 January 2018), nucleotides, nucleotide analogs, modified nucleotides, and the like, as well as synthetic, naturally occurring, and non-naturally occurring modified backbone residues or linkages, for example, as described herein. Examples of NATNAs include, but are not limited to, Cas9-dual -guide RNAs, Cas9-single-guide RNAs, and the Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex NATNAs of the present invention. Variations on Cas9 guides are known in the art, including but not limited to those disclosed in U.S. Patent No. 9,580,727, issued 28 February 2017; U.S. Patent No. 9,745,600, issued 29 August 2017; U.S. Patent No. 9,677,090, issued 13 June 2017; U.S. Patent No.

9,745,562, issued 29 August 2017; U.S. Patent No. 9,816,081, issued 14 November 2017.

[0057] As used herein, a "stem element" or "stem structure" refers to two strands of nucleic acids that are known or predicted to form a double-stranded region (the“stem element”). A "stem-loop element” or“stem-loop structure” refers to a stem structure wherein 3 -end sequences of one strand are covalently bonded to 5'-end sequences of the second strand by a nucleotide sequence of typically single-stranded nucleotides (“a stem-loop element nucleotide sequence”). In some embodiments, the loop element comprises a loop element nucleotide sequence of between about 3 and about 20 nucleotides in length, preferably between about 4 and about 10 nucleotides in length. In preferred embodiments, a loop element nucleotide sequence is a single-stranded nucleotide sequence of unpaired nucleic acid bases that do not interact through hydrogen bond formation to create a stem element within the loop element nucleotide sequence. The term“hairpin element” is also used herein to refer to stem-loop structures. Such structures are well known in the art. The base pairing may be exact; however, as is known in the art, a stem element does not require exact base pairing. Thus, the stem element may include one or more base mismatches or non-paired bases. [0058] A“linker element nucleotide sequence,”“linker nucleotide sequence,” and "linker sequence" are used interchangeable herein and refer to a single-stranded nucleic acid sequence of one or more nucleotides covalently attached to a first nucleic acid sequence (5'-linker nucleotide sequence-first nucleic acid sequence-3'). In some embodiments, a linker nucleotide sequence connects two separate nucleic acid sequences to form a single polynucleotide (e.g., 5'- first nucleic acid sequence-linker nucleotide sequence-second nucleic acid sequence-3'). Other examples of linker sequences include, but are not limited to, 5'-first nucleic acid sequence- linker nucleotide sequence-3', and 5'-linker nucleotide sequence-first first nucleic acid sequence-linker nucleotide sequence-3'. In some embodiments, the linker element nucleotide sequence can be a single-stranded nucleotide sequence of unpaired nucleic acid bases that do not interact with each other through hydrogen bond formation to create a secondary structure (e.g., a stem-loop structure) within the linker element nucleotide sequence. In some

embodiments, two linker element nucleotide sequences can interact with each other through hydrogen bonding between the two linker element nucleotide sequences. In some embodiments, a linker element nucleotide sequence can be between about 1 and about 50 nucleotides in length, preferably between about 1 and about 15 nucleotides in length. As used herein, a "connecting sequence" or a "connecting nucleotide sequence" refers to a single-stranded nucleic acid sequence linker sequence that covalently connects a first nucleic acid sequence and a second nucleic acid sequence. As used herein, a "joining nucleotide sequence" is a short single- stranded nucleic acid sequence linker sequence typically between about 1 nucleotide and about 5 nucleotides.

[0059] As used herein,“dual-guide NATNA” and“Cas9-dual-guide NATNA” refer to a two- component NATNA capable of associating with a cognate Class 2 Type II CRISPR-Cas9 protein, as further described herein. FIG. 1 A presents an illustrative example of a

Campylobacter jejuni dual-guide Class 2 Type II CRISPR-Cas9-associated RNA comprising a crRNA (FIG. 1 A, 101) and a tracrRNA (FIG. 1 A, 102) comprising a helical triplex (see, e.g., Kim, E., et al., Nat. Commun. 8: 14500 (2017) - dx.doi.org/l0.l038/ncommsl4500; Yamada, M, et al, Mol. Cell. 65(6): 1109-1121 (2017) dx.doi.org/l0.l0l6/ j.molcel.20l7. 02.007). FIG.

1 A presents an overview of and nomenclature for secondary structural elements of the crRNA and tracrRNA of C. jejuni, including the following: a spacer element (FIG. 1A, 103) comprising a spacer sequence (also referred to herein as a spacer or nucleic acid target binding sequence); a first stem structure (FIG. 1 A, 104); a nexus (FIG. 1 A, 105; also referred to herein as a nexus element or nexus stem-loop element; see, e.g., Briner, A., et al. , Mol. Cell.

23;56(2):333-339 (2014) - dx.doi.org/ 10.1016/ j.molcel.20l4.09.0l9, Nowak, C., et al., Nucleic Acids Res 44 (20):9555-9564 (2016) - dx.doi.org/l0.l093/nar/gkw908; Wright, A., et al ., Proceedings of the National Academy of Sciences of the United States of America, 112(10): 2984-2989 (2015) - dx.doi.org/l0.l073/ pnas. l50l698H2; U.S. Patent No. 9,580,727, issued 28 February 2017); a 3’ hairpin (FIG. 1A, 106); a helical -triplex forming nucleotide sequence (FIG. 1A, 107); and a helical triplex (FIG. 1A, 108). Two different graphical views of the helical triplex of C. jejuni Class 2 Type II CRISPR-Cas9-associated NATNAs are set forth in FIG. 1C and FIG. 1D. The view in FIG. 1D is rotated 45 degrees clockwise relative to the view in FIG. 1C. Examples of hydrogen bonds forming the helical triplex are illustrated with dashed lines.

[0060] A dual-guide Class 2 Type II CRISPR-Cas9-associated NATNA is capable of forming a nucleoprotein complex with a cognate Cas9 protein, wherein the complex is capable of targeting a nucleic acid target sequence complementary to the spacer sequence. Modifications of dual-guide Class 2 Type II CRISPR-Cas9-associated NATNAs are known in the art (see, e.g., U.S. Patent No. 9,260,752, issued 16 February 2016; U.S. Patent No. 9,580,727, issued 28 February 2017; and U.S. Patent No. 9,816,093, issued 4 November 2017).

[0061] FIG. 2 illustrates a dual-guide Class 2 Type II CRISPR-Cas9-associated NATNA comprising a helical triplex (indicators for hydrogen bond interactions omitted). The series of indicators and corresponding nucleotide sequences are set forth in Table 1.

[0062]

_ _

[0063] As used herein,“single-guide NATNA,” "sgNATNA" (e.g., a sgRNA), and“Cas9- single-guide NATNA” refer to a one-component NATNA capable of associating with a cognate Class 2 Type II CRISPR-Cas9 protein to form a nucleoprotein complex, as further described herein. FIG. 1B shows an example of a of C. jejuni Class 2 Type II CRISPR-Cas9-associated sgRNA and illustrates a sgRNA wherein the crRNA (FIG. 1 A, 101) is covalently joined to the tracrRNA (FIG. 1 A, 102) by a connective sequence to form a first stem -loop structure (FIG.

1B, 109). In some embodiments, the connective sequence is a tetraloop (see, e.g., Kim, E., et al ., Nat. Commun. 8: 14500 (2017) - dx.doi.org/l0. l038/ncommsl4500; Yamada, M., et al, Mol. Cell. 65(6): 1109-1121 (2017) - dx.doi.org/l0. l0l6/j.molcel.20l7. 02.007). A

nucleoprotein complex of a Cas9-single-guide NATNA and Cas9 protein is capable of targeting and binding to a nucleic acid sequence complementary to the spacer sequence of the sgNATNA. Modifications of single-guide Cas9-associated NATNAs are known in the art (see, e.g., U.S. Patent No. 9,260,752, issued 16 February 2016; U.S. Patent No. 9,580,727, issued 28 February 2017; and U.S. Patent No. 9,816,093, issued 4 November 2017).

[0064] FIG. 3 illustrates a single-guide Class 2 Type II CRISPR-Cas9-associated NATNA comprising a helical triplex (indicators for hydrogen bond interactions omitted). A series of indicators and corresponding nucleotide sequences are set forth in Table 2.

[0065]

[0066] As used herein, the term“cognate” typically refers to a Cas protein (e.g., Cas9) and one or more Cas polynucleotides (e.g., Class 2 Type II CRISPR-Cas9-associated NATNAs) that are capable of forming a nucleoprotein complex capable of site-directed binding to a nucleic acid target sequence complementary to the nucleic acid target binding sequence present in one of the one or more Cas polynucleotides.

[0067] The terms“wild-type,”“naturally occurring,” and“unmodified” are used herein to mean the typical (or most common) form, appearance, phenotype, or strain existing in nature; for example, the typical form of cells, organisms, polynucleotides, proteins, macromolecular complexes, genes, RNAs, DNAs, or genomes as they occur in, and can be isolated from, a source in nature. The wild-type form, appearance, phenotype, or strain serve as the original parent before an intentional modification. Thus, mutant, variant, engineered, recombinant, and modified forms are not wild-type forms. [0068] As used herein, the terms“engineered,”“genetically engineered,”“recombinant,” “modified,”“non-naturally occurring,”“non-natural,” and“non-native” are interchangeable and indicate intentional human manipulation.

[0069] As used herein,“interrupted,”“broken,” and“discontinuous” are used interchangeably to mean made into one or more pieces from a whole, e.g., a single polynucleotide gives rise to a first polynucleotide and a second polynucleotide by introducing a break in covalent bonds of a polynucleotide backbone in the backbone of the single polynucleotide. In this example, the single polynucleotide has a 5' terminus and a 3' terminus, and the resulting first polynucleotide and the second polynucleotide each have a 5' terminus and a 3' terminus (5' terminus-first polynucleotide-3' terminus and 5' terminus-second polynucleotide-3' terminus, respectively) after introduction of the break. In an RNA or DNA molecule, a break in the polynucleotide backbone can be a broken phosphodiesterase bond.

[0070] Examples of termini include, but are not limited to, termini wherein the 5' terminus of a DNA or RNA molecule is the fifth carbon in the sugar ring and the 3' terminus is the hydroxyl group on the third carbon in the sugar ring. Two polynucleotides, each having a 5' terminus and a 3' terminus, are formed when the backbone of a single polynucleotide is broken at one site. A 5’ and/or 3’ terminus can be covalently modified, for example, by addition of a moiety (e.g., a moiety providing resistance to the degradative effects of exonucleases).

[0071] The term "Class 2 Type II CRISPR-Cas9-associated split-nexus NATNA" refers to at least two Class 2 Type II CRISPR-Cas9-associated split-nexus polynucleotides wherein the backbone of the nexus stem-loop structure comprises an engineered break in the nucleic acid backbone resulting in at least one non-native 3’ terminus and one non-native 5’ terminus (relative to a wild-type Class 2 Type II CRISPR-Cas9-associated tracrRNA) in the nexus that converts the nexus stem-loop structure to a nexus stem structure (see, e.g., U.S. Patent No. 9,580,727, issued 28 February 2017; and U.S. Patent No. 9,745,600, issued 29 August 2017).

[0072] FIG. 4A illustrates an example of a f. jejufri Class 2 Type II CR4SPR-Cas9-associated split-nexus NATNA. FIG. 4B illustrates a Class 2 Type II CRISPR-Cas9-associated split-nexus NATNA comprising a helical triplex. A series of indicators and corresponding nucleotide sequences are set forth in Table 3. [0073]

[0074] As used herein, the terms "helical triplex" and "triple helix" are used interchangeably and refer to a triple-stranded polynucleotide (e.g., DNA or RNA) in which three

oligonucleotide strands interact with one another and form a triple helix. Examples of helical triplex structures include, but are not limited to, RNA minor groove triplexes (a common RNA structural motif, typically providing optimal van der Waals contacts, extensive hydrogen bonding, and an energetically favorable interaction), RNA major groove triplexes, and triple- stranded DNA (typically formed by Hoogsteen or reversed Hoogsteen hydrogen bonds in the major groove of B-form DNA). FIG. 1C and FIG. 1D illustrate a helical triplex formed between a nexus and a triplex forming nucleotide sequence of a Class 2 Type II CRISPR-Cas9- associated NATNA. Such helical triplexes are present, for example, in Campylobacter jejuni guide RNAs (Yamada, M., et al., Mol. Cell. 65(6): 1109-1121 (2017) - dx.doi.org/l0. l0l6/j.molcel.20l7.02.007).

[0075] As used herein, the terms“Class 2 Type II CRISPR-Cas9-associated discontinuous- helical triplex NATNA” (dht-NATNA) composition refers to engineered Class 2 Type II Cas9- associated NATNAs of the present invention, wherein the polynucleotide 3' of the nexus {see, e.g., FIG. 2, 210-216; and FIG. 3, 310-316) comprises at least one polynucleotide strand that interacts with the nexus stem-loop structure to form a helical triplex, and further comprises at least one engineered break in the nucleic acid backbone resulting in at least one non-native 3’ terminus and one non-native 5’ terminus in the polynucleotide 3' of the nexus (relative to a wild-type Class 2 Type II CRISPR-Cas9-associated tracrRNA). In some embodiments, dht- NATNAs comprise DNA, RNA, or RNA and DNA. A dht-NATNA comprises two or more dht Cas polynucleotides (dht-casPNs); one polynucleotide component of a dht-NATNA is referred to as a dht-casPN.

[0076] In one embodiment, a dht-NATNA (or dht-NATNA composition) comprises a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPNl) comprising, in a 5’ to 3’ direction, a nucleic acid target binding sequence, a first stem-loop element, a nexus, a third stem nucleotide sequence I and a non native 3’ terminus; and a second Class 2 Type II CRISPR-Cas9-associated discontinuous- helical triplex single-stranded polynucleotide (dht-casPN2) comprising, in a 5’ to 3’ direction, a non-native 5' terminus, a third stem nucleotide sequence II, and a triplex forming nucleotide sequence (see, e.g., FIG. 5A, FIG. 5B). FIG. 5A illustrates a dht-NATNA comprising two polynucleotides (FIG. 5 A, I, dht-casPNl; and FIG. 5 A, II, dht-casPN2). FIG. 5B illustrates two polynucleotides of the dht-NATNA connected by hydrogen base-pair bonds.

[0077] In another embodiment, a dht-NATNA comprises a first Class 2 Type II CRISPR- Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPNl) comprising a spacer sequence, a second Class 2 Type II CRISPR-Cas9-associated

discontinuous-helical single-stranded polynucleotide (dht-casPN2) comprising a nexus, and a third Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN3) (see, e.g., FIG. 5C, FIG. 5D). FIG. 5C illustrates a dht-NATNA comprising three polynucleotides (FIG. 5C, I, dht-casPNl; FIG. 5C, II, dht-casPN2; FIG. 5C, III, dht-casPN3). FIG. 5D illustrates three polynucleotides of a dht-NATNA connected by hydrogen base-pair bonds.

[0078]“Covalent bond,”“covalently attached,”“covalently bound,”“covalently linked,” “covalently connected,” and“molecular bond” are used interchangeably herein, and refer to a chemical bond that involves the sharing of electron pairs between atoms. Examples of covalent bonds include, but are not limited to, phosphodiester bonds and phosphorothioate bonds.

[0079]“Non-covalent bond,”“non-covalently attached,”“non-covalently bound,”“non- covalently linked,”“non-covalent interaction,” and“non-covalently connected” are used interchangeably herein, and refer to any relatively weak chemical bond that does not involve sharing of a pair of electrons. Multiple non-covalent bonds often stabilize the conformation of macromolecules and mediate specific interactions between molecules. Examples of non- covalent bonds include, but are not limited to hydrogen bonding, ionic interactions (e.g., Na + CE), van der Waals interactions, and hydrophobic bonds.

[0080] As used herein,“hydrogen bonding,”“hydrogen-base pairing,” and“hydrogen bonded” are used interchangeably and refer to canonical hydrogen bonding and non-canonical hydrogen bonding including, but not limited to,“Watson-Crick-hydrogen-bonded base pairs” (W-C-hydrogen-bonded base pairs or W-C hydrogen bonding);“Hoogsteen-hydrogen-bonded base pairs” (Hoogsteen hydrogen bonding); and“wobble-hydrogen-bonded base pairs” (wobble hydrogen bonding). W-C hydrogen bonding, including reverse W-C hydrogen bonding, refers to purine-pyrimidine base pairing, that is, adenine:thymine, guanine: cytosine, and uracil:

adenine. Hoogsteen hydrogen bonding, including reverse Hoogsteen hydrogen bonding, refers to a variation of base pairing in nucleic acids wherein two nucleobases, one on each strand, are held together by hydrogen bonds in the major groove. This non-W-C hydrogen bonding can allow a third strand to wind around a duplex and form triple-stranded helices. Wobble hydrogen bonding, including reverse wobble hydrogen bonding, refers to a pairing between two nucleotides in RNA molecules that does not follow Watson-Crick base pair rules. There are four major wobble base pairs: guaninemracil, inosine (hypoxanthine):uracil, inosine-adenine, and inosine-cytosine. Rules for canonical hydrogen bonding and non-canonical hydrogen bonding are known to those of ordinary skill in the art {see, e.g., The RNA World, Third Edition (Cold Spring Harbor Monograph Series), R. F. Gesteland, Cold Spring Harbor

Laboratory Press (2005), ISBN 978-0879697396; The RNA World, Second Edition (Cold Spring Harbor Monograph Series), R. F. Gesteland, eta/., Cold Spring Harbor Laboratory Press (1999), ISBN 978-0879695613; The RNA World (Cold Spring Harbor Monograph Series), R.

F. Gesteland, et a/., Cold Spring Harbor Laboratory Press (1993), ISBN 978-0879694562 {see, e.g., Appendix 1 : Structures of Base Pairs Involving at Least Two Hydrogen Bonds, I. Tinoco); Principles of Nucleic Acid Structure, W. Saenger, Springer International Publishing AG (1988), ISBN 978-0-387-90761-1; Principles of Nucleic Acid Structure, First Edition, S. Neidle, Academic Press (2007), ISBN 978-01236950791).

[0081]“Connect,”“connected,” and“connecting” are used interchangeably herein, and refer to a covalent bond or a non-covalent bond between two macromolecules (e.g., polynucleotides, proteins, and the like).

[0082] As used herein, the terms“nucleic acid sequence,”“nucleotide sequence,” and “oligonucleotide” are interchangeable and refer to a polymeric form of nucleotides. As used herein, the term "polynucleotide" refers to a polymeric form of nucleotides that has one 5' end and one 3' end, and can comprise one or more nucleic acid sequences. A "circular

polynucleotide" refers to a polynucleotide having a covalent bond between its 5’ end and 3’ end, thus forming the circular polynucleotide. The nucleotides may be deoxyribonucleotides (DNA), ribonucleotides (RNA), analogs thereof, or combinations thereof, and may be of any length. Polynucleotides may perform any function and may have various secondary and tertiary structures. The terms encompass known analogs of natural nucleotides and nucleotides that are modified in the base, sugar, and/or phosphate moieties. Analogs of a particular nucleotide have the same base-pairing specificity (e.g., an analog of A base pairs with T). A polynucleotide may comprise one modified nucleotide or multiple modified nucleotides. Examples of modified nucleotides include fluorinated nucleotides, methylated nucleotides, and nucleotide analogs. Nucleotide structure may be modified before or after a polymer is assembled. Following polymerization, polynucleotides may be additionally modified via, for example, conjugation with a labeling component or target binding component. A nucleotide sequence may

incorporate non-nucleotide components. The terms also encompass nucleic acids comprising modified backbone residues or linkages, that are synthetic, naturally occurring, and/or non- naturally occurring, and have similar binding properties as a reference polynucleotide (e.g., DNA or RNA). Examples of such analogs include, but are not limited to, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl

ribonucleotides, peptide-nucleic acids (PNAs), Locked Nucleic Acid (LNA™) (Exiqon, Inc., Woburn, MA) nucleosides, glycol nucleic acid, bridged nucleic acids, and morpholino structures.

[0083] Peptide-nucleic acids (PNAs) are synthetic homologs of nucleic acids wherein the polynucleotide phosphate-sugar backbone is replaced by a flexible pseudo-peptide polymer. Nucleobases are linked to the polymer. PNAs have the capacity to hybridize with high affinity and specificity to complementary sequences of RNA and DNA.

[0084] In phosphorothioate nucleic acids, the phosphorothioate (PS) bond substitutes a sulfur atom for a non-bridging oxygen in the polynucleotide phosphate backbone. This modification makes the internucleotide linkage resistant to nuclease degradation. In some embodiments, phosphorothioate bonds are introduced between the last 3 to 5 nucleotides at the 5’ -end or 3’- end sequences of a polynucleotide sequence to inhibit exonuclease degradation. Placement of phosphorothioate bonds throughout an entire oligonucleotide helps reduce degradation by endonucleases as well.

[0085] Threose nucleic acid (TNA) is an artificial genetic polymer. The backbone structure of TNA comprises repeating threose sugars linked by phosphodiester bonds. TNA polymers are resistant to nuclease degradation. TNA can self-assemble by base-pair hydrogen bonding into duplex structures.

[0080] Linkage inversions can be introduced into polynucleotides through use of“reversed phosphoramidites” {see, e.g., www.ucalgary.ca/dnalab/synthesis/-modifications/linkages). A 3’- 3’ linkage at a terminus of a polynucleotide stabilizes the polynucleotide to exonuclease degradation by creating an oligonucleotide having two 5’ -OH termini but lacking a 3’ -OH terminus. Typically, such polynucleotides have phosphoramidite groups on the 5’ -OH position and a dimethoxytrityl (DMT) protecting group on the 3’-OH position. Normally, the DMT protecting group is on the 5’-OH and the phosphoramidite is on the 3’-OH.

[0087] Polynucleotide sequences are displayed herein in the conventional 5’ to 3’ orientation unless otherwise indicated.

[0088] As used herein,“sequence identity” generally refers to the percent identity of nucleotide bases or amino acids comparing a first polynucleotide or polypeptide to a second polynucleotide or polypeptide using algorithms having various weighting parameters. Sequence identity between two polynucleotides or two polypeptides can be determined using sequence alignment by various methods and computer programs (e.g., BLAST, CS-BLAST, FASTA, HMMER, L-ALIGN, and the like) available through the worldwide web at sites including, but not limited to, GENBANK (www.ncbi.nlm.nih.gov/genbank/) and EMBL-EBI

(www.ebi.ac.uk.). Sequence identity between two polynucleotides or two polypeptide sequences is generally calculated using the standard default parameters of the various methods or computer programs. A high degree of sequence identity, as used herein, between two polynucleotides or two polypeptides is typically between about 90% identity and 100% identity, for example, about 90% identity or higher, preferably about 95% identity or higher, more preferably about 98% identity or higher. A moderate degree of sequence identity, as used herein, between two polynucleotides or two polypeptides is typically between about 80% identity to about 85% identity, for example, about 80% identity or higher, preferably about 85% identity. A low degree of sequence identity, as used herein, between two polynucleotides or two polypeptides is typically between about 50% identity and 75% identity, for example, about 50% identity, preferably about 60% identity, more preferably about 75% identity. For example, a Cas protein (e.g., a Cas9 comprising amino acid substitutions) can have a low degree of sequence identity, a moderate degree of sequence identity, or a high degree of sequence identity, over its length to a reference Cas protein (e.g., a wild-type Cas9). As another example, a NATNA can have a low degree of sequence identity, a moderate degree of sequence identity, or a high degree of sequence identity, over its length compared to a reference wild-type polynucleotide that complexes with the reference Cas protein (e.g., an sgRNA that forms a complex with Cas9).

[0089] As used herein,“hybridization” or“hybridize” or“hybridizing” is the process of combining two complementary single-stranded DNA or RNA molecules so as to form a single double-stranded molecule (DNA/DNA, DNA/RNA, RNA/RNA) through hydrogen base pairing. Hybridization stringency is typically determined by the hybridization temperature and the salt concentration of the hybridization buffer; e.g., high temperature and low salt provide high stringency hybridization conditions. Examples of salt concentration ranges and

temperature ranges for different hybridization conditions are as follows: high stringency, approximately 0.01M to approximately 0.05M salt, hybridization temperature 5 C to 10 C below T m ; moderate stringency, approximately 0.16M to approximately 0.33M salt,

hybridization temperature 20 C to 29 C below T m ; and low stringency, approximately 0.33M to approximately 0.82M salt, hybridization temperature 40 C to 48 C below T m. T m of duplex nucleic acid sequences is calculated by standard methods well-known in the art {see, e.g., Maniatis, T., eta/., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press: New York (1982); Casey, J., eta/., Nucleic Acids Research 4: 1539-1552 (1977); Bodkin, D.K., eta/., Journal of Virological Methods l0(l):45-52 (1985); Wallace, R.B., eta/., Nucleic Acids Research 9(4):879-894 (1981)). Algorithm prediction tools to estimate T m are also widely available. High stringency conditions for hybridization typically refer to conditions under which a polynucleotide complementary to a target sequence predominantly hybridizes with the target sequence, and substantially does not hybridize to non-target sequences.

Typically, hybridization conditions are of moderate stringency, preferably high stringency. [0090] As used herein,“complementarity” refers to the ability of a nucleic acid sequence to form hydrogen bond(s) with another nucleic acid sequence (e.g., through canonical Watson- Crick base pairing). A percent complementarity indicates the percentage of residues in a nucleic acid sequence that can form hydrogen bonds with a second nucleic acid sequence. If two nucleic acid sequences have 100% complementarity, the two sequences are perfectly complementary, i.e., all of the contiguous residues of a first polynucleotide hydrogen bond with the same number of contiguous residues in a second polynucleotide.

[0091] As used herein,“binding” refers to a non-covalent interaction between

macromolecules (e.g., between a protein and a polynucleotide, between a polynucleotide and a polynucleotide, or between a protein and a protein, and the like). Such non-covalent interaction is also referred to as“associating” or“interacting” (e.g., if a first macromolecule interacts with a second macromolecule, the first macromolecule binds to second macromolecule in a non- covalent manner). Some portions of a binding interaction may be sequence-specific (the terms “sequence-specific binding,”“sequence-specifically bind,”“site-specific binding,” and“site specifically binds” are used interchangeably herein). Sequence-specific binding, as used herein, typically refers to one or more NATNAs capable of forming a complex with a protein (e.g., Cas9) to cause the protein to bind a nucleic acid sequence (e.g., a DNA sequence) comprising a nucleic acid target sequence (e.g., a target DNA sequence) preferentially relative to a second nucleic acid sequence (e.g., a second DNA sequence) without the nucleic acid target binding sequence (e.g., the DNA target binding sequence). All components of a binding interaction do not need to be sequence-specific, such as contacts of a protein with phosphate residues in a DNA backbone. Binding interactions can be characterized by a dissociation constant (Kd). “Binding affinity” refers to the strength of the binding interaction. An increased binding affinity is correlated with a lower Kd.

[0092] As used herein, a Cas protein is said to“target” a polynucleotide if a Cas

protein/NATNA nucleoprotein complex binds or cleaves a polynucleotide at the nucleic acid target sequence within the polynucleotide.

[0093] As used herein,“double-strand break” (DSB) refers to both strands of a double- stranded segment of DNA being severed. In some instances, if such a break occurs, one strand can be said to have a“sticky end” wherein nucleotides are exposed and not hydrogen bonded to nucleotides on the other strand. In other instances, a“blunt end” can occur wherein both strands remain fully base paired with each other.

[0094]“Donor polynucleotide,”“donor oligonucleotide,” and“donor template” are used interchangeably herein and can be a double-stranded polynucleotide (e.g., DNA), a single- stranded polynucleotide (e.g., DNA or RNA), or a combination thereof. Donor polynucleotides can comprise homology arms flanking the insertion sequence (e.g., DSBs in the DNA). The homology arms on each side can vary in length. Parameters for the design and construction of donor polynucleotides are well-known in the art {see, e.g., Ran, F., eta/., Nature Protocols 8(1 l):228l-2308 (2013); Smithies, O., eta/., Nature 317:230-234 (1985); Thomas, K., eta/., Cell 44:419-428 (1986); Wu, S., eta/., Nature Protocols 3: 1056-1076 (2008); Singer, B., eta/., Cell 31 :25-33 (1982); Shen, P., eta/., Genetics 112:441-457 (1986); Watt, V., eta/.,

Proceedings of the National Academy of Sciences of the United States of America 82:4768- 4772 (1985); Sugawara, N., eta/., Journal of Molecular Cell Biology l2(2):563-575 (1992); Rubnitz, J., eta/., Journal of Molecular Cell Biology 4(11):2253-2258 (1984); Ayares, D., et a/., Proceedings of the National Academy of Sciences of the United States of America

83(14):5199-5203 (1986); Liskay, R, eta/., Genetics 115(1): 161-167 (1987)).

[0095] As used herein,“homology-directed repair” (HDR) refers to DNA repair that takes place in cells, for example, during repair of a DSB in DNA. HDR requires nucleotide sequence homology and uses a donor polynucleotide to repair the sequence wherein the DSB (e.g., within a target DNA sequence) occurred. The donor polynucleotide generally has the requisite sequence homology with the sequence flanking the DSB so that the donor polynucleotide can serve as a suitable template for repair. HDR results in the transfer of genetic information from, for example, the donor polynucleotide to the target DNA sequence. HDR may result in alteration of the target DNA sequence (e.g., insertion, deletion, or mutation) if the donor polynucleotide sequence differs from the target DNA sequence and part or all of the donor polynucleotide is incorporated into the target DNA sequence. In some embodiments, an entire donor polynucleotide, a portion of the donor polynucleotide, or a copy of the donor

polynucleotide is integrated at the site of the target DNA sequence. For example, a donor polynucleotide can be used for repair of the break in the target DNA sequence, wherein the repair results in the transfer of genetic information (i.e., polynucleotide sequences) from the donor polynucleotide at the site or in close proximity of the break in the DNA. Accordingly, new genetic information (i.e., polynucleotide sequences) may be inserted or copied at a target DNA sequence.

[0096] A“genomic region” is a segment of a chromosome in the genome of a host cell that is present on either side of the nucleic acid target sequence site or, alternatively, also includes a portion of the nucleic acid target sequence site. The homology arms of the donor polynucleotide have sufficient homology to undergo homologous recombination with the corresponding genomic regions. In some embodiments, the homology arms of the donor polynucleotide share significant sequence homology to the genomic region immediately flanking the nucleic acid target sequence site; it is recognized that the homology arms can be designed to have sufficient homology to genomic regions farther from the nucleic acid target sequence site.

[0097] As used herein,“non-homologous end joining” (NHEJ) refers to the repair of a D SB in DNA by direct ligation of one terminus of the break to the other terminus of the break without a requirement for a donor polynucleotide. NHEJ is a DNA repair pathway available to cells to repair DNA without the use of a repair template. NHEJ in the absence of a donor

polynucleotide often results in nucleotides being randomly inserted or deleted at the site of the DSB.

[0098]“Microhomology-mediated end joining” (MMEJ) is pathway for repairing a DSB in DNA. MMEJ involves deletions flanking a DSB and alignment of microhomologous sequences internal to the break site before joining. MMEJ is genetically defined and requires the activity of, for example, CtIP, Poly(ADP-Ribose) Polymerase 1 (PARP1), DNA polymerase theta (Pol Q), DNA Ligase 1 (Lig 1), or DNA Ligase 3 (Lig 3). Additional genetic components are known in the art {see, e.g., Sfeir, A., et al, Trends in Biochemical Sciences 40:701-714 (2015)).

[0099] As used herein,“DNA repair” encompasses any process whereby cellular machinery repairs damage to a DNA molecule contained in the cell. The damage repaired can include single-strand breaks or double-strand breaks (DSBs). At least three mechanisms exist to repair DSBs: HDR, NHEJ, and MMEJ.“DNA repair” is also used herein to refer to DNA repair resulting from human manipulation, wherein a target locus is modified, e.g., by inserting, deleting, or substituting nucleotides, all of which represent forms of genome editing.

[00100] As used herein,“recombination” refers to a process of exchange of genetic information between two polynucleotides.

[00101] As used herein, the terms“regulatory sequences,”“regulatory elements,” and “control elements” are interchangeable and refer to polynucleotide sequences that are upstream (5’ non-coding sequences), within, or downstream (3’ non-translated sequences) of a polynucleotide target to be expressed. Regulatory sequences influence, for example, the timing of transcription, amount or level of transcription, RNA processing or stability, and/or translation of the related structural nucleotide sequence. Regulatory sequences may include activator binding sequences, enhancers, introns, polyadenylation recognition sequences, promoters, transcription start sites, repressor binding sequences, stem-loop structures, translational initiation sequences, internal ribosome entry sites (IRES), translation leader sequences, transcription termination sequences (e.g., polyadenylation signals and poly-El sequences), translation termination sequences, primer binding sites, and the like. [00102] Regulatory elements include those that direct constitutive, inducible, and repressible expression of a nucleotide sequence in many types of host cells and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). In some embodiments, a vector comprises one or more pol III promoters, one or more pol II promoters, one or more pol I promoters, or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and Hl promoters. Examples of pol II promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter

(optionally with the CMV enhancer; see, e.g., Boshart, M., et al., Cell 41 :521-530 (1985)), the SV40 promoter, the dihydrofolate reductase promoter, the b-actin promoter, the

phosphoglycerol kinase (PGK) promoter, and the EFla promoter. It will be appreciated by those skilled in the art that the design of an expression vector may depend on such factors as the choice of the host cell to be transformed, the level of expression desired, and the like. A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acid sequences as described herein.

[00103] “Gene,” as used herein, refers to a polynucleotide sequence comprising exon(s) and related regulatory sequences. A gene may further comprise intron(s) and/or untranslated region(s) (ETTR(s)).

[00104] As used herein, the term“operably linked” refers to polynucleotide sequences or amino acid sequences placed into a functional relationship with one another. For example, regulatory sequences (e.g., a promoter or enhancer) are“operably linked” to a polynucleotide encoding a gene product if the regulatory sequences regulate or contribute to the modulation of the transcription of the polynucleotide. Operably linked regulatory elements are typically contiguous with the coding sequence. However, enhancers can function if separated from a promoter by up to several kilobases or more. Accordingly, some regulatory elements may be operably linked to a polynucleotide sequence but not contiguous with the polynucleotide sequence. Similarly, translational regulatory elements contribute to the modulation of protein expression from a polynucleotide.

[00105] As used herein,“expression” refers to transcription of a polynucleotide from a DNA template, resulting in, for example, a messenger RNA (mRNA) or other RNA transcript (e.g., non-coding, such as structural or scaffolding RNAs). The term further refers to the process through which transcribed mRNA is translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be referred to collectively as“gene product(s).” Expression may include splicing the mRNA in a eukaryotic cell, if the polynucleotide is derived from genomic DNA.

[00106] As used herein, the term“modulate” refers to a change in the quantity, degree or amount of a function. For example, a dht-NATNA/Cas9 nucleoprotein complex, as disclosed herein, may modulate the activity of a promoter sequence by binding to a nucleic acid target sequence at or near the promoter. Depending on the action occurring after binding, the dht- NATNA/Cas9 nucleoprotein complex can induce, enhance, suppress, or inhibit transcription of a gene operatively linked to the promoter sequence. Thus,“modulation” of gene expression includes both gene activation and gene repression.

[00107] Modulation can be assayed by determining any characteristic directly or indirectly affected by the expression of the target gene. Such characteristics include, for example, changes in RNA or protein levels, protein activity, product levels, expression of the gene, or activity level of reporter genes. Accordingly, the terms“modulating expression,” “inhibiting expression,” and“activating expression” of a gene can refer to the ability of a dht- NATNA/Cas9 nucleoprotein complex to change, activate, or inhibit transcription of a gene.

[00108] “Vector” and“plasmid,” as used herein, refer to a polynucleotide vehicle to introduce genetic material into a cell. Vectors can be linear or circular. Vectors can contain a replication sequence capable of effecting replication of the vector in a suitable host cell (i.e., an origin of replication). Upon transformation of a suitable host, the vector can replicate and function independently of the host genome or integrate into the host genome. Vector design depends, among other things, on the intended use and host cell for the vector, and the design of a vector of the invention for a particular use and host cell is within the level of skill in the art. The four major types of vectors are plasmids, viral vectors, cosmids, and artificial

chromosomes. Typically, vectors comprise an origin of replication, a multicloning site, and/or a selectable marker. An expression vector typically comprises an expression cassette.

[00109] As used herein,“expression cassette” refers to a polynucleotide construct generated using recombinant methods or by synthetic means and comprising regulatory sequences operably linked to a selected polynucleotide to facilitate expression of the selected polynucleotide in a host cell. For example, the regulatory sequences can facilitate transcription of the selected polynucleotide in a host cell, or transcription and translation of the selected polynucleotide in a host cell. An expression cassette can, for example, be integrated in the genome of a host cell or be present in a vector to form an expression vector.

[00110] As used herein, a“targeting vector” is a recombinant DNA construct typically comprising tailored DNA arms, homologous to genomic DNA, that flank elements of a target gene or nucleic acid target sequence (e.g., a DSB). A targeting vector comprises a donor polynucleotide. Elements of the target gene can be modified in a number of ways including deletions and/or insertions. A defective target gene can be replaced by a functional target gene, or in the alternative a functional gene can be knocked out. Optionally, the donor polynucleotide of a targeting vector comprises a selection cassette comprising a selectable marker that is introduced into the target gene. Targeting regions (i.e., nucleic acid target sequences) adjacent or within a target gene can be used to affect regulation of gene expression.

[00111] As used herein, the term "between" is inclusive of end values in a given range (e.g., between about 1 and about 50 nucleotides in length includes 1 nucleotide and 50 nucleotides.

[00112] As used herein, the term“amino acid” refers to natural and synthetic (unnatural) amino acids, including amino acid analogs, modified amino acids, peptidomimetics, glycine, and D or L optical isomers.

[00113] As used herein, the terms“peptide,”“polypeptide,” and“protein” are interchangeable and refer to polymers of amino acids. A polypeptide may be of any length. It may be branched or linear, it may be interrupted by non-amino acids, and it may comprise modified amino acids. The terms also refer to an amino acid polymer that has been modified through, for example, acetylation, disulfide bond formation, glycosylation, lipidation, phosphorylation, pegylation, biotinylation, cross-linking, and/or conjugation (e.g., with a labeling component or ligand). Polypeptide sequences are displayed herein in the conventional N-terminal to C-terminal orientation, unless otherwise indicated.

[00114] Polypeptides and polynucleotides can be made using routine techniques in the field of molecular biology {see, e.g., standard texts discussed above). Furthermore, essentially any polypeptide or polynucleotide is available from commercial sources.

[00115] The terms“fusion protein” and“chimeric protein,” as used herein, refer to a single protein created by joining two or more proteins, protein domains, or protein fragments that do not naturally occur together in a single protein. For example, a fusion protein can contain a first domain from a Cas9 protein and a second domain from a Csy4 protein. The modification to include such domains in fusion proteins may confer additional activity on the modified site-directed polypeptides. Such activities can include nuclease activity,

methyltransferase activity, demethylase activity, DNA repair activity, DNA damage activity, deamination activity, dismutase activity, alkylation activity, depurination activity, oxidation activity, pyrimidine dimer forming activity, integrase activity, transposase activity, recombinase activity, polymerase activity, ligase activity, helicase activity, photolyase activity, glycosylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitinating activity, adenylation activity, deadenylation activity, SUMOylating activity, deSUMOylating activity, ribosylation activity, deribosylation activity, and/or myristoylation activity or demyristoylation activity that modifies a polypeptide associated with nucleic acid target sequence (e.g., a histone).

[00116] A fusion protein can also comprise epitope tags (e.g., histidine tags, FLAG® (Sigma Aldrich, St. Louis, MO) tags, Myc tags), reporter protein sequences (e.g., glutathione-S- transferase, beta-galactosidase, luciferase, green fluorescent protein, cyan fluorescent protein, yellow fluorescent protein), and/or nucleic acid sequence binding domains (e.g., a DNA binding domain or an RNA binding domain). A fusion protein can also comprise activator domains (e.g., heat shock transcription factors, NFKB activators) or repressor domains (e.g., a KRAB domain). As described by Lupo, A., et a/., Current Genomics 14(4): 268-278 (2013), the KRAB domain is a potent transcriptional repression module and is located in the amino- terminal sequence of most C2H2 zinc finger proteins {see, e.g., Margolin, L, e a/., Proceedings of the National Academy of Sciences of the United States of America 91 :4509-4513 (1994); Witzgall, R., eta/., Proceedings of the National Academy of Sciences of the United States of America 91 :4514-4518 (1994)). The KRAB domain typically binds to co-repressor proteins and/or transcription factors via protein-protein interactions, causing transcriptional repression of genes to which KRAB zinc finger proteins (KRAB-ZFPs) bind {see, e.g., Friedman J.R., et a/, Genes & Development 10:2067-2678 (1996)). In some embodiments, linker nucleic acid sequences are used to join the two or more proteins, protein domains, or protein fragments.

[00117] A“moiety,” as used herein, refers to a portion of a molecule. A moiety can be a functional group or describe a portion of a molecule with multiple functional groups (e.g., that share common structural aspects). The terms“moiety” and“functional group” are typically used interchangeably; however, a“functional group” can more specifically refer to a portion of a molecule that comprises some common chemical behavior.“Moiety” is often used as a structural description. In some embodiments, a 5’ terminus, a 3’ terminus, or a 5’ terminus and a 3’ terminus (e.g., a non-native 5' terminus and/or a non-native 3' terminus in a first stem element) can comprise one or more moieties.

[00118] The term“affinity tag,” as used herein, typically refers to one or more moieties that increases the binding affinity of a dht-casPN to a Cas protein, for example, to facilitate formation of a dht-NATNA/Cas9 nucleoprotein complex. In some embodiments, an affinity tag can be used to increase the binding affinity of any dht-casPN of a dht-NATNA to a Cas protein

(e.g., Cas9). In some embodiments, an affinity tag can be used to increase the binding affinity of a first dht-casPN for a second dht-casPN. Some embodiments of the present invention use an “affinity sequence,” which is a polynucleotide sequence comprising one or more affinity tags. Some embodiments of the present invention introduce one or more affinity tags to the N- terminal of a Cas protein sequence (e.g., a Cas9 protein sequence), to the C-terminal of a Cas protein sequence, to a position located between the N-terminal and C-terminal of a Cas protein sequence, or to combinations thereof. In some embodiments of the present invention, one or more dht-casPNs of a dht-NATNA comprises an affinity sequence wherein the affinity sequence is located, for example, at or within the 5’-end nucleic acid sequence, the 3’-end nucleic acid sequence, both the 5’ -end and 3’ end nucleic acid sequences, or a position between the 5’ -end nucleic acid sequence and 3’ -end nucleic acid sequence of a dht-casPN, as well as combinations thereof. A wide variety of affinity tags are disclosed in U.S. Published Patent Application No. 2014-0315985, published 23 October 2014.

[00119] As used herein, a“cross-link” is a bond that links one polymer chain (e.g., a polynucleotide or polypeptide) to another. Such bonds can be covalent bonds or ionic bonds. In some embodiments, one polynucleotide can be bound to another polynucleotide by cross linking the polynucleotides. In other embodiments, a polynucleotide can be cross linked to a polypeptide. In additional embodiments, a polypeptide can be cross linked to a polypeptide.

[00120] The term“cross-linking moiety,” as used herein, typically refers to a moiety suitable to provide cross linking between a dht-casPN and a cognate Cas protein (e.g., Cas9). A cross-linking moiety is another example of an affinity tag.

[00121] The terms“ligand” and“ligand-binding moiety,” as used herein, refer to moieties that can facilitate the binding of a dht-casPN and a cognate Cas protein (e.g., Cas9). Ligands and ligand-binding moieties are paired affinity tags.

[00122] As used herein, a“host cell” generally refers to a biological cell. A cell is the basic structural, functional, and/or biological unit of an organism. A cell can originate from any organism having one or more cells. Examples of host cells include, but are not limited to, a prokaryotic cell, eukaryotic cell, a bacterial cell, an archaeal cell, a cell of a single-cell eukaryotic organism, a cell of a eukaryotic organism, a protozoal cell, a cell from a plant (e.g., cells from plant crops (such as soy, tomatoes, sugar beets, pumpkin, hay, cannabis, tobacco, plantains, yams, sweet potatoes, cassava, potatoes, wheat, sorghum, soybean, rice, corn, maize, oil-producing Brassica (e.g., oil-producing rapeseed and canola), cotton, sugar cane, sunflower, millet, and alfalfa), fruits, vegetables, grains, seeds, flowering plants, conifers, gymnosperms, ferns, clubmosses, homworts, liverworts, mosses), an algal cell, (e.g., Botryococcus braunii ,

Chlamydomonas reinhardtii , Nannochloropsis gaditana, Chlorella pyrenoidosa , Sargassum patens C. agardh, and the like), seaweeds (e.g., kelp), a fungal cell (e.g., a yeast cell or a cell from a mushroom), an animal cell, a cell from an invertebrate animal (e.g., fruit fly, cnidarian, echinoderm, nematode, and the like), a cell from a vertebrate animal (e.g., fish, amphibian, reptile, bird, or mammal), a cell from a mammal (e.g., a pig, a cow, a goat, a sheep, a rodent, a rat, a mouse, a non human primate, a human, and the like). Furthermore, a cell can be a stem cell or a progenitor cell. In some embodiments, a host cell is a non-human cell. In some embodiments, a host cell is a human cell outside of a human body, wherein in particular embodiments the human cell is not introduced into a human body.

[00123] As used herein,“stem cell” refers to a cell that has the capacity for self-renewal, i.e., the ability to go through numerous cycles of cell division while maintaining the

undifferentiated state. Stem cells can be totipotent, pluripotent, multipotent, oligopotent, or unipotent. Stem cells can be embryonic, fetal, amniotic, adult, or induced pluripotent stem cells.

[00124] As used herein,“induced pluripotent stem cells” refers to a type of pluripotent stem cell that is artificially derived from a non-pluripotent cell, typically a somatic cell. In some embodiments, the somatic cell is a human somatic cell. Examples of somatic cells include, but are not limited to, dermal fibroblasts, bone marrow-derived mesencyhmal cells, cardiac muscle cells, keratinocytes, liver cells, stomach cells, neural stem cells, lung cells, kidney cells, spleen cells, and pancreatic cells. Additional examples of somatic cells include cells of the immune system, including but not limited to, B cells, dendritic cells, granulocytes, innate lymphoid cells, megakaryocytes, monocytes/macrophages, myeloid-derived suppressor cells, natural killer (NK) cells, T cells, thymocytes, and hematopoietic stem cells.

[00125] “Plant,” as used herein, refers to whole plants, plant organs, plant tissues, germplasm, seeds, plant cells, and progeny of the same. Plant cells include, without limitation, cells from seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen and microspores. Plant parts include differentiated and undifferentiated tissues including, but not limited to roots, stems, shoots, leaves, pollens, seeds, tumor tissue and various forms of cells and culture (e.g., single cells, protoplasts, embryos, and callus tissue). The plant tissue may be in plant or in a plant organ, tissue or cell culture.“Plant organ” refers to plant tissue or a group of tissues that constitute a morphologically and functionally distinct part of a plant.

[00126] “Subject,” as used herein, refers to any member of the phylum Chordata, including, without limitation, humans and other primates, including non-human primates such as rhesus macaques, chimpanzees, and other monkey and ape species; farm animals, such as cattle, sheep, pigs, goats, and horses; domestic mammals, such as dogs and cats; laboratory animals, including rabbits, mice, rats, and guinea pigs; birds, including domestic, wild, and game birds, such as chickens, turkeys and other gallinaceous birds, ducks, and geese; and the like. The term does not denote a particular age or gender. Thus, the term includes adult, young, and newborn individuals as well as male and female. In some embodiments, a host cell is derived from a subject (e.g., stem cells, progenitor cells, or tissue-specific cells). In some embodiments, the subject is a non-human subject.

[00127] As used herein,“transgenic organism” refers to an organism whose genome is genetically modified. The term includes the progeny (any generation) of a transgenic organism, provided that the progeny has the genetic modification. In some embodiments, the transgenic organism is a non-human transgenic organism.

[00128] As used herein,“isolated” can refer to a molecule (e.g., a polynucleotide or a polypeptide) that, by human intervention, exists apart from its native environment and is therefore not a product of nature. An isolated polynucleotide or polypeptide can exist in a purified form and/or can exist in a non-native environment such as, for example, in a recombinant cell.

[00129] Aspects of the present invention relate to at least one engineered break in the polynucleotide backbone of a Class 2 Type II CRISPR-Cas9-associated NATNA comprising a helical triplex, wherein the engineered break is located 3' of the nexus. The engineered break can result in one or more non-native 3’ termini and one or more non-native 5’ termini (non native relative to a wild-type Class 2 Type II CRISPR-Cas9-associated NATNAs comprising a helical triplex). In one aspect, the present invention relates to a Class 2 Type II CRISPR-Cas9- associated NATNA composition comprising one or more non-native 3’ termini located 3' of the nucleotide sequence that forms the nexus stem element through hydrogen-bond interactions, and one or more non-native 5’ termini that results from the engineered break that generated the non-native 3' termini. In a preferred aspect, the composition is capable of forming a complex with a cognate Cas9 protein, and the complex preferentially binds a nucleic acid target sequence complementary to a nucleic acid target binding sequence in a polynucleotide relative to a polynucleotide that does not comprise the nucleic acid target sequence.

[00130] In a first aspect, the present invention relates to Class 2 Type II CRISPR-Cas9- associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) compositions. A dht-NATNA composition comprises at least one engineered break in a Class 2 Type II CRISPR-Cas9-associated NATNA comprising a helical triplex, wherein (i) the break is located 3' of the nexus (e.g., FIG. 2, between 210 and 216; FIG. 3, between 310 and 316) resulting in one or more non-native 3’ termini and one or more non-native 5’ termini, (ii) the dht-NATNA composition is capable of forming a nucleoprotein complex with a cognate Cas9 protein, and (iii) the complex is capable of binding or binding/cleaving a nucleic acid target sequence complementary to the nucleic acid target binding sequence of the dht-NATNA. A dht- NATNA comprises two or more dht-casPNs. The dht-casPNs typically connect with each other through hydrogen bonds to form the dht-NATNA.

[00131] Types of hydrogen bonds are discussed above. Embodiments of the present invention include, but are not limited to, the following types of hydrogen bonds in pairs of hydrogen-bonded nucleotides: W-C hydrogen bonding, reverse W-C hydrogen bonding, Hoogsteen hydrogen bonding, reverse Hoogsteen hydrogen bonding, wobble hydrogen bonding, reverse wobble hydrogen bonding, or combinations thereof.

[00132] One method to determine the presence of hydrogen bonds in pairs of hydrogen- bonded nucleotides is prediction of the secondary structure of each polynucleotide {see, e.g., Ran, F.A., eta/. , Nature 520(7546): 186-191 (2015); Zuker, M., Mfold web server for nucleic acid sequence folding and hybridization prediction, Nucleic Acids Research 31 :3406-3415 (2003)).

[00133] Methods are known to those of ordinary skill in the art to determine the presence of hydrogen bonds in pairs of hydrogen-bonded nucleotides. For example, experimental techniques include, but not limited to, X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, cryo-electron microscopy (Cryo-EM), chemical/enzymatic probing, thermal denaturation (melting studies), and mass spectrometry. Predictive techniques can be employed, such as computational structure prediction {see, e.g., Ran, F.A., e a/., Nature 520(7546): 186-191 (2015); Zuker, M., Mfold web server for nucleic acid sequence folding and hybridization prediction, Nucleic Acids Research 31 :3406-3415 (2003);“RNAfold web server” (rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi); Gruber A.R., eta/., The Vienna RNA Websuite, Nucleic Acids Research 36(supplement 2):W70-W74 (2008); Lorenz, R., eta/.,“ViennaRNA Package 2.0,” Algorithms for Molecular Biology 16:26 (2011) for each dht-NATNA

polynucleotide. A preferred method to evaluate RNA secondary structure is to use the combined experimental and computational SHAPE method (Low J.T., eta/., Methods

52(2): 150-158 (2010).

[00134] An empirical method to determine whether there is secondary structure created by base-pair hydrogen bonding is analysis on non-denaturing gels {see, e.g., McGookin, R., Methods Molecular Biology 2:93-100 (1985)). In this method, dht-NATNA polynucleotides can be combined in equal molar concentrations in an annealing or hybridization buffer (e.g., l.25mM HEPES, 0.625mM MgCl 2 , 9.375mM KC1 at pH7.5; or 20mM Tris-HCl pH 7.5, lOOmM KC1, 5mM MgCl 2 ), incubated above the melting temperature of the dht-NATNA polynucleotides and allowed to equilibrate at room temperature. This re-annealed mixture of polynucleotides can be a“combined” dht-NATNA. The same steps can be applied to the individual dht-NATNA polynucleotides. In separate reactions, the same equal molar concentrations of each individual dht-NATNA, as is used for the combined sample

polynucleotides, can be processed. After re-annealing, the individual dht-NATNAs can be combined (“separate” dht-NATNAs). The combined and separate samples can be resolved side- by-side on non-denaturing gels. The banding patterns of the combined and separate samples can be compared. Formation of secondary structure is indicated by differences in the banding patterns between the combined and separate samples.

[00135] Two embodiments of the present invention are set forth in FIG. 5A, FIG. 5B, FIG. 5C, and FIG. 5D that illustrate examples of dht-NATNAs. The dht-NATNA illustrated in FIG. 5A and FIG. 5B comprises a first stem-loop whereas the dht-NATNA illustrated in FIG.

5C and FIG. 5D comprises a first stem (i.e., no polynucleotide connects the 3' and 5' ends of the first stem). FIG. 5 A and FIG. 5B illustrate a dht-NATNA comprising two polynucleotides (I = dht-casPNl; II = dht-casPN2). FIG. 5C and FIG. 5D illustrate a dht-NATNA comprising three polynucleotides (I = dht-casPNl; II = dht-casPN2; dht-casPN3).

[00136] To illustrate additional exemplary embodiments of the present invention, FIG.

6A shows schematic diagram of six segments (i.e., the circled Arabic numerals 1-6) of a Class 2 Type II CRISPR-Cas9-associated dht-NATNA. Table 4 presents a series of indicators and corresponding nucleotide sequences for FIG. 6A.

[00137]

[00138] The corresponding sequences of FIG. 6B to FIG. 6K can be identified by comparison to FIG. 6A. FIG. 6B to FIG. 6K show examples of dht-NATNA compositions. For convenience, the following naming convention is applied to the dht-NATNAs:

[00139] when segments are joined by a connective sequence, an interpoint (“·”) in the dht-casPN designation indicates a connective sequence between segments (e.g., FIG. 5A: dht- casPNl is marked with "I" and its corresponding segment designation is dhtl » 2 » 3 » 4-casPNl; dht-casPN2 is marked with "P" and its corresponding segment designation is dht5 * 6-casPN2);

[00140] when a dht-casPN designation begins with an interpoint, then a connective sequence comes before the first segment (e.g., FIG. 6B: dht-casPNl is marked with "I" and its corresponding segment designation is dhtl 2 3-casPNl; dht-casPN2 is marked with "P" and its corresponding segment designation is dht * 4 * 5 * 6-casPN2);

[00141] the sequence of segment numbers in the name corresponds to the relative order of the segments given in a 5’ to 3’ direction (e.g., FIG. 6B: dhtl 2 3-casPNl, 5'-segment 1- connective sequence-segment 2-connective sequence-segment 3-3'; and dht 4 5 6-casPN2, 5'- connective sequence-segment 4-connective sequence-segment 5- connective sequence-segment 6-3');

[00142] when a segment number occurs at the end of a first dht-casPN designation and also occurs at the beginning of a second dht-casPN designation, then there is a break (i.e., the polynucleotide backbone is discontinuous) in that segment (e.g., FIG. 6D: I = dhtl-casPNl; II = dht2*3*4-casPN2; III = dht 4*5*6-casPN3); and

[00143] when the first and last numbers of a dht-casPN designation are the same, then the corresponding dht-casPN is circular (e.g., FIG. 6E: I = dhtl»2-casPNl; II = dht3»4»5»6»3).

[00144] While not intending to limit the scope of the present invention, the naming convention is set forth to help facilitate understanding of embodiments of the present invention. The naming convention is not intended to be a complete representation of any dht-NATNA.

[00145] FIG. 6E, FIG. 6G, FIG. 6H, FIG. 61, FIG. 6J, and FIG. 6K illustrate examples of dht-NATNAs that comprise a split-nexus (i.e., a nexus stem structure as opposed to a nexus stem -loop structure). FIG. 6F, FIG. 6G, FIG. 6H, FIG. 6J, and FIG. 6K have a 3' stem (3' stem structure) that corresponds to a 3' stem structure (see, e.g., FIG. 5B) of a 3' stem loop (see, e.g., FIG. 3) without the third loop nucleotide sequence (e.g., compare FIG. 3 with FIG. 5B). Table 5 lists exemplary dht-NATNA structures, as illustrated in FIG. 6B to FIG. 6K. In Table 5, "LS" stands for linking sequence.

[00140]

[00147] Embodiments of the first aspect of the present invention include, but are not limited to, Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid targeting nucleic acids (dht-NATNAs) comprising three individual polynucleotides that, when aligned in a 5' to 3' direction (see, e.g., FIG. 6C), comprise a nucleic acid target binding sequence, a first stem, a nexus, a 3' stem, and a helical -triplex forming nucleotide sequence. In one embodiment, a dht-NATNA comprises comprising three Class 2 Type II CRISPR-Cas9- associated discontinuous-helical triplex Cas polynucleotides (dht-casPNs: dht-casPNl, dht- casPN2, and dht-casPN3), each having a 5' end and 3' end, wherein (i) each of the three dht- casPNs is an unbroken (continuous) series of covalently connected nucleotides, that is, each polynucleotide is a polymeric form of nucleotides, (ii) the 3' end of dht-casPN2 is a non-native terminus and the 5' end of dht-casPN3 is a non-native terminus, and (iii) the 3' end of dht- casPN2 and the 5' end of dht-casPN3 are located 3' of the nexus (relative to a Class 2 Type II CRISPR-Cas9-associated NATNA comprising a helical triplex). The dht-NATNA is capable of forming a nucleoprotein complex with a cognate Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a nucleic acid target sequence complementary to the nucleic acid target binding sequence of the dht-NATNA. The dht-casPNs typically connect with each other through hydrogen bonds to form the dht-NATNA.

[00148] In this embodiment, the first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex polynucleotide (dht-casPNl), having a 5' end and a 3' end, comprises a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence I, wherein the first polynucleotide is continuous series of covalently connected nucleotides. A second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex polynucleotide (dht-casPN2), having a 5' end and a 3' end, and a third Class 2 Type II CRISPR- Cas polynucleotide (dht-casPN3), having a 5' end and a 3' end, together comprising:

a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I;

a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II;

a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment;

a fourth segment comprising a third stem nucleotide sequence I;

a third connecting nucleotide sequence covalently connecting the 3' end of the third segment to the 5' end of the fourth segment;

a fifth segment comprising a third stem nucleotide sequence II;

a fourth connecting nucleotide sequence covalently connecting the 3' end of the fourth segment to the 5' end of the fifth segment;

a sixth segment comprising a helical-triplex forming nucleotide sequence; and a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment; wherein the dht-casPN2 is a continuous series of covalently connected nucleotides and the dht-casPN3 is a continuous series of covalently connected nucleotides, and the 3' end of the dht-casPN2 and the 5' end of the dht-casPN3, are between the 5' end of the fourth segment and the 3' end of the sixth segment.

[00149] In preferred embodiments, the first stem repeat nucleotide sequence I is connected through hydrogen base-pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base- pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, and the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex.

[00150] In one embodiment the dht-casPN2 comprises at least the second segment comprising the first stem repeat nucleotide sequence II, the joining nucleotide sequence, the nexus stem nucleotide sequence I, and the first linker element nucleotide sequence I; and the third segment comprising the first linker nucleotide sequence II and the nexus stem nucleotide sequence II; and the second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment (e.g., FIG. 6C). Furthermore, in this embodiment, the dht-casPN3 comprises at least the fourth segment comprising the third stem nucleotide sequence I, the fifth segment comprising the third stem nucleotide sequence II, and the sixth segment comprising the helical -triplex forming nucleotide sequence (e.g., FIG. 6C).

[00151] In another embodiment, the dht-casPN2 comprises at least the second segment comprising the first stem repeat nucleotide sequence II, the joining nucleotide sequence, the nexus stem nucleotide sequence I, and the first linker element nucleotide sequence I; the third segment comprising first linker nucleotide sequence II and the nexus stem nucleotide sequence II, and the second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment; and the fourth segment comprising the third stem nucleotide sequence I (e.g., FIG. 5B, FIG. 6F). Furthermore, in this embodiment the dht- casPN3 comprises at least, the fifth segment comprising the third stem nucleotide sequence II, and the sixth segment comprising the helical-triplex forming nucleotide sequence (e.g., FIG.

5B, FIG. 6F).

[00152] In additional embodiments, the 3' end of the first segment is covalently attached to the 5' end of the second segment, thus forming a first stem-loop structure (FIG. 6B, FIG. 6E, FIG. 6F, FIG. 6H, FIG. 6K). In one such embodiment, a dht-NATNA comprises a dht-casPNl and a dht-casPN2. The dht-casPNl (e.g., FIG. 5A, I = dht-casPNl = dhtl » 2 » 3 » 4-casPNl), having a 5' end and a 3' end, comprising in a 5' to 3' direction a spacer, a first stem-loop structure, a nexus (a second stem-loop structure), a connecting nucleotide sequence, and a third stem nucleotide sequence I, wherein the dht-casPN 1 has a single, continuous, covalently connected backbone. The dht-casPN2 (e.g., FIG. 5A, II = dht-casPN2 = dht5*6-casPN2), having a 5' end and a 3' end, comprising in a 5' to 3' direction a spacer, a third stem nucleotide sequence II and a triplex-forming nucleotide sequence, wherein the dht-casPN2 has a single, continuous, covalently connected backbone.

[00153] In further embodiments, there is no third connecting nucleotide sequence covalently connecting the 3' end of the third segment to the 5' end of the fourth segment (FIG. 6B, FIG. 6C).

[00154] In other embodiments, there is no fourth connecting nucleotide sequence covalently connecting the 3' end of the fourth segment to the 5' end of the fifth segment; thus, instead of a 3' stem-loop structure, a 3' stem structure is formed (FIG. 6F, FIG. 6G, FIG. 6H, FIG. 6J, FIG. 6K).

[00155] In yet further embodiments, there is no fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment.

[00156] In some embodiments, a first dht-casPN comprises a first portion of a segment and a second dht-casPN comprises a second portion of the segment (e.g., in FIG. 6D, the dht2*3*4-casPN2 comprises a first portion of segment 4, and the dht 4*5*6-casPN3 comprises a second portion of segment 4).

[00157] Additional embodiments can comprise, for example, a nucleic acid sequence covalently connected to a 5' end, 3' end, or 5' and 3' ends of a dht-casPN (e.g., FIG. 6J, an additional nucleic acid sequence on the 5' end of Segment 3, and an additional nucleic acid sequence on the 3' end of Segment 6; FIG. 6K, an additional nucleic acid sequence on the 5' end of Segment 5, and an additional nucleic acid sequence on the 3' end of Segment 4). In some embodiments, a nucleic acid sequence can be covalently connected to a non-native 3' terminus, a nucleic acid sequence can be covalently connected to the non-native 5' terminus, or a nucleic acid sequence can be covalently connected to the non-native 3' terminus and a nucleic acid sequence can be covalently connected to the non-native 5' terminus wherein the polynucleotide sequence covalently connected to the non-native 3' terminus and a polynucleotide sequence covalently connected to the non-native 5' terminus are also connected through hydrogen base pairing.

[00158] In view of the teachings of the present specification, one of ordinary skill in the art will readily understand how to engineer additional variations of dht-NATNA compositions. [00159] Example 1 describes production of polynucleotide components of engineered Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) compositions, for example, as illustrated in FIG. 5 A and FIG. 5B. Components of the dht-NATNA compositions were assembled by PCR using 3’ overlapping primers containing DNA sequences corresponding to each dht-casPN component. In vitro transcription of the DNA templates was carried out using a T7 promoter and a T7 RNA polymerase.

[00160] In addition to known Class 2 Type II systems comprising a Class 2 Type II CRISPR-Cas9-associated RNA comprising a helical triplex, crRNAs of species having such a Class 2 CRISPR system can be identified using the method set forth in Example 5. Example 6 describes a method by which tracrRNAs of species having such Class 2 Type II systems can be identified.

[00161] Example 8 describes a method to probe for sites tolerant of modification in the backbone of Class 2 Type II CRISPR guide RNAs comprising a 3' helical triplex (e.g., sgRNAs or dual-guide RNAs) in the regions 3' of the nexus (e.g., introduction of a break in the polynucleotide backbone to generate non-native termini).

[00162] In a second aspect, the present invention is directed to polynucleotide/protein (nucleoprotein) compositions comprising a dht-NATNA (e.g., comprising a dht-casPNl and a dht-casPN2), and a Cas protein (e.g., Cas9) with which a dht-NATNA, comprising the dht- casPNs, is capable of forming a complex. In some embodiments, the Cas protein is catalytically inactive for one or more of its endonuclease activities.

[00163] In one embodiment of this second aspect of the present invention, a

nucleoprotein composition comprises a dht-NATNA as described herein and a Cas9 protein. In another embodiment, the dht-NATNA is in a complex with the Cas9 protein (dht-NATNA/Cas9 nucleoprotein complex). The Cas9 protein can have combinations of the following

endonuclease activities: both the RuvC-l and HNH domains of the Cas9 protein can be catalytically active, both the RuvC-l and HNH domains of the Cas9 protein can be catalytically inactive (dCas9), the RuvC-l domain of the Cas9 protein can be catalytically inactive, or the HNH domain of the Cas9 protein can be catalytically inactive. Methods of making Cas9 proteins that are enzymatically inactive for RuvC-l -related nuclease activity, HNH-related nuclease activity, or both RuvC-l -related nuclease activity and HNH-related nuclease activity (dCas9) are known in the art.

[00164] The site-specific binding of and/or cutting by a nucleoprotein complex comprising a dht-NATNA and a Cas9 protein, as well as modifications thereof (e.g., introduction of an affinity tag) can be confirmed, if necessary, using an electrophoretic mobility shift assay (see, e.g., Gamer, M., et a/., Nucleic Acids Research 9(l3):3047-3060 (1981); Fried, M., etal, Nucleic Acids Research 9(23):6505-6525 (1981); Fried, M., Electrophoresis 10:366- 376 (1989); Gagnon, K., etal. , Methods Molecular Biology 703:275-2791 (2011); Fillebeen,

C., etal., Journal of Visualized Experiments 3(94) (2014) - dx.doi.org /10.3791/51959), or the Cas cleavage assay described in Example 3.

[00165] Example 3 describes the use of dht-NATNA/Cas9 nucleoprotein complexes for in vitro biochemical cleavage assays. Example 2 provides a method for production of double- stranded target DNA sequences for use in the in vitro Cas9 protein cleavage assays. The data obtained using the biochemical cleavage assay of Example 3 can be used to demonstrate that dht-NATNAs facilitate Cas9 protein mediated site-specific binding to, and subsequence cleavage of, double-stranded target DNA sequences.

[00166] To examine site-specific binding, and/or cutting in eukaryotic cells, deep sequencing analysis for detection of nucleic acid target sequence modifications (Example 4) and/or the T7E1 assay for detection of nucleic acid target sequence modifications (Example 7) can be employed.

[00167] Example 9 describes the use of dht-NATNAs to modify nucleic acid target sequences present in human genomic DNA and to measure the level of cleavage activity and specificity of cleavage at such sites. Measurement of the level of cleavage percentage and/or cleavage specificity at a particular site can provide options to identify nucleic acid target sequences having a desired cleavage percentage and/or specificity.

[00168] An embodiment of a dht-NATNA of the present invention is shown in FIG. 7A {compare FIG. 5 A). In FIG. 7A, a dht-casPNl {compare FIG. 5 A, I) is represented as FIG. 7A, 700, and a dht-casPN2 {compare FIG. 5A, II) is represented as FIG. 7A, 701. A Cas9 protein is represented as FIG. 7A, 702. A double-stranded DNA, comprising a target DNA sequence, is represented as FIG. 7A, 703, and a PAM sequence is represented as FIG. 7A, 704. FIG. 7B illustrates a dht-NATNA (FIG. 7B, 705)/Cas9 nucleoprotein complex (FIG. 7B, 706) bound to a double-stranded DNA (FIG. 7B, 703) comprising a target DNA sequence, wherein the nucleoprotein complex has cut both strands of the double-stranded target DNA sequence. The location of the cut made by the Cas9 protein of the nucleoprotein complex is indicated as FIG. 7B, 707. The PAM (FIG. 7B, 704) in the double-stranded DNA is present in the 5’ to 3’ DNA strand.

[00169] In some embodiments of the present invention, affinity tags are introduced into one or more dht-casPNs of a dht-NATNA composition (e.g., FIG. 5 A, I, dht-casPNl; and/or FIG. 5 A, II, dht-casPN2) and a cognate Cas protein (e.g., into the dht-casPNl and the cognate Cas protein, into the dht-casPN2 and the cognate Cas protein, or into the dht-casPNl, dht- casPN2, and the cognate Cas protein). For example, a nucleic acid sequence appended to the 3' end of dht-casPNl can comprise an affinity sequence such as a MS2 binding sequence, a U1A binding sequence, a stem-loop sequence (e.g., a Cas6 protein binding sequence such as a Csy4 protein binding sequence), an eåF4A binding sequence, a Transcription Activator-Like Effector (TALE) binding sequence (see, e.g., Valton, L, eta/., Journal of Biological Chemistry

287(46):38427-38432 (2012)), or a zinc finger (ZFN) domain binding sequence (see, e.g., Font, J., etal., Methods Molecular Biology 649:479-491 (2010); Isalan, M., eta Z, Nature

Biotechnology l9(7):656-660 (2001)). In some embodiments, dht-casPN2 can be similarly modified, or both the dht-casPNl and the dht-casPN2 can be modified. The Cas protein coding sequence can then be modified to comprise a corresponding affinity tag; for example, an MS2 coding sequence, a U1A coding sequence, a stem-loop binding protein coding sequence (e.g., an enzymatically (rib oendonucl ease) inactive Csy4 protein that binds the Csy4 protein sequence), an eIF4A coding sequence, a TALE coding sequence, or a ZFN domain coding sequence. Typically, enzymatically inactive nucleic acid binding proteins that retain sequence specific binding to a nucleic acid sequence are used (e.g., a riboendonuclease inactive Csy4 protein (dCsy4)); however, in some embodiments, enzymatically active nucleic acid binding proteins or nucleic acid binding proteins with altered enzymatic activity can be used. When both dhtl-casPN and dht2-casPN are modified with an affinity sequence, preferably the two affinity sequences typically are not the same.

[00170] A wide variety of cross-linking moieties for use with nucleic acid sequences and polypeptide sequences are commercially available, including, but not limited to, thiols (e.g., 5’ thiol C6, dithiol phosphoramidite (DTP A), and 3’ thiol C3) (e.g., Integrated DNA

Technologies, Inc., Coralville, IA; Thermo Fisher Scientific, South San Francisco, CA;

ProteoChem, Loves Park, IL; BroadPharm, San Diego, CA). Examples of targets for cross- linking moieties include, but are not limited to, amines (e.g., lysines or a protein N-terminus), sulfhydryls (e.g., cysteines), carbohydrates (e.g., oxidized sugars), and carboxyls (e.g., protein or peptide C-terminus, aspartic acid, or glutamic acid). Examples of chemical cross-linking moieties include, but are not limited to, carbodiimide, N-hydroxysuccinimide esters (NHS) ester, imidoesters, maleimides, haloacetyls, pyridyldisulfides, hydrazides, alkoxyamines, diazirines, aryl azides, and isocyanates.

[00171] Following the guidance of the present specification, one of ordinary skill in the art can modify one or more polynucleotides of a dht-NATNA as well as a cognate Cas protein with cross-linking moieties using established chemical methods (e.g., Methods of Chemistry of Protein and Nucleic Acid Cross-Linking and Conjugation, Second Edition, Shan S. Wong and David M. Jameson, CRC Press, ISBN-13 978-0849374913 (2011); Bioconjugate Techniques, Third Edition, Greg T. Hermanson, Academic Press, ISBN-13 978-0123822390 (2013);

Chemistry of Bioconjugates - Synthesis, Characterization, and Biomedical Applications, First Edition, Ravin Narain (Editor), Wiley, ISBN-13 978-1118359143 (2014); Bioconjugation Protocols - Strategies and Methods (Series: Methods in Molecular Biology (Book 751), Second Edition, Sonny S. Mark (Editor), Humana Press, ISBN-13 978-1617791505 (2011);

Crosslinking Technical Handbook, Thermo Fisher Scientific, South San Francisco, CA (2009, 2012)).

[00172] In some embodiments, the Cas protein primary sequence is engineered to comprise an amino acid residue at a particular residue position in the Cas protein (e.g., substitution or insertion of a Cys amino acid at a position that is not a Cys amino acid in the corresponding wild-type Cas protein) useful for cross linking to a cross-linking moiety present in one or more polynucleotides of a dht-NATNA.

[00173] A further application of a cross-linking moiety is to provide one or more photoactive nucleotide in one or more of the polynucleotides of a dht-NATNA, wherein the photoactive nucleotide is positioned to maximize contact between the one or more photoactive nucleotides and one or more photoreactive amino acids. LTltra-violet (UV) light can be used to induce cross linking between the one or more photoactive nucleotides and the one or more photoreactive amino acids. In one embodiment, a cross-linking moiety is a cross-linkable polynucleotide comprising a contiguous run of uracil nucleotides (poly-Li) or a run of uracil nucleotides alternating with other nucleotides. In another embodiment, a cross-linking moiety can be a cross-linkable polynucleotide comprising a contiguous run of thymidine nucleotides (poly-T) or a run of thymidine nucleotides alternating with other nucleotides. Such cross- linkable polynucleotides are, for example, positioned in one or more of the polynucleotides of a dht-NATNA to maximize contact with one or more photoreactive amino acids of a Cas protein.

[00174] A large number of photoreactive amino acids can be added photochemically (e.g., 254nm) to uracil (see, e.g., Smith, K.C., etal ',“DNA-Protein Crosslinks,” available at www.photobiology.info/ Smith_Shetlar.html) including, but not limited to, glycine, serine, phenylalanine, tyrosine, tryptophan, cystine, cysteine, methionine, histidine, arginine and lysine. The most reactive amino acids are phenylalanine, tyrosine and cysteine. A number of photoreactive amino acids can be added photochemically to thymidine including, but not limited to, lysine, arginine, cysteine and cystine. Accordingly, regions of a Cas protein complex comprising one or more photoreactive amino acid can be evaluated for the ability to act as cross-linking moieties. Also, the Cas protein coding sequence can be modified to introduce a photoreactive amino acid (an affinity tag) in a position suitable to come into proximity of a photoactive nucleotide (an affinity tag) in an affinity sequence of one or more polynucleotides of a dht-NATNA.

[00175] Further examples of photoreactive cross-linking moieties include, but are not limited to, photo reactive amino acid analogs (L-photo leucine, L-photo-methionine, p-benzoyl- L-phenylalanine) and photoactivatable ribonucleosides (halogenated and thione containing ribonucleoside analogues, such as 5-Bromo-dUTP, Azide-PEG4-aminoallyl-dUTP, 4- thiouridine, 6-thioguanosine, preferred reaction with tyrosines, phenylalanines and

tryptophanes). General photoreactive cross-linking moieties include, but are not limited to, aryl azides, azido-methyl-coumarins, benzophenones, anthraquinones, certain diazo compounds, diazirines, and psoralen derivatives.

[00176] There are a number of photoreactive cross-linking analogs that serve as substrates for RNA polymerases for introduction into RNA molecules including, but not limited to, 4-thio-UTP, 5-azido-UTP, 5-bromo-UTP and 8-azido-ATP, 5-APAS-UTP, 5-APAS-CTP, 8- APAS-ATP, and 8-N(3)AMP (see, e.g., C. Costas, et t ', Nucleic Acids Research 28(9): 1849- 1858 (2000); Gaur R.K., Methods Molecular Biology 488: 167-180 (2008)).

[00177] A variety of cross-linking methods and moieties are commercially available, for example, from TriLink Biotechnologies (San Diego, CA) including, for photocross-linking, RNA - 4-Thiouridine, 5-Bromouridine-5’-Triphosphate, 5-Iodouridine-5’ -Triphosphate, 4- Thiouridine-5’ -Triphosphate / DNA - 6-Thio-dG, and 4-Thiothymidine.

[00178] Examples of general cross-linking reagents include, but are not limited to, glutaraldehyde and formaldehyde. Furthermore, monofunctional (e.g., one-function cross- linking moieties, such as alkyl imidates), bifunctional (two cross-linking moieties, such as disuccinimidyl suberate (DSS)), or trifunctional cross-linking moieties can be used, as well as homobifunctional (DSS) and heterobifunctional (sulfosuccinimidyl-4-(N-maleimidomethyl) cyclohexane- l-carboxylate (Sulfo-SMCC)) cross-linking moieties. Additionally, cross-linking moieties can comprise different spacer lengths (C3, C6, PEG spacers, and others).

[00179] In another embodiment, a ligand-binding moiety is introduced into the Cas protein and one or more polynucleotides of a dht-NATNA are modified to contain the ligand. An exemplary ligand/ligand-binding moiety is avidin or streptavidin/Biotin {see, e.g., Livnah, O., eta/., Proceedings of the National Academy of Sciences of the ETnited States of America

90(11):5076-5080 (1993); Airenne, K.J., eta/, Biomolecular Engineering l6(l-4):87-92 (1999)). One example of a Cas protein with a ligand-binding moiety is a Cas protein fused to a ligand avidin or streptavidin designed to bind one or more polynucleotides of a dht-NATNA at a 5’ or 3’ terminus. Biotin is a high affinity and high specificity ligand for the avidin or streptavidin protein. By fusing an avidin or streptavidin polypeptide chain to a Cas protein, the Cas protein has a high affinity and specificity for one or more 5’ or 3’ biotinylated

polynucleotide of a dht-NATNA.

[00180] The sequence of a dht-casPN and location of the biotin can be provided to commercial manufacturers for synthesis of the dht-casPN-biotin or can be added through the use of an artificial third-base pair (e.g., an unnatural base pair between 7-(2- thienyl)imidazo[4,5-b]pyridine (Ds) and pyrrole-2-carbaldehyde (Pa)) in an in vitro

transcription reaction {see, e.g., Hirao, I., eta/., Nature Methods 3(9):729-735 (2006)). dht- casPNs can be similarly modified at 5’-end sequences, 3’-end sequences, or positions between the 5’ -end and the 3’ -end sequences. Changes to cleavage percentage and specificity of the ligand-binding modified Cas protein-ligand-binding moiety/dht -NATNA-ligand moiety can be evaluated as described below in Example 3 and Example 4.

[00181] Examples of other ligand moieties and ligand-binding moieties (ligand/ligand binding pair) that can be similarly used include, but are not limited to: estradiol/estrogen receptor {see, e.g., Zuo, I, eta/., Plant Journal 24(2):265-273 (2000)); and

rapamycin/FKPB/FKBPl2 and rapamycin/FK506/FKKBP {see, e.g., Setscrew, B., eta/.,

Nature Biotechnology 33 : 139-142 (2015); Chiu M.I., eta/., Proceedings of the National Academy of Sciences of the ETnited States of America 91(26): 12574-12578 (1994),

respectively).

[00182] Another example of a ligand moiety and ligand-binding moiety (ligand/ligand binding pair) is to provide one or more aptamer or modified aptamer in a polynucleotide sequence of one or more dht-casPNs of a dht-NATNA (e.g., a dht-casPNl and/or a dht-casPN2) that has a high affinity and binding specificity for a selected region of a Cas protein. In one embodiment, a ligand-binding moiety can be a polynucleotide comprising an aptamer {see, e.g., Navani, N.K., eta/., Biosensors and Biodetection (Methods in Molecular Biology) 504:399-415 (2009); A. V. Kulbachinskiy, Biochemistry (Moscow) 72(13): 1505-1518 (2007)). Aptamers are single-stranded functional nucleic acid sequences (ligand-binding moieties) that possess recognition capability of a corresponding ligand moiety. In yet another embodiment, an established aptamer binding sequence/aptamer is used by introducing the aptamer-binding region into the Cas9 protein. [00183] The creation of a high affinity binding site for a selected ligand on a Cas protein (e.g., Cas9) can be achieved using several protein engineering methods known to those of ordinary skill in the art in view of the guidance of the present specification. Examples of such protein engineering methods include, but are not limited to, rational protein design, directed evolution using different selection and screening methods for the library (e.g., phage display, ribosome display, yeast display, RNA display), DNA shuffling, computational methods (e.g., ROSETTA, www.rosettacommons.org/software), and introduction of a known high affinity ligand into a Cas protein. Libraries obtained by these methods can be screened to select for a Cas protein high affinity binders using, for example, a phage display assay, a cell survival assay, or a binding assay.

[00184] In a third aspect, the present invention relates to nucleic acid sequences encoding one or more dht-casPNs, as well as expression cassettes, vectors, and recombinant cells comprising nucleic acid sequences encoding dht-casPNs. Some embodiments of the third aspect of the invention include a nucleic acid coding sequence for a cognate Cas9 protein with which a dht-NATNA, comprising dht-casPNs, is capable of forming a nucleoprotein complex. Such embodiments include, but are not limited to expression cassettes, vectors, and recombinant cells.

[00185] In one embodiment, the present invention relates to one or more expression cassettes comprising one or more nucleic acid sequences encoding one or more dht-casPNs, and optionally one or more nucleic acid sequences encoding a cognate Cas9 protein with which a dht-NATNA, comprising the dht-casPNs, is capable of forming a complex. Expression cassettes typically comprise a regulatory sequence involved in one or more of the following: regulation of transcription, post-transcriptional regulation, or regulation of translation.

Expression cassettes can be introduced into a wide variety of organisms including, but not limited to, bacterial cells, yeast cells, plant cells, and mammalian cells (including human cells). Expression cassettes typically comprise functional regulatory sequences corresponding to the organism(s) into which they are being introduced.

[00186] A further embodiment of the present invention relates to vectors, including expression vectors, comprising one or more nucleic acid sequences encoding one or more dht- casPNs, and optionally one or more nucleic acid sequences encoding a cognate Cas9 protein with which a dht-NATNA, comprising the dht-casPNs, is capable of forming a complex.

Vectors can also include sequences encoding selectable or screenable markers. Furthermore, nuclear targeting sequences can also be added, for example, to the Cas9 protein. Vectors can also include polynucleotides encoding protein tags (e.g., poly-His tags, hemagglutinin tags, fluorescent protein tags, and bioluminescent tags). The coding sequences for such protein tags can be fused to, for example, one or more nucleic acid sequences encoding a Cas9 protein.

[00187] General methods for construction of expression vectors are known in the art. Expression vectors for host cells are commercially available. There are several commercial software products designed to facilitate selection of appropriate vectors and construction thereof, such as insect cell vectors for insect cell transformation and gene expression in insect cells, bacterial plasmids for bacterial transformation and gene expression in bacterial cells, yeast plasmids for cell transformation and gene expression in yeast and other fungi, mammalian vectors for mammalian cell transformation and gene expression in mammalian cells or mammals, and viral vectors (including lentivirus, retrovirus, adenovirus, herpes simplex virus I or II, parvovirus, reticuloendotheliosis virus, and adeno-associated virus (AAV) vectors) for cell transformation and gene expression and methods to easily allow cloning of such polynucleotides. Illustrative plant transformation vectors include those derived from a Ti plasmid of Agrobacterium tumefaciens (Lee, L.Y., eta/., Plant Physiology 146(2): 325-332 (2008)). Also useful and known in the art are Agrobacterium rhizogenes plasmids. For example, SNAPGENE™ (GSL Biotech LLC, Chicago, IL;

snapgene.com/resources/plasmid_files/ your_time_is_valuable/) provides an extensive list of vectors, individual vector sequences, and vector maps, as well as commercial sources for many of the vectors.

[00188] Adenovirus is a member of the Adenoviridae family. Adenovirus vectors are derived from adenovirus. Adenovirus is medium sized, non-enveloped icosahedral virus. It is composed of a nucleocapsid and a double-stranded linear DNA genome that can be used as a cloning vector. The extensive knowledge and data on adenovirus transcription regulation favored the engineering of adenovirus vectors modified for expression of inserted genes. For this purpose, the early regions El and E3 were deleted, thus making the virus incapable of replication, requiring the host cell to provide this function in trans. An expression cassette comprising protein coding sequences (e.g., a Cas9 protein) is typically inserted to replace the deleted El region. In the cassette, a gene is placed under control of an additional major late promoter or under control of an exogenous promoter, such as cytomegalovirus or selected regulatable promoter.

[00189] The genome of adenovirus can be manipulated in such a way that it encodes and expresses a gene product of interest while at the same time inactivating the adenovirus’ ability to replicate via a normal lytic cycle. Some such adenoviral vectors include those derived from adenovirus strain Ad type 5 dl324 or other adenovirus strains (e.g., Ad2, Ad3, and Ad7). In certain circumstances, recombinant adenoviruses can be advantageous because they cannot infect non-dividing cells, and they can be used to infect epithelial cells and a variety of other cell types. In addition, the virus particle is relatively stable, and it is amenable to purification and concentration. The adenoviral genome’s carrying capacity for foreign DNA is up to approximately 8 kilobases, which is large compared with other gene delivery vectors. The large double-stranded DNA adenovirus does not integrate into the genome, limiting its use to transient, episomal expression. Because it is not integrated into the genome of a host cell (unlike retroviral DNA) potential problems such as insertional mutagenesis are avoided.

[00190] Adeno-associated virus (AAV), an single-strand DNA member of the family Parvoviridae , is a naturally replication-deficient virus. Like adenovirus, it can infect non dividing cells; however, it has the advantage of integration competence. AAV vectors are among the viral vectors most frequently used for gene therapy. Twelve human serotypes of AAV (AAV serotype 1 [AAV-l] to AAV-12) and more than 100 serotypes from non-human are known. In one embodiment, AAV-6 is used as a vector (see, e.g., U.S. Patent No.

6, 156,303, issued 5 December 2000). A number of factors have increased AAV's potential as a delivery vehicle for gene therapy applications, including the lack of pathogenicity of the virus, the persistence of the virus, and the many available serotypes. AAV is a small (25-nm), non- enveloped virus that comprises a linear single-stranded DNA genome. Productive infection by AAV typically occurs only in the presence of a helper virus, for example, adenovirus or herpesvirus. In the absence of helper virus, AAV (serotype 2) can become latent by integrating into human chromosome l9ql3.4 (locus AAVS-l) (see, e.g., Daya, S., et ah, (2008) "Gene Therapy Using Adeno-Associated Virus Vectors," Clinical Microbiology Reviews, 21(4): 583— 593).

[00191] Vaccinia virus is a member of the poxvirus family. Vaccinia vectors are derived from vaccinia virus. The vaccinia virus genome is comprised of a double stranded DNA of nearly 200,000bp. It replicates in the cytoplasm of the host cell. Cells infected with the vaccinia virus produce up to 5000 virus particles per cell, leading to high levels of expression for encoded gene products. The vaccinia system has been efficiently used in very large scale culture (1000 L) to produce several proteins, including HIV-l rgpl60 and human pro-thrombin.

[00192] Retrovirus is a member of the Retroviridae family. Retroviral vectors are derived from retrovirus. Retroviruses are RNA viruses that replicate via a double-strand DNA intermediate. One advantage of using a retrovirus as vector is that most retroviruses do not kill the host, but instead produce progeny virons over an indefinite period of time. Therefore, retroviral vectors (i) can be used to make stably transformed cell lines, (ii) provide viral gene expression driven by strong promoters, which can be subverted to control the expression of transgenes; and (iii) include those derived from retroviruses having a broad host range (e.g., amphotropic strains of murine leukaemia virus (MLV)) thus allowing the transfection of many cell types.

[00193] Exogenous gene-expression systems based on the retroviral vector are also a method for generating stable, high-expressing mammalian cell lines.

[00194] Lentiviral vectors are examples of vectors useful for introduction into

mammalian cells of one or more nucleic acid sequences encoding one or more dht-casPNs, and optionally one or more nucleic acid sequences encoding a Cas9 protein with which a dht- NATNA, comprising the dht-casPNs, is capable of forming a complex. Lentivirus is a member of the Retroviridae family and is a single-stranded RNA virus, which can infect both dividing and non-dividing cells as well as provide stable expression through integration into the genome. To increase the safety of lentiviral vectors, components necessary to produce a viral vector are split across multiple plasmids. Transfer vectors are typically replication incompetent and may additionally contain a deletion in the 3’LTR, which renders the virus self-inactivating after integration. Packaging and envelope plasmids are typically used in combination with a transfer vector. For example, a packaging plasmid can encode combinations of the Gag, Pol, Rev, and Tat genes. A transfer plasmid can comprise viral LTRs and the psi packaging signal. The envelope plasmid usually comprises an envelope protein (usually vesicular stomatitis virus glycoprotein, VSV-GP, because of its wide infectivity range).

[00195] Lentiviral vectors based on human immunodeficiency virus type-l (HIV-l) have additional accessory proteins that facilitate integration in the absence of cell division. HIV-l vectors have been designed to address a number of safety concerns, including separate expression of the viral genes in trans to prevent recombination events leading to the generation of replication-competent viruses. Furthermore, the development of self-inactivating vectors reduces the potential for transactivation of neighboring genes and allows for the incorporation of regulatory elements to target gene expression to particular cell types {see, e.g., Cooray, S., et al. , Methods in Enzymology 507:29-57 (2012)).

[00196] Transformed host cells (or recombinant cells) or the progeny of cells that have been transformed or transfected using recombinant DNA techniques can comprise one or more nucleic acid sequences encoding one or more dht-casPNs, and optionally one or more nucleic acid sequences encoding a Cas9 protein with which a dht-NATNA, comprising the dht-casPNs, is capable of forming a complex. Methods of introducing polynucleotides (e.g., an expression vector) into host cells are known in the art and are typically selected based on the kind of host cell. Such methods include, for example, viral or bacteriophage infection, transfection, conjugation, electroporation, calcium phosphate precipitation, polyethyleneimine-mediated transfection, DEAE-dextran mediated transfection, protoplast fusion, lipofection, liposome- mediated transfection, particle gun technology, microprojectile bombardment, direct microinjection, and nanoparticle-mediated delivery.

[00197] As an alternative to expressing one or more nucleic acid sequences encoding one or more dht-casPNs (optionally one or more nucleic acid sequences encoding a Cas9 protein with which a dht-NATNA, comprising the dht-casPNs, is capable of forming a complex), a dht- NATNA, cognate Cas protein (e.g., Cas9), or a dht-NATNA/Cas protein complex can be directly introduced into a cell. Or one or more of these nucleic acid sequences can be expressed by a cell and the other component s) of a dht-NATNA/Cas protein complex can be directly introduced. Methods to introduce the components into a cell include electroporation, lipofection, particle gun technology, and microprojectile bombardment.

[00198] A variety of exemplary host cells disclosed herein can be used to produce recombinant cells using a dht-NATNA/Cas protein complex. Such host cells include, but are not limited to, a plant cell, a yeast cell, a bacterial cell, an insect cell, an algal cell, and a mammalian cell.

[00199] Methods of introducing polynucleotides such as one or more polynucleotide encoding dht-casPNs (e.g., an expression vector comprising dht-casPN coding sequences) into host cells to produce recombinant cells are known in the art and are typically selected based on the kind of host cell. Such methods include, for example, viral or bacteriophage infection, transfection, conjugation, electroporation, calcium phosphate precipitation, polyethyleneimine- mediated transfection, DEAE-dextran mediated transfection, protoplast fusion, lipofection, liposome-mediated transfection, particle gun technology, direct microinjection, and

nanoparticle-mediated delivery. For ease of discussion,“transfection” is used below to refer to any method of introducing polynucleotides into a host cell.

[00200] Preferred methods for introducing polynucleotides plant cells include

microprojectile bombardment and Agrobacterium-mcdiated transformation. Alternatively, other non- Agrobacterium species (e.g., Rhizobium ) and other prokaryotic cells that are able to infect plant cells and introduce heterologous polynucleotides into the genome of the infected plant cell can be used. Other methods include electroporation, liposome-mediated transfection, transformation using pollen or viruses, and chemicals that increase free DNA uptake, or free DNA delivery using microprojectile bombardment {see, e.g., Narusaka, Y., eta/., Chapter 9, in Transgenic Plants - Advances and Limitations, edited by Yelda, O., ISBN 978-953-51-0181-9

(2012)).

[00201] In some embodiments, a host cell is transiently or non-transiently transfected with nucleic acid sequences encoding one or more component of a dht-NATNA/Cas9 nucleoprotein complex. In some embodiments, a cell is transfected as it naturally occurs in a subject. In some embodiments, a cell that is transfected is first removed from a subject, e.g., a primary cell or progenitor cell. In some embodiments, the primary cell or progenitor cell is cultured and/or is returned after ex vivo transfection to the same subject or to a different subject.

[00202] The dht-NATNA/Cas9 nucleoprotein complexes described herein can be used to generate non-human transgenic organisms by site specifically introducing a selected polynucleotide sequence (e.g., a portion of a donor polynucleotide) at a DNA target locus in the genome to generate a modification of the genomic DNA. The transgenic organism can be an animal or a plant.

[00203] A transgenic animal is typically generated by introducing dht-NATNA/Cas9 nucleoprotein complexes (or nucleic acid coding sequences for components thereof) into a zygote cell. A basic technique, described with reference to making transgenic mice (see, e.g., Cho, A., etal.,“Generation of Transgenic Mice,” Current Protocols in Cell Biology,

CHAPTER.Unit-l9. l l (2009)) involves five basic steps: first, preparation of a system, as described herein, including a suitable donor polynucleotide; second, harvesting of donor zygotes; third, microinjection of the system into the mouse zygote; fourth, implantation of microinjected zygotes into pseudo-pregnant recipient mice; and fifth, performing genotyping and analysis of the modification of the genomic DNA established in founder mice. The founder mice will pass the genetic modification to any progeny. The founder mice are typically heterozygous for the transgene. Mating between these mice will produce mice that are homozygous for the transgene 25% of the time.

[00204] Methods for generating transgenic plants are also well known and can be applied using dht-NATNA/Cas9 nucleoprotein complexes (or nucleic acid coding sequences for components thereof). A generated transgenic plant, for example using Agrobacterium- mediated transformation, typically contains one transgene inserted into one chromosome. It is possible to produce a transgenic plant that is homozygous with respect to a transgene by sexually mating (i.e., selfing) an independent segregant transgenic plant containing a single transgene to itself, for example an F0 plant, to produce an Fl seed. Plants formed by germinating F 1 seeds can be tested for homozygosity. Typical zygosity assays include, but are not limited to, single nucleotide polymorphism assays and thermal amplification assays that distinguish between homozygotes and heterozygotes.

[00205] As an alternative to using a system described herein for the direct transformation of a plant, transgenic plants can be formed by crossing a first plant that has been transformed with a dht-NATNA/Cas9 nucleoprotein complex with a second plant that has never been exposed to the complex. For example, a first plant line containing a transgene can be crossed with a second plant line to introgress the transgene into the second plant line, thus forming a second transgenic plant line.

[00200] A fourth aspect of the present invention relates to methods of using dht- NATNA/Cas9 nucleoprotein complexes (or nucleic acid coding sequences for components thereof). Embodiments of dht-NATNA compositions are described herein, for example, in the preceding second aspect of the invention.

[00207] In one embodiment, the present invention includes a method of binding a nucleic acid sequence (e.g., single-stranded (ss) DNA or dsDNA) comprising providing a dht- NATNA/Cas9 nucleoprotein complex (or components thereof) for introduction into a cell or biochemical reaction, and/or introducing a dht-NATNA/Cas9 nucleoprotein complex (or components thereof) into a cell or biochemical reaction, thereby facilitating contact of a nucleic acid target sequence in the nucleic acid sequence (e.g., ssDNA or dsDNA) with the dht- NATNA/Cas9 nucleoprotein complex resulting in binding of the dht-NATNA/Cas9

nucleoprotein complex to the nucleic acid target sequence in the nucleic acid sequence. In some embodiments, the nucleic acid target sequence is dsDNA or genomic DNA. Such methods of binding a nucleic acid target sequence can be carried out in vitro (e.g., in a biochemical reaction or in cultured cells, in some embodiments the cultured cells are human cultured cells that remain in culture and are not introduced into a human), in vivo (e.g., in cells of a living organism, with the proviso that in some embodiments the organism is a non-human organism), or ex vivo (e.g., cells removed from a subject, with the proviso that in some embodiments the subject is a non-human subject).

[00208] In an additional embodiment, the method of the present invention is a method of binding RNA. The method comprises contacting a RNA target sequence in an RNA

polynucleotide with a nucleoprotein complex comprising a dht-NATNA composition and a Cas9 protein, thereby facilitating binding of the nucleoprotein complex to the RNA target sequence. Dugar, G., et al. (Molecular Cell, 69(5):893 - 905. e7 (2018)) describe that a ssRNA target complementary to the spacer sequence of a C. jejuni Cas9-guide RNA/Cas9 protein nucleoprotein complex can bind the ssRNA target. [00209] A variety of methods are known in the art to evaluate and/or quantitate interactions between nucleic acid sequences and polypeptides including, but not limited to, the following: immunoprecipitation (ChIP) assays, DNA electrophoretic mobility shift assays (EMSA), DNA pull-down assays, and microplate capture and detection assays. Commercial kits, materials, and reagents are available to practice many of these methods and, for example, can be obtained from the following suppliers: Thermo Scientific (Wilmington, DE), Signosis (Santa Clara, CA), Bio-Rad (Hercules, CA), and Promega (Madison, WI). A common approach to detect interactions between a polypeptide and a nucleic acid sequence is EMSA (see, e.g., Hellman L.M., et al, Nature Protocols 2(8): 1849-1861 (2007)).

[00210] In another embodiment, the present invention includes a method of cutting a nucleic acid sequence (e.g., ssDNA or dsDNA) comprising providing a dht-NATNA/Cas9 nucleoprotein complex (or components thereof) for introduction into a cell or biochemical reaction, and/or introducing a dht-NATNA/Cas9 nucleoprotein complex (or components thereof) into a cell or biochemical reaction, thereby facilitating contact of a nucleic acid target sequence in the nucleic acid sequence (e.g., ssDNA or dsDNA) with the dht-NATNA/Cas9 protein complex resulting in binding of the dht-NATNA/Cas9 nucleoprotein complex to the nucleic acid target sequence in the nucleic acid sequence. The bound dht-NATNA/Cas9 nucleoprotein complex results in cutting the nucleic acid target sequence. In some

embodiments, the nucleic acid target sequence is dsDNA or genomic DNA. In some embodiments, the nucleic acid target sequence is double-stranded and one or both of the strands is cut. Such methods of cutting a nucleic acid target sequence can be carried out in vitro, in vivo, or ex vivo.

[00211] A method of cutting a nucleic acid target sequence using a dht-NATNA/Cas9 nucleoprotein complex is illustrated in FIG. 7A and FIG. 7B. A method of cutting a nucleic acid target sequence is set forth in Example 3.

[00212] In an additional embodiment, the present invention relates to cleaving RNA target sequences. The method comprises contacting a RNA target sequence in an RNA polynucleotide with a nucleoprotein complex comprising a dht-NATNA composition and a Cas9 protein, thereby facilitating binding of the nucleoprotein complex to the RNA target sequence and cleavage of the RNA target sequence. Dugar, G., et al. (Molecular Cell,

69(5): 893 - 905. e7 (2018)) describe that a ssRNA target complementary to the spacer sequence of a C. jejuni Cas9-guide RNA/Cas9 protein nucleoprotein complex can bind and cleave the ssRNA target. The Cas9 protein HNH domain is responsible for cleavage of the target RNA. [00213] In yet another embodiment, the present invention includes a method of modifying a nucleic acid target sequence in a cell comprising providing a dht-NATNA/Cas9 nucleoprotein complex (or components thereof) for introduction into a cell or biochemical reaction, and/or introducing a dht-NATNA/Cas9 nucleoprotein complex (or components thereof) into a cell or biochemical reaction, thereby facilitating contact of a nucleic acid target sequence in the nucleic acid sequence (e.g., DNA) with the dht-NATNA/Cas9 nucleoprotein complex resulting in binding of the dht-NATNA/Cas9 nucleoprotein complex to the nucleic acid target sequence in the nucleic acid sequence. The dht-NATNA comprises a nucleic acid targeting sequence that is complementary to the nucleic acid target sequence. The dht- NATNA/Cas9 nucleoprotein composition cuts the nucleic acid target sequence. In some embodiments, the nucleic acid target sequence is DNA or genomic DNA. The cell will repair the cut site through cell repair mechanisms such as HDR, NHEJ, or MMEJ. Such methods of modifying a nucleic acid target sequence can be carried out in vitro, in vivo, or ex vivo. The contacting step may further comprise a donor polynucleotide being present, wherein at least a portion of the donor polynucleotide is incorporated into the DNA.

[00214] In yet another embodiment, the present invention includes methods of modulating in vitro or in vivo transcription, for example, transcription of a gene comprising regulatory element sequences. The method comprises providing a dht-NATNA/Cas9 nucleoprotein complex (or components thereof) for introduction into a cell or biochemical reaction, and/or introducing a dht-NATNA/Cas9 nucleoprotein complex (or components thereof) into a cell or biochemical reaction, thereby facilitating contact of a nucleic acid target sequence in the nucleic acid sequence (e.g., DNA) with the dht-NATNA/Cas9 nucleoprotein complex resulting in binding of the dht-NATNA/Cas9 nucleoprotein complex to the nucleic acid target sequence in the nucleic acid sequence. In some embodiments, the Cas9 protein is a catalytically inactive nuclease protein. In addition, the Cas9 protein can be a fusion protein, for example, dCas9 fused to a repressor or activator domain. The binding of the dht-NATNA/Cas protein complex to the nucleic acid target sequence modulates transcription of the gene.

[00215] Any of the components of the dht-NATNA compositions, as described above, can be incorporated into a kit. In some embodiments, a kit includes a package with one or more containers holding the kit elements, as one or more separate compositions or, optionally if the compatibility of the components allows, as admixture. In some embodiments, kits also comprise one or more of the following excipients: a buffer, a buffering agent, a salt, a sterile aqueous solution, a preservative, and combinations thereof. Illustrative kits can comprise one or more dht-casPNs of a dht-NATNA, one or more excipients, and optionally a Cas9 protein; one or more nucleic acid sequences encoding one or more dht-casPNs of a dht-NATNA, and optionally a Cas9 protein; a dht-NATNA and a Cas9 protein; a dht-NATNA; a dht- NATNA/Cas9 nucleoprotein complex; or combinations thereof.

[00216] Furthermore, kits can further comprise instructions for using components of the dht-NATNA/Cas9 nucleoprotein compositions. Instructions included in kits of the invention can be affixed to packaging material or can be included as a package insert. Although the instructions are typically written or printed materials, they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated by this invention.

[00217] Another aspect of the invention relates to methods of making or manufacturing one or more dht-casPNs of a dht-NATNA, a dht-NATNA/Cas9 nucleoprotein composition, or components thereof. In one embodiment, a method of making or manufacturing comprises chemically synthesizing one or more dht-casPN components of a dht-NATNA composition. In some embodiments, one or more polynucleotides of a dht-NATNA comprise RNA bases and can be generated from DNA templates using in vitro transcription.

[00218] A dht-NATNA/Cas9 nucleoprotein composition can further comprise a detectable label, such as a moiety that can provide a detectable signal. Examples of detectable labels include, but are not limited to, an enzyme, a radioisotope, a member of a specific binding pair, a fluorophore (FAM), a fluorescent protein (green fluorescent protein, red fluorescent protein, mCherry, tdTomato), an DNA or RNA aptamer together with a suitable fluorophore (enhanced GFP (EGFP),“Spinach”), a quantum dot, an antibody, and the like. A large number and variety of suitable detectable labels are well-known to one of ordinary skill in the art.

[00219] A dht-NATNA/Cas9 nucleoprotein composition, cells comprising a dht- NATNA/Cas9 nucleoprotein composition, cells modified through the use of a dht-NATNA/Cas nucleoprotein composition, or progeny of such cells can be used as pharmaceutical

compositions formulated, for example, with a pharmaceutically acceptable excipient.

Illustrative excipients include carriers, stabilizers, diluents, dispersing agents, suspending agents, thickening agents, and the like. The pharmaceutical compositions can facilitate administration of a dht-NATNA/Ca9 protein composition to a subject. Pharmaceutical compositions can be administered in therapeutically effective amounts by various forms and routes including, for example, intravenous, subcutaneous, intramuscular, oral, aerosol, parenteral, ophthalmic, and pulmonary administration.

[00220] The Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex- associated nucleic acid-targeting nucleic acid compositions described herein (e.g., dht- NATNAs, dht-NATNA/Cas9 nucleoprotein compositions) provide a number of advantages including, but not limited to, the following:

Modified binding affinity of a dht-NATNA/Cas9 nucleoprotein complex for a nucleic acid target sequence;

Increased binding affinity of one or more polynucleotides of a dht-NATNA composition (e.g., a dht-casPNl and/or a dht-casPN2) to a Cas9 protein using covalent cross linking or tethering of the one or more polynucleotides of a dht-NATNA composition to a Ca9s protein versus employing a dual-guide RNA or sgRNA NATNA charge-based interaction with a Cas9 protein;

Resistance to RNase degradation provided by modified thiol-linkages of one or more dht-casPNs of a dht-NATNA composition (e.g., a dht-casPNl and/or a dht-casPN2);

Fast screening of dht-NATNA component compatibilities, e.g., for a dht-NATNA comprising a dht-casPNl and a dht-casPN2, screens can be developed by creating a Csy4-dht- casPN2 library and pairing each dht-casPN2 of the library with the same dht-casPN 1 and (dCsy4)-Cas protein for screening; and

Improved cell delivery, e.g., for a dht-NATNA comprising a dht-casPNl and a dht- casPN2, of dht-casPN2s into cells expressing dht-casPNl s and Cas9 protein versus delivery of a similarly targeted crRNA into cells expressing tracrRNA and Cas protein, due to the smaller size of the dht-casPN2s.

Embodiments of the present invention include, but are not limited to, the following:

Embodiment 1. A Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) composition, comprising:

a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPNl) comprising a spacer sequence, a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical single-stranded polynucleotide (dht-casPN2) comprising a nexus, and a third Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN3), each having a 5' end and 3' end, wherein each of the dht-casPNl, the dht-casPN2, and the dht-casPN3 is a continuous series of covalently connected nucleotides, and the 3' end of the dht-casPN2 and the 5' end of the dht- casPN3 are located 3' of the nexus; and

wherein the dht-casPNl connects with the dht-casPN2 through hydrogen base-pair bonds, and the dht-casPN2 connects with the dht-casPN3 through hydrogen base-pair bonds to form the dht-NATNA; the dht-NATNA is capable of forming a nucleoprotein complex with a

Cas9 protein; and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

Embodiment 2. The dht-NATNA composition of embodiment 1,

wherein the dht-casPNl further comprises:

a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence

I, and

wherein the dht-casPN2 and the dht-casPN3 together further comprise:

a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I,

a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II,

a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment,

a fourth segment comprising a third stem nucleotide sequence I,

a third connecting nucleotide sequence covalently connecting the 3' end of the third segment to the 5' end of the fourth segment,

a fifth segment comprising a third stem nucleotide sequence II,

a fourth connecting nucleotide sequence covalently connecting the 3' end of the fourth segment to the 5' end of the fifth segment,

a sixth segment comprising a helical-triplex forming nucleotide sequence, and a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment, and

the 3' end of the dht-casPN2 and the 5' end of the dht-casPN3, are between the 5' end of the fourth segment and the 3' end of the sixth segment;

wherein the first stem repeat nucleotide sequence I is connected through hydrogen base- pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base-pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, and the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex; and

wherein the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

Embodiment 3. The dht-NATNA composition of embodiment 2, wherein the dht- casPN2 comprises at least the second segment and the third segment; and the dht-casPN3 comprises at least the fourth segment, the fifth segment, and the sixth segment.

Embodiment 4. The dht-NATNA composition of embodiment 2, wherein the dht- casPN2 comprises at least the second segment, the third segment, and the fourth segment; and the dht-casPN3 comprises at least the fifth segment, and the sixth segment.

Embodiment 5. The dht-NATNA composition of embodiment 2, wherein the dht- casPN2 comprises at least the second segment, the third segment, and a first portion of the fourth segment; and the dht-casPN3 comprises at least a second portion of the fourth segment, the fifth segment, and the sixth segment.

Embodiment 6. A Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) composition, comprising:

a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPNl) comprising a spacer and a nexus, and a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single-stranded polynucleotide (dht-casPN2), each having a 5' end and 3' end, wherein the dht-casPNl and the dht-casPN2 are each a continuous series of covalently connected nucleotides, and the 3' end of the dht-casPNl and the 5' end of the dht-casPN2 are located 3' of the nexus; and

wherein the dht-casPNl connects with the dht-casPN2 through hydrogen base-pair bonds to form the dht-NATNA, the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht-NATNA.

Embodiment 7. The dht-NATNA composition of embodiment 6,

wherein the dht-casPNl and dht-casPN2 together further comprise:

a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence

I,

a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I,

a first connecting nucleotide sequence covalently connecting the 3' end of the first segment to the 5' end of the second segment, a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II,

a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment,

a fourth segment comprising a third stem nucleotide sequence I,

a third connecting nucleotide sequence covalently connecting the 3' end of the third segment to the 5' end of the fourth segment,

a fifth segment comprising a third stem nucleotide sequence II,

a fourth connecting nucleotide sequence covalently connecting the 3' end of the fourth segment to the 5' end of the fifth segment,

a sixth segment comprising a helical-triplex forming nucleotide sequence, and a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment, and

the 3' end of the dht-casPNl and the 5' end of the dht-casPN2, are between the 5' end of the fourth segment and the 3' end of the sixth segment;

wherein the first stem repeat nucleotide sequence I is connected through hydrogen base- pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base-pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, and the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex; and

wherein the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

Embodiment 8. The dht-NATNA composition of embodiment 7, wherein the dht- casPNl comprises at least the first segment, the second segment and the third segment; and the dht-casPN2 comprises at least the fourth segment, the fifth segment, and the sixth segment.

Embodiment 9. The dht-NATNA composition of embodiment 7, wherein the dht- casPNl comprises at least the first segment, the second segment, the third segment, and the fourth segment; and the dht-casPN2 comprises at least the fifth segment, and the sixth segment.

Embodiment 10. The dht-NATNA composition of embodiment 7, wherein the dht- casPNl comprises at least the first segment, the second segment, the third segment, and a first portion of the fourth segment; and the dht-casPN2 comprises at least a second portion of the fourth segment, the fifth segment, and the sixth segment.

Embodiment 11. A Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex nucleic acid-targeting nucleic acid (dht-NATNA) composition, comprising:

a first Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPNl) comprising,

a first segment comprising a spacer sequence and a first stem repeat nucleotide sequence

I,

a second segment comprising a first stem repeat nucleotide sequence II, a joining nucleotide sequence, a nexus stem nucleotide sequence I, and a first linker element nucleotide sequence I,

a first connecting nucleotide sequence covalently connecting the 3' end of the first segment to the 5' end of the second segment,

a third segment comprising first linker nucleotide sequence II and a nexus stem nucleotide sequence II,

a second connecting nucleotide sequence covalently connecting the 3' end of the second segment to the 5' end of the third segment,

a fourth segment comprising a third stem nucleotide sequence I, and

a second Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex single- stranded polynucleotide (dht-casPN2) comprising,

a fifth segment comprising a third stem nucleotide sequence II,

a sixth segment comprising a helical-triplex forming nucleotide sequence, and a fifth connecting nucleotide sequence covalently connecting the 3' end of the fifth segment to the 5' end of the sixth segment;

wherein the first stem repeat nucleotide sequence I is connected through hydrogen base- pair bonds to the first stem repeat nucleotide sequence II and forms a first stem, the nexus stem nucleotide sequence I is connected through hydrogen base-pair bonds to the nexus stem nucleotide sequence II and forms a nexus, the third stem nucleotide sequence I is connected through hydrogen base-pair bonds to the third stem nucleotide sequence I and forms a 3' stem loop, and the triplex-forming nucleotide sequence connects with the nexus through hydrogen base-pair bonding and forms a helical triplex; and

wherein the dht-NATNA is capable of forming a nucleoprotein complex with a Cas9 protein, and the nucleoprotein complex is capable of binding or binding/cleaving a target nucleic acid sequence complementary to a target nucleic acid binding sequence of the dht- NATNA.

Embodiment 12. The dht-NATNA composition of embodiment 11, wherein the third stem nucleotide sequence I further comprises an additional 3' nucleic acid sequence and the third stem nucleotide sequence II further comprises an additional 5' nucleic acid sequence, and wherein the additional 3' nucleic acid sequence is connected through hydrogen base-pair bonds to the additional 5' nucleic acid sequence.

Embodiment 13. The dht-NATNA composition of any one of embodiments 1 to 5, wherein dht-casPNl, dht-casPN2, or dht-casPN3 comprise DNA, RNA, or DNA and RNA.

Embodiment 14. The dht-NATNA composition of any one of embodiments 1 to 5, wherein at least two of dht-casPNl, dht-casPN2, and dht-casPN3 comprise DNA, RNA, or DNA and RNA.

Embodiment 15. The dht-NATNA composition of any one of embodiments 1 to 5, wherein dht-casPNl, dht-casPN2, and dht-casPN3 comprise DNA, RNA, or DNA and RNA.

Embodiment 16. The dht-NATNA composition any one of embodiments 6 to 12, wherein dht-casPNl, dht-casPN2, or dht-casPNl and dht-casPN2, comprise DNA, RNA, or DNA and RNA.

Embodiment 17. A cell, comprising: the dht-NATNA composition of any preceding embodiment.

Embodiment 18. A nucleoprotein composition, comprising: the dht-NATNA composition of any one of embodiments 1 to 16; and a Cas9 protein.

Embodiment 19. The nucleoprotein composition of embodiment 18, wherein the dht- NATNA composition is in a complex with the Cas9 protein.

Embodiment 20. The nucleoprotein composition of embodiment 19, wherein the Cas9 protein is enzymatically inactive.

Embodiment 21. A cell comprising the nucleoprotein composition of any one of embodiments 18 to 20.

Embodiment 22. One or more nucleic acid sequences encoding one or more of dht- casPNl, dht-casPN2, and dht-casPN3 of the dht-NATNA composition of any one of embodiments 1 to 5.

Embodiment 23. One or more nucleic acid sequences encoding one or more of dht- casPNl and dht-casPN2 of the dht-NATNA composition of any one of embodiments 6 to 12.

Embodiment 24. An expression cassette comprising the one or more nucleic acid sequences of embodiment 22 or 23. Embodiment 25. A vector comprising the expression cassette of embodiment 24.

Embodiment 26. A method of binding a nucleic acid sequence comprising:

providing the nucleoprotein composition of any one of embodiments 18 to 20 for introduction into a cell or biochemical reaction; and, introducing the nucleoprotein composition into the cell or the biochemical reaction, thereby facilitating contact of a target nucleic acid sequence in the nucleic acid sequence with the nucleoprotein composition resulting in binding of the nucleoprotein composition to the target nucleic acid sequence in the nucleic acid sequence.

Embodiment 27. The method of embodiment 26, wherein genomic DNA comprises the nucleic acid sequence.

Embodiment 28. A method of cutting a nucleic acid sequence comprising: providing the nucleoprotein complex of any one of embodiments 18 to 20 for introduction into a cell or biochemical reaction; and, introducing the nucleoprotein complex into the cell or the biochemical reaction, thereby facilitating contact of a nucleic acid target sequence in the nucleic acid sequence with the nucleoprotein complex resulting in binding of the nucleoprotein complex to the nucleic acid target sequence and cutting of the nucleic acid target sequence.

Embodiment 29. The method of embodiment 28, wherein genomic DNA comprises the nucleic acid sequence.

Embodiment 30. A kit, comprising: the dht-NATNA composition of any one of embodiments 1 to 16; and a buffer.

Embodiment 31. A kit comprising: the one or more nucleic acid sequences of embodiment 22 or 23; and a buffer.

Embodiment 32. The kit of embodiment 30 or 31, further comprising a Cas9 protein or a nucleotide sequence encoding a Cas9 protein.

Experimental

[00221] Aspects of the present invention are illustrated in the following Examples. Efforts have been made to ensure accuracy with respect to numbers used (e.g., amounts, concentrations, percent changes, and the like) but some experimental errors and deviations should be accounted for. ETnless indicated otherwise, temperature is in degrees Centigrade and pressure is at or near atmospheric. It should be understood that these Examples are given by way of illustration only and are not intended to limit the scope of what the inventor regards as various aspects of the present invention. [00222]

Example 1

Component Production of Class 2 Type CRISPR-Cas9-Associated Discontinuous-Helical

Triplex Nucleic Acid-Targeting Nucleic Acids (dht-NATNAs)

[00223] This Example describes production of polynucleotide components of engineered Class 2 Type II CRISPR-Cas9-associated discontinuous-helical triplex (dht) nucleic acid- targeting nucleic acids (dht-NATNA) compositions comprising dht Cas9-associated

polynucleotides (dht-casPNs). The dht-casPN components of the dht-NATNA compositions were assembled by PCR using 3’ overlapping primers containing DNA sequences

corresponding to each dht-casPN or ordered as synthetic RNA reagents. Following the guidance of the present specification, additional dht-NATNA compositions can be designed for production by one of ordinary skill in the art.

[00224] A. Production of sgRNA and dht 1 * 2 * 3 * 4-casPN Components

[00225] Class 2 Type II CRISPR-Cas9-associated sgRNA (illustrated in FIG. 3) and dht- casPNs, for example, dhtl » 2 » 3 » 4-casPNs as illustrated in FIG. 5A(I), FIG. 6A, and FIG. 6F, were produced as set forth herein.

[00226] RNA dht-casPN (dht-casRNA) components were produced by in vitro transcription (e.g., T7 Quick High Yield RNA Synthesis Kit; New England Biolabs, Ipswich, MA) from a double-stranded DNA (dsDNA) template incorporating a T7 promoter at the 5’- end sequences of the DNA.

[00227] The sgRNA and dht 1 * 2 * 3 * 4-casRNA were designed to comprise a DNA target binding sequence targeting the adeno-associated virus integration site 1 (. AAVS1 ) from the human genome. The target DNA sequence selected for targeting is shown in Table 6.

[00228]

[00229] The dsDNA templates for sgRNA and the dht 1 * 2 * 3 * 4-casRNA components were assembled by PCR using overlapping primers containing DNA sequences corresponding to the sgRNA or the dhtl » 2 » 3 » 4-casRNA component. The oligonucleotides used in the assembly are set forth in Table 7. [00230]

[00231] The DNA primers were present at a concentration of 2nM each. One DNA primer corresponded to the T7 promoter (SEQ ID NO: 1) and the other to the 3’ terminus of the RNA sequences (SEQ ID NO:2 or SEQ ID NO:7). The DNA primers were used at a

concentration of 640nM to drive the amplification reaction. PCR reactions were performed using Q5 Hot Start High-Fidelity 2X Master Mix (New England Biolabs, Ipswich, MA) following the manufacturer’s instructions. PCR assembly reactions were carried out using the following thermal cycling conditions: 98°C for 2 minutes, 35 cycles of 15 seconds at 98°C, 15 seconds at 60°C, 15 seconds at 72°C, and a final extension at 72°C for 2 minutes. DNA product quality was evaluated after the PCR reaction by agarose gel electrophoresis (1.5%, SYBR® Safe; Life Technologies, Grand Island, NY).

[00232] Between 0.25-0.5pg of the DNA template for the sgRNA or dhtl 2 3 4-casRNA components were used as a template for transcription using T7 High Yield RNA Synthesis Kit (New England Biolabs, Ipswich, MA) for approximately 16 hours at 37°C. Transcription reactions were treated with DNase I (New England Biolabs, Ipswich, MA) and purified using GeneJet RNA Cleanup and Concentration Kit (Life Technologies, Grand Island, NY). RNA yield was quantified using the Nanodrop™ 2000 System (Thermo Scientific, Wilmington, DE). The quality of the transcribed RNA was checked by agarose gel electrophoresis (2%, SYBR® Safe; Life Technologies, Grand Island, NY). The sgRNA and dhtl » 2 » 3 » 4-casRNA sequences are shown in Table 8.

[00233]

* target sequence is underlined

[00234] This method for production of sgRNA and dht l * 2 * 3 * 4-casRNA can be applied to the production of other dht-NATNA components described herein.

[00235] B. Production of dht5 * 6-casPN Components

[00236] The dht5 » 6-casPN (FIG. 5A(II); FIG. 6A and FIG. 6F), comprising RNA (dht5*6-casRNA), was produced by annealing two complementary DNA oligonucleotides, resulting in a double-stranded DNA (dsDNA) template comprising a T7 promoter 5’ of the dht4 * 5-casPN sequence. To form the dsDNA template, the two complementary

oligonucleotides (SEQ ID NOs: 18 and 19) were ordered from a commercial manufacturer and mixed at an equimolar concentration of luM in lOmM Tris Cl (pH7.5) and lmM MgCE. The oligonucleotide mixture was incubated in a thermocycler at 95°C for 2 minutes, followed by cooling at -0.5°C /sec to 25°C. The dsDNA template was then used in an in vitro transcription reaction in a manner similar to the method described above. The sequence of the dht5 * 6- casRNA is shown in Table 9.

[00237]

[00238] Alternatively, dht5 * 6-casRNA can be ordered as synthesized RNA from a commercial manufacturer. These methods for the production of dht5 * 6-casRNA can be applied to the production of other dht-NATNA components described herein. [00239]

Example 2

Production of Target dsDNA Sequences for Use in Cas Protein Cleavage Assays

[00240] Target dsDNA sequences for use in in vitro Cas protein cleavage assays were produced using PCR amplification of selected nucleic acid target sequences from genomic human DNA.

[00241] Target dsDNA sequences for genomic human DNA AAVS1 for biochemical assays were amplified by PCR by phenol-chloroform preparation from human cell line K562 (American Type Culture Collection (ATCC), Manassas, VA) genomic DNA (gDNA). PCR reactions were carried out with Q5 Hot Start High-Fidelity 2X Master Mix (New England Biolabs, Ipswich, MA), following the manufacturer’s instructions. lng/pL gDNA in a final volume of 25 pl was used to amplify the selected nucleic acid target sequence under the following conditions: 98°C for 2 minutes, 29 cycles of 10 seconds at 98°C, 10 seconds at 58°C, 20 seconds at 72°C, and a final extension at 72°C for 2 minutes. PCR products were purified using Spin Smart™ PCR purification tubes (Denville Scientific, South Plainfield, NJ) and were quantified using a Nanodrop™ 2000 ETV-Vis spectrophotometer (Thermo Scientific,

Wilmington, DE).

[00242] Forward and reverse primers for amplification of the selected target DNA sequences from gDNA are set forth in Table 10.

[00243]

[00244] The AAVS1 target DNA sequences were amplified using SEQ ID NO: 12 and SEQ ID NO: 13, yielding a 532bp target dsDNA sequence.

[00245] Other suitable target dsDNA sequences can be obtained using essentially the same method. For non-human nucleic acid target sequences, genomic DNA from the selected organism (e.g., plant, bacteria, yeast, algae, or mammalian) can be used instead of DNA derived from human cells. In addition, polynucleotide sources other than genomic DNA can be used (e.g., vectors and gel isolated DNA fragments). [00246]

Example 3

Cas Cleavage Assays

[00247] This Example illustrates the use of dht-NATNA/Cas9 nucleoprotein complexes in cleavage assays. sgRNA/Cas9 and dhtl•2*3*4-casPN/dht5*6-casPN/Cas9 nucleoprotein complexes were used in the in vitro Cas9 cleavage assays to evaluate and compare the percent cleavage of selected dht l•2*3*4-casPN/dht5*6-casPN/Cas9 nucleoprotein complexes of selected target dsDNA sequences set forth in Example 2.

[00248] C. jejuni Cas9 was recombinantly expressed in Escherichia coli and purified for use in the in vitro biochemical cleavage assay.

[00249] sgRNA (SEQ ID NO:8; Example 1 A) was diluted to a suitable working concentration, assembled in a single tube to a final concentration of 200nM. Corresponding pairs of dhtl*2*3*4-casRNA (SEQ ID NO:9) and dht5*6-casRNA (SEQ ID NO: 10)

components, as produced in Example 1 A (dhtl » 2 » 3 » 4-casRNA) and Example 1B (dht5 » 6- casRNA), were diluted to a suitable working concentration, assembled in a single tube to a final concentration of 200nM each. As a control, a dht l *2*3*4-casRNA (SEQ ID NO:9) was diluted to a suitable working concentration, and reaction components were assembled in a single tube to a final concentration of 200nM for the dhtl•2*3*4-casRNA. All NATNA components were incubated in a thermocycler for 2 minutes at 95°C, removed from the thermocycler, and allowed to equilibrate to room temperature.

[00250] sgRNA, dhtl » 2 » 3 » 4-casRNA and dht5 » 6-casRNA pairs, and dhtl » 2 » 3 » 4-casRNA (control) comprised separate reaction mixes. Each reaction mix comprised a Cas9 protein diluted to a final concentration of 200nM in reaction buffer (20mM HEPES, lOOmM KC1,

5mM MgCE, and 5% glycerol at pH 7.4). An additional control reaction mix containing a Cas9 protein (but no NATNA) was also used. Each reaction mix was incubated at 37°C for 10 minutes. The cleavage reactions were initiated by the addition of the target DNA sequence to a final concentration of 7.5nM. Samples were mixed and centrifuged briefly before being incubated for 15 minutes at 37°C. Cleavage reactions were terminated by the addition of Proteinase K (Denville Scientific, South Plainfield, NJ) at a final concentration of 0.2pg/pL and 0.44 mg/pL RNase A Solution (SigmaAldrich, St. Louis, MO). Samples were incubated for 25 minutes at 37°C and 25 minutes at 55°C. For each sample, l2pL of the total reaction were mixed with 3uL of gel loading dye (New England Biolabs, Ipswich, MA) and evaluated for cleavage activity by agarose gel electrophoresis (2%, SYBR® Gold; Life Technologies, Grand

Island, NY). For the Cas9 cleavage of the AA VS I target dsDNA sequence, the appearance of DNA bands at approximately 265bp and approximately 267bp indicated cleavage of the target DNA sequence. Cleavage percentages were calculated using area under the curve (AUC) values as calculated by FIJI (ImageJ; an open source Java image processing program) for each cleavage fragment and the parent target DNA sequence. The sum of the cleavage fragments was divided by the sum of both the cleavage fragments and the parent target DNA sequences to provide a percent cleavage.

[00251] Table 11 presents the results of the Cas9 cleavage assays using AAVS-1 target dsDNA sequences and dht-NATNA components.

[00252]

*L.O.D.: below limit of detection

[00253] The data presented in Table 11 demonstrate that the dht-NATNAs of the present invention facilitated Cas protein mediated site-specific cleavage of target dsDNA sequences.

[00254] Following the guidance of the present specification, the biochemical cleavage assay described in this Example can be practiced by one of ordinary skill in the art with other dht-NATNAs and their cognate Cas9 proteins.

[00255]

Example 4

Deep Sequencing Analysis for Detection of

Nucleic Acid Target Sequence Modifications in Eukaryotic Cells

[00256] This Example describes the use deep sequencing analysis to evaluate and compare target percent cleavage of selected target dsDNA sequences in cells by dht- NATNA/Cas9 nucleoprotein complexes.

[00257] A. Formation of dhtl•2 * 3 * 4-casRNA/dht5 * 6-casRNA/Cas9 Nucleoprotein

Complexes

[00258] dht l•2 * 3 * 4-casRNA/dht5 * 6-casRNA to target the human AAVS1 genomic DNA target can be produced as described in Example 1 A and Example 1B. RNA sequences for exemplary dhtl 2 3 4-casRNA and dht5 6-casRNA are shown in Table 8 and Table 9, respectively. [00259] C. jejuni Cas9 can be tagged at the C-terminus with one nuclear localization sequences (NLS) and recombinantly expressed in E. coli and purified using chromatographic methods. Nucleoprotein complexes can be formed at a concentration of 40pmol Cas9 protein: l20pmol dht l•2*3*4-casRNA/dht5*6-casRNA Prior to assembly with Cas9, each of the dhtl *2*3*4-casRNA and dht5*6-casRNA can be diluted to the desired total concentration (l20pmol) in a final volume of 2pL, incubated for 2 minutes at 95°C, followed by cooling at - 0.5°C /sec to 25°C, and then removed from the thremocycler. Cas9 protein can be diluted to an appropriate concentration in binding buffer (20mM HEPES, lOOmM KC1, 5mM MgCL, and 5% glycerol at pH 7.4) to a final volume of 3pL and can be mixed with the 2pL of dhtl*2 » 3 » 4- casRNA/dht5*6-casRNA followed by incubation at 37°C for 10 minutes.

[00200] B. Cell Transfections Rising dht l•2*3*4-casRNA/dht5*6-casRNA/Cas9

[00201] dhtl » 2 » 3 » 4-casRNA/dht5 » 6-casRNA /Cas9 nucleoprotein complexes can be transfected into HEK293 cells (ATCC, Manassas VA), using the Nucleofector® 96-well Shuttle System (Lonza, Allendale, NJ) and the following protocol. The complexes can be dispensed in a 5pL final volume into individual wells of a 96-well plate. The cell culture medium can be removed from the HEK293 cell culture plate and the cells detached with TrypLE™ (Thermo Scientific, Wilmington, DE). Suspended HEK293 cells can be pelleted by centrifugation for 3 minutes at 200 x g, TrypLE reagents aspirated, and cells can be washed with calcium and magnesium-free phosphate buffered saline (PBS). Cells can be pelleted by centrifugation for 3 minutes at 200 x g, the PBS aspirated, and the cell pellet re-suspended in lOmL of calcium and magnesium-free PBS.

[00202] The cells can be counted using the Countess® II Automated Cell Counter (Life Technologies; Grand Island, NY). 2.2 x 10 7 cells can be transferred to a l .5ml microfuge tube and pelleted. The PBS can be aspirated and the cells re-suspended in Nucleofector™ SF (Lonza, Allendale, NJ) solution to a density of 1 x 10 7 cells/m. 20pL of the cell suspension can be then added to each individual well containing 5pL of nucleoprotein complexes, and the entire volume from each well can be transferred to a well of a 96-well Nucleocuvette™ Plate (Lonza, Allendale, NJ). The plate can be loaded onto the Nucleofector™ 96-well Shuttle™ (Lonza, Allendale, NJ) and cells nucleofected using the 96-CM-130 Nucleofector™ program (Lonza, Allendale, NJ). Post-nucleofection, 70pL Dulbecco’s Modified Eagle Medium

(DMEM; Thermo Scientific, Wilmington, DE), supplemented with 10% Fetal Bovine Serum (FBS; Thermo Scientific, Wilmington, DE), penicillin, and streptomycin (Life Technologies, Grand Island, NY) can be added to each well, and 50pL of the cell suspension can be transferred to a 96-well cell culture plate containing l50pL pre-warmed DMEM complete culture medium. The plate can be transferred to a tissue culture incubator and maintained at 37°C in 5% C0 2 for 48 hours.

[00263] C. Target dsDNA Sequence Generation for Deep Sequencing

[00264] gDNA can be isolated from the HEK293 cells 48 hours after transfection using the dht-NATNA/Cas9 nucleoprotein complexes and 50pL QuickExtract™ DNA extraction solution (Epicentre, Madison, WI) per well, followed by incubation at 37°C for 10 minutes, 65°C for 6 minutes, and 95°C for 3 minutes to stop the reaction. The isolated gDNA can be diluted with 50pL sterile water and samples can be stored at -80°C.

[00265] Using the isolated gDNA, a first PCR can be performed using Q5 Hot Start High-Fidelity 2X Master Mix (New England Biolabs, Ipswich, MA) at lx concentration, primers at 0.5mM each (SEQ ID NO: 14 and SEQ ID NO: 15), 3.75pL of gDNA in a final volume of lOpL and amplification at 98°C for 1 minute, 35 cycles of lOs at 98°C, 20 seconds at 60°C, 30 seconds at 72°C, and a final extension at 72°C for 2 minutes. Primers can be designed to amplify the region of the genome targeted for cleavage by the dht-NATNA/Cas9

nucleoprotein complex. The PCR reaction can be diluted 1 : 100 in water.

[00266] A unique set of index primers for a“barcoding” PCR can be used to facilitate multiplex sequencing for each sample. Exemplary primer pairs are shown in Table 12.

[00267]

[00268] Barcoding PCR can be performed using a reaction mix comprising Q5 Hot Start

High-Fidelity 2X Master Mix (New England Biolabs, Ipswich, MA) at lx concentration, primers at 0.5mM each (Table 12), and ImE of 1 : 100 diluted first PCR in a final volume of 10mE. The reaction mix can be amplified as follows: 98°C for 1 minute; followed 12 cycles of lOs at 98°C, 20 seconds at 60°C, and 30 seconds at 72°C; with a final extension reaction at 72°C for 2 minutes.

[00269] D. SPRIselect Clean-up

[00270] The PCR reactions can be pooled and transferred into a single microfuge tube for SPRIselect (Beckman Coulter, Pasadena, CA) bead-based cleanup of amplicons for sequencing. [00271] To the amplicon, 0.9x volumes of SPRIselect beads can be added, mixed, and incubated at room temperature for 10 minutes. The microfuge tube can be placed on magnetic tube stand (Beckman Coulter, Pasadena, CA) until the solution clears. Supernatant can be removed and discarded, the residual beads can be washed with 1 volume of 85% ethanol, and the beads can be incubated at room temperature for 30 seconds. After incubation, ethanol can be aspirated and the beads air-dried at room temperature for 10 minutes. The microfuge tube can be removed from the magnetic stand and 0.25x volumes of Qiagen EB buffer (Qiagen, Venlo, Netherlands) added to the beads, mixed vigorously, and incubated for 2 minutes at room temperature. The microfuge tube can be returned to the magnet, incubated until the solution has cleared, and supernatant containing the purified amplicons dispensed into a clean microfuge tube. The purified amplicon can be quantified using the Nanodrop™ 2000 System (Thermo Scientific, Wilmington DE) and library quality analyzed using the Fragment Analyzer™

System (Advanced Analytical Technologies, Ames, IA) and the DNF-910 dsDNA Reagent Kit (Advanced Analytical Technologies, Ames, IA).

[00272] E. Deep Sequencing Set-up

[00273] The pooled amplicons can be normalized to a 4nM concentration as calculated from the Nanodrop™ 2000 System values and the average size of the amplicons. The library can be analyzed on MiSeq Sequencer (Illumina, San Diego, CA) with MiSeq Reagent Kit v2 (Illumina, San Diego, CA) for 300 cycles with two l5l-cycle paired-end runs plus two 8- cycle index reads.

[00274] F. Deep Sequencing Data Analysis

[00275] The identities of products in the sequencing data can be determined based on the index barcode sequences adapted onto the amplicons in the barcoding PCR. A computational script can be used to process the MiSeq data that executes, for example, the following tasks:

Reads can be aligned to the human genome (build GRCh38/38) using Bowtie (bowtie- bio. sourceforge.net/index. shtml) software;

Aligned reads can be compared to the expected wild-type AAVS1 locus sequence, and reads not aligning to any part of the AAVS1 locus discarded;

Reads matching wild-type AAVS1 sequence can be tallied;

Reads with indels (insertion or deletion of bases) can be categorized by indel type and tallied; and

Total indel reads can be divided by the sum of wild-type reads and indel reads to give percent-mutated reads. [00276] Through the identification of indel sequences at regions targeted by the dht- NATNAs/Cas9 nucleoprotein complexes, sequence-specific targeting in a human cell line can be determined.

[00277] Following the guidance of the present specification, in cell editing of a genomic sequence can be practiced by one of ordinary skill in the art with other dht-NATNAs and their cognate Cas9 proteins.

[00278]

Example 5

Identification and Screening of crRNAs

[00279] This Example describes a method to identify Class 2 Type II crRNAs in different bacterial species. The method set forth herein here is adapted from Chylinski, K., el al ., RNA Biology l0(5):726-737 (2013). Not all of the following steps are required for screening nor must the order of the steps be as presented.

[00280] A. Identify a Species Containing a Class 2 Type II CRISPR Locus

[00281] Using the Basic Local Alignment Search Tool (BLAST,

blast.ncbi.nlm.nih.gov/Blast.cgi), a search of the genomes of various species can be conducted to identify Class 2 Type II CRISPR Cas nucleases (e.g., Cas9). Class 2 Type II CRISPR systems exhibit a high diversity in sequence across species; however Class 2 Type II CRISPR nuclease orthologs have conserved domains, for example, an HNH endonuclease domain and/or a RuvC/RNase H domain. Primary BLAST results can be filtered for identified domains, incomplete or truncated sequences discarded, and species having Class 2 Type II CRISPR nuclease orthologs identified.

[00282] If a Class 2 Type II CRISPR nuclease ortholog is identified in a species, sequences adjacent to the Cas protein ortholog coding sequence (e.g., Cas9) can be probed for other Cas proteins and an associated repeat-spacer array to identify all sequences belonging to the CRISPR-Cas locus can be used. This may be done by alignment to other known Class 2 Type II CRISPR loci such as that of C. jejuni, which comprises a dual-guide Class 2 Type II CRISPR-associated RNA comprising a helical triplex.

[00283] Once the sequence of the Class 2 Type II CRISPR locus for the nuclease ortholog is identified for the species, in silico predictive screening can be used to extract the crRNA sequence. The crRNA sequence is contained within CRISPR repeat array and can be identified by its hallmark repeating sequences interspaced by foreign spacer sequences. [00284] B . Preparation of RNA-seq Library

[00285] The putative CRISPR array containing the individual crRNA identified in silico can be further validated using RNA sequencing (RNA-seq).

[00280] Cells from species identified as comprising putative crRNA can be procured from a commercial repository (e.g., ATCC, Manassas, VA; German Collection of

Microorganisms and Cell Cultures GmbH (DSMZ), Braunschweig, Germany).

[00287] Cells can be grown to mid-log phase and total RNA prepped using Trizol reagent (SigmaAldrich, St. Louis, MO) and treated with DNasel (Fermentas, Vilnius,

Lithuania).

[00288] lOpg of the total RNA can be treated with Ribo-Zero rRNA Removal Kit (Illumina, San Diego, CA) and the remaining RNA purified using RNA Clean and

Concentrators (Zymo Research, Irvine, CA).

[00289] A library can be prepared using a TruSeq Small RNA Library Preparation Kit (Illumina, San Diego, CA), following the manufacturer’s instructions. This will result in cDNAs having adapter sequences.

[00290] The resulting cDNA library can be sequenced using MiSeq Sequencer (Illumina, San Diego, CA).

[00291] C. Processing of Sequencing Data

[00292] Sequencing reads of the cDNA library can be processed, for example, using the following method.

[00293] Adapter sequences can be removed using cutadapt 1.1 (pypi.python.org/ pypi/cutadapt/l. l) and about l5nt can be trimmed from the 3’end of the read to improve read quality.

[00294] Reads can be aligned to the genome of the respective species (i.e., from which the putative crRNA is to be identified) using Bowtie 2 (www.bowtie- bio.sourceforge.net/bowtie2/index.shtml). The Sequence Alignment/Map (SAM) file, which is generated by Bowtie 2, can be converted into a Binary Alignment/Map (BAM) file using SAMTools (www.samtools.sourceforge.net/) for subsequent sequencing analysis steps.

[00295] Read coverage mapping to the CRISPR locus or loci can be calculated from the BAM file using BedTools (www.bedtools.readthedocs.org/en/latest/).

[00290] The BED file, as generated in the previous step, can be loaded into Integrative Genomics Viewer (IGV; www.broadinstitute.org/igv/) to visualize the sequencing read pileup. Read pile can be used to identify the 5’ and 3’ termini of the transcribed putative crRNA sequence. [00297] The RNA-seq data can be used to validate that a putative crRNA element is actively transcribed in vivo. Confirmed hits from comparison of the in silico and RNA-seq screens can be validated for functional ability to support Class 2 Type II CRISPR nuclease cleavage of dsDNA nucleic acid target sequences using the methods described herein (e.g., Examples 1, 2, and 3).

[00298] Following the guidance of the present specification, the identification of novel crRNA sequences associated with Class 2 Type II CRISPR nucleases can be practiced by one of ordinary skill in the art.

[00299]

Example 6

Identification and Screening of tracrRNAs

[00300] This Example illustrates a method by which tracrRNAs of species having Class 2 Type II CRISPR systems can be identified. The method set forth here is adapted from

Chylinski, K., et al., RNA Biology l0(5):726-737 (2013). Not all of the following steps are required for screening nor must the order of the steps be as presented.

[00301] A. Identification of a Species Containing a Class 2 Type II CRISPR System

[00302] ETsing the Basic Local Alignment Search Tool (BLAST,

www.blast.ncbi.nlm.nih.gov/Blast.cgi), a search of the genomes of various species can be conducted to identify Class 2 Type II CRISPR Cas nucleases (e.g., Cas9). Class 2 Type II CRISPR systems exhibit a high diversity in sequence across species, however Class 2 Type II CRISPR Cas nuclease (e.g., Cas9) orthologs exhibit conserved domain architectures of a central HNH endonuclease domain and a split RuvC/RNase domain. Primary BLAST results can be filtered for identified domains; incomplete or truncated sequences discarded and Class 2 Type II CRISPR Cas orthologs identified.

[00303] If a Class 2 Type II CRISPR Cas ortholog is identified in a species, sequences adjacent to the Class 2 Type II CRISPR Cas ortholog-coding sequence can be probed for other Cas proteins and a Cas-associated repeat-spacer array to identify all sequences belonging to the CRISPR locus. This may be done by alignment to other known Class 2 Type II CRISPR loci such as that of C. jejuni, which comprises a dual-guide Class 2 Type II CRISPR-associated RNA comprising a helical triplex, with the knowledge that closely related species exhibit similar CRISPR locus architecture (e.g., Cas protein composition, size, orientation, location of array, location of tracrRNA, and the like). The tracrRNA element is typically contained within the Class 2 Type II CRISPR locus and can be readily identified by its sequence

complementarity to the repeat elements in the repeat-spacer array. It should be noted that the tracrRNA sequences complementary to the repeat elements are called the tracrRNA“anti-repeat sequences.”

[00304] Once the sequence of the CRISPR locus corresponding to the Class 2 Type II CRISPR Cas ortholog is identified for a species, in silico predictive screening can be used to extract the tracr anti-repeat sequence to identify the associated tracrRNA. Putative anti-repeats can be screened, for example, as follows.

[00305] If the repeat sequence is from a known species, the repeat sequence can be identified in, and retrieved from, the CRISPRdb database (www.crispr.u-psud.fr/crispr/). If the repeat sequence is not from a known species, the repeat sequence can be predicted employing CRISPRfmder software (www.crispr.u-psud.fr/Server/) using the Class 2 Type II CRISPR locus for the species, as described above.

[00300] The identified repeat sequence for the species can be used to probe the CRISPR locus for the anti-repeat sequence (e.g., using the BLASTp algorithm or the like). The search is typically restricted to intergenic regions of the CRISPR locus.

[00307] An identified tracr anti-repeat region can be validated for complementarity to the identified repeat sequence.

[00308] A putative anti -repeat region can be analyzed in the regions 5’ and 3’ of the putative anti-repeat region for a Rho-independent transcriptional terminator (TransTerm HP, www. transterm . cbcb .umd.edu/) .

[00309] By combining the identified sequence comprising the anti-repeat element and the Rho-independent transcriptional terminator, the sequence can be determined to be the putative tracrRNA of the given species.

[00310] B . Preparation of RNA-seq Library

[00311] The in silico identified, putative tracrRNA can be further validated using RNA sequencing (RNA-seq).

[00312] Cells from species comprising the putative tracrRNA can be procured from a commercial repository (e.g., ATCC, Manassas VA; DSMZ, Braunschweig, Germany).

[00313] Cells can be grown to mid-log phase and total RNA prepared using Trizol reagent (SigmaAldrich, St. Louis, MO) and treated with DNasel (Fermentas, Vilnius,

Lithuania).

[00314] lOpg of the total RNA can be treated using a Ribo-Zero rRNA Removal Kit (Illumina, San Diego, CA) and the remaining RNA purified using RNA Clean and

Concentrators (Zymo Research, Irvine, CA). [00315] A library can be prepared using a TruSeq Small RNA Library Preparation Kit (Illumina, San Diego, CA), following the manufacturer’s instructions. This will result in cDNAs having adapter sequences.

[00310] The resulting cDNA library can be sequenced using a MiSeq Sequencer (Illumina, San Diego, CA).

[00317] C. Processing of Sequencing Data

[00318] Sequencing reads of the cDNA library can be processed, for example, using the following method.

[00319] Adapter sequences can be removed using cutadapt 1.1 (www.pypi.python.org/ pypi/cutadapt/l. l) and about l5nt can be trimmed from the 3’end of the read to improve read quality.

[00320] Reads can be aligned to the genome of the respective species (i.e., from which the putative crRNA is identified) using Bowtie 2 (www.,bowtie- bio.sourceforge.net/bowtie2/index.shtml). The Sequence Alignment/Map (SAM) file, generated by Bowtie 2, can be converted into a Binary Alignment/Map (BAM) file using SAMTools (www.samtools.sourceforge.net/) for subsequent sequencing analysis steps.

[00321] Read coverage mapping to the CRISPR locus or loci can be calculated from the BAM file using BedTools (www.bedtools.readthedocs.org/en/latest/).

[00322] The BED file, generated in the previous step, can be loaded into Integrative Genomics Viewer (IGV; www.broadinstitute.org/igv/) to visualize the sequencing read pileup. Read pile can be used to identify the 5’ and 3’ termini of the transcribed putative tracrRNA sequence.

[00323] The RNA-seq data can be used to validate that a putative tracrRNA element is actively transcribed in vivo. Confirmed hits from the comparison of the in silico and RNA-seq screens can be validated for functional ability of the identified tracrRNA sequence and its cognate crRNA to support Class 2 Type II CRISPR Cas nuclease-mediated cleavage of a target dsDNA sequence using methods described herein (e.g., Examples 1, 2, and 3).

[00324] Following the guidance of the present specification and Examples, the identification of novel tracrRNA sequences related to Class 2 Type II CRISPR nucleases can be practiced by one of ordinary skill in the art. [00325]

Example 7

T7E1 Assay for Detection of Nucleic Acid Target Sequence Modifications in Eukaryotic Cells

[00326] This Example illustrates the use of T7E1 assays to evaluate and compare the percent cleavage in vivo for dht-NATNA/Cas9 nucleoprotein complexes of selected target dsDNA sequences.

[00327] A. Cell Transfections ETsing Cas9 Polynucleotide Components

[00328] The dht-NATNAs can be transfected into HEK293 cells constitutively expressing C. jejuni Cas9 using the Nucleofector® 96-well Shuttle System (Lonza, Allendale, NJ) and the following protocol dht l•2 * 3 * 4-casPN/dht5 * 6-casPN pairs can be diluted to appropriate concentration (e.g., l20pmol) and can be incubated for 2 minutes at 95°C, removed from a thermocycler, allowed to equilibrate to room temperature, and dispensed in a 5pL final volume in a 96-well plate. Culture medium can be aspirated from HEK293-Cas9 cells, the cells washed once with calcium and magnesium-free PBS, and trypsinized by the addition of TrypLE (Life Technologies, Grand Island, NY), followed by incubation at 37°C for 3-5 minutes.

Trypsinized cells can be gently pipetted up and down to form a single-cell suspension and added to DMEM complete culture medium composed of DMEM culture medium (Life Technologies, Grand Island, NY) containing 10% FBS (Thermo Scientific, Wilmington, DE) and supplemented with penicillin, and streptomycin (Life Technologies, Grand Island, NY).

[00329] The cells can be then pelleted by centrifugation for 3 minutes at 200 x g, the culture medium aspirated, and cells re-suspended in PBS. The cells can be counted using the Countess® II Automated Cell Counter (Life Technologies, Grand Island, NY). 2.2 x 10 7 cells can be transferred to a l.5ml microfuge tube and pelleted. The PBS can be aspirated and the cells re-suspended in Nucleofector™ SF (Lonza, Allendale, NJ) solution to a density of 1 x 10 7 cells/mL. 20pL of the cell suspension can be added to individual wells containing 5uL of the dhtl » 2 » 3 » 4-casPN/dht5 » 6-casPN and the entire volume transferred to the wells of a 96- well Nucleocuvette™ Plate (Lonza, Allendale, NJ). The plate can be loaded onto the

Nucleofector™ 96-well Shuttle™ (Lonza, Allendale, NJ) and cells nucleofected using the 96- CM-130 Nucleofector™ program (Lonza, Allendale, NJ). Post-nucleofection, 70pL DMEM complete culture medium can be added to each well, and 50pL of the cell suspension transferred to a collagen-coated 96-well cell culture plate containing l50pL pre-warmed DMEM complete culture medium. The plate can be transferred to a tissue culture incubator and maintained at 37°C in 5% C0 2 for 48 hours. [00330] B. Target dsDNA Sequence Generation for T7E1 Assay

[00331] gDNA can be isolated from HEK293-Cas9 cells 48 hours after transfection of the dhtl » 2 » 3 » 4-casPN/dht5 » 6-casPN using 50pL QuickExtract DNA Extraction solution (Epicentre, Madison, WI) per well followed by incubation at 37°C for 10 minutes, 65°C for 6 minutes and 95°C for 3 minutes to stop the reaction. gDNA can be then diluted with 150pL water and samples stored at -80°C.

[00332] DNA for T7E1 can be generated by PCR amplification of target dsDNA sequences (e.g., AAVS1) from isolated gDNA. PCR reactions can be set up using 8pL gDNA as template with KAPA HiFi Hot Start polymerase and 0.5U of polymerase, lx reaction buffer, 0.4mM dNTPs and 300nM forward and reverse primers directed to the target dsDNA sequence (e.g., Example 2; SEQ ID NO: 12 and SEQ ID NO: 13) in a total volume of 25pL. The target DNA sequence can be amplified using the following conditions: 95°C for 5 minutes, 4 cycles of 20 seconds at 98°C, 20 seconds at 70°C, minus 2°C/cycle, 30 seconds at 72°C, followed by 30 cycles of 15 seconds at 98°C, 20 seconds at 62°C, 20 seconds at 72°C, and a final extension at 72°C for 1 minute.

[00333] C. T7E1 Assay

[00334] PCR-amplified target dsDNA sequences for T7E1 assays can be denatured at 95°C for 10 minutes and then allowed to re-anneal by cooling to 25°C at -0.5°C/s in a thermal cycler. The re-annealed DNA can be incubated with 0.5pL T7 Endonuclease I in lx NEBuffer 2 buffer (New England Biolabs, Ipswich, MA) in a total volume of l5pL for 25 minutes at 37°C. T7E1 reactions can be analyzed using the Fragment Analyzer™ System (Advanced Analytical Technologies, Ames, IA) and the DNF-910 dsDNA Reagent Kit (Advanced Analytical

Technologies, Ames, IA). The Fragment Analyzer™ System will provide the concentration of each cleavage fragment and of the target dsDNA sequence that remains after cleavage.

[00335] Cleavage percentages of the target dsDNA sequences can be calculated from the concentration of each cleavage fragment and the target dsDNA sequence that remains after cleavage has taken place, using the following formula:

[00337] In Equation 1, fragl and frag2 concentrations correspond to the concentration of

Cas9 cleavage fragments of the target dsDNA sequence and parent corresponds to the target dsDNA sequence that remains after cleavage has taken place. [00338] The T7E1 assay for detection of target sequence modifications in eukaryotic cells can provide data demonstrating that the dht l•2 * 3 * 4-casPN/dht5 * 6-casPN/Cas9

nucleoprotein complexes described herein facilitate Cas9-mediated site-specific in vivo cleavage of multiple target dsDNA sequences. sgRNA, and/or crRNA/tracrRNA

polynucleotides having the same DNA target binding sequence as the dht l ·2·3·4- casPN/dht5 » 6-casPN can also be included in the assay to compare the Cas9-mediated site- specific cleavage percentages between the constructs.

[00339] Following the guidance of the present specification, the T7E1 assay described in this Example can be practiced by one of ordinary skill in the art with other Type II CRISPR Cas9 proteins and their cognate dht-NATNAs.

[00340]

Example 8

Probing for Sites Tolerant of Modification in Class 2 Type II Cas9 Guide RNA Backbones

[00341] This Example describes methods for the generation and testing of engineered dht-NATNAs.

[00342] Breaks can be introduced into the RNA backbone of Class 2 Type II CRISPR guide RNAs comprising a 3' helical triplex (e.g., sgRNAs or dual-guide RNAs) in the regions 3' of the nexus to identify locations for engineering non-native termini in these regions. The method described below is adapted from Briner, A., et al. , Molecular Cell 56(2):333-339 (2014). Not all of the following steps are required for screening nor must the order of the steps be as presented.

[00343] A guide RNA from a Class 2 Type II CRISPR system comprising a 3' helical triplex (e.g., a sgRNA or a tracrRNA) can be selected for engineering. The guide RNA sequence can be modified in silico to introduce breaks in the nucleic acid sequences 3’ of the nexus element. Furthermore, after introduction of a break into the such nucleic acid sequences, bases can be serially deleted 5’ and/or 3’ of the break to determine the effects of removal of multiple bases. Breaks in the nucleic acid backbone can also be used to introduce bases that form novel hydrogen base-pair interactions within the guide RNA backbone. Additionally, bases can be added 5’ and/or 3’ of the break.

[00344] The introduction of a break into the nucleotide sequences 3' of the nexus in a Class 2 Type II CRISPR sgRNA comprising a helical triplex can result, for example, in a dhtl 2 3 4-casRNA and a dht5 6-casRNA (see, e.g., FIG. 5B).

[00345] In silico designed dht-NATNA sequences can be provided to a commercial manufacturer for synthesis. [00346] Engineered dht-NATNAs can be evaluated for their ability to support cleavage of a target dsDNA sequence mediated by their cognate Cas9 protein. Amplification of target dsDNA sequences and biochemical cleavage assay can be carried out in a manner similar to those described in Example 2 and Example 3. dht-NATNAs that are capable of mediating cleavage of a target DNA sequence when in complex with their cognate Cas9 protein can be validated for activity in cells using the method described in Example 4.

[00347] Sites for the introduction of connective nucleotide sequences and linker sequences can be probed in a similar manner.

[00348] Following the guidance of the present specification, breaks can be introduced into the RNA backbone of Class 2 Type II CRISPR guide RNAs comprising a 3' helical triplex (e.g., introduction of breaks in the nucleotide sequences 3' of the nexus) to engineer dht- NATNAs.

[00349]

Example 9

Screening of dht-NATNAs Comprising Target DNA Binding Sequences

[00350] This Example illustrates the use of dht-NATNAs of the present invention to modify target DNA sequences present in human genomic DNA and to measure the level of cleavage activity at those sites.

[00351] Target sites can be first selected from genomic DNA. dht-NATNAs can be designed to target the selected sequences. Assays (e.g., as described in Example 3) can be performed to determine the level of target DNA sequence cleavage.

[00352] Not all of the following steps are required for every screening nor must the order of the steps be as presented, and the screening can be coupled to other experiments or can form part of a larger experiment.

[00353] A. Selecting Target DNA Sequences from Genomic DNA

[00354] PAM sequences (e.g., NNNNRYAC, wherein R is A or G, and Y is C or T) for a Cas9 protein (e.g., C. jejuni Cas9) can be identified within the selected genomic region.

[00355] One or more Cas9 target DNA sequences, 19-23 nucleotides in length, that are 5’ adjacent to a NNNNRYAC PAM sequence can be identified and selected.

[00356] Criteria for selection of nucleic acid target sequences can include, but are not limited to, the following: homology to other regions in the genome; percent G-C content;

melting temperature; cleavage activity at varying spacer length; presences of homopolymer within the spacer; distance between the two sequences; and other criteria known to one skilled in the art. [00357] A target DNA binding sequence that hybridizes to the Cas9 target DNA sequence can be incorporated into a dht-NATNA (e.g., a dht l•2*3*4-casPN/dht5*6-casPN) The nucleic acid sequence of a dht-NATNA construct is typically provided to and synthesized by a commercial manufacturer. Alternatively, the dht-NATNA construct can be produced as described in Example 1 A by in vitro transcription.

[00358] A dht-NATNA, as described herein, can be used with a cognate Cas9 protein to form dht-NATNA/Cas9 nucleoprotein complexes.

[00359] B. Determination of Cleavage Percentages and Specificity

[00300] In vitro cleavage percentages and specificity (i.e., the amount of off-target binding) related to a dht-NATNA can be determined, for example, using the cleavage assays described in Example 3, and compared as follows:

[00301] (1) If only a single pair of target DNA sequences is identified or selected for a dht-NATNA, the cleavage percentage and specificity for each of the target DNA sequences can be determined. If so desired, cleavage percentage and/or specificity can be altered in further experiments using methods including, but not limited to, modifying the dht-NATNA, introducing effector proteins/effector protein-binding sequences to modify the dht-NATNA or the Cas9 protein, or adding ligand/ligand-binding moieties to modify the dht-NATNA or the Cas9 protein.

[00302] (2) If multiple pairs of target DNA sequences are identified or selected for a dht-

NATNA, the percentage cleavage data and site-specificity data obtained from the cleavage assays can be compared between different DNAs comprising the target binding sequence to identify the target DNA sequences having the desired cleavage percentage and specificity. Cleavage percentage data and specificity data provide criteria on which to base choices for a variety of applications. For example, in some situations the activity of the dht-NATNA may be the most important factor. In other situations, the specificity of the cleavage site may be relatively more important than the cleavage percentage. If so desired, cleavage percentage and/or specificity can be altered in further experiments using methods including, but not limited to, modifying the dht-NATNA, introducing effector proteins/effector protein-binding sequences to modify the dht-NATNA or the Cas9 protein, or adding ligand/ligand-binding moieties to modify the dht-NATNA or the Cas9 protein.

[00303] Alternatively, or in addition to the in vitro analysis, in cell cleavage percentages and specificities of dht-NATNAs can be obtained using, for example, the method described in Example 4, and compared as follows: [00364] (1) If only a single pair of target DNA sequences is identified or selected for a dht-NATNA, the cleavage percentage and specificity for each of the target DNA sequences can be determined. If so desired, cleavage percentage and/or specificity can be altered in further experiments using methods including, but not limited to, modifying the dht-NATNA, introducing effector proteins/effector protein-binding sequences to modify the dht-NATNA or the Cas9 protein, or adding ligand/ligand-binding moieties to modify the dht-NATNA or the Cas9 protein.

[00365] (2) If multiple pairs of target DNA sequences are identified or selected for a dht-

NATNA, the percentage cleavage data and site-specificity data obtained from the cleavage assays can be compared between different DNAs comprising the target binding sequences to identify the target DNA sequences having the desired cleavage percentage and specificity. Cleavage percentage data and specificity data provide criteria on which to base choices for a variety of applications. For example, in some situations the activity of the dht-NATNA may be the most important factor. In other situations, the specificity of the cleavage site may be relatively more important than the cleavage percentage. If so desired, cleavage percentage and/or specificity can be altered in further experiments using methods including, but not limited to, modifying the dht-NATNA, introducing effector proteins/effector protein-binding sequences to modify the dht-NATNA or the Cas9 protein, or adding ligand/ligand-binding moieties to modify the dht-NATNA or the Cas9 protein.

[00366] Following the guidance of the present specification, the screening described in this Example can be practiced by one of ordinary skill in the art with other dht-NATNAs for use with cognate Cas9 proteins.