Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ULTRABRIGHT DNA NANOSTRUCTURES FOR BIOSENSING
Document Type and Number:
WIPO Patent Application WO/2023/070010
Kind Code:
A1
Abstract:
The aspects of this disclosure relate to luminescently labeled nucleic acid nanostructure as well as methods and compositions for the detection and/or sequencing of one or more target molecules derived from a sample. The invention may utilize ultrabright DNA for the purpose of improving signal-to-noise ratio and poor specificity of current single molecule sensing technology.

Inventors:
SHI XINGHUA (US)
HUANG HAIDONG (US)
FAKIH HASSAN (CA)
SLEIMAN HANADI (CA)
Application Number:
PCT/US2022/078397
Publication Date:
April 27, 2023
Filing Date:
October 19, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
QUANTUM SI INC (US)
International Classes:
B82Y5/00; A61K31/7088; C07H19/00; C07H21/00; C12Q1/6834; C12Q1/6837; C12Q1/686; C12Q1/6876; G01N33/68
Domestic Patent References:
WO2015196146A22015-12-23
Other References:
KIKUCHI NANAMI: "Split Aptameric Turn-On Fluorescence Sensor for Detection of Sequence Specific Nucleic Acid at Ambient Temperature", PHD THESIS, UNIVERSITY OF CENTRAL FLORIDA, 1 January 2018 (2018-01-01), XP093064548, [retrieved on 20230717]
Attorney, Agent or Firm:
PRITZKER, Randy, J. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A luminescently labeled nucleotide-bound nucleic acid nanostructure, the nanostructure comprising:

(i) at least three structural strands,

(ii) at least one nucleotide complementary strand, wherein the nucleotide complementary strand is ligated to at least one structural strand; and

(iii) at least one dye-binding strand, wherein each of the at least one dye-binding strands is complementary to at least one structural strand, and wherein each of the at least one dye-binding strand comprises at least one luminescent label.

2. The nanostructure of claim 1, wherein the nanostructure has three structural strands, wherein the three structural strands comprise a first structural strand, a second structural strand, and a third structural strand.

3. The nanostructure of claim 2, wherein the 5’ end of the first structural strand is hybridized to the 3’ end of the second structural strand, the 5’ end of the second structural strand is hybridized to the 3’ end of the third structural strand, and the 5’ end of third structural strand is hybridized to the 3’ end of the first structural strand.

4. The nanostructure of claim 1, wherein the nanostructure has four structural strands, wherein the four structural strands comprise a first structural strand, a second structural strand, a third structural strand, and a fourth structural strand.

5. The nanostructure of claim 4, wherein the 5’ end of the first structural strand is hybridized to the 3’ end of the second structural strand, the 5’ end of the second structural strand is hybridized to the 3’ end of the third structural strand, the 5’ end of third structural strand is hybridized to the 3’ end of the fourth structural strand, and the 5’ end of the fourth structural strand is hybridized to the 3’ end of the first structural strand.

6. The nanostructure of any one of claims 1-5, wherein the nucleotide complementary strand is separated from the at least one structural strand to which it is ligated by a dithymine spacer sequence.

49

7. The nanostructure of any one of claims 1-6, wherein the nucleotide complementary strand comprises a guanine, thymine, cytosine, or adenosine nucleotide.

8. The nanostructure of any one of claims 1-7, wherein each of the structural strands comprises a middle sequence and a clip sequence, wherein each clip sequence is complementary to one other clip sequence.

9. The nanostructure of any one of claims 1-8, wherein the structural strands are hybridized by the clip sequence.

10. The nanostructure of any one of claims 1-9, wherein the nucleotide complementary strand is ligated to the 5’ end of the at least one structural strand.

11. The nanostructure of any one of claims 1-10, wherein the nanostructure comprises one nucleotide complementary strand, two nucleotide complementary strands, three nucleotide complementary strands, or four nucleotide complementary strands.

12. The nanostructure of any one of claims 1-11, wherein the nanostructure comprises one dye-binding strand, two dye-binding strands, three dye-binding strands, or four dye-binding strands.

13. The nanostructure of any one of claims 1-12, wherein the clip sequence is 10-20 nucleotides in length.

14. The nanostructure of any one of claims 1-13, wherein each of the at least one dye-binding strands comprises one, two, or three luminescent labels.

15. The nanostructure of any one of claims 1-14, wherein the strands are nucleic acid strands.

16. The nanostructure of any one of claims 1-15, wherein the nucleic acid strands are singlestranded.

17. The nanostructure of any one of claims 1-16, wherein the luminescent label or luminescent labels is/are conjugated to each of the at least one dye-binding strands after synthesis of the at least one dye-binding strands.

50

18. The nanostructure of any one of claims 1-17, wherein the luminescent label or luminescent labels is/are conjugated to each of the at least one dye-binding strands during synthesis of the at least one dye-binding strands.

19. The nanostructure of any one of claims 1-18, wherein the luminescent label is fluorescent.

20. A method of determining the sequence of a template nucleic acid, the method comprising:

(i) exposing a complex in a target volume, the complex comprising the template nucleic acid, a primer, and a polymerizing enzyme, to one or more different types of luminescently labeled nucleotides, wherein each type of luminescently labeled nucleotide comprises a luminescently labeled nanostructure according to any one of claims 1-19;

(ii) directing a series of pulses of one or more excitation energies towards a vicinity of the target volume;

(iii) detecting a plurality of emitted from luminescently labeled nucleotides during sequential incorporation into a nucleic acid comprising the primer; and

(iv) identifying the sequence of incorporated nucleotides by determining at least of luminescent intensity and luminescent lifetime based on the emitted photons.

21. A method of determining the sequence of a template nucleic acid, the method comprising:

(i) immobilizing the template nucleic acid at a base of a well within a chip comprising a plurality of wells;

(ii) exposing the template nucleic acid to a primer, a polymerizing enzyme, and one or more different types of luminescently labeled nucleotides, wherein each type of luminescently labeled nucleotide comprises a luminescently labeled nanostructure according to any one of claims 1-19;

(iii) directing a series of pulses of one or more excitation energies towards a vicinity of the target volume;

(iv) detecting a plurality of emitted from luminescently labeled nucleotides during sequential incorporation into a nucleic acid comprising the primer; and

(v) identifying the sequence of incorporated nucleotides by determining at least of luminescent intensity and luminescent lifetime based on the emitted photons.

51

22. The method of claim 21, wherein the method is automated or manual.

23. The method of claim 21 or 22, wherein the automated method occurs in a single instrument.

24. The method of any one of claims 21-23, wherein the plurality of wells is a number of wells selected from the group consisting of: 96 wells, 384 wells, 1,536 wells, or more wells.

25. A chip comprising a plurality of wells, a template nucleic acid immobilized to a base of at least a subset of the plurality of wells, and a luminescently labeled nanostructure according to any one of claims 1-19.

26. The chip of claim 25, wherein the plurality of wells is a number of wells selected from the group consisting of: 96 wells, 384 wells, 1,536 wells, or more wells.

27. The chip of claim 25 or 26, wherein the template nucleic acid is derived from a sample comprising a plurality of nucleic acids.

28. The chip of any one of claims 25-27, wherein the template nucleic acid is immobilized to the base of the well via a secondary complex.

52

Description:
ULTRABRIGHT DNA NANOSTRUCTURES FOR BIOSENSING

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. Provisional Patent Application Serial No. 63/270,019, filed October 20, 2021, entitled “ULTRABRIGHT DNA NANOSTRUCTURES FOR BIOSENSING,” which is incorporated herein by reference in its entirety.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. Absent any indication otherwise, publications, patents, and patent applications mentioned in this specification are incorporated herein by reference in their entireties.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The content of the electronic sequence listing (R070870144WO00-SEQ-RJP.xml; Size: 17,646 bytes; and Date of Creation: October 19, 2022) is herein incorporated by reference in its entirety.

TECHNICAL FIELD

The present disclosure relates to methods and compositions for detecting and sequencing a nucleic acid sequence using ultrabright DNA nanostructures.

BACKGROUND

In recent years, single molecule-based sensing platforms, such as fluorescence microscopy, nanopores, and mechanochemical instruments, have been revolutionizing sensing, providing unprecedented detection limits and more informative values than those of conventional sensors.

However, single-molecule sensing has suffered from limitations such as low signal-to- noise ratio and poor specificity. Thus, improved methods and compositions are needed.

SUMMARY OF THE INVENTION

Aspects of this disclosure relate to methods and compositions for the detection and/or sequencing of one or more target molecules derived from a sample. In some embodiments, a target molecule is a nucleic acid, a deoxyribonucleic acid (DNA) molecule, a ribonucleic acid (RNA) molecule, or derivative thereof. Through the use of the methods and/or the compositions of the instant disclosure, target molecules derived from a sample may, in some embodiments, be more readily detected and/or sequenced. The subject matter of the present invention involves, in some cases, interrelated products, alternative solutions to a particular problem, and/or a plurality of different uses of one or more systems and/or articles.

One aspect of the present disclosure relates to a luminescently labeled nucleotide-bound nucleic acid nanostructure, the nanostructure comprising: (i) at least three structural strands, (ii) at least one nucleotide complementary strand, wherein the nucleotide complementary strand is ligated to at least one structural strand; and (iii) at least one dye-binding strand, wherein each of the at least one dye-binding strands is complementary to at least one structural strand, and wherein each of the at least one dye-binding strand comprises at least one luminescent label. In some embodiments, the nanostructure has three structural strands, wherein the three structural strands comprise a first structural strand, a second structural strand, and a third structural strand. In some embodiments, the 5’ end of the first structural strand is hybridized to the 3’ end of the second structural strand, the 5’ end of the second structural strand is hybridized to the 3’ end of the third structural strand, and the 5’ end of third structural strand is hybridized to the 3’ end of the first structural strand. In some embodiments, the nanostructure has four structural strands, wherein the four structural strands comprise a first structural strand, a second structural strand, a third structural strand, and a fourth structural strand. In some embodiments, the 5’ end of the first structural strand is hybridized to the 3’ end of the second structural strand, the 5’ end of the second structural strand is hybridized to the 3’ end of the third structural strand, the 5’ end of third structural strand is hybridized to the 3’ end of the fourth structural strand, and the 5’ end of the fourth structural strand is hybridized to the 3’ end of the first structural strand.

In some embodiments, the nucleotide complementary strand is separated from the at least one structural strand to which it is ligated by a dithymine spacer sequence. In some embodiments, the nucleotide complementary strand comprises a guanine, thymine, cytosine, or adenosine nucleotide. In some embodiments, each of the structural strands comprises a middle sequence and a clip sequence, wherein each clip sequence is complementary to one other clip sequence. In some embodiments, the structural strands are hybridized by the clip sequence. In some embodiments, the nucleotide complementary strand is ligated to the 5’ end of the at least one structural strand.

In some embodiments, the nanostructure comprises one nucleotide complementary strand, two nucleotide complementary strands, three nucleotide complementary strands, or four nucleotide complementary strands. In some embodiments, the nanostructure comprises one dyebinding strand, two dye-binding strands, three dye-binding strands, or four dye-binding strands. In some embodiments, the clip sequence is 10-20 nucleotides in length. In some embodiments, each of the at least one dye-binding strands comprises one, two, or three luminescent labels. In some embodiments, the strands are nucleic acid strands. In some embodiments, the nucleic acid strands are single-stranded.

In some embodiments, the luminescent label or luminescent labels is/are conjugated to each of the at least one dye-binding strands after synthesis of the at least one dye-binding strands. In some embodiments, the luminescent label or luminescent labels is/are conjugated to each of the at least one dye-binding strands during synthesis of the at least one dye-binding strands. In some embodiments, the luminescent label is fluorescent.

Another aspect of the present disclosure relates to a method of determining the sequence of a template nucleic acid, the method comprising: (i) exposing a complex in a target volume, the complex comprising the template nucleic acid, a primer, and a polymerizing enzyme, to one or more different types of luminescently labeled nucleotides, wherein each type of luminescently labeled nucleotide comprises a luminescently labeled nanostructure described herein; (ii) directing a series of pulses of one or more excitation energies towards a vicinity of the target volume; (iii) detecting a plurality of emitted from luminescently labeled nucleotides during sequential incorporation into a nucleic acid comprising the primer; and (iv) identifying the sequence of incorporated nucleotides by determining at least of luminescent intensity and luminescent lifetime based on the emitted photons.

Another aspect of the present disclosure relates to a method of determining the sequence of a template nucleic acid, the method comprising: (i) immobilizing the template nucleic acid at a base of a well within a chip comprising a plurality of wells; (ii) exposing the template nucleic acid to a primer, a polymerizing enzyme, and one or more different types of luminescently labeled nucleotides, wherein each type of luminescently labeled nucleotide comprises a luminescently labeled nanostructure described herein; (iii) directing a series of pulses of one or more excitation energies towards a vicinity of the target volume; (iv) detecting a plurality of emitted from luminescently labeled nucleotides during sequential incorporation into a nucleic acid comprising the primer; and (v) identifying the sequence of incorporated nucleotides by determining at least of luminescent intensity and luminescent lifetime based on the emitted photons. In some embodiments, the method is automated or manual. In some embodiments, the automated method occurs in a single instrument. In some embodiments, the plurality of wells is a number of wells selected from the group consisting of: 96 wells, 384 wells, 1,536 wells, or more wells.

Another aspect of the present disclosure relates to a chip comprising a plurality of wells, a template nucleic acid immobilized to a base of at least a subset of the plurality of wells, and a luminescently labeled nanostructure described herein. In some embodiments, the plurality of wells is a number of wells selected from the group consisting of: 96 wells, 384 wells, 1,536 wells, or more wells. In some embodiments, the template nucleic acid is derived from a sample comprising a plurality of nucleic acids. In some embodiments, the template nucleic acid is immobilized to the base of the well via a secondary complex.

The details of one or more embodiments of the invention are set forth in the description below. Other features or advantages of the present invention will be apparent from the following drawings and detailed description of several embodiments, and also from the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

Non-limiting embodiments of the present invention are described by way of example with reference to the accompanying figures, which are schematic and are not intended to be drawn to scale unless otherwise indicated. In some embodiments of the figures, each identical or nearly identical component illustrated is typically represented by a single numeral. For purposes of clarity, not every component is labeled in every figure, nor is every component of each embodiment of the invention shown where illustration is not necessary to allow those of ordinary skill in the art to understand the invention.

FIG. 1 shows a schematic of the DNA nanostructures developed to incorporate fluorescently labeled strands and nucleotide carrying strands (left) and a sensing system being used for DNA sequencing (right).

FIG. 2 shows the design of the triangle nanostructure (left) and the square nanostructure (right), each carrying luminescent labels.

FIG. 3 shows a 6% polyacrylamide native gel showing the sequential assembly of a triangle nanostructure and its loading with D strands at 1 pM, under different conditions (RT= room temperature, annealed, following 3 -day storage at 4 degrees after annealing). Samples were run for 1.5 hours at 130 V in lx TAMg.

FIG. 4 shows the sequential loading of a triangle nanostructure with “nuc” strands and a 6% polyacrylamide native gel showing the sequential assembly addition of nuc strands to the triangle nanostructure at IpM. Samples were annealed for 2 hours and run for 1.5 hours at 130 V in lx TAMg.

FIG. 5 shows probe strands carrying three fluorescent luminescent labels. The luminescent labels are denoted by an “X.”

FIG. 6 shows a 6% polyacrylamide native gel showing the assembly of the triangle with fluorescently labeled probe strands Q3 and Q4 at various conditions (left) and the fluorescent signal increase between a triangle containing 1 labeled strands vs 3 labeled strands, measured on a fluorescence plate reader following 25-fold dilution of 1 pM samples (n=2) (right).

FIG. 7 shows two nucleotide carrying strands. FIG. 8 shows testing the assembly of fluorescently labeled triangle (Q4) with nucleotide functionalized strands (dC) and a 6% Polyacrylamide native gel showing the mobility shift of various samples listed in the figure. Samples were annealed for 2 hours at 1 pM and run for 1.5 hours at 130 V in lx TAMg.

FIG. 9 shows testing the assembly of the square SI design with dye and nucleotide strands and a 6% polyacrylamide native gel showing the mobility shift of various samples listed in the figure. Samples were annealed for 2 hours at 1.5 pM and run for 1.5 hours at 130 V in lx TAMg.

FIG. 10 shows testing the ability to clean up the square S 1 assembly sample by centrifugation filtration by checking assembly on a 6% native polyacrylamide gel and a 10% native polyacrylamide gel to check removal of excess strands. Samples were annealed for 2 hours at 1.5 pM and run for 1.5 hours at 130 V in lx TAMg. Filtered samples were spun at 21xg for 8 mins in 50 KDa cut-off amicon filtration tubes. Samples indicated with “ ' ” were resuspended to their original volume and concentration using lx TAMg buffer.

FIG. 11 shows design differences between the Square 1 (SI), Square 2 (S2), and Square 3 (S3) nanostructure assemblies.

FIG. 12 shows a gel from a gel electrophoresis analysis for testing the assembly of the Square 2 design.

FIG. 13 shows each sequence used to generate each nanostructure assembly. Sequences with matching colors within a nanostructure assembly are complementary to each other.

DETAILED DESCRIPTION

In some aspects, the disclosure provides luminescently labeled nucleic acid nanostructures. The nucleic acid nanostructures may comprise at least three structural strands (e.g., three structural strands, four structural strands). In some cases, one or more portions of each structural strand may be hybridized to one or more portions of one or more other structural strands to form a suitable shape (e.g., a triangle, a square). The nucleic acid nanostructures may further comprise at least one dye-binding strand configured to bind to at least one luminescent label (e.g., at least one luminescent label, at least two luminescent labels, at least three luminescent labels). In addition, the nucleic acid nanostructures may further comprise at least one nucleotide complementary strand ligated to at least one structural strand. In some cases, the nucleic acid nanostructures may facilitate detection and/or sequencing of one or more target nucleic acids. For example, in some embodiments, methods and compositions described herein facilitate the single-molecule sequencing of target nucleic acids by linking certain nucleotides with ultrabright nucleic acid nanostructures, which may exhibit a higher intensity luminescent signal during incorporation into a growing nucleic acid strand. These embodiments may provide advantages for the detection and sequencing of target nucleic acids.

DNA nanotechnology, a field that exploits the specificity of base pairing to develop nanostructures with unprecedented specificity and control, provides an attractive strategy to organize matter. The well-defined geometries of nucleic acid nanostructures, as well as the various shapes and sizes, make them excellent scaffolds to organize matter for various applications, such as nanomachines, molecular computing, and targeted drug delivery. Using DNA nanotechnology to organize probes is an attractive strategy to enhance single-molecule sensing and imaging. Due to the ability of certain nucleic acid nanostructures to multiply signal output (e.g., by binding multiple dyes) and improve organization and spatial positioning, detection and specificity may be enhanced.

In some embodiments, luminescently labeled nucleic acid nanostructures described herein provide ultrabright, compact probes. In some cases, the nanostructures may be controlled in size to ensure compactness and compatibility with nucleic acid sequencing platforms. As described herein, structures built from nucleic acids (e.g., DNA) can bind and organize multiple luminescent labels, and the whole nanostructure may be used as a single probe entity. Such designs may multiply the intensity observed for a certain event due to the multivalency of nucleic acid nanostructures. All of this can be achieved while maintaining a compact size appropriate for such biosensing applications.

The present disclosure describes a process of expanding the multivalency of nucleic acid nanostructures structures by producing a triangle nanostructure and/or a square nanostructure. In some embodiments, a nanostructure comprises at least three structural strands, wherein each structural strand is a nucleic acid strand (e.g., a single- stranded nucleic acid strand). In certain embodiments, each structural strand comprises a middle sequence and a clip sequence. In certain instances, the clip sequence may be 10-20 nucleotides in length. In some instances, at least one clip sequence of each structural strand is complementary to at least one other clip sequence of another structural strand. In some embodiments, the structural strands may be hybridized by the clip sequence.

In some embodiments, the nanostructure comprises three structural strands (i.e., a first structural strand, a second structural strand, a third structural strand). In some embodiments, each structural strand is a nucleic acid strand (e.g., a single- stranded nucleic acid strand). In some instances, the 5’ end of the first structural strand is hybridized to the 3’ end of the second structural strand, the 5’ end of the second structural strand is hybridized to the 3’ end of the third structural strand, and the 5’ end of third structural strand is hybridized to the 3’ end of the first structural strand. In some such instances, the nanostructure forms a triangle. In certain instances, the nanostructure comprises four structural strands (i.e., a first structural strand, a second structural strand, a third structural strand, a fourth structural strand). In some instances, the 5’ end of the first structural strand is hybridized to the 3’ end of the second structural strand, the 5’ end of the second structural strand is hybridized to the 3’ end of the third structural strand, the 5’ end of third structural strand is hybridized to the 3’ end of the fourth structural strand, and the 5’ end of the fourth structural strand is hybridized to the 3’ end of the first structural strand. In some such instances, the nanostructure forms a rectangle (e.g., a square).

In some embodiments, a nanostructure comprises at least one dye-binding strand. In some embodiments, the nanostructure comprises one, two, three, or four dye-binding strands. In some cases, each dye-binding strand is complementary to at least one structural strand of the nanostructure. In some cases, each dye-binding strand comprises at least one luminescent label. In certain embodiments, each dye-binding strand independently comprises one, two, or three luminescent labels. In some instances, a nanostructure comprises one, two, three, four, five, six, seven, eight, nine, ten, eleven, or twelve luminescent labels. The one or more luminescent labels may be conjugated to the one or more dye-binding strands during or after synthesis of the one or more dye-binding strands.

In some embodiments, a nanostructure comprises at least one nucleotide complementary strand. In some cases, each nucleotide complementary strand may be ligated to at least one structural strand (e.g., the 5’ end of at least one structural strand). In certain embodiments, each nucleotide complementary strand is separated from the at least one structural strand to which it is ligated by a dithymine spacer sequence. In some cases, each nucleotide complementary strand is configured to be complementary to at least one nucleotide or nucleic acid sequence. In some embodiments, each nucleotide complementary strand comprises a guanine, thymine, cytosine, or adenosine nucleotide. In some embodiments, a nucleic acid nanostructure comprises one, two, three, or four nucleotide complementary strands.

Sequencing

Some aspects of the application are useful for sequencing biological polymers, such as nucleic acids. In some embodiments, methods, compositions, and devices described in the application can be used to identify a series of nucleotide or amino acid monomers that are incorporated into a nucleic acid (e.g., by detecting a time-course of incorporation of a series of labeled nucleotide). In some embodiments, methods, compositions, and devices described in the application can be used to identify a series of nucleotides that are incorporated into a templatedependent nucleic acid sequencing reaction product synthesized by a polymerase enzyme. In certain embodiments, the template-dependent nucleic acid sequencing product is carried out by naturally occurring nucleic acid polymerases. In some embodiments, the polymerase is a mutant or modified variant of a naturally occurring polymerase. In some embodiments, the template-dependent nucleic acid sequence product will comprise one or more nucleotide segments complementary to the template nucleic acid strand. In one aspect, the application provides a method of determining the sequence of a template (or target) nucleic acid strand by determining the sequence of its complementary nucleic acid strand.

In another aspect, the application provides methods of sequencing target nucleic acids by sequencing a plurality of nucleic acid fragments, wherein the target nucleic acid comprises the fragments. In certain embodiments, the method comprises combining a plurality of fragment sequences to provide a sequence or partial sequence for the parent target nucleic acid. In some embodiments, the step of combining is performed by computer hardware and software. The methods described herein may allow for a set of related target nucleic acids, such as an entire chromosome or genome to be sequenced.

During sequencing, a polymerizing enzyme may couple (e.g., attach) to a priming location of a target nucleic acid molecule. The priming location can be a primer that is complementary to a portion of the target nucleic acid molecule. As an alternative the priming location is a gap or nick that is provided within a double stranded segment of the target nucleic acid molecule. A gap or nick can be from 0 to at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, or 40 nucleotides in length. A nick can provide a break in one strand of a double stranded sequence, which can provide a priming location for a polymerizing enzyme, such as, for example, a strand displacing polymerase enzyme.

In some cases, a sequencing primer can be annealed to a target nucleic acid molecule that may or may not be immobilized to a solid support. A solid support can comprise, for example, a sample well (e.g., a nanoaperture, a reaction chamber) on a chip used for nucleic acid sequencing. In some embodiments, a sequencing primer may be immobilized to a solid support and hybridization of the target nucleic acid molecule also immobilizes the target nucleic acid molecule to the solid support. In some embodiments, a polymerase is immobilized to a solid support and soluble primer and target nucleic acid are contacted to the polymerase. However, in some embodiments a complex comprising a polymerase, a target nucleic acid and a primer is formed in solution and the complex is immobilized to a solid support (e.g., via immobilization of the polymerase, primer, and/or target nucleic acid). In some embodiments, none of the components in a sample well (e.g., a nanoaperture, a reaction chamber) are immobilized to a solid support. For example, in some embodiments, a complex comprising a polymerase, a target nucleic acid, and a primer is formed in solution and the complex is not immobilized to a solid support.

Under appropriate conditions, a polymerase enzyme that is contacted to an annealed primer/target nucleic acid can add or incorporate one or more nucleotides onto the primer, and nucleotides can be added to the primer in a 5’ to 3’, template-dependent fashion. Such incorporation of nucleotides onto a primer (e.g., via the action of a polymerase) can generally be referred to as a primer extension reaction. Each nucleotide can be associated with a detectable tag that can be detected and identified (e.g., based on its luminescent lifetime and/or other characteristics) during the nucleic acid extension reaction and used to determine each nucleotide incorporated into the extended primer and, thus, a sequence of the newly synthesized nucleic acid molecule. Via sequence complementarity of the newly synthesized nucleic acid molecule, the sequence of the target nucleic acid molecule can also be determined. In some cases, annealing of a sequencing primer to a target nucleic acid molecule and incorporation of nucleotides to the sequencing primer can occur at similar reaction conditions (e.g., the same or similar reaction temperature) or at differing reaction conditions (e.g., different reaction temperatures). In some embodiments, sequencing by synthesis methods can include the presence of a population of target nucleic acid molecules (e.g., copies of a target nucleic acid) and/or a step of amplification of the target nucleic acid to achieve a population of target nucleic acids. However, in some embodiments sequencing by synthesis is used to determine the sequence of a single molecule in each reaction that is being evaluated (and nucleic acid amplification is not required to prepare the target template for sequencing). In some embodiments, a plurality of single molecule sequencing reactions are performed in parallel (e.g., on a single chip) according to aspects of the present application. For example, in some embodiments, a plurality of single molecule sequencing reactions are each performed in separate reaction chambers (e.g., nanoapertures, sample wells) on a single chip.

Embodiments are capable of sequencing single nucleic acid molecules with high accuracy and long read lengths, such as an accuracy of at least about 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.9%, 99.99%, 99.999%, or 99.9999%, and/or read lengths greater than or equal to about 10 base pairs (bp), 50 bp, 100 bp, 200 bp, 300 bp, 400 bp, 500 bp, 1000 bp, 10,000 bp, 20,000 bp, 30,000 bp, 40,000 bp, 50,000 bp, or 100,000 bp. In some embodiments, the target nucleic acid molecule used in single molecule sequencing is a single stranded target nucleic acid (e.g., deoxyribonucleic acid (DNA), DNA derivatives, ribonucleic acid (RNA), RNA derivatives) template that is added or immobilized to a sample well (e.g., nanoaperture) containing at least one additional component of a sequencing reaction (e.g., a polymerase such as, a DNA polymerase, a sequencing primer) immobilized or attached to a solid support such as the bottom or side walls of the sample well. The target nucleic acid molecule or the polymerase can be attached to a sample wall, such as at the bottom or side walls of the sample well directly or through a linker. The sample well (e.g., nanoaperture) also can contain any other reagents needed for nucleic acid synthesis via a primer extension reaction, such as, for example suitable buffers, co-factors, enzymes (e.g., a polymerase) and deoxyribonucleoside polyphosphates, such as, e.g., deoxyribonucleoside triphosphates, including deoxyadenosine triphosphate (dATP), deoxycytidine triphosphate (dCTP), deoxyguanosine triphosphate (dGTP), deoxy uridine triphosphate (dUTP) and deoxy thymidine triphosphate (dTTP) dNTPs, that include luminescent tags, such as fluorophores. In some embodiments, each class of dNTPs (e.g., adenine-containing dNTPs (e.g., dATP), cytosine-containing dNTPs (e.g., dCTP), guanine-containing dNTPs (e.g., dGTP), uracil-containing dNTPs (e.g., dUTPs) and thymine-containing dNTPs (e.g., dTTP)) is conjugated to a distinct luminescent tag such that detection of light emitted from the tag indicates the identity of the dNTP that was incorporated into the newly synthesized nucleic acid. Emitted light from the luminescent tag can be detected and attributed to its appropriate luminescent tag (and, thus, associated dNTP) via any suitable device and/or method, including such devices and methods for detection described elsewhere herein. The luminescent tag may be conjugated to the dNTP at any position such that the presence of the luminescent tag does not inhibit the incorporation of the dNTP into the newly synthesized nucleic acid strand or the activity of the polymerase. In some embodiments, the luminescent tag is conjugated to the terminal phosphate (e.g., the gamma phosphate) of the dNTP.

In some embodiments, the single- stranded target nucleic acid template can be contacted with a sequencing primer, dNTPs, polymerase and other reagents necessary for nucleic acid synthesis. In some embodiments, all appropriate dNTPs can be contacted with the singlestranded target nucleic acid template simultaneously (e.g., all dNTPs are simultaneously present) such that incorporation of dNTPs can occur continuously. In other embodiments, the dNTPs can be contacted with the single-stranded target nucleic acid template sequentially, where the singlestranded target nucleic acid template is contacted with each appropriate dNTP separately, with washing steps in between contact of the single- stranded target nucleic acid template with differing dNTPs. Such a cycle of contacting the single-stranded target nucleic acid template with each dNTP separately followed by washing can be repeated for each successive base position of the single-stranded target nucleic acid template to be identified.

In some embodiments, the sequencing primer anneals to the single-stranded target nucleic acid template and the polymerase consecutively incorporates the dNTPs (or other deoxyribonucleoside polyphosphate) to the primer based on the single-stranded target nucleic acid template. The unique luminescent tag associated with each incorporated dNTP can be excited with the appropriate excitation light during or after incorporation of the dNTP to the primer and its emission can be subsequently detected, using, any suitable device(s) and/or method(s), including devices and methods for detection described elsewhere herein. Detection of a particular emission of light (e.g., having a particular emission lifetime, intensity, spectrum and/or combination thereof) can be attributed to a particular dNTP incorporated. The sequence obtained from the collection of detected luminescent tags can then be used to determine the sequence of the single- stranded target nucleic acid template via sequence complementarity.

While the present disclosure makes reference to dNTPs, devices, systems and methods provided herein may be used with various types of nucleotides, such as ribonucleotides and deoxyribonucleotides (e.g., deoxyribonucleoside polyphosphates with at least 4, 5, 6, 7, 8, 9, or 10 phosphate groups). Such ribonucleotides and deoxyribonucleotides can include various types of tags (or markers) and linkers.

Properties of luminescent labels

As described herein, a luminescent molecule is a molecule that absorbs one or more photons and may subsequently emit one or more photons after one or more time durations. The luminescence of the molecule is described by several parameters, including but not limited to luminescent lifetime, absorption spectra, emission spectra, luminescent quantum yield, and luminescent intensity. The terms absorption and excitation are used interchangeably throughout the application. A typical luminescent molecule may absorb, or undergo excitation by, light at multiple wavelengths. Excitation at certain wavelengths or within certain spectral ranges may relax by a luminescent emission event, while excitation at certain other wavelengths or spectral ranges may not relax by a luminescent emission event. In some embodiments, a luminescent molecule is only suitably excited for luminescence at a single wavelength or within a single spectral range. In some embodiments, a luminescent molecule is suitably excited for luminescence at two or more wavelengths or within two or more spectral ranges. In some embodiments, a molecule is identified by measuring the wavelength of the excitation photon or the absorption spectrum.

The emitted photon from a luminescent emission event will emit at a wavelength within a spectral range of possible wavelengths. Typically, the emitted photon has a longer wavelength (e.g., has less energy or is red-shifted) compared to the wavelength of the excitation photon. In certain embodiments, a molecule is identified by measuring the wavelength of an emitted photon. In certain embodiments, a molecule is identified by measuring the wavelength of a plurality of emitted photon. In certain embodiments, a molecule is identified by measuring the emission spectrum.

Luminescent lifetime refers to the time duration between an excitation event and an emission event. In some embodiments, luminescent lifetime is expressed as the constant in an equation of exponential decay. In some embodiments, wherein there are one or more pulse events delivering excitation energy, the time duration is the time between the pulse and the subsequent emission event.

“Determining a luminescent lifetime” of a molecule can be performed using any suitable method (e.g., by measuring the lifetime using a suitable technique or by determining timedependent characteristics of emission). In some embodiments, determining the luminescent lifetime of a molecule comprises determining the lifetime relative to one or more molecules (e.g., different luminescently labeled nucleotides in a sequencing reaction). In some embodiments, determining the luminescent lifetime of a molecule comprises determining the lifetime relative to a reference. In some embodiments, determining the luminescent lifetime of a molecule comprises measuring the lifetime (e.g., fluorescence lifetime). In some embodiments, determining the luminescent lifetime of a molecule comprises determining one or more temporal characteristics that are indicative of lifetime. In some embodiments, the luminescent lifetime of a molecule can be determined based on a distribution of a plurality of emission events (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, or more emission events) occurring across one or more time-gated windows relative to an excitation pulse. For example, a luminescent lifetime of a single molecule can be distinguished from a plurality of molecules having different luminescent lifetimes based on the distribution of photon arrival times measured with respect to an excitation pulse.

It should be appreciated that a luminescent lifetime of a single molecule is indicative of the timing of photons emitted after the single molecule reaches an excited state and the single molecule can be distinguished by information indicative of the timing of the photons. Some embodiments may include distinguishing a molecule from a plurality of molecules based on the molecule’s luminescent lifetime by measuring times associated with photons emitted by the molecule. The distribution of times may provide an indication of the luminescent lifetime which may be determined from the distribution. In some embodiments, the single molecule is distinguishable from the plurality of molecules based on the distribution of times, such as by comparing the distribution of times to a reference distribution corresponding to a known molecule. In some embodiments, a value for the luminescent lifetime is determined from the distribution of times. Luminescent quantum yield refers to the fraction of excitation events at a given wavelength or within a given spectral range that lead to an emission event, and is typically less than 1. In some embodiments, the luminescent quantum yield of a molecule described herein is between 0 and about 0.001, between about 0.001 and about 0.01, between about 0.01 and about 0.1, between about 0.1 and about 0.5, between about 0.5 and 0.9, or between about 0.9 and 1. In some embodiments, a molecule is identified by determining or estimating the luminescent quantum yield.

As used herein for single molecules, luminescent intensity refers to the number of emitted photons per unit time that are emitted by a molecule which is being excited by delivery of a pulsed excitation energy. In some embodiments, the luminescent intensity refers to the detected number of emitted photons per unit time that are emitted by a molecule which is being excited by delivery of a pulsed excitation energy, and are detected by a particular sensor or set of sensors.

The luminescent lifetime, luminescent quantum yield, and luminescent intensity may each vary for a given molecule under different conditions. In some embodiments, a single molecule will have a different observed luminescent lifetime, luminescent quantum yield, or luminescent intensity than for an ensemble of the molecules. In some embodiments, a molecule confined in a sample well (e.g., a nanoaperture) will have a different observed luminescent lifetime, luminescent quantum yield, or luminescent intensity than for molecules not confined in a sample well. In some embodiments, a luminescent label or luminescent molecule attached to another molecule will have a different luminescent lifetime, luminescent quantum yield, or luminescent intensity than the luminescent label or luminescent molecule not attached to another molecule. In some embodiments, a molecule interacting with a macromolecular complex (e.g., protein complex (e.g., nucleic acid polymerase)) will have different luminescent lifetime, luminescent quantum yield, or luminescent intensity than a molecule not interacting with a macromolecular complex.

In certain embodiments, a luminescent molecule described in the application absorbs one photon and emits one photon after a time duration. In some embodiments, the luminescent lifetime of a molecule can be determined or estimated by measuring the time duration. In some embodiments, the luminescent lifetime of a molecule can be determined or estimated by measuring a plurality of time durations for multiple pulse events and emission events. In some embodiments, the luminescent lifetime of a molecule can be differentiated amongst the luminescent lifetimes of a plurality of types of molecules by measuring the time duration. In some embodiments, the luminescent lifetime of a molecule can be differentiated amongst the luminescent lifetimes of a plurality of types of molecules by measuring a plurality of time durations for multiple pulse events and emission events. In certain embodiments, a molecule is identified or differentiated amongst a plurality of types of molecules by determining or estimating the luminescent lifetime of the molecule. In certain embodiments, a molecule is identified or differentiated amongst a plurality of types of molecules by differentiating the luminescent lifetime of the molecule amongst a plurality of the luminescent lifetimes of a plurality of types of molecules.

In certain embodiments, the luminescent emission event is a fluorescence. In certain embodiments, the luminescent emission event is a phosphorescence. As used herein, the term luminescence encompasses all luminescent events including both fluorescence and pho sphorescence .

In one aspect, the application provides a method of determining the luminescent lifetime of a single luminescent molecule comprising: providing the luminescent molecule in a target volume; delivering a plurality of pulses of an excitation energy to a vicinity of the target volume; and detecting a plurality of luminescences from the luminescent molecule. In some embodiments, the method further comprises evaluating the distribution of the plurality of time durations between each pair of pulses and luminescences. In some embodiments, the method further comprises immobilizing the single luminescent molecule in the target volume.

In another aspect, the application provides a method of determining the luminescent lifetime of a plurality of molecules comprising: providing a plurality of luminescent molecules in a target volume; delivering a plurality of pulses of an excitation energy to a vicinity of the target volume; and detecting a plurality of luminescences from the luminescent molecules. In some embodiments, the method further comprises evaluating the distribution of the plurality of time durations between each pair of pulses and luminescences. In some embodiments, the method further comprises immobilizing the luminescent molecules in the target volume. In some embodiments, the plurality consists of between 2 and about 10 molecules, between about 10 and about 100 molecules, or between about 100 and about 1000 molecules. In some embodiments, the plurality consists of between about 1000 and about 106 molecules , between about 106 and about 109 molecules, between about 109 and about 1012 molecules, between about 1012 and about 1015 molecules, or between about 1015 and about 1018 molecules. In some embodiments, all molecules of the plurality are the same type of molecule.

Excitation energy

In one aspect of methods described herein, one or more excitation energy is used to excite the luminescent labels of the molecules to be identified or distinguished. In some embodiments, an excitation energy is in the visible spectrum. In some embodiments, an excitation energy is in the ultraviolet spectrum. In some embodiments, an excitation energy is in the infrared spectrum. In some embodiments, one excitation energy is used to excite the luminescently labeled molecules. In some embodiments, two excitation energies are used to excite the luminescently labeled molecules. In some embodiments, three or more excitation energies are used to excite the luminescently labeled molecules. In some embodiments, each luminescently labeled molecule is excited by only one of the delivered excitation energies. In some embodiments, a luminescently labeled molecule is excited by two or more of the delivered excitation energies. In certain embodiments, an excitation energy may be monochromatic or confined to a spectral range. In some embodiments, a spectral range has a range of between about 0.1 nm and about 1 nm, between about 1 nm and about 2 nm, or between about 2 nm and about 5 nm. In some embodiments a spectral range has a range of between about 5 nm and about 10 nm, between about 10 nm and about 50 nm, or between about 50 nm and about 100 nm.

In certain embodiments, excitation energy is delivered as a pulse of light. In certain embodiments, excitation energy is delivered as a plurality of pulses of light. In certain embodiments, two or more excitation energies are used to excite the luminescently labeled molecules. In some embodiments, each excitation energy is delivered at the same time (e.g., in each pulse). In some embodiments, each excitation energy is delivered at different times (e.g., in separate pulses of each energy). The different excitation energies may be delivered in any pattern sufficient to allow detection of luminescence from the target molecules. In some embodiments, two excitation energies are delivered in each pulse. In some embodiments, a first excitation energy and a second excitation energy are delivered in alternating pulses. In some embodiments, a first excitation energy is delivered in a series of sequential pulses, and a second excitation energy is delivered in a subsequent series of sequential pulses, or an alternating pattern of such series.

In certain embodiments, the frequency of pulses of light is selected based on the luminescent properties of the luminescently labeled molecule. In certain embodiments, the frequency of pulses of light is selected based on the luminescent properties of a plurality of luminescently labeled nucleotides. In certain embodiments, the frequency of pulses of light is selected based on the luminescent lifetime of a plurality of luminescently labeled nucleotides. In some embodiments, the frequency is selected so that the gap between pulses is longer than the luminescent lifetimes of one or more luminescently labeled nucleotides. In some embodiments, the frequency is selected based on the longest luminescent lifetime of the plurality of luminescently labeled nucleotides. For example, if the luminescent lifetimes of the four luminescently labeled nucleotides are 0.25, 0.5, 1.0, and 1.5 ns, the frequency of pulses of light may be selected so that the gap between pulses exceeds 1.5 ns. In some embodiments, the gap is between about two times and about ten times, between about ten times and about 100 times, or between about 100 times and about 1000 times longer than the luminescent lifetime of one or more luminescently labeled molecules being excited. In some embodiments, the gap is about 10 times longer than the luminescent lifetime of one or more luminescently labeled molecules being excited. In some embodiments, the gap is between about 0.01 ns and about 0.1 ns, between about 1 ns and about 5 ns, between about 5 ns and about 15 ns, between about 15 ns and about 25 ns, or between about 25 ns and about 50 ns. In some embodiments, the gap is selected such that there is a 50%, 75%, 90%, 95%, or 99% probability that the molecules excited by the pulse will luminescently decay or that the excited state will relax by another mechanism.

In certain embodiments, wherein there are multiple excitation energies, the frequency of the pulses for each excitation energy is the same. In certain embodiments, wherein there are multiple excitation energies, the frequencies of the pulses for each excitation energy is different. For example, if a red laser is used to excite luminescent molecules with lifetimes of 0.2 and 0.5 ns, and a green laser is used to excite luminescent molecules with lifetimes of 5 ns and 7 ns, the gap after each red laser pulse may be shorter (e.g., 5 ns) than the gap after each green laser pulse (e.g., 20 ns).

In certain embodiments, the frequency of pulsed excitation energies is selected based on the chemical process being monitored. For a sequencing reaction the frequency may be selected such that a number of pulses are delivered sufficient to allow for detection of a sufficient number of emitted photons to be detected. A sufficient number, in the context of detected photons, refers to a number of photons necessary to identify or distinguish the luminescently labeled nucleotide from the plurality of luminescently labeled nucleotides. For example, a DNA polymerase may incorporate an additional nucleotide once every 20 milliseconds on average. The time that a luminescently labeled nucleotide interacts with the complex may be about 10 milliseconds, and the time between when the luminescent marker is cleaved and the next luminescently labeled nucleotide begins to interact may be about 10 milliseconds. The frequency of the pulsed excitation energy could then be selected to deliver sufficient pulses over 10 milliseconds such that a sufficient number of emitted photons are detected during the 10 millisecond when the luminescently labeled nucleotide is being incorporated. For example, at a frequency of 100 MHz, there will be 1 million pulses in 10 milliseconds (the approximate length of the incorporation event). If 0.1% of these pulses leads to a detected photon there will be 1,000 luminescent data points that can be analyzed to determine the identity of the luminescently labeled nucleotide being incorporated. Any of the above values are non-limiting. In some embodiments incorporation events may take between 1 ms and 20 ms, between 20 ms and 100 ms, or between 100 ms and 500 ms. In some embodiments, in which multiple excitation energies are delivered in separately timed pulses the luminescently labeled nucleotide may only be excited by a portion of the pulses. In some embodiments, the frequency and pattern of the pulses of multiple excitation energies is selected such that the number of pulses is sufficient to excite any one of the plurality of luminescently labeled nucleotides to allow for a sufficient number of emitted photons to be detected.

In some embodiments, the frequency of pulses is between about 1 MHz and about 10 MHz. In some embodiments, the frequency of pulses is between about 10 MHz and about 100

MHz. In some embodiments, the frequency of pulses is between about 100 MHz and about 1

GHz. In some embodiments, the frequency of pulses is between about 50 MHz and about 200

MHz. In some embodiments, the frequency of pulses is about 100 MHz. In some embodiments, the frequency is stochastic.

In certain embodiments, the excitation energy is between about 500 nm and about 700 nm. In some embodiments, the excitation energy is between about 500 nm and about 600 nm, or about 600 nm and about 700 nm. In some embodiments, the excitation energy is between about 500 nm and about 550 nm, between about 550 nm and about 600 nm, between about 600 nm and about 650 nm, or between about 650 nm and about 700 nm.

In certain embodiments, a method described herein comprises delivery of two excitation energies. In some embodiments, the two excitation energies are separated by between about 5 nm and about 20 nm, between about 20 nm and about 40 nm, between about 40 nm and about 60 nm , between about 60 nm and about 80 nm, between about 80 nm and about 100 nm, between about 100 nm and about 150 nm, between about 150 nm and about 200 nm, between about 200 nm and about 400 nm, or between at least about 400 nm. In some embodiments, the two excitation energies are separated by between about 20 nm and about 80 nm, or between about 80 nm and about 160 nm.

When an excitation energy is referred to as being in a specific range, the excitation energy may comprise a single wavelength, such that the wavelength is between or at the endpoints of the range, or the excitation energy may comprise a spectrum of wavelengths with a maximum intensity, such that the maximum intensity is between or at the endpoints of the range.

In certain embodiments, the first excitation energy is in the range of 450 nm to 500 nm and the second excitation energy is in the range of 500 nm to 550 nm, 550 nm to 600 nm, 600 nm to 650 nm, or 650 nm to 700 nm. In certain embodiments, the first excitation energy is in the range of 500 nm to 550 nm and the second excitation energy is in the range of 450 nm to 500 nm, 550 nm to 600 nm, 600 nm to 650 nm, or 650 nm to 700 nm. In certain embodiments, the first excitation energy is in the range of 550 nm to 600 nm and the second excitation energy is in the range of 450 nm to 500 nm, 500 nm to 550 nm, 600 nm to 650 nm, or 650 nm to 700 nm. In certain embodiments, the first excitation energy is in the range of 600 nm to 650 nm and the second excitation energy is in the range of 450 nm to 500 nm , 500 nm to 550 nm, 550 nm to 600 nm, or 650 nm to 700 nm. In certain embodiments, the first excitation energy is in the range of 650 nm to 700 nm and the second excitation energy is in the range of 450 nm to 500 nm , 500 nm to 550 nm, 550 nm to 600 nm, or 600 nm to 650 nm.

In certain embodiments, the first excitation energy is in the range of 450 nm to 500 nm and the second excitation energy is in the range of 500 nm to 550 nm. In certain embodiments, the first excitation energy is in the range of 450 nm to 500 nm and the second excitation energy is in the range of 550 nm to 600 nm. In certain embodiments, the first excitation energy is in the range of 450 nm to 500 nm and the second excitation energy is in the range of 600 nm to 670 nm. In certain embodiments, the first excitation energy is in the range of 500 nm to 550 nm and the second excitation energy is in the range of 550 nm to 600 nm. In certain embodiments, the first excitation energy is in the range of 500 nm to 550 nm and the second excitation energy is in the range of 600 nm to 670 nm. In certain embodiments, the first excitation energy is in the range of 550 nm to 600 nm and the second excitation energy is in the range of 600 nm to 670 nm. In certain embodiments, the first excitation energy is in the range of 470 nm to 510 nm and the second excitation energy is in the range of 510 nm to 550 nm. In certain embodiments, the first excitation energy is in the range of 470 nm to 510 nm and the second excitation energy is in the range of 550 nm to 580 nm. In certain embodiments, the first excitation energy is in the range of 470 nm to 510 nm and the second excitation energy is in the range of 580 nm to 620 nm. In certain embodiments, the first excitation energy is in the range of 470 nm to 510 nm and the second excitation energy is in the range of 620 nm to 670 nm. In certain embodiments, the first excitation energy is in the range of 510 nm to 550 nm and the second excitation energy is in the range of 550 nm to 580 nm. In certain embodiments, the first excitation energy is in the range of 510 nm to 550 nm and the second excitation energy is in the range of 580 nm to 620 nm. In certain embodiments, the first excitation energy is in the range of 510 nm to 550 nm and the second excitation energy is in the range of 620 nm to 670 nm. In certain embodiments, the first excitation energy is in the range of 550 nm to 580 nm and the second excitation energy is in the range of 580 nm to 620 nm. In certain embodiments, the first excitation energy is in the range of 550 nm to 580 nm and the second excitation energy is in the range of 620 nm to 670 nm. In certain embodiments, the first excitation energy is in the range of 580 nm to 620 nm and the second excitation energy is in the range of 620 nm to 670 nm.

Certain embodiments of excitation energy sources and devices for delivery of excitation energy pulses to a target volume are described elsewhere herein. Luminescently labeled nucleotides

In one aspect, methods and compositions described herein comprises one or more luminescently labeled nucleotides. In certain embodiments, one or more nucleotides comprise deoxyribose nucleosides. In some embodiments, all nucleotides comprises deoxyribose nucleosides. In certain embodiments, one or more nucleotides comprise ribose nucleosides. In some embodiments, all nucleotides comprise ribose nucleosides. In some embodiments, one or more nucleotides comprise a modified ribose sugar or ribose analog (e.g., a locked nucleic acid). In some embodiments, one or more nucleotides comprise naturally occurring bases (e.g., cytosine, guanine, adenine, thymine, uracil). In some embodiments, one or more nucleotides comprise derivatives or analogs of cytosine, guanine, adenine, thymine, or uracil.

In certain embodiments, a method comprises the step of exposing a polymerase complex to a plurality of luminescently labeled nucleotides. In certain embodiments, a composition or device comprises a reaction mixture comprising a plurality of luminescently labeled nucleotides. In some embodiments, the plurality of nucleotides comprises four types of nucleotides. In some embodiments, the four types of nucleotides each comprise one of cytosine, guanine, adenine, and thymine. In some embodiments, the four types of nucleotides each comprise one of cytosine, guanine, adenine, and uracil.

In certain embodiments, the concentration of each type of luminescently labeled nucleotide in the reaction mixture is between about 50 nM and about 200 nM, about 200 nM and about 500 nM, about 500 nM and about 1 pM, about IpM and about 50 pM, or about 50 pM and 250 pM. In some embodiments, the concentration of each type of luminescently labeled nucleotide in the reaction mixture is between about 250 nM and about 2 pM. In some embodiments, the concentration of each type of luminescently labeled nucleotide in the reaction mixture is about 1 pM.

In certain embodiments, the reaction mixture contains additional reagents of use for sequencing reactions. In some embodiments, the reaction mixture comprises a buffer. In some embodiments, a buffer comprises 3-(N-morpholino)propanesulfonic acid (MOPS). In some embodiments, a buffer is present in a concentration of between about 1 mM and between aboutlOO mM. In some embodiments, the concentration of MOPS is about 50 mM. In some embodiments, the reaction mixture comprises one or more salt. In some embodiments, a salt comprises potassium acetate. In some embodiments, the concentration of potassium acetate is about 140 mM. In some embodiments, a salt is present in a concentration of between about 1 mM and about 200 mM. In some embodiments, the reaction mixture comprises a magnesium salt (e.g., magnesium acetate). In some embodiments, the concentration of magnesium acetate is about 20 mM. In some embodiments, a magnesium salt is present in a concentration of between about 1 mM and about 50 mM. In some embodiments, the reaction mixture comprises a reducing agent. In some embodiments, a reducing agent is dithiothreitol (DTT). In some embodiments, a reducing agent is present in a concentration of between about 1 mM and about 50 mM. In some embodiments, the concentration of DTT is about 5 mM. In some embodiments, the reaction mixture comprises one or photostabilizers. In some embodiments, the reaction mixture comprises an anti-oxidant, oxygen scavenger, or triplet state quencher. In some embodiments, a photo stabilizer comprises protocatechuic acid (PCA). In some embodiments, a photo stabilizer comprises 4-nitrobenzyl alcohol (NBA). In some embodiments, a photo stabilizer is present in a concentration of between about 0.1 mM and about 20 mM. In some embodiments, the concentration of PCA is about 3 mM. In some embodiments, the concentration of NBA is about 3 mM. A mixture with a photostabilizer (e.g., PCA) may also comprise an enzyme to regenerate the photo stabilizer (e.g., protocatechuic acid dioxygenase (PCD)). In some embodiments, the concentration of PCD is about 0.3 mM.

The application contemplates different methods for differentiating nucleotides amongst a plurality of nucleotides. In certain embodiments, each of the luminescently labeled nucleotides has a different luminescent lifetime. In certain embodiments, two or more of the luminescently labeled nucleotides have the same luminescent lifetimes or substantially the same luminescent lifetimes (e.g., lifetimes that cannot be distinguished by the method or device).

In certain embodiments, each of the luminescently labeled nucleotides absorbs excitation energy in a different spectral range. In certain embodiments, two of the luminescently labeled nucleotides absorb excitation energy in the same spectral range. In certain embodiments, three of the luminescently labeled nucleotides absorb excitation energy in the same spectral range. In certain embodiments, four or more of the luminescently labeled nucleotides absorb excitation energy in the same spectral range. In certain embodiments, two of the luminescently labeled nucleotides absorb excitation energy a different spectral range. In certain embodiments, three of the luminescently labeled nucleotides absorb excitation energy a different spectral range. In certain embodiments, four or more of the luminescently labeled nucleotides absorb excitation energy a different spectral range.

In certain embodiments, each of the luminescently labeled nucleotides emits photons in a different spectral range. In certain embodiments, two of the luminescently labeled nucleotides emits photons in the same spectral range. In certain embodiments, three of the luminescently labeled nucleotides emits photons in the same spectral range. In certain embodiments, four or more of the luminescently labeled nucleotides emits photons in the same spectral range. In certain embodiments, two of the luminescently labeled nucleotides emits photons in the different spectral range. In certain embodiments, three of the luminescently labeled nucleotides emits photons in the different spectral range. In certain embodiments, four or more of the luminescently labeled nucleotides emits photons in the different spectral range.

In certain embodiments, each of four luminescently labeled nucleotides has a different luminescent lifetime. In certain embodiments, two or more luminescently labeled nucleotides have different luminescent lifetimes and absorb and/or emit photons in a first spectral range, and one or more luminescently labeled nucleotides absorb and/or emit photons in a second spectral range. In some embodiments, each of three luminescently labeled nucleotides has a different luminescent lifetime and emit luminescence in a first spectral range, and a fourth luminescently labeled nucleotide absorbs and/or emits photons in a second spectral range. In some embodiments, each of two luminescently labeled nucleotides has a different luminescent lifetime and emit luminescence in a first spectral range, and a third and fourth luminescently labeled nucleotide each have different luminescent lifetimes and emit luminescence in a second spectral range.

In certain embodiments, each of four luminescently labeled nucleotides has a different luminescent intensity. In certain embodiments, two or more luminescently labeled nucleotides have different luminescent intensity and emit luminescence in a first spectral range, and one or more luminescently labeled nucleotides absorbs and/or emits photons in a second spectral range. In some embodiments, each of three luminescently labeled nucleotides has a different luminescent intensity and emit luminescence in a first spectral range, and a fourth luminescently labeled nucleotide absorbs and/or emits photons in a second spectral range. In some embodiments, each of two luminescently labeled nucleotides has a different luminescent intensity and emit luminescence in a first spectral range, and a third and fourth luminescently labeled nucleotide each have different luminescent intensity and emit luminescence in a second spectral range.

In certain embodiments, each of four luminescently labeled nucleotides has a different luminescent lifetime or luminescent intensity. In certain embodiments, two or more luminescently labeled nucleotides have different luminescent lifetime or luminescent intensity and emit luminescence in a first spectral range, and one or more luminescently labeled nucleotides absorbs and/or emits photons in a second spectral range. In some embodiments, each of three luminescently labeled nucleotides has a different luminescent lifetime or luminescent intensity and emit luminescence in a first spectral range, and a fourth luminescently labeled nucleotide absorbs and/or emits photons in a second spectral range. In some embodiments, each of two luminescently labeled nucleotides has a different luminescent lifetime or luminescent intensity and emit luminescence in a first spectral range, and a third and fourth luminescently labeled nucleotide each have different luminescent lifetime or luminescent intensity and emit luminescence in a second spectral range.

In certain embodiments, two or more luminescently labeled nucleotides have different luminescent lifetimes and absorb excitation energy in a first spectral range, and one or more luminescently labeled nucleotides absorbs excitation energy in a second spectral range. In some embodiments, each of three luminescently labeled nucleotides has a different luminescent lifetime and absorb excitation energy in a first spectral range, and a fourth luminescently labeled nucleotide absorbs excitation energy in a second spectral range. In some embodiments, each of two luminescently labeled nucleotides has a different luminescent lifetime and absorb excitation energy in a first spectral range, and a third and fourth luminescently labeled nucleotide each have different luminescent lifetimes and absorb excitation energy in a second spectral range.

In certain embodiments, two or more luminescently labeled nucleotides have different luminescent lifetime or luminescent intensity and absorb excitation energy in a first spectral range, and one or more luminescently labeled nucleotides absorbs excitation energy in a second spectral range. In some embodiments, each of three luminescently labeled nucleotides has a different luminescent lifetime or luminescent intensity and absorb excitation energy in a first spectral range, and a fourth luminescently labeled nucleotide absorbs excitation energy in a second spectral range. In some embodiments, each of two luminescently labeled nucleotides has a different luminescent lifetime or luminescent intensity and absorb excitation energy in a first spectral range, and a third and fourth luminescently labeled nucleotide each have different luminescent lifetime or luminescent intensity and absorb excitation energy in a second spectral range.

During sequencing the method of identifying a nucleotide may vary between various base pairs in the sequence. In certain embodiments, two types of nucleotides may be labeled to absorb at a first excitation energy, and those two types of nucleotides (e.g., A, G) are distinguished based on different luminescent intensity, whereas two additional types of nucleotides (e.g., C, T) may be labeled to absorb at a second excitation energy, and those two additional types of nucleotides are distinguished based on different luminescent lifetime. For such an embodiment, during sequencing certain segments of the sequence may be determined only based on luminescent intensity (e.g., segments incorporating only A and G), whereas other segments of the sequence may be determined only based on luminescent lifetime (e.g., segments incorporating only C and T). In some embodiments, between 2 and 4 luminescently labeled nucleotide are be differentiated based on luminescent lifetime. In some embodiments, between 2 and 4 luminescently labeled nucleotides are differentiated based on luminescent intensity. In some embodiments, between 2 and 4 luminescently labeled nucleotides are differentiated based on luminescent lifetime and luminescent intensity.

Luminescence detection

In one aspect of methods described herein, an emitted photon (a luminescence) or a plurality of emitted photons is detected by one or more sensors. For a plurality of luminescently labeled molecules or nucleotides, each of the molecules may emit photons in a single spectral range, or a portion of the molecules may emit photons in a first spectral range and another portion of molecules may emit photons in a second spectral range. In certain embodiments, the emitted photons are detected by a single sensor. In certain embodiments, the emitted photons are detected by multiple sensors. In some embodiments, the photons emitted in a first spectral range are detected by a first sensor, and the photons emitted in a second spectral range are detected by a second sensor. In some embodiments, the photons emitted in each of a plurality of spectral ranges are detected by a different sensor.

In certain embodiments, each sensor is configured to assign a time bin to an emitted photon based on the time duration between the excitation energy and the emitted photon. In some embodiments, photons emitted after a shorter time duration will be assigned an earlier time bin, and photons emitted after a longer duration will be assigned a later time bin.

In some embodiments, a plurality of pulses of excitation energy is delivered to vicinity of a target volume and a plurality of photons, which may include photon emission events, are detected. In some embodiments, the plurality of luminescences (e.g., photon emission events) correspond to incorporation of a luminescently labeled nucleotide into a nucleic acid product. In some embodiments, the incorporation of a luminescently labeled nucleotide lasts for between about 1 ms and about 5 ms, between about 5 ms and about 20 ms, between about 20 ms and about 100 ms, or between about 100 ms and about 500 ms. In some embodiments, between about 10 and about 100, between about 100 and about 1000, about 1000 and about 10000, or about 10000 and about 100000 luminescences are detected during incorporation of a luminescently labeled nucleotide.

In certain embodiments, there are no luminescences detected if a luminescently labeled nucleotide is not being incorporated. In some embodiments, there is a luminescence background. In some embodiments, spurious luminescences are detected when no luminescently labeled nucleotide is being incorporated. Such spurious luminescences may occur if one or more luminescently labeled nucleotides is in the target volume (e.g., diffuses into the target volume, or interacts with polymerase but is not incorporated) during a pulse of excitation energy, but is not being incorporated by the sequencing reaction. In some embodiments, the plurality of luminescences detected from a luminescently labeled nucleotide in the target volume but not being incorporated is smaller (e.g., ten times, 100 times, 1000 times, 10000 times) than the plurality of luminescences from a luminescently labeled nucleotide.

In some embodiments, for each plurality of detected luminescences corresponding to incorporation of a luminescently labeled nucleotide the luminescences are assigned a time bin based on the time duration between the pulse and the emitted photon. This plurality for an incorporation event is referred to herein as a “burst”. In some embodiments, a burst refers to a series of signals (e.g., measurements) above a baseline (e.g., noise threshold value), wherein the signals correspond to a plurality of emission events that occur when the luminescently labeled nucleotide is within the excitation region. In some embodiments, a burst is separated from a preceding and/or subsequent burst by a time interval of signals representative of the baseline. In some embodiments, the burst is analyzed by determining the luminescent lifetime based on the plurality of time durations. In some embodiments, the burst is analyzed by determining the luminescent intensity based on the number of detected luminescences per a unit of time. In some embodiments, the burst is analyzed by determining the spectral range of the detected luminescences. In some embodiments, analyzing the burst data will allow assignment of the identity of the incorporated luminescently labeled nucleotide, or allow one or more luminescently labeled nucleotides to be differentiated from amongst a plurality of luminescently labeled nucleotides. The assignment or differentiation may rely on any one of luminescent lifetime, luminescent intensity, spectral range of the emitted photons, or any combination thereof.

Luminescent labels

The terms luminescent tag, luminescent label and luminescent marker are used interchangeably throughout, and relate to molecules comprising one or more luminescent molecules. In certain embodiments, the incorporated molecule is a luminescent molecule, e.g., without attachment of a distinct luminescent label. Typical nucleotide and amino acids are not luminescent, or do not luminesce within suitable ranges of excitation and emission energies. In certain embodiments, the incorporated molecule comprises a luminescent label. In certain embodiments, the incorporated molecule is a luminescently labeled nucleotide. In certain embodiments, the incorporated molecule is a luminescently labeled amino acid or luminescently labeled tRNA. In some embodiments, a luminescently labeled nucleotide comprises a nucleotide and a luminescent label. In some embodiments, a luminescently labeled nucleotide comprises a nucleotide, a luminescent label, and a linker. In some embodiments, the luminescent label is a fluorophore. In certain embodiments, the luminescent label, and optionally the linker, remain attached to the incorporated molecule. In certain embodiments, the luminescent label, and optionally the linker, are cleaved from the molecule during or after the process of incorporation.

In certain embodiments, the luminescent label is a cyanine dye, or an analog thereof. In some embodiments, the cyanine dye is of formula: or a salt, stereoisomer, or tautomer thereof, wherein:

A 1 and A 2 are joined to form an optionally substituted, aromatic or non-aromatic, monocyclic or polycyclic, heterocyclic ring;

B 1 and B 2 are joined to form an optionally substituted, aromatic or non-aromatic, monocyclic or polycyclic, heterocyclic ring; each of R 1 and R 2 is independently hydrogen, optionally substituted alkyl; and each of L 1 and L 2 is independently hydrogen, optionally substituted alkyl, or L 1 and L 2 are joined to form an optionally substituted, aromatic or non-aromatic, monocyclic or polycyclic, carbocyclic ring.

In certain embodiments, the luminescent label is a rhodamine dye, or an analog thereof.

In some embodiments, the rhodamine dye is of formula: or a salt, stereoisomer, or tautomer thereof, wherein: each of A 1 and A 2 is independently hydrogen, optionally substituted alkyl, optionally substituted aromatic or non-aromatic heterocyclyl, optionally substituted aromatic or non-aromatic carbocyclyl, or optionally substituted carbonyl, or A 1 and A 2 are joined to form an optionally substituted, aromatic or non-aromatic, monocyclic or polycyclic, heterocyclic ring; each of B 1 and B 2 is independently hydrogen, optionally substituted alkyl, optionally substituted, aromatic or non-aromatic heterocyclyl, optionally substituted, aromatic or non-aromatic carbocyclyl, or optionally substituted carbonyl, or B 1 and B 2 are joined to form an optionally substituted, aromatic or non-aromatic, monocyclic or polycyclic, heterocyclic ring; each of R 2 and R 3 is independently hydrogen, optionally substituted alkyl, optionally substituted aryl, or optionally substituted acyl; and

R 4 is hydrogen, optionally substituted alkyl, optionally substituted, optionally substituted aromatic or non-aromatic heterocyclyl, optionally substituted aromatic or non-aromatic carbocyclyl, or optionally substituted carbonyl.

In some embodiments, R 4 is optionally substituted phenyl. In some embodiments, R 4 is optionally substituted phenyl, wherein at least one substituent is optionally substituted carbonyl. In some embodiments, R 4 is optionally substituted phenyl, wherein at least one substituent is optionally substituted sulfonyl.

Typically, the luminescent label comprises an aromatic or heteroaromatic compound and can be a pyrene, anthracene, naphthalene, acridine, stilbene, indole, benzindole, oxazole, carbazole, thiazole, benzothiazole, phenanthridine, phenoxazine, porphyrin, quinoline, ethidium, benzamide, cyanine, carbocyanine, salicylate, anthranilate, coumarin, fluorescein, rhodamine or other like compound. Exemplary dyes include xanthene dyes, such as fluorescein or rhodamine dyes, including 5-carboxyfluorescein (FAM), 2'7'-dimethoxy-4'5'-dichloro-6-carboxyfluorescein (JOE), tetrachlorofluorescein (TET), 6-carboxyrhodamine (R6G), N,N,N',N'-tetramethyl-6- carboxyrhodamine (TAMRA), 6-carboxy-X-rhodamine (ROX). Exemplary dyes also include naphthylamine dyes that have an amino group in the alpha or beta position. For example, naphthylamino compounds include l-dimethylaminonaphthyl-5-sulfonate, l-anilino-8- naphthalene sulfonate and 2-p-toluidinyl-6-naphthalene sulfonate, 5-(2'- aminoethyl)aminonaphthalene-l- sulfonic acid (EDANS). Other exemplary dyes include coumarins, such as 3-phenyl-7-isocyanatocoumarin; acridines, such as 9-isothiocyanatoacridine and acridine orange; N-(p-(2-benzoxazolyl)phenyl)maleimide; cyanines, such as indodicarbocyanine 3 (Cy®3), (2Z)-2-[(E)-3-[3-(5-carboxypentyl)-l,l-dimethyl-6,8- disulfobenzo[e]indol-3-ium-2-yl]prop-2-enylidene]-3-ethyl-l, l-dimethyl-8- (trioxidanylsulfanyl)benzo[e]indole-6-sulfonate (Cy®3.5), 2-{2-[(2,5-dioxopyrrolidin-l-yl)oxy]- 2-oxoethyl]-16,16,18,18-tetramethyl-6,7,7a,8a,9,10,16,18- octahydrobenzo[2",3"]indolizino[8",7":5',6']pyrano[3',2':3,4 ]pyrido[l,2-a]indol-5-ium-14- sulfonate (Cy®3B), indodicarbocyanine 5 (Cy®5), indodicarbocyanine 5.5 (Cy®5.5), 3-(- carboxy-pentyl)-3'-ethyl-5,5'-dimethyloxacarbocyanine (CyA); 1H,5H,11H,15H- Xantheno[2,3,4-ij:5,6,7-i'j']diquinolizin- 18-ium, 9-[2(or 4)-[[[6-[2,5-dioxo-l-pyrrolidinyl)oxy]- 6-oxohexyl] amino] sulfonyl] -4(or 2)-sulfophenyl]-2,3,6,7,12,13,16,17-octahydro-inner salt (TR or Texas Red®); BODIPY® dyes; benzoxazoles; stilbenes; pyrenes; and the like. For nucleotide sequencing, certain combinations of luminescently labeled nucleotides may be preferred. In some embodiments, at least one of the luminescently labeled nucleotides comprises a cyanine dye, or analog thereof. In some embodiments, at least one luminescently labeled nucleotides comprises a rhodamine dye, or analog thereof. In some embodiments, at least two luminescently labeled nucleotides each comprise a cyanine dye, or analog thereof. In some embodiments, at least two luminescently labeled nucleotides each comprise a rhodamine dye, or analog thereof. In some embodiments, at least three luminescently labeled nucleotides each comprise a cyanine dye, or analog thereof. In some embodiments, at least three luminescently labeled nucleotides each comprise a rhodamine dye, or analog thereof. In some embodiments, at least four luminescently labeled nucleotides each comprise a cyanine dye, or analog thereof. In some embodiments, at least four luminescently labeled nucleotides each comprise a rhodamine dye, or analog thereof. In some embodiments, three luminescently labeled nucleotides comprise a cyanine dye, or analog thereof, and a fourth luminescently labeled nucleotide comprises a rhodamine dye, or analog thereof. In some embodiments, two luminescently labeled nucleotides comprise a cyanine dye, or analog thereof, and a third, and optionally a fourth, luminescently labeled nucleotide comprises a rhodamine dye, or analog thereof. In some embodiments, three luminescently labeled nucleotides comprise a rhodamine dye, or analog thereof, and a third, and optionally a fourth, luminescently labeled nucleotide comprises a cyanine dye, or analog thereof.

In some embodiments, at least one labeled nucleotide is linked to two or more dyes (e.g., two or more copies of the same dye and/or two or more different dyes).

In some embodiments, at least two luminescently labeled nucleotides absorb a first excitation energy, wherein at least one of the luminescently labeled nucleotides comprises a cyanine dye, or analog thereof, and at least one of the luminescently labeled nucleotides comprises a rhodamine dye, or an analog thereof. In some embodiments, at least two luminescently labeled nucleotides absorb a second excitation energy, wherein at least one of the luminescently labeled nucleotides comprises a cyanine dye, or analog thereof, and at least one of the luminescently labeled nucleotides comprises a rhodamine dye, or an analog thereof. In some embodiments, at least two luminescently labeled nucleotides absorb a first excitation energy, wherein at least one of the luminescently labeled nucleotides comprises a cyanine dye, or analog thereof, and at least one of the luminescently labeled nucleotides comprises a rhodamine dye, or an analog thereof, and at least two additional luminescently labeled nucleotides absorb a second excitation energy, wherein at least one of the luminescently labeled nucleotides comprises a cyanine dye, or analog thereof, and at least one of the luminescently labeled nucleotides comprises a rhodamine dye, or an analog thereof. In some embodiments, at least two luminescently labeled nucleotides absorb a first excitation energy, wherein at least one of the luminescently labeled nucleotides has a luminescent lifetime of less than about 1 ns, and at least one of the luminescently labeled nucleotides has a luminescent lifetime of greater than 1 ns. In some embodiments, at least two luminescently labeled nucleotides absorb a second excitation energy, wherein at least one of the luminescently labeled nucleotides has a luminescent lifetime of less than about 1 ns, and at least one of the luminescently labeled nucleotides has a luminescent lifetime of greater than 1 ns. In some embodiments, at least two luminescently labeled nucleotides absorb a first excitation energy, wherein at least one of the luminescently labeled nucleotides has a luminescent lifetime of less than about 1 ns, and at least one of the luminescently labeled nucleotides has a luminescent lifetime of greater than 1 ns, and at least additional two luminescently labeled nucleotides absorb a second excitation energy, wherein at least one of the luminescently labeled nucleotides has a luminescent lifetime of less than about 1 ns, and at least one of the luminescently labeled nucleotides has a luminescent lifetime of greater than 1 ns.

In certain embodiments, the luminescent label is a dye selected from Table 1. The dyes listed in Table 1 are non-limiting, and the luminescent labels of the application may include dyes not listed in Table 1. In certain embodiments, the luminescent labels of one or more luminescently labeled nucleotides is selected from Table 1. In certain embodiments, the luminescent labels of four or more luminescently labeled nucleotides is selected from Table 1. Table 1. Exemplary fluorophores.

Dyes may also be classified based on the wavelength of maximum absorbance or emitted luminescence. Table 2 provides exemplary fluorophores grouped into columns according to approximate wavelength of maximum absorbance. The dyes listed in Table 2 are non-limiting, and the luminescent labels of the application may include dyes not listed in Table 2. The exact maximum absorbance or emission wavelength may not correspond to the indicated spectral ranges. In certain, embodiments, the luminescent labels of one or more luminescently labeled nucleotides is selected from the “Red” group listed in Table 2. In certain embodiments, the luminescent labels of one or more luminescently labeled nucleotides is selected from the “Green” group listed in Table 2. In certain embodiments, the luminescent labels of one or more luminescently labeled nucleotides is selected from the “Yellow/Orange” group listed in Table 2. In certain embodiments, the luminescent labels of four nucleotides are selected such that all are selected from one of the “Red”, “Yellow/Orange”, or “Green” group listed in Table 2. In certain embodiments, the luminescent labels of four nucleotides are selected such that three are selected from a first group of the “Red”, “Yellow/Orange”, and “Green” groups listed in Table 2, and the fourth is selected from a second group of the “Red”, “Yellow/Orange”, and “Green” groups listed in Table 2. In certain embodiments, the luminescent labels of four nucleotides are selected such that two are selected from a first of the “Red”, “Yellow/Orange”, and “Green” group listed in Table 2, and the third and fourth are selected from a second group of the “Red”, “Yellow/Orange”, and “Green” groups listed in Table 2. In certain embodiments, the luminescent labels of four nucleotides are selected such that two are selected from a first of the “Red”, “Yellow/Orange”, and “Green” groups listed in Table 2, and a third is selected from a second group of the “Red”, “Yellow/Orange”, and “Green” groups listed in Table 2, and a fourth is selected from a third group of the “Red”, “Yellow/Orange”, and “Green” groups listed in Table 2.

Table 2. Exemplary fluorophores by spectral range.

In certain embodiments, the luminescent label may be (Dye 101), (Dye 102), (Dye 103), (Dye 104), (Dye 105), or (Dye 106), of formulae (in NHS ester form): optionally protonated. In some embodiments, the dyes above are attached to the linker or nucleotide by formation of an amide bond at the indicated point of attachment.

In certain embodiments, the luminescent label may comprise a first and second chromophore. In some embodiments, an excited state of the first chromophore is capable of relaxation via an energy transfer to the second chromophore. In some embodiments, the energy transfer is a Forster resonance energy transfer (FRET). Such a FRET pair may be useful for providing a luminescent label with properties that make the label easier to differentiate from amongst a plurality of luminescent labels. In certain embodiments, the FRET pair may absorb excitation energy in a first spectral range and emit luminescence in a second spectral range.

For a set of luminescently labeled molecules (e.g., luminescently labeled nucleotides), the properties of a luminescently labeled FRET pair may allow for selection of a plurality of distinguishable molecules (e.g., nucleotides). In some embodiments, the second chromophore of a FRET pair has a luminescent lifetime distinct from a plurality of other luminescently labeled molecules. In some embodiments, the second chromophore of a FRET pair has a luminescent intensity distinct from a plurality of other luminescently labeled molecules. In some embodiments, the second chromophore of a FRET pair has a luminescent lifetime and luminescent intensity distinct from a plurality of other luminescently labeled molecules. In some embodiments, the second chromophore of a FRET pair emits photons in a spectral range distinct from a plurality of other luminescently labeled molecules. In some embodiments, the first chromophore of a FRET pair has a luminescent lifetime distinct from a plurality of luminescently labeled molecules. In certain embodiments, the FRET pair may absorb excitation energy in a spectral range distinct from a plurality of other luminescently labeled molecules. In certain embodiments, the FRET pair may absorb excitation energy in the same spectral range as one or more of a plurality of other luminescently labeled molecules.

In some embodiments, two or more nucleotides can be connected to a luminescent label, wherein the nucleotides are connected to distinct locations on the luminescent label. A nonlimiting example could include a luminescent molecule that contains two independent reactive chemical moieties (e.g., azido group, acetylene group, carboxyl group, amino group) that are compatible with a reactive moiety on a nucleotide analog. In such an embodiment, a luminescent label could be connected to two nucleotide molecules via independent linkages. In some embodiments, a luminescent label can comprise two or more independent connections to two or more nucleotides.

In some embodiments, two or more nucleotides can be connected to a luminescent dye via a linker (e.g., a branched linker or a linker with two or more reactive sites onto which nucleotides and/or dyes can be attached). Accordingly, in some embodiments, two or more nucleotides (e.g., of the same type) can be linked to two or more dyes (e.g., of the same type).

In some embodiments, a luminescent label can comprise a quantum dot with luminescent properties. In some embodiments, one or more nucleotides are connected to a quantum dot. In some embodiments, one or more nucleotides are connected to a quantum dot via connections to distinct sites of the protein. In some embodiments, the surface of a quantum dot is coated with nucleotide molecules. In certain embodiments, a quantum dot is covalently connected to one or more nucleotides (e.g., via reactive moieties on each component). In certain embodiments, a quantum dot is non-covalently connected to one or more nucleotides (e.g., via compatible non- covalent binding partners on each component). In some embodiments, the surface of a quantum dot comprises one or more streptavidin molecules that are non-covalently bound to one or more biotinylated nucleotides.

In some embodiments, a luminescent label can comprise a protein with luminescent properties. In some embodiments, one or more nucleotides are connected to a luminescent protein. In some embodiments, one or more nucleotides are connected to a luminescent protein via connections to distinct sites of the protein. In certain embodiments, the luminescent labels of four nucleotides are selected such that one nucleotide is labeled with a fluorescent protein while the remaining three nucleotides are labeled with fluorescent dyes (e.g., the non-limiting examples in Tables 1 and 2). In certain embodiments, the luminescent labels of four nucleotides are selected such that two nucleotides are labeled with fluorescent proteins while the remaining two nucleotides are labeled with fluorescent dyes (e.g., the non-limiting examples in Tables 1 and 2). In certain embodiments, the luminescent labels of four nucleotides are selected such that three nucleotides are labeled with fluorescent proteins while the remaining nucleotide is labeled with a fluorescent dye (e.g., the non-limiting examples in Tables 1 and 2). In some embodiments, the luminescent labels of four nucleotides are selected such that all four nucleotides are labeled with fluorescent proteins.

According to some aspects of the application, luminescent labels (e.g., dyes, for example fluorophores) can damage polymerases in a sequencing reaction that is exposed to excitation light. In some aspects, this damage occurs during the incorporation of a luminescently labeled nucleotide, when the luminescent molecule is held in close proximity to the polymerase enzyme. Non-limiting examples of damaging reactions include the formation of a covalent bond between the polymerase and luminescent molecule and emission of radiative or non-radiative decay from the luminescent molecule to the enzyme. This can shorten the effectiveness of the polymerase and reduce the length of a sequencing run.

In some embodiments, a nucleotide and a luminescent label are connected by a relatively long linker or linker configuration to keep the luminescent label away from the polymerase during incorporation of the labeled nucleotide. The term “linker configuration” is used herein to refer to the entire structure connecting the luminescent molecule(s) to the nucleotide(s) and does not encompass the luminescent molecule(s) or the nucleotide(s).

In some embodiments, a single linker connects a luminescent molecule to a nucleotide. In some embodiments, a linker contains one or more points of divergence so that two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) nucleotides are connected to each luminescent molecule, two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) luminescent molecules are connected to each nucleotide, or two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) nucleotides are connected to two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) luminescent molecules.

In some embodiments, the linker configuration determines the distance between the luminescent label and the nucleotide. In some embodiments, the distance is about 1 nm or 2 nm to about 20 nm. For example, more than 2 nm, more than 5 nm, 5-10 nm, more than 10 nm, 10- 15 nm, more than 15 nm, 15-20 nm, more than 20 nm. However, the distance between the luminescent label and the nucleotide cannot be too long since the luminescent label needs to be within the illumination volume to be excited when the nucleotide is held within the active site of the enzyme. Accordingly, in some embodiments, the overall linker length is less than 30 nm, less than 25 nm, around 20 nm, or less than 20 nm.

In some embodiments, a protecting molecule is included within a linker configuration. A protecting molecule can be a protein, protein homodimer, protein heterodimer, protein oligomer, a polymer, or other molecule that can protect the polymerase from the damaging reactions that can occur between the enzyme and the luminescent label. Non-limiting examples of protecting molecules include proteins (e.g., avidin, streptavidin, Traptavidin, NeutrAvidin, ubiquitin), protein complexes (e.g., TrypsimBPTI, bamase:barstar, colicin E9 nuclease:Im9 immunity protein), nucleic acids (e.g., deoxyribonucleic acid, ribonucleic acid), polysaccharides, lipids, and carbon nanotubes. In some embodiments, the protecting molecule is an oligonucleotide (e.g., a DNA oligonucleotide, an RNA oligonucleotide, or a variant thereof).

In some embodiments, a protecting molecule is connected to one or more luminescent molecules and to one or more nucleotide molecules. In some embodiments, the luminescent molecule(s) are not adjacent to the nucleotide(s). For example, one or more luminescent molecules can be connected on a first side of the protecting molecule and one or more nucleotides can be connected to a second side of the protecting molecule, wherein the first and second sides of the protecting molecule are distant from each other. In some embodiments, they are on approximately opposite sides of the protecting molecule.

The distance between the point at which a protecting molecule is connected to a luminescent label and the point at which the protecting molecule is connected to a nucleotide can be a linear measurement through space or a non-linear measurement across the surface of the protecting molecule. The distance between the luminescent label and nucleotide connection points on a protecting molecule can be measured by modeling the three-dimensional structure of the protecting molecule. In some embodiments, this distance can be 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 nm or more. Alternatively, the relative positions of the luminescent label and nucleotide on a protecting molecule can be described by treating the structure of the protecting molecule as a quadratic surface (e.g., ellipsoid, elliptic cylinder). In some embodiments, the luminescent label and the nucleotide are separated by a distance that is at least one eighth of the distance around an ellipsoidal shape representing the protecting molecule. In some embodiments, the luminescent label and the nucleotide are separated by a distance that is at least one quarter of the distance around an ellipsoidal shape representing the protecting molecule. In some embodiments, the luminescent label and the nucleotide are separated by a distance that is at least one third of the distance around an ellipsoidal shape representing the protecting molecule. In some embodiments, the luminescent label and the nucleotide are separated by a distance that is one half of the distance around an ellipsoidal shape representing the protecting molecule.

Sample well

In certain embodiments, a method of detecting one or more luminescently labeled molecules is performed with the molecules confined in a target volume (e.g., a reaction volume). In some embodiments, the target volume is a region within a sample well (e.g., a nanoaperture). Embodiments of sample wells (e.g., nanoapertures) and the fabrication of sample wells (e.g., nanoapertures) are described elsewhere herein. In certain embodiments, the sample well (e.g., nanoaperture) comprises a bottom surface comprising a first material and sidewalls formed by a plurality of metal or metal oxide layers. In some embodiments, the first material is a transparent material or glass. In some embodiments, the bottom surface is flat. In some embodiments, the bottom surface is a curved well. In some embodiments, the bottom surface includes a portion of the sidewalls below the sidewalls formed by a plurality of metal or metal oxide layers. In some embodiments, the first material is fused silica or silicon dioxide. In some embodiments, the plurality of layers each comprise a metal (e.g., Al, Ti) or metal oxide (e.g., AI2O3, TiCE, TiN).

Polymerases

The term “polymerase,” as used herein, generally refers to any enzyme (or polymerizing enzyme) capable of catalyzing a polymerization reaction. Examples of polymerases include, without limitation, a nucleic acid polymerase, a transcriptase or a ligase. A polymerase can be a polymerization enzyme.

Embodiments directed towards single molecule nucleic acid extension (e.g., for nucleic acid sequencing) may use any polymerase that is capable of synthesizing a nucleic acid complementary to a target nucleic acid molecule. In some embodiments, a polymerase may be a DNA polymerase, an RNA polymerase, a reverse transcriptase, and/or a mutant or altered form of one or more thereof.

Examples of polymerases include, but are not limited to, a DNA polymerase, an RNA polymerase, a thermostable polymerase, a wild-type polymerase, a modified polymerase, E. coli DNA polymerase I, T7 DNA polymerase, bacteriophage T4 DNA polymerase q>29 (psi29) DNA polymerase, Taq polymerase, Tth polymerase, Tli polymerase, Pfu polymerase, Pwo polymerase, VENT polymerase, DEEPVENT polymerase, EX-Taq polymerase, LA-Taq polymerase, Sso polymerase, Poc polymerase, Pab polymerase, Mth polymerase, ES4 polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tea polymerase, Tih polymerase, Tfi polymerase, Platinum Taq polymerases, Tbr polymerase, Tfl polymerase, Tth polymerase, Pfutubo polymerase, Pyrobest polymerase, Pwo polymerase, KOD polymerase, Bst polymerase, Sac polymerase, Klenow fragment, polymerase with 3’ to 5’ exonuclease activity, and variants, modified products and derivatives thereof. In some embodiments, the polymerase is a single subunit polymerase. Non-limiting examples of DNA polymerases and their properties are described in detail in, among other places, DNA Replication 2nd edition, Kornberg and Baker, W. H. Freeman, New York, N.Y. (1991).

Upon base pairing between a nucleobase of a target nucleic acid and the complementary dNTP, the polymerase incorporates the dNTP into the newly synthesized nucleic acid strand by forming a phosphodiester bond between the 3’ hydroxyl end of the newly synthesized strand and the alpha phosphate of the dNTP. In examples in which the luminescent tag conjugated to the dNTP is a fluorophore, its presence is signaled by excitation and a pulse of emission is detected during or after the step of incorporation. For detection labels that are conjugated to the terminal (gamma) phosphate of the dNTP, incorporation of the dNTP into the newly synthesized strand results in release the beta and gamma phosphates and the detection label, which is free to diffuse in the sample well, resulting in a decrease in emission detected from the fluorophore.

In some embodiments, the polymerase is a polymerase with high processivity. However, in some embodiments, the polymerase is a polymerase with reduced processivity. Polymerase processivity generally refers to the capability of a polymerase to consecutively incorporate dNTPs into a nucleic acid template without releasing the nucleic acid template.

In some embodiments, the polymerase is a polymerase with low 5 '-3' exonuclease activity and/or 3'-5' exonuclease. In some embodiments, the polymerase is modified (e.g., by amino acid substitution) to have reduced 5 '-3' exonuclease activity and/or 3 '-5' activity relative to a corresponding wild-type polymerase. Further non-limiting examples of DNA polymerases include 9°Nm™ DNA polymerase (New England Biolabs), and a P680G mutant of the Klenow exo- polymerase (Tuske et al. (2000) JBC 275(31):23759-23768). In some embodiments, a polymerase having reduced processivity provides increased accuracy for sequencing templates containing one or more stretches of nucleotide repeats (e.g., two or more sequential bases of the same type).

In some embodiments, the polymerase is a polymerase that has a higher affinity for a labeled nucleotide than for a non-labeled nucleic acid.

Embodiments directed toward single molecule RNA extension (e.g., for RNA sequencing) may use any reverse transcriptase that is capable of synthesizing complementary DNA (cDNA) from an RNA template. In such embodiments, a reverse transcriptase can function in a manner similar to polymerase in that cDNA can be synthesized from an RNA template via the incorporation of dNTPs to a reverse transcription primer annealed to an RNA template. The cDNA can then participate in a sequencing reaction and its sequence determined as described above and elsewhere herein. The determined sequence of the cDNA can then be used, via sequence complementarity, to determine the sequence of the original RNA template. Examples of reverse transcriptases include Moloney Murine Leukemia Virus reverse transcriptase (M-MLV), avian myeloblastosis virus (AMV) reverse transcriptase, human immunodeficiency virus reverse transcriptase (HIV-1) and telomerase reverse transcriptase.

The processivity, exonuclease activity, relative affinity for different types of nucleic acid, or other property of a nucleic acid polymerase can be increased or decreased by one of skill in the art by mutation or other modification relative to a corresponding wild-type polymerase. Templates

The present disclosure provides devices, systems and methods for detecting biomolecules or subunits thereof, such as nucleic acid molecules. Such detection can include sequencing. A biomolecule may be extracted from a biological sample obtained from a subject (e.g., a human or other subject). In some embodiments, the subject may be a patient. In some embodiments, a target nucleic acid may be detected and/or sequenced for diagnostic, prognostic, and/or therapeutic purposes. In some embodiments, information for a sequencing assay may be useful to assist in the diagnosis, prognosis, and/or treatment of a disease or condition. In some embodiments, the subject may be suspected of having a health condition, such as a disease (e.g., cancer). In some embodiments, the subject may be undergoing treatment for a disease.

In some embodiments, a biological sample may be extracted from a bodily fluid or tissue of a subject, such as breath, saliva, urine, blood (e.g., whole blood or plasma), stool, or other bodily fluid or biopsy sample. In some examples, one or more nucleic acid molecules are extracted from the bodily fluid or tissue of the subject. The one or more nucleic acids may be extracted from one or more cells obtained from the subject, such as part of a tissue of the subject, or obtained from a cell-free bodily fluid of the subject, such as whole blood.

A biological sample may be processed in preparation for detection (e.g., sequencing). Such processing can include isolation and/or purification of the biomolecule (e.g., nucleic acid molecule) from the biological sample, and generation of more copies of the biomolecule. In some examples, one or more nucleic acid molecules are isolated and purified from a bodily fluid or tissue of the subject, and amplified through nucleic acid amplification, such as polymerase chain reaction (PCR). Then, the one or more nucleic acid molecules or subunits thereof can be identified, such as through sequencing. However, in some embodiments nucleic acid samples can be evaluated (e.g., sequenced) as described in this application without requiring amplification.

As described in this application, sequencing can include the determination of individual subunits of a template biomolecule (e.g., nucleic acid molecule) by synthesizing another biomolecule that is complementary or analogous to the template, such as by synthesizing a nucleic acid molecule that is complementary to a template nucleic acid molecule and identifying the incorporation of nucleotides with time (e.g., sequencing by synthesis). As an alternative, sequencing can include the direct identification of individual subunits of the biomolecule.

During sequencing, signals indicative of individual subunits of a biomolecule may be collected in memory and processed in real time or at a later point in time to determine a sequence of the biomolecule. Such processing can include a comparison of the signals to reference signals that enable the identification of the individual subunits, which in some cases yields reads. Reads may be sequences of sufficient length (e.g., at least about 30, 50, 100 base pairs (bp) or more) that can be used to identify a larger sequence or region, e.g., that can be aligned to a location on a chromosome or genomic region or gene.

Sequence reads can be used to reconstruct a longer region of a genome of a subject (e.g., by alignment). Reads can be used to reconstruct chromosomal regions, whole chromosomes, or the whole genome. Sequence reads or a larger sequence generated from such reads can be used to analyze a genome of a subject, such as to identify variants or polymorphisms. Examples of variants include, but are not limited to, single nucleotide polymorphisms (SNPs) including tandem SNPs, small-scale multi-base deletions or insertions, also referred to as indels or deletion insertion polymorphisms (DIPs), Multi-Nucleotide Polymorphisms (MNPs), Short Tandem Repeats (STRs), deletions, including microdeletions, insertions, including microinsertions, structural variations, including duplications, inversions, translocations, multiplications, complex multi-site variants, copy number variations (CNV). Genomic sequences can comprise combinations of variants. For example, genomic sequences can encompass the combination of one or more SNPs and one or more CNVs.

The term “genome” generally refers to an entirety of an organism’s hereditary information. A genome can be encoded either in DNA or in RNA. A genome can comprise coding regions that code for proteins as well as non-coding regions. A genome can include the sequence of all chromosomes together in an organism. For example, the human genome has a total of 46 chromosomes. The sequence of all of these together constitutes the human genome. In some embodiments, the sequence of an entire genome is determined. However, in some embodiments, sequence information for a subset of a genome (e.g., one or a few chromosomes, or regions thereof) or for one or a few genes (or fragments thereof) is sufficient for diagnostic, prognostic, and/or therapeutic applications.

Nucleic acid sequencing of a plurality of single-stranded target nucleic acid templates may be completed where multiple sample wells (e.g., nanoapertures) are available, as is the case in devices described elsewhere herein. Each sample well can be provided with a single-stranded target nucleic acid template and a sequencing reaction can be completed in each sample well. Each of the sample wells may be contacted with the appropriate reagents (e.g., dNTPs, sequencing primers, polymerase, co-factors, appropriate buffers, etc.) necessary for nucleic acid synthesis during a primer extension reaction and the sequencing reaction can proceed in each sample well. In some embodiments, the multiple sample wells are contacted with all appropriate dNTPs simultaneously. In other embodiments, the multiple sample wells are contacted with each appropriate dNTP separately and each washed in between contact with different dNTPs. Incorporated dNTPs can be detected in each sample well and a sequence determined for the single- stranded target nucleic acid in each sample well as is described elsewhere herein.

While some embodiments may be directed to diagnostic testing by detecting single molecules in a specimen, the inventors have also recognized that the single molecule detection capabilities of the present disclosure may be used to perform nucleic acid (e.g., DNA, RNA) sequencing of one or more nucleic acid segments of, for example, genes.

In some aspects, methods described herein can be performed using one or more devices or apparatuses described in more detail below.

EXAMPLES

Example 1. Design and optimization of ultrabright DNA nanostructures

This Example relates to fabricating 2-dimensional-DNA minimal nanostructures (wireframe triangles and squares) that are cost-effective, compact in size, and ultrabright due to their ability to carry multiple fluorescently-labeled strands. The structures are based on a polygon strategy that allows the structures to possess multiple single- stranded spots. These spots are used to functionalize the structures with fluorescently labeled probe strands, as well as nucleotide- carrying strands. The nucleotide strands are incorporated into a growing DNA chain being synthesized in the pores where sequencing-by-synthesis is being carried out. When a base is incorporated, the unique fluorescent signature of that base is signaled by the fluorescent DNA structure connected to it. Design, assembly, optimization, and validation of the structures as monodispersed and bright probes for nucleotide incorporation are described herein.

Design

To produce triangles and squares from modular strands, a design was developed in which strands were imagined as sides/arms to the polygons (FIG. 1). These strands had 5’ and 3’ regions that were 15 bp long and clipped onto other strands with complementary sequences on their extremities. On the 5’ end of each strand, there was a dithymine (TT) spacer followed by a complementary sequence that hybridized to a nucleotide-carrying strand (nuc). The middle part of these strands was encoded with the sequence that hybridized to the strand that carried multiple luminescent labels (e.g., 3 cyanine dyes). Three arms coming together yielded a triangle nanostructure with 3 nuc and 9 dyes called T, and 4 arms coming together yielded a square nanostructure with 4 nuc and 12 dyes called SI (FIG. 2).

Assembly optimization and sequencing with triangle

First, the assembly of the triangle was tested and the conditions for assembly were optimized. The triangle that binds to 3 strands that simulate the dye strands but lack the dye, called “D” strands, was assembled first. To test the assembly, a sequential ladder assembly was conducted in which one strand is added after the other to form the structure as previously described. This was also tested in a gel to determine whether a fully assembled structure can be produced without annealing compared to a standard 2-hour anneal, as well as checking the intactness of the structures following 3-day storage post annealing (FIG. 1). The gel showed that the assembly of the triangle was more efficient with annealing (FIG. 1) and was not impacted following a 3-day storage. In addition, since all 3 triangle strands shared the same sequence that binds D, when 1 equivalent of D (lane 4) was added, a range of 1-bound to 3-bound was achieved, and 3 equivalents formed the final fully double stranded triangle product (lane 5).

This result indicated that the desired triangular structure was formed, better assembled when annealed, and the cavity can be loaded with the target complementary sequence in all 3 binding sites.

The addition of Nuc strands was tested next, which represents a strand holding the nucleotide but without it. The data showed that the triangle decreased in mobility with increasing equivalents of the Nuc strand, where it reached the final product of 3x nuc strand on the triangle in lane 4 (FIG. 3).

Following this, the assembly was tested with strands containing the luminescent labels. Two strands with two different labels were used: Q3 had 3 C530 dyes that were external dyes conjugated post synthesis to the strand (external, not part of the backbone), and Q4 had 3 Cy3 dyes that were conjugated during synthesis as internal dyes/amidites (FIG. 4). In this assembly test, 3 equivalents of either dye strand were added to the triangle, either at room temperature or annealed with the triangle strands (lanes 1, 2, 3 for Q3 and lanes 4, 5, 6 for Q4) and showed that annealing the dye strand with the triangle was more efficient in incorporation. Assembly with Q3 was qualitatively slightly better than that with Q4, which may have been due to the dyes not being part of the sequence itself and not hindering hybridization (FIG. 5, left). The annealing method was used in all additional experiments.

The triangles with 3 equivalents of the internal strands were also assembled but were varied in the number of fluorophores present. Triangles with 1 equivalent Q3/Q4 and 2 equivalents D (3 dyes) (lanes 8, 9) were also assembled, and their fluorescence signal was compared to that with 3 equivalents of Q3 (9 dyes) triangles (lane 3, 6). In both triangles loaded with 3 equivalents of the probe strand, around 2.4x increase in fluorescent output was observed, indicating that as part of the triangle compact structure increased fluorescent signal was achieved with a small amount of quenching as expected due to proximity (FIG. 5, right).

Following this, functionalizing the fluorescently labeled strands with the nucleotide carrying strands was addressed next. dC carried a cytosine nucleotide and dG carried a guanosine nucleotide (Error! Reference source not found. 6). These strands were added at room temperature to avoid any potential degradation that can result from annealing.

The assembly test below shows the decreasing mobility of the fluorescently labeled triangle as the dC strand was added to it. However, excess copies of the dC strand were needed to push the assembly of the triangle from carrying 1 or 2 dC strands to a 3 dC carrying triangle. The amount sufficient to fully assemble the triangle with 3 nucleotide carrying strands was in between 2:1 and 3:1 strand Triangle ratio (after adding 9 copies, not much change was observed on the gel, lane 5). Without wishing to be bound a particular theory, this could have been due to steric hindrance around the triangle with these relatively large moieties, and adding excess of the nucleotide carrying strands may require purification of the sample from any excess to avoid interference with the sequencing data. Having leftover nucleotides in the sample that are not coupled to a fluorescent probe to signal its incorporation may lead to an incorporation that is not reported and hence show up as a deletion in the sequencing run.

This is what was observed. There was successful incorporation of the structure and roughly around 3x the increase in fluorescent intensity in comparison to a control that carries 2 fluorescent dyes. However, there was also a slightly increased deletion in the incorporation of dC when tagged to a Q3 labeled triangle, and dG when it is tagged to a Q4 labeled triangle.

From this study, it was concluded that annealing the dye labeled strands with the triangle assembly strands and excess nucleotide carrying strands at room temperature yielded a high yield monodispersed triangle. Any excess non-assembled strands should be removed as well to eliminate their interference with the sequencing process.

Square 1 (SI) assembly

Many of the assembly conditions developed from the triangle studies, where the dye strands were annealed with the assembly strands, and the nucleotide carrying strands were added in excess (a little more than 2:1 ratio), were applied to the square. To test this, the following tests on assembling the square by itself, with dye strands, and with nucleotide strands were performed (FIG. 8 shows that excess dye strands (more than 4 copies) were beneficial for the square as 1:1 dye strands: square (lanes 2, 8) showed a secondary faster mobility band right below the fully assembled product obtained when excess dye strands were added (lanes 3, 9). As for the nucleotide strand addition to the square, a secondary band was noticed below the final desired product band (lanes 4, 9), which got fainter with excess nucleotide strands (lanes 5, 6, 10, 11) (FIG. 8). Annealing the nucleotide strand with the structure was also tested and did not have a significant impact on the assembly (lanes 7 and 12) (FIG. 8). This indicates that even for a larger structure (i.e., a square compared to a triangle), it was still beneficial to have an excess of the functionalizing strands due to the steric inherited from the probes they carry (nucleotide or dye). Nonetheless, this structure added 12 dyes, 3 more than the triangle, which gave an even higher intensity in sequencing.

Eliminating excess strands by centrifugation filtration

To address the issue of leftover strands that are a result of adding excess amounts to push the assembly to completion, amicon centrifugation filtration columns with a 50 KDa cut-off size were used. This retained the S 1 structure, while the smaller molecular weight strands, such as the nucleotide and dye-labeled strands, passed through. In addition, this method allowed the samples to be concentrated. For this, the samples were loaded not only in a 6% native polyacrylamide gel, but also in a 10% gel, to be able to visualize the shorter strands and see if they were retained or removed following this purification strategy.

As evidenced by the data (FIG. 9, left), the square S 1 exhibited improved assembly when there were excess dye strands (lanes 5, 10) compared to equimolar dye strands (lane 2, 7), in both cases of using Q3 and Q4 labeled strands. Following one round of centrifugation, the squares retained their structure as there was no evident deformation of the band even when they were 5 times more concentrated (lanes 3, 6, and 12 in 6% native gel, FIG. 9, left). The 10% native gel (FIG. 9, right) showed that the excess strands indicated by the arrows (blue, orange, and yellow) disappeared following filtration (lanes 3, 3’, 6, 12, 12’). Only in the case of square with Q4 following re-dilution (6’) was there an appearance of the labeled strand as excess (red arrow), which could indicate that it was losing fluorescent strands after re-dilution as lane 6 did not show any excess.

When these samples were tested for sequencing, the filtration procedure decreased the deletion rate observed, confirming the hypothesis that the release of the nucleotide containing strand from the labeled structure was causing the high deletion rate due to incorporation without signaling. Nonetheless, there were still considerable deletion events compared to the control probes.

Design of Square 2 (S2) and Square 3 (S3) for lower deletion rates

To try to mitigate the deletion issue, modifications to the square design were introduced. The deletion rate was likely high due to need for excess nucleotide carrying strands to fully functionalize the square. To address that, two new squares were designed: S2 and S3. S2 had only 1 arm that bound a nucleotide carrying strand, compared to 4 in design SI. S3 had the nucleotide strand covalently bound to one of the arms that form the structure (i.e., structural strands). The S2 strategy may eliminate the need for excess nucleotide carrying strands as there is no crowding anymore with 4 strands. As for S3, that may eliminate the chance of any separation between the probe and nucleotide strand as they are now covalently linked together.

The assembly of S2 was then tested under optimal conditions (6x dye strand annealed, 2x nucleotide strand at room temperature) and filtered by centrifugation filtration (Error! Reference source not found.). When this sample was submitted to sequencing, it had a lower deletion rate than S 1.

Experimental methods related to Example 1

Oligonucleotide synthesis, purification, and assembly

Oligonucleotide synthesis was performed on a 1 pmol scale, starting from a universal 1000 A LCAA-CPG solid-support, on a Mermaide 6/12 synthesizer. Strands were deprotected and cleaved with a solution of 28% aqueous ammonium hydroxide solution for 20 hours at 60°C, followed by drying under vacuum at 60°C, and resuspended in Millipore H2O. Sequences were purified on polyacrylamide/8M urea gel electrophoresis at a constant current of 30mA for 1.5 hours (30 min at 250 V followed by 1 hour at 500 V), using the lx TBE buffer (90 mM Tris, 90 mM boric acid, 2 mM EDTA, and the pH was adjusted to 8.0). Following electrophoresis, the plates were wrapped in plastic and placed on a fluorescent TLC plate and illuminated with a UV lamp (254 nm). The bands were quickly excised, and the gel pieces were crushed and incubated in 12 mL of sterile water at 60°C for 12-16 hours. Samples were then dried to 1.0 mL, desalted using size exclusion chromatography (Sephadex G-25), and carefully quantified using UV-Vis spectroscopy (260 nm). Samples were loaded on a 12% polyacrylamide gel in IX TBE buffer (30 min, 250 V, 1 hour 500 V) and stained with GelRed™. Some strands were ordered from IDT without purification, followed by purification as mentioned above.

To assemble structures, samples were mixed at the desired concentration in Tris-acetic acid-magnesium buffer (lx TAMg buffer, 45 mM Tris, 20 mM acetic acid and 7.6 mM Mg(Cl)2, pH was adjusted to about 8.0 using glacial acetic acid), diluted with milliQ water, and annealed from 95 to 5 °C over 2 hours using Bio-Rad T100TM thermal cycler.

Concentration of samples and centrifugation filtration (amicon) was done with Amicon® Ultra Centrifugal Filter Units (Millipore sigma), where samples were centrifuged at 21G for 8 mins, always washing with lx TAMg.

Nucleic acid sequences used in this Example, including sequences of the structural strands of triangle Tl, square SI, and square S2, and sequences of the D (dye-binding) and Nuc functionalizing strands, are shown in FIG. 13. Characterization of nanostructures by gel electrophoresis

Assembled structures were loaded on Native Polyacrylamide gel (PAGE) (6-10%) in IX TAMg at a constant voltage of 130V for 1.5 hours. Roughly around 2 pmoles of strands were run on the gel, mixed with loading dye and buffer.

Fluorescence spectroscopy

Fluorescence measurements were performed on a SpectraMax i3x Multi-Mode plate reader. Samples were diluted 25-fold from the 1 pM assembly in lx TAMg buffer to a final volume of 50 pL. Measurements were performed at 25 °C (room temperature) in a 384-well plate.

EQUIVALENTS

Various aspects of the present application may be used alone, in combination, or in a variety of arrangements not specifically discussed in the embodiments described in the foregoing and is therefore not limited in its application to the details and arrangement of components set forth in the foregoing description or illustrated in the drawings. For example, aspects described in one embodiment may be combined in any manner with aspects described in other embodiments.

Also, the invention may be embodied as a method, of which at least one example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.

Use of ordinal terms such as “first,” “second,” “third,” etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.

Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of "including," "comprising," or "having," “containing,” “involving,” and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.