Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEMS AND METHODS FOR NANOPORE-BASED ANALYSIS OF NUCLEIC ACIDS
Document Type and Number:
WIPO Patent Application WO/2015/051378
Kind Code:
A1
Abstract:
The present disclosure generally relates to the field of nanopore-based analysis of polymers, such as RNA and DNA polymers. In one aspect, the disclosure provides methods of analyzing a nucleic acid in a nanopore-based system that include application of nonconstant electrical potentials as the nucleic acid translocates through the nanopore. The methods facilitate the realization of informative continuous current profiles to ascertain characteristics of the nucleic acid analyte. In another aspect, the disclosure provides systems and methods for reducing the Brownian motion of nucleic acid analytes in nanopores to provide for enhanced resolution of nanopore-based analyses.

Inventors:
GUNDLACH JENS (US)
DERRINGTON IAN M (US)
MANRAO ELIZABETH A (US)
LANGFORD KYLE W (US)
LASZLO ANDREW (US)
Application Number:
PCT/US2014/059360
Publication Date:
April 09, 2015
Filing Date:
October 06, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV WASHINGTON CT COMMERCIALI (US)
International Classes:
G01N27/26
Domestic Patent References:
WO2010086603A12010-08-05
WO2013016486A12013-01-31
Foreign References:
US20130146457A12013-06-13
US7625706B22009-12-01
US20120055792A12012-03-08
Attorney, Agent or Firm:
NOWAK, Thomas Stasiu (1201 Third Avenue Suite 360, Seattle WA, US)
Download PDF:
Claims:
CLAIMS

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:

1. A method for analyzing a nucleic acid, comprising:

translocating the nucleic acid through a nanopore from a first conductive liquid medium to a second conductive liquid medium, wherein the nanopore comprises a constriction zone, and wherein the nanopore is disposed in a membrane and provides liquid communication between the first conductive liquid medium and the second conductive liquid medium;

applying a non-constant electrical potential between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore;

measuring a plurality of current signals between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore;

identifying a subset of the plurality of measured current signals associated with a single translocation step of the nucleic acid;

deriving a current-potential curve from the subset of current signals to provide a current pattern; and

determining a characteristic of the nucleic acid based on the current pattern.

2. The method of Claim 1, wherein the characteristic of the nucleic acid is the identity of one or more nucleotide residues of the nucleic acid.

3. The method of Claim 2, wherein the current-potential curve is indicative of the identity of the one or more nucleotide residues in the nucleic acid.

4. The method of Claim 1, further comprising applying a compensatory electrical potential to substantially offset any current resulting from capacitance of the membrane, nucleic acid, or nanopore.

5. The method of Claim 1, further comprising correcting the plurality of measured current signals, or the subset thereof, for any current not associated with passage of ions through the nanopore.

6. The method of Claim 5, wherein the current not associated with the passage of ions through the nanopore is current associated with the capacitance of the membrane, pore, analyte, and the like.

7. The method of Claim 1, further comprising converting the current-potential curve into a conductance-voltage relationship at a given voltage.

8. The method of Claim 7, further comprising correcting the measured current signals in the subset for any non- linear ion current- voltage relationship.

9. The method of Claim 1, further comprising converting the current-potential curve into a current-nucleic acid distance curve to provide a current pattern corresponding to a segment of the nucleic acid residing in the constriction zone of the nanopore associated with the single translocation step.

10. The method of Claim 9, wherein the converting step comprises modeling the nucleic acid as a spring with a linear restoring force.

11. The method of Claim 9, wherein the current-nucleic acid distance curve of the subset is indicative of the identity of one or more nucleotide residues in the segment.

12. The method of Claim 9, further comprising repeating the steps recited in Claims 1 and 9, and optionally in any of Claims 4, 5 and 7, for one or more additional subsets associated with one or more additional sequential single translocation steps to provide an aggregation of multiple current patterns representing the current through the nanopore as the nucleic acid translocates continuously through the nanopore.

13. The method of Claim 12, wherein the aggregation multiple current patterns is indicative of the identity of one or more nucleotide residues in multiple overlapping segments of the nucleic acid.

14. The method of Claim 13, further comprising comparing one or more of the multiple current patterns to current patterns from reference nucleic acids with known correlations between current patterns and sequence.

15. The method of Claim 14, further comprising determining the sequence of a portion of the nucleic acid corresponding to the overlapping segments.

16. The method of Claim 12, further comprising detecting one or more discontinuities in the aggregation of multiple current patterns.

17. The method of Claim 16, further comprising identifying the one or more discontinuities as a forward skip or backstep associated with an aberrant translocation movement of the nucleic acid.

18. The method of Claim 16, further comprising correcting the discontinuity.

19. The method of Claim 1, wherein translocating the nucleic acid through the nanopore from the first conductive liquid medium to the second conductive liquid medium comprises using a molecular brake to regulate a rate of translocation for the nucleic acid in one or more discrete steps.

20. The method of Claim 19, wherein the molecular brake is a molecular motor.

21. The method of Claim 19, wherein the molecular brake is a translocase, a polymerase, a helicase, an exonuclease, or a topoisomerase.

22. The method of Claim 21, wherein the polymerase is phi29 DNA polymerase, Klenow fragment, or a variant or homolog thereof.

23. The method of Claim 21, wherein the helicase is a Hel308 helicase, a RecD helicase, a Tral helicase, a Tral subgroup helicase, an XPD helicases or a variant or homolog thereof.

24. The method of Claim 21, wherein the exonuclease is exonuclease I, exonuclease III, lambda exonuclease, or a variant or homolog thereof.

25. The method of Claim 21, wherein the topoisomerase is a gyrase or a variant or homolog thereof.

26. The method of Claim 1, wherein the non-constant electrical potential has a periodic function or a random non-periodic function.

27. The method of Claim 26, wherein the non-constant electrical potential is superimposed onto a constant potential that is between 10 mV and 1 V.

28. The method of Claim 26, wherein the non-constant electrical potential is superimposed over a constant electrical potential that is between 10 mV and 300 mV.

29. The method of Claim 26, wherein the periodic function of the non- constant electrical potential has minimum to maximum difference of less than 250 mV.

30. The method of Claim 26, wherein the periodic function of the non- constant electrical potential has a period of between 0.1 ms and Is.

31. The method of Claim 26, wherein the periodic function is triangular, sawtooth, square, sinusoidal, a custom function, or linear combinations thereof.

32. The method of Claim 26, wherein the periodic function is optimized to promote a substantially constant rate of nucleic acid movement within the nanopore during a period of increasing or decreasing potential in the periodic function.

33. The method of Claim 1, wherein the nanopore is a solid-state nanopore, protein nanopore, a hybrid solid state-protein nanopore, a biologically adapted solid-state nanopore, or a DNA origami nanopore.

34. The method of Claim 33, wherein the protein nanopore is alpha-hemolysin, leukocidin, Mycobacterium smegmatis porin A (MspA), outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase A, Neisseria autotransporter lipoprotein (NalP), WZA, lysenin or a homolog or variant thereof.

35. The method of Claim 34, wherein the protein nanopore sequence is modified to contain at least one amino acid substitution, deletion, or addition.

36. The method of Claim 35, wherein the at least one amino acid substitution, deletion, or addition results in a net charge change in the nanopore.

37. The method of Claim 33, wherein the protein nanopore has a constriction zone with a non-negative charge.

38. A nanopore system, comprising:

a membrane separating a first conductive liquid medium and a second conductive liquid medium;

a nanopore comprising a constriction zone and a vestibule that together define a tunnel;

wherein the nanopore is disposed in the membrane to provides liquid communication between the first conductive liquid medium and the second conductive liquid medium through the tunnel;

wherein the constriction zone is more proximate to the first conductive liquid medium and the vestibule is more proximate to the second conductive liquid medium; wherein the system is operative to translocate a nucleic acid from the first conductive liquid medium to the second conductive liquid medium.

39. The system of Claim 38, wherein the membrane is a lipid bilayer.

40. The system of Claim 38, wherein the membrane comprises a block copolymer.

41. The system of Claim 39, wherein the nanopore is a solid-state nanopore, protein nanopore, a hybrid solid state-protein nanopore, a biologically adapted solid-state nanopore, a graphene nanopore, or a DNA origami nanopore.

42. The system of Claim 41, wherein the protein nanopore is an alpha-hemolysin, leukocidin, Mycobacterium smegmatis porin A (MspA), outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase A, Neisseria autotransporter lipoprotein (NalP), WZA, lysenin or a homolog or variant thereof.

43. The system of Claim 41, wherein the protein nanopore sequence is modified to contain at least one amino acid substitution, deletion, or addition.

44. The system of Claim 43, wherein the at least one amino acid substitution, deletion, or addition results in a net charge change in the nanopore.

45. The system of Claim 41 , wherein the protein nanopore has a constriction zone with a non-negative charge.

46. A method for analyzing a nucleic acid in the nanopore system of Claim 38, comprising:

translocating the nucleic acid through the nanopore from the first conductive liquid medium to the second conductive liquid medium;

applying an electrical potential between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore;

measuring a plurality of current signals between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid is in the nanopore; and

determining a characteristic of the nucleic acid based on the measured current signals;

wherein the nucleic acid is associate with molecular brake in the first conductive liquid medium that regulates the translocation velocity of the nucleic acid through the nanopore, and wherein the molecular brake has a diameter that exceed a diameter of the nanopore.

47. The method of Claim 46, wherein the molecular brake is disposed within 4 nm of the constriction zone of the nanopore during translocation of the nucleic acid.

48. The method of Claim 46, wherein the molecular brake is an enzyme capable of associating with the nucleic acid.

49. The method of Claim 48, wherein the enzyme is a translocase, a polymerase, a helicase, an exonuclease, or a topoisomerase.

50. The method of Claim 49, wherein the polymerase is phi29 DNA polymerase, Klenow fragment, or a variant or homolog thereof.

51. The method of Claim 48, wherein the helicase is a Hel308 helicase, a RecD helicase, a Tral helicase, a Tral subgroup helicase, an XPD helicases or a variant or homolog thereof.

52. The method of Claim 50, wherein the exonuclease is exonuclease I, exonuclease III, lambda exonuclease, or a variant or homolog thereof.

53. The method of Claim 49, wherein the topoisomerase is a gyrase or a variant or homolog thereof.

54. The method of Claim 46, wherein the electrical potential is a non-constant electrical potential.

55. The method of Claim 46, wherein the characteristic is an identity of one or more nucleotide residues of the nucleic acid.

Description:
SYSTEMS AND METHODS FOR NANOPORE-BASED ANALYSIS

OF NUCLEIC ACIDS

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of Provisional Application No. 61/887,236, filed October 4, 2013, and Provisional Application No. 61/941,900, filed February 19, 2014, both of which are expressly incorporated herein by reference in their entirety.

STATEMENT REGARDING SEQUENCE LISTING

The sequence listing associated with this application is provided in text format in lieu of a paper copy and is hereby incorporated by reference into the specification. The name of the text file containing the sequence listing is 52613_SEQ_LISTING_ST25.txt. The text file is 9 KB; was created on October 6, 2014; and is being submitted via EFS- Web with the filing of the specification.

STATEMENT OF GOVERNMENT LICENSE RIGHTS

This invention was made with Government support under R01HG005115 awarded by the National Institutes of Health. The Government has certain rights in the invention.

BACKGROUND

The rapid, reliable, and cost-effective analysis of polymer molecules, such as sequencing of nucleic acids and polypeptides, is a major goal of researchers and medical practitioners. The ability to determine the sequence of polymers, such as the sequence of DNA, RNA or polypeptides, has additional importance in identifying genetic mutations and polymorphisms. Established DNA sequencing technologies have considerably improved in the past decade but still require substantial amounts of DNA and several lengthy steps, and struggle to yield long contiguous readlengths. Obtained information must be assembled "shotgun" style, an effort that depends non-linearly on the size of the genome and on the length of the fragments from which the full genome is constructed. Furthermore, considering the equipment and reagents involved, these approaches are expensive and time-consuming, especially when sequencing mammalian genomes. Nanopore-based analysis methods have been investigated as an alternative to traditional polymer analysis approaches. These methods involve passing a polymeric molecule, for example single-stranded DNA ("ssDNA"), through a nanoscopic opening while monitoring a signal, such as an electrical signal, that is influenced by the physical properties of the polymer subunits as the polymer analyte passes through the nanopore opening. The nanopore optimally has a size or three-dimensional configuration that allows the polymer to pass only in a sequential, single file order. Under theoretically optimal conditions, the polymer molecule passes through the nanopore at a rate such that the passage of each discrete monomeric subunit of the polymer can be correlated with the monitored signal. Differences in the chemical and physical properties of each monomeric subunit that makes up the polymer, for example, the nucleotides that compose an ssDNA, result in characteristic electrical signals that can identify each monomeric subunit as it passes through the nanopore. Nanopores, such as solid state nanopores and protein nanopores held within lipid bilayer membranes, have been heretofore used for analysis of DNA, RNA, and polypeptides and, thus, provide the potential advantage of robust analysis of polymers even at low copy number.

However, many challenges remain for the full realization of such benefits. For example, in ideal sequencing conditions, the passage of each potential monomeric subunit-type through the nanopore would cause a distinct detectable signal that can be readily differentiated from detectable signals caused by the passage of any other monomeric subunit-types. However, depending on the structural characteristics of the nanopore and the particular polymer analyte, multiple distinct monomeric subunit types can often affect the current similarly and, thus, contribute to detectable signals that are difficult to distinguish. For example, when monitoring the ion current flowing through Mycobacterium smegmatis porin A (MspA) during nucleic acid analysis, it has been found that the nucleotide residue adenine (A) results in the largest detectable current, whereas the residue thymine (T) results in the lowest detectable current. While the A and T residues can be readily distinguished, the nucleotide residues cytosine (C) and guanine (G) cause current levels that are intermediate between the current levels caused by A and T residues. Accordingly, C and G residues are often difficult to distinguish from each other. In another example, analysis of ssDNA in the protein pore alpha-hemolysin results in signals that are even more compressed where there is signal overlap for all four nucleotide residues types, which makes base-calling uncertain.

Furthermore, nanopores that have been heretofore used for analysis of DNA and R A, for example, protein nanopores held within lipid bilayer membranes and solid state nanopores, have generally not been capable of reading a sequence at a single-nucleotide resolution. Instead, nanopores such as MspA have a constriction zone, which is the portion of the internal tunnel with the smallest diameter. A series of consecutive nucleotides in the nucleic acid (i.e., a "k-mer" subsequence, where k = the number of nucleotides) residing in the constriction zone at any given time will influence the current through the nanopore. With the sequential passage of each individual nucleotide through the pore, the k-mer subsequence of the nucleic acid segment residing in the constriction zone shifts by a single nucleotide. Thus, the series of consecutive measured current signals is determined by a corresponding series nucleic acid segments that have overlapping nucleotide subsequences. Accordingly, the monitored signals must undergo a deconvolution analysis to deduce a correlation between the observed signal and the subsequence of the multiple nucleotides residing in the constriction zone that determined the particular signal. This deconvolution from signal to sequence requires consideration that multiple k-mer subsequences often results in similar sequences and typically relies identifying the progressing cascade of possible k-mer subsequences that can result in a continuous sequence.

Finally, a technical hurdle to the use of many nanopores is the speed in which the nucleic acid analytes translocate. Often the translocation rate is too fast for current hardware to observe meaningful modulations in the measured current. Accordingly, much emphasis has been applied in modifying the nanopore systems to control the translocation rate. Some strategies have included alterations to the physical shape or charge of the nanopores themselves. Other strategies include use of molecular brake constructs, such as molecular motors, that regulate the translocation rate of the nucleic acid. Most molecular motors facilitate the controlled translocation of the nucleic acid in discrete steps, often in single nucleotide increments. This enables the system to monitor and record current signal (called "levels") for each discrete translocation step. However, such data provides only an average current sampling for each discrete step, and thus is inherently limited in the resolution of data that it can provide. Accordingly, a need remains for enhancing the sensitivity and resolution of nanopore-based polymer analysis. The methods and systems of the present disclosure address this and related needs of the art.

SUMMARY

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

In one aspect, the disclosure provides a method for analyzing a nucleic acid. The method comprises translocating the nucleic acid through a nanopore from a first conductive liquid medium to a second conductive liquid medium, wherein the nanopore comprises a constriction zone, and wherein the nanopore is disposed in a membrane and provides liquid communication between the first conductive liquid medium and the second conductive liquid medium. The method also comprises applying a non-constant electrical potential between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore. The method also comprises measuring a plurality of current signals between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore. The method also comprises identifying a subset of the plurality of measured current signals associated with a single translocation step of the nucleic acid. Finally, the method also comprises deriving a current-potential curve from the subset of current signals to provide a current pattern. In some embodiments, the method also comprises determining a characteristic of the nucleic acid based on the current pattern.

In some embodiments, the characteristic of the nucleic acid is the identity of one or more nucleotide residues of the nucleic acid. In some embodiments, the current-potential curve is indicative of the identity of the one or more nucleotide residues in the nucleic acid.

In some embodiments, the method further comprises applying a compensatory electrical potential to substantially offset any current resulting from capacitance of the membrane, nucleic acid, or nanopore. In some embodiments, the method further comprises correcting the plurality of measured current signals, or the subset thereof, for any current not associated with passage of ions through the nanopore. In some embodiments, the current not associated with the passage of ions through the nanopore is current associated with the capacitance of the membrane, pore, analyte, and the like.

In some embodiments, the method further comprises converting the current-potential curve into a conductance-voltage relationship at a given voltage. In some embodiments, the method further comprises correcting the measured current signals in the subset for any non- linear ion current- voltage relationship.

In some embodiments, the method further comprises converting the current-potential curve into a current-nucleic acid distance curve to provide a current pattern corresponding to a segment of the nucleic acid residing in the constriction zone of the nanopore associated with the single translocation step. In some embodiments, the converting step comprises modeling the nucleic acid as a spring with a linear restoring force. In some embodiments, the current-nucleic acid distance curve of the subset is indicative of the identity of one or more nucleotide residues in the segment.

In some embodiments, the method comprises repeating one or more of the described steps for one or more additional subsets associated with one or more additional sequential single translocation steps to provide an aggregation of multiple current patterns representing the current through the nanopore as the nucleic acid translocates continuously through the nanopore. In some embodiments, the aggregation multiple current patterns is indicative of the identity of one or more nucleotide residues in multiple overlapping segments of the nucleic acid. In some embodiments, the method further comprises comparing one or more of the multiple current patterns to current patterns from reference nucleic acids with known correlations between current patterns and sequence. In some embodiments, the method further comprises determining the sequence of a portion of the nucleic acid corresponding to the overlapping segments. In some embodiments, the method further comprises detecting one or more discontinuities in the aggregation of multiple current patterns. In some embodiments, the method comprises identifying the one or more discontinuities as a forward skip or backstep associated with an aberrant translocation movement of the nucleic acid. In some embodiments, the method further comprises correcting the discontinuity.

In some embodiments, the step of translocating the nucleic acid through the nanopore from the first conductive liquid medium to the second conductive liquid medium comprises using a molecular brake to regulate a rate of translocation for the nucleic acid in one or more discrete steps. In some embodiments, the molecular brake is a molecular motor, such as a translocase, a polymerase, a helicase, an exonuclease, or a topoisomerase. In some embodiments, the polymerase is phi29 DNA polymerase, Klenow fragment, or a variant or homolog thereof. In some embodiments, the helicase is a Hel308 helicase, a RecD helicase, a Tral helicase, a Tral subgroup helicase, an XPD helicases or a variant or homolog thereof. In some embodiments, the exonuclease is exonuclease I, exonuclease III, lambda exonuclease, or a variant or homolog thereof. In some embodiments, the topoisomerase is a gyrase or a variant or homolog thereof.

In some embodiments, the non-constant electrical potential has a periodic function or a random non-periodic function. In some embodiments, the non-constant electrical potential is superimposed onto a constant potential that is between 10 mV and 1 V. In some embodiments, the non-constant electrical potential is superimposed over a constant electrical potential that is between 10 mV and 300 mV. In some embodiments, the periodic function of the non-constant electrical potential has minimum to maximum difference of less than 250 mV. In some embodiments, the periodic function of the non- constant electrical potential has a period of between 0.1 ms and 1 s. In some embodiments, the periodic function is triangular, sawtooth, square, sinusoidal, a custom function, or linear combinations thereof. In some embodiments, the periodic function is optimized to promote a substantially constant rate of nucleic acid movement within the nanopore during a period of increasing or decreasing potential in the periodic function.

In some embodiments, the nanopore is a solid-state nanopore, protein nanopore, a hybrid solid state-protein nanopore, a biologically adapted solid-state nanopore, or a DNA origami nanopore. In some embodiments, the protein nanopore is alpha-hemolysin, leukocidin, Mycobacterium smegmatis porin A (MspA), outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase A, Neisseria autotransporter lipoprotein (NalP), WZA, lysenin or a homolog or variant thereof. In some embodiments, the protein nanopore sequence is modified to contain at least one amino acid substitution, deletion, or addition. In some embodiments, the at least one amino acid substitution, deletion, or addition results in a net charge change in the nanopore. In some embodiments, the protein nanopore has a constriction zone with a non-negative charge. In another aspect, the disclosure provides a nanopore system. The nanopore system provides a membrane separating a first conductive liquid medium and a second conductive liquid medium;

a nanopore comprising a constriction zone and a vestibule that together define a tunnel;

wherein the nanopore is disposed in the membrane to provides liquid communication between the first conductive liquid medium and the second conductive liquid medium through the tunnel;

wherein the constriction zone is more proximate to the first conductive liquid medium and the vestibule is more proximate to the second conductive liquid medium; wherein the system is operative to translocate a nucleic acid from the first conductive liquid medium to the second conductive liquid medium.

In some embodiments, the membrane is a lipid bilayer. In some embodiments, the membrane comprises a block copolymer. In some embodiments, the nanopore is a solid- state nanopore, protein nanopore, a hybrid solid state-protein nanopore, a biologically adapted solid-state nanopore, a graphene nanopore, or a DNA origami nanopore. In some embodiments, the protein nanopore is an alpha-hemolysin, leukocidin, Mycobacterium smegmatis porin A (MspA), outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase A, Neisseria autotransporter lipoprotein (NalP), WZA, lysenin or a homolog or variant thereof. In some embodiments, the protein nanopore sequence is modified to contain at least one amino acid substitution, deletion, or addition. In some embodiments, the at least one amino acid substitution, deletion, or addition results in a net charge change in the nanopore.

In another aspect, the disclosure provides a method for analyzing a nucleic acid in the nanopore system, as described herein. The method comprises:

translocating the nucleic acid through the nanopore from the first conductive liquid medium to the second conductive liquid medium;

applying an electrical potential between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore; measuring a plurality of current signals between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid is in the nanopore; and

determining a characteristic of the nucleic acid based on the measured current signals;

wherein the nucleic acid is associate with molecular brake in the first conductive liquid medium that regulates the translocation velocity of the nucleic acid through the nanopore, and wherein the molecular brake has a diameter that exceed a diameter of the nanopore.

In some embodiments, the molecular brake is disposed within 4 nm of the constriction zone of the nanopore during translocation of the nucleic acid.

In some embodiments, the molecular brake is an enzyme capable of associating with the nucleic acid, for example, a translocase, a polymerase, a helicase, an exonuclease, or a topoisomerase.

DESCRIPTION OF THE DRAWINGS

The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:

FIGURE 1 is a cartoon representation of single stranded DNA (ssDNA) an MspA nanopore embedded in a lipid bilayer membrane, according to an embodiment of the present disclosure. The MspA is illustrated with a single stranded nucleic acid molecule extending between the cis compartment and the trans compartment through the MspA.

FIGURE 2 is a cartoon representation of the constriction zone of MspA with a nucleic acid positioned therein. The relative intensity of the shading in the constriction zone represents the increasing influence of any nucleic acid nucleotide at that position on the measured current through the nanopore, and reflects the fact that Brownian motion moves the nucleic acid in and out of the constriction zone. Thus, a plurality of nucleotides differentially influence the measured current signal as they pass through the nanopore during translocation of the nucleic acid through the nanopore. FIGURE 3 is a cartoon representation of MspA embedded within a lipid bilayer in an electrolyte solution (not shown). Single stranded DNA (ssDNA) is attached to a bulky NeutrAvidin molecule with a biotin linker. A specific nucleotide is designated by the position, X, measured from the biotin-NeutrAvidin "brake" where the first nucleotide is X=l . Under an applied voltage {trans +), the ssDNA threads through MspA from the cis side and remains immobilized until the applied voltage is reversed {trans -). The nucleotides residing in the narrowest part of MspA, the constriction, modulate the ionic current through the pore.

FIGURE 4 graphically illustrates voltage-dependent resistance of the DNA-pore system, indicating that the nanopore system is non-ohmic. The voltage dependent resistance of the DNA-pore system is shown for homopolymer adenine ('poly-dA', circles), cytosine ('poly-dC, squares), and thymine ('poly-dT', triangles) for both 3'- leading (solid) and 5'- leading (open) orientations. The resistance of MspA without DNA present is also shown ("Open Pore", x's).

FIGURE 5 graphically illustrates the fluctuations of inter-event resistance for nanopore-analyte complexes. Average standard deviations of the DNA-pore resistance within an event is shown above for homopolymer DNA strands. For all strands, the fluctuation of ionic current within an event decreases with higher voltage.

FIGURE 6 graphically illustrates the effect of single nucleotide substitutions at various locations in the nucleic acid on the resistance of the nucleic acid. The figure shows a comparison of resistance for DNA with a single cytosine (dC) substitution at position X in an otherwise poly-dA strand (3' leading). The deviation of resistance from that of poly-dA is shown. The top of each plot corresponds to the resistance for a poly- dC strand. The vertical dashed lines indicate the peak of the fitted Gaussian curves.

FIGURE 7 graphically illustrates the differential positioning of ssDNA within MspA with varying voltage. The nucleotide positions centrally residing in MspA's constriction zone are shown at each voltage for the dA strand (circles), dT strand (squares), and SNP strand (triangles). The upper panel illustrates that the central nucleotide is taken from the peak of the fitted Gaussian curve as shown in FIGURE 6. The lower panel illustrates the FWHM of the Gaussian curve as shown in FIGURE 6. The error bars indicate the standard errors in the Gaussian fit for each parameter. Voltage dependent stretching of the DNA within MspA is observed by the voltage dependent shifting of the nucleotides in MspA's constriction zone.

FIGURE 8 graphically illustrates experimental results for single nucleotide substitutions (SNP) in poly-dT strands that are immobilized in MspA. The panels provide a comparison of resistance for ssDNA with a single adenine (dA) substitution at position X in an otherwise poly-dT strand at different applied voltages. The deviation of the resistance from that of poly-dT (horizontal line at 0) is shown. The top of each plot corresponds to the resistance for a poly-dA strand. The central nucleotide in the constriction zone (vertical dashed line) is taken as the peak of the fitted Gaussian curve and the width of the recognition site is taken as the full width at half max (FWHM).

FIGURE 9 graphically illustrates experimental results for single nucleotide polymorphism strands. The panels provide a comparison of the voltage dependent resistance of the DNA-pore system for experiments with a single adenine (dA) or cytosine (dC) substitution at nucleotide position X from the NeutrAvidin anchor in a heteromeric strand. The difference in resistance of for the dA and dC substitutions is shown for each nucleotide position and voltage. The top of each graph corresponds to the difference in resistance expected for poly-dC and poly-dA. The central nucleotide in the constriction (vertical red dashed line) is taken as the peak of the fitted Gaussian curve and the width of the constriction is taken as the full width at half max (FWHM).

FIGURE 10 graphically illustrates calculations of force applied to ssDNA within MspA by the application of different voltages. The force applied to DNA in MspA's constriction is calculated at each applied voltage using the freely jointed chain (FJC) model and force relationship (Eqns 2 - 3).

FIGURE 11 graphically illustrates the elasticity of ssDNA. The spring constant is determined from the FJC model (Eqn 3) using experimentally derived contour length of the ssDNA, L, as described in more detail below.

FIGURE 12 graphically illustrates the width of the MspA recognition site compared to Brownian motion calculations of DNA immobilized in MspA. The recognition sites determined from experiment (circles) are the FWHM from the Gaussian curve fitted to experimental results assuming an interphosphate distance of 0.56 nm (FIGURE 6). The Brownian motion values (triangles) are determined as described in the text. Plots are shown for single dC substitution in poly-dA strand (top panel), single dA substitution in poly-dT strand (middle panel), SNP strand (bottom panel).

FIGURE 13 graphically illustrates the force-extension curves for immobilized ssDNA modeled as a Freely Jointed Chain (FJC). Force extension curves for a FJC (Eqn 3) are plotted for each voltage using contour lengths, L, found experimentally. The curve for an applied voltage of 180 mV (dashed) is compared to the curves for each lower voltage (solid). The star indicates the position where the end-to-end distances of the strand, x, are equal and the forces are related by Eqn 2.

FIGURES 14A-14F illustrates the current observed when DNA is moved through a nanopore. FIGURE 14A is a cartoon diagram of the constriction zone of MspA with ssDNA (represented with balls and sticks) threaded and influencing the resistance, primarily due to the nucleotides within the sensing zone (shaded area). FIGURE 14B graphically illustrates the expected average current as DNA is moved, adiabatically, continuously or in small steps. This representation is referred to as a "continuous current profile". FIGURE 14C graphically illustrates the corresponding average current expected to be observed when DNA is moved in single nucleotide steps by a polymerase such as phi29 DNA polymerase. The arrows between FIGURE 14B and FIGURE 14C demonstrate where along the curve in FIGURE 14B the current level was "sampled" for FIGURE 14C. FIGURE 14D is a cartoon diagram illustrating that an increased force on the ssDNA repositions the ssDNA within the nanopore by an amount, δ, referred to as the "registration shift". FIGURE 14E graphically illustrates the shifting the continuous current profile resulting changes the nucleotides in the ssDNA sensing zone. FIGURE 14F graphically illustrates the corresponding average levels when DNA is moved in single nucleotide steps at the increased force applied in FIGURE 14D.

FIGURE 15 graphically illustrates a comparison of current profiles generated from the same nucleic acid polymer, but with separately applied currents of 160 mV and 180 mV. FIGURE 15 (top panel) graphically illustrates the 180 mv data (circles) overlayed with 160 mV data (triangles), scaled for comparison. The inset shows a representative region and demonstrates the offsetting that occurred. FIGURE 15 (bottom panel) graphically illustrates that after correction for the offsetting, the 160 mV data falls very neatly on the 180 mV line. FIGURES 16A-16C graphically illustrate the registration shift, δ, resulting from the application of two different electrical potentials, 160 mV and 180 mV. FIGURE 16A is a graphic representation of an overlay of the step-wise sampling of current from the same nucleic acid analyte with 160 mV and 180 mV. FIGURE 16B illustrates the voltage shift from 160 mV compared to the phase at 160 mV as calculated from data with Eqn 5 (dashed, horizontal line). The fit (solid, curved line) is described by Eqn 7 as adapted for the nanopore system from a modified Freely Jointed Chain (FJC). Error bars are taken from averaging the δ (V), from at least 10 events and at least two experiments. FIGURE 16C is a cartoon diagram of the nanopore indicating the parameters corresponding to the variables used in Eqn 7.

FIGURE 17 is a schematic representation of one embodiment of the present disclosure for obtaining sequence information from a nucleic acid analyte in a nanopore system using a non-constant electrical potential. Optional steps are indicated in dashed lines.

FIGURES 18A-18C are illustrative examples of raw current signals obtained during the translocation of an ssDNA analyte through an MspA nanopore with the application of a non-constant electrical potential. The series from 18A-18C illustrates the raw current signals with increasing resolution (i.e., over reduced time scales). In FIGURE 18C, the arrow indicates the transition to a new level, indicating a translocation step event by the ssDNA.

FIGURE 19 is an illustrative depiction of raw current signals during the translocation of an ssDNA through MspA during the application of a non-constant electrical potential. The arrow indicates the transition to a new level, indicating a translocation step event by the ssDNA

FIGURE 20 is an illustrative example of the current-potential curves corresponding to multiple enzyme-assisted translocation steps identified in the raw current signals. Each curve is represented on its own current to voltage scale.

FIGURE 21 graphically illustrates the corrected current to voltage relationship, indicating that current signals collected across a range of electrical potentials can be converted to remove voltage dependency because the illustrated curves lack slope or other structure dependent on voltage. FIGURES 22A and 22B are illustrative examples of a continuous current profile generated from the current-potential curves illustrated in FIGURE 20. The continuous current profiles are generated by converting the voltage variable to nucleic acid position using a modeled as a Freely Jointed Chain (FJC) model. FIGURE 22A illustrates the continuous current profile as current vs. distance in nucleotides. FIGURE 22B illustrates the continuous current profile as current vs. distance in nm.

FIGURE 23 is a cartoon illustration comparing a nanopore system with an MspA nanopore in the forward orientation (left diagram) with a nanopore system with an MspA nanopore in the reverse orientation (right diagram).

FIGURE 24 graphically illustrates the contrasting current-to-voltage characteristics of MspA nanopores in forward and reverse orientations, in the absence of analyte. The dashed lines represent the observed characteristics when the poles and applied voltages were reversed for the nanopores in the forward orientation.

FIGURES 25A-25B graphically illustrate current observed in a reverse-oriented MspA nanopore without and with analyte, respectively. FIGURE 25A illustrates gating events initially observed when applying an electrical potential to an MspA nanopore in the reverse orientation in the absence of any analyte. FIGURE 25B illustrates the various current levels observed during the translocation of a nucleic acid analyte, indicating that the reverse nanopore is functional for analyzing a translocating nucleic acid.

FIGURES 26A-26B graphically illustrate comparisons of current signal resolution provided by MspA nanopores in the forward and reverse orientations. FIGURE 26 A illustrates the auto-correlation (i.e., correlation of a current signal with itself and current signals at adjacent positions). As the lag increases, the correlations of observed current is reduced at a greater rate in a reverse pore than in a forward pore. FIGURE 26B illustrates the nucleotide-specific reponse of the forward (solid line) and reverse (dashed) MspA nanopores. The backwards pore response is narrower and more specific than the forwards pore. FIGURE 26C illustrates the distribution of currents observed in forward (solid) and reverse (dashed) MspA nanopores using the same ssDNA analyte.

DETAILED DESCRIPTION

The present disclosure generally relates to the field of nanopore-based analysis of polymers, such as RNA and DNA polymers. The present disclosure is based in part on a preliminary force spectroscopy investigation on single stranded DNA (ssDNA) within a nanopore. As described in more detail below, the inventors found that an anchored DNA analyte stretches within the constriction zone of MspA with increasing force, as applied with an increased electric potential in the nanopore system. By varying electric potential in the nanopore system and simultaneously monitoring the resulting current, the stretching of the DNA within the nanopore was characterized at angstrom-level precision. Using a freely jointed chain model, the relative positions of the nucleotides were characterized during the stretch events. Furthermore, the contribution of Brownian motion to the sensitivity of the nanopore system to multiple nucleotides was established.

The revelations that DNA stretches in the constriction zone when under non- constant electrical potential and that the positions of the nucleotides of the DNA can be modeled were applied to substantially enhance the sensitivities of nanopore systems. As described above, many nanopore systems use molecular brakes, such as phi29 DNA polymerase, to move nucleic acids through the nanopore in discrete steps. This step-wise translocation functions to slow the translocation process enough allow the recordation of current blockades at each translocation step. However, considering that the nucleic acid is held substantially static during each translocation step, the measured blockade events amount to incomplete samplings of data relative to the dynamic fluctuation of current through the pore that would occur if the DNA had translocated continuously. As described in more detail below, the present inventors applied non-constant electrical potentials to cause the DNA to stretch during each stepwise translocation event caused by a molecular brake. Because of the insight from the initial spring modeling analysis, the measured currents were associated specifically with the applied non-constant electrical potential to provide informative current-potential curves for each translocation step. Furthermore, the positions of the nucleotides were then calculated at any point during the DNA stretching and were correlated with the current levels measured throughout each step of translocation. By performing such an analysis for multiple steps in the stepwise translocation, the inventors were able to generate a continuous current profile to reflect the passage of the DNA as if it passed smoothly and continuously through the nanopore. Accordingly, the inventors were able to modify the extant nanopore -based approaches to take advantage of the slowing effect of a molecular brake while avoiding the limitations of blockade measurements in discrete locations, resulting in a much more sensitive analysis of the nucleic acid.

The present disclosure provides numerous practical advantages. At the most basic level, the current-potential curves associated with a single translocation event, e.g., the passage of a single nucleotide residue, contains much more information than mere blockade measurements regarding the k-mer subsequence of the nucleic acid affecting the current during the translocation event. This makes prediction of the k-mer sequence much more robust and simple because there is much less likelihood that any other possible k-mer sequence will result in a current-potential curve profile with the same characteristics. Simple modifications can be implemented to provide for enhanced accuracy of the derived current-potential curve. For example, a non-constant current can be implemented in a cycle that repeats multiple times during each enzyme-assisted translocation step. This results in the repetitive stretching and retracting of the DNA to provide multiple data points for each stretch position. This results in a higher statistical certainty of the observed signal and reduces the influence of aberrant measurements. Furthermore, detection of phosphate backbone structure is now possible because the resolution of this approach allows characterization of movements of less than a single nucleotide. This fine resolution enables the approach to accurately characterize homopolymers. Moreover, the generation of a continuous current profile allows for the simple detection and even correction of translocation errors committed by the molecular brake. For example, the conversion of numerous current-potential curves to an aggregation current-nucleotide positions can reveal discontinuities in the continuous current profile. Such discontinuities can be indicative of forward skips or backsteps committed by the molecular brake. The mere knowledge of such aberrant translocation instances can advance the analysis of the signals. However, missing or extraneous data due to such aberrations can be readily inferred or corrected in view of the remainder of the continuous current profile. Finally, considering the dynamic potentials applied, a complete and empirical function that predicts the response of every k-mer at every electrical potential can be readily generated.

In accordance with the foregoing, in one aspect, a method is provided for analyzing a nucleic acid. The term "nucleic acid" refers to any polymer molecule that comprises multiple nucleotide subunits (i.e., a polynucleotide). Nucleic acids encompassed by the present disclosure can include deoxyribonucleotide polymer (DNA), ribonucleotide polymer (RNA), cDNA or a synthetic nucleic acid known in the art, such as peptide nucleic acid (PNA), glycerol nucleic acid (GNA), threose nucleic acid (TNA), locked nucleic acid (LNA) or other synthetic polymers with nucleotide side chains, or any combination thereof. The nucleic acids can be in either single- or double-stranded form, or comprise both single and double stranded portions. Typically cDNA, RNA, GNA, TNA or LNA are single stranded. DNA can be either double stranded (dsDNA) or single stranded (ssDNA).

The present disclosure addresses characterization of nucleic acids, including characterization or identification of one or more nucleotide subunits of the nucleic acid molecule. Nucleotide subunits of the nucleic acids can be naturally occurring or artificial or modified. A nucleotide typically contains a nucleobase, a sugar and at least one phosphate group. The nucleobase is typically heterocyclic. Suitable nucleobases include purines and pyrimidines and more specifically adenine (A), guanine (G), thymine (T) (or typically in RNA, uracil (U) instead of thymine (T)), and cytosine (C). The sugar is typically a pentose sugar. Suitable sugars include, but are not limited to, ribose and deoxyribose. The nucleotide is typically a ribonucleotide or deoxyribonucleotide. The nucleotide typically contains a monophosphate, diphosphate or triphosphate. These are generally referred to herein as nucleotides or nucleotide residues to indicate the subunit. Without specific identification, the term nucleotides, nucleotide residues, and the like, is not intended to imply any specific structure or identity.

Unless otherwise indicated, the nucleotides addressed herein can also be synthetic, modified (such as by epigenetic modifications), or damaged. The nucleotide can be labeled or modified to act as a marker with a distinct signal. Furthermore, modifications can be applied to the nucleic acid before analysis that selectively affects the structure of a limited nucleotide-type to enhance the differentiation of the resulting signal. For example, see International Application No. PCT/US2014/53754, incorporated herein by reference in its entirety. The disclosure also encompasses uses to identify the absence of a base, for example, an abasic unit or spacer in the polynucleotide. In accordance with all aspects of this disclosure, polymer analytes such as nucleic acids can be assessed using a nanopore system.

A "nanopore" specifically refers to a pore typically having a size of the order of nanometres that allows the passage of analyte polymers, such nucleic acids, therethrough. Typically, nanopores encompassed by the present disclosure have an opening with a diameter at its most narrow point of about 0.3 nm to about 2 nm. Nanopores useful in the present disclosure include any pore capable of permitting the linear translocation of the analyte polymer from one side to the other at a velocity amenable to monitoring techniques, such as techniques to detect current fluctuations.

Nanopores can be biological pores or solid state pores.

In some embodiments, the nanopore comprises a protein, such as alpha- hemolysin, anthrax toxin and leukocidins, and outer membrane proteins/porins of bacteria such as Mycobacterium smegmatis porins (Msp), including MspA, outer membrane porins such as OmpF, OmpG, OmpATb, and the like, outer membrane phospholipase A and Neisseria autotransporter lipoprotein (NalP), and lysenin, as described in U.S. Pub. No. US2012/0055792, International PCT Pub. Nos. WO2011/106459, WO2011/106456, WO2013/153359, and Manrao et al., "Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nat. Biotechnol. 50:349-353 (2012), each of which is incorporated herein by reference in its entirety. Nanopores can also include alpha-helix bundle pores comprise a barrel or channel that is formed from a- helices. Suitable a-helix bundle pores include, but are not limited to, inner membrane proteins and an outer membrane proteins, such as WZA and ClyA toxin. The nanopore can also be a homologs or derivative of any nanopore illustrated above. A "homolog," as used herein, is a gene or protein from another species that has a similar structure and evolutionary origin. By way of an example, homologs of wild-type MspA, such as MppA, PorMl, PorM2, and Mmcs4296, can serve as the nanopore in the present invention. Protein nanopores have the advantage that, as biomolecules, they self- assemble and are essentially identical to one another. In addition, it is possible to genetically engineer protein nanopores, thus creating a "derivative" of a nanopore, such as those illustrated above, that possesses various attributes. Such derivatives can result from substituting amino acid residues for amino acids with different charges, from the creation of a fusion protein (e.g., an enzyme+alpha-hemolysin). Thus, the protein nanopores can be wild-type or can be modified to contain at least one amino acid substitution, deletion, or addition. In some embodiments the at least one amino acid substitution, deletion, or addition results in a different net charge of the nanopore. In some embodiments, the different in net charge increases the difference of net charge as compared to the first charged moiety of the polymer analyte. For example, if the first charged moiety has a net negative charge, the at least one amino acid substitution, deletion, or addition results in a nanopore that is less negatively charged. In some cases, the resulting net charge is negative (but less so), is neutral (where it was previously negative), is positive (where it was previously negative or neutral), or is more positive (where it was previously positive but less so).

In some embodiments, the nanopores can include or comprise DNA-based structures, such as generated by DNA origami techniques. For descriptions of DNA origami-based nanopores for analyte detection, see PCT Pub. No. WO2013/083983, incorporated herein by reference.

In some embodiments, the nanopore is an MspA or homolog or derivative thereof. MspA is formed from multiple monomers. The pore may be homomonomeric or heteromonomeric, where one or more of the monomers contains a modification or difference from the others in the assembled nanopore. Descriptions of modifications to MspA nanopores have been described, see U.S. Pub. No. 2012/0055792, incorporated herein by reference in its entirety. Briefly described, MspA nanopores can be modified with amino acid substitutions to result in a MspA mutant with a mutation at position 93, a mutation at position 90, position 91, or both positions 90 and 91, and optionally one or more mutations at any of the following amino acid positions: 88, 105, 108, 118, 134, or 139, with reference to the wild type amino acid sequence. In one specific embodiment, the MspA contains the mutations D90N/D91N/D93N, with reference to the wild type sequence positions (referred to therein as "Ml MspA" or "Ml-NNN"). In another embodiment, the MspA contains the mutations

D90N/D91N/D93N/D118R/D134R E139K, with reference to the wild type sequence positions (referred to therein as "M2MspA"). See U.S. Pub. No. 2012/0055792. Such mutations can result in a MspA nanopore that comprises a vestibule having a length from about 2 to about 9 nm and a diameter from about 2 to about 6 nm, and a constriction zone having a length from about 0.3 to about 3 nm and a diameter from about 0.3 to about 3 nm, wherein the vestibule and constriction zone together define a tunnel. Furthermore, the amino acid substitutions described in these examples provide a greater net positive charge in the vestibule of the nanopore, further enhancing the energetic favorability of interacting with a negatively charged polymer analyte end.

Some nanopores, such as MspA protein nanopores, can comprise a variably shaped tunnel component through which the polymer analyte moves. For example, FIGURE 1 illustrates an exemplary embodiment where MspA (30) is disposed in a lipid bilayer membrane (20). The MspA nanopore (30) comprises an outer entrance rim region (40), and a vestibule (50) and a constriction zone (60). The vestibule (50) and a constriction zone (60) together form a tunnel. A "vestibule" refers to the cone-shaped portion of the interior of the nanopore whose diameter generally decreases from one end to the other along a central axis, where the narrowest portion of the vestibule is connected to the constriction zone. Stated otherwise, the vestibule may generally be visualized as "goblet-shaped." Because the vestibule is goblet-shaped, the diameter changes along the path of a central axis, where the diameter is larger at one end than the opposite end. The diameter may range from about 2 nm to about 6 nm. Optionally, the diameter is about, at least about, or at most about 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, or 6.0 nm, or any range derivable therein. The length of the central axis may range from about 2 nm to about 6 nm. Optionally, the length is about, at least about, or at most about 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, or 6.0 nm, or any range derivable therein. When referring to "diameter" herein, one can determine a diameter by measuring center-to-center distances or atomic surface-to-surface distances.

The term "constriction zone" generally refers to the narrowest portion of the tunnel of the nanopore, in terms of diameter, that is connected to the vestibule. The length of the constriction zone can range, for example, from about 0.3 nm to about 20 nm. Optionally, the length is about, at most about, or at least about 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, or 3 nm, or any range derivable therein. The diameter of the constriction zone can range from about 0.3 nm to about 2 nm. Optionally, the diameter is about, at most about, or at least about 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, or 3 nm, or any range derivable therein. In other embodiment, such as those incorporating solid state pores, the range of dimension (length or diameter) can extend up to about 20 nm. For example, the constriction zone of a solid state nanopore is about, at most about, or at least about 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1,2 13, 14, 15, 16, 17, 18, 19, or 20 nm, or any range derivable therein. Larger dimension in such nanopores can be preferable depending on the analyte. As described in more detail below, the constriction zone is generally the part of the nanopore structure where the presence of a polymer analyte, such as a nucleic acid, can influence the ionic current from one side of the pore to the other side of the nanopore. FIGURE 2 provides an illustrative diagram of a constriction zone (60) that is sensitive to a subsequence of several nucleotides of a polymer (represented by a ball and stick chain). In this example, a specific position within the constriction zone (60) has the highest sensitivity for determining the current through the nanopore. Thus, the nucleotide residing in that position at any time will provide the greatest influence on the current signal and the neighboring nucleotides in the constriction zone have diminished influence on the signal. Accordingly, the dimensions of the nanopore's constriction zone can influence the resolution of the current signal as it relates to the structure (and sequence identity) of the analyte polymer residing therein. In some instances, the term "constriction zone" is used in a functional context based on the obtained resolution of the nanopore and, thus, the term is not necessarily limited by any specific parameter of physical dimension. Thus, a nanopore's functional constriction zone can be optimized by modifying aspects of the nanopore system but without providing for any physical modification to the nanopore itself.

In some embodiments, the nanopore can be a solid state nanopore. A solid-state layer is not of biological origin. In other words, a solid state layer is not derived from or isolated from a biological environment such as an organism or cell, or a synthetically manufactured version of a biologically available structure. Solid state nanopores can be produced as described in U.S. Patent Nos. 7,258,838 and 7,504,058, incorporated herein by reference in their entireties. Briefly, solid state layers can be formed from both organic and inorganic materials including, but not limited to, microelectronic materials, insulating materials such as Si3N4, A1203, and SiO, organic and inorganic polymers such as polyamide, plastics such as Teflon® or elastomers such as two-component addition-cure silicone rubber, and glasses. The solid state layer may be formed from graphene. Suitable graphene layers are disclosed in WO 20091035647 and WO-20111046706. Solid state nanopores have the advantage that they are more robust and stable. Furthermore, solid state nanopores can in some cases be multiplexed and batch fabricated in an efficient and cost-effective manner. Finally, they might be combined with microelectronic fabrication technology. In some embodiments, the nanopore comprises a hybrid protein/solid state nanopore in which a nanopore protein is incorporated into a solid state nanopore. In some embodiments, the nanopore is a biologically adapted solid- state pore.

In some cases, the nanopore is disposed within a membrane, thin film, layer, or bilayer. For example, biological (e.g., proteinaceous) nanopores can be inserted into an amphiphilic layer such as a biological membrane, for example a lipid bilayer. An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, which have both hydrophilic and lipophilic properties. The amphiphilic layer may be a monolayer or a bilayer. The amphiphilic layer may be a co-block polymer. Alternatively, a biological pore may be inserted into a solid state layer.

The membrane, thin film, layer, or bilayer typically separate a first conductive liquid medium and a second conductive liquid medium to provide a nonconductive barrier between the first conductive liquid medium and the second conductive liquid medium. The nanopore, thus, provides liquid communication between the first and second conductive liquid media. In some embodiments, the pore provides the only liquid communication between the first and second conductive liquid media. The liquid media typically comprises electrolytes or ions that can flow from the first conductive liquid medium to the second conductive liquid medium through the interior of the nanopore. Liquids employable in methods described herein are well-known in the art. Descriptions and examples of such media, including conductive liquid media, are provided in U.S. Patent No. 7,189,503, for example, which is incorporated herein by reference in its entirety. The first and second liquid media may be the same or different, and either one or both may comprise one or more of a salt, a detergent, or a buffer. Indeed, any liquid media described herein may comprise one or more of a salt, a detergent, or a buffer. Additionally, any liquid medium described herein may comprise a viscosity-altering substance or a velocity-altering substance.

The polymer analyte (e.g., nucleic acid) serving as the target or focus of an analysis is capable of interacting with the nanopore and translocating, preferably in a linear fashion, through the nanopore to the other side. As used herein, the terms "interact" or "interacting," indicate that the analyte moves into at least an interior portion of the nanopore and, optionally, moves through the nanopore. As used herein, the terms "through the nanopore" or "translocate" are used to convey that at least some portion (i.e., at least one subunit) of the polymer analyte enters one side of the nanopore and moves to and out of the other side of the nanopore. In some cases, the first and second conductive liquid media located on either side of the nanopore are referred to as being on the cis and trans regions, where the polymer analyte to be measured generally translocates from the cis region to the trans region through the nanopore. This can be represented with reference to FIGURE 1, wherein an ssDNA (10) is illustrated as residing along the entire internal tunnel of the MspA nanopore (30). The area above the lipid bilayer (20) is the cis compartment that contains the negative pole and the area below the lipid bilayer (20) is the trans compartment that contains the positive pole. Downward movement of the ssDNA (10) through the nanopore would be a translocation of the ssDNA from the cis to the trans regions. However, it will be appreciated that in some embodiments, the polymer analyte to be analyzed can translocate from the trans region to the cis region through the nanopore. In some cases, the entire length of the polymer does not pass through the pore, but portions or segments of the polymer pass through the nanopore for analysis. The directionality and rate of translocation can be regulated using various mechanisms such as applied current, additional molecular brakes or motors, or the incorporation of a nanopore in the reverse orientation, as described in more detail below.

Nanopore systems also incorporate structural elements to measure and/or apply an electrical potential across the nanopore-bearing membrane or film. For example, the system can include a pair of drive electrodes that drive current through the nanopores. Typically, the negative pole is disposed in the cis region and the positive pole is disposed in the trans region. Additionally, the system can include one or more measurement electrodes that measure the current through the nanopore. These can include, for example, a patch-clamp amplifier or a data acquisition device. For example, nanopore systems can include an Axopatch-IB patch-clamp amplifier (Axon Instruments, Union City, CA) to apply voltage across the bilayer and measure the ionic current flowing through the nanopore. For example, in some embodiments, the applied electrical field includes a direct or constant current that is between about 10 mV and about 1 V. In some embodiments that include protein-based nanopores embedded in lipid membranes, the applied current includes a direct or constant current that is between about lO mV and 300 mV, such as about 10 mV, 20 mV, 30 mV, 40 mV, 50 mV, 60 mV, 70 mV, 80 mV, 90 mV, 100 mV, l lO mV, 120 mV, 130 mV, 140 mV, 150 mV, 160 mV, 170 mV, 180 mV, 190 mV, 200 mV, 210 mV, 220 mV, 230 mV, 240 mV, 250 mV, 260 mV, 270 mV, 280 mV, 290 mV, 300 mV, or any voltage therein. In some embodiments, the applied electrical field is between about 40 mV and about 200 mV. In some embodiments, the applied electrical field includes a direct or constant current that is between about 100 mV and about 200 mV. In some embodiments, the applied electrical direct or constant current field is about 180 mV. In other embodiments where solid state nanopores are used, the applied direct or constant current electrical field can be in a similar range as described, up to as high as 1 V. As will be understood, the voltage range that can be used can depend on the type of nanopore system being used and the desired effect.

As will be described in more detail below, the present disclosure addresses the application of a non-constant current to the nanopore system as the nucleic acid is translocating. Thus, illustrative parameters and approaches are provided in more detail below.

In some instances, the electrical potential applied can be sufficient to translocate a polymer analyte (e.g., nucleic acid) through the nanopore. However, as described above, in many nanopore systems, additional components are incorporated that facilitate or regulate the translocation of the polymer through the nanopore. In some embodiments, the component is a molecular brake, which is a moiety that regulates the rate of translocation of the polymer analyte through the nanopore. Molecular brakes can be any protein, enzyme, or molecule that can associate with the polymer analyte (e.g., nucleic acid analyte). Often, the molecular brake has a physical size or shape that prevents its own translocation through the nanopore or is or comprises a tether to an immobile object. For example, FIGURE 3 illustrates a molecular brake (30) that is linked to a ssDNA molecule (90) by a biotin linker (80). As described in more detail below, an exemplary system proved useful in the present studies to assess the elasticity parameters exhibited by ssDNA in the nanopore under variable applied forces.

In some embodiments, the molecular brake facilitates movement of a nucleic acid polymer through the nanopore. The molecular brake can be active, i.e., using energy such as ATP to move the nucleic acid. Such molecular brakes can encompass moieties that can move the DNA against the force direction applied by the voltage cross the nanopore. Alternatively the molecular brake can be passive, i.e., not using energy to move the nucleic acid. Molecular brakes can be any known molecular motor that associates with nucleic acids. Typically, such molecular motors are enzymes that are known to associate and manipulate nucleic acids. In many instances, molecular motors will allow or regulate the translocation of the nucleic acid in a stepwise fashion where the nucleic acid progresses in discrete movements of a relatively consistent length, akin to a ratchet or queuing motion. For example, some molecular motors facilitate translocation of the nucleic acid in single nucleotides steps. However, it will be appreciated that other molecular motors are useful for translocating the nucleic acid in steps that are less than a single nucleotide length. Yet other molecular motors are useful for translocating the nucleic acid in steps that are more than a single nucleotide in length. As described above, in many nanopore systems, a nucleic acid will pass through the nanopore too quickly to obtain informative current signals. Thus, with many molecular brakes the translocation velocity, or an average translocation velocity, is less than the translocation velocity that would occur without the molecular brake. While the molecular brakes are discussed above with general reference to regulating, and often slowing, the translocation rate, persons familiar with the art will appreciated, however, that some molecular brakes such as molecular motors can be used to speed up or maintain constant translocation rates. Accordingly, the term molecular brake as used herein can encompass moieties that can move the polymer analyte with or against the force applied by an electrical current or flow across the pore.

In some embodiments, the molecular brake is an enzyme or derived from an enzyme. In some embodiments, the molecular brake is modified to remove a particular function from the enzyme, but preserves the ability of the brake to associate with the polymer analyte (e.g., nucleic acid) and facilitate its translocation. In some embodiments, the enzyme is or is derived from a member of any of the Enzyme Classification (EC) groups 3.1.11, 3.1.13, 3.1.14, 3.1.15, 3.1.16, 3.1.21, 3.1.22, 3.1.25, 3.1.26, 3.1.27, 3.1.30 and 3.1.31. The enzyme maybe any of those disclosed in International publication as WO 2010/1086603, incorporated herein by reference in its entirety.

In some embodiments, the enzyme is a translocase, a polymerase, a helicase, an exonuclease, or topoisomerase, and the like.

Exemplary helicases useful for this application are generally described in WO 2010/1086603, incorporated herein by reference in its entirety. Other examples are exonucleases, which can include exonuclease I, exonuclease III, lambda exonuclease, or a variant or homolog thereof. For any aspect herein, homologs, derivatives, and other variant proteins, as described herein, can preferably be at least 50% homologous to the reference protein based on amino acid sequence identity. More preferably, the variant polypeptide may be at least 55%>, at least 60%>, at least 65%>, at least 70%>, at least 75%>, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the reference protein. Homology can be determined by any method accepted in the art. It is noted that homologs or variants can possess sequence and structural modifications so long as they retain the proper functionality for translocation. While exonucleases often contain enzymatic functions for excising portions of the nucleic acids, such enzymes can be modified to ablate such nuclease function while preserving the ability to bind and translocate the nucleic acid polymer.

Exemplary helicases useful for this application are generally described in WO 2014/013260 and WO 2013/057495, each reference incorporated herein by reference in its entirety, and can include an Hel308 helicase, a RecD helicase, a Tral helicase, a Tral subgroup helicase, an XPD helicases, or a variant or homolog thereof.

Exemplary polymerases useful for this application include DNA polymerases such as phi29 DNA polymerase (sometimes referred to as phi29 DNAP), Klenow fragment, or a variant or homolog thereof. For example, as described in more detail below, a DNA polymerase such as phi29 can be used to facilitate movement of the DNA polymer through the nanopore. See, e.g., Cherf, G.M., et al, "Automated forward and reverse ratcheting of DNA in a nanopore at 5-A precision," Nat. Biotechnol. 50:344-348 (2012), and Manrao et al., "Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nat. Biotechnol. 50:349-353 (2012), each of which is incorporated herein by reference in its entirety.

Exemplary topoisomerases can include a gyrase, or a variant or homolog thereof.

The method according to this aspect comprises translocating the nucleic acid through a nanopore from the first conductive liquid medium to the second conductive liquid medium, applying a non-constant electrical potential between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore, and measuring a plurality of current signals between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore. Additionally, the method comprises identifying a subset of the plurality of measured current signals associated with a single translocation step of the nucleic acid, and deriving a current-potential curve from the subset of current signals to provide a current pattern. In some embodiments, the method also comprises determining a characteristic of the nucleic acid based on the current pattern.

Non-constant electrical potential is applied across the nanopore during the translocation of the nucleic acid. As described above, this is because the inventors have observed the stretch of DNA in the nanopore during the increasing force applied by increasing voltage. The inventors were able to model the stretching phenomenon and characterize the movement of reference nucleic acids within the nanopore constriction zone at different forces (i.e., voltages). Thus, the inventors determined that a shift in the DNA could be imposed, resulting in a corresponding shift in observed current through the pore. With the application of the non-constant electrical potential during translocation the DNA is stretched, and in some embodiments contracted again, once or multiple times, to provide data that ultimately permits simulation of the continuous passage of the nucleic acid through the nanopore.

The non-constant electrical potential can have a periodic function or can be random (non-periodic). However, whatever the pattern, the user should know the non- constant potential and be able to correlate it to the measured current signals at any given time. In some embodiments, the non-constant electrical potential can be applied across the nanopore by superimposing a non-constant (varying) electrical potential over a constant, direct potential. The general parameters of the constant, direct potential are described above. In some embodiments, the non-constant electrical potential is superimposed onto a constant potential that is between 10 mV and 1 V. In some embodiments, such as embodiments incorporating a biological, or proteinaceous, nanopore, the non-constant electrical potential is imposed over a constant electrical potential that is between 10 mV and 300 mV.

In some embodiments, the electrical potential has a periodic function. The periodic function can be asymmetric, symmetric, regular or irregular. For example, in some embodiments, the potential can be applied continuously for a period of time, i.e. for a partial period of the period cycle, with a transition between those different potentials, for example a square wave or stepped wave. The transitions between the voltage levels can be sharp or can be ramped over a period of time. In other embodiments, the voltage level can vary continuously, for example a triangular, sawtooth, or sinusoidal wave. The inventors observed that the stretch behavior of the DNA is not linearly correlated with increasing potential. Thus, the periodic function can be customized to achieve a substantially linear rate DNA movement during a period of increasing or decreasing potential in the periodic function. For example, observing that the rate of stretching of DNA is reduced at the highest applied voltages, a triangular periodic function might be modified to enhance the time spent at the lower voltages by imposing a low rate of voltage increase at the beginning of the cycle, but followed by an increasing slope until the maximum voltage. The descending portion of the cycle can mirror the increase, such that the time spent at the higher voltages is minimized and the time spent at lower voltages is maximized.

The period or cycle of the periodic function is typically shorter than the time required for each translocation step of the nucleic acid. Often, it is preferred to have multiple (and even numerous) periodic function periods occur during each single translocation step as it permits the acquisition of multiple data points at each potential level applied. In some embodiments, the periodic function is between about 0.1 ms and about 1 s (inclusive). In a further embodiment, the period function is between about 1.0 ms and about 0.5 s, for example between about 1.0 ms and about 0.1 s, between about 1.0 ms and about 0.01 s, between about 5.0 ms and about 0.01 s, between about 10 ms and about 0.01 ms, between about 5.0 ms and about 0.001 s, between about 10 ms and about 0.001 s, and the like. In some embodiments, the difference between the minimum and maximum potential in the periodic function is less than 250 mV. This indicates that in these embodiments, the period function does not fluctuate by more than 250 mV, 240 mV, 230 mV, 220 mV, 210 mV, 200 mV, 190 mV, 180 mV, 170 mV, 160 mV, 150 mV, 140 mV, 130 mV, 120 mV, 110 mV, 100 mV, or less.

Generally, the imposition of the non-constant potential does not reverse the polarity of the nanopore system.

Raw current can be continuously or periodically monitored during the application of the non-constant potential, as described above, to produce the plurality of current signals. Preferably multiple current signals are obtained during the translocation of each nucleotide. FIGURES 18A-18C illustrate raw current signals obtained during the translocation of a ssDNA through MspA with increasing resolution (i.e., with decreasing time scales). As illustrated, hundreds of current signals can be recorded per second. Based on the plurality current signals, a subset of current signals associated with a single translocation step is identified. The translocation step can merely be the passage of nucleic acid by one nucleotide subunit. The translocation step can be driven or facilitated by a molecular brake, many of which drive the translocation in single, discrete steps. The subset of current signals can be identified according to any known technique. For example, as described in more detail below, algorithms can be applied to the data to ascertain pattern shifts in the current signals indicating the passage of a nucleotide length, and thus a new k-mer subsequence in the nanopore constriction zone. Such algorithms can be referred to as level finders, with the term level referring to the current signals (often taking the form of a "block" or "level) corresponding to a k-mer with a specific sequence within the constriction zone. The arrows in FIGURES 18C and 19 indicate the transition between levels.

Once the subset of current signals associated with a single translocation step is identified, a current-potential curve is derived from the subset of current signals. Briefly, the measured current is associated with the particular electrical potential, usually indicated as a specific voltage, which had been applied to the nanopore system as the current was observed. Because the method employed a non-constant electrical potential, current signals will be associated with at least two distinct potentials to provide the current-potential curve. FIGURE 20 provides and illustrative example of a series of 16 distinct current-potential curves, each corresponding to sequential translocation steps. Each of the illustrated curves contains hundreds of current to voltage (I-V) plots over a range of 100 to 200 mV.

As is apparent from FIGURE 20, each current-potential curve contains a vast amount of information regarding the corresponding sequence of the k-mer residing in the constriction zone of the nanopore at the time the corresponding subset of raw current signals was recorded. For instance, the various current-potential curves illustrated contain different lengths, slopes, curvatures, minimum and maximum currents, and the like, all of which can be used to uniquely reflect the structure (e.g., sequence) of the k- mer residing in the constriction zone. Thus, in some embodiments, the current-potential curve is the current pattern and, thus, the characteristics of the nucleic acids can be identified based on the current-potential curve. Use of the term "curve" notwithstanding, it will be appreciated that there is no implied requirement that the relationship of the data creating the curve be displayed, such as in FIGURE 20, for the relationship (i.e., curve) to be used according to the disclosed methods of determining a characteristic of the nucleic acid.

As will be described in more detail below, in other embodiments, corrections, conversions, or modifications can be applied to the current data, including the current-potential curve, to provide the current pattern. Such corrections, conversions, or modifications can be advantageous in various circumstances to correct the signal for various perceived "non-linearities" or other abberations in the system that can otherwise obfuscate the relationship between the structure of the k-mer in the constriction zone and the observed current.

As will be appreciated be persons of ordinary skill in the art, the current observed in nanopore systems may not be entirely attributable to the flow of ions through the nanopore. Accordingly, in some embodiments, the method also comprises a step to correct for measured current not associated with the flow of ions through the nanopore, and thus reduce the influence of this current on the measured current signal. Ultimately, the goal is to obtain observations of the ion current through the pore at any given time without influence or obfuscation by other types of measured currents. For example, the measured current can be affected by the capacitance of various components of the nanopore system. For example, components that can exhibit capacitance are the bilayer membrane, the electronics, or the nanopore setup. Thus, the method can include a step of correcting for any capacitance of the system to reduce the influence on the measured current signal. This correction can be implemented by hardware, by providing compensatory electrical potential (e.g., injecting currents as appropriate) to substantially offset any current resulting from charging of capacitances of the nanopore system. Alternatively or in addition, the measured current signal itself can be corrected to compensate for any capacitive currents within the nanopore system. This can be accomplished according to any known approach (see below description), and usually involves adjustment of the overall measured data.

In another embodiment, the method further comprises converting the current-potential curve into a conductance-voltage relationship at a given voltage. This entails removing the voltage dependency of the data, which heretofore reflects the different observed currents at multiple voltages (as applied as a non-constant potential). This conversion can be performed according to any known methods, such as described in more detail below. Generally speaking, once the relationship between current and voltage across any given subset is known (corresponding to a particular k-mer), the current for each current data point can be adjusted to an expected current at single, constant reference voltage of choice. This current transformation emulates the current that would be found if the ssDNA nucleotides were positioned at the given location, while at a constant potential was applied. It is noted that the exemplary figures illustrating the data conversions, such as FIGURES 22A and 22B illustrate current to another parameter. It is noted that conductance is current divided by voltage. As all of the conductance valued plotted are multiplied at a constant voltage (typically 180 mV), the conductance value is converted to current for purposes of these illustrations. Furthermore, as described in more detail below, it was observed that the MspA nanopore system was non-ohmic, i.e., had a non-linear resistance to voltage relationship. See, e.g., FIGURE 4. Thus, the conversion can include correcting for the nonlinearity of the current to voltage relationship. An exemplary correction is described in more detail below, where the non-linear relationships were fit to a quadratic function. Linear fits were made to each function. As illustrated in FIGURE 21, the plot of the corrected current is constant for the various homopolymers tested and, thus no longer reflected variation in the applied voltage, confirming the robustness of the conversion. In another embodiment, the method further comprises converting the current-potential curve into a current-nucleic acid distance curve to provide a current pattern corresponding to a segment of the nucleic acid residing in the constriction zone of the nanopore associated with the single translocation step. This conversion is the result of modeling work described below wherein the elasticity of ssDNA within the nanopore constriction zone was modeled as a spring, thus enabling the characterization of the specific positions of the reference nucleotides during the stretch as increasing force was applied. Using such a model, the fluctuations in applied potential are predictive of the spatial relationship of the nucleotides in the k-mer as to the nanopore constriction zone. Thus, the current-potential curves can be converted to show the current-spatial relationships of the k-mer. This effectively describes the current-nucleotide distance (within the nanopore) relationship in the subset of current signals. In some embodiments, the conversion of the current-potential curve into a current-nucleic acid distance curve is accomplished by application of a spring-based model. In some embodiments, the model is a model of spring with a linear restoring force. In some embodiments, the model is a non-linear restoring force as in a freely jointed chain (FJC) model or modified freely jointed chain (FJC) model, as described in more detail below. Other appropriate models can be applied according to the skill in the art.

Unless otherwise noted, the conversion and correction steps described can be applied optionally and independently to the observed current signals and/or current-potential curves.

As described above, the inventors have discovered that the data derived from applying a non-constant electric potential during translocation, after a specific conversion process unexpectedly revealed a great quantity of high-resolution data regarding the passage of each nucleotide through the nanopore. Thus, as above, the current-potential curves, or any conversions thereof, such as the current-nucleic acid distance curves, provides a high order of detailed information that corresponds to the specific k-mer structure (e.g., sequence) passing through the nanopore constriction zone at any given time, even when the translocation is a ratcheted, stepwise process facilitated by a molecular brake. The data obtained for the subset can be useful for predicting the sequence of the corresponding k-mer in the nucleic acid. In some embodiments, the method is repeated as described above, including any optional conversion step, for one or more additional subsets of data to provide a plurality of current patterns. Preferably, the subsets correspond to sequential translocation steps, e.g., the passage of a contiguous sequence of nucleotide subunits through the nanopore. Accordingly, information (i.e., current patterns as described above) can be obtained for each step. When aggregated, these current patterns provide even further analytic advantages. In one respect, the prediction of a series of k-mer sequences, even if there is redundancy in curves reflecting various k-mers, can be useful to predict with high probability the sequences of the k-mers. This is based on the fact that with the passage of a single nucleotide, the initial k-mer will overlap with the subsequent k-mer by all but one nucleotide. Thus knowledge of the sequence (or potential sequences) of a first k-mer sequence is useful to deduce the potential sequence(s) of the subsequent k-mer sequences. Such analyses can be performed with the application of known algorithms, to determine which of the multiple possible multi-subunit polymer sequences is correct. For example, Hidden-Markov Models (HMM) are particularly suited for recovering sequence information for this system. See, e.g., Timp W, et al, 2012, Biophys. J. 702:L37-L39, incorporated herein by reference in its entirety. Other descriptions of similar deconvolutions are provided in WO 2013/041878, also incorporated herein by reference in its entirety. For example, as a general illustration of an embodiment, all potential k- mer nucleic acid sequences for a degenerate sequence are stored. A first output signal is measured to provide a first current pattern, as described above. A second, subsequent output signal is assessed to provide one or more potential k-mer nucleic acid sequences for the second current pattern. All potential k-mer nucleic acid sequences for the first current pattern that are incompatible with the potential k-mer nucleic acid sequences for the second current pattern are discarded. This process can be repeated for more k-mer current patterns. Thus, even if the first measurement yielded many possible k-mer nucleic acid sequences, it is likely that after several measurements there will only be one or a few possible sequences that are consistent with all the measurements.

Thus, the method can also comprise the step of comparing one or more to of the multiple current patterns to current patterns from reference nucleic acids with known correlations between current patterns and sequence, for example, in the form of a look-up table. Such tables can be generated according to ordinary skill in the art. Furthermore the aggregation of current patterns for sequential translation steps can provide even further advantages, such as the ability to detect discontinuities in the translocation process, and thus deficiencies in the data. As illustrated in FIGURES 22A and 22B, the aggregation of current patterns along a distance axis, such as nucleotide or nm, respectively, provides a continuous current profile that simulates the dynamic fluctuation of current that would be caused by a nucleic acid polymer during a continuous (not ratcheted, stepwise) translocation through the nanopore. However, the generated continuous current profile can reveal discontinuities, or aberrations from the continuous pattern. In some embodiments, the discontinuity reflects a forward skip in translocation. This could occur, for instance, when a molecular brake that normally permits translocation in single nucleotide steps permits two or more nucleotides to translocate without an intervening pause. Additionally, the discontinuity reflects a backstep during translocation. This can occur when the molecular brake pulls the nucleic acid backward (e.g., from trans to cis) for a length of the nucleic acid, and then resumes normal forward translocation, effectively allowing the same subsequence to translocate multiple times. Such discontinuities can be readily observed in the generated continuous current profile and disregarded or otherwise accounted for. For skips, often the missing data can be reasonably inferred by imposing a reasonable connection between disconnected portions of the profile. Thus, in some embodiments, the discontinuities are corrected.

An illustrative, non-limiting embodiment of the present method is represented schematically in FIGURE 17. Briefly, DNA is tranlocated through a nanopore while applying nonconstant potential and current signals are acquired (110). Exemplary illustrations of the raw signal data are provided in FIGURES 18A-19. Optionally, the capacitance of the system is corrected by injecting compensatory potential (150) and/or by data correction of the acquired current signals (160). Subsets of data corresponding to individual enzyme-assisted translocation steps are identified (120) by, for example, the use of a level finder algorithm. Such level transitions between the subsets are illustrated in FIGURES 18C and 19. The current-potential (voltage) curves are derived for one or more subset (130), as illustrated in FIGURE 20. The obtained current-potential (voltage) curves can be used to estimate the sequence of one or more nucleotide subunits of the analyte nucleic acid (140). Optionally, the current-potential (voltage) curves can further be subjected to adjustments, including conversion to remove the voltage dependency (whether linear or non- linear) of the current (170) and/or conversion to a current-nucleic acid position relationship to provide a continuous current profile based on spring modeling (180). See, e.g., FIGURES 22A and 22B. With the generation of a continuous current profile, discontinuities in the translocation process, and resulting data, can be identified and addressed (190).

It will be appreciated that a characteristic of the nucleic acid polymer that is determined in this method can be the identity of one or more of the nucleic acid subunits therein. In other embodiments, the characteristic is the presence of one or more modified nucleotide subunits. Furthermore, actual identity does not need to be determined. In other embodiments, for example, the method can be used to characterize patterns within the structure (independent of specific sequences) such as a "fingerprint" that can be used to distinguish the nucleic acid relative to another nucleic acid. In some embodiments, the sequence identity is determined for one, two, or more (including all) of the nucleotide subunits in the nucleic acid.

Additionally, it will be appreciated that given the ability of some nucleic acids to form double stranded conformations with complementary sequences, the present disclosure can be applied to both strands (termed positive and negative, or sense and antisense) to ascertain the sequence of one or both strands. The additional information provided allows distinction between a larger number of underlying structures. See, for example, International Application No. PCT/US2014/53754, incorporated herein by reference in its entirety.

As described herein, in some embodiments the measurable signal obtained from nanopore analysis of the nucleic acid is compared against a known signal or a signal obtained from a known nucleic acid. The term "known nucleic acid" is used in reference to a nucleic acid for which the status with respect to a particular characteristic, such as nucleotide sequence or fingerprint pattern, is known. In some embodiments, the known signal is obtained from the known nucleic acid under the same or similar analytical conditions.

As described above, the inventors modeled the stretching of DNA in MspA nanopore and identified the influence of Brownian motion on the resolution (or the functional size of the constriction zone). By having nucleotides move in and out to the constrictions zone area, they contribute to the modification of the passing ionic current and, thus, lower the resolution of the pore. The inventors sought to reduce the influence of Brownian motion on the resolution of signals obtained from the nanopore by reducing the distance between the molecular brake and the constriction zone of the nanopore. The inventors hypothesized that a reduced distance would reduce elongational tendencies of the nucleic acid and reduce the Brownian motion. While this could theoretically be accomplished by reversing the orientation of the MspA nanopore in the nanopore system (see FIGURE 23), previous results indicated that this was an untenable system. As described in more detail below, MspA nanopores in the reverse orientation were known to exhibit significant gating when relevant potentials were applied to test for insertion of the pore. Gating refers to spontaneous reductions in current in the absence of an analyte. See, e.g., FIGURE 25A. This gating behavior would be difficult to distinguish from any real signal caused by an analyte. Furthermore, the MspA nanopores tend to insert into the bilayer membranes in the forward orientation when using standard procedures.

In spite of the above challenges, the inventors examined the feasibility of MspA in the reverse orientation for nanopore analysis to determine whether it would reduce issues with Brownian motion. As described in more detail below, the inventors surprisingly found that when analyte (associated with a molecular brake) is inserted into the reverse nanopore from the cis side (see FIGURE 23, right diagram) the pore generated accurate current signals reflecting analyte structure with minimized gating events (see FIGURE 25B). Furthermore, the inventors demonstrated that the current signals obtained from a reverse oriented MspA nanopore exhibited better resolution as compared to the current signals obtained from a forward MspA nanopore with the same analyte. See FIGURES 26A and 26B. Furthermore, the inventors demonstrated that the MspA nanopore in the reverse orientation produced current levels that span a wider range of currents, with a much higher frequency of currents in the high and low end of the measured currents. See FIGURE 26C. These data indicate that the reduced distance between the molecular brake and the constriction zone reduced the Brownian motion of the nucleic acid in the constriction zone. This, in turn, resulted in a functionally shorter constriction zone, and an increased resolution of the nanopore system.

Accordingly, in another aspect, a system for nanopore-based analysis of a nucleic acid is provided. The system is generally described above. The system comprises a membrane separating a first conductive liquid medium and a second conductive liquid medium and a nanopore embedded in the membrane that comprises a constriction zone and a vestibule that together define a tunnel. The constriction zone is more proximate to the first conductive liquid medium and the vestibule is more proximate to the second conductive liquid medium. The system is operative to translocate a nucleic acid from the first conductive liquid medium to the second conductive liquid medium. The first conductive liquid medium is typically present in the cis side of the membrane, as illustrated in FIGURE 23. In some embodiments, the cis side of the membrane contains the negative pole of the nanopore system.

The membrane is generally described above. In some embodiments, the membrane is a lipid bilayer. In some embodiments, the membrane comprises a block copolymer.

The nanopore can be any nanopore as described herein, which is asymmetrical along the axis of the tunnel and contains a constriction zone. The constriction zone is oriented in the membrane such that the constriction zone is more proximate to the first conductive liquid medium than the second conductive liquid medium.

In some embodiments, the nucleic acid is associated with a molecular brake. In some embodiments, the molecular brake is disposed within the first conductive liquid medium. In some embodiments, the molecular brake is a moiety that has a diameter equal to or larger than the internal diameter of the nanopore constriction zone. In some embodiments, the molecular brake is or includes a tether linkage to an immobile object.

In a related aspect, the disclosure provides a method for analyzing a nucleic acid in a nanopore system. The nanopore system can be the system described herein where the nanopore is embedded in the membrane in a "reverse" orientation with the constriction zone more proximate to the first conductive liquid medium. The method includes translocating the nucleic acid through the nanopore from the first conductive liquid medium to the second conductive liquid medium, applying an electrical potential between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid translocates through the nanopore, measuring a plurality of current signals between the first conductive liquid medium and the second conductive liquid medium as the nucleic acid is in the nanopore; and determining a characteristic of the nucleic acid based on the measured current signals. As part of the method, the nucleic acid is associate with molecular brake in the first conductive liquid medium that regulates the translocation velocity of the nucleic acid through the nanopore, and wherein the molecular brake has a diameter that exceed a diameter of the nanopore. In this aspect, during translocation of the nucleic acid, the molecular motor does not pass through the nanopore. In some embodiments, the molecular motor abuts against the opening of the nanopore (on the cis side) to provide resistance to the translocation rate of the nucleic acid. In some embodiments, the distance between the molecular motor and the constriction zone of the nanopore is less than about 4 nm, such as less than about 3 nm, less than about 2nm, and less than about 1 nm.

The method steps can generally be performed as described above. The molecular brake and other components of the nanopore system are described in more detail above.

The use of the term "or" in the claims is used to mean "and/or" unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and "and/or."

Following long-standing patent law, the words "a" and "an," when used in conjunction with the word "comprising" in the claims or specification, denotes one or more, unless specifically noted.

Unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise, " "comprising, " and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of "including, but not limited to." Words using the singular or plural number also include the plural and singular number, respectively. Additionally, the words "herein," "above," and "below," and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of the application. Words such as "about" and "approximately" imply minor variation around the stated value, usually within a standard margin of error, such as within 10% or 5% of the stated value.

Disclosed are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in preparation for, or are products of the disclosed methods and compositions. It is understood that, when combinations, subsets, interactions, groups, etc., of these materials are disclosed, each of various individual and collective combinations is specifically contemplated, even though specific reference to each and every single combination and permutation of these compounds may not be explicitly disclosed. This concept applies to all aspects of this disclosure including, but not limited to, steps in the described methods. Thus, specific elements of any foregoing embodiments can be combined or substituted for elements in other embodiments. For example, if there are a variety of additional steps that can be performed, it is understood that each of these additional steps can be performed with any specific method steps or combination of method steps of the disclosed methods, and that each such combination or subset of combinations is specifically contemplated and should be considered disclosed. Additionally, it is understood that the embodiments described herein can be implemented using any suitable material such as those described elsewhere herein or as known in the art.

Publications cited herein and the subject matter for which they are cited are hereby specifically incorporated by reference in their entireties.

The following is a description of a study varied the applied electric potential to alter the force on nucleic acid polymers resulting in differential stretching of the nucleic acid within an MspA nanopore. The study further addresses the analysis of the resulting current signals to characterize and model the stretching of the nucleic acid.

Overview

Stretching of single-stranded DNA within a biological nanopore system was investigated. Mycobacterium smegmatis porin A (MspA), and mutants thereof, has been previously described as part of a nanopore DNA sequencing system with high signal-to- noise resolution of individual nucleotides. In the prior study, the pore was inserted into a lipid bilayer in an electrolyte solution and voltage was applied across the bilayer. DNA nucleotides positioned into the pore's constriction zone were found to modulate the resistance of the DNA-pore system. In the present study, MspA was used to stretch single-stranded DNA as it was immobilized within the nanopore. The DNA was attached to a large NeutrAvidin molecule and driven into MspA until the NeutrAvidin came to rest on the rim of the pore, prohibiting further translocation. Approximately 15 nucleotides spanned the distance from the NeutrAvidin to the MspA constriction zone. By varying the voltage across the pore, DNA stretching was detected with nucleotide precision. Using a freely jointed chain model, the force applied to the DNA strand was estimated, as well as its charge density and elasticity. The Brownian motion of the strand was also estimated and found to cause multiple nucleotides to control the ionic current through the nanopore. These results provide insight into the behavior of DNA confined within a nanopore and are important for the further refinement of a nanopore DNA sequencing system.

Introduction

When a voltage is applied across a small pore in an electrolyte solution, an ionic current can be measured. A molecule residing in the pore affects the ionic current and permits study of the molecule's properties. Biological nanopores can be used as a single molecule tool to detect individual DNA nucleotides as well as to study protein folding and peptide structure. This simple technique is currently being developed for use as an inexpensive, high-resolution DNA sequencing system.

Recently, a DNA sequencing system was developed using Mycobacterium smegmatis porin A (MspA) and a motor enzyme (see Manrao, E.A., et al., "Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nature Biotechnology 50:349-353 (2012), incorporated herein by reference in its entirety). MspA is a particularly significant biological nanopore because it provides the highest signal-to-noise resolution of DNA nucleotides to date (Butler, T.Z., et al, "Single-molecule DNA detection with an engineered MspA protein nanopore," Proceedings of the National Academy of Sciences of the United States of America 705:20647-20652 (2008), incorporated herein by reference in its entirety). Initially negative charges near the constriction of wild type MspA inhibited DNA translocation, but after introduction of mutations in which three negatively charged aspartic acids were replaced with asparagines DNA translocation was observed. This modified porin, referred to hereafter as simply "MspA", demonstrated sensitivity to all four DNA bases (see Manrao, E.A., "Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nature Biotechnology 50:349-353 (2012) and Derrington, I.M., et al, "Nanopore DNA sequencing with MspA," Proc. Natl. Acad. Sci. USA 707: 16060-16065 (2010), each incorporated herein in its entirety) as well as detection of epigenetic modifications (Manrao, E.A., "Nucleotide Discrimination with DNA Immobilized in the MspA Nanopore," PLoS ONE <5:e25723 (2011) incorporated herein in its entirety). In order to pass DNA through the MspA constriction zone nucleotide-by-nucleotide steps, a phi29 DNA polymerase as a molecular motor was used. The ionic current through the pore demonstrated distinct levels associated with each single nucleotide movement (see Manrao, E.A., et al., "Reading DNA at single -nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nature Biotechnology 50:349-353 (2012), incorporated herein by reference in its entirety). However, the correlation between the ionic current values and the identity of the nucleotides within MspA's constriction was found to be complex. Approximately four nucleotides contribute to the ionic current and it was evident that the precise positioning of the nucleotides within MspA's constriction is an important factor to understanding these observed current levels. In order for nanopore sequencing using MspA to become a viable technique, an understanding of how the nucleotides residing in MspA's constriction affect the ionic current through the pore must be developed. In this work, individual single stranded DNA (ssDNA) molecules immobilized within MspA were studied in order to gain insight on the configuration and behavior of DNA held within a nanopore.

Using a previously described method in the context of a-hemolysin (Nakane, J., et al., "A nanosensor for transmembrane capture and identification of single nucleic acid molecules." Biophys J 87:615-621 (2004), incorporated herein by reference in its entirey) a ssDNA polymer is held within a MspA using a bulky molecule as an anchor. Biotinylated ssDNA was bound to a NeutrAvidin molecule and drawn into MspA by an applied voltage of 180 mV. The DNA threaded through the nanopore's constriction zone until the NeutrAvidin, which has a diameter larger than the entrance of MspA prevented further translocation (see FIGURE 3). Once the DNA was held in the pore, the applied voltage was decreased in 20 mV steps from 180 mV to 80 mV, with each voltage applied for 250 ms. The resulting current was recorded.

Altering the applied voltage across the nanopore modulated the electric field in the pore constriction. The single narrow and short constriction zone of MspA focused the electric field onto a few nucleotides. The force, F, on DNA across a potential, V, is F=fadV where c is the effective charge density of DNA nucleotides in the nanopore. At physiological pH (pH~7), DNA bases are practically uncharged but the phosphate groups of the DNA backbone, due to their pKa of about 0, are completely ionized and have a negative charge of ~le " per phosphate. However, at high salt conditions the effective charge is reduced due to counter ions surrounding the phosphates and shielding the charge. By observing the ionic resistance of the DNA-pore system at various voltages, the force on the DNA within the pore constriction and the effective charge of DNA within the MspA nanopore can be estimated.

Materials and Methods

Experiments

Pores were established with previously described methods (Butler, T.Z., et al, "Single-molecule DNA detection with an engineered MspA protein nanopore," Proceedings of the National Academy of Sciences of the United States of America 705:20647-20652 (2008) and Manrao, E.A., "Nucleotide Discrimination with DNA Immobilized in the MspA Nanopore," PLoS ONE 6:e25723 (2011), each incorporated herein in its entirety). Briefly, lipid bilayers were formed across a horizontal ~20 μιη diameter aperture in Teflon from 1 ,2-diphytanoyl-sn-glycero 1-3 -phosphocho line, 1,2- diphytanoyl-sn-glycero-3 -phosphate (Avanti Polar Lipids) or equal mixtures thereof. Compartments on both sides of the bilayer contained experimental buffer of 1.0 M KC1, 10 mM HEPES/KOH buffered at pH 8.0 +/- 0.05. An Axopatch-200B, IB or 1C patch clamp amplifier (Axon Instruments) was used to apply a voltage across the bilayer and measure the ionic current. MspA was added to the grounded cis compartment at a concentration of ~2.5 ng/ml. Once a single protein was inserted in the bilayer, the cis compartment was flushed with experimental buffer in order to avoid further insertions. All experiments were performed at 23 ± 1°C. The analog ion current signal was low-pass filtered at 20 kHz with a 4-pole Bessel filter and digitized at 100 kHz. Data acquisition was controlled with custom software written in LabWindows/CVI (National Instruments). The pore was held at 180 mV until there was a spontaneous reduction in current to less than 200 pA lasting longer than 100 ms, signifying DNA held threaded through the pore. The voltage was changed in steps of 20 mV from 180 mv down to 80 mV, each step lasting 250 ms. Then -lOOmV was applied for 250 ms to eject the DNA-NeutrAvidin complex out of the pore back into the cis compartment. The voltage then returned to 180 mV to attract another DNA-NeutrAvidin complex

Biological Materials

The MspA protein was purified from Mycobacterium smegmatis as previously described (Butler, T.Z., et al, "Single-molecule DNA detection with an engineered MspA protein nanopore," Proceedings of the National Academy of Sciences of the United States of America 705:20647-20652 (2008), incorporated herein by reference in its entirety). DNA with biotin on either the 5' or 3' end was synthesized by Integrated DNA Technologies with standard desalting and no additional purification. NeutrAvidin was obtained from Invitrogen. DNA was mixed in equal molar concentrations with NeutrAvidin to create ssDNA-NeutrAvidin complexes, which were stored at -20°C until immediately before use. For experiments, 5 μΜ of the ssDNA-NeutrAvidin complex was added to the cis chamber.

Data Analysis

Data was analyzed with custom software written in Matlab (The Mathworks). Translocation of DNA was first identified using current-thresholds (Butler, T.Z., "Determination of RNA orientation during translocation through a biological nanopore," Biophysical Journal 90: 190- 199 (2006), incorporated herein by reference in its entirety). To avoid signals due to voltage stepping, 5 ms at the beginning and end of each voltage step were neglected. The residual ionic current and error for each voltage step was found from the mean and standard deviation, respectively. To convert current to resistance, the applied voltage by the measured ionic current were divided. Events with a standard deviation in resistance of more than twice the average deviation for that experiment were discarded as they likely had poor NeutrAvidin seating on MspA. Minor variations in open-pore current levels were seen across a number of experiments and are likely due to minor variations in conductivity. Fluctuations between experiments were minimized by normalizing the DNA-pore resistance for each translocation by the unblocked, pore-only, resistance for that experiment. To report values in resistance, the average open-pore resistance for all experiments were re -normalized, 0.56 +/- 0.01 ϋΩ (mean +/- std dev). Subsequent experiments were often conducted with DNA strands of different compositions sequentially added without perfusing. After each new DNA addition, a corresponding peak appeared in the histogram of residual currents. For each DNA strand type, data was taken on several pores (see Table 1), for at least one of which the ionic current was clearly resolved from other strand types. A Gaussian curve was fit to the histogram of mean DNA-pore resistances for this subset of experiments. For experiments containing strands with overlapping histograms of mean residual current, events were resolved by overlaying the normalized model Gaussians curves described above and dividing events in the overlapping regions accordingly. The mean values and errors reported herein were determined by the mean and standard deviation of DNA-pore resistances for all experiments.

Table 1 : DNA Sequences and number of pores tested in force spectroscopy assays

τττ CG

5'- /biotin/ TTT TTT TTT TTT TTT ATT TTT TTT TTT TTT TTT TTT TTT

TTT TTT TTT TTT CTG TCT CCC TGC CG

Si ii!jJ.- nuclcpi klc poly morphism I SM ' I

dC 13 I 5'- /biotin/ GCT GGA GAA AGG CAT GTG CAA ATT AAA AAA AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

dA 13 I 5'- /biotin/ GCT GGA GAA AGG AAT GTG CAA ATT AAA AAA AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

dC 14 I 5'- /biotin/ TGC TGG AGA AAG GCA TGT GCA AAT TAA AAA AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

dA 14 5'- /biotin/ TGC TGG AGA AAG GAA TGT GCA AAT TAA AAA AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

dC 15 I 5'- /biotin/ CTG CTG GAG AAA GGC ATG TGC AAA TTA AGA AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

dA 15 I 5'- /biotin/ CTG CTG GAG AAA GGA ATG TGC AAA TTA AGA AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

dC 16 I 5'- /biotin/ CCT GCT GGA GAA AGG CAT GTG CAA ATT AAG AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

dA 16 I 5'- /biotin/ CCT GCT GGA GAA AGG AAT GTG CAA ATT AAG AAA AAA

AAA AAA AAA AAA AAA AAA AAA AAA AAA AA

Results

The "Open Pore" resistance of MspA at each applied voltage without DNA present was found to be -0.56 +/- 0.01 G independent of voltage (FIGURE 4, x's), in agreement with previously reported results.

First, the homopolymeric adenine (poly-dA), cytosine (poly-dC), and thymine (poly-dT) nucleic acids were examined held within the pore in either orientation (3' or 5' ends threaded through the pore). Based on geometry, approximately 20 nucleotides (nt) adjacent to the biotin anchor were expected to reside in the MspA pore, with the remainder of the strand on the trans side. All strands were 50-65 nt long so that the regions of the nucleic acids within MspA's constriction zone were located in the center of the strand. Homopolymer guanine was not available due to G-tetrad structure formation. The voltage-dependent resistance for each DNA-pore complex is shown in FIGURE 4. Homopolymer cytosine in the 3 '-leading orientation and homopolymer thymine in the 5'- leading orientation produced the highest resistances. Both orientations of homopolymer adenine produced similar resistances, which were found to be the lowest of all strands. Homopolymer cytosine with 5 '-leading had a resistance similar to poly-dA. The resistance of homopolymer cytosine was strongly orientation dependent with the 3'- leading direction having a much higher resistance than that for 5 '-leading direction. The orientation dependence of the DNA-pore resistance has been previously observed and was reasoned to be due to differences in the tilt of the DNA bases while confined in the nanopore. It was observed that in contrast to the open pore resistance, the resistance for a DNA-pore system does not have ohmic behavior (i.e., does not have a linear resistance to voltage relationship). Changes in the confirmation of the DNA strand at higher voltages reduces the resistance of the DNA-pore system. Because of the more widely separated resistance values, the 3'-leading homopolymer DNA was used for all subsequent experiments.

The standard deviation of the DNA-pore resistance for the homopolymer DNA strands is shown in FIGURE 5. At lower voltages, the fluctuation of the resistance within an event was larger than at higher voltages, indicating that the strand is more constrained at higher voltages.

Next, the stretching of ssDNA was examined using homopolymer adenine strands containing a single dC at various positions (X) from the biotin anchor. The voltage- dependent resistance of each strand was compared to that of pure poly-dA (FIGURE 6). The histograms show the change in the homopolymer resistance due to a single dC substitution at various positions, X. The response this single nucleotide substitution is characterized by the position of the central nucleotide and the full width at half max (FWHM). At lower voltages, the determined nucleotide position affecting the current the most is at X~16. Nucleotides located before the 14th position had little to no effect on the resistance, as they were located above MspA's recognition site (i.e., constriction zone). At higher voltages, the nucleotide most affecting the current changed to X~14. In addition, the range over which the single dC affects the current is slightly smaller at higher voltages (see FIGURE 7). Next, the single nucleotide substitution experiments were repeated with a single dA nucleotide substituted at positions X=13-16 in an otherwise poly-dT strand (see FIGURE 8, discussed in more detail in the Supporting Materials (below)). A similar trend was observed where the central nucleotide in the constriction varied from X~14 for 180 mV to X~16 at 80 mV (see FIGURE 7).

Finally, a short heteromeric DNA sequence known to contain a single nucleotide polymorphism (SNP), known to be associated with cancer, was studied. This DNA sequence had a particular nucleotide that was either a dC or a dA (see Table 1 for sequence). The SNP sequence chosen had no more than 2 consecutive base pairing so that effects of secondary structure within MspA emulates genomic DNA. By incrementally adding nucleotides at the NeutrAvidin end, the SNP was positioned at X=13-16. Similarly to the above described strands, the central nucleotide shifted from X~16 at low voltages to X~14 at higher voltages (FIGURE 9, discussed in more detail in the Supporting Material). FIGURE 7 compares the nucleotides within MspA's recognition site for the three different DNA strands.

Analysis

The present results indicate that the voltage influences longitudinal positioning of DNA held within MspA. At 180 mV, the nucleotide that exhibits the largest effect on the DNA-pore resistance is located at X~14 from the biotin anchor. As the voltage decreases, the DNA nucleotides shift upward so that the X~16 is centered in the recognition site and the number of nucleotides involved in controlling the ionic current slightly increases (FIGURE 7).

The observed shifting of nucleotides in MspA's recognition site relative to the DNA with varying voltages is due to either elongation of the DNA strand or changes in positioning of the biotin anchor with respect to MspA's constriction. Elongation of the ssDNA under force may be caused by base unstacking or a reduction in the curvature of the strand. Alternately, changes in the positioning of the NeutrAvidin on the pore or deformation of the rim of MspA may cause the entire strand to shift. The fluctuations observed in the DNA-pore resistance within a single event show strong voltage dependence (i.e. force dependence) where the strand is more tightly constrained at higher forces (FIGURE 5). This suggests that the conformation of the DNA strand is changing under increased force. The force on the DNA within Ms A can be estimated as:

F= cV [Eqn l] where V is the applied voltage and c is the effective charge density of DNA in the pore.

The number of nucleotides in MspA's recognition site, n, are taken as the width of the fitted Gaussian curve as shown in FIGURE 7 (lower panel). Assuming that the number of nucleotides in the MspA recognition site (constriction zone) is proportional to the charge density of DNA in the pore, σ, forces at two voltages, Vi and V 2, can be estimated by:

Fi/F 2 = (ni/n 2 )*(Vi/V 2 ) [Eqn 2]

For forces of tens of picoNewtons, the extension, x, of ssDNA under an applied force, F, can be modeled as a freely jointed chain (FJC): x = L . [coth(-¾ - ¾ [Eqn3]

k b T Fb where x is the end-to-end distance of the strand, L is the contour length, b is the Kuhn length, kb is the Boltzmann constant, and T is the temperature. At a 1 M salt concentration, the Kuhn length, b, of ssDNA is ~3 nm (see Murphy, M.C., et al, "Probing single-stranded DNA conformational flexibility using fluorescence spectroscopy," Biophysical Journal #6:2530-2537 (2004), incorporated herein by reference in its entirety). The contour length, L, is extracted from experiment as the length of DNA between the NeutrAvidin anchor and the top of the recognition site, assuming an interphosphate distance of 0.56 nm. At a given voltage all three strands had the same nucleotide position, X, centered within MspA's recognition site and therefore it is assumed that the interphosphate distance and Kuhn length are sequence independent.

The end-to-end distance of the DNA within the MspA vestibule, x, was assumed to be constant whereas the contour length of the strand, L, within this region changed with varying voltage. The applied force and elongation of DNA, x, was estimated with the FJC model by comparing the force-extension curves (Eqn 3) at two voltages using the force relationship described above (Eqn 2). See Supporting Material, below, for more details. The estimated forces for each DNA strand are shown in FIGURE 10 and range from 6 to 14 pN for 80 and 180 mV, respectively.

The charge density of DNA within the pore was then determined from Eqn 1 , <7=F/V. To obtain the estimated charge per nucleotide, the curvature of DNA in the pore, x/L is considered. The linear charge of DNA was taken as λ=(χ/∑)σ where x and L are the extension and contour length determined from the FJC model above. See Supporting Material for more details. The linear charge density of the dA strand was -0.29 eVnt and the charge density of the dT strand and the SNP strand were both -0.22 eVnt. The larger charge density of the dA strand was found from the larger forces observed in FIGURE 10 and indicates that the dA strand is more affected by voltage changes than the other strands.

The FWHM of a Gaussian curve fitted to experimental results was used to determine how many nucleotides resided in MspA's recognition site. The width of this Gaussian curve had contributions from both the size of the MspA constriction zone and the Brownian motion of nucleotides. It was assumed that the applied voltage dropped entirely over the MspA constriction zone so that only nucleotides within this region experienced a force. However, it is believed that Brownian motion caused nucleotides around the constriction to move in and out of the electric field making the number of nucleotides affecting the DNA-pore resistance to be larger than expected for a stationary system. A DNA strand immobilized in MspA can be modeled as a spring with a linear restoring force such that where d is the difference of applied force, δχ is the resulting displacement, and /ris the spring constant. Brownian motion fluctuations caused the DNA to move around its equilibrium position. From the equipartition theorem, the variance in the displacement from equilibrium can be related to the spring constant, κ, of the DNA by ½ kbT = ½ κ <δχ > where <δχ > is the variance in the displacement from equilibrium, kb is the Boltzmann constant, and T is temperature. The variance of the DNA due to Brownian motion is then: The spring constant, κ, is found from the force extension relation (Eqn 3) and is plotted for each strand in FIGURE 11 (see the Supporting Material for more detail).

The Brownian motion contribution can be compared to the width of the recognition site determined from experiment. See Supporting Material, below, for more details. The results for experiments using the dA, dT and SNP strands are shown in FIGURE 12 (described in more detail in the Supporting Material, below). For all strands, the contribution from Brownian motion makes up the majority of the width of the recognition site. By subtracting the purely Brownian motion contribution to the width, the width of MspA's constriction is -0.8 +/- 0.2 nm for the dA strand. This agrees favorably with that expected from geometry. For the dT and SNP strands under low voltage, the recognition site was smaller than the expected contribution from Brownian motion. While this may be result of imperfections in the experiment it may also suggest that the nucleotides interact with the pore constriction zone, thus constraining their motion.

Discussion

The over-stretching transition for homopolymer adenine occurs at an applied force of 23 pN corresponding to the removal of secondary structure. By the present study it was deduced that the experimental forces applied with the MspA nanopore are well below this threshold, suggesting that secondary structure of the DNA strand was intact during these assays. The number of nucleotides within the recognition site was similar for all three strands examined (FIGURE 7) indicating that an increase in the DNA-pore resistance was not a result of simply more nucleotides within the constriction. Instead, secondary structure, the identity of the nucleotides, and DNA-pore interactions must be considered.

Comparing the present values for the effective charge within MspA to previous studies on DNA translocation through the a-hemolysin nanopore, which demonstrated an effective charge of ~0.1e ~ per nucleotide (1M salt), the present study reveals a larger charge density, -0.2-0.3 eVnt. This larger effective charge for MspA may be due to geometrical differences between the MspA and a-hemolysin nanopores, reducing the charge shielding in MspA. The recognition site of α-hemolysin is less localized due to its long β-barrel stem, resulting in a lower electric field when compared to MspA. Additionally, the diameter of the β-barrel in a-hemolysin is larger than MspA's constriction zone. This increased width is likely responsible for the greater charge shielding in a-hemolysin as water molecules can more easily surround the DNA strand.

Using the force estimations and experimental results, it was found that the number of bases affecting the ionic current through MspA is dominated by Brownian motion of the nucleotides and not the length on MspA's constriction. Because the thermodynamic energy of the system dictates the width of the recognition site, further mutations to shorten MspA's constriction will not significantly improve its sensitivity to nucleotides. To decrease the number of nucleotides affecting the ionic current signal, Brownian motion must be reduced. The DNA strand may be immobilized in MspA's constriction with the addition of DNA binding sites. This strategy would likely require additional modifications to the MspA pore. Alternately, the effects due to the elasticity of the DNA strand could be minimized by shortening the distance between the anchor point and the constriction. This could be accomplished by holding the DNA in the MspA pore under reverse voltage and with the speed- regulating enzyme (e.g., molecular motor) in the trans side. Alternatively, the system could incorporate the MspA in a reverse orientation, with the constriction zone closer to the cis side and the vestibule closer to the trans side.

By varying the applied voltage on a DNA molecule held in an MspA nanopore, a force-dependent shift in the recognition site of MspA was measured. This system provided the unique opportunity to explore DNA properties while spatially confined within a nanopore. The ssDNA was observed to stretch by 1-2 nt under varying applied voltages and different bases may stretch to different degrees. These results suggest that sampling the DNA-pore resistance at various voltages will provide additional sequence information since the strand can be shifted within the pore constriction. These results provide insight into how a nanopore-based device may be further refined to sense nucleotides passing through it. With this new understanding of DNA's behavior in MspA, a nanopore-based sequencing system is can have increased sensitivity and resolution.

Supporting Material

Force Calculation Details (Homopolymer Adenine Strand)

The change in DNA-pore resistances due to single nucleotide substitutions was observed at various positions along the strand. The DNA-pore resistance and error for a given strand was determined from the mean and standard deviation, respectively, of the resistances frond from multiple experiments. By fitting a Gaussian curve to the results at each applied voltage, it was determined which nucleotides comprised the recognition site of MspA from the peak and full width at half max (FWHM) of the fit (FIGURES 6 and 7). The width of the recognition site was comprised of two elements: MspA's constriction zone size and Brownian motion of nucleotides. The constriction zone was the narrowest part of the MspA pore. It was assumed that the applied voltage dropped entirely over the constriction so that only nucleotides within this region experienced a force. However, Brownian motion caused the nucleotides to move in and out of the pore constriction, making the total recognition site of the DNA larger than the physical size of MspA's constriction. For calculations of the force on the DNA strand, the effects of Brownian motion moving nucleotides into the constriction were avoided by excluding the region of DNA within the recognition site.

To determine the force on a DNA strand immobilized in MspA, the strand between the biotin linker and recognition site was modeled as a freely jointed chain (FJC). The DNA strand within the vestibule, then, had one end fixed by the biotin linker. The force from nucleotides in the induced electric field acted on the free end of the strand and corresponded to the top of MspA's recognition site.

The force on the FJC is related to the charge density of DNA, cr, and applied voltage across the pore, V, by:

F= σΥ [Eqn 1]

To compare forces at different voltages, it was estimated that the FWHM value from experiments, n, was proportional to the charge density, σ. The forces at two voltages, then, were related by:

Fi/F 2 = (ni/n 2 )*(Vi/V 2 ) [Eqn 2].

The extension of a FJC under a force, F, was estimated as: χ = 1 · [coth(— -) -^-](1 + -) [Eqn 3]

(see Smith, S.B., et al, "Overstretching B-DNA: The elastic response of individual double-stranded and single-stranded DNA molecules," Science 277:795-799 (1996), incorporated herein by reference in its entirety) where x is the end-to-end distance of the strand, L is the contour length, kb is the Boltzmann constant, T is the temperature, S is the stretch modulus, and b is the Kuhn length. At room temperature, the product k b T is 4.1 pN-nm. The Kuhn length, b, was estimated as 3.0 nm (Murphy, M.C., et al, "Probing single-stranded DNA conformational flexibility using fluorescence spectroscopy, Biophys. J. #6:2530-2537 (2004), incorporated herein by reference in its entirety) (1M KC1) and the stretch modulus, S, was taken as 800 pN (Smith, S.B., et al, "Overstretching B-DNA: The elastic response of individual double-stranded and single-stranded DNA molecules," Science 277:795-799 (1996), incorporated herein by reference in its entirety). The contour length, L, was obtained from experiment as the length of DNA between the biotin and top of the recognition site, assuming an interphosphate distance of 0.56 nm. The extension or end-to-end distance, x, was the length of the vestibule between the biotin and top of the recognition site and was taken as unknown and voltage-independent. The applied force and end-to-end distance of the vestibule, x, were found with the FJC model by comparing the force-extension curves (Eqn 3) at two voltages using the force relationship found above (Eqn 2). Because 180 mV was the baseline voltage, the extension at each lower voltage was compared to that at 180 mV.

The force extension curves for a single dC substitution in poly-dA are shown in FIGURE 13. The average force for an applied voltage of 180 mV found from the FJC comparisons described above was 15.54 +/- 1.81 pN. Forces for each lower voltage were determined using this value and the force relationship from Eqn 2. The extension of DNA, x, corresponded to the distance between the biotin linker and top of the recognition site and was found to be 6.76 +/- 0.10 nm. See Table 2. It is noted that for Tables 2-4, the central nucleotide and number of nucleotides in the recognition site are taken from a Gaussian fit to experimental data (FIGURE 6). The errors are the standard errors of the Gaussian fit parameters. The contour length of DNA is measured from the biotin anchor to the top of the recognition site assuming an interphosphate distance of 0.56 nm. The force relationship F(X mV)/F(180 mV) is calculated from Eqn 2 and is used in the FJC model to relate the force extension curves for two voltages. The force at 180 mV and the extension are determined from the FJC model and correspond to the red stars in Figure S2. The force at each lower voltage is determined using the mean value at of the force at 180 mV and the force relationship F(X mV)/F(180 mV). The charge density is determined using Eqn 1 and is cr=F/V. Finally, the DNA charge is determined using the curvature of DNA in the pore as described in the text.

The charge density of ssDNA in MspA, determined from Eqn 1 , is <J=F/V. Considering the curvature of ssDNA within the pore, the linear charge of DNA can be taken as λ=(χ/∑)σ where x and L are the extension and contour length from the FJC model, respectively. For the single dC substitution in poly-dA, we found that the linear charge density of ssDNA was 0.29 +/- 0.03 eVnt, assuming an interphosphate distance of 0.56 nm.

Homopolymer Thymine Strand

Experiments performed on a poly-dT strand, which had a single dA substitution at X=13-16 produced a similar trend to that found with single nucleotide substitutions in poly-dA (FIGURES 6 and 8). Using the same procedure described above, the forces were determined to be lower than those found above for a poly-dA strand (FIGURE 10) and the effective charge was 0.22 +/- 0.01 eVnt. See Table 3.

Single Nucleotide Polymorphism (SNP) Strand

Single nucleotide polymorphism (SNP) experiments were performed on a heteromeric strand with a single nucleotide that could be either a dA or dC. The location of this variable nucleotide was shifted to be at X=13-16 and the difference in resistance for the dA and dC substitutions were found (FIGURE 9). The peak and FWHM values were found from the fitted Gaussian curve. Using the same procedure described above, the forces were determined to be similar to those of poly-dT and the effective charge was 0.22 +/- 0.01 eVnt. See Table 4.

Brownian Motion Calculation Details

As previously discussed, a Gaussian curve fitted to experimental results (FIGURE 6) was used to determine which nucleotides most affected the DNA-pore resistance. The recognition site of the DNA strand was defined by the FWHM of the Gaussian fit. The recognition site width contained contributions from the width of the MspA constriction zone as well as Brownian motion of nucleotides moving in and out of the constriction. It was estimated that the applied voltage dropped entirely over the constriction so that only nucleotides within this region experienced a force. Brownian motion, however, caused nucleotides around the constriction to move in and out of the electric field making the number of nucleotides affecting the DNA-pore resistance to be larger than expected for a stationary system.

A DNA strand immobilized in MspA can be modeled as a spring with a linear restoring force. Brownian motion fluctuations resulting from changes in thermal kinetic energy caused the DNA to move around its equilibrium position. Thermal kinetic energy is characterized by the value kbT where kb is the Boltzmann constant and Tis temperature. From the equipartition theorem, thermal kinetic energy is equal to the potential energy of a harmonic trap, ½ kbT = ½ κ <x 2 > where κ is the spring constant of the trap, and <x 2 > is the variance in the displacement from equilibrium. The variance of the DNA due to Brownian motion was therefore found by:

<x 2 > = k b T / K [Eqn 4] where kbT is 4.11 pN-nm at room temperature. The spring constant, κ, is ratio of an applied force to the resulting displacement of a particle, κ =F/x, and was found from the derivative of the force extension relation (Eqn 3). See FIGURE 11. To compare these results with experiment, the variance was converted into the FWHM of a corresponding Gaussian curve by <x 2 > = FWHM 2 / (8 ln2).

The width of the Gaussian curve from experimental data was related to the curves from MspA's constriction and Brownian motion by:

FWHM experime nt 2 = FWHM ' constriction + FWHM brownian■ The Width of MspA*S constriction, then, was determined from FWHM constriction = FWHM experiment 2 - 8 * ln2 * k b T/ (F/x).

Comparisons of the Brownian motion calculations and the widths of the recognition sites for experiments using poly-dA, poly-dT and SNP are shown in FIGURE 12. The contributions from Brownian motion account for the majority of the width of the recognition site. For the poly-dA strand, it was found that the mean width of MspA's constriction was -0.79 +/- 0.23 nm. For poly-dT and SNP strands held with a low applied voltage, the recognition site was smaller than the expected contribution from Brownian motion. This suggests that either unstacked DNA strands cannot be modeled as a spring or the nucleotides interacted with the pore constriction constraining their motion.

Table 2: FJC Calculations for Single Nucleotide Substitutions in Poly-dA

Experimental Data FJC Solution

Contour Length F(X mV). Charge Density DNA Charge

Voltage Central # nt lo Force {pN>

L. (urn) t : (.tao mV) F (ISO mV) 1 E tension

(©Yum)

. Nucfeo«de <X) Recognition Site fnra)

80 15.91 +/·· 0.?.3 3.60 0.58 7.90 }·/· 0.2 1 0,58 •5,30 6,75 8.95 ) /- 1 .89

J 00 15..38 +/- 0.16 3.2 0.39 7.71 +/- 0.14 0.64 17.20 6.8'· 9.99 + - 1.83 0.62 0.11 0.31 0.06 120 15.06 ·) · 0.1 1 2.96 0.26 7.60 -* · 0.10 0.71 16.80 6.83 11.03 1.79 0.5? 0.09 0.29 ■ ■■ 0.05 140 1 rt .84 •t /- 0 ,07 2.65 0.16 7.57 0.06 0.74 I S. SO 6.78 11.52 +/- \ .71 0.51 0.08 0 26 /- 0.0 " 160 .14.68 +/- 0.08 2.73 0. 13 7.43 )·/- 0.07 0.87 .12,60 6.S9 5.3.SS -)· - 2.06 0.53 +/- 0.08 0.27 /- 0,04 14.39 )· - 0.08 0.20 7.28 ·( /- 0,07 JLOO 15 54 •t /·· 1.81 0.54 0.06 0.28 / - 0.03

Mean 1S.S4 6.76 Q.SS O. '

Std Dev t.8t 0.10 0.07 0.03

Table 3: FJC Calculations for Single Nucleotide Substitutions in Poly-dT

Single dA Substitution in po y-dT

Table S3: FJC Calculations for SNP Strands

SNP Strand

Experimental Data FJC Solution

Contour Length F{X mV) Charge Density DNA Charge

Voltage i Central j # nt in L <nm) FU80 mV) F (180 mV) j Extension Force (pN)

(e ' /nm) (e /nt) Nucleotide (X) I Recognition Site (PN) j ("."?.)

SO i 16,06 + - 0.06 f 3.73 +/- δ.ίό Ί ' 7.95 +?- 0.04 0.5Λ 1Ϊ.40 1 6.23 6,52 +/- 0.52 0.51 0,04 0.23 + - 0.02

100 i IS.28 +/- 0.01 3.59 +/- 0.03 ?. 3 +/- 0.01 0.65 11.80 6.26 7.84 +/- 0.60 0.49 +/- 0.04 0.23 +/- 0.02

12.0 i 14.84 +/- 0.08 3.33 1·/ · 0.25 7.38 ! / · 0.08 0,73 11.40 6,23 8.72 + /·· 0.93 0.45 ···/ ' - 0.05 0.22 ·(·/· 0,02

140 i 14.49 +/- 0.08 3.19 +/- 0.2? 7.22 +/- 0,09 0.81 1 1.90 6.29 9.7S +/'- 1.10 0.44 +/- 0.05 0.21 +/- 0,02

150 j 14.18 + - 0.08 3.11 +/- 0 23 7.07 +/- 0.08 0.91 1 3.40 6.37 10.85 +/- 1 ,16 0.42 ÷/- 0.05 0.21 +/- 0.02,

180 i 13.99 +/- 0.03 I 3.05 ·! / ' - 0.10 L 6.98 (·/·- 0.03 1.00 J 1 1.98 +/·· 0.83 0.42 ···/ - 0.03 0.2 1 (·/·· 0,01

Mean li.98 6.28 OAS 0.22

Std Dev 0.83 0.06 0.04 0.01

The following is a description of a study demonstrating that the application of non-constant electrical potential to a nanopore system during the translocation of a nucleic acid analyte provides a much higher level of resolution, permitting the characterization of sub-nucleotide variations of the nucleic acid to permit a much more sensitive and informative analysis of the translocating nucleic acid.

Overview

As described herein, the nanopores such as the protein nanopore Mycobacteria smegmatis porin A (MspA) can be used to examine nucleic acids with sub-angstrom resolution. A controlled voltage can be applied to the system and an ionic current passing around ssDNA within the nanopore is measured to provide detectable signals that correspond to characteristics of the ssDNA. In MspA, this current is influenced by multiple bases in MspA's narrow constriction zone. Translocation of single stranded DNA (ssDNA) through MspA can be facilitated by a DNA polymerase (e.g., phi29) which draws the ssDNA through the nanopore in single nucleotide steps. As described above, varying the voltage stretches the elastic ssDNA bases within MspA's constriction. In the present, varying voltage was applied to an MspA nanopore system where ssDNA is translocated in single nucleotide steps by phi29. With the varying voltage, the changes in the current response were observed allowing measurement of position changes less than 100 pm. This position sensitivity is confirmed by comparing the present results to a modified freely jointed chain (FJC) model, an experimentally verified model that describes DNA elasticity (described above). By stretching of the DNA, a scan of DNA is constructed as if it were moved through the pore by steps much smaller than 1 nt (i.e., near continuous passage) resulting in a significant increase of data regarding the corresponding structure of the ssDNA translocating through the nanopore compared to existing strategies and, thus, a significant increase in the resolution of the nanopore system for nucleic acid analysis.

Introduction

Leveraging the stretching property of DNA to enhance sensitivity of nanopores Single stranded DNA is an inherently elastic molecule. The end-to-end length of DNA can be altered by applying less than a pico Newton forces on the ends. Building upon techniques established in Manrao, E.A., "Reading DNA at single -nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nature Biotechnology 50:349-353 (2012) and Cherf, G.M., et al, "Automated forward and reverse ratcheting of DNA in a nanopore at 5 -A precision," Nat. Biotechnol. 30: 344-348 (2012), each incorporated herein by reference in its entirety, the stretching of ssDNA was observed within MspA upon the application of variable electric potential. phi29 DNA polymerase was used to move an ssDNA template through MspA in single nt steps. At each nt step actuated by phi29 DNA polymerase, a corresponding step in the sequence specific current was observed. As described herein, ssDNA stretching was observed by changing the force on the ssDNA held by phi29 DNA polymerase. An increased voltage increased the force on the DNA and repositioned the DNA within the MspA constriction zone. Because of MspA's sensitivity to the nucleotide sequence in the constriction zone, sequence-specific current-levels at different voltages were used to measure the changing position of ssDNA within the MspA. The change in ssDNA position was compared to the standard model of DNA stretching.

Results

Overview of the effect of shifting DNA within the MspA constriction zone

Enzyme-assisted translocation of a nucleic acid through a nanopore can be advantageous to regulate the translocation rate. This can be essential to sufficiently slow the translocation of the nucleic acid to facilitate the detection of current signal fluctuations that are reflective of the specific sequence of the nucleic acid. Enzymes will move DNA through a nanopore in discrete steps to produce blockade events in the measured current that correspond to each discrete step movement. In contrast, the hypothetically slow but continuous movement of DNA through the nanopore can produce a continuous stream of measurable current corresponding to the passage of DNA at a much finer scale. This measurable current will provide a much higher scale of information regarding the structure of the DNA within the pore.

To illustrate, it should first be noted that nanopores such as MspA have a constriction zone, which is the narrowest portion of the pore. The segment of DNA in the constriction zone at any given time is the most influential portion of the DNA on the current of ions also passing through the pore. FIGURE 14A illustrates the constriction zone (i.e., "DNA sensing zone") of a nanopore such as MspA, where the influence of the nucleotide position within the zone is indicated in the intensity of the shading. The observed current expected DNA moving continuously through a nanopore would likely resemble a continuous curve, referred to as a continuous current profile (FIGURE 14B) and would reflect the continuously changing character (i.e., sequence) of the DNA segment within the constriction zone. In contrast, a molecular motor, such as phi29 DNA polymerase, moves the DNA in single nucleotide steps, effectively "sampling" the current profile in 1 nt steps (FIGURE 14C). Any mechanism that repositions the DNA analyte within the nanopore constriction zone (FIGURE 14D) will result in a shift of the measured signal. In a continuous (or near continuous) monitoring of the current signal, the continuous current profile is shifted (FIGURE 14E). The, "sampling" of the current profile facilitated by the discrete step movement of a molecular motor will occur at a different locations along continuous current profile (FIGURE 14F). This change in DNA position is called a "registration shift" and is labeled 6. Such registration shift, is directly seen the increase (or decrease) in signal on one side of a signal peak, with a corresponding decrease (or an increase, respectively). An object of the present study was to determine whether DNA experienced registration shifts with the application of different electric forces during the enzyme-assisted translocation of the DNA through the nanopore.

Current observed at constant force

A DNA sequence, "seql", was moved by phi29 DNA polymerase through MspA at different voltages, resulting in different forces acting on the DNA as it translocated. FIGURE 15A illustrates a plot of the value of observed currents observed current levels associated with seql for voltages of 160 mV and 180 mV. Comparing the sequence- specific currents a change in current pattern is observed. The change in the current pattern is explained by changing the position of ssDNA relative to MspA's constriction zone. Taking the distance between ssDNA length to be lss= 6.9 A, (Murphy, M.C., et al, "Probing Single-Stranded DNA Conformational Flexibility Using Fluorescence Spectroscopy," Biophysical Journal S(5(4):2530-2537 (2004) and Bosco et al, "Elastic properties and secondary structure formation of single-stranded DNA at monovalent and divalent salt conditions," Nucleic Acids Research 42(3):2064-2074 (2014), each reference incorporated herein by reference in its entirety) each nucleotide was mapped to a distance moved within the pore. Adding a registration shift, 6=1.1 A to the positions observed at 180 mV, put the observations in line with the current levels at 160 mV (FIGURE 15B). The same registration shift was observed for a different DNA sequence (not shown). Current observed at continuously varying voltage

The current of DNA in Ms A was examined while applying a triangle wave 50 mV-250 mV peak-to-peak voltage at 200Hz. Discrete changes were examined in the I-V response of the system, indicating phi29 DNA polymerase controlled DNA movement. To identify stepping due to phi29 DNA polymerase-controlled motion of the ssDNA, a custom "level" finder was employed to find changes in the I-V response. In this regard, a "level" refers to a discrete subset of current signals that are associated with a single translocation step of the ssDNA through the pore and, thus, the sequence of a discrete segment of the nucleic acid residing in the constriction zone at the time of the level. This step in this case is facilitated by the molecular motor action of phi29 DNA polymerase. This level finder is a modification to the level finder described in Laszlo et al., "Decoding long nanopore sequencing reads of natural DNA," Nature Biotechnology 52:829-833 (2014), incorporated herein by reference in its entirety. See Methods section below for additional detail. For each one of these levels, the continuously varied voltage was compared to the current observed within 1 mV of 180 mV and 160 mV. The observed currents observed for a continuously varying voltage are not distinguishable from currents observed at the different, constant voltages (not shown).

Measuring the DNA registration shift

The sequence-specific currents observed at different voltages were used to measure the registration shift, such as a shift represented in FIGURE 16A. First, a small DNA registration shift Δδ(νΐ, AV=V2-V1) was estimated for a small change in voltage, AV from an initial voltage VI to a final voltage V2 (see Methods). The total DNA registration shift is observed when increasing the voltage from Vo to V, by adding small position shifts of the form zW (VI, AV), from the lowest observed voltage (VI = 100 mV) to a final voltage V. This gives the phase shift as:

«.'-::: " 1 0 JTl V

Choosing AV=1 mV, the resulting δ (V) shows δ increasing though gradually smaller at higher voltages (FIGURE 16B). Different choices of AV do not change the results (not shown). DNA registration shift is caused by DNA elongation

The observed voltage-induced DNA registration shift was compared to the prediction that ssDNA stretches under the electric forces within the MspA. In bulk, ssDNA elasticity is well understood. The end-to-end distance of ssDNA, labeled x, can be well modeled by a freely jointed chain (FJC) modified to allow Kuhn segments than can stretch under force (Smith, S.B., et al, "Overstretching B-DNA: The Elastic Response of Individual Double-Stranded and Single- Stranded DNA Molecules," Science, 27i(5250):795-799 (1996), incorporated herein by reference in its entirety. For forces below ~20pN, the relation between x and an applied force, F, can be approximated as: x = L * [1 - (k b T / (F * b) )] [Eqn 6]

Here, L is the ssDNA contour length, b is the Kuhn length, k b is the Boltzmann constant, and T is the temperature. This system, illustrated in FIGURE 16C identifying the appropriate variables, is different as x is fixed to be the distance between the polymerase and the high force region of the constriction, while the number of nucleotides changes with voltage, namely, N = N(V) (see FIGURE 16B). In the calculation, x is denoted as = a, and L = N*Lss, where Lss is the inter-phosphate distance. Thus, the applied force is estimated as:

F = V * Nc * q / h [Eqn 7] where V is the applied voltage, Nc is the number of nucleotides in the constriction, and q is the effective charge per nucleotide. The curvature of DNA is considered constant throughout the pore, giving:

Nc(V)/N(V) = h/a [Eqn 8].

Eqns 6-8 are used to solve for N(V), giving, N(V) = a/l ss +kT/(q V*b/a). A phase shift between two voltages, VI and V2, is represented as a change in N, or:

AN(V1,V2) = kT/Q*(l/Vl-l/V2) [Eqn 9] where Q = q*b/a is an effective charge.

As shown in FIGURE 16B, the observed 5(V) was fit to the single-parameter equation [Eqn 9] demonstrating that the observations are well-described by the modified FJC model. This indicates that changing the electric force induces the registration shift is caused by elongating the ssDNA. The resulting fit parameter in FIGURE 16B gives the effective charge as Q = 0.12+/-0.01 e. While the modified FJC was derived to model ssDNA stretching in non-confined environments, it accurately describes ssDNA stretching in the confined environment of MspA. This suggests that any nonspecific DNA interactions of the DNA with MspA do not considerably impact the ssDNA elasticity in the force ranges that were examined.

Because the present results fit well to Eqn 7 it is also note that these registration shifts are unlikely influenced by voltage induced by direct or indirect deformation of phi29 DNA polymerase or MspA. However, to ensure that the results were not system- dependent, two additional studies were performed that support the conclusion that the DNA elongation is the primary cause of the registration shift. In these studies a different DNA sequence was examined with motion controlled by the phi29 DNA polymerase to find the same voltage-dependent registration shifts. Next, MspA was used to examine different DNA sequences anchored to a bulky neutrAvidin-biotin molecule and stretched at different voltages (previously presented). The elongation in these other systems was found to be indistinguishable from the results presented above.

Creating a sub-nucleotide-scale current profile for DNA moving through MspA The voltage induced sub-A registration-shifts were exploited to construct a "continuous current profile" of DNA going through MspA, akin to the diagram shown in FIGURE 14B. To create this profile, an empirical transformation was used to estimate the current at a constant voltage, referred to as the effective current. Next, for each level a physical distance was calculated to map the applied voltage to a registration shift. For each level, the distance of 5.5 A was added to the calculated registration shifts, due to phi29 DNA polymerase stepping, giving a "scanning distance". The scanning distance was plotted versus the estimated current, yielding current profiles. See FIGURE 22B for an exemplary series of such current profile representing sub-nucleotide scans of DNA. Discussion

The sub-Angstrom position of nucleotides held within MspA's constriction were modified using variable force. Distance changes were detected on the order of 30 pm, and obtained a voltage-distance curve with 25 pm precision. This distance curve was fit to a model of ssDNA stretching, the FJC, with a single parameter. The accuracy to which this simple model described the observations confirms DNA stretching. It is observed that the change in position of the DNA sequence within the MspA constriction will change the current values by considerable shifts of 30 pm yield observable current differences on the order of 0.6 pA, with a current-distance sensitivity ration of 2 pA/A or higher. In these experiments, an increased force was used to change in the DNA position within the constriction.

Implications for nanopore sequencing

Other mechanisms might also cause a DNA registration shift, such as DNA secondary structure, DNA-pore interaction, DNA-enzyme interactions and other experimental conditions like temperature and buffer viscosity or salinity. To increase nanopore sequencing accuracy, any registration shifts must be minimized or accounted for. The present method enables a seemingly continuous "scan" of DNA as it moves through MspA. The recreation of such a continuous scan can facilitate detection of any registration shifts, such as forward skips or backward toggles. Once detected, they can be addressed to infer any missing data.

Implications for nanopore sensing techniques

Position changes of DNA can be detected within MspA that are on the order of 30 pm. In principle, this expanded sensitivity of nanopores, such as MspA, can be directly used as a high throughput single-molecule tool to detect sub-Angstrom characteristics of nucleic acid analytes.

Methods

System

Nanopore/translocation techniques were performed as generally described above and in Manrao, E.A., "Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nature Biotechnology 30:349-353 (2012), incorporated herein by reference in its entirety. In short, a single MspA pore was formed within in an unsupported phospholipid bilayer and used phi DNAP as a molecular motor to control the motion of DNA.

Room temperature buffer (23+/- 1 °C) 150mM KC1 in a cis-well/500mM KC1 trans-well or 300 mM KC1, 10 mM Hepes buffered at pH 8.00 ± 0.05 was used. Currents were recorded on an Axopatch 200B amplifier with custom Labview soft- ware (National Instruments) at varying voltages as described in the text. DNA material was acquired from PAN Laboratories (Stanford University, Stanford, CA).

Analysis

To examine the data, a custom-developed software implemented in Matlab (The Mathworks) and in Java. From the recorded current traces, blockages where DNA motion was controlled by the phi29 DNA polymerase were first identified. Capacitive currents not removed by the application of compensatory potentials implementation using hardware were removed from observed current traces. A step detection software that identified significant changes in I-V response due to polymerase controlled DNA motion was employed to identify the step-wise movements of the ssDNA associated with the phi29 DNA polymerase as reflected in the distinct levels of current signal patterns. Laszlo et al, "Decoding long nanopore sequencing reads of natural DNA," Nature Biotechnology 52:829-833 (2014), incorporated herein by reference in its entirety. The algorithm in Laszlo et al. (2014) was modified with the addition of log-probability of similarity (see Laszlo et al. (2014) supplement for definition of log-probability) determined at different voltages. The added information from different voltages allowed robust determination of DNA steps (or current "levels") attributed to phi29-associated movements. Other techniques to determine these level changes in the I-V response can readily devised by those familiar in the art. Levels were filtered based on (1) the duration of the step, (2) similarity between sequential steps, and (3) removed anonymously noisy levels, usually caused by pore gating. Once the levels, or ssDNA steps, were identified, additional analysis was performed.

With I-V responses specific to the sequence the DNA registration shift Δδ were estimated. This was done in several steps. Initially the current at a given voltage was converted to the anticipated current at 180mV as described below. Next, for each step, the average current in 1 mV bins and for each sequential voltage bin was determined. Finally, the registration shift offset that would minimize the difference between a spline interpolant of the currents at one voltage was found, and the observed currents at the next (higher) voltage.

Estimating registration shifts

The registration shift, Δδ(νΐ, AV=V2-V1), caused by a change in voltage from VI to V2 was estimated using the I-V responses for each level. At each voltage binned in units of 1 mV, the average current (within these 1 mV bins) was found, giving voltage dependent traces. Example traces at 160mV and 180mV are shown in FIGURE 15A and in FIGURE 16A. A third order spline was fit to each of these voltage dependent current traces. To find the registration shift between two voltages, VI and V2, the shift that would minimize the difference between two spline interpolants was found for current traces at VI and V2. (FIGURE 15B).

Sub-nucleotide current profile

To create the sub-nucleotide current-profile of DNA translocating through MspA, two empirically determined steps were employed. First, the voltage of each step was converted to a physical distance, using the observations shown in FIGURE 7 and FIGURE 16B, their fits to Eqn 7, and the knowledge that each level corresponded to motor-enabled steps of 1 nt. Second, for each level the current at a given voltage was converted to the anticipated current at 180mV.

Constant voltage current transformation

The non-ohmic resistance of the system (see, e.g., FIGURE 4, described above) was removed to transform an observed current at a given voltage, to the expected current at a constant voltage. This current transformation emulates the current that would be found if the DNA nucleotides were positioned at the given location, while at a constant 180 mV using I-V measurements of homopolymer DNA (homopoly A, homopolyT and homopolyC) conjugated to a bulky NeutrAvidin-biotin. Computing R=V/I provided three curves RA(V), RC(V), and R T (V). Each of these curves was fit to a quadratic function R(V)=X0+X1 *V+X2*V 2 resulting in three different sets of parameters for each homopolymer: {XA} , {XC },{¾ } · The coefficient values {XA} , {XC} , {XT} were found to scale with the observed resistance at 180mV R A (180 mV) , R c (180 mV), and R T (180 mV). Linear fits were made to each of these, parameterizing the coefficients of this class of curves by a resistance-valued parameter, r. It is noted that different salt conditions alter the transformation function. For [KC1] = 300 mM on cis and trans, it was found: X0(r)=3.691 * r-5.171

Xl(r)=(-2.123*r+4.200)/100

X2(r)=(3.488*r-7.375)/10 5 [Eqns 10]

With the components described, a transformation was created that converted a current- voltage observation (Iobs,Vobs) into an estimated current I 180 for a 180 mV driving voltage. To do so, the following equation was solved for r: Vobs/Iobs=X0(r)+Xl(r)*V+X2(r)*V 2 [Eqn l l] using Eqns 10.

With the r determined in Eqn 1 1 , the following equation was solved for I 180 : 180/Ii8o=XO(r)+Xl(r)* 180+X2(r)* 180 2 [Eqn 12]

The resulting function Ii 8 o(Iobs,Vobs) was found to linearize the observed RA(V), RC(V), and VRT(V) (FIGURE 21). Converting the constant voltage to distance, as described, a sub -nucleotide current profile was obtained. See FIGURES 22A and 22B. It is noted that constant voltage of 180mV was chosen is out of convenience, and that any voltage choice may yield a similar transformation. It will be appreciated by skilled artisans that other current transformations can be employed to provide a method to convert a current and voltage pair into a current at a specified constant voltage. The following is a description of a study using an MspA nanopore in a reverse orientation to reduce the Brownian motion of the nucleic acid within the pore during translocation.

Overview

As described above, analysis of the ssDNA within the nanopore using the application of varying electric potentials revealed that Brownian motion of the ssDNA strand contributed to the extend in which multiple nucleotides effected the ionic current through the nanopore at any given time. Specifically, Brownian motion caused nucleotides around the constriction to move in and out of the electric field making the number of nucleotides affecting the DNA-pore resistance to be larger than expected for a purely stationary system. Accordingly, as described herein, a study was performed by reversing the orientation of the MspA nanopore within the system with the intent to reduce the Brownian motion of ssDNA analytes within the nanopore constriction zone. By reducing the Brownian motion, the length of ssDNA (i.e., number of nucleotide positions) effecting the current signal the resolution of the system was minimized. Thus, the resulting current signals measured at any given time represented a shorter combination of sequential nucleotide residues. Furthermore, it was demonstrated that measured signals from homopolymers of different nucleotides were significantly more distinguishable, facilitating a simplified deconvolution process to associate any given current measurement with a specific nucleotides sequence.

Results/Discussion

As described herein, MspA is a proteinaceous nanopore that has been successfully applied to systems for analysis of nucleic acid analytes. As illustrated in FIGURE 1, MspA (30) generally has a conical shape with the entrance rim (40) creating the widest diameter of the internal tunnel of the nanopore. The internal "vestibule" (50) possesses a conical shape where the diameter generally decreases along the axis when proceeding from the entrance rim towards the constriction zone (60). The vestibule (50) is connected to the constriction zone (60) on the opposite end of the nanopore from the entrance rim (50), the constriction zone (60) generally providing the narrowest portion of the internal tunnel of the MspA nanopore (30). It is the segment of the ssDNA residing in the constriction zone (60) at any given moment that influences the ionic current passing through the pore, and thus determines the measurable signal in the system.

Initial experiments to characterize MspA as a potential nanopore for nucleic acid-related applications indicated a preferred orientation of MspA, referred to herein as "forward" orientation, wherein the entrance rim was in contact with the cis compartment and the rim proximate to the constriction zone was in contact with the trans compartment (see FIGURE 23, left diagram). Thus, from cis to trans, the MspA nanopore tunnel in the "forward" orientation is comprised of the entrance rim, the vestibule, and the constriction zone portions. First, the forward orientation for MspA was previously preferred because the typical methodologies for forming and embedding MspA in a lipid bilayer membrane resulted in a vast majority (around 80%) of MspA nanopores inserting into the membrane in the forward orientation. Briefly, such methodologies generally followed the methodology described in Butler, T.Z., et al, Proc. Natl. Acad. Sci. USA 105:20647- 20652 (2008), where lipid bilayers were formed across a horizontal ~20 μιη diameter Teflon aperture with the ~60 μΐ compartments on both sides of the bilayer contained experimental buffer. An Axopatch 200B integrating patch clamp amplifier (Axon Instruments) applied a 180 mV voltage across the bilayer (trans side positive) and measured any current. The MspA monomers were added to the grounded cis compartment to yield a concentration of ~2.5 ng/ml. Once a single pore inserted, the compartment was flushed with experimental buffer to avoid further insertions.

Second, the forward orientation for MspA was previously preferred because when MspA nanopores inserted into the bilayer membrane in the "reverse" orientation, i.e., with the constriction zone more proximate to the cis compartment and the vestibule and entrance rim more proximate to the trans compartment (see FIGURE 23, right diagram), the nanopore exhibited significant gating, or spontaneous changes of conductance that would be difficult to distinguish from current events associated with an analyte. Gating typically increases in proteinaceous nanopores with increasing voltages and results in a decrease the conductivity of the nanopore. This trend was observed to be especially pronounced for MspA in the reverse orientation. As the voltage applied to an MspA embedded in the reverse orientation increased (without nucleic acid analytes present), the frequency of non-analyte gating increased. The most drastic transition was observed between applications of 80 mV and 100 mV, although the frequency and severity of the gating events continued to increase past an applied potential of 140 mV (not shown). Such a tendency to gate is typically a disfavored characteristic for such nanopores because it possibly reflects aberrations in the structure of the nanopore and suggests inconsistent passage of analytes or ions through the pore, decreasing its utility for sequencing. Furthermore, as described, the gating signals are often difficult to distinguish from true conductance events associated with a nucleic acid analyte. Accordingly, MspA nanopores inserting into the bilayer membranes in the reverse orientation were typically discarded for purposes of ssDNA analysis. However, the present study surprisingly found that MspA nanopores 1) could be stably formed in a bilayer membrane with increasing frequency and 2) could function thereafter without the gating problems previously observed and provide an additional benefit of reduced Brownian Motion in the nucleic acid, leading to a more clearly resolved current signal.

Briefly, the above protocol for forming MspA nanopores in bilayer membranes was modified by pre -mixing the MspA monomers with the bilayer membrane components prior forming the bilayer membrane across the Teflon aperture. Thus, the MspA monomers assembled into an MspA nanopore within the bilayer as the membrane itself assembled between the cis and trans compartments. With this approach, an MspA nanopore was roughly equally likely to be in the forward or the reverse orientation. The reverse orientation could be readily confirmed by preliminary characterization of the current to voltage relationship of the pore. FIGURE 24 illustrates the current (I) to voltage (V) relationship for a backward (i.e., reverse orientation) pore and two forward orientation pores. Both of the forward pores had a similar current to voltage relationship and the backward pore had a distinct current to voltage relationship, thus enabling the identification of the pore orientation within a membrane. Furthermore, when the electrode poles and applied voltage were reversed on the forward pores revealed that the resulting current to voltage relationship was nearly indistinguishable with the relationship previously observed for the backwards pore. See FIGURE 24, dashed lines. Thus, reverse orientation of MspA can be confirmed by the observed current to voltage relationship resulting therefrom.

The present study also surprisingly revealed that MspA nanopores in the reverse orientation (see FIGURE 23, right diagram) did not exhibit the prior problems with gating events when analyte was present within the pore. Briefly, ssDNA analyte was introduced into the cis compartment of the nanopore system. The ssDNA was associated with phi29 DNA polymerase to provide a molecular brake to the ssDNA and regulate its translocation rate through the nanopore in single nucleotide steps. FIGURES 25A and 25B illustrate a comparison between the non-analyte gating of the reverse nanopore (FIGURE 25 A) and nucleic acid-associated current events observed with the same reverse nanopore (FIGURE 25B). Without being bound by any particular theory, the MspA in a reverse orientation permits a much reduced distance between the constriction zone of the MspA and the molecular brake that regulates the translocation of the linear nucleic acid through the pore. This reduced distance (see, e.g., the comparison in FIGURE 23) is believed to result in a reduced Brownian motion of the nucleic acid within the constriction zone region. The reduction of Brownian motion minimizes the effective nucleic segment length that influences the observed current flow and, thus, simplifies the association between the observed current levels and the structure of the corresponding segment of the nucleic acid. This is supported by various analyses. In a first "autocorrelation" approach, the current signals observed from a reverse nanopore and a forward nanopore are each analyzed for the similarity in adjacent current readings. In this regard, in MspA, where a current signal is influenced by a segment of nucleic acid > 1 nucleotide long (in the constriction zone), immediately adjacent current readings are expected to be similar to each other. However, as the distance between the current readings (i.e., the "lag" between the reference current and the compared current) increases, the expected similarity of the currents is expected to decrease. FIGURE 26A demonstrates that the similarity, or "auto correlation", between near currents in a reverse pore has a steeper decline than for a near currents in a forward pore. Thus, the averaging effect in a reverse pore results from a shorter sequence of nucleotides of the nucleic acid, and fewer nucleotides determine the measured current signal. In a second, "mutual information" approach described in Ross, B.C., "Mutual Information between Discrete and Continuous Data Sets," PLoS ONE (2):e87357 (2014), incorporated herein by reference in its entirety, the width of the effective constriction zone is modeled using the observed levels and aligning to the actual sequence of the nucleic acid, as illustrated in FIGURE 26B. See also Laszlo et al., "Decoding long nanopore sequencing reads of natural DNA," Nature Biotechnology 52:829-833 (2014), incorporated herein by reference in its entirety). As illustrated, the effective width (i.e., phase) of the reverse nanopore is narrower than in the forward pore, indicating that fewer nucleotides contribute to the observed current signals when the MspA is in the reverse orientation. This supports the hypothesis that a reduced nucleic acid length between the molecular brake (e.g., phi29 DNA polymerase) and the constriction zone reduces Brownian motion of the nucleic acid. FIGURE 26C illustrates a comparison of the frequency of different current readings observed for the same nucleic acid using MspA nanopores in the reverse and forward orientations. As illustrated, the MspA nanopore in reverse orientation produces current levels that span a wider range of currents, with a much higher frequency of currents in the high and low end of the measured currents. This indicates that distinct observed signals correlating to distinct nucleic acid sequences, as provided by a reverse MspA nanopore, are likely to be much more easily distinguished from each other because there is less likelihood that the signals will overlap substantially. In contrast, the forward pore exhibits more typical bell-curve of recorded current distribution, such that currents corresponding to distinct nucleic acid subsequences are more likely to overlap in range of current produced.

Conclusion

Accordingly, it is demonstrated that MspA nanopore can be inserted in a bilayer membrane of the system in a reverse orientation and still provide useful current signal without an increase in gating frequency. Furthermore, the nanopore system with the MspA nanopore in the reverse orientation minimizes the influence of Brownian motion of the ssDNA analyte, thus generating a signal with higher resolution that reflects a shorter sequence of nucleotides in the constriction zone and provides an enhanced signal differentiation between the distinct nucleotide species in the ssDNA.

While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.