Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SEQUENCING METHOD
Document Type and Number:
WIPO Patent Application WO/2022/207714
Kind Code:
A1
Abstract:
A method of determining whether a test nucleotide comprises a base complementary to the next base of a template strand immediately downstream of a primer in a primed template nucleic acid molecule is disclosed. A primed template nucleic acid molecule is contacted with a reaction mixture containing a polymerase and a nucleotide labelled with a luminescent marker. If the labelled nucleotide comprises a base complementary to the next base of the template strand then it is incorporated into the primed strand. Incorporation of the labelled nucleotide can be detect d by exciting the luminescent marker by multi-photon excitation.

Inventors:
HUMPHRIES MARTIN (GB)
ROBERTS MATTHEW (GB)
Application Number:
PCT/EP2022/058430
Publication Date:
October 06, 2022
Filing Date:
March 30, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CAMBRIDGE DISPLAY TECH LTD (GB)
SUMITOMO CHEMICAL CO (JP)
International Classes:
C12Q1/6869
Domestic Patent References:
WO2020058440A12020-03-26
WO2018060722A12018-04-05
Foreign References:
US20110200989A12011-08-18
US20200263084A12020-08-20
Attorney, Agent or Firm:
GILANI, Anwar (GB)
Download PDF:
Claims:
CLAIMS

1. A method of determining whether a test nucleotide comprises a base complementary to the next base of a template strand immediately downstream of a primer in a primed template nucleic acid molecule, the method comprising: providing a primed template nucleic acid molecule; providing a labelled nucleotide comprising the test nucleotide labelled with a luminescent marker comprising a luminescent material; contacting the primed template nucleic acid molecule with a reaction mixture that comprises a polymerase and the test nucleotide, to thereby incorporate the test nucleotide into the primed strand only if the test nucleotide comprises a base complementary to the next base of the template strand; exciting the luminescent marker with an excitation light; and detecting light emitted by the luminescent marker, wherein the detection of emitted light identifies the incorporation of the test nucleotide into the primed strand, and thereby indicates that the test nucleotide comprises a base complementary to the next base of the template strand, and wherein the luminescent marker is excited by multi-photon excitation. 2. The method according to claim 1 wherein the luminescent material is an organic luminescent material.

3. The method according to claim 1 wherein the organic luminescent material is a conjugated polymer.

4. The method according to claim 1 or 2 wherein the luminescent marker comprises a host material wherein the host material is excited by the multi-photo excitation and transfers energy to the luminescent material. 5. The method according to claim 4 wherein the host material is a polymer.

6. The method according to any one of the preceding claims wherein the excitation light has a wavelength of more than 500 nm

7. The method according to any one of the preceding claims wherein the light emitted by the luminescent marker has a peak wavelength in the range of 400-500 nm.

8. The method according to any one of the preceding claims wherein the luminescent marker is a particulate luminescent marker.

Description:
SEQUENCING METHOD BACKGROUND

One of the technologies that have improved the study of nucleic acids is the development of fabricated arrays of immobilised nucleic acids. These arrays consist typically of a high- density matrix of polynucleotides immobilised onto a solid support material. Using such arrays, current sequencing methods allow for the parallel processing of millions or even billions of cloned nucleic acids or nucleic acid fragments in a single sequencing run. These high-throughput approaches to nucleic acid analysis are often referred to as massive parallel sequencing, or next generation sequencing (NGS) methods. NGS technologies differ in precise methodology and sequencing chemistry but share the feature of the parallel analysis of clonally amplified nucleic acid template clusters that are spatially separated and immobilised within a flow cell.

One way of determining the nucleotide sequence of a nucleic acid bound to an array is called "sequencing by synthesis" or "SBS". This technique requires the incorporation of the correct nucleotide complementary to that of the nucleic acid being sequenced. Thus, each nucleotide residue is identified as it is incorporated into the growing nucleic acid strand. The incorporated nucleotide is read using an appropriate label attached thereto before removal of the label moiety and the subsequent next round of sequencing. Detection of the label can be carried out using various methods, including luminescence spectroscopy or by other optical means. Generally, the preferred label is a fluorophore, which, after absorption of energy, emits radiation at a defined wavelength. Nevertheless, luminescence-based sequencing instrumentation is typically large and expensive, and improvements in increased throughput capacity and reduced cost are required.

SUMMARY A summary of aspects of certain embodiments disclosed herein is set forth below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of these certain embodiments and that these aspects are not intended to limit the scope of this disclosure. Indeed, this disclosure may encompass a variety of aspects and/or a combination of aspects that may not be set forth. In some embodiments, the present disclosure provides a method of determining whether a test nucleotide comprises a base complementary to the next base of a template strand immediately downstream of a primer in a primed template nucleic acid molecule.

The method comprises: providing a primed template nucleic acid molecule; providing a labelled nucleotide comprising the test nucleotide labelled with a luminescent marker comprising a luminescent material; contacting the primed template nucleic acid molecule with a reaction mixture that comprises a polymerase and the test nucleotide, to thereby incorporate the test nucleotide into the primed strand only if the test nucleotide comprises a base complementary to the next base of the template strand; exciting the luminescent marker with an excitation light; and detecting light emitted by the luminescent marker, wherein the detection of emitted light identifies the incorporation of the test nucleotide into the primed strand, and thereby indicates that the test nucleotide comprises a base complementary to the next base of the template strand, and wherein the luminescent marker is excited by multi-photon excitation.

Optionally, the luminescent material is an organic luminescent material.

Optionally, the organic luminescent material is a conjugated polymer. Optionally, the luminescent marker comprises a host material wherein the host material is excited by the multi-photo excitation and transfers energy to the luminescent material.

Optionally, the host material is a polymer.

Optionally, the excitation light has a wavelength of more than 500 nm.

Optionally, the light emitted by the luminescent marker has a peak wavelength in the range of 400-500 nm.

Optionally, the luminescent marker is a particulate luminescent marker. DESCRIPTION OF THE DRAWINGS

The invention will now be described in more detail with reference to the drawings wherein:

Figure 1 is a graph of fluorescence produced by 2-photon excitation of an aqueous dispersion of fluorescent nanoparticles with a 425-455 nm bandpass emission filter.

DETAILED DESCRIPTION

Unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise," "comprising," and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of "including, but not limited to." As used herein, the terms "connected," "coupled," or any variant thereof means any connection or coupling, either direct or indirect, between two or more elements. Additionally, the words "herein," "above," "below," and words of similar import, when used in this application, refer to this application as a whole and not to any particular portions of this application. Where the context permits, words in the Detailed Description using the singular or plural number may also include the plural or singular number respectively. The word "or," in reference to a list of two or more items, covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.

The teachings of the technology provided herein can be applied to other methods, not necessarily the methods described below. The elements and acts of the various examples described below can be combined to provide further implementations of the technology. Some alternative implementations of the technology may include not only additional elements to those implementations noted below, but also may include fewer elements.

These and other changes can be made to the technology in light of the following detailed description. While the description describes certain examples of the technology, and describes the best mode contemplated, no matter how detailed the description appears, the technology can be practiced in many ways. Details of the disclosed methods and systems may vary considerably in their specific implementation, while still being encompassed by the technology disclosed herein. As noted above, particular terminology used when describing certain features or aspects of the technology should not be taken to imply that the terminology is being redefined herein to be restricted to any specific characteristics, features, or aspects of the technology with which that terminology is associated. In general, the terms used in the following claims should not be construed to limit the technology to the specific examples disclosed in the specification, unless the Detailed Description section explicitly defines such terms. Accordingly, the actual scope of the technology encompasses not only the disclosed examples, but also all equivalent ways of practicing or implementing the technology under the claims.

To reduce the number of claims, certain aspects of the technology are presented below in certain claim forms, but the applicant contemplates the various aspects of the technology in any number of claim forms.

In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of implementations of the disclosed technology. It will be apparent, however, to one skilled in the art that embodiments of the disclosed technology may be practiced without some of these specific details.

The disclosed methods may comprise sequencing template fragments derived from a target nucleic acid.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art. For clarity, the following specific terms have the specified meanings.

The term “nucleic acid” can refer to at least two nucleotide monomers linked together. Examples include, but are not limited to DNA, such as genomic or cDNA; RNA, such as mRNA, sRNA or rRNA; or a hybrid of DNA and RNA. Thus, a “nucleic acid” is a polynucleotide, such as DNA, RNA, or any combination thereof, that can be acted upon by a polymerizing enzyme during nucleic acid synthesis. The term “nucleic acid” includes single-, double-, or multiple-stranded DNA, RNA and analogs (derivatives) thereof. As apparent from the disclosure below and elsewhere herein, a nucleic acid can have a naturally occurring nucleic acid structure or a non-naturally occurring nucleic acid analog structure. A nucleic acid can contain phosphodiester bonds; however, in some embodiments, nucleic acids may have other types of backbones, comprising, for example, phosphoramide, phosphorothioate, phosphorodithioate, O-methylphosphoroamidite and peptide nucleic acid backbones and linkages. Nucleic acids can have positive backbones; non-ionic backbones, and non-ribose based backbones. Nucleic acids may also contain one or more carbocyclic sugars. The nucleic acids used in methods or compositions herein may be single stranded or, alternatively double stranded, as specified. In some embodiments a nucleic acid can contain portions of both double stranded and single stranded sequence, for example, as demonstrated by forked adapters. A nucleic acid can contain any combination of deoxyribo- and ribo-nucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthanine, hypoxanthanine, isocytosine, isoguanine, and base analogs such as nitropyrrole (including 3 -nitropyrrole) and nitroindole (including 5-nitroindole), etc.

A “template nucleic acid” is a nucleic acid to be detected or sequenced using any sequencing method disclosed herein. As used herein, a “primed template nucleic acid” (or alternatively, “primed template nucleic acid molecule”) is a template nucleic acid primed with (i.e., hybridized to) a primer, wherein the primer is an oligonucleotide having a 3 ’-end with a sequence complementary to a portion of the template nucleic acid. The primer can optionally have a free 5 ’-end (e.g., a portion of the primer being non- hybridized with the template), be fully hybridized to the template or can be continuous with the template (e.g., via a hairpin structure). The primed template nucleic acid includes the complementary primer and the template nucleic acid to which it is bound. Unless explicitly stated, a primed template nucleic acid can have either a 3 ’-end that is extendible by a polymerase, or a 3 ’-end that is blocked from extension. In preferred embodiments, genomic DNA fragments, or amplified copies thereof, are used as the target nucleic acid. In other preferred embodiments, mitochondrial or chloroplast DNA is used. Other embodiments are targeted to RNA or derivatives thereof such as mRNA or cDNA.

The term “nucleotide sequence” is intended to refer to the order and type of nucleotide monomers in a nucleic acid polymer. A nucleotide sequence is a characteristic of a nucleic acid molecule and can be represented in any of a variety of formats including, for example, a depiction, image, electronic medium, series of symbols, series of numbers, series of letters, series of colors, etc. A series of “A,” “T,” “G,” and “C” letters is a well- known sequence representation for DNA that can be correlated, at single nucleotide resolution, with the actual sequence of a DNA molecule. A similar representation is used for RNA except that “T” is replaced with “U” in the series. A “nucleotide” is a molecule that includes a nitrogenous base, a five-carbon sugar (ribose or deoxyribose), and at least one phosphate group. The term embraces, but is not limited to, ribonucleotides, deoxyribonucleotides, nucleotides modified to include exogenous labels or reversible terminators, and nucleotide analogs. The test nucleotide is preferably a native nucleotide. A “native” nucleotide refers to a naturally occurring nucleotide that does not include an exogenous label (e.g., a luminescent dye, or other label) or chemical modification such as may characterize a nucleotide analog. The term “dNTP” refers to any deoxyribonucleotide triphosphate, and a dNTP for use in the disclosed method may comprise a native nucleotide. Examples of native nucleotides that may be used as a test nucleotide in the disclosed methods include: dATP (2’-deoxyadenosine-5’-triphosphate); dGTP (2’-deoxyguanosine-5’-triphosphate); dCTP (2’-deoxycytidine-5’-triphosphate); dTTP (2’-deoxythymidine-5’-triphosphate); and dUTP (2’-deoxyuridine-5’- triphosphate). The test nucleotide may be nucleotide analog. A “nucleotide analog” has one or more modifications, such as chemical moieties, which replace, remove and/or modify any of the components (e.g., nitrogenous base, five-carbon sugar, or phosphate group(s)) of a native nucleotide. Nucleotide analogs may be either incorporable or non- incorporable by a polymerase in a nucleic acid polymerization reaction. Optionally, the 3 ’-OH group of a nucleotide analog is modified with a moiety. The moiety may be a 3’ reversible or irreversible terminator of polymerase extension. The base of a nucleotide may be any of adenine, cytosine, guanine, thymine, or uracil, or analogs thereof. Optionally, a nucleotide has an inosine, xanthine, hypoxanthine, isocytosine, isoguanine, nitropyrrole (including 3-nitropyrrole) or nitroindole (including 5-nitro indole) base. Nucleotides may include, but are not limited to, ATP, UTP, CTP, GTP, ADP, UDP, CDP, GDP, AMP, UMP, CMP, GMP, dATP, dTTP, dUTP, dCTP, dGTP, dADP, dTDP, dCDP, dGDP, dAMP, dTMP, dCMP, and dGMP. Nucleotides may also contain terminating inhibitors of DNA polymerase, dideoxynucleotides or 2’, 3’ dideoxynucleotides, which are abbreviated as ddNTPs (ddGTP, ddATP, ddTTP, ddUTP and ddCTP). The “next correct nucleotide” (also referred to as the “cognate” nucleotide) refers to the nucleotide type that will bind and/or incorporate at the 3’ end of a primer to complement a base in a template strand to which the primer is hybridized. The base in the template strand is referred to as the “next template nucleotide” and is immediately 5’ of the base in the template that is hybridized to the 3’ end of the primer. The next correct nucleotide can be, but need not necessarily be, capable of being incorporated at the 3’ end of the primer or 3’ end of the nascent growing strand. A nucleotide having a base that is not complementary to the next template base is referred to as an “incorrect” (or “non cognate”) nucleotide. A “marked nucleotide” refers to a nucleotide conjugated to any marker (e.g. a fluorophore) by a linker, wherein the nucleotide may or may not be incorporated into the primed strand.

The terms “label” and “marker” may be used interchangeably to refer to any group or moiety that may be used to identify, detect, and/or distinguish between nucleotides. A label may be a luminescent label, in which case, it may be referred to as an “emitter”. A “polymerase” refers to any nucleic acid synthesizing enzyme, including but not limited to, DNA polymerase, RNA polymerase, reverse transcriptase, primase and transferase. Typically, the polymerase includes one or more active sites at which nucleotide binding and/or catalysis of nucleotide polymerization may occur. The polymerase may catalyze the polymerization of nucleotides to the 3 ’-end of a primer bound to its complementary nucleic acid strand. For example, a polymerase can catalyze the addition of a next correct nucleotide to the 3’ oxygen of the primer via a phosphodiester bond, thereby chemically incorporating the nucleotide into the primer.

The term “providing”, as used, for example, in relation to a test nucleotide, a marked nucleotide, a template, a primer, or a primed template nucleic acid, refers to the preparation and delivery of one or many of the relevant reagents, for example to a reaction mixture or reaction chamber.

The term “contacting” refers to the mixing together of reagents (e.g., mixing a primed template nucleic acid molecule with a reaction mixture that comprises a polymerase and the test nucleotide) so that a physical binding reaction or a chemical reaction may take place. The term, “incorporating” or “chemically incorporating” refers to the inclusion of the cognate nucleotide, for example, by correct base pairing with the corresponding base in the template strand, or by attachment to the primer by formation of a phosphodiester bond. Accordingly, the term “incorporating” refers to the process of joining a nucleotide to the 3’-end of a primer or nascent strand by formation of a phosphodiester bond. Thus, the incorporation of a nucleotide at the 3’ end of the primer or nascent strand leads to extension of the primer or nascent strand. The incorporated nucleotide thereby provides the 3’ end of the primer or nascent strand in the subsequent sequencing cycle. The 3’ end of the primer or nascent strand thereby advances by one position along the template strand in each sequencing cycle. The terms “primer” and “nascent strand” may be interchangeably to refer to an oligonucleotide having a 3 ’-end with a sequence complementary to a portion of the template nucleic acid.

As used herein, “extension” refers to the process in a polymerase enzyme catalyzes addition of one or more nucleotides at the 3 ’-end of the primer or nascent strand, thereby leading to extension of the primer or nascent strand.

The present inventors have found that luminescent markers used in a sequencing method may be excited to emit light by a multi-photon absorption process, preferably a 2-photon absorption process. For simplicity, the following description of multi-photon absorption describes a 2-photon process however it will be understood that this process may be applied to a multi-photon absorption process of 3 or more photo absorptions.

In 2-photon absorption, a luminescent marker is irradiated with a wavelength of light from a light source having a sufficient intensity to cause two photons to be absorbed by a luminescent material of the luminescent marker, the absorbed energy being the sum of the energies of the individual photons. If this sum of energies exceeds the excitation energy of the luminescent material, e.g. an energy as determined from a peak absorption wavelength of the luminescent material, then the luminescent material may emit light.

Preferably, the light source is a laser.

Preferably, the light source produces light having a wavelength of more than 500 nm, optionally more than 600 nm or more than 700 nm. Use of 2-photon absorption to produce emission having a shorter wavelength than the wavelength of light used to excite the luminescent material may improve the signal-to- noise ratio in a sequencing method as compared to single photon absorption by, for example, reducing autofluorescence of nucleotides and / or the substrate, and / or reducing “false” identification of emission from the excitation source as emission from a luminescent marker due to scattered excitation light reaching a photodetector.

Use of 2-photon absorption may reduce damage to nucleotides as compared to irradiation with shorter wavelengths for single photon absorption.

The present inventors have found that organic luminescent materials having a high degree of conjugation have a high 2-photon cross-section and as such may be particularly advantageous in absorbing excitation light and improving the signal-to-noise ratio. Conjugated materials as described herein may be small molecules or polymers, preferably a conjugated polymer.

In some embodiments, the sequencing method may comprise sequencing-by-synthesis (SBS) method. In some embodiments, a SBS method may comprise four steps:

1. library preparation;

2. cluster generation;

3. sequencing; and

4. data analysis. 1. Library Preparation

Library preparation is a molecular biology protocol that converts a nucleic acid template, such as a genomic DNA sample, or cDNA sample, into a sequencing library, which can then be sequenced, for example, using a Next Generation Sequencing (NGS) instrument.

A target nucleic acid sample can, in some embodiments, be processed prior to performing other modifications. For example, a target nucleic acid sample can be amplified prior to attaching to a bead, or prior to attaching to the surface of a solid support.

Amplification is particularly useful when samples are in low abundance or when small amounts of a target nucleic acid are provided. Methods that amplify the vast majority of sequences in a genome are referred to as “whole genome amplification” methods. Examples of such methods include multiple displacement amplification (MDA), strand displacement amplification (SDA), or hyperbranched strand displacement amplification, each of which can be carried out using degenerate primers. Particularly useful methods are those that are used during sample preparation methods recommended by commercial providers of whole genome sequencing platforms (e.g. Illumina Inc., San Diego and Life Technologies Inc., Carlsbad).

The sequencing library may be prepared by random fragmentation of the nucleic acid sample. The term “fragment,” when used in reference to a first nucleic acid, is intended to mean a second nucleic acid consisting of a part or portion of the sequence of the first nucleic acid.

In some embodiments, fragmentation inherently results from amplification, for example, in cases where the portion of the template that occurs between sites where flanking primers hybridize is selectively copied. In other embodiments, fragmentation may be achieved using chemical, enzymatic or physical techniques known in the art.

Fragments in a desired size range can be obtained using separation methods known in the art such as gel electrophoresis. Fragmentation can be carried out to obtain template nucleic acid fragments that have a minimum size of at least about 0.1 kb, 0.5 kb, 1 kb, 2 kb, 3, kb, 4 kb, 5 kb, 10 kb or longer in length.

Adapters, which may be referred to as “library adapters” may be ligated to the template fragments, such as, for example, ligation of 5' and 3' adapters to each DNA fragment. “Tagmentation” may be used to combine the fragmentation and ligation reactions into a single step that may increase the efficiency of the library preparation process. Adapter-ligated fragments may be amplified and purified by any suitable method currently used in the art. For example, adapter-ligated fragments may be PCR amplified and gel purified. The fragments that are produced from one or more nucleic acid templates can be captured randomly at locations on a solid support surface.

Solid supports can be two-or three-dimensional and can be a planar surface (e.g., a glass slide) or can be shaped. Useful materials include glass (e.g., controlled pore glass (CPG)), quartz, plastic (such as polystyrene (low cross-linked and high cross-linked polystyrene), polycarbonate, polypropylene and poly(methylmethacrylate)), acrylic copolymer, polyamide, silicon, metal (e.g., alkanethiolate-derivatized gold), cellulose, nylon, latex, dextran, gel matrix (e.g., silica gel), polyacrolein, or composites. Suitable three- dimensional solid supports include, for example, spheres, microparticles, beads, membranes, slides, plates, micromachined chips, tubes (e.g., capillary tubes), microwells, microfluidic devices, channels, filters, or any other structure suitable for anchoring a nucleic acid. Solid supports can include planar microarrays or matrices capable of having regions that include populations of nucleic acids or primers. Examples include nucleoside-derivatized CPG and polystyrene slides; derivatized magnetic slides; polystyrene grafted with polyethylene glycol, and the like.

A solid support to which nucleic acids may be attached in the sequencing method have a continuous or monolithic surface. Thus, fragments can attach at spatially random locations wherein the distance between nearest neighbor fragments (or nearest neighbor clusters derived from the fragments) may be variable. The resulting arrays may have a variable or random spatial pattern of features.

Different template fragments that are at different sites of an array can be differentiated from each other according to the locations of the sites in the array. An individual site of an array can include one or more molecules of a particular type. For example, a site can include a single target nucleic acid molecule having a particular sequence or a site can include several nucleic acid molecules having the same sequence (and/or complementary sequence, thereof). The sites of an array can be different features or locations on the same substrate. Exemplary sites include, for example, wells in a substrate, beads (or other particles) in or on a substrate, projections from a substrate, ridges on a substrate or channels in a substrate. The sites of an array can be separate substrates each bearing a different molecule. Exemplary arrays in which separate substrates are located on a surface include, for example, those having beads in wells. The disclosed methods can advantageously use arrays having a high density of features such as, for example, at least about 10 features/cm 2 , 100 features/cm 2 , 500 features/cm 2 , 1,000 features/cm 2 , 5,000 features/cm 2 , 10,000 features/cm 2 , 50,000 features/cm 2 , 100,000 features/cm 2 , 1,000,000 features/cm 2 , 5,000,000 features/cm 2 , 10 7 features/cm 2 , 5xl0 7 features/cm 2 , 10 8 features/cm 2 , 5xl0 8 features/cm 2 , 10 9 features/cm 2 , 5xl0 9 features/cm 2 , or higher.

Flow cells provide a convenient format for housing an array of nucleic acid fragments for use in the disclosed methods. As used herein, the term “flow cell” is intended to mean a chamber having a surface across which one or more fluid reagents can be flowed. Generally, a flow cell will have an ingress opening and an egress opening to facilitate flow of fluid. Flow cells provide a convenient format for use in the disclosed method that involves repeated delivery of reagents in cycles. For example, to initiate a first SBS cycle, one or more dNTPs, DNA polymerase, etc., can be flowed into/through a flow cell that houses an array of nucleic acid fragments. Washes can easily be carried out in the flow cell between the various delivery steps. The cycle can be repeated n times to extend the primer by n nucleotides, thereby detecting a sequence of length n.

For cluster generation, the library of adapter-ligated template fragments may be loaded into a flow cell where fragments are captured on a lawn of surface-bound binding molecules, such as oligonucleotides complementary to the library adapters. DNA nanoballs can also be used in the disclosed methods and methods for preparing and using DNA nanoballs for genomic sequencing are known in the art. Briefly, following genomic DNA fragmentation consecutive rounds of adaptor ligation, amplification and digestion results in head to tail concatamers of multiple copies of the circular genomic DNA template/adaptor sequences which are circularized into single stranded DNA by ligation with a circle ligase and rolling circle amplified. The adaptor structure of the concatamers promotes coiling of the single stranded DNA thereby creating compact DNA nanoballs. The DNA nanoballs can be captured on substrates, preferably to create an ordered or patterned array such that distance between each nanoball is maintained thereby allowing sequencing of the separate DNA nanoballs. Sequencing utilizing the methods and compositions described herein can also be performed in a microtiter plate, for example in high density reaction plates or slides. For example, genomic targets can be prepared by emPCR technologies. Reaction plates or slides can be created from fiber optic material capable of capturing and recording light generated from a reaction, for example from a luminescent reaction. The core material can be etched to provide discrete reaction wells capable of holding at least one emPCR reaction bead. Such slides/plates can contain over a 1.6 million wells. The created slides/plates can be loaded with the target sequencing reaction emPCR beads and mounted to an instrument where the sequencing reagents are provided and sequencing occurs.

2. Cluster Generation

Cluster generation is a process of clonal amplification of target nucleic acid templates which may be used or required, for example, for imaging systems which cannot detect single luminescence events. Various suitable methods for clonally amplifying nucleic acid template molecules to produce clusters of cloned template will be known to the skilled person. Any suitable method may be used, and typically, these methods comprise polymerase chain reaction (PCR)-based techniques. The cluster generation procedure is relatively complicated and time-consuming, and may introduce errors into the cloned template nucleic acids. In the cluster generation step, each template fragment is clonally amplified into distinct clusters. The result is a clonal grouping of identical template fragments bound to the surface of the flow cell. Each cluster on the flow cell produces a single sequencing read. For example, 10,000 clusters on the flow cell would produce 10,000 sequence reads. Where paired-end reads are implemented, a second read of each sequence would also be performed.

Each cluster is seeded by a single template nucleic acid fragment and is clonally amplified, for example, using a PCR-based approach, such as involving the use of forward and reverse primers that are attached to the support within the flow cell. In current sequencing methods, a typical cluster has in the order of 1000 copies. To enable the generation of defined clusters, the template fragments may be captured onto surfaces that are patterned for example, with embedded beads (typically 1-2 pm in diameter) or wells (typically 200-600 nm in diameter). Each bead or well only captures a single template fragment and the size of the bead or well defines the maximum size of the cluster. The structured organization provided by the patterned surface of the flow cell provides improved, regular spacing of template clusters, and increased cluster density, which provides advantages over non-pattered clusters, such as in relation to signal detection.

Bridge amplification may be used to generate clusters. Bridge amplification may occur on the surface of the flow cell. For example, in currently used methods, the surface of the flow cell is coated with a “lawn” of oligonucleotides. In the first step of bridge amplification, a single- stranded sequencing library (with complementary adapter ends) is loaded into the flow cell. Individual molecules in the library bind to complementary oligos as they “flow” across the oligo lawn. Priming occurs as the opposite end of a ligated fragment bends over and “bridges” to another complementary oligo on the surface. Repeated denaturation and extension cycles (similar to PCR) result in localized amplification of single molecules into millions of unique, clonal clusters across the flow cell.

Other suitable amplification methods known in the art can also be used to produce immobilized amplicons from immobilized nucleic acid fragments. For example one or more clusters can be formed via solid-phase PCR, solid-phase MDA, solid-phase RCA etc., whether one or both primers of each pair of amplification primers are immobilized.

3. Sequencing

In some embodiments, the disclosed method comprises the detection of single nucleotides as they are incorporated at the 3 ’ end of the primer or 3 ’ end of the nascent growing strand.

In some embodiments, nucleotides are added to a nucleic acid primer thereby extending the primer in a template-dependent manner. Detection of the order and type of nucleotides added to the primer can be used to determine the sequence of the template. At least some of the nucleotides are labelled with a luminescent marker which may be used to detect and identify the nucleotide.

During each sequencing cycle, a nucleotide which is the next complementary nucleotide (i.e. comprises a base complementary to the next base of the template strand), is incorporated into the nucleic acid chain in a template-dependent manner due to complementary hydrogen bonding with the corresponding nucleotide in the template fragment.

A polymerase enzyme may subsequently catalyze the chemical addition of the next complementary nucleotide into the nucleic acid chain. In each sequencing cycle, the next complementary nucleotide is identified by irradiation of the nucleotide and detection of any luminescence attributable to a luminescent marker that the nucleotide is labelled with.

A plurality of different species of test nucleotide may be included in the reaction mixture in a single sequencing cycle. Thus, labels associated with each different nucleotide species may be distinguishable and identifiable, for example on the basis of different luminescence emission spectra. In these embodiments, the luminescence emission that is detected is used to determine which of the different species of test nucleotides is incorporated into the primed strand and, thus represents the next complementary nucleotide. In some embodiments, one or more of the nucleotides may be pre-labelled with the relevant marker prior to inclusion in the reaction mixture and/or prior to incorporation into the primer or nascent strand of the nucleic acid that is being sequenced. In other embodiments, one or more of the nucleotides may be labelled after correctly base pairing with the relevant nucleotide in the template strand, and/or after incorporation into the growing strand. For example, one of more of the nucleotides may comprise a moiety, such as a linker, to which a detectable label may be directly or indirectly attached to thereby detect and identify the nucleotide.

In some sequencing methods, each nucleotide is associated with a different luminescent label that may be used to detect and identify the nucleotide. Thus, the number of possible different nucleotides to be detected is equal to the number of different labels used, and the number of detection channels required.

In some sequencing methods, the number of possible different nucleotides is greater than the number of different labels used and/or the number of detection channels required. In some embodiments, one of the nucleotides is not labelled and all other nucleotides are labelled. The presence of the unlabeled nucleotide may be determined if no luminescence arising from a luminescent marker is detected.

In some sequencing methods, emissions from at least two different fluorescent markers are detectable by the same detector, i.e. emissions from the at least two different fluorescent markers are the a single nucleotide is labelled with a first and second fluorescent marker, wherein the first and second fluorescent markers can be excited by the different laser frequency to produce emissions having first and second peak wavelengths wherein the first and second peak wavelengths are detectable by the same detector. In any event, at the point of detection, such as for example, when electromagnetic radiation at an excitation wavelength is applied to an emitter, the nucleotide that is being detected comprises the label, in the sense that the label is associated with the nucleotide in such a way that it may be used to specifically detect and identify the nucleotide.

The nucleotide may comprise a “terminator” which may be a “reversible terminator”. A nucleotide having a terminator or reversible terminator moiety can be used such that subsequent extension cannot occur until a deblocking agent is delivered to remove the terminator moiety. Thus, after nucleotide incorporation and identification, the terminator may be removed, such as by enzymatic cleavage, to allow the next sequencing cycle to commence. The linker attaching the label to the nucleotide in the disclosed method may be cleavable or otherwise arranged to allow dissociation of the label from the nucleotide. Thus, after incorporation and identification of the test nucleotide, the emitter or other label may be removed to allow the next sequencing cycle to commence and the next complementary nucleotide to be identified. The result is base-by-base sequencing of the template fragment nucleic acids.

The term “linker” is intended to mean a chemical bond or moiety that bridges two moieties, for example by covalent linkage or the formation of a stable complex. A linker can be, for example, the sugar-phosphate backbone that connects nucleotides in a nucleic acid moiety. In the disclosed method, a linker may be used to conjugate a test nucleotide to a label such as a fluorophore or other emitter. The linker can include, for example, one or more of a nucleotide moiety, a nucleic acid moiety, a non-nucleotide chemical moiety, a nucleotide analogue moiety, amino acid moiety, polypeptide moiety, or protein moiety. The terms “cleave”, “cleavage site”, and similar terms, refer to a moiety in a molecule, such as a linker, that can be modified or removed to physically separate two other moieties of the molecule.

In some embodiments of the disclosed method, a linker may be used to associate the test nucleotide to a terminator. The linker may be a cleavable linker, such that at the appropriate point in the sequencing cycle the linker is cleaved, thereby removing the terminator from the nucleotide. In some embodiments of the disclosed method, the linker that is used to associate the test nucleotide and terminator comprises the same type of cleavable linkage that is present in the linker conjugating the test nucleotide to the fluorophore or other detectable label. Thus, at the appropriate point in the sequencing cycle, a single agent, such as a single type of cleaving enzyme, may be used to remove both the terminator and label from the test nucleotide. In some embodiments of the disclosed method, a single linker may be arranged to conjugate both the terminator and label to the test nucleotide. In some embodiments of the disclosed method, the linker that is used to associate the test nucleotide and terminator may comprise a different type of cleavable linkage to that present in the linker conjugating the test nucleotide to the label.

In some embodiments, a plurality of different nucleic acid fragments can be sequenced simultaneously under conditions where events occurring for different templates can be distinguished, for example due to being present at different locations in an array.

4. Data Analysis During data analysis, the newly identified sequence reads of the template fragment are aligned, and the target nucleic acid sequence may thus be determined.

Following alignment, many variations of analysis are possible, including single nucleotide polymorphism (SNP), insertion-deletion (indel) identification, and read counting for RNA methods, phylogenetic or metagenomic analysis.

The skilled person will appreciate that the disclosed methods are not limited to sequencing -by- synthesis methods, and may be applied to various other methods involving the use of luminescence to identify a nucleotide. Such methods include, for example, methods for the analysis of short tandem repeat (STR) markers, single nucleotide polymorphisms (SNPs), methylation patterns, ChIP analysis, and RNA transcription. Also included are methods of analysis of any nucleic acid template, including, for example, DNA from any organism or mixed population such as a microbiome, whole genomes, RNA transcripts for expression analysis, cancer samples (such as methods of analysing somatic variants and/or tumour subclones), and mitochondrial DNA. Signal States

The disclosed methods may include, for each imaging event, correlating one or more nucleotide species to a dark state, correlating one or more nucleotide species to a signal state, correlating one or more nucleotide species to a grey state, and/or correlating one or more nucleotide species to a change in state between two imaging events (such as first and second imaging events), between a dark state, a grey state or a signal state.

A “signal state,” when used in reference to a detection event, means a condition in which a specific signal is produced in the detection event. For example, a nucleotide can be in a signal state and detectable when attached to a luminescent label that is excited at a specific excitation wavelength and detected by emission in an emission detection step in a sequencing method, using a specific detection filter.

The term “dark state,” when used in reference to a detection event, means a condition in which a specific signal is not produced in the detection event. For example, a nucleotide can be in a dark state when the nucleotide does not emit above a threshold level in an emission detection step in a sequencing method, using a single detection filter. For example, a nucleotide may lack a luminescent label, and/or may be attached to a luminescent label that is excited at a specific excitation wavelength that is different to the excitation wavelength used in relation to that particular detection event.

Dark state detection may also include any background emission which may be present in the absence of a luminescent label that is excited at the specific excitation wavelength being used in relation to that particular detection event. For example, some reaction components, which may include luminescent labels that are excited at a different excitation wavelength to the specific excitation wavelength being used, may demonstrate minimal emission in response to the excitation wavelength being used. As such, there may be background emission from such components. Further, background emission may be due to light scatter, for example from adjacent sequencing reactions, which may be detected by a detector. In addition, “dark state” can include background emission produced when an emissive moiety is not specifically included, such as when a nucleotide lacking a luminescent label is used. However, such background emission is distinguishable from a signal state or a grey state and, as such, nucleotide incorporation of an unlabelled nucleotide (or “dark” nucleotide) is still discernible. Likewise, nucleotide incorporation of a nucleotide that is attached to a luminescent label that is excited at a specific excitation wavelength that is different to the excitation wavelength used in relation to that particular detection event can also be distinguished from a signal state or a grey state.

The term “grey state,” when used in reference to a detection event, means a condition in which an attenuated signal is produced in the detection event. For example, a population of nucleotides of a particular type can be in a grey state when a first subpopulation of the nucleotides attached to a first luminescent label that is detected in a luminescence detection step of a sequencing method, while a second subpopulation of the nucleotides lacks the first luminescent label and does not give emission that is specifically detected in the luminescence detection step when excited at the excitation wavelength of the first luminescent label. The second subpopulation of the nucleotides, may, for example, comprise a second luminescent label that may be detected when excited at an excitation wavelength that is different to the excitation wavelength of the first luminescent label, such that the first and second luminescent labels may be detected in the same detector but are distinguishable on the basis that they are excited using different excitation wavelengths.

Typically, a reaction cycle will be carried out by delivering all four nucleotide types to a nucleic acid sample in the presence of a polymerase, for example a DNA or RNA polymerase, during a primer extension reaction. The presence of four nucleotide types provides an advantage of increasing polymerase fidelity compared to the use of fewer than four nucleotide types.

In some embodiments, fewer than four different types of nucleotides can be present during a polymerase extension reaction. The disclosed methods are not limited to nucleic acid sequencing and may also be used in other applications where detection of more than one analyte (i.e., nucleotide, protein, or fragments thereof) in a sample is desired.

It will be understood that reference to the use of a luminescent label includes the use of a plurality of different luminescent labels having the same or similar excitation and emission properties.

In preferred embodiments, the disclosed method is performed on a substrate (reaction surface), such as a glass, plastic, semiconductor chip or composite derived substrate, for example, within a flow cell.

Sequencing may be in a multiplex format, wherein multiple nucleic acid targets are detected and sequenced in parallel, for example in a flowcell or array type of format. The disclosed method is particularly advantageous when practicing parallel sequencing or massive parallel sequencing.

For purposes of illustration and not intended to limit embodiments, a general strategy sequencing cycle can be described by a sequence of steps. The following example is based on a sequence by synthesis sequencing reaction, however the disclosed methods are not limited to any particular sequencing reaction methodology.

The four nucleotide types A, C, T and G, typically modified nucleotides designed for sequencing reactions such as reversibly blocked (rb) nucleotides (e.g., rbA, rbT, rbC, rbG) wherein three of the four types of nucleotide are luminescently labelled, are simultaneously added, along with other reaction components, to the reaction surface (e.g., flowcell, chip, slide, etc.).

Following incorporation of a nucleotide into a growing sequence nucleic acid chain based on the target sequence, a first specific excitation wavelength, which is capable of selectively exciting a first emitter that is associated with a first of the four possible nucleotides is provided to the reaction surface. Any resulting emission is recorded using a specific detection filter; this constitutes a first imaging event and a first luminescence detection pattern. Following the first imaging event, a second specific excitation wavelength, which is capable of selectively exciting a second emitter that is associated with a second of the four possible nucleotides is provided to the reaction surface. Any resulting emission is recorded using a specific detection filter, and this constitutes a second imaging event and a second luminescence detection pattern. In the same way, third and fourth imaging events and luminescence detection patterns are subsequently produced in respect of the third and fourth of the possible nucleotides.

The sequence of the target nucleic acid, for that particular cycle (i.e. the identity of the next cognate residue) is determined by identifying the nucleotide that has been incorporated from the four luminescence detection patterns. Specifically, one of the luminescence detection patterns will comprise a signal state and three of the luminescence detection patterns will comprise a dark state.

Various reagents present after the fourth imaging event are washed away in preparation for the next sequencing cycle. Exemplary chemical reagents that may be removed include, but are not limited to, blocking agents, luminescent labels, quenchers, cleavage reagents, or any other reagents that may directly or indirectly cause an identifiable and measurable change in luminescence, or which may inhibit the incorporation of the next correct nucleotide in the sequence. Luminescent markers

The luminescent markers for use in any of the disclosed methods may be selected from any luminescent materials known to the skilled person including non-polymeric and polymeric luminescent materials. In use, one or more luminescent markers may be dissolved or dispersed in the reaction mixture.

In some embodiments, one or more of the luminescent markers is a particulate luminescent marker comprising a luminescent material.

The luminescent marker may emit light having a peak wavelength in the range of 350- 1000 nm.

A blue luminescent marker as described herein may have a photoluminescence spectrum with a peak of no more than 500 nm, preferably in the range of 400-500 nm, optionally 400-490 nm.

A green luminescent marker as described herein may have a photoluminescence spectrum with a peak of more than 500 nm up to 580 nm, optionally more than 500 nm up to 540 nm.

A red luminescent marker as described herein may have a photoluminescence spectrum with a peak of no more than more than 580 nm up to 950 nm, optionally up to 630 nm, optionally 585 nm up to 625 nm. The luminescent marker may have a shift between excitation and emission maxima in the range of 20-400 nm.

Photoluminescence spectra of luminescent markers as described herein may be measured in methanol solution or suspension using a Jobin Yvon Horiba Fluoromax-3.

UV/vis absorption spectra of luminescent markers as described herein may be as measured in methanol solution or suspension using a Cary 5000 UV-vis-IR spectrometer. Examples of non-polymeric light-emitting materials include, but are not limited to, fluorescein and fluorescein derivatives such as carboxyfluorescein, tetrachlorofluorescein, hexachlorofluorescein, carboxynapthofluorescein, fluorescein isothiocyanate, NHS -fluorescein, iodoacetamidofluorescein, fluorescein maleimide, SAMSA-fluorescein, fluorescein thiosemicarbazide, carbohydrazinomethylthioacetyl- amino fluorescein, rhodamine and rhodamine derivatives such as TRITC, TMR, lissamine rhodamine, Texas Red, rhodamine B, rhodamine 6G, rhodamine 10, NHS -rhodamine, TMR-iodoacetamide, lissamine rhodamine B sulfonyl chloride, lissamine rhodamine B sulfonyl hydrazine, Texas Red sulfonyl chloride, Texas Red hydrazide, coumarin and coumarin derivatives such as AMCA, AMCA-NHS, AMCA-sulfo-NHS, AMCA-HPDP, DCIA, AMCE-hydrazide, BODIPY and derivatives such as BODIPY FL C3-SE, BODIPY 530/550 C3, BODIPY 530/550 C3-SE, BODIPY 530/550 C3 hydrazide, BODIPY 493/503 C3 hydrazide, BODIPY FL C3 hydrazide, BODIPY FL IA, BODIPY 530/551 IA, Br-BODIPY 493/503, Cascade Blue and derivatives such as Cascade Blue acetyl azide, Cascade Blue cadaverine, Cascade Blue ethylenediamine, Cascade Blue hydrazide, CoralHue mk02, DAPI, DiA, DiD, Dil, DiO, DiR, DRAQ5, DsRED, dTomato, DyeCycle dyes, EB, ECFP, EGFP, Emerald dyes, Eosin, EYFP, Fluo-dyes, Fura dyes, FVS dyes, Hoechst33258, Indo dyes, JC-1, Kusabira-Orange, Lucifer Yellow and derivatives such as Lucifer Yellow iodoacetamide, Lucifer Yellow CH, Magnesium Green, Marina Blue, mBanana, mCherry, mOrange, mPlum, mRaspberry, mStrawberry, mTangerine, methyl Coumarin, Mitotracker Red, Na-Green, Nile Red, Oregon Green, Pacific Blue, Pacific Orange, PE dyes, PerCP dyes, Picogreen, PI, QDot dyes, R718, Rho dyes, Rhodamine Red, Riboflavin, SNARF dyes, S YBR Green, SYTOX dyes, Texas Red, TO-Pro dyes, TOTO dyes, V450, V500, Via-probe dyes, YO-Pro dyes, YOYO dyes, ZsGreen, cyanine and derivatives such as indolium based cyanine dyes, benzo-indolium based cyanine dyes, pyridium based cyanine dyes, thiozolium based cyanine dyes, quinolinium based cyanine dyes, imidazolium based cyanine dyes, Cy 3, Cy5, lanthanide chelates and derivatives such as BCPDA, TBP, TMT, BHHCT, BCOT, Europium chelates, Terbium chelates, Alexa Fluor dyes, DyLight dyes, Atto dyes, LightCycler Red dyes, CAL Flour dyes, JOE and derivatives thereof, Oregon Green dyes, WellRED dyes, IRD dyes, phycoerythrin and phycobilin dyes, Malacite green, stilbene, DEG dyes, NR dyes, CF dyes, near-infrared dyes and others known in the art. A luminescent polymer as described herein may be a fluorescent or phosphorescent luminescent polymer. In use, the luminescent polymer may be dissolved in the reaction mixture or a particulate luminescent marker comprising or consisting of the luminescent polymer may be dispersed in the reaction mixture. Examples of polymeric luminescent materials that may be suitable for use in the disclosed methods include, but are not limited to, Horizon Brilliant dyes by Becton Dickinson, Super Bright Dyes by ThermoFisher, StarBright dyes by Bio-Rad, and KIRA VIA Dyes by Sony. Other suitable polymeric light emitting materials are discussed below.

The luminescent polymer may be a homopolymer or may be a copolymer comprising two or more different repeat units.

The luminescent polymer may comprise luminescent groups in the polymer backbone, pendant from the polymer backbone or as end groups of the polymer backbone. In the case of a phosphorescent polymer, a phosphorescent metal complex, preferably a phosphorescent iridium complex, may be provided in the polymer backbone, pendant from the polymer backbone or as an end group of the polymer backbone.

The luminescent polymer may have a non-conjugated backbone or may be a conjugated polymer. By “conjugated polymer” is meant a polymer comprising repeat units in the polymer backbone that are directly conjugated to adjacent repeat units. Conjugated luminescent polymers include, without limitation, polymers comprising one or more of arylene, heteroarylene and vinylene groups conjugated to one another along the polymer backbone.

The conjugation of a conjugated polymer may extend along the whole length of the polymer backbone, or the conjugation may be broken by a conjugation-breaking repeat unit which does not provide a conjugation path between repeat units linked to the conjugation-breaking repeat unit.

The luminescent polymer may comprise host repeat units and emissive repeat units wherein the emissive repeat units and host repeat units. In use, the host repeat units may absorb excitation energy and transfer it to the emissive repeat units. In some embodiments, first and second luminescent polymers may be provided in which the first and second luminescent polymers have the same or similar emissive repeat units and different host repeat units, or a different composition of host repeat units, such that emission from the first and second polymers is similar but absorption peaks are at different wavelengths. In some embodiments, the first polymer contains a first host repeat unit and an emissive repeat unit and the second polymer contains the first host repeat unit, the emissive repeat unit and an intermediate repeat unit having a band gap between that of the first host repeat unit and the emissive repeat unit.

The luminescent polymer may have a linear, branched or crosslinked backbone. The luminescent polymer may comprise one or more repeat units in the backbone of the polymer substituted with one or more substituents selected from non-polar and polar substituents.

Preferably, the luminescent polymer comprises at least one polar substituent. The one or more polar substituents may be the only substituents of said repeat units, or said repeat units may be further substituted with one or more non-polar substituents, optionally one or more Ci-40 hydrocarbyl groups. The repeat unit or repeat units substituted with one or more polar substituents may be the only repeat units of the polymer or the polymer may comprise one or more further co-repeat units wherein the or each co-repeat unit is unsubstituted or is substituted with non-polar substituents, optionally one or more Ci-40 hydrocarbyl substituents.

Ci-40 hydrocarbyl substituents as described herein include, without limitation, Ci-20 alkyl, unsubstituted phenyl and phenyl substituted with one or more Ci-20 alkyl groups.

As used herein a “polar substituent” may refer to a substituent, alone or in combination with one or more further polar substituents, which renders the luminescent polymer with a solubility of at least 0.01 mg/ml in an alcoholic solvent, optionally in the range of 0.01- 10 mg / ml. Optionally, solubility is at least 0.1 or 1 mg/ml. The solubility is measured at 25°C. Preferably, the alcoholic solvent is a CM O alcohol, more preferably methanol.

Polar substituents are preferably substituents capable of forming hydrogen bonds or ionic groups. In some embodiments, the luminescent polymer comprises polar substituents of formula -0(R 3 0) t -R 4 wherein R 3 in each occurrence is a Ci- 10 alkylene group, optionally a C 1-5 alkylene group, wherein one or more non-adjacent, non-terminal C atoms of the alkylene group may be replaced with O, R 4 is H or C 1-5 alkyl, and t is at least 1, optionally 1-10. Preferably, t is at least 2. More preferably, t is 2 to 5. The value of t may be the same in all the polar groups of formula -0(R 3 0) t -R 4 . The value of t may differ between polar groups of the same polymer.

By “Ci- 5 alkylene group” as used herein with respect to R 3 is meant a group of formula - (CH 2 ) f- wherein f is from 1-5. Preferably, the luminescent polymer comprises polar substituents of formula - 0(CH 2 CH 2 0) t -R 4 wherein t is at least 1, optionally 1-10 and R 4 is a C 1-5 alkyl group, preferably methyl. Preferably, t is at least 2. More preferably, t is 2 to 5, most preferably t is 3.

In some embodiments, the luminescent polymer comprises polar substituents of formula -NCR 5 ) ! , wherein R 5 is H or C 1 - 12 hydrocarbyl. Preferably, each R 5 is a Ci-i 2 hydrocarbyl.

In some embodiments, the luminescent polymer comprises polar substituents which are ionic groups which may be anionic, cationic or zwitterionic. Preferably the ionic group is an anionic group.

Exemplary anionic groups are -COO , a sulfonate group; hydroxide; sulfate; phosphate; phosphinate; or phosphonate.

An exemplary cationic group is -N(R 5 ) 3 + wherein R 5 in each occurrence is H or Ci- 12 hydrocarbyl. Preferably, each R 5 is a C 1-12 hydrocarbyl.

A luminescent polymer comprising cationic or anionic groups comprises counterions to balance the charge of these ionic groups. An anionic or cationic group and counterion may have the same valency, with a counterion balancing the charge of each anionic or cationic group. - oh -

The anionic or cationic group may be monovalent or polyvalent. Preferably, the anionic and cationic groups are monovalent.

The luminescent polymer may comprise a plurality of anionic or cationic polar substituents wherein the charge of two or more anionic or cationic groups is balanced by a single counterion. Optionally, the polar substituents comprise anionic or cationic groups comprising di- or trivalent counterions.

The counterion is optionally a cation, optionally a metal cation, optionally Li + , Na + , K + , Cs + , preferably Cs + , or an organic cation, optionally ammonium, such as tetraalkylammonium, ethylmethyl imidazolium or pyridinium. The counterion is optionally an anion, optionally a halide; a sulfonate group, optionally mesylate or tosylate; hydroxide; carboxylate; sulfate; phosphate; phosphinate; phosphonate; or borate.

In some embodiments, the luminescent polymer comprises polar substituents selected from groups of formula -0(R 3 0) t -R 4 , groups of formula -N(R 5 ) 2 , groups of formula OR 4 and/or ionic groups. Preferably, the luminescent polymer comprises polar substituents selected from groups of formula -OiCEhCEhOj t R 4 , groups of formula -N(R 5 ) 2 , and/or anionic groups of formula -COO . Preferably, the polar substituents are selected from the group consisting of groups of formula -0(R 3 0) t -R 4 , groups of formula -N(R 5 ) 2 , and/or ionic groups. Preferably, the polar substituents are selected from the group consisting of polyethylene glycol (PEG) groups of formula -O/CthCthO R 4 , groups of formula - N(R 5 ) 2 , and/or anionic groups of formula -COO . R 3 , R 4 , R 5 , and t are as described above.

Optionally, the backbone of the luminescent polymer is a conjugated polymer. Optionally, the backbone of the conjugated luminescent polymer comprises repeat units of formula (III): wherein Ar 1 is an arylene group or heteroarylene group; Sp is a spacer group; m is 0 or 1; R 1 independently in each occurrence is a polar substituent; n is 1 if m is 0 and n is at least 1, optionally 1, 2, 3 or 4, if m is 1; R 2 independently in each occurrence is a non-polar substituent; p is 0 or a positive integer, optionally 1, 2, 3 or 4; q is 0 or a positive integer, optionally 1, 2, 3 or 4; and wherein Sp, R 1 and R 2 may independently in each occurrence be the same or different.

Preferably, m is 1 and n is 2-4, more preferably 4. Preferably p is 0. Ar 1 of formula (III) is optionally a C6-20 arylene group or a 5-20 membered heteroarylene group. Ar 1 is preferably a C6-20 arylene group, optionally phenylene, fluorene, benzofluorene, phenanthrene, naphthalene or anthracene, more preferably fluorene or phenylene, most preferably fluorene.

Sp^R^n may be a branched group, optionally a dendritic group, substituted with polar groups, optionally -NH2 or -OH groups, for example polyethyleneimine.

Preferably, Sp is selected from:

Ci -20 alkylene or phenylene-Ci-20 alkylene wherein one or more non- adjacent C atoms may be replaced with O, S, N or C=0; a C6-20 arylene or 5-20 membered heteroarylene, more preferably phenylene, which, in addition to the one or more substituents R 1 , may be unsubstituted or substituted with one or more non-polar substituents, optionally one or more Ci -20 alkyl groups. “alkylene” as used herein means a branched or linear divalent alkyl chain.

“non-terminal C atom” of an alkyl group as used herein means a C atom other than the methyl group at the end of an n-alkyl group or the methyl groups at the ends of a branched alkyl chain. More preferably, Sp is selected from:

Ci -20 alkylene wherein one or more non-adjacent C atoms may be replaced with O, S or CO; and a C6-20 arylene or a 5-20 membered heteroarylene, even more preferably phenylene, which may be unsubstituted or substituted with one or more non- polar substituents.

R 1 may be a polar substituent as described anywhere herein. Preferably, R 1 is: a polyethylene glycol (PEG) group of formula -0(CH 2 CH 2 0) t R 4 wherein t is at least 1, optionally 1-10 and R 4 is a C1-5 alkyl group, preferably methyl; a group of formula -N(R 5 )2, wherein R 5 is H or Ci-12 hydrocarbyl; or - an anionic group of formula -COO .

In the case where n is at least two, each R 1 may independently in each occurrence be the same or different. Preferably, each R 1 attached to a given Sp group is different.

In the case where p is a positive integer, optionally 1, 2, 3 or 4, the group R 2 may be selected from: - alkyl, optionally Ci-20 alkyl; and aryl and heteroaryl groups that may be unsubstituted or substituted with one or more substituents, preferably phenyl substituted with one or more Ci-20 alkyl groups; a linear or branched chain of aryl or heteroaryl groups, each of which groups may independently be substituted, for example a group of formula -(Ar 3 ) s wherein each Ar 3 is independently an aryl or heteroaryl group and s is at least 2, preferably a branched or linear chain of phenyl groups each of which may be unsubstituted or substituted with one or more Ci-20 alkyl groups; and a crosslinkable-group, for example a group comprising a double bond such and a vinyl or acrylate group, or a benzocyclobutane group.

Preferably, each R 2 , where present, is independently selected from Ci-4ohydrocarbyl, and is more preferably selected from Ci-20 alkyl; unsubstituted phenyl; phenyl substituted with one or more Ci-20 alkyl groups; and a linear or branched chain of phenyl groups, wherein each phenyl may be unsubstituted or substituted with one or more substituents. A polymer as described herein may comprise or consist of only one form of the repeating unit of formula (III) or may comprise or consist of two or more different repeat units of formula (III).

Optionally, the polymer comprising one or more repeat units of formula (III) is a copolymer comprising one or more co-repeat units. If co-repeat units are present then the repeat units of formula (III) may form between 0.1- 99 mol % of the repeat units of the polymer, optionally 50-99 mol % or 80-99 mol %. Preferably, the repeat units of formula (I) form at least 50 mol% of the repeat units of the polymer, more preferably at least 60, 70, 80, 90, 95, 98 or 99 mol%. Most preferably the repeat units of the polymer consist of one or more repeat units of formula (I). The or each repeat unit of the polymer may be selected to produce a desired colour of emission of the polymer.

Arylene repeat units of the polymer include, without limitation, fluorene, preferably a 2,7- linked fluorene; phenylene, preferably a 1,4-linked phenylene; naphthalene, anthracene, indenofluorene, phenanthrene and dihydrophenanthrene repeat units. The polystyrene-equivalent number- average molecular weight (Mn) measured by gel permeation chromatography of the luminescent polymers described herein may be in the range of about lxlO 3 to lxlO 8 , and preferably lxlO 4 to 5xl0 6 . The polystyrene-equivalent weight- average molecular weight (Mw) of the polymers described herein may be lxlO 3 to lxlO 8 , and preferably lxlO 4 to lxlO 7 .

Polymers as described herein are suitably amorphous polymers.

A particulate luminescent marker as described herein may be, without limitation, a micro- or nano-particulate luminescent marker.

In some embodiments, the particulate light emitting marker comprises or consists of a quantum dot. Exemplary luminescent quantum dot materials include, without limitation, metal chalcogenides. Quantum dots include, without limitation, core, core-shell and alloyed quantum dots. In some embodiments, the particulate luminescent marker is a collapsed luminescent polymer.

In some embodiments, the luminescent particles of the particulate luminescent marker comprise a luminescent material and a matrix. The luminescent material may be a fluorescent or phosphorescent luminescent material. The luminescent material be selected from polymeric or non-polymeric luminescent materials as described anywhere herein.

Preferably, the luminescent particles as described herein have a number average diameter of no more than 500nm or 400 nm in methanol as measured by dynamic light scattering (DLS) using a Malvern Zetasizer Nano ZS (Details of measurement in the Examples). Preferably the particles have a number average diameter of between 5-500 nm, optionally 10-200 nm, preferably between 20-150 nm, as measured by a Malvern Zetasizer Nano ZS. The inventors have found that light emitting particles having an average diameter of less than 50nm, such as 20-40nm are preferred. In addition, the inventors have produced light emitting particles having a diameter of 30nm which have an extinction coefficient that is at least two orders of magnitude higher than that of a typical small molecule dye. Particles of this size are also ideally suited for use in current sequencing methods, for example, using substrate beads having diameters in the range of l-2pm, or nanowells in the range of 200-600nm. The matrix of a luminescent particle comprising a matrix and a luminescent material may at least partially isolate the luminescent material from the surrounding environment. This may limit any effect that the external environment may have on the lifetime of the luminescent material. In some embodiments, the particle comprises the luminescent material homogenously distributed through the matrix.

In some embodiments, the particle may have a particulate core and, optionally, a shell wherein at least one of the core and shell contains the luminescent material.

In the case where the luminescent material is a polymer, polymer chains of the luminescent polymer may extend across some or all of the thickness of the core and / or shell. Polymer chains may be contained within the core and / or shell or may protrude through the surface of the core and / or shell.

In some embodiments, the particle comprises a core comprising or consisting of the luminescent polymer and a shell comprising or consisting of the matrix. In some embodiments, the particle core consists of the matrix and the luminescent material. In some embodiments, the particle core comprises at least one further material, for example a host material configured to absorb excitation energy from an energy source, e.g. a light source, and transfer energy to the transferring energy to the luminescent material. In a preferred embodiment, the host material is a conjugated polymer and the luminescent material is a non-polymeric material.

In the case where the luminescent marker comprises a host material and a luminescent material, it will be understood that the host may be excited by a 2-photon excitation and may transfer energy to the luminescent material. In some embodiments, the host material has a larger 2-photon cross-section that the luminescent material. Exemplary polymeric matrix materials include, without limitation, polystyrene and homopolymers or copolymers of (alkyl)acrylic acids. A polymeric matrix material may be crosslinked, e.g. a crosslinked chito san-poly aery lie acid polymer. The polymer matrix may be a self-assembled micelle or vesicle comprising lipid or polymer surfactants. The polymer matrix is preferably an inorganic oxide, optionally silica, alumina or titanium dioxide. The polymer matrix is more preferably silica.

In some embodiments the luminescent material may be covalently bound, directly or indirectly, to the matrix material. In some embodiments, the luminescent material may be mixed with (i.e. not covalently bound to) a matrix material. Preferably, the matrix is not covalently bound to the luminescent material, in which case there is no need for the matrix material and / or the luminescent material to be substituted with reactive groups for forming such covalent bonds, e.g. during formation of the particles.

In some embodiments, a silica matrix as described herein may be formed by polymerisation of a silica monomer in the presence of the luminescent material, for example as described in WO 2018/060722, the contents of which are incorporated herein by reference.

In some embodiments, the polymerisation comprises bringing a solution of silica monomer into contact with an acid or a base. The acid or base may be in solution. The luminescent material may be in solution with the acid or base and / or the silica monomer before the solutions are mixed. Optionally, the solvents of the solutions are selected from water, one or more Ci-s alcohols or a combination thereof.

Polymerising a matrix monomer in the presence of a luminescent polymer may result in one or more chains of the polymer encapsulated within the particle and / or one or more chains of the polymer extending through a particle.

The particles may be formed in a one- step polymerisation process.

Optionally, the silica monomer is an alkoxysilane, preferably a trialkoxy or tetra- alkoxysilane, optionally a Ci-12 trialkoxy or tetra-alkoxysilane, for example tetraethyl orthosilicate. The silica monomer may be substituted only with alkoxy groups or may be substituted with one or more groups.

Optionally, at least 0.1 wt% of total weight of the particle core consists of the luminescent material. Preferably at least 1, 10, 25 wt% of the total weight of the particle core consists of the luminescent material. Optionally at least 50 wt% of the total weight of the particle core consists of the matrix. Preferably at least 60, 70, 80, 90, 95, 98, 99, 99.5, 99.9 wt% of the total weight of the particle core consists of the matrix.

EXAMPLES Example 1

A 532 nm laser was used to excite an aqueous dispersion of fluorescent nanoparticles by a two-photon process. Emission was detected using a detector having a 425-455 nm bandpass filter.

The nanoparticles were prepared by forming silica nanoparticles in the presence of Polymer 1 as disclosed in WO 2018/060722, the contents of which are incorporated herein by reference.

Polymer 1

The same process was carried out using a blank containing water only in place of the aforementioned aqueous dispersion of fluorescent nanoparticles.

The 532nm frequency doubled output of a Nd:YAG pulsed laser (Crylas FDSS 532) was used to excite the fluorescence. The laser used had a Ins pulse, 300pJ pulse energy and a frequency of 20 Hz. Laser light was focused through a 5X objective lens (numerical aperture of 0.13) into the centre of a quartz cuvette with a 10mm pathlength. Fluorescence was collected at right angles to the excitation beam and collected through another objective lens, the light was passed through a bandpass filter with 438nm centre wavelength and 28 nm width. The light was then directed into an Ocean Optics USB2000+ spectrometer via a fibreoptic cable and collimating lens. The spectrometer was used with a long integrating time to integrate the signal from multiple laser pulses With reference to Figure 1, emission from the fluorescent nanoparticles was detected but no emission was detected from the water-only control.