Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SINGLE-MOLECULE APTAMER FRET FOR PROTEIN IDENTIFICATION AND STRUCTURAL ANALYSIS
Document Type and Number:
WIPO Patent Application WO/2024/049290
Kind Code:
A1
Abstract:
The invention provides a method for characterization of a structure of a protein using a first probe and a second probe, wherein the method comprises: an exposure stage comprising: (i) exposing the protein to the second probe, (ii) providing radiation to the protein, wherein the radiation has a wavelength selected from a donor excitation radiation range, and (iii) measuring emission in a donor emission radiation range and an acceptor emission radiation range to provide an emission signal; wherein the exposure stage is protein degradation-free; and wherein: the protein comprises a first binding site and a second binding site; the first probe is: (i) covalently bound to the protein at the first binding site; or (ii) configured to transiently bind the protein at the first binding site, wherein the first probe comprises a first chromophore; the second probe is configured to transiently bind the protein at the second binding site with an off-rate selected from the range of 0.01 – 10 s-1, wherein the second probe comprises a second chromophore, wherein the second probe comprises an affinity-based probe selected from the group comprising an aptamer, an antibody, a nanobody, and a small-molecule moiety; the first chromophore and the second chromophore are selected from FRET donor-acceptor pair chromophores, wherein the FRET donor-acceptor pair chromophores have the donor excitation radiation range, the donor emission radiation range and the acceptor emission radiation range, wherein a donor of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein an acceptor of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission radiation in the acceptor emission radiation range upon excitation with donor excitation radiation of the donor when the first chromophore and the second chromophore are configured within a FRET distance selected from the range of 0.1 – 10 nm.

Inventors:
JOO CHIRLMIN (NL)
VAN WEE RAMAN GAUTAM (NL)
FILIUS MIKE (NL)
Application Number:
PCT/NL2023/050437
Publication Date:
March 07, 2024
Filing Date:
August 28, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV DELFT TECH (NL)
International Classes:
G01N33/542
Domestic Patent References:
WO2021049940A12021-03-18
WO2022096677A12022-05-12
WO2016050813A12016-04-07
WO2005059509A22005-06-30
WO2018102759A12018-06-07
WO2021049940A12021-03-18
WO2022096677A12022-05-12
WO2016050813A12016-04-07
Other References:
H. DACRES ET AL, ANALYTICAL CHEMISTRY, vol. 82, no. 1, 2010, pages 432 - 435, XP055030784
CORZO JAVIER: "Time, the forgotten dimension of ligand binding teaching", BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION, vol. 34, no. 6, 2006, pages 413 - 416, XP093101526
ZHANG QIANLI ET AL: "Engineered fast-dissociating antibody fragments for multiplexed super-resolution microscopy", CELL REPORTS METHODS, vol. 2, no. 10, October 2022 (2022-10-01), pages 100301, XP093101931
Attorney, Agent or Firm:
EDP PATENT ATTORNEYS B.V. (NL)
Download PDF:
Claims:
CLAIMS:

1. A method for characterization of a structure (13) of a protein (10) using a first probe (31) and a second probe (32), wherein the method comprises: an exposure stage comprising: (i) exposing the protein (10) to the second probe (32), (ii) providing radiation (50) to the protein (10), wherein the radiation has a wavelength selected from a donor excitation radiation range, and (iii) measuring emission (60) in a donor emission radiation range and an acceptor emission radiation range to provide an emission signal, wherein the exposure stage is protein degradation-free; and wherein: the protein (10) comprises a first binding site (11) and a second binding site (12); the first probe (31) is: (i) covalently bound to the protein (10) at the first binding site (11); or (ii) configured to transiently bind the protein (10) at the first binding site (11), wherein the first probe (31) comprises a first chromophore (21); the second probe (32) is configured to transiently bind the protein (10) at the second binding site (12) with an off-rate selected from the range of 0.01 - 10 s’1, wherein the second probe (32) comprises a second chromophore (22), wherein the second probe (32) comprises an affinity-based probe (35) selected from the group comprising an aptamer (36), an antibody, a nanobody, and a small-molecule moiety (37); the first chromophore (21) and the second chromophore (22) are selected from FRET donor-acceptor pair chromophores (20), wherein the FRET donor-acceptor pair chromophores (20) have the donor excitation radiation range, the donor emission radiation range and the acceptor emission radiation range, wherein a donor (23) of the FRET donoracceptor pair chromophores (20) is excitable by donor excitation radiation (53) in the donor excitation radiation range, wherein an acceptor (24) of the FRET donor-acceptor pair chromophores (20) is configured to provide acceptor emission radiation (64) in the acceptor emission radiation range upon excitation with donor excitation radiation (53) of the donor (23) when the first chromophore (21) and the second chromophore (22) are configured within a FRET distance selected from the range of 0.1 - 10 nm.

2. The method according to claim 1, wherein the method comprises: a distance estimation stage comprising estimating a distance (di) between the first binding site (11) and the second binding site (12) based on the emission signal; and wherein the emission signal comprises the ratio between the emission (63) from the donor emission radiation range and the emission (64) from the acceptor emission radiation range.

3. The method according to claim 2, wherein the method comprises: a structure prediction stage comprising predicting the structure (13) of the protein (10) based on the estimated distance (di); wherein the structure (13) is selected from the group comprising a secondary protein structure, a tertiary protein structure, and a quaternary protein structure.

4. The method according to any of the preceding claims, wherein the second probe (32) comprises the aptamer (36), wherein the aptamer (36) is configured to transiently bind the protein (10) at the second binding site (12) with an off-rate selected from the range of 0.05-2 s'1.

5. The method according to claim 4, wherein the aptamer (36) comprises a DNA aptamer, wherein the DNA aptamer comprises 10-80 nucleotides.

6. The method according to claim 1-3, wherein the second probe (32) comprises the small-molecule moiety (37), wherein the small-molecule moiety (37) is configured to transiently bind the protein (10) at the second binding site (12) with an off-rate selected from the range of 0.05-2 s'1.

7. The method according to any one of the preceding claims, wherein the method comprises: a binding stage comprising covalently binding the first probe (31) to the first binding site (11), wherein the binding stage precedes the exposure stage.

8. The method according to claim 7, wherein the first binding site (11) is selected from the group comprising an N-terminal end (19) of the protein (10) and a C-terminal end (18) of the protein (10).

9. The method according to any one of the preceding claims 1-6, wherein the first probe (31) is configured to transiently bind the protein (10) at the first binding site (11), wherein the first binding site (11) is different from the second binding site (12), and wherein the first probe (31) comprises the affinity-based probe selected from the group comprising the aptamer (36), the antibody, the nanobody, and the small-molecule moiety (37), and wherein the first probe (31) is configured to transiently bind the protein (10) at the first binding site (11) with an off-rate selected from the range of 0.05-2 s'1.

10. The method according to claim 9, wherein the first chromophore (21) comprises the donor (23), and wherein the second chromophore (22) comprises the acceptor (24), and wherein the off-rate of the first probe (31) binding the protein (10) at the first binding site (11) is higher than the off-rate of the second probe (32) binding the protein (10) at the second binding site (12); wherein the on-rate of the first probe (31) binding the protein (10) at the first binding site (11) is higher than the on-rate of the second probe (32) binding the protein (10) at the second binding site (12); and wherein the molar ratio between the second probe (32) and the first probe (31) is selected from the range of 2:1 - 20: 1, and wherein the exposure stage comprises exposing the protein (10) to the first probe (31).

11. The method according to any one of the preceding claims, wherein the exposure stage comprises sequentially exposing the protein (10) to different second probes (32).

12. The method according to any one of the preceding claims, wherein the method further comprises: a fingerprint provision stage comprising providing a protein fingerprint based on the emission signal; a protein identification stage comprising identifying the protein (10) by comparing the protein fingerprint to protein-related information in reference data.

13. A system (200) for characterization of a structure (13) of a protein (10), wherein the system (200) comprises an analytical space (210), a probe supply (230), a radiation source (250), a single-molecule fluorescence microscope (240), and a control system (300), wherein the analytical space (210) is configured to host the protein (10), wherein the probe supply (230) is configured to provide probes to the analytical space (210), wherein the radiation source (250) is configured to provide radiation (50) to the analytical space (210), wherein the single- molecule fluorescence microscope (240) is configured to measure emission (60) in a donor emission radiation range and in an acceptor emission radiation range in the analytical space (210) and to provide an emission signal to the control system (300), and wherein in an operational mode the control system (300) is configured to execute the method according to any one of the preceding claims 1-12.

14. The system (200) according to claim 13, wherein the control system (300) is configured to estimate a protein fingerprint based on the emission signal, and wherein the control system (300) is configured to identify the protein (10) by comparing the protein fingerprint to protein-related information in reference data.

15. The system (200) according to any one of the preceding claims 13-14, wherein the control system (300) is configured to estimate a distance (dl) based on the emission signal, wherein the control system (300) is configured to predict the structure (13) of the protein (10) based on the estimated distance (dl).

16. A data carrier (400) having stored thereon program instructions, which when executed by the system (200) according to any one of preceding claims 13-15 causes the system (200) to execute the method according to any one of preceding claims 1-12.

Description:
Single-molecule aptamer FRET for protein identification and structural analysis

FIELD OF THE INVENTION

The invention relates to a method for characterization of a structure of a protein using a first probe and a second probe. The invention further relates to a system for characterization of a structure of a protein. The invention further relates to a data carrier.

BACKGROUND OF THE INVENTION

Protein identification and structural analysis using affinity probes are known in the art. For instance, WO2018102759A1 relates to methods and systems for identifying a protein within a sample. A panel of antibodies are acquired, none of which are specific for a single protein or family of proteins. Additionally, the binding properties of the antibodies in the panel are determined. Further, the protein is iteratively exposed to a panel of antibodies. Additionally, a set of antibodies which bind the protein are determined. The identity of the protein is determined using one or more deconvolution methods based on the known binding properties of the antibodies to match the set of antibodies to a sequence of a protein.

W02021049940A1 describes an analysis method for characterization of a tagged protein using FRET donor-acceptor pair chromophores, wherein the FRET donoracceptor pair chromophores comprise a first chromophore and a second chromophore, wherein the FRET donor-acceptor pair chromophores have a donor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission radiation in the acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a predetermined distance, wherein the tagged protein comprises a first amino acid tagged with a first tag and a second amino acid tagged with a second tag, wherein the first tag comprises the first chromophore or is associated to the first chromophore, wherein the second tag comprises an oligonucleotide, and wherein the analysis method comprises: a barcode exposure stage comprising: (i) exposing the tagged protein to a barcode, wherein the barcode is configured to hybridize with the second tag, and wherein the barcode comprises the second chromophore, (ii) providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein, and (iii) measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal.

WO2022096677A1 describes a method for obtaining quantitative information on average donor-acceptor distance changes within a molecule or in between molecules using ensemble Forster resonance energy transfer (eFRET), and a measurement system comprising a controller adapted for performing the same, wherein the method comprises the steps of performing an eFRET measurement for at preferably a donor-, an acceptor-, and at least a donor-acceptor-labelled sample, wherein each sample comprises respective labelled molecule copies and wherein each eFRET measurement is performed using multiple respective labelled molecule copies, under a first and a second condition, correcting the obtained results for fluorophore-specific, condition-specific and inter-condition effects, determining conditionspecific eFRET efficiencies based on the corrected results, and determining quantitative information on donor-acceptor distance changes within the molecule between the first and the second condition based on the respective condition-specific eFRET efficiency.

WO20 16050813 Al describes a method for detecting a spatial proximity of a first and a second epitope f a protein or of a first and a second protein of a protein complex in a sample of a subject. The method comprises binding a first binding member having a first oligonucleotide conjugated thereto to the first epitope, binding a second binding member having a second oligonucleotide conjugated thereto to the second epitope, and determining whether a Fluorescence Resonance Energy Transfer (FRET) effect is present between a donor fluorophore and an acceptor fluorophore, which are associated with the first oligonucleotide and the second oligonucleotide, wherein the presence of the FRET effect indicates a spatial proximity of the first and the second oligonucleotide and, thus, the spatial proximity of the first and the second epitope.

SUMMARY OF THE INVENTION

Proteins are biochemical workhorses in all living cells. The many thousands of different proteins sustain the functions of the cell, from copying DNA and catalyzing basic metabolism to producing cellular motion and more. For understanding of biological processes and their (dys)regulation, including diseases, it may be critical to identify and monitor the protein composition of cells by sequencing (i.e. determination of the amino acid sequence of proteins) and/or structural analysis (i.e. determination of the three-dimensional structure of proteins). However, assigning the function to proteins remains one of the biggest challenges in fundamental and biomedical research. This is partly because it is not known how large the human proteome is. Currently there are reports suggesting that the human proteome can be as small as 20,000 proteins to as large as several millions. Recently, a term “proteoform” has been defined. It refers to each individual molecular form of a protein that is derived from a single protein-encoding gene. There are several layers of cellular processes that can result in a large number of proteoforms transcribed from a single gene.

The number of protein encoding genes in the human genome may be estimated to be about 20,000. If all protein encoding genes result in a single protein, then the number of different proteins in the human proteome may also be about 20,000. However, a process known as alternative splicing may increase the number of transcripts to -80,000. The complexity of the proteome further increases due to post-translational modification (PTMs). The PTMs may result in many hundreds of thousands of additional protein variants. The large variety and number of PTMs may affect the protein function. Subtle differences in highly similar proteoforms may have profound effects on health. Moreover, the concentration range at which different proteoforms exist in the cell spans several orders of magnitude, with some of them being present with just a few copies per cell. To address these challenges, a method that can detect and discriminate them with single-molecule sensitivity may be needed. However, protein sequencing and structural analysis remains a challenge, especially when only a small sample of the protein is available.

Modern protein analysis commonly utilizes mass spectrometry-based identification techniques — determining the precise mass of a protein. Such current methods may suffer from limitations. First, they can analyze only fragments of proteins. Information on those fragments is then used to reconstruct the full-length amino acid sequence, but this often fails due to the combinatorial complexity. Second, they often fail to recognize minor protein species among highly abundant protein species, since sequence prediction is made through analysis of complex spectral peaks. As many important cellular proteins such as signaling proteins exist in very low abundance, it can be difficult to obtain comprehensive proteomic information. Early detection of diseases may also rely on detection of low concentrations of protein biomarkers and thereby forms a demand for protein sequencing and structural analysis techniques capable of working at the single-molecule scale. In addition, it may be particularly challenging to analyze proteoforms of a protein using mass spectrometry approaches.

Single-molecule techniques are cutting-edge detection tools to study biological processes at the nanoscale and may be suited for samples containing target molecules (such as proteins) with low copy numbers. Potential single-molecule protein sequencing approaches have recently been explored. For example, nanopores based on a-hemolysin have been used to distinguish between non-phosphorylated and phosphorylated proteins. Further, a comparable biological nanopore in combination with a motor protein complex has been used to control protein translocation through a nanopore. Edman degradation has been used for fluorescence detection of single peptides and single-molecule fluorescence fingerprinting of peptides and cellular proteins has been demonstrated using a motor protein complex.

Although protein features including phosphorylation status and structural domains could be detected, it may remain challenging to identify the amino acid sequence of a protein using single-molecule techniques. Whereas DNA and RNA consist of 4 nucleotide building blocks, proteins may typically comprise up to 20 different amino acid building blocks. Independent of the readout method of choice, full protein sequencing may require the detection of 20 distinguishable signals, which has so far not been demonstrated.

Prior art methods for protein characterization and identification may require linearization and fragmentation of the protein, hence losing the folded conformation and structure of the whole protein. Further, such methods may require the use of enzymes for translocation, linearization, and fragmentation of the protein, all of which may alter the protein structure via enzymatic activity.

Prior art methods may require (covalently) attaching linkers to specific amino acid residues of a protein. Thereby, such methods may be limited to analyses only of those amino acid residues for which there is linker chemistry available. Further, the attaching of such linkers to amino acid residues may be labor-intensive, wasteful with regards to reagents (and thus costly), and may result in incomplete or off-target labelling. In addition, the attachment of such linkers may complicate the procedure, may hamper flexibly analyzing the protein, and may result in changes to the structure of the protein, such as to the secondary, tertiary and/or quaternary structure of the protein.

Hence, it is an aspect of the invention to provide an alternative method for characterization of a structure of a protein, which preferably further at least partly obviates one or more of above-described drawbacks. The present invention may have as object to overcome or ameliorate at least one of the disadvantages of the prior art, or to provide a useful alternative.

In a first aspect, the invention provides a method for characterization of a structure of a protein using a first probe and a second probe. In embodiments, the method may comprise an exposure stage. The exposure stage may comprise exposing the protein to the second probe. The exposure stage may further comprise providing excitation radiation (or “radiation”) to the protein. The radiation (or: “light”) may especially have a wavelength selected from a donor excitation radiation range. The exposure stage may additionally measure emission to provide an emission signal. The emission may especially be measured in a donor emission radiation range and an acceptor emission radiation range. Moreover, the exposure stage may be protein degradation-free. In embodiments, the protein may comprise a first binding site and a second binding site. The first probe may in embodiments be covalently bound to the protein at the first binding site. The first probe may in other embodiments be configured to transiently bind the protein at the first binding site. The first probe may especially comprise a first chromophore. In embodiments, the second probe may be configured to transiently bind the protein at a second binding site. The second probe may comprise a second chromophore. Moreover, the second probe may comprise a second affinity-based probe, especially a second affinity-based probe selected from the group comprising an aptamer, an antibody, a nanobody, and a small-molecule moiety. In embodiments, the first chromophore and the second chromophore may be selected from Forster Resonance Energy Transfer (also: “Fluorescence Resonance Energy Transfer” or “FRET”) donor-acceptor pair chromophores. The FRET donoracceptor pair chromophores may have a donor excitation radiation range, the donor emission radiation range, and the acceptor emission radiation range. Especially, a donor chromophore (or: “donor”) of the FRET donor-acceptor pair chromophores may be excitable by donor excitation radiation in the donor excitation radiation range. Further, an acceptor chromophore (or: “acceptor”) of the FRET donor-acceptor pair chromophores may be configured to provide acceptor emission radiation in the acceptor emission radiation range. This may in embodiments especially happen upon excitation with donor excitation radiation in the donor excitation radiation range of the donor of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a FRET distance, especially wherein the FRET distance is selected from the range of 0.1 - 10 nm.

Hence, in embodiments the invention provides a method for characterization of a structure of a protein using a first probe and a second probe, wherein the method comprises: an exposure stage comprising: (i) exposing the protein to the second probe, (ii) providing radiation to the protein, wherein the radiation has a wavelength selected from a donor excitation radiation range, and (iii) measuring emission in a donor emission radiation range and an acceptor emission radiation range to provide an emission signal; wherein the exposure stage is protein degradation-free; and wherein: the protein comprises a first binding site and a second binding site; the first probe is: (i) covalently bound to the protein at the first binding site; or (ii) configured to transiently bind the protein at the first binding site, wherein the first probe comprises a first chromophore; the second probe is configured to transiently bind the protein at the second binding site with an off-rate selected from the range of 0.01 - 10 s’ 1 , wherein the second probe comprises a second chromophore, wherein the second probe comprises an affinity-based probe selected from the group comprising an aptamer, an antibody, a nanobody, and a small-molecule moiety; the first chromophore and the second chromophore are selected from FRET donor-acceptor pair chromophores, wherein the FRET donor-acceptor pair chromophores have the donor excitation radiation range, the donor emission radiation range and the acceptor emission radiation range, wherein a donor of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein an acceptor of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission radiation in the acceptor emission radiation range upon excitation with donor excitation radiation of the donor when the first chromophore and the second chromophore are configured within a FRET distance selected from the range of 0.1 - 10 nm.

The method of the invention may thus comprise exposing a protein to a set of probes comprising FRET chromophores, wherein each probe associates with, especially (transiently) binds, to the protein at a respective binding site. As a FRET donor may, upon excitation by excitation radiation, transfer energy (in a non-radiative fashion) to a matching FRET acceptor in a highly distance-dependent manner (see below), and as both the FRET donor and the FRET acceptor may provide emission radiation following excitation, the distance between the binding sites may be observed by providing excitation radiation to the protein and by observing the emission radiation provided by the donor and the acceptor. In particular, by measuring a plurality of emission signals, and especially by determining a plurality of (pairwise) distances, such as between a first binding site and a plurality of second binding sites, or such as between (pairwise) combinations of first and second binding sites, a protein fingerprint may be obtained, and especially a protein structure may be determined.

With the present invention, the method for characterization of a structure of a protein using a first probe and a second probe may provide a single-molecule protein profiling platform that may facilitate the identification of proteins based on a unique “protein fingerprint” by combining single-molecule FRET technique with affinity based probes. Such a protein fingerprint may be obtained by probing the target proteins with a large number of fluorescently labeled probes that provide a signal on the distance of their binding epitopes to a reference point. Hence, this platform provides a sequencing technique that may examine the protein structural profile and may consequently allow identification of the protein sequence. This method of single-molecule FRET detection, hereafter also called “protein fingerprinting” or “single molecule aptamer FRET (SMAF) spectroscopy”, may also facilitate determining and/or predicting protein structures.

Unlike mass spectrometry-based methods, protein fingerprinting may directly probe full-length proteins, thereby improving the accuracy of protein identification. By identifying single molecules, protein fingerprinting may provide a sensitivity of one molecule and may hence facilitate sequencing of proteins from an amount of samples 3-5 orders of magnitude smaller than needed for mass spectrometry. Protein fingerprinting may further facilitate technologies such as single-cell proteomics and real-time screening for on-site medical diagnostics and early-stage disease diagnosis.

Further, the protein fingerprinting method may, in embodiments, not require chemical conjugation of labels to target amino acid residues and may thus not be hampered by low labeling efficiencies. Therefore, the throughput and sensitivity of the protein fingerprinting method may be significantly higher than currently used methods and thus may be more readily applicable. Additionally, identifying proteins through fluorescently labeled, transient binding probes may circumvent labor intensive, expensive, inefficient and complicated covalent attachment of reporter molecules (e.g. fluorescent dyes or DNA handles) to the target proteins. Hence, the invention may provide the benefit that the need for covalently binding linkers to a protein in a preprocessing step can be substantially reduced, such as by limited to binding sites to which covalent binding is (relatively) convenient and efficient, or even eliminated. Further, the method of the invention may facilitate targeting 3D-structures with the affinity probes, thereby providing a larger flexibility in the binding sites that can be evaluated in comparison to methods based on covalent binding. In particular, the reduction in preprocessing may provide for a higher analysis throughput and reduced costs as (covalent) protein labelling may be at least partially avoided.

The method of the invention may further specifically involve the use of promiscuous affinity probes that may transiently bind a large variety of proteins, i.e., where affinity probes may typically be designed to be highly specific, the method of the invention may beneficially use affinity probes binding a multitude of (similar) binding sites.

In particular, the method may provide a FRET efficiency pattern based on one or more emission signals, especially based on a plurality of emission signals. The FRET efficiency pattern may be characteristic of the protein. Especially, the FRET efficiency pattern may comprise a protein fingerprint.

Hence, the invention may provide a method for characterization of a protein. The method may involve analyzing a protein to determine a protein characteristic. In particular, the term “characterization of a protein” may herein refer to one or more of identifying the protein, especially via sequencing, or determining (at least part of) the 3-D protein structure of the protein.

As indicated, the invention may provide a method for characterization of a structure of a protein using a first probe and a second probe. Proteins may have up to four distinct levels of structural organization. The primary protein structure relates to the linear sequence of covalently bound amino acids that comprise the protein, beginning on one end of the amino acid sequence at an amino-terminal (or: “N-terminal”) amino acid and ending on the other end of the amino acid sequence at a carboxyl-terminal (or: “C-terminal”) amino acid. Hence, the primary protein structure begins at an N-terminal end and ends at a C-terminal end. The secondary protein structure relates to the highly regular structural formations localized to the backbone chain formed by the amino acid sequence. The tertiary protein structure relates to the total three-dimensional structure created by the single amino acid sequence of a single protein. The tertiary protein structure of a single protein may be subdivided into parts called protein domains. The quaternary protein structure relates to the total three-dimensional structure created by the aggregation of two or more amino acid sequences in multiprotein complexes. Hence, single proteins consisting of a single amino acid sequence may not comprise a quaternary protein structure. The protein fingerprinting method may provide for the characterization of one or more protein structure levels.

Hence, in embodiments, the method may be for characterization of a primary protein structure. In further embodiments, the method may be for characterization of a secondary protein structure. In further embodiments, the method may be for characterization of a tertiary protein structure. In further embodiments, the method may be for characterization of a quaternary protein structure.

In embodiments, the protein fingerprinting method employs a first probe and a second probe. The first probe and second probe may interact with a localized protein substructure, such as a protein domain, protein sequence, or amino acid, optionally in a highly specific manner. Thus their binding may be a reliable indicator for the presence of a particular localized protein substructure. Examining the binding of a different first probe and second probe to a protein or protein substrate in relation to each other may provide information about the presence and locations of the localized protein substructure that may be present in the protein. This information may facilitate characterizing, such as identifying, the protein.

Herein, the protein may comprise a first binding site and a second binding site. The first binding site may be the localized protein substructure at which the first probe is targeted. In specific embodiments, the first probe may be covalently bound to the protein at the first binding site, and hence may be (irreversibly) bound to the protein during the fingerprinting method. In such embodiments, the method may comprise a binding stage comprising covalently binding the first probe to the protein at the first binding site. The binding stage may in embodiments precede the exposure stage, which may comprise binding the second probe to the protein at the second binding site for the fingerprinting method. In other embodiments, the first probe may be configured to transiently bind the protein at the first binding site, and hence may be reversibly bound to the protein during the fingerprinting method. The first binding site may especially be a predetermined localized protein substructure on the target protein with a known localization within the protein structure.

The first binding site may be selected from the group comprising an N-terminal end of the protein and a C-terminal end of the protein, which may typically be present on every protein as the two ends of the amino acid sequence. Further, binding sites from this group may be distinct from other binding sites, such as from other amino acids, in the protein sequence. The first binding site may especially comprise the N-terminal end of the protein. In further embodiments, the first binding site may especially comprise the C-terminal end of the protein.

In embodiments, the protein may especially be immobilized at the first binding site. For instance, the first probe may be (covalently) bound to the protein at the first binding site, and may be immobilized on a structure, such as via a binding to an analytical surface. In such embodiments, the binding stage may comprise binding the protein to the first probe during the binding stage, wherein the first probe is immobilized (on a structure), or wherein the binding stage further comprises immobilizing the first probe on a structure. In other embodiments, the protein may be immobilized at a different binding site, such as a localized protein substructure, than the first binding site. In certain embodiments, the protein may be immobilized (at either the first binding site or at a different binding site) prior to the first probe binding to the first binding site.

The immobilization may be achieved using covalent chemistry. The protein may be immobilized to a surface, such as a glass surface or quartz surface, by using biotin- Streptavidin binding or by using a covalent chemical approach (such as amine-NHS, 2PCA or thiol chemistry). Biotin or biotinylated DNA linker may be attached to the protein for surface immobilization. Especially, NHS chemistry, 2PCA chemistry or alkyne-ketene reaction may be used for N-terminal immobilization of the protein. In other embodiments, decarboxylative alkylation reaction may be used for C-terminal immobilization of the protein. Alternatively, the protein may be immobilized via different approaches such as via binding to capture molecules, for example an aptamer, a small-molecule moiety (e.g. inhibitors or other ligands), an antibody, or a nanobody.

In general, the first probe may comprise a single molecule, i.e., a single molecule may be configured to bind the protein (at the first binding site) and may comprise the first chromophore. However, in embodiments, the first probe may comprise two or more different molecules (or “entities”, especially two molecules comprising complementary nucleotide sequences. For instance, in embodiments, the first probe may comprise a protein binding molecule and a chromophore molecule. In such embodiments, the protein binding molecule may bind the first protein at the first binding site, especially covalently bind the first protein at the first binding site, and the chromophore molecule may (transiently) associate with the protein binding molecule. In particular, in such embodiments, the protein binding molecule and the chromophore molecule may comprise complementary nucleotide sequences, such as complementary DNA sequences. In further embodiments, the protein binding molecule may further comprise an immobilization part configured to immobilize the protein (on a structure). For example, a protein binding molecule comprising single stranded DNA and a biotin modification may be immobilized (on a structure) using the biotin modification, and a chromophore molecule comprising a first chromophore, such as an acceptor chromophore, may be configured to bind to the single stranded DNA via a complementary nucleotide sequence (attached to the first chromophore). Such embodiments may facilitate arranging a first chromophore particularly close to the first binding site, which may ease determining the distance between the first binding site and the second binding site (see below).

Hence, in embodiments, the first probe may comprise a protein binding molecule and a chromophore molecule, wherein the protein binding molecule and the chromophore binding molecule comprise complementary nucleotide sequences (to one another). In such embodiments, the protein binding molecule may especially be configured to covalently bind to the protein. Hence, in further embodiments, the method may comprise covalently binding the protein binding molecule to the protein, especially to the first binding site. In further embodiments, the exposure stage may comprise exposing the protein to the chromophore molecule.

Similarly, in general, the second probe may comprise a single molecule, i.e., a single (second) molecule may be configured to bind the protein (at the first binding site) and may comprise the first chromophore.

The second binding site may be the localized protein substructure at which the second probe is targeted. In embodiments, the second probe may be configured to transiently bind the protein at a second binding site. Especially, the second probe may comprise a second affinity-based probe, and hence may reversibly bind to the second binding site during the method, especially during the exposure stage. The second affinity -based probe may especially be selected from the group comprising an aptamer, an antibody, a nanobody, and a smallmolecule moiety. Probes selected from this group of organic compounds may typically be able to target localized protein substructures with high affinity. This may provide an advantage as the fingerprinting method may not require conjugation of fluorophores or other biomolecules to the protein of interest. This is an advantage because protein labeling is typically (relatively) inefficient. Hence, reducing, especially avoiding (see below), protein labelling may make the method simpler in execution, cheaper and may shorten sample preparation.

Aptamers are single stranded DNA , RNA or peptide nanostructures, typically of tens of base pairs in length. Aptamers may, in embodiments, be selected from the group comprising DNA aptamers (including synthetic DNA aptamers, such as L-DNA aptamers), RNA aptamers, short peptide aptamers, peptide nucleic acid (PNA) aptamers, locked nucleic acid (LNA) aptamers, and slow off-rate modified aptamers (SOMAmers). Aptamers may be partially self-complementary and may fold into secondary structures and may then be able to transiently bind a specific region (or epitope) in a target molecule. Currently, antibodies or nanobodies may be the standard probes for protein staining, but aptamers may have several advantages. For example, aptamers may be chemically synthesized, which may make production faster, cheaper and purer than antibodies. Furthermore, the increased stability of aptamers may translate into a longer shelf-life and resistance against degrading conditions (such as e.g. high temperatures, high or low pH, etc.) than antibodies would. Also, unlike the amino acids that are the building blocks of antibodies, the nucleotides that may comprise the DNA or RNA aptamer backbone can easily be sequenced for identification, enabling highly parallel assays. Different aptamers may have different binding kinetics.

Hence, in embodiments, the second probe may comprise an aptamer. The aptamer may comprise 2-80 aptamer monomers. Further, the aptamer may comprise 5-80 aptamer monomers, such as 5-70 aptamer monomers. The aptamer may in embodiments comprise 10-70 aptamer monomers, especially 10-60 aptamer monomers, Moreover, the aptamer may comprise 20-60 aptamer monomers, such as 20-50 aptamer monomers.

In embodiments, the aptamer may especially comprise a DNA aptamer. The DNA aptamer may comprise 10-80 aptamer monomers, i.e., 10-80 nucleotides. Further, the DNA aptamer may comprise 12-80 aptamer monomers, such as 12-70 aptamer monomers. The DNA aptamer may in embodiments comprise 15-70 aptamer monomers, especially 15-60 aptamer monomers, Moreover, the DNA aptamer may comprise 20-60 aptamer monomers, such as 20-50 aptamer monomers.

In certain embodiments, the second probe may comprise an antibody (or: “immunoglobulin”). Antibodies are glycoproteins that may be naturally produced by the immune system of vertebrate animals and may be configured to selectively bind specific molecules or molecule domains, such as protein domains. The target molecule or molecule domain of an antibody is called an epitope. Through the recombination of so-called “variable domains” comprised by antibodies, the immune system of vertebrate animals may be able to produce many varieties of antibodies with different epitopes, thereby facilitating the ability of the vertebrate animal immune system to recognize and neutralize foreign compounds. Antibodies with specific epitopes may be generated for use in biotechnological applications. Hence, antibodies may be generated for use as a second probe in the current invention. In specific embodiments, the invention may provide a method to investigate the epitope of an antibody on a protein.

In specific embodiments, the second probe may comprise a nanobody. Nanobodies are the recombinant variable domains isolated from antibodies, retaining the specific binding abilities for an epitope but losing other parts of the antibody glycoprotein. Nanobodies may hence have a smaller molecular weight and better solubility than antibodies. Hence, nanobodies may be generated for use as a second probe in the current invention.

In other embodiments, the second probe may comprise a small-molecule moiety. A small-molecule moiety may comprise an organic compound with one or more of (i) a molecular weight of < 2000 Da, such as < 1000 Da, especially < 500 Da, and (ii) a molecular size from the range of 0.01 - 50 nm, such as from the range of 0.02 - 20 nm, especially from the range of 0.05 - 10 nm. Such small-molecule moieties may be efficiently synthesized and may transiently bind a localized protein substructure. The small-molecule moiety may be selected from the group comprising thiols and reducing agents. Especially, the small-molecule moiety may be selected from the group comprising tris(2-carboxyethyl)phosphine (TCEP), 2- Mercaptoethanol (P-mercaptoethanol]T, and iodoacetamide.

The unit Dalton (Da) herein refers to the atomic mass constant, which may typically be used to indicate the mass of proteins, as well as of small molecules. 1 Dalton may correspond to one twelfth of the mass of a free carbon-atom at rest. As will be known to the person skilled in the art, 1 Dalton corresponds to (approximately) 1.660* KT 27 Kg.

The second binding site may be a localized protein substructure with a known localization within the protein structure, especially in such embodiments employing the method with the goal of identifying the presence or absence of a known protein in a (biological) sample. The second binding site may also be a localized protein substructure with an unknown localization within the protein structure, especially in such embodiments employing the (fingerprinting) method with the goal of investigating the structure of a known or unknown protein in a (biological) sample.

The second probe may be configured to transiently bind the protein at the second binding site. Such a transient binding may be described using binding kinetics that comprise (i) an on-rate (also: “association rate constant”, “association rate coefficient”, K on ), being the second-order rate constant for the binding of the second probe to the second binding site, and (ii) an off-rate (also: “dissociation rate constant”, “dissociation rate coefficient”, K O ff), being the first-order rate constant for the disassociation of the second probe from the second binding site. It will be clear to the skilled person that different probes may have different on-rates and off-rates for the protein, especially for binding sites in the protein. In particular, probes may be designed, for instance via computational modeling and/or via binding assays, with specific binding affinities for one or more specific binding sites.

Especially, the second probe may be configured to transiently bind the protein at the second binding site with an on-rate selected from the range of 0.02 * 10 6 - 5 * 10 6 M'

1 (or 0.02E6 - 5E6 M^s' 1 ). Further, the second probe may bind the protein at the second binding site with an on-rate selected from the range of 0.05 * 10 6 - 5 * 10 6 M’ 1 , such as 0.05 * 10 6 -

2 * 10 6 M' 1 . In embodiments, the second probe may bind the protein at the second binding site with an on-rate selected from the range of 0.1 * 10 6 - 2 * 10 6 M’ 1 , such as 0.1 * 10 6 - 1 * 10 6 M' 1 . Moreover, the second probe may bind the protein at the second binding site with an on-rate selected from the range of 0.2 * 10 6 - 1 10 6 * M’ 1 , such as 0.2 * 10 6 - 0.8 * 10 6 M' 1 . Specifically, the second probe may bind the protein at the second binding site with an on-rate selected from the range of 0.4 * 10 6 - 0.8 * 10 6 M’ 1 , such as 0.4 * 10 6 - 0.6 * 10 6 M'

The second probe may bind the protein at the second binding site with an off- rate selected from the range of 0.01 - 10 s' 1 . Further, the second probe may bind the protein at the second binding site with an off-rate selected from the range of 0.02 - 10 s' 1 , such as 0.02 - 5 s' 1 . In embodiments, the second probe may bind the protein at the second binding site with an off-rate selected from the range of 0.05 - 5 s' 1 , such as 0.05 - 2 s' 1 . Moreover, the second probe may bind the protein at the second binding site with an off-rate selected from the range of 0.1 - 2 s' 1 , such as 0.1 - 1 s' 1 . Specifically, the second probe may bind the protein at the second binding site with an off-rate selected from the range of 0.2 - 1 s' 1 , such as 0.2 - 0.5 s' 1 . The binding kinetics of the second probe may be controlled. This may be done by f.e. altering the ion strength of the solution which may affect aptamer folding. Through adjusting specific ion concentrations, the binding kinetics of aptamers may be sped up (which may result in for a faster read-out) or slowed down (which may result in a higher signal-to- noise ratio). Further, the binding kinetics of the second probe may be controlled by altering the labeling (or “binding”) of the second chromophore to a different position on the second probe. Especially, in embodiments where the second probe comprises an aptamer, the second chromophore may be labeled to a different position in the aptamer sequence, which may affect the binding kinetics of the second probe. Hence, the method, especially the exposure stage, may comprise controlling the off-rate (or the on-rate) of the second probe at the second binding site in the range of 0.02 - 5 s' 1 (or in the range of 0.01 * 10 6 - 10 * 10 6 M^s' 1 ), and especially in the other ranges mentioned hereabove.

In specific embodiments, the first probe may be configured to transiently bind the protein at the first binding site. The first binding site may especially be different from the second binding site. To facilitate such transient binding, the first probe may in such embodiments comprise a first affinity-based probe selected from the group comprising an aptamer, an antibody, a nanobody, and a small-molecule moiety, as described above. In such embodiments, the exposure stage may comprise exposing the protein to the first probe. Especially, the protein may be exposed to the first probe simultaneously as the protein may be exposed to the second probe. This may facilitate the first probe binding to the first binding site while the second probe may bind to the second binding site, hence allowing the FRET to occur.

In such embodiments, the first probe may have binding kinetics comprising an on-rate and an off-rate.

Moreover, in such embodiments, the first probe may be configured to transiently bind the protein at the first binding site with an on-rate selected from the range of 0.02 - 5 s' 1 . Further, the first probe may bind the protein at the first binding site with an on-rate selected from the range of 0.05 * 10 6 - 5 * 10 6 M’ 1 , such as 0.05 * 10 6 - 2 * 10 6 M’ 1 . In embodiments, the first probe may bind the protein at the first binding site with an on-rate selected from the range of 0.1 * 10 6 -2 * 10 6 M’ 1 , such as 0.1 * 10 6 - 1 * 10 6 M’ 1 . Moreover, the first probe may bind the protein at the first binding site with an on-rate selected from the range of 0.2* 10 6 - 1 * 10 6 M’ 1 , such as 0.2* 10 6 - 0.8 * 10 6 M’ 1 . Specifically, the first probe may bind the protein at the first binding site with an on-rate selected from the range of 0.4* 10 6 - 0.8 * 10 6 M'V, such as 0.4* 10 6 - 0.6 * 10 6 M'V. The first probe may be configured to transiently bind the protein at the first binding site with an off-rate selected from the range of 0.01 - 10 s' 1 . Further, the first probe may bind the protein at the first binding site with an off-rate selected from the range of 0.02 - 10 s' 1 , such as 0.02 - 5 s' 1 . In embodiments, the first probe may bind the protein at the first binding site with an off-rate selected from the range of 0.05 - 5 s' 1 , such as 0.05 - 2 s' 1 . Moreover, the first probe may bind the protein at the first binding site with an off-rate selected from the range of 0.1 - 2 s' 1 , such as 0.1 - 1 s' 1 . Specifically, the first probe may bind the protein at the first binding site with an off-rate selected from the range of 0.2 - 1 s' 1 , such as 0.2 - 0.5 s' 1 .

Similarly to for the second probes, in embodiments, the method, especially the exposure stage, may comprise controlling the off-rate (or the on-rate) of the first probe at the first binding site in the range of 0.02 - 5 s' 1 (or in the range of 0.01* 10 6 - 10 * 10 6 M^s' 1 ), and especially in the other ranges mentioned hereabove.

Especially, in such embodiments, the binding kinetics of the first probe and the second probe may be selected relative to each other. For instance, in embodiments wherein the first probe, especially the first chromophore, comprises the donor chromophore, the ratio of the off-rate of the first probe binding the protein at the first binding site may be higher than the off- rate of the second probe binding the protein at the second binding site. Further, in such embodiments, the ratio of the of the on-rate of the first probe binding the protein at the first binding site may be lower than the on-rate of the second probe binding the protein at the second binding site. With such binding kinetics, the second probe, comprising the acceptor chromophore, may transiently bind the protein at the second binding site longer and more often than the first probe transiently binds the protein at the first binding site, i.e., the second probe may be present at the second binding site for a larger proportion of the time than the first probe is present at the first binding site, which may increase the likelihood of FRET occurring upon binding of the first probe to the first binding site, i.e., a larger proportion of binding events of the first probe (comprising the donor chromophore) may lead to FRET when the second probe (comprising the acceptor chromophore) is present at the second binding site during a larger proportion of the time. In such embodiments where the first probe comprises the donor chromophore, the exposure stage may comprise exposing the donor chromophore to donor excitation radiation, which may over time lead to photobleaching of the donor chromophore. On the other hand, photobleaching of the acceptor chromophore is less of a concern as the acceptor chromophore is exposed to acceptor excitation radiation only when it is within FRET of an (excited) donor chromophore. Hence, it may be preferable in such embodiments to increase the likelihood of FRET upon binding of the first probe to the first binding site both in view of data acquisition efficiency and in view of managing photobleaching.

Similarly, in further embodiments, the second probe, especially the second chromophore, may comprise the donor chromophore. In such embodiments, the ratio of the off- rate of the first probe binding the protein at the first binding site may be lower than the off-rate of the second probe binding the protein at the second binding site. Further, in such embodiments, the ratio of the of the on-rate of the first probe binding the protein at the first binding site may be higher than the on-rate of the second probe binding the protein at the second binding site.

Further, in embodiments, the molar ratio between the first probe and the second probe may be selected from the range of 50:1 - 1 :50, such as from the range of 10: 1 - 1 : 10.

In particular, it may be preferable that the probe comprising the acceptor chromophore is relatively more abundant than the probe comprising the donor chromophore, as photobleaching may affect the donor chromophore to a greater extent than the acceptor chromophore (as described above). Hence, in embodiments wherein the first probe, especially the first chromophore, comprises the donor chromophore, it may be preferable to increase the likelihood of FRET upon binding of the first probe to the first binding site. Hence, in embodiments, the second probe, especially the second chromophore, may comprise the acceptor chromophore, and the molar ratio between the second probe and the first probe may be selected from the range of 3: 1 - 50:1, such as from the range of 3:1 - 30: 1. In further embodiments, the molar ratio between the second probe and the first probe may be selected from the range of 2: 1 - 30: 1, especially from the range of 2: 1 - 20: 1, such as from the range of 3:2 - 20: 1, especially from the range of 3:2 - 10: 1. In further embodiments, the molar ratio between the second probe and the first probe may be selected from the range of 1 : 1 - 10: 1, such as 1 : 1 - 5: 1.

Similarly, in embodiments wherein the first probe, especially the first chromophore, comprises the acceptor chromophore, the molar ratio between the first probe and the second probe may be selected from the range of 3 : 1 - 50: 1, such as from the range of 3 : 1 - 30: 1. In further embodiments, the molar ratio between the first probe and the second probe may be selected from the range of 2: 1 - 30: 1, especially from the range of 2: 1 - 20: 1, such as from the range of 3: 1 - 20: 1, especially from the range of 3: 1 - 10: 1. In further embodiments,, the molar ratio between the first probe and the second probe may be selected from the range of 1 : 1 - 10: 1, such as 1 : 1 - 5:1. In embodiments wherein both the first probe and the second probe comprise a nucleotide aptamer, especially a single-stranded nucleotide aptamer, the sequence of the first probe may not be complementary to the sequence of the second probe. For instance, in embodiments, the first probe and the second probe may comprise at least 3 non-complementary nucleotides, such as at least 5 non-complementary nucleotides, especially at least 10 non- complementary nucleotides. Aptamers with complementary sequences may be able to transiently bind each other instead of their respective first binding site or second binding site, which may result in the donor and acceptor being within FRET distance and thus providing acceptor emission radiation. Hence, the use of aptamers with complementary sequences may interfere with the FRET method due to providing an emission signal based on probe-to-probe binding.

Summarizing, the first probe may bind (covalently or transiently) to the first binding site, optionally with a known localization within the protein structure, whereas the second probe may transiently bind to the second binding site with a (known or unknown) localization within the protein structure. Hence, in embodiments, the second probe provides information about the second binding site in the protein structure in relation to the (predetermined or known) localization of the first binding site as a reference point.

The first probe may comprise a first chromophore, and the second probe may comprise a second chromophore. The first chromophore and the second chromophore may absorb exposure radiation within respective wavelength ranges and may emit emission radiation in a different respective wavelength range. The emission of radiation at the second defined wavelength range by a chromophore may provide information on exposure of that chromophore to the respective exposure radiation and/or to FRET energy transfer (see below). In embodiments, the first chromophore and/or the second chromophore may be fluorophores.

Especially, the first chromophore and the second chromophore may be selected from FRET donor-acceptor pair chromophores. FRET is a method based on the transfer of the energy of a donor fluorophore to an acceptor fluorophore. This energy transfer may only occur when the two fluorophores are placed within several nanometers. As the transfer efficiency is sensitive to sub-nanometer distance change, the transmission efficiency of the radiation that is being emitted by the acceptor fluorophore after excitement by the donor fluorophore depends on this sub-nanometer distance change. FRET may hence be used as a spectroscopic ruler for probing biological systems.

The method may especially relate to the use of FRET donor-acceptor pair chromophores. The term “FRET” (“Forster Resonance Energy Transfer” or “Fluorescence Resonance Energy Transfer”) may herein refer to the transfer of the energy of a donor chromophore to an acceptor chromophore, which may occur when the donor-acceptor pair chromophores are within a FRET distance, such as within several nanometers. Hence, the FRET donor-acceptor pair chromophores may comprise a first chromophore and a second chromophore, wherein the FRET donor-acceptor pair chromophores have a donor excitation radiation range, an acceptor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission in the FRET acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a FRET distance. Hence, if the donor chromophore and the acceptor chromophore are arranged within a FRET distance, which may vary for different FRET donor-acceptor pairs, the donor chromophore may upon excitation with donor excitation radiation transfer energy to the acceptor chromophore, whereupon the acceptor chromophore may emit acceptor emission radiation. This energy transfer may occur with a specific FRET efficiency depending on the (exact) distance between the donor chromophore and the acceptor chromophore. Hence, by measuring the FRET (transfer) efficiency, information regarding the distance between the donor chromophore and the acceptor chromophore is obtained. In particular, the FRET transfer efficiency may be sensitive to sub-nanometer distance changes, which may make FRET an outstanding spectroscopic ruler for probing, for example, biological systems. The FRET efficiency (E) may be defined as: wherein I a is the intensity of the acceptor emission, and wherein la is the intensity of the donor emission. The distance between the donor chromophore and the acceptor chromophore may then be estimated by comparing the measured value of E (equation above) to an estimated value of the FRET Efficiency E e as a function of the distance r: wherein R is the Forster radius, which may be specific for the donor-acceptor pair. The distance between the first binding site and the second binding site may then be estimated based on the estimated distance between the donor chromophore and the acceptor chromophore, as well as, for example, based on the (length of) the probes.

Hence, the FRET donor-acceptor pair chromophores may have an excitation radiation range and an emission radiation range for both the donor and the acceptor. The donor of the FRET donor-acceptor pair chromophores may be excitable by exposure to donor excitation radiation with a wavelength range in the donor excitation radiation range. When the donor has thus been excited, it may re-emit donor emission radiation with a wavelength range in the donor emission radiation range. The acceptor of the FRET donor-acceptor pair chromophores may be excitable by exposure to acceptor excitation radiation with a wavelength range in the acceptor excitation radiation range. When the acceptor has thus been excited, it may re-emit acceptor emission radiation with a wavelength range in the acceptor emission radiation range.

In embodiments, the FRET donor-acceptor pair chromophores may be selected such that the donor emission radiation range substantially overlaps with the acceptor excitation radiation range. As a result, when the donor becomes excited upon exposure to donor excitation radiation, it may emit donor emission radiation that falls within the acceptor excitation radiation range. If the donor and acceptor are in sufficient proximity for FRET coupling and hence exposure of the acceptor to donor emission radiation, the acceptor may then become excited and emit acceptor emission radiation. Hence, the FRET donor-acceptor pair chromophores may herein be configured to provide acceptor emission radiation in the acceptor emission radiation range upon excitation with donor excitation radiation of the donor.

The FRET excitation and emission ranges may, for example, comprise wavelengths in the ultraviolet (UV) range, the visible light range, and/or the (near)-infrared ((N)IR) range. Hence, the FRET excitation and emission ranges may, for example, comprises a (sub)range selected from within the range of 200 - 1500 nm, especially from within the range of 400 - 800 nm. In embodiments, the donor excitation radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. In embodiments, the donor emission radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. In embodiments, the acceptor excitation radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. In embodiments, the acceptor emission radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. The FRET excitation and emission ranges will in general depend on the used FRET pairs.

For example, in embodiments, the FRET donor-acceptor chromophore pair may comprise Atto488 and Cy3, wherein Atto488 (the donor chromophore) and Cy3 (the acceptor chromophore) may be excited maximally at about 488 nm and 552 nm respectively, and wherein Atto488 and Cy3 may provide emission radiation at about 521 nm and 568 nm respectively. In further embodiments, the donor-acceptor chromophore pair may comprise Atto488 and Cy5, which may respectively be maximally excited at 488 nm and 650 nm, and may provide emission radiation at about 521 nm and 666 nm. In further embodiments, the donor-acceptor chromophore pair may comprise Cy3 and Cy5, which may respectively be maximally excited at 552 nm and 650 nm, and may provide emission radiation at about 568 nm and 666 nm. In further embodiments, the donor-acceptor chromophore pair may comprise Cy3 and Cy7, which may respectively be maximally excited at 488 nm and 750 nm, and may provide emission radiation at about 568 nm and 788 nm. In further embodiments, the donor-acceptor chromophore pair may comprise Cy5 and Cy7, which may respectively be maximally excited at 650 nm and 750 nm, and may provide emission radiation at about 666 nm and 788 nm.

The FRET donor-acceptor pair chromophores may especially comprise a chromophore pair selected from the group comprising Atto488/Cy3, Atto488/Cy3b, Atto488/Cy5, Atto488/Atto647n, Cy3/Cy5, Cy3b/Cy5, Cy3/Cy7, Cy3b/Cy7, and Cy5/Cy7. The term “predetermined distance range” and similar terms may herein especially refer to a distance range wherein FRET energy transfer can occur for the FRET donor-acceptor pair chromophores, which may vary for different sets of FRET donor-acceptor pair chromophores.

The term “chromophore” may herein especially refer to a fluorescent chemical molecule that upon excitation with light (e.g. radiation from a laser), emits light of a different wavelength. The FRET donor-acceptor pair chromophores (also: “donor-acceptor pair chromophores”) may especially comprise fluorescent molecules and/or phosphorescent molecules. The term “FRET donor-acceptor pair chromophores” may herein especially refer to two chromophores capable of FRET energy transfer, i.e., energy transfer in a non-radiative distance-dependent fashion, especially through dipole-dipole coupling of the donor chromophore and the acceptor chromophore.

In embodiments, the FRET donor-acceptor pair chromophores may comprise one or more pairs selected from the group comprising the Cyanine family, the Alexa family, the Atto family, the Dy family, and the Rhodamine family. Different chromophore pairs may be sensitive at different distances, i.e., may provide a high effect regarding FRET efficiency for subnanometer distance changes (a high resolution). For example, the Cyanine family pair Cy3:Cy5 may be most sensitive at around a distance of 5 nm, such as at distances selected from the range of 3-7 nm. Similarly, the Cyanine family pair Cy3:Cy7 may be most sensitive at around a distance of 3 nm. The Cyanine family pair Cy2:Cy3 may be most sensitive at around a distance of 7 nm.

Different combinations of FRET pairs may be used for probing different regions of the protein. The most commonly used FRET pair, Cy3-Cy5, may be most sensitive at a distance of ~5 nm, and may hence be used for probing a second binding site 3-7 nm away from the first binding site. For probing a second binding site less than ~4 nm away from the first binding site, a dye pair such as Cy3-Cy7 may be used that may be most sensitive at a distance of ~3 nm. For a second binding site more than ~6 nm away from the first binding site, a dye pair such as Cy2-Cy3 may be used that may be most sensitive at a distance of ~7-nm. The end- to-end distance of a FRET pair may also be altered by placing a chromophore at a different position in the probe, such as at a different part of the sequence of an aptamer. Hence the FRET donor-acceptor pair may be optimized by altering the chromophore pair. The FRET donoracceptor pairing may be (essentially) unaffected by being bound to the first probe or second probe and may remain valid if switched, i.e. Cy3 may be bound to the first probe and Cy5 may be bound to the second probe and vice versa.

As described above, when the second probe binds to the second binding site, a FRET pairing may momentarily form between the chromophore comprised by the second probe and the chromophore comprised by the first probe which is bound to the first binding site. FRET can then occur, reporting on the distance between the second binding site and the first binding site. Furthermore, as FRET requires co-localization of 2 chromophores, the number of false positives may be reduced compared to assays which use the increase in signal from a single fluorophore. Both the occurrence of binding as well as the associated FRET efficiency may be informative for a protein’s identity.

In particular, the FRET pairing may occur when the first chromophore and the second chromophore are configured within a FRET distance. The FRET distance may be selected from the range of 0.05 - 20 nm. In further embodiments, the FRET distance may yet be selected from the range of 0.1 - 10 nm. In further embodiments, the FRET distance may be selected from the range of 1 - 9 nm, such as from the range of 2 - 8 nm, especially from the range of 3 - 7 nm.

The protein fingerprinting method may comprise an exposure stage. The exposure stage may especially comprise (i) exposing the protein to the second probe, (ii) providing radiation having a wavelength selected from the donor excitation radiation range to the protein, and (iii) measuring emission in a donor emission radiation range and an acceptor emission radiation range to provide an emission signal. Hence, it is during the exposure stage that the fluorescent resonance energy transfer in FRET may be utilized to provide the information (in the form of the emission signal). In particular, the emission signal may comprise a protein fingerprint.

In embodiments, during the exposure stage the first probe may be bound to a predetermined localized protein substructure of the protein. The protein may during the exposure stage be exposed to the second probe, which may result in the second probe transiently binding to a (different) localized protein substructure of the protein. Thus the first chromophore and the second chromophore may localize outside of or within the FRET distance through the first probe and the second probe binding to the protein, depending on the distance between the two localized protein substructures on the protein.

During the exposure stage, radiation may then be provided having a wavelength selected from the donor excitation radiation range to the protein. This may excite the donor chromophore and may consequently result in the donor providing donor emission radiation. If the donor chromophore and acceptor chromophores separation is within the FRET distance, FRET from the donor to the acceptor may take place. Hence, the acceptor may become excited and consequently provide acceptor emission radiation. The efficiency of FRET may depend on the precise distance between the donor and acceptor up to a sub-nanometer distance difference. If the donor and acceptor chromophores separation is not within the FRET distance, FRET may (essentially) not occur. Hence, the presence or absence of acceptor emission radiation, and the efficiency of the FRET of such acceptor emission radiation, may provide information about the distance between the donor and acceptor.

The emission of radiation may then be measured during the exposure stage to provide the emission signal. Especially, the emission of radiation having a wavelength range within the donor emission radiation range and the acceptor emission radiation range may be measured. The emission of donor emission radiation may in embodiments be used as confirmation that (i) the donor is bound to the protein and (ii) that the donor has become excited due to exposure to donor excitation radiation. As such, the presence of donor emission radiation may function as a positive control for the FRET method, and the absence of donor emission radiation may be interpreted as a signal that at least one or both of the aforementioned conditions has not occurred productively. Further, the presence of donor emission radiation and absence, or relatively low levels, of acceptor emission radiation may indicate that the protein does not comprise a second binding site within FRET distance from the first binding site.

In certain embodiments, the method may further comprise a distance estimation stage comprising estimating a distance (di) between the first binding site and the second binding site based on the emission signal, especially based on a FRET efficiency. Herein, the emission signal may further comprise the ratio between the emission from the donor emission radiation range and the emission from the acceptor emission radiation range. This ratio comprises information on the transfer efficiency of the FRET that has occurred between the acceptor and the donor. Hence, the emission signal may allow for a sub-nanometer distance di estimation between the first binding site and the second binding site.

In embodiments, the method may further comprise a structure prediction stage. Herein, the structure prediction stage may comprise predicting the structure of the protein based on the estimated distance di. The structure may in embodiments be a localized protein substructure and may further be selected from the group comprising a primary protein structure, a secondary protein structure, a tertiary protein structure, and a quaternary protein structure. Especially, the structure may be selected from the group comprising a secondary protein structure, a tertiary protein structure, and a quaternary protein structure, such as particularly a tertiary protein structure. Hence, by predicting the structure of the protein the FRET method facilitates the characterization and identification of the protein.

In embodiments, the exposure stage may be protein degradation-free. The term “protein degradation” may herein refer to any process that alters the primary protein structure, i.e. the amino acid sequence. Such processes may comprise for example Edman’s degradation and protein cleavage. Edman’s degradation is a process wherein the N-terminal amino acid of the amino acid sequence is labelled with a reagent and subsequently removed from the amino acid sequence. Protein cleavage is a process wherein the peptide bonds between amino acids in the amino acid sequence of the protein are hydrolyzed to produce protein degradation products, such as protein fragments, polypeptides, peptides, or amino acids. Protein cleavage may be facilitated using a variety of reagents, such as f.e. enzymes (especially proteases and proteinases), peptides, organic acids, and mineral acids. Protein cleavage may also be induced through heat, such as at temperatures above 200 °C, especially above 220 °C, or further above 250 °C. The exposure stage may thus be performed in the absence of reagents and conditions that may lead to protein degradation.

Hence, using a protein degradation-free exposure stage, the current invention provides a method to determine the structure of (essentially) the whole protein. In particular, the method of the invention may facilitate determining at least part of the structure of the whole protein, i.e., of an unfragmented and non-degraded protein. In contrast, currently applied methods may commonly analyze protein degradation products rather than the whole protein. However, the structure of protein degradation products may be different from the structure of the whole protein. The current invention therefore may provide an approach allowing the characterization and identification of the structure of the whole protein. Hence, in embodiments, an initial protein may be subjected to the exposure stage resulting in an exposed protein, wherein at least 95% of the amino acid sequence of the initial protein is present in the exposed protein, such as at least 97%, especially at least 98%, such as at least 99%, including 100%.

Especially, in embodiments, the exposure stage may not comprise processes or methods that may (substantially) affect the structure, especially the primary structure, of the protein. Such processes or methods may comprise linearization and fragmentation of the protein. The exposure stage may further not comprise the use of enzymes for translocation, linearization, and fragmentation of the protein. Thus, the protein may comprise a “folded” protein, i.e. a protein in its three-dimensional structural conformation. Further, the protein may comprise at least 90% of the amino acid sequence of the protein as obtained from a biological sample, such as at least 95% of the amino acid sequence of the protein as obtained from the biological sample, such as at least 99% of the amino acid sequence of the protein as obtained from the biological sample. Hence, the folded conformation and structure of the (entire) protein may be preserved and characterized with minimal, such as essentially no, enzymatic alteration during the exposure stage.

This may be especially relevant for the characterization and identification of different protein variants caused by alternative splicing. Alternative splicing may relate to molecular mechanisms resulting in the transcription of many different messenger RNA’s (mRNA’s) from a single gene, with different mRNAs translating into different proteins. One such molecular mechanism is a process called RNA splicing, which facilitates removal of gene segments (termed “introns”) from a primary gene transcript. Alternative splicing is a process in which a single primary transcript results in a variety of mature RNAs leading to the production of a large number of proteoforms that may have different functions. Especially, alternative splicing may be one of the major sources of the diversity of the proteome. However, detection of alternatively spliced proteins may be of high relevance in disease detection and/or monitoring as several diseases, such as cystic fibrosis, cancer and Parkinson disease have been associated with mutations in their spliceoforms that lead to alternative splicing and abnormal protein production. However, current methods that employ protein degradation to characterize and identify proteins may not be able to differentiate between different proteoforms with largely overlapping amino acid sequences. Through protein fingerprinting of the whole protein, the method of the invention may be able to differentiate between such proteoforms with high accuracy.

In further embodiments, the method may be a non-medical method. In further embodiments, the method may be a non-diagnostic method.

In embodiments, the method may, prior to the exposure stage, comprise providing the protein from a protein source comprising the protein. Especially, the protein source may be a biological sample, such as from a bacterial sample, or such as from an archaeal sample, or such as from a protozoan sample, or such as from a fungal sample, or such as from a mammalian sample, or such as from a plant sample. The protein source may be a clinical sample obtained from a patient, or a research sample obtained from a cell culture, or a research sample obtained in situ.

The method of the invention may be particularly suitable for characterizing proteins available in relatively low abundance, such as with a low concentration in a small sample. In particular, the method of the invention may be suitable to characterize a single protein. Hence, in embodiments, the protein source may comprise less than 10 pM of the protein, further less than 5 pM of the protein, moreover less than 1 pM of the protein, especially less than 0.5 pM of the protein, or even less than 0.1 pM of the protein. In further embodiments, the protein source may comprise less than 100 pL, further less than 50 pL, moreover less than 10 pL, especially less than 5 pL, or even less than 1 pL.

In embodiments, the exposure stage may comprise exposing the protein to different second probes. In certain embodiments, the exposure stage may comprise sequentially exposing the protein to different second probes. In such embodiments, at least two of the different second probes may be configured to transiently bind to the protein at different respective second binding sites. The transient binding of the second probe may allow probing two or more second binding sites. Further, the transient binding of the second probes may ensure that the probing of different second binding sites on the protein may be (temporally or spatially) separated.

The temporal separation may be achieved via successive exposure of the protein to the at least two different second probes. In embodiments with successive exposure, the exposure stage may comprise a washing step. The washing step may follow after exposure of the protein to one second probe, and may comprise removing the one second probe (such that the protein is no longer exposed to the one second probe). The washing step may then be followed by exposure of the protein to another second probe. Hence, the protein may be exposed to the at least two second probes separately, i.e., not simultaneously. The emission signal may be measured throughout the successive exposure of the protein to the at least two different second probes.

The spatial separation may be achieved via simultaneous addition of the protein to at least two or more different second probe containers (or: “containers”). Each container may comprise a different second probe. The protein may then be exposed to a different second probe in each container. Hence, a different emission signal will be provided by each container and may be measured simultaneously. Such embodiments may be particularly suitable for analyzing a sample comprising a plurality of same proteins.

In further embodiments, the exposure stage may comprise flushing the protein (with first probe bound thereto) over a well-plate with different second probes arranged in different wells.

In other embodiments, the exposure stage may comprise concurrently exposing the protein to different second probes within the same container. In such embodiments, at least two of the different second probes may be configured to transiently bind to the protein at respective second binding sites. The different second probes may especially have different second chromophores. The different second chromophores may be selected such that (i) they may each form a FRET donor-acceptor pair with the first chromophore, and (ii) they may provide a distinguishable emission signal upon FRET with the first chromophore. In embodiments, the first probe may comprise a donor chromophore and the different second chromophores may comprise different acceptor chromophores, and the different second chromophores may be distinguished based on different acceptor emission radiation upon excitation by donor emission radiation. In other embodiments, the first probe may comprise an acceptor chromophore and the different second chromophores may comprise different donor chromophores, and the different second chromophores may be distinguished based on different donor emission radiation upon binding to the second binding site. In embodiments, the different second probes may further be distinguished based on different binding kinetics. Hence, the protein may be exposed to at least two or more different second probes concurrently and within the same container, but through at least two or more distinguishable emission signals this configuration may allow for the accurate probing of their different second binding sites.

Summarizing, the exposure stage may comprise exposing the protein to different second probes with different second binding sites. In embodiments, the method may comprise exposing the protein to the different second probes sequentially, one second probe before a different second probe. In further embodiments, the method may comprise exposing the protein to the different second probes in different containers, each providing an individual emission signal. In further embodiments, the method may comprise exposing the protein to the different second probes simultaneously within the same container, one second probe comprising a different chromophore than the chromophore comprised by a different second probe. In certain embodiments, at least two of these three approaches may be combined, wherein it may be ensured that the protein is not exposed to (different) second probes comprising the same chromophore simultaneously within the same container. In specific embodiments, all three of these three approaches may be combined. Hence, the transient binding of the different second probes and availability of different chromophores may facilitate flexible, rapid and accurate probing of the structure of the protein (also see above).

The transient binding of the second probe may further circumvent photobleaching issues. In particular, a probe comprising a photobleached chromophore may dissociate from the protein, and a (same) probe comprising a fresh chromophore may again associate to the protein at the same binding site.

Hence, the protein may be probed multiple times using the transiently binding second probes. The transient interactions may be monitored in real time using a fluorescence microscope, such as especially a single-molecule fluorescence microscope. This repetitive probing may allow for the FRET efficiency to be determined with less than 0.1% error, which may provide a smaller than 0.1 nm resolution. The emission signals from the FRET pairings may be repetitively recorded for the protein until a FRET “fingerprint” sufficient for profiling of the structure of the protein is obtained. Thus, from recording the emission signal per second probe, a FRET histogram may be provided. The FRET histogram may comprise mean FRET efficiency for each observed FRET event. The FRET histogram may be fit using a Gaussian function. The center of the peak may be determined with an accuracy (standard error) of up to a maximum of 2%, such as up to a maximum of 1%, like up to a maximum of 0.5%, especially up to a maximum of 0.1%, moreover up to a maximum of 0.05%, when at least 100 binding events are recorded. The distribution of the center of the peaks from all measurements may be called a ‘FRET fingerprint’ that is unique for the structure of the protein. The term “protein fingerprint” may herein refer to a protein-specific (unique) signal, especially wherein the protein fingerprint is suitable for identification of the protein. Herein, the protein fingerprint may especially refer to one or more of an array of FRET efficiency values; an array of estimated distances; and/or raw data, especially one or more emission signals, obtained according to the method of the invention.

Thus in embodiments, the method may further comprise a fingerprint provision stage. Herein, the fingerprint provision stage may comprise providing a protein fingerprint based on the emission signal. The protein fingerprint may in embodiments be provided (in realtime) during the exposure stage. In other embodiments, the protein fingerprint may be provided following the exposure stage. The fingerprint provision stage may in embodiments be followed by a protein identification stage. Herein, the protein identification stage may comprise identifying the protein by comparing the protein fingerprint to protein-related information in reference data. Such reference data may comprise previously acquired protein fingerprints that may allow for the identification of the presence of the same protein in a (biological) sample.

In another aspect, the invention may provide a system for characterization of a structure of a protein. The system may comprise (i) an analytical space, (ii) a probe supply, (iii) a radiation source, (iv) a fluorescence microscope, especially a single-molecule fluorescence microscope, and (v) a control system. Herein, the analytical space may especially comprise an analytical surface and may be configured to host the protein. The probe supply may be configured to provide probes to the analytical space. The radiation source may be configured to provide radiation, especially donor excitation radiation, to the analytical space. The fluorescence microscope may especially comprise a single-molecule fluorescence microscope and be configured to measure emission radiation having a wavelength in a donor emission radiation range and in an acceptor emission radiation range in the analytical space. The fluorescence microscope may be configured to (then) provide an emission signal to the control system. The control system may in some embodiments comprise an analysis system, and in other embodiments, the control system may be functionally coupled to an analysis system. In an operational mode (of the system), the system, especially the control system, may be configured to execute the method of the invention. Hence, a system may be provided that performs the method as described above for characterization of a structure of a protein. Especially, the control system may be configured to have the system execute the various stages of the method as described above.

In specific embodiments, the invention may provide a system for characterization of a structure of a protein, wherein the system comprises an analytical space, a probe supply, a radiation source, a single-molecule fluorescence microscope, and a control system, wherein the analytical space is configured to host the protein, wherein the probe supply is configured to provide probes to the analytical space, wherein the radiation source is configured to provide radiation to the analytical space, wherein the single-molecule fluorescence microscope is configured to measure emission in a donor emission radiation range and in an acceptor emission radiation range in the analytical space and to provide an emission signal to the control system, and wherein in an operational mode the control system is configured to execute the method of the invention.

The term “controlling” and similar terms especially refer at least to determining the behavior or supervising the running of an element. Hence, herein “controlling” and similar terms may e.g. refer to imposing behavior to the element (determining the behavior or supervising the running of an element), etc., such as e.g. measuring, displaying, actuating, opening, shifting, changing temperature, etc.. Beyond that, the term “controlling” and similar terms may additionally include monitoring. Hence, the term “controlling” and similar terms may include imposing behavior on an element and also imposing behavior on an element and monitoring the element. The controlling of the element can be done with a control system, which may also be indicated as “controller”. The control system and the element may thus at least temporarily, or permanently, functionally be coupled. The element may comprise the control system. In embodiments, the control system and element may not be physically coupled. Control can be done via wired and/or wireless control. The term “control system” may also refer to a plurality of different control systems, which especially are functionally coupled, and of which e.g. one control system may be a master control system and one or more others may be slave control systems. A control system may comprise or may be functionally coupled to a user interface.

The system, or apparatus, or device may execute an action in a “mode” or “operation mode” or “mode of operation” or “operational mode”. The term “operational mode may also be indicated as “controlling mode”. Likewise, in a method an action or stage, or step may be executed in a “mode” or “operation mode” or “mode of operation” or “operational mode.” This does not exclude that the system, or apparatus, or device may also be adapted for providing another controlling mode, or a plurality of other controlling modes. Likewise, this may not exclude that before executing the mode and/or after executing the mode one or more other modes may be executed.

The control system, or especially the analysis system, may be configured to estimate a protein fingerprint based on the emission signal. The control system, or especially the analysis system, may further be configured to identify the protein by comparing the protein fingerprint to protein-related information in reference data. Hence, the control system may be configured to estimate a protein fingerprint and to identify the protein based on reference data. Such embodiments may be especially suitable for the identification of a protein that has been characterized previously and may provide information about the (biological) sample, e.g. identifying protein biomarkers of disease in clinical samples.

The control system, especially the analysis system, may be configured to estimate a distance di between the first binding site and the second binding site based on the emission signal. The control system, especially the analysis system, may further be configured to predict a structure of the protein based on the estimated distance di. Hence, the control system may in embodiments be configured to provide an estimated distance di and to predict a structure of the protein. Such embodiments may be especially suitable for the characterization of a protein that has not been characterized previously, e.g. the investigation of a protein being developed as a novel drug candidate.

In embodiments, the control system may be configured to predict a protein structure of the protein (using a computational process), especially based on the estimated distance, or especially based on the FRET efficiency. The computational process may especially comprise a computational algorithm.

In further embodiments, the computational process may comprise a refinement of a model protein structure based on the estimated distance. In further embodiments, the computational process may comprise a de novo protein structure prediction. In such embodiments, the estimated distance may especially comprise a plurality of estimated distances.

In embodiments, the control system may be configured to execute in a controlling mode the analysis method according to the invention. The control system may especially receive program instructions from a data carrier such that the control system executes the method according to the invention.

In a further aspect, the invention may provide a data carrier having stored thereon program instructions. Such program instructions when executed by the system described above may cause the system to execute the method described above. Herein, the data carrier may facilitate the execution of pre-programmed operational modes of the system. This may increase user convenience and adherence to standard use of the system. Further, the data carrier may comprise the reference data that may be used in the protein identification stage.

The term “stage” and similar terms used herein may refer to a (time) period (also “phase”) of a method and/or an operational mode. The different stages may (partially) overlap (in time). For example, the exposure stage may, in general, be initiated prior to the fingerprint provision stage, but may partially overlap in time therewith. However, for example, the binding stage may typically be completed prior to the exposure stage. It will be clear to the person skilled in the art how the stages may be beneficially arranged in time.

The method and/or system may be applied in or may be part of analysis methods/sy stems of biological samples, such as protein samples, particularly in relation to protein sequencing, protein structure elucidation, and/or protein interactomics.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will now be described, by way of example only, with reference to the accompanying schematic drawings in which corresponding reference symbols indicate corresponding parts, and in which: Fig. 1A-B schematically depict embodiments of the method for characterization of a structure of a protein using a first probe and a second probe. Fig. 2 schematically depicts embodiments of the system for characterization of a structure of a protein. Fig. 3 A-B depict further aspects of embodiments of the invention. Fig. 4 schematically depicts results of fingerprinting experiments. Fig. 5A-C schematically depict results of further fingerprinting experiments. Fig. 6A-B schematically depict results from binding kinetics experiments. Fig. 7A-C schematically depict results from further binding kinetics experiments. The schematic drawings are not necessarily on scale.

DETAILED DESCRIPTION OF THE EMBODIMENTS

Fig. 1 A schematically depict embodiments of the method for characterization of a structure 13 of a protein 10 using a first probe 31 and a second probe 32. The method comprises an exposure stage comprising (i) exposing the protein 10 to the second probe 32, (ii) providing excitation radiation 50 having a wavelength selected from a donor excitation radiation range to the protein 10, and (iii) measuring emission radiation 60 in a donor emission radiation range and an acceptor emission radiation range to provide an emission signal. The exposure stage is protein degradation-free. Further, the protein 10 comprises a first binding site 11 and a second binding site 12. In embodiments, (i) the first probe 31 is covalently bound to the protein 10 at a first binding site 11 or (ii) the first probe 31 is configured to transiently bind the protein 10 at the first binding site 11, wherein the first probe 31 comprises a first chromophore 21. Moreover, the second probe 32 is configured to transiently bind the protein 10 at a second binding site 12 and comprises a second chromophore 22. The second probe 32 comprises an affinity -based probe 35 selected from the group comprising an aptamer, an antibody, a nanobody, and a small-molecule moiety. In embodiments, the first chromophore 21 and the second chromophore 22 are selected from FRET donor-acceptor pair chromophores 23,24. The FRET donor-acceptor pair chromophores 23,24 have a donor excitation radiation range, the donor emission radiation range 63 and the acceptor emission radiation range 64. A donor chromophore 23 of the FRET donor-acceptor pair chromophores 20 is excitable by donor excitation radiation 53 in the donor excitation radiation 53 range. Further, an acceptor chromophore 24 of the FRET donor-acceptor pair chromophores 23,24 is configured to provide acceptor emission radiation 64 (also see Fig. 3B) in the acceptor emission radiation 64 range upon excitation with donor excitation radiation 53 in the donor excitation radiation range of the donor 23 of the FRET donor-acceptor pair chromophores 23,24 when the first chromophore 21 and the second chromophore 22 are configured within a FRET distance selected from the range of 0.1 - 10 nm.

Fig. 1 A further schematically depicts embodiments wherein the exposure stage comprises sequentially exposing the protein 10 to different second probes 32, wherein at least two of the different second probes 32, 32a, 32b are configured to transiently bind to the protein

10 at respective second binding sites 12, 12a, 12b. In such embodiments, the first binding site

11 may be selected from the group comprising an N-terminal end 19 of the protein 10 and a C- terminal end 18 of the protein 10, especially the N-terminal end 19, or especially the C-terminal end 18.

Fig. IB schematically depicts a closer view of the method for characterization of a structure 13 of a protein 10 using a first probe 31 and a second probe 32. Herein, the exposure stage is depicted wherein the first probe 31 and the second probe 32 are transiently bound to their respective binding sites 11,12. Fig. IB in particular depicts an embodiment wherein the first probe 31 comprises an aptamer 36 and the second probe 32 comprises a smallmolecule moiety 37. The FRET donor-acceptor pair chromophores 20 are, when bound to the respective binding sites 11,12, within a FRET distance di from each other. Hence, FRET may occur between the donor 23 and the acceptor 24, as schematically depicted by the arrow from the donor 23 to the acceptor 24.

Fig. 2 schematically depicts an embodiment of the system 200 according to the invention. In particular, it depicts a system 200 for characterization of a structure 13 of a protein 10, wherein the system 200 comprises an analytical space 210, a probe supply 230, a radiation source 250, a single-molecule fluorescence microscope 240, and a control system 300, wherein the analytical space 210 is configured to host the protein 10, wherein the probe supply 230 is configured to provide probes 31,32 to the analytical space 210, wherein the radiation source 250 is configured to provide excitation radiation 50, especially donor excitation radiation 53, to the analytical space 210, wherein the single-molecule fluorescence microscope 240 is configured to measure emission radiation 60 in a donor emission radiation 63 range and in an acceptor emission radiation 64 range in the analytical space 210 and to provide an emission signal to the control system 300, and wherein in an operational mode the control system 300 is configured to execute the method as described above.

In the depicted embodiment, the system further comprises a probe outlet 225 configured for the removal of probes from the analytical surface 210.

In the depicted embodiment, the radiation source 250 comprises a plurality of radiation sources 250, especially configured to provide different wavelengths of radiation. In particular, the radiation source 250 may be suitable to provide radiation in the donor excitation radiation range corresponding to different FRET donor-acceptor chromophore pairs.

In the depicted embodiment, the single-molecule fluorescence microscope 240 comprises or is functionally coupled to a plurality of optical elements configured to separate the radiation emitted from the analytical space 210 (by the donor chromophore 23 and/or the acceptor chromophore 24) into the donor emission radiation 53 and acceptor emission radiation 64. The single-molecule fluorescence microscope 240 may comprise an EMCCD camera 241 to measure the donor emission radiation 53 and the acceptor emission radiation 64. It will be clear to the person skilled in the art, that many variations of the single-molecule fluorescence microscope 240 and/or the optical elements may be possible without deviating from the scope of the invention as described herein.

In embodiments, the control system 300 may be configured to estimate a protein fingerprint based on the FRET efficiency pattern. In further embodiments, the control system 300 may be configured to identify the tagged protein 10 by comparing the protein fingerprint to protein-related information in reference data.

Fig. 2 further schematically depicts a data carrier 400 having stored thereon program instructions, which when executed by the system 200 according to the invention, especially by the control system 300, causes the system 200 to execute the method 100 according to the invention. In further embodiments, the control system 300 may comprise the data carrier 400.

Fig. 3A-B depict further embodiments for the immobilization of a protein 10 that may be used to investigate a structure 13 of the protein 10. Fig. 3 A depicts the immobilization of a protein 10 using an antibody 38. A first probe 31 may comprise an affinitybased probe 35 comprising an acceptor chromophore 24 targeting the first binding site 11. A second probe 32 may comprise an aptamer 36 and a donor chromophore 23 targeting the second binding site 12. Fig. 3B depicts the immobilization of a protein 10 with a covalently bound aptamer 36. This may provide for the probing of a structure 13 of the protein 10 by an affinitybased probe 35 comprising an aptamer 36 and a donor chromophore 23.

Experiments

Unless described otherwise, the experiments described herein are performed using the materials and methods described hereinafter.

Imaging buffer: the experiments were performed in imaging buffer comprising 100 mM NaCl and 10 mM Na2HPO 4 /NaH 2 PO4 pH 7.4 or 100 mM KC1 and 10 mM K 2 HPO 4 /KH 2 PO4. The ions in this buffer may promote aptamer 36 folding, in particular the stabilization of the G-quadruplex and duplex, which are structures that some DNA/RNA aptamers 36 may adopt. In this imaging buffer the aptamers 36 have off-rates of 0.05 - 0.2 s' 1 . The binding affinity of an aptamer 36 can be fine-tuned by altering the position of the fluorescent dye, which resulted in up to 20-fold difference in binding affinity.

Imaging: the emission 60 was collected using an ssFRET TIRF microscope 240. TIR excitation radiation 50 and FRET pair 20 emission 60 detection using prism type TIRF. Immobilized molecules were exposed to donor excitation radiation 53 by TIR using a green and/or red laser. Fluorescence emission 60 was collected by an objective and the slit created images of half the size of the EM-CCD camera. The fluorescence emission 60 signal was split into donor and acceptor signal by a dichroic mirror and was imaged side by side on the EM- CCD. Collected movies were processed and analyzed using custom written software.

Four aptamers were synthesized with a short Thymine stretch at their 5' to increase the distance from the aptamer-protein binding site and the chromophore (see SEQ ID NO: 1-4).

Fig. 4 schematically depicts results from fingerprinting experiments using the method as described herein. A thrombin protein 10 corresponding to SEQ ID NO: 5 was immobilized to the analytical space 210 at the N-terminal end 19 via a first probe 31 using biotin-streptavidin interaction. The first probe 31 comprised a DNA docking strand labeled with a Cy acceptor chromophore 24. The thrombin protein 10 was probed with two second probes 32a, 32b each comprising a Cy3 donor chromophore 23. The first second probe 32a was an affinity-based probe 35 HD1 corresponding to SEQ ID NO:1 and targeting thrombin exosite 1. The second second probe 32b was an affinity-based probe 35 HD22 corresponding to SEQ ID NO: 2 and targeting thrombin exosite 2.

Depicted in Fig. 4 on the top time-trace graph are the emission intensities I (in a.u.) of the donor emission radiation 63 and the acceptor emission radiation 64 over time T in seconds of a representative single molecule. The occurrence of active FRET efficiency E over time T in seconds for this representative single-molecule is depicted on the bottom time-trace graph. Further depicted are the kymographs of protein fingerprinting for the first second probe 32a and the second second probe 32b, with the FRET efficiency E over time T in seconds. Note that the difference in counts may be explained by the shorter imaging time and reduced binding frequency of the thrombin exosite 2 of the second second probe 32b.

Fig. 5A-C schematically depict results from fingerprinting experiments using the method as described herein. A thrombin protein 10 corresponding to SEQ ID NO: 5 was immobilized to the analytical space 210 using a small-molecule moiety 37 (CAS 142036-63- 3) bound to the analytical space 210 using biotin-streptavidin interaction. The first probe 31 comprised an affinity-based probe 35 HD22 corresponding to SEQ ID NO:2 labeled with a Cy5 acceptor chromophore 24 and targeting thrombin exosite 2. The thrombin protein 10 was probed with three second probes 32a,32c,32d each comprising an aptamer 36 and a Cy3 donor chromophore 23. The first second probe 32a was an affinity-based probe 35 HD1 corresponding to SEQ ID NO: 1 and targeting thrombin exosite 1. The third second probe 32c was an affinitybased probe 35 NU172 corresponding to SEQ ID NO: 3 and targeting thrombin exosite 1. The fourth second probe 32d was an affinity-based probe 35 RE31 corresponding to SEQ ID NO:4 and targeting thrombin exosite 1.

Depicted in each of Fig. 5A-C on the top time-trace graph are the emission intensity I (in a.u.) of the donor emission radiation 63 and the acceptor emission radiation 64 over time T in seconds of a representative single molecule. The occurrence of FRET efficiency E over time T in seconds for this representative single molecule is depicted on the bottom timetrace graph. Further depicted are the kymographs of protein fingerprinting of a representative single molecule for the second probe 32, with the FRET efficiency E over time T in seconds. Fig. 5 A depicts these results for the first second probe 32a, Fig. 5B depicts these results for the third second probe 32c, and Fig. 5C depicts these results for the fourth second probe 32d. Hence, repetitive, transient binding is observed for all three transiently binding probes 32a,32c,32d. Hence, a single protein 10 may be probed with a plurality of (different) second probes 32, resulting in corresponding FRET measurements, which may facilitate characterizing, such as identifying, ((binding sites 11 of) the structure 13 of) the protein 10.

Fig. 6A-B schematically depict results from experiments investigating the binding kinetics of aptamers 36 with acceptor chromophores 24 labeled at different positions of the aptamer 36 sequence. A thrombin protein 10 was immobilized to the analytical space 210 at the N-terminal end 19 using biotin-streptavidin interaction. The emission 60 was measured to provide a time-trace graph with the emission intensity I (in a.u.) over time T in seconds. These results were analyzed to provide the dwell time D in seconds per # of binding event B. Fig. 6A depicts the results from probing the structure 13 of the protein 10 using the second second probe 32b HD22 corresponding to SEQ ID NO:2 and targeting thrombin exosite 2, the affinity-based probe 35 comprising a Cy5 acceptor chromophore 24 arranged at the 5’- end. Fig. 6B depicts the results from probing the structure 13 of the protein 10 using a fifth second probe 32e HD22 corresponding to SEQ ID NO:2 and targeting thrombin exosite 2, the probe comprising a Cy5 acceptor chromophore 24 arranged at the thymine at the 17 th position of SEQ ID NO:2 counted from the 5’-end. Hence, the labeling of the chromophore 23,24 on the aptamer 36 sequence may affect the binding kinetics of the aptamer 36 and may be used to fine-tune the binding kinetics in embodiments.

Fig. 7A-C schematically depict results from experiments investigating the binding kinetics of three second probes 32a,32c,32d each comprising an aptamer 36 and a Cy3 donor chromophore 23. A thrombin protein 10 corresponding to SEQ ID NO: 5 was immobilized to the analytical space 210 at the N-terminal end 19 using biotin-streptavidin interaction. The emission 60 was measured to provide a time-trace graph (on the left) with the emission intensity I (in a.u.) over time T in seconds. For each of Fig. 7A-C, the bar chart on the right side indicates the number of binding events B observed for a specific dwell time D, i.e., the graph schematically depicts a binning of the observed dwell times. A mean lifetime T of the aptamers 36 was calculated from the dwell time D of all measured binding events. Further, the off-rate of the second probes 32 equals 1/T, i.e., the off-rate of the second probes 32 is the reciprocal of the mean lifetime T. Fig. 7A depicts results from the first second probe 32a, an affinity-based probe 35 HD1 corresponding to SEQ ID NO:1 and targeting thrombin exosite 1. Fig. 7B depicts results from the fourth second probe 32d, an affinity-based probe 35 RE31 corresponding to SEQ ID NO:4 and targeting thrombin exosite 1. Fig. 7C depicts results from the third second probe 32c, an affinity-based probe 35 NU172 corresponding to SEQ ID NO:3 and targeting thrombin exosite 1. As in Fig. 5A-C, repetitive, transient binding is observed for all three transiently binding probes 32a,32c,32d, thereby demonstrating that a single protein may be probed with a plurality of (different) second probes 32. Moreover, the three transiently binding probes 32a,32c,32d each show a different dwell time (and therefore, a different off- rate), demonstrating that the combination of the binding site 11 on the protein 10 and the aptamer 36 may affect the binding kinetics of the aptamers 36. The binding kinetics of the aptamers 36 may therefore be fine-tuned in embodiments. The term “plurality” refers to two or more. Furthermore, the terms “a plurality of’ and “a number of’ may be used interchangeably. The terms “substantially” or “essentially” herein, and similar terms, will be understood by the person skilled in the art. The terms “substantially” or “essentially” may also include embodiments with “entirely”, “completely”, “all”, etc. Hence, in embodiments the adjective substantially or essentially may also be removed. Where applicable, the term “substantially” or the term “essentially” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. Moreover, the terms ’’about” and “approximately” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. For numerical values it is to be understood that the terms “substantially”, “essentially”, “about”, and “approximately” may also relate to the range of 90% - 110%, such as 95%-105%, especially 99%-101% of the values(s) it refers to.

The term “comprise” also includes embodiments wherein the term “comprises” means “consists of’.

The term “and/or” especially relates to one or more of the items mentioned before and after “and/or”. For instance, a phrase “item 1 and/or item 2” and similar phrases may relate to one or more of item 1 and item 2. The term "comprising" may in an embodiment refer to "consisting of' but may in another embodiment also refer to "containing at least the defined species and optionally one or more other species".

Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.

The devices, apparatus, or systems may herein amongst others be described during operation. As will be clear to the person skilled in the art, the invention is not limited to methods of operation, or devices, apparatus, or systems in operation.

The term “further embodiment” and similar terms may refer to an embodiment comprising the features of the previously discussed embodiment, but may also refer to an alternative embodiment.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims.

In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. Use of the verb "to comprise" and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise”, “comprising”, “include”, “including”, “contain”, “containing” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”.

The article "a" or "an" preceding an element does not exclude the presence of a plurality of such elements.

The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim, or an apparatus claim, or a system claim, enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

The invention also provides a control system that may control the device, apparatus, or system, or that may execute the herein described method or process. Yet further, the invention also provides a computer program product, when running on a computer which is functionally coupled to or comprised by the device, apparatus, or system, controls one or more controllable elements of such device, apparatus, or system.

The invention further applies to a device, apparatus, or system comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. The invention further pertains to a method or process comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. Moreover, if a method or an embodiment of the method is described being executed in a device, apparatus, or system, it will be understood that the device, apparatus, or system is suitable for or configured for (executing) the method or the embodiment of the method, respectively.

The various aspects discussed in this patent can be combined in order to provide additional advantages. Further, the person skilled in the art will understand that embodiments can be combined, and that also more than two embodiments can be combined. Furthermore, some of the features can form the basis for one or more divisional applications.