Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS AND COMPOSITIONS RELATED TO SELECTING VARIANT PROTEASES
Document Type and Number:
WIPO Patent Application WO/2019/112567
Kind Code:
A1
Abstract:
This invention provides protease activity screening matrices that contain a solid support conjugated to a substrate moiety that can be specifically cleaved by a specific protease. The invention also provides methods for utilizing the protease activity screening matrices described herein For identifying a variant protease that recognizes a desired substrate cleavage site. The invention additionally provides specific variant trypsin enzymes or enzymatic fragments that have citrulline-dependent proteolytic activities.

Inventors:
PAEGEL BRIAN (US)
TRAN DUC (US)
CAVETT VALERIE (US)
Application Number:
PCT/US2017/064697
Publication Date:
June 13, 2019
Filing Date:
December 05, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SCRIPPS RESEARCH INST (US)
International Classes:
C07H17/00; C07H19/04; C07H19/20; C07H21/02
Foreign References:
US9051612B22015-06-09
US20100075407A12010-03-25
US20050014160A12005-01-20
Other References:
TRAN ET AL.: "Evolution of a mass spectrometry-grade protease with PTM-directed specificity", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, vol. 113, no. 51, 20 December 2016 (2016-12-20), pages 14686 - 14691, XP055615228, DOI: 10.1073/pnas.1609925113
HUANG ET AL.: "Linking genotype to phenotype on beads: high throughput selection of peptides with biological function", SCIENTIFIC REPORTS, vol. 3, 3030, 23 October 2013 (2013-10-23), pages 1 - 9, XP055159037, DOI: 10.1038/srep03030
Attorney, Agent or Firm:
FITTING, Thomas et al. (US)
Download PDF:
Claims:
WE CLAIM:

1. A protease activity assay matrix, comprising a solid support conjugated to a substrate moiety that can be specifically cleaved by a protease, wherein the substrate moiety comprises (1) a probe moiety capable of detecting cleavage of the substrate moiety by the protease and (2) a nucleotide moiety capable of prirning repiieation of a

polynucleotide sequence encoding the protease.

2. The protease activity assay matrix of claim !s wherein the substrate moiety further comprises a linker moiety for connecting both the probe moiety and the nucleotide moiety to the solid support.

3. The protease activity assay matrix of claim 1 , wherein the nucleotide moiety is a PCR oligonucleotide primer,

4. The protease acti vity assay matrix of claim 1, wherein the solid support is a magnetic bead.

5. The protease activity assay matrix of claim 1 , wherein the probe moiety comprises a cleavage site of the protease and a detectable label

6. The protease activity assay matrix of claim 5 , wherein the detectable label is a fluorescent label.

7. The protease activ ity assay matri of claim 6, wherein the fluorescent label is a fluorpgen molecule.

8. The protease activity assay matrix of claim I, wherein the protease is an endopeptidase.

9. The protease activity assay matri of claim 8, wherein the endopeptidase is a trypsin variant,

Ϊ The protease acti vity assay matrix of claim 1 , further comprising a polynucleotide sequence encoding the protease, wherein the polynucleotide sequence is linked to the assay matrix via the nucleotide moiety

11. The protease activity assay matrix of claim 10, wherein the nucleotide moiety is a PGR oligonucleotide primer,

12. A method for identifying a variant protease that recognizes a desired substrate cleavage site, comprising (! ) conjugating to a solid support a substrate moiety that comprises (a) a probe moiety capable of detecting cleavage of the substrate moiety at the desired cleavage site and (b) a nucleotide moiety capable of priming replication of a polynucleotide sequence encoding the variant protease, to generate a protease activity assay matrix, (2) emulsifying the protease activity assay matrix with a library of polynucleotide sequences encoding a population of candidate proteases, to generate a library of matrix emulsified polynucleotide sequences, (3) performing emulsion PCR (emPCR) and emulsion in vitro transcription/translation femlVTT) with the library of matrix emulsified

polynucleotide sequences, and (4) detecting cleavage of the substrate moiety in one or more matrix emulsified polynucleotide sequences; thereby identifying a variant protease

recognizing the desired substrate cleavage site,

13. The method of claim 12, wherein the substrate moiety further comprises a linker moiety for connecting both the probe moiety and the nucleotide moiety to the solid support.

14. The method of claim. 1 , wherein the nucleotide moiety is a PCR oligonucleotide primer

15. The method of claim 12, wherein the solid support is a magnetic bead

16. The method of claim 12, wherein the emulsifying is performed with an emulsion formulation that comprises a continuous phase and an aqueous phase.

17. The method of claim 16, wherein the emPCR is performed in the presence of a stabilizer in the aqueous phase and a stabilizer in the continuous phase.

18. The method of claim 16. wherein the emlVTT is performed in the presence of a stabi lizer In the continuous phase.

19. The method of claim 12, wherein the probe moiety comprises a cleavage site of the protease and a detectable label,

20. The method of claim 19, wherein the detectable label is a fluorescent label.

21. The method of claim 20, wherein the fluorescent label i s a fiuorogenic molecule

22. The method of cla im 12, wherein the variant protease is an endopeptidase,

23. The method of claim 22, wherein the endopeptidase is a trypsin variant,

24. A variant trypsin enzyme or enzymatic fragment thereof that has citrul!ine-dependeni proteolytic activity, comprising (a) a sequence that is at least 90% identical to wild type trypsin, and (b) mutation DJ89S and one or more additional mutations selected from the group consisting of L7P, E185K, and K188A, or conservative substitutions thereof.

25. The variant trypsin enzyme or enzymatic fragment thereof according to claim 24, comprising amino acid substitutions L7P, E I 85K, K.188A and D189S, or conservatively substitutions thereof.

Description:
METHODS AND COMPOSITIONS RELATED TO SELECTING

VARIANT PROTEASES

STATEMENT OP GOVERNMENT SU PPORT

100011 Ί his invention was made with government support under DP2ODQ08535 awarded by the National Institutes of Health. The government has certain rights in the invention.

BACKGROUND OF THE INVENTIO

[0001] Mass spectrometry is an exquisitely general and high-resolution platform for protein sequencing, identification, and mapping. Proteins are first enzymatically digested into smaller peptides that are suitable for chromatographic separation and ionization. A tandera mass analyzer then isolates individual peptide ons in a first analyzer, induces regular fragmentation along the peptide back bone, and scans the product ion series in the second analyzer. Mass differences between consecutive product Ion peaks in these series reveal the sequential identities of amino acids, and the type and location of any post- translational modifi cations (PTMs) .

|0OQ2) Proteolytic digestion is the first step in preparing a sample for mass spectrometric analysis. Regions of protein sequence with sparse proteolytic cleavage sites are not detected and result in sequence coverage gaps. Trypsin is used almost exclusively because it is highly specific and efficient, and tryptic cleavage sites are common in many proteins. A few other proteases are used much less frequently because they are either too inefficient (AspN, GluC) or promiscuous (chymotry sin) to be generally useful The arsenal of viable proteases for sample preparation is shockingly small, and the absence of tryptic cleavage sites alone can render swathes of protein sequence completely invisible to the instrument. These sequence coverage gaps are a disabling obstacle because they can conceal key regulatory modifications, thwarting discovery and mapping efforts.

[0003] There is a strong need in the art for proteases with novel substrate specificities that are useful in mass spectrometry and related technical fields. The present invention is directed to this and other unmet needs. SUMMARY OF THE INVENTION

10004] In one aspect, the present: invention provides protease activity assay matrices. The matrices contain a solid support conjugated to a substrate moiety that can be Specifically cleaved by a protease, Typically, the substrate moiety of the matrices is comprised of (1) a probe moiety capable of detecting cleavage of the substrate moiety by the protease and (2) a nucleotide moiety capable of priming replication of a polynucleotide sequence encoding the protease. In some embodiments, the substrate moiety can

additionally contain a linker moiet for connecting both the probe moiety and the nucleotide moiety to the solid support.

{0005] In some protease activity assay matrices of the invention, the nucleotide moiet is a PCR oligonucleotide primer. In some embodiments, the solid support is a magnetic bead in some embodiments, the probe moiety contains a cleavage site of the protease and a detectable label. In some of these embodiments, the detectable label is a fluorescent label, e.g., a fluprogenic molecule. Some of the protease activity assay matrices of the in vention are intended for endopeptidases, e.g., trypsin or its variants. In some embodiments, the protease activity assay matrices can further include a polynucleotide sequence encoding the protease, e.g., a candidate mutant or variant of a target protease of interest in some of these embodiments the polynucleotide sequence can be jinked to the assay matrix via the nucleotide moiety, e.g , through hybridization . In some of these embodiments the nucleotide moiety is a PCR oligonucleotide primer

]0006] In another aspect, the invention provides methods for identifying a variant protease that recognizes a desired substrate cleavage site. These methods entail (1 ) conjugating to a solid support a substrate moiety that contains (a) a probe moiety capable of detecting cleavage of the substrate moiety at the desired cleavage site and (b) a nucleotide moiety capable of priming replication of a polynucleotide sequence encoding a candidate variant protease, to generate a protease activity assay matrix, (2) emulsifying the protease acti vity assay matrix with a l ibrary of polynuc leotide sequences encoding a population of candidate proteases, to generate a library of matrix emulsified polynucleotide sequences, (3) performing emulsion PCR (emPCR) and emulsion in vitro transcription/translation

(ernlVTT) with the library of matrix emulsified polynucleotide sequences, and (4) detecting cleavage of the substrate moiety in one or more matrix emulsified polynucleotide sequences. These will lead to identification of one or more variant proteases recognizing the desired substrate cleavage site.

{0007] in some methods of the invention, the substrate moiety can further include a linker moiety for connecting both the probe moiety and the nucleotide moiety to the solid support. In some embodiments, the nucleotide moiety Is a PCR oligonucleotide primer. In some embodiments, the employed solid support is a magnetic bead. In some methods, the emulsifying is performed with an emulsion formulation that comprises a continuous phase and an aqueous phase. In some of these methods, the emPCR is performed in the presence of a stabilizer in the aqueous phase and a stabilizer in the continuous phase, in some of these methods, the ernIVTT is performed in the presence of a stabil izer in the continuous phase.

In some methods of the invention, the employed probe moiety contains a cleavage site of the protease and a detectable label. In some of these methods, the detectable label is a fluorescent label, e.g.., a fluprogemc molecule. Some methods of the invention are directed to identify ing variants of an endopeptidase, e.g., trypsin variants.

[0008] In another aspect, the invention provides variant trypsin enzymes or enzymatic fragments thereof that have citridlme-dependent proteolytic activity, These enzymes typically contain an amino acid sequence that is at least 90% identical to the sequence of wild type trypsin. Further, relative to the wild type trypsin, the trypsin variants of the invention contain mutation D189S and one or more additional mutations selected from the group consisting of L7P, El 85K, and KI 88,4, or conservative substitution thereof. In some embodiments, the variant trypsin enzymes of the invention contain amino acid substitutions L7F, E185K, K188A and D189S, or conservatively substitutions thereof.

{0009] A further understanding of the nature and advantages of the present invention may be realized by reference to the remaining portions of the specification and claims.

DESCRIPTION OF TOIL DRAWINGS

[ooioi Figure 1 shows activity assay bead synthesis, function, arid implementation in compartmentalized protease evolution. The protease activity assay bead (top left) is prepared from 2.8-mhi- diameter, amino-functionalized magnetic resin, The resin

is elaborated with a bifunctional linker, displaying an alkyne for copper- catalyzed cycloaddition to an azido oligonucleotide PCR primer and a primary· amine for conjugation with a carboxylic acid-terminated bisamide Rl !O probe (black hexagon).

The bisamide probe displays two symmetric tripeptide arms. Cleavage only at the PI amide bond (dashed line) mediated by an active mutant protease (white pae-man) yields a fluorescent bead displaying the R1 10 fluorophore (white). The beads are used in an in vitro compartmentalized protease evolution workflow (bottom), including preparation of a protease mutant gene library, emulsion PCR (emPCR) of the mutant gene l ibrary in the presence of the protease activity assay beads to yield a bead library··, each clonally displaying ~l 0,000 copies of a progenitor mutant. The bead library is washed and re-emulsified with in vitro transcription/translation reaction mixture (emiVTT). Droplets housing beads that display an inactive protease gene (gray DNA) result in translation of inactive protease (black pac-man) and no proteolytic activity directed toward the bead (black hex bead). Droplets housing beads that display an active protease gene (white DNA) result in translation of the active protease, which then transforms bead-bound substrates to fluorescent bead-bound product (white hex bead). Fluorescence-activated cell sorting (FACS) beads harboring active protease genes for subsequent single-bead PCR an activity assay,

DETAILED DESCRIPTION

1. Overview

|Q01 11 The invention is predicated in part on the development by the present inventors a compartmentalized in vitro evolution platform to discover new proteases for mass spectrometry-based proteomics. The platform is capable of generating and . screening over a million protease mutants, identifying mutants exhibiting desired proteolytic activity and rejecting those that exhibit off- target activity, As exemplification, protease activity assay beads are prepared by functionalizing 2,8-pm magnetic amino beads with a bi functional linker that displays both a protease activity probe and a DNA oligonucleotide primer for PCR (Fig, i, top). A protease activity- based probe such as the rhodamine 1 10- derived probe can be prepared with any type of side chain target for which mutant proteases that cleave C-terminal to that side chain are desired. A mutant protease that can catalyze cleavage of that amide bond reveals the rhodamine 1 10 core fluorophore, generating a ~~I Ed- fold enhancement of fluorescence quantum yield on the bead. Mutant library screening proceeds through a complex workflow of PCR mutagenesis, two emulsified biochemical reactions, and flow cytometry (Pig, 1 , bottom). |00Ϊ2] To screen for mutant protease with the desired acti vity, a mutant library is constructed by PCR, the mutant library is diluted and emulsified together with the protease activity assay beads in a PCR mix containing the opposite primer required to amplify the mutant protease gene. Genes and beads are diluted such that on average each droplet contains ~0.3 gene molecules and -1 bead, The emulsion PGR (emPCR) is then thermally cycled. In droplets that house both bead and gene, the bead becomes clonally populated with -I k copies of the progenitor mutant gene. Ail beads are harvested from emPCR and washed. This bead library is emulsified with in vitro transcription/translation (IVTT) reagent. During emulsion IVTT (emiVTT), the bead-bound genes are transcribed to RNA, which is then translated to protease if the mutant protease is active (Fig. 1 ), it transforms the quenched bead-bound probe to the highly fluorescent bead-bound j 10 product. If the mutant protease is inactive (Fig. 1 ), the probe remains quenched and the head non- fluorescent. The beads are harvested from the emiVTT, washed, and analyzed by

fluorescence-activated cell sorting (FACS), each hit bead deposited in a separate microtiter plate well containing PCR reagents. Subsequent thermal cycling amplifies the bead-bound gene for activity assay and sequencing.

[00!3f in accordance with these studies, the invention provides protease activity screening matrix that contain a solid support conjugated to a substrate moiety that can be specifically cleaved by a specific protease. Also provided in the invention are methods for utilizing the protease activit screening matrix described herein for identifying a variant protease that recognizes a desired substrate cleavage site, The invention additionally provides specific variant trypsin enzymes or enzymatic fragments that have eitrulline- dependent proteolytic activities.

II. Definitions

(00141 Unless otherwise indicated, the present invention can be practiced in accordance with the techniques exemplified herein and other standard procedures well known and routinely practiced in the art. Unless defined Otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which this invention pertains. The following references provide one of skill with a general definition of many of the terms used in this invention: Singleton et ah, DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY (2d ed. 1994); THE CAMBRI DGE DICTIONARY OF SCIENCE AND TECHNOLOGY (Walker ed,, 1988); and Hale & arham, THE HARPER COLLINS DICTIONARY OF BIOLOGY (1991 ). fn the event that there are a plurality of definitions for terms herein, those in this section prevail. Where reference is made to a URL or other such identifier or address, it understood that such identifiers can change and particular information on the internet can come and go, but equivalent information can be found by searching the internet. Reference thereto evidences theavailability and public dissemination of such information.

ffMHSj As used herein, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to compound, comprising "an extracellular domain" includes compounds with one or a plurality of extracellular domains.

(0016] As used herein, emulsion PCR (BmPCR) refers to a method for template amplification employed in multiple next-generation sequencing platforms. EmPCR is based on compartmenta!ization of DNA fragments in minute water dropieis/vesicies in a water-moil emulsion to a degree of dilution where there is only a single or a few template molecules per droplet. Ideally, each vesicle/droplet contains one sphere, one single-stranded template molecule, one of the primers bound to the sphere, and all other reagents necessary for the PCR reaction; the second primer remains in the solution to screen out molecules bound to same adapters. Thus, every droplet functions as an isolated PCR micro-reactor leading to generation of numerous copies of bound templates, facilitating signal amplification an detection during IVTT and FACS.

(0017] The organic compound citrulUne is an a-amino acid. It has the formula

HaNCCOiNHCCfbfjCH NlTijCOjH. it is a key intermediate in the urea cycle, the pathway by which mammals excrete ammonia by converting it into urea. Several proteins contain citrulline as a result of a postiranslationa] modification (PTM). These citrulline residues are generated by a family of enzymes called peptidylarginine deiminases (PADs), which convert arginine into citrulline in a process called citrullmatibn or deimination. Proteins that normally contain citrulline residues include myelin basic protein (MBP), fliaggrin, and several histone proteins, whereas other proteins, such as fibrin and viinentin are susceptible to citrul!ination during ceil death and tissue inflammation.

[0018] As, used herein, a librar of protease variants or a combinatorial library of protease variants refers to a collection of protease mutants or variants having distinct and diverse amino acid mutations in its sequence with respect to the sequence of a starting template or wild type protease. The mutations represented in the collection can be across the sequence of the starting protease or can be in a specified region or regions of the starting protease. The mutations can be made randomly or can be targeted mutation designed empirically or rationally based on structural or functional information.

[0019 j As used herein, a "template protease" refers to a protease having a sequence of am ino acids that is used for mutagenesis thereof. A template protease can be the sequence of a wild-type protease, or a eatalytically active portion thereof, or it can be the sequence of a variant protease, or eatalytically active portion thereof, for which additional mutations are made. For example, a specific variant protease identified in the selection methods herein, can be used as a starting template for further mutagenesis to be used in subsequent rounds of selection,

[0020] As used herein, random mutation refers to the introduction of one or more amino acid changes across the sequence of a polypeptide without regard or bias as to the mutation . Random mutagenesis can be facilitated by a variety of techniques known to one of skill in the art including, for example, UV irradiation, chemical methods, and PCR methods e.g error-prone PCR).

[0021] As used herein, a focused mutation refers to one or more amino acid changes in a specified region (or regions) or a specified position (or positions) of a polypeptide. For example, targeted mutation of the amino acids in the specificity binding pocket of a protease can be made. Focused mutagenesis can be performed, for example, by site directed mutagenesis or multi-site directed mutagenesis using standard recombinant techniques known in the art.

[9022] As used herein, desired specificity with reference to substrate specificity refers to cleavage specificity for a predetermined or preselected or otherwise targeted substrate.

[0023] As used herein, "protease" refers to any peptide, polypeptide or peptide or polypeptide-containing substance that catalyzes the hydrolysis of a protein or peptide, The protease may be natural or non-naturally occurring and may be isolated from a natural source, may be recombinant or synthetic and is not required to be in any particular form. Proteases Include, for example, serine proteases, cysteine proteases, aspartic proteases, threonine and metallo-proteases depending on the catalytic activity of their active site and mechanism of cleaving peptide bonds of a target substrate. Examples of well-known proteases include trypsin, chymotrypsm, bromelain, cathepsin B, cathepsin D, cathepsin G, c!osiripain, collagenase, dispose, endoproteinase Arg-C, endoproteinase Asp-N,

endoproteinase Glu-C, endoproteinase lys-C, factor Xa, kallikrein, papain, pepsin, plasm in, proteinase K, subtUisin, tbermolysin, thrombin, aeyiat ino-acid-releasing enzyme, aminopeptidase M, carboxypeptidase A, carboxypeptidase B, carboxypeptidase P, carboxypeptidase Y, cathepsin C, leucine aminopeptidase, and pyrog!utarnate

aminopeptidase,

[0024] As used herein, a "substrate" is a molecule that binds to the active site of a protease and is cleaved by the protease. After cleavage, part of the substrate may remain bound to the protease.

[0025] As used herein, a zymogen : refers to a protease that is activated by proteolytic cleavage, including maturation cleavage, such as activation cleavage, and/or complex formation with other proteih(s) and/or cofactor(s). A zymogen is an inactive precursor of a proteolytic enzyme. Such precursors are generally larger, although not necessarily larger than the active form, With reference to serine proteases, zymogens are converte to active enzymes by specific cleavage, including catalytic and autocatalytic cleavage, or by binding of an act ivating co-factor, which generates an active enzyme, A zymogen, thus, is an enzymatically inactive protein that is converted to a proteolytic enzyme fay the action of an activator. Cleavage can be effected autocatalytically Zymogens, general ly, are. inactive and can be con verted to mature active polypeptides fay catalytic or autocatalytic cleavage of the proregion from the zymogen.

[0026j As used herein, a protease domain is the catalytieally active portion of a protease. Reference to a protease domain of a protease includes the single, two- and multi- chain forms of any of these proteins. A protease domain of a protein contains all of the ; requisite: properties of that protein required for its proteolytic activity, such as for example, its catalytic center.

[0027] As used herein, a catalytieally active portion or proteolytically active portion of a protease refers to the protease domain, or any fragment or portion thereof that retains protease activity. Significantly, at least in vitro, the single chain forms of the proteases and catalytic domains or proteolytically active portions thereof (typically C- terminal truncations} exhibit protease activity. [0028] As used herein, a’’nucleic acid encoding a protease domain or catalytically active portion of a protease" refers to a nucleic acid encoding only the recited single chain protease domain or active portion thereof, and not the other contiguous portions of the protease as a continuous sequence,

[0029} As used herein, active site of a protease refers to the substrate binding site where catalysis of the substrate occurs. The structure and chemical properties of the active site allow the recognition and binding of the substrate and subsequent hydrolysis and cleavage of the seissile bond in the substrate. The active site of a protease contains amino acids that contribute to the catalytic mechanism of peptide cleavage as well as amino acids that contribute to substrate sequence recognition, such as amino acids that contribute to extended substrate binding specificity.

[0030] As used herein, the "substrate recognition sequence” or "cleavage site" refers to the sequence that is recognized by the active site of a protease and cleaved by a protease. Typically, for example, for a serine protease, a cleavage sequence is made up of the P1 -P4 and PT-P4' amino acids in a substrate, where cleavage occurs after the PI position. Typically, a cleavage site for a serine protease is six residues in length to match the extended substrate specificity of many proteases, but can be longer or shorter depending upon the protease. The probe moiety' or probe peptide described herein typically contains a desired substrate recognition sequence or cleavage site.

[0031 f As used herein, target substrate refers to a substrate that is specifically cleaved at its substrate recognition site by a protease. Minimally, a target substrate includes the amino acids that make up the cleavage sequence. Optionally, a target substrate includes a peptide containing the cleavage sequence and any other amino acids A full-length protein, allelic variant, isoform, or any portion thereof, containing a cleavage sequence recognized by a protease, is a target substrate for that protease. Additionally, a target substrate includes a peptide or protein containing an addi tional moiety that does not affect cleavage of the substrate by a protease, For example, a target substrate can include a four amino acid peptide or a full-length protein chemically linked to a fluorogenie moiety,

{0032] As used herein, altered specificity refers to a change in substrate specificity of a modified or selected protease compared to a starting wild-type or template protease, Generally, the change in specificity is a reflection of the change in preference of a modified protease for a target substrate compared to a wild type substrate of the template protease (herein referred to as a non-target substrate). Typically, modified proteases or selected proteases provided herein exhibit increased substrate specificity for any one or more predetermined or desired cleavage sequences of a target protein compared to the substrate specificity of a template protease. For example, a modified protease or selected protease that has a substrate specificity ratio of 100 for a target substrate versus a non-target substrate exhibits a 10-fold increased specificity compared to a scaffol protease with a substrate specificity ratio of 10, In another example, a modified protease that has a substrate

specificity ratio of 1 compared to a ratio of 0.1 , exhibits a 10-fold increase in substrate specificity. To exhibit increased specificity compared to a template protease, a modified protease has a 1 5-fotd, 2-fold, 5-fold, lQ-fold, 50-fold, 100-fold, 200-fold, 300-fold, 400- fold, 500-fold or more greater substrate specificity for any one of more of the predetermined target substrates.

[6033] As used herein, "primer" refers to a nucleic acid molecule that can act as a point of initiation of template-directed DMA synthesis under appropriate conditions (e,g., in the presence of four different nucleoside triphosphates and a polymerization agent, such as DMA polymerase, RNA polymerase o reverse transcriptase) in an appropriate buffer and at a suitable temperature, it will be appreciated that a certain nucleic acid molecules can serve as a "probe" and as a "primer." A primer, however, has a 3' hydroxyl group for extension, A primer can be use in a variety of methods, including, for example, polymerase chain reaction (PC ft), reverse-transcriptase (RT)-PCR, RNA PCR, LCR, multiplex PCR, panhandle PCR, capture PCR, expression PCR, 3' and 5 RACE, in situ PCR, ligation- mediated PCR and other amplification protocols

10034) As used herein, "primer pair" refers to a set of primers that includes a 5’

(upstream) primer that hybridizes with the 5' end of a sequence to be amplified (e.g by PCR) and a 3’ (downstream) primer that hybridizes with the complement of the * end of the sequence to be amplified. ill. Assay matrix for examining protease activities

[0035] The invention provides assay matrices (“assay media” or“substrate matrices”) and related methods that can be used to screen for modified or variant proteases that have an altered or desired substrate specificity relative to a known or wild type protease (reference or starting protease). In various embodiments, the assay matrix is contains a solid support onto which Is immobilized a substrate moiety. Typically, the substrate moiety contains a probe moiety that harbors a designed peptide cleavage site that is desired to be specifically cleaved by a variant (or modified) protease of interest. The probe moiety can additionally contain a detectable label that will generate a detectable signal upon cleavage of the probe by a variant protease of interest in addition to the probe moiety, the substrate moiety also contains a nucleotide moiety that is capable of priming replication of a

polynucleotide sequence encoding a candidate variant protease.

[0036} Various solid support or matrix materials can be used to prepare the solid assay matrix of the invention it can be any porous or non-porous material or matrix suitable for attaching macromolecules such as proteins, peptides, nucleic acids and the like. This includes, e.g., nylon, nitrocellulose, diazonitrocellulose, glass, silicon, polystyrene, polyvinyl chloride, polypropylene, polyethylene, dextran, sepharose, agar, starch, or arty other material that al low's for the immobilization of biomoleeules. The material can be formed in filters, membranes, flat surfaces, tubes, channels, we Us, sheets, beads, microspheres, columns, fibers (e.g. optical fibers) and the like. The solid support can also be multiwell tubes (such as mierotiter plates) such as 12-well, 24-well 48-well, 96-well, 384-well, and 1537-well plates, in some embodiments, the solid support is a particle or bead. Preferred beads are made of glass, latex, or a magnetic material (magnetic, paramagnetic, or stipe rmagnetic beads), in some embodiments, the sol id support can he a set of color coded microspheres such as those manufactured and sold by Luminex Corporation (Austin, ex,).

[60371 When the solid support is particulate in this fashion, it is convenient to use individually addressable particulate structures that perm it identification and separation of the structures individually addressable structures are known in the art and include magnetic beads, radio-frequency tagged particles, fluorescently labeled microspheres and the like. In such cases the presence of bound detection complex (comprising the inhibitor bound to the protease) can be detected by virtue: of the presence of the label present on the inhibitor and the addressable moiety· present on the structure. For example, a fluorescently labeled bead can be detected using flow cytometry, as discussed in more detail below, in some preferred embodiments, the solid support employed in the invention is magnetic bead.

(00381 The substrate moiety can be bound covalently to the solid support by any technique or combination of techniques well known in the art. In some embodiments, the substrate moiety is conjugated to the solid support matrix via a linker moiety, The linker moiety can connect one or both of the probe moiety and the nucleotide moiety to the solid support. In various embodiments, the linker moiety is a chemical compound or small molecule group that reacts with both the substrate moiety and the solid support. For example, the linker can be a homobi functional or heterobi functional chemical group that connects the substrate moiety and the solid support. In some embodiments, the substrate moiety (e.g., the probe) is first functionalized or reacte with the linker moiety before reacting with the solid support. In these embodiments the linker moiety can be considered part of the substrate moiety

{0039] Any suitable compounds can be used as the linker moiety in the assay matrices of the invention, probes, e.g., peptides or organic compounds in some

embodiments, linker moiety may be prepared from organic compounds such as alkyl chains, phenyl compounds, ethylene glycol, amides, esters and the like. In some embodiments, the linker moiety can typically comprise a chain length fro i to about 100 atoms, more preferably, from 5 to about 30 atoms. Other examples of suitable linker moieties include, but are not limited to ethanol amine, ethylene glycol, polyethylene with a chain length of 6 carbon atoms, polyethylene glycol with 3 to 6 repeating units, phe noxyethano I,

propanol a ide, butylene glycol, butyleneglycolamide, propyl phenyl chains, and ethyl, propyl, hexyl, steryl, cetyl, and paimitoyl alkyl chains,

{0040] The probe moiety in the substrate assay matrix of the invention typically contains a cleavage site that is designed or desired for a variant protease of interest to cleave. Thus, in various embodiments, the cleavage site can be a specific peptide sequence (probe peptide) that is recognized by a variant protease that one wishes to identify : or select from a library of candidate variant protease. For example, the cleavage site can be a peptide sequence that contains post-translational modifications (PTMs). As exemplified herein for identifying variant trypsin that exhibits citrallination-dependent protease activity, the probe moiety can include a peptide sequence that contains a deim mated arginine (citrulllne). In addition to eitra!lination, the cleavage site can also contain other PTMs, e.g,, arginine methylation, lysine acetylation or metby!aiion.

[0041] To faci litate detection of c leavage of the probe moiety, the probe can additionally Include one or more detection labels (label moieties) whic can generate a detectable signal upon cleavage of the cleavage site. The detectable label can either itself be detected or can produce a detectable signal upon reacting with another molecule under appropriate conditions. It can be any molecule, functional group or chemical moiety that displays or provides a signal that can be readily detected and/or measured. These: include, for example, radioactive isotopes fluorescent labels, chemiluminescent labels,

bioluminescent labels, and enzyme labels. The label moieties can also be haptens that are be recognized by secondary reagents such as antibodies, peptides, direct chemical interactions, and other methods that are well known in the art. The label moiety may also be an

oligonucleotide or nucleic acid that can be detected by hybridization, polymerization, ligation and/or amplification by methods well known in the art. The label moiety may also comprise two ehromophores bound in close proximity to utilize a phenomenon called fluorescence resonance energy transfer (FRET).

fO042] in various embodiments, the label in the probe moieties of the invention can be a fluorescent molecule, a chemiluminescent molecule (e.g., chemiluminescent substrates), a phosphorescent molecule, a radioisotope, an enzyme substrate, an affinity molecule, a ligand, an antigen, a hapten, an antibody, an antibody fragment, a chromogemc substrate, a contrast agent, an MR1 contrast agent, a positron emission tomography (PET) label (e.g., Techneimm-99m and fludeoxyglueose), a phosphorescent label, and the like. However, the detectable label is : preferably a small moiety such as a detectable atom (e.g., a radioactive isotope), a small organic molecule, or a small reactive chemical moiety or functional group, as opposed to bigger molecules such as enzymes or other polypeptides.

[0043] In various embodiments, the employed label moieties can be fluorophores, rhodamine moieties, and eoinnarin moieties (e.g., such as 7-arnino~4~carbamoyieoumarin, 7- amim 3-carbamoyimethy!-4-methylcoumarin, or 7~am!no~4~methyle0umarin), Examples of fluorophores that can be used in the invention: include, e.g., fluorescein, fluorescein analogs, BOD!PY -fluorescein, arginine, rhodamine ] 10, ihodamme-B, rhodamine -A, rhodamine derivatives, and the like. For further information on fluorescent label moieties and fluorescence techniques, see. e.g., Handbook of Fluorescent Probes and Research Chemicals, by Richard P. Haugland. Sixth Edition, Molecular Probes Inc. (1996), in some

embodiments, the detectable label is a long wavelength f!uorophore such as fluorescent dyes in the A!exa Fluor family (Thermo Fisher Scientific Inc.), In some exemplary embodiments, a fluorescein derivative such as 5-FAM-X-SE (6 - (Fluorescein - 5 - carboxamido)hexanoic acid, succimmidy! ester) can be used to label the probe moieties of the invention, This will l o result in the probe peptide being conjugated to a fluorophore, fluorescein - 5 - carboxamido)hexanoie acid (5-FAM-X).

[0044) In some preferred embodiments, the detectable label in the probe moieties can be a fluorogenie molecule as exemplified herein. Fluorogenic molecules are fiuorophores in which fluorescence is activated by enzymatic activity, light, or

environmental changes. In some embodiments, conjugation of the label moiety to the probe peptide results in quenching of the detectable signal (e.g., fluorescence) from the Sahel moiety, For example, as exemplified herein, the cleavage site peptide sequence can be conjugated to rhodam e 1 10 (Ri 10). As exemplified herein, these fluorogenic substrates contain an amino acid or peptide covalently linked to each of Rl 1 Q’s amino groups. Upon enzymatic cleavage, the quenched nonfluorescent probe moiety is converted to R 1 10 which displays a detectable fluorescence signal. In some other embodiments, the detectable label in the substrate moiety is a fluorophore that generates a fluorescent signal upon, e.g., light excitation.

fiMMSJ ln general, the detectable label or label moiety can be attached to the cleavage site (probe peptide) at any position in some embodiments, the detectable label is Jinked to a side chain of the probe peptide, In some embodiments, the detectable label is attached to the N-tennina! residue of the probe peptide, In some other embodiments, the detectable label is attached to the C -terminal residue of the probe peptide. As exemplified herein, some probe moieties of the invention have a detectable label that is coupled to the C- terminus of the probe peptide. Preparation of iluorophore-labeled peptide probes can be readily performed via protocols exemplified herein and/or method well known in the art. See, e.g., Weder et ah, J. Chromatogr. 698, 181 , 1995; Cavrois et ah, Nat. Biotechno! . 20,

1 151-1 154, 2002; and Marme et ah, Angew. Chem., it. Ed, Engl, 43, 3798, 20Q4. In general, labeled peptide probes can be prepared by either modifying isolated peptides or by incorporating the label during solid-phase synthesis. For example, f!uorophores can be conjugated to the N- or C- terminus of a resin-bound peptide before other protecting groups are removed and the labeled peptide is released from the resin. Labeling of the peptide probes of the invention may also be achieved indirectly by using a biotinylated amino acid, in some other embod iments, the label moieties in the probe moieties of the invention are electroactive species for electrochemical detection or chemiluminescent moieties for chemiluminescent detection. UV absorption is also an optional detection method, for which UV absorbers are optionally used, Phosphorescent, colorimetric, e.g., dyes, and radioactive labels can also be optionally attached to the probe peptides,

[0046] The substrate moiety of the assay matrices of the invention typically further contain a nucleotide moiety that can mediate or prime the replication or amplification of a polynucleotide sequence encoding a library of candidate variant proteases. As detailed below, the assay matrices of the invention are useful for screening from a library of candidate proteases for a variant protease of interest that is capable to cleave the probe peptide of substrate moiety. In these applications, polynucleotides encoding a library of candidate proteases are reacted with an assay matrix to generate a heterogeneous population of assay matrices that can then express the candidate variant proteases via emulsion in vitro transcription/translation (IVTT). Variant protease of interest that can specifically cleave the probe peptide of the substrate moiety can then be identified. The nucleotide moiety in the substrate moiety cart be any compound that can direct amplification of the library of enzyme- encoding polynucleotide sequences. In some embodiments, the nucleotide moiety is an oligonucleotide PCR primer sequence as exemplified herein,

IV. Screening variant proteases for desired catalytic specificity

[6647] Utilizing the assay matrices described herein, the invention provides methods for screening a library of candidate variant proteases to identify a variant protease with desired catalytic specificity. The methods involve first constructing an assay matrix as described herein that is intended for screening variants of a specific protease (e.g., trypsin) to identify a variant of interest with desired substrate specificity. Thus, the substrate moiety on the assay matrix should contain a probe moiety or probe peptide having the desired cleavage site in addition, the nucleotide moiety of the substrate moiety should be able to prime amplification of polynucleotide sequences encoding a library of variants of the chosen or target protease in addition, the methods require polynucleotide sequences encoding a library of candidate variant proteases from which a desired variant protease of interest is to be identified. Such polynucleotide sequences can he obtained by introducing mutations to the active site of the chosen protease, e.g., via PGR As exemplified herein for identifying variant trypsin that recognize citrui!ination modified cleavage sites, variants are generated by randomizing two 4-amino at c solvent-exposed loops adjacent the enzyme’s active site. Each of the 2 loop mutant libraries (each 4 20 :::: 160,000 members) can be used to generate solid support libraries displaying a citrulhne bearing probe moiety,

{0048] Employ ing an appropriate library of variant proteases and also a desired cleavage site (or probe peptide) in the construction of the assay matrix, the methods of the invention can be applied toward various proteases to identify variants with any cleavage site specificity. The desired cleavage site can be any side chain in the substrate sequence, For example, it can be a site or side chain containing PTM The desired cleavage site can also be any one or a subset of the· 20 biogenic amino acids. For example, the cleavage site can be at an amino acid residue with a charged side chain (acidic or basic amino acid residue), with an uncharged polar side chain, with a nonpolar side chain, with a beta-branched side chain or an aromatic side chain. In various embodiments, the substrate sequence contains (or is capable of sensing cleavage at) any of such desired amino acid side chains.

|0049) Proteases that can be employed in the methods of the invention can be any known class of protease capable of peptide bond hydrolysis. Candidate proteases forScreening typically are wild type or modified or variant forms of a wild type candidate protease, or catalytically active portion thereof, including allelic variant and isoforms of any one protein. These include proteinases (endopeptsdases) and peptidases (exopeptidases). In some preferred embodiments, the methods of the invention are employe for screening variants of a proteinase. Suitable proteinases include, e,g., the serine-, cysteine-, aspartic-, threonine- and metalio-type endopeptidases.

IOOSO] The library of variant proteases can be generated via various means.

Combinatorial libraries can be prepared in accordance with methods: routinely practiced in the art or the specific techniques exemplified herein. See generally, Combinatorial Libraries: Synthesis, Screening and Application Potential (Cortese Ed.) Walter de Gruyter, Inc., 1995:; Tietze and Lied, Curr. Opirs. Chem. Biol., 2(3):363-71 ( 1998); Lam, Anticancer Drug Des,, 12(3): 145-6? (1997); Blaney and Martin, Curr. C/pin. Chem. Biol., 1 ( I ):54~9 (1997); and Schultz and Schultz, Biotechnol. Prog., 12(6):729-43 ( 1996)). Methods and strategies: for- generating diverse libraries, including protease or enzyme libraries, including positional scanning synthetic combinatorial libraries (PSSCL), have been developed using molecular biology methods and/or simultaneous chemical synthesis methodologies. See, e.g.

Georgiou, et ai. ( 1 97) Nat, Biotechnol. 15:29-34; Kirn et at (2000) Appl Environ

Microbiol 66: 788 793; MacBeath, G. P, et ai. (1998) Science 279: 1958-196 ! ; Soumiilion, P. L. et al. (1994) Appl. Biochero. Biotechnol. 47: 175- 189, Wang, I. etai (1 96).

Methods Enzymol. 267:52-68; U.S. Pat Nos. 6,867,010, 6, 168,919. and U.S. Patent Application Publication No, 2006-0024289,

{0051] In some embodiment , the library of variant proteases can be generated via mutagenesis of a template or wild type enzyme. These include, e.g,, random mutagenesis and focused mutagenesis. Random mutagenesis methods include, for example, use of E. co!i XLl -red, UV irradiation, chemical modification such as by deamination, alkylation, or base analog mutagens, or PCR methods such as D A shuffling, cassette mutagenesis, site- directed random mutagenesis, or error prone PCR (see e.g, U.S, Application No.: 2006- 01 15874). Such examples include, but are not limited to, chemical modification by hydroxy lam irie (Ruan, H,, et al. ( 1997) Gene 188:35-39), the use of dNTP analogs (Zaccolo, M., et al. (1996) I. Mol. Biol, 255:589-603), or the use of commercial ly available random mutagenesis kits such as, for example, GeneMorph PCR-based random mutagenesis kits (Stratagene) or Diversify random mutagenesis kits (Clonteeh). Focused mutation can be achieved by making one or more mutations in a pre-determ ined region of a gene sequence, for example, in regions of the protease domain that mediate catalytic acti vity and/or substrate binding. For example, one or more amino acid residues in such regions of a protease can be mutated using any standard single or multiple site-directed mutagenesis kit such as for example QuikChange (Stratagene). In some embodiments, one or more amino acid residues of a protease can be mutated by saturation mutagenesis (see, e.g., Zheng et al. Nucl. Acids, Res,, 3 . 2; 115, 2004). In some embodiments, the chosen residues for mutagenesis are outside the active site the enzyme, as exemplified herein for trypsin. In some other embodiments, the mutation may be made to residues in the active site of the enzyme. For example, residues that form the S l-S4 pocket of a protease (where the protease is in contact with the P 1 -P4 residues of the peptide substrate) and/or that have been shown to be important determinants of specificity ma be mutated to every possible amino acid, either alone or in combination. Generally, substrate specificity and active site of a protease can be determined by molecular modeling based on three-dimensional structures of the complex of a protease and substrate. See for example, Wang et al., Biochemistry 40(34): 10038. 2001 ; Hopfner et a!,, Structure Fold Des, 7{8):989, 1 99; Friedrich et al, J Biol Chem 277(3):2160, 2002; and Waugh et ai., Nat Struct Biol. 7(9):762), 2000, (0052] The methods for detecting or monitoring a protease catalytic activity, as well as the methods for screening variant protease with a defined substrate specificity, can ai l be carried out in accordance with the steps exemplified herein or protocols well known in the art. As exemplified herein, the screening methods of the invention can typically be performed in a high throughput format. Any of the conventional techniques and equipment known in the art for screening a large number of compounds (e.g., automated library screening) can be employed in this screening assay of the present invention. In some embodiments, the methods are directed to screening for variant enzymes that have PTM- dependent protease activities, as exemplified herein for identifying citrulfine- dependent enzymes. To screen for variant protease of interest, the library of enzyme-encoding

polynucleotides are contacted with the assay matrix. The mixtures are then subject sequentially to emulsion PCR (emPCR) and emulsion in vitro transcription/translation femiVTT),

[0053] In some embodiments, the variant protease sequences and the assay matrix are diluted such that on average each droplet contains around 0.3 polynucleotide per assay matrix (e.g., per magnetic bead). In some embodiments, the emulsion PCR (emPCR) is thermally cycled to generate assay matrix that becomes cionaliy populated with about - 1 k copies of the enzyme-encoding polynucleotide sequence. Typically, after emPCR, the assay matrix is harvested and washed before emulsified with in vitro transeription/translaflon (IVTT) reagents. Upon expression, candidate variant proteases that are able to· cleave the probe peptide in the substrate moiety will generate a detectable signal (e,g., a fluorescent signal) that can be readily detected, e.g;, via FACS. The identified assay matrix can then be subject to further thermal cycling to amplify the hound polynucleotide for activity assay and sequencing.

f 00S4| In some methods of the invention, emulsion of the enzyme-encoding polynucleotide sequences and the assay matrix is performed with an emulsion formulation, e.g , a watgr-in-oi! emulsion. The emulsion formulation typically contains a continuous phase arid a dispersed phase. In some embodiments, the continuous phase is an oil or silicone fluid. In some embodiments, the dispersed phase is aqueous. For example, the emulsion formulation can contain an aqueous phase containing the appropriate biochemical reaction mixture, e.g., reagents for the PCR and IVTT reactions. In various embodiments, the emulsion formulation contains a continuous phase-solubilized stabilizer (e.g., silicone hydrophobe surfactant or hydrocarbon hydrophobe surfactant) and an aqueous phase-soluble stabilizer. In some embodiments, the continuous/oii phase stabilizer is present during both the emPCR and the ern!YTT reactions, while the dispersed/aqueous phase stabilizer is used onl in the emPCR reaction to enable generation of thermally stable emulsions, As specific exemplification, the continuous/oil phase stabilizer employed in some of these embodiments can be KF-6Q38, and the employed dispersed/aqueous phase stabilizer can be KF-6012.

[0055] The methods disclosed herein have significant advantages that are applicable to any type of PTM for which a protease can be evolved to recognize for proteolysis. Using variant trypsin enzymes described herein as example, digestion of purified protein with the PTM-· dependent protease and low-resolution mass analysis can provide some immediate insight, particularly if the PTM removes a tryptic cleavage site (e,g„ arginine dei ination or methylafion, lysine acetylation or methylation, etc.).

Comparative digestion of a purified protein with either mutant protease or trypsin yields mass spectral data that are less dense and therefore amenable to manual analysis. For higher throughput proteomic approaches, such as LC-MS/MS, cleavage at the PTM site potentially generates unique signals in the MS2 data that flag the parent peptide for further scrutiny. Similar to the ubiquitous and uniform neutral loss of phosphate (80 or 98 Da) observed upon collision-acti vated dissociation (CAD) and MS2 analysis of certain phosphopeptides, PTM- dependent proteolysis generates peptides that should in theory yield a consistent y | son derived specifically from the PTM that the protease recognizes (e.g., the 176.103 ion exemplified herein corresponding to a citru!iine y 1 ion), This signature dramatically simplifies data searching in otherwise highly complex and dense mass spectra metric proteomic data sets.

V . Variant trypsin enzymes with desired recognition specificity

(00561 Trypsin is a serine protease from the PA clan superfamily, found in the digestive system of many vertebrates where it hydrolyses proteins. Trypsin is formed in the small intestine when its proenzyme form, the trypsinogen produced by the pancreas, is activated. Trypsin cleaves peptide chains mainly: at the carboxyl side of the amino acids lysine or arginine, except when either Is followed by proline. It is used for numerous biotechnological processes, The process is commonly referred to as trypsin proteolysis or trypsin izatibn, and proteins that have been digested/treated with trypsin are said to have been

9 trypsinized The aspartate residue (D189) located in the catalytic pocket (S l) of trypsin is responsible for attracting and stabilizing positively charged lysine and/or arginine, and is, thus, responsible for the specificity of the enzyme.

[0057} The invention provides specific trypsin variants that are capable of cleave citruliine modified peptide sequences. Relative to wild type trypsin, the trypsin variants of the invention contains amino acid mutations Di 89S and one or more additional mutations L7P, E185K an KΪ88A, or conservative substitutions thereof. In some embodiments, the trypsin variant of the invention can ; include amino acid mutations DI 89S and L7P, or conservative substitutions thereof. Some of these trypsin variants can additional contain mutation E185K or K188A, or conservative substitution " thereof, in some preferred embodiments, the trypsin variant of the invention can include amino acid mutations D189S, E185K, K188A and L7P, or conservative substitutions thereof In addition to these amino acid residue substitutions, the trypsin variants can have an amino acid sequence that is at least 75%, 80%, 85%, 90%, 95% r 99% identical to the wild type trypsin sequence. Some trypsin variants of the invention are trypsin fragments that contain one or more of the noted amino acid substitutions at the active site (DI 89S, E185K and K 188 A).

[0058 ] The variant trypsin enzymes of the invention exhibit strong citruliine- dependent protease activities, As exemplified herein, these variant trypsin molecules were identified by emulsifying a library of variant trypsin encoding sequences with a protease assay matrix bearing citruliine activity probe. The library was then challenged in emiVTT, and analyzed by FACS as described above. Highly fluorescent beads (53 beads) were individually sorted, amplified and the PCR amp! icons assayed for cltrulline-dependeni proteolytic activity using the same iVTT reaction mixture and free-solution Rl 10 protease activity probe of the emiVTT challenge. Once cloned, sequenced, recombinantly expressed, the identified variant trypsin molecules were subjected to formal kinetic analysis. Using both citruliine- and arginine-containing fluorogenic substrates, if was found that one trypsin variant bearing L7P, E185K, K I88A, D189S f 4 B06”.) has a catalytic efficiency of kcat/KM = 7x 10- NT' s - , which compares favorably with the wild type enzyme's catalytic efficiency of kc a/ KM ~ I -5 x 10^ M V ί for the corresponding argi ine-containing substrate. The mutant enzyme’s citrulline-dependeni proteolysis is 7* l0- > -fold higher than trypsin,

|0Q59J The variant trypsin proteases of the invention have advantageous util ities that are not previously available. As exemplified herein in MS/MS analysis experiments. PAD4, one of the above enzymes responsible for arginine deiraination, can also deiminate its own arginines (’'autode minaiion") under conditions of high i !). Digestion of the autodeiminated PAD4 (adPAD4) with the 806 protease followed by LC-MS/MS analysis of the digest reveal ed a series of peptides that unambiguously displayed characteristics of citrull inatian, Extracted ion chromatograms contained peaks corresponding to both the deiminate and unmodified peptide, with appropriate relative retention times (citrullmaied at later retention time relative to the unmodified peptide). Significantly, the citrulline- dependent proteolytic cleavage that the B06 protease catalyzes results in unique peptide fragments that bear a C-terminal citrulfine at modified sites. Fragmentation of these peptides in MS2 analysis results in a Y1 ion at m/z :::: 176.103 Da, correspon ing to citru!iine, a constant signature . Using th is signature, a single LC-MS/MS analysis of the adPAD4 B06 digest yielded 66 unique peptides covering 10 of 12 previously disclosed sites of citruilination, X!C analysis above confirmed the presence of the unmodified form of each 176.103 -contain ing peptide.

{0060] The citrulfine-dependent proteases of the; invention greatly simplify the detection of a protein \s sites of citruilination. While there are abundant tools for isolating citrull inated proteins (e.g., anti-citrulline peptide antibodies and citruUine-specific chemical probes), these tools only facilitate pull-down of the target. Identification of the specific citrulhnated residue requires sequencing. Typically, citm! line-containing peptides can be identified as tryptic off-target cleavages at citrulfine, but the intensity·' is very low because trypsin’s citrull ine-specific activity is almost immeasurable. They can also be detected directly as citrull mated peptides, but the mass difference between citrufline and arginine is only ,984 Da, and chemical decomposition products can disguise cltrulline-containmg peptides as tryptic peptides. Other methods include variants on citruUine-specific modification using phenylglyoxaPs citruUine-specific reactivity that include mass tagging, but the resulting fragmentation patterns tend to be extremely complicated and do not lend themselves to unbiased dat mining instead, the citruiline-speclfic enzymes described herein can directly cleave at citrullinated residues, resulting in fragments that terminate in citrulline. In addition, the residual tryptic activity of the variant enzymes also permits it to cleave at arginine. Though the two peptides are 0,984 Da different, they are

ehromatographieally distinct, and upon resolution in LC, the two peptides elute at different retention times and exhibit distinct isotopic distributions. These sorts of data dramatically increase confidence in PTM identification.

EXAMPLES

{0061] The following examples are offered to Illustrate, but not to limit the present invention.

Example 1 Compartmentalized Evolution of a Citru! line-Dependent Protease

{0062] We devised a compartmentalized microbead display strategy for high- throughput mutant; protease expression and activity- based screening. Activity assay “probe-primer” beads (Fig, 1 , Upper) display an oligonucleotide PCR primer and a bisamkle rhodamine 110 (R 1 10) protease activity assay probe (Leytus e ai„ Bioehem. J. 2| 5(2):253- 260, 1983), Probe-primer beads and a trypsinogen mutant gene library were used in emulsion PCR (emPCR) to generate beads clonal iy populated with ~ 10,000 copies of a gene library member (probe-gene beads) followed by emulsion in vitro transcription/translation (emIVTT) for highly parallel, low-volume protein expression an activity assay, FACS analysis isolated single beads encoding active proteases that dequehched bead-bound R 1 10 probes (Fig, I, Lower),

{0 63] Control tryptic probe-primer beads displaying (acGPRLR I 10 that were digested with MS-grade or translated and enterokinase (EK)-activated trypsin and analyzed by flow cytometry exhibited good separation fro untreated and unfimctionalized beads.With feasibility established, e trulHrse-dependem activity assay probe-primer beads displaying (acJPcif^Rl 10 were elaborated with an EGGK(I 85-iB8)XXXX/DI 89S (SEQ ID NO:9) trypsinogen mutant library. This library explored the active site-proximal specificity- determining surface-exposed loop 1 in the context of a fixed pi 89S, another key residue in the base of the acti ve site that was found to confer basal citru | line-dependent cleavage activity. em!VTT and FACS screening of 1.3 x 10 6 facIPcitERT TO probe-gene library beads yielded 53 events, 21 successful sorts to individual quantitative PCR$ (qPCRs), and 15 clean amplifications. One sample exhibited enhanced citru!line- dependent proteolysis in a real time 1 VTT activity assay. The screening lead contained the citru! line-active loop 1 mutant EGGK( 185-188)KGGA (i.e., EGGK i SS‘l ss (SEQ ID NO:.! 9) to KGGA (SEQ ID NO: 15) mutation) and one additional serendipitous mutation, L7P, in the posttranslatiouaily cleaved H-ter mal propeptide. The L7P mutation conferred no detectable citrulUne-dependent cleavage activity to trypsin by real-time IVTT assay, although L7P increased real-time IVTT reaction progress of trypsin using a tryptic substrate, Screening of either loop 2

Q NK(22l -224)XXXX (ie„ Q NK 221 224 (SEQ ID NQ:20) to XXXX (SEQ ID NO: 16) substitutions) or re screening loop 1 with fixed L7P did not yield more compelling leads, though L7P modestly increased the hit rate. The confirmed mutant L7P 4- EGG fiSS- l 88)KGGA/D189S (SEQ ID NO: 10) was declared trypsin 4 " 1 (referred to as”B06" above) to reflect its trypsin ancestry and eitrulline- dependent activity.

Example 2 Expression and Characterization of Trypsin

|0064j Overexpression of tr psin 401 in Eschetidua eoli BL21 was readily observed in the insoluble fraction by SDS/PAGE analysis. Solubilized, purified, and refolded

trypsin " was observed as the singly charged molecular ion [M = 26.888 and doubly charged molecular ion [M + = 13,432 in MALDI-TOF MS analysis before EK. activation (full- length zymogen), and as [M + H] + ~ 23,337 and [M + 2H]^ + - 1 1 ,638 after activation: (active enzyme). Overexpressibn of wild-type trypsin was: unsuccessful under otherwise identical conditions. Installation of L7F alone onto trypsirr was sufficient to generate overexpress son product by SDS/PAGB, and the P7L trypsin* ® ' 1 revertant abolished overexpression product.

{00651 Digestion of f!uorogenic peptide probes, black hole quencher (BHQ)~

!PcitAA-fluorescein (FAM) and BHQ-IPRAA- tetramethylrhodatn irse (TMR), with trypsin, trypsin 4"01 *, or an activator control (EK) and fluorescence monitoring eon fir ed specific enzyme activities on more conventional substrates: after 25 min incubation only the

trypsift +clt digest of BHQ-IPcitAA- FAM produced FAM signal; only the tryptic digest of BHQ-IPRAA- TMR produced TMR signal; and the EK control produced no signal. LC- MS/MS analysis confirmed each enzyme’s site-specific cleavage activity. Only the trypsin 1 0 * 1 digest of BHQ-IPcitAA-FAM contained extracted ion chromatogram (XIC) features corresponding to· new N- and C-terminal products of cleavage at citruiline (1 ,031.58 + and 908.08 ^ m/z, respectively). Tryptic and control digest XIC only contained the fbl {-length probe (9 and EK control digests of the BHQ-IPRAA-TMR probe yielded only full- length probe (987,75*· ! m/z) in the respective XICs, whereas the tryptic digest XIC contained new N- and C-terminal products of cleavage at Arg ( i ,03Q.58 + and 962, 33 f ro/z, respectively); the C -terminal fragment exists: as three regioisomerie TMR coupling products, which are chromatographically distinct but otherwise indistinguishable by MS, Trypsin* -51 * exhibited greatly enhanced citrul line-dependent activity in steady-state kinetic analyses. The catalytic efficiency (kcat/KM) of BHQ-lPcitAA-FAM cleavage by trypsin 40 ** was 6,9 c 10^ M " s ~i , and that of BHQ-IPRAA-TMR cleavage was 63 x - 12-fold substrate selectivity. The catalytic efficiency of BHQ-IPRAA-TMR cleavage by trypsin was 3.9 c 10 ^ M ~ Tryptic cleavage of BHQ-IPcttAA-FAM was undetectable.

Example 3 Trypsin ^ Digestion Introduces Diagnostic Mass Signatures of

Gitas Hlnation for PTM Site Mapping

{00661 Trypsin rt il digestion of protein arginine deirmnase 4 (PAD4) in its native and autodeiminated state (adPAD4, Ca -treated PAD4) produced notably different MALDl-TOF MS spectra, which were also dramatically different from feature- poor spectra of tryptic digests. Qualitatively, the tryptic digest spectra of PACK and adPAD4 are nearly identical. However, trypsin^' 1 digestion of adPAD4 produced numerous new features (428 vs. 173 with S/N > 3.5, Intensity > 5% base peak). A magnified view displays two example features unique to the PAD4 and adPAD4 trypsin tv!t digest. The 1 ,387 m/2 ion (534 -T L REHN S F V E R- 544) (SEQ ID NO; 17) was prevalent in both PAD4 and adPAD4 trypsin +CiI digests, yet undetectable in the tryptic digest. The peptide’s sequence yielde Us molecular formula and thereby the theoretical abundance of each isotopic peak based on the binomial distribution and natural abundances of each element. The theoretical and observed isotopic envelopes agreed for both PAD4 and adPAD4 versions of the 1 ,387 m/z ion. The isotopic envelope of the 1,400 m/z ion (363- TEPVVFDSPRNR-374) (SEQ ID NQ:8), another new try sin "011 feature, agreed with the expected theoretical envelope. The same feature; in the adPAD4 spectrum contained a low-intensity molecular ion and high-intensity first isotope, inconsistent with the expected theoretical envelope for a peptide of this mass where the highest intensity peak should be the molecular ion; instead, it agreed more closely with the overlapping envelopes of two peptides whose molecular ions are shifted hy—l Da, suggesting the presence of an unmodified and citruilinated (Am ~ 0.984 Da) form of the same peptide. [0Q67| LC-MS/MS investigation of the adPAD4 trypsin’ 6 ’* digest confirmed citruilination of the 1,400 m/z peptide and revealed other diagnostic signatures of citruilination. Database matching returned S3 unique peptides from the tryptic adPAD4 digest (24 peptides contained C-terminal Arg) and 91 unique peptides from the trypsin* 61 * digest (34 peptides contained C-terminal Arg). The 1387 rod: ion (534-TLREHNSFVER- 544) appeared in both tryptic and trypsin +c,t digest analyses, the XIC: contained only one chromatographic peak (463,5687- m/z), and the MSI isotopic envelope agreed with the unmodified mass. In. contrast, the XIC of the 1 ,400 m/z ion contained two

chromatographic peaks. The two MSI ions (467.5934 i+ and 467.9206 3 1 m/z.) differed by 0,9816 :Da s in close agreement with the mass shift of citruilination. Another peptide ion unique to the adPAD4 trypsin , cif digest (21 l-VRVFQATR-218) also yielded two distinct XIC peaks (488.7870'* 1 and 489.2799 5” ), the second peak’s MSI again reflecting the mass shift of eitrujiination. The MS2 spectrum of this peptide contained 176.1030 m/z, the yi citrufline fragment, y I (cit), and its ammonia neutral loss, 159,0764 m/z, A search for all MSI precursors generating the 176, 1030/159,0765 MS2 citruilination signatures yielded 66 unique MS I Of these 66, 28 produced two-peak XIC,, the earlier eluting peak

corresponding to the unmodified peptide and the later peak corresponding to the C-terminal citr l 1 inated peptide. The remaining 38 peptides contained no arginine pair peaks, generated uninterpretable XIC, or yielded no sequence data. Of the 28 citrulUnated species, 22 were unique peptides corresponding to 12 sites of modification: 205, 212, 218, 372, 374, 383, 394, 427, 441, 488, 495, and 639 The XIC for R639cit contained no detectable peak corresponding to the unmodified peptide.

Example 4 Digestion identifies Citruilination in PAD4 -Treated Fibrinogen

and Increases Sequence Coverage

(0068) Trypsin !l digestion of native fibrinogen alpha, beta, and gamma chains

(PGA, FOB, PGG) and PA04-ciiru!!inated fibrinogen (cFGA, cFGB, cFGG) produced analogous eitrullinatiqn-dependent mass spectral signatures. The novel trypsirr c,r cleavages increased sequence coverage compared with the tryptic digest for tire heavily modified cFGA chain (68% vs. 52% in the tryptic digest), as well as for cFGB and cFGG chains (88% vs. 77% and 78% vs, 75%, respectively). Mascot and X!Tandem database searches of the two I -MS MS analyses (trypsin or trypsin *6 1 digestion of eFG) generated two lists of peptides for filtering via direct assignment of Arg modification, yl (cit) detection, or both. The MS2 fragmentation product ion yi (cit) is a unique feature of all C-teiminally eitrullinated peptides, though not all sites of citruHinatson were assigned as a result of a C-terminai citmlline; some sites were internal in the trypsin* ® " sample. All eitrullinated sites were observed internal in the tryptic digest The larger sequence space of cFG also allowed an analysis of trypsin** 11 cleavage junction PI and PE residues in addition to estFullirse, Arg and Lys were the most common PI residues. Cleavage products also frequently contained C-terminai Tyr and Gin, The P i' residue profile oftrypsin *" and trypsin are fairly similar, PI ' Pro and Tip appeared to be particularly disfavored. XIC analysis of the cFG trypsin 1 *" digest peptides (modified and unmodified forms) that appeared in both database searches (Mascot and X'Tandern) as eitrullinated usually resulted in only a single chromatographic peak corresponding to the modified form. XIC analysis of FG trypsin " digest peptides yielded peaks corresponding to the unmodified form of the peptide for most sites. The eFG trypsin * * 11 digest yielded 49 unique peptides that cross-validated between two database searches as containing a deiminated Arg- Among the 18 EGA sites, 12 contained yl(cit) and 1 was previously imreported (R621). Among the 6 FGB sites, 1 contained yl(eit) and 1 waspreviously unreported (R376) For lire 1 FGG site, yl(cit) was not detected. PreViousJy identified citrullination sites not sequenced by trypsin * *" included seven in EGA (287, 308,

353, 367, 394, 404, 425), tour in FGB (121 , 124, 285, 410), and two in FGG (40, 282). The eFG tryptic digest yielded 35 unique peptides (none containing C -terminal citrulline) mapping to P PGA sites, 6 FOB sites, and l FGG site. Previously Identified citrullination sites not sequenced by trypsin included eight FGA sites (308, 353, 367. 394, 404, 425, 426, 510) and four FGB sites (121 » 124, 285, 410).

Example 5 Materials and methods

[0069] Emulsification, PCRs contain FCR buffer ( 1 ¾), dNTPs (200 mM each), PEG-

8000 (4.8 mM), butanediof (5% [vol/v lj), KF-6012 (0.02% [voi/voi]; ShinEtsu), Taq (0,3 U/pL), oligonucleotide primer 5'-TGGGTCGGGCGTAGAGGATC-3 ' (SEQ ID NO; 1 1 ) , EGGKl 1 B5-I 88)XXXX/D I 89S (SEQ ID NO;9) library (800 pg), and (aeiPcit) 2 ft 1 10 probe- primer beads (5 * 10 6 ). 1UTG react ions (New England BioLabs) contained beads from emPCR, solution A (74.7 pL)„ solution B (56 pL). disulfide enhancers: 1 and 2 (14,9 pL each), and EK ( i /2G0 dilution , 4.1 pL). The reaction ( 150 pL) was combined w ith ice-cold oil (76% [wt/noί] DMF-A-6es 20% [wt/vol j mineral oil: 4% [wt/vo!] KF-6038; 600 pL) and a stainless steel bead (6 mm diameter). Sample was loaded into a homogen izer (TsssueL-yser; QIAGEN), emulsified (10 s, 15 Hz; 10 $, 17 Hz), and placed on ice. Aliquots of emPCR (50 pL) were transferred to PCR tubes for thermal cycling ([95 °C, 20 s; 68 °€, 90 s] x 30 cycles; 68 °C, 600 s). The emlVTT samples were incubated (67 h, 37 C) and protected from light.

[0070] Real-Time IVTT Assay, Purified PCR product from single-bead

am plifications (5 ug) or plasm id tem plate (22 ng) was added to IVTT reactions (5 mE, above) and either (acIPcit) 2 R 1 10 (3 mM) or (cbzJPR)2R1 10 (3 mM). Reactions were incubated (37 °C, 1,000 min) with fluorescence monitoring (channel 1, CFX96; Bio-Rad).

[0071] Materials Sources. All reagents were obtained from Sigma Aldrich unless otherwise specified. Fmoc-Rink-Amide MB!TA Resin (AnaSpee), Fmoo-Gly Wang resin 100- 200 mesh (HMD Miilipore), l-[(l~(cyano-2-ethoxy~2~oxoei.hylideneaminooxy)

dimeihylarninomoFphoiino)] uroniwn hexafluorophosphate (COMU; Acros Organics), N,N’~ diisopropylcarbadiimide (DIG; Acros Organics), 1 -hydroxy-7- azabenzotriazole (HOAt), N,N -di isopropylethylamine (DIE ; Thermo Fisher Scientific), piperidine, Fmoc- PEGo-OH (AnaSpee), 5(6)-carboxyi odamine 110 (Rl 10; Exciton Inc,}, 5(6)- carboxyfiuorescein (FAM), 5(6)-carboxytetramethylrhodamine (TMR), Black Hole Quencher carboxylic acid (BHQ1 ; Biosearch, Inc.), Frnoc-Trp(Boc)- OH (CEM), Fmoe- GIu(OtBu)-OH (CEM), Fmoc-Leu-OH (OEM), Fmoc-Pro-OFI (AnaSpee), Fmoc- Thr(tBu)-OH (CEM), Fmoc- Giy-OH (AnaSpee), Fmoc-Oe-OH (CEM), Fraoc-Ala-QB (CEM), F m oc-C y s(Trt) -OH (CEM), Fnioc-Arg(Pbf)-QH (AnaSpee), Fmoc- citruilme-OH (Thermo Fisher Scientific), Fmoc-propargylglycine (Fmoc-Pra-QH; Baehem),

dimethylformamide (DMF; Thermo Fisher Scientific), dichloromethane (DOM; Thermo Fisher Scientific), acetic anhydride, trifluoroacetic acid ( F A), triisopropyl silane: (TIPS), DM SO, a-cyanp-4-hydroxycinnamie acid (HCCA), formic acid, phenol, acetonitrile (HPLC grade; Thermo Fisher Scientific), FQO (HPLC grade; Thermo Fisher Scientific), 5- azidopentanoic acid, N-hydroxysuceinimide (HHS), l-ethyi-3-(3-dlmethy!aminopropyl) carbodiimide (EDC), 2 ,4' ,6' -tr ihydroxyacetophenqne monohydrate (THAP), Dynabeads M-270 Amine (Life Technologies), triethyiammomum acetate (TEA A, 2 M; Life

Technologies), L(+j- ascorbic acid (Acros Organics), copper (II) sulfate (CUSO4), EDTA, ammonium citrate dibasic, sodium hydroxide, ethanol, sodium citrate dibasic, Taq DNA polymerase (Taq; New England BioLabs), 2'~deoxyribonueleoside triphosphates (dNTP, set of dATP, dTTP, dOTP, dCTP; New England BioLabs), agarose, agarose SeaPlac j ue GTG 200 bp-25 Kb (Lonza), dimethyl silicone fluid (DMF-A~6cs) (ShinEtsu), KF-6012 (ShinEtsu), KF-6038 (ShinEtsu), mineral oil, Triton X-1GQ, (±)-1 ,3-butanediol, PEG- 8000 (molecular biology grade; VWR), PureExpress in vitro protein synthesis kit (TVTT; New England BioLabs), PureExpress disulfide bond enhancer (New England BioLabs), enterokinase light chain (EK; New England BioLabs), benzamidine hydro- chloride, LB Broth with agar, LB broth, ampicsflin, Ncol-HF (20,000 units/mL; New England BioLabs),

B Ip I (10,000 units/mL; New England BioLabs), T4 DNA iigase (New England BioLabs), phi 45b plasmid (EMD Miliipore), One Shot TOP 10 chemically competent E. coli (Invstrogen Thermo Fisher Scientific), SOC medium, Shuffle T7 Express competent E. coli (New England BioLabs), 1 FT G. lysozyme (Thermo Fisher Scientific), DNase I (New England BioLabs), RNase A (QIAGEN), urea (Bio-Rad), oxidized glutathione (GSSG), reduced glutathione (GSH; Thermo Fisher Scientific), sodium phosphate dibasic, sodium phosphate monobasic, 5 Prime PerfectPro Ni-NTA agarose (Thermo Fisher Scientific), imidazole, oUView Tris-Giycine precast gel (4-20%, NuSep; VWR), SDS, DL-1 ,4-DTT (Thermo Fisher Scientific), ammonium bicarbonate (AMBIC), sequencing grade trypsin (Promega Corp ), ActivX Desthiobiotin-FP Serine Hydrolase probe (Thermo Fisher Scientific), {Z-li6~PrG“Arg)2R l I Q HCi salt (CPC Scientific), and iodoaeeta ide were used as provided. Solvents used in solid-phase synthesis were dried over molecular sieves (3 A, 3.2-mm pellets). Tris[( l -benzyl- lH- l ,2,3-triazol~4~yS)methyl]amine (TBTA) was recrystallized three times in t-BuOH/H 2 0 (L I ). Oligonucleotides (Integrated DNA Technologies, Inc.) were obtained as desalted lyophs!ate and used without additional purification. qPCR 5 r exonuclease assay probes were HPLC purified at the manufacturer.

All synthesis incubation steps requiring constant rotation were performed in a temperature- controlled rotisserie- style incubator (Incubator Genie, Scientific Industries, Inc., Bohemia, NY). All recombinant protein expression, purification, and analysis steps requiring constant shaking were performed in a temperature-controlled incubator for bacterial cell culture (Muititron, Infors USA, Annapolis, MD).

100721 Buffers and Oil. Bead buffer (10 niM Tris, pH 8.3, 100 M NaCI, 0,5 tnM EDTA, 0.05% KF-6012), breaking buffer (I O mM Tris, pH 7.5, l OQ mM NaGl, 1 mM EDTA, 1% [wt/voi] SDS. 1 % [vol/vol] Triton X-100), binding and washing buffer with Tween- 20 (2x, BWBT, 10 mM Tris, pH 7,5. 2 M NaCI, 1 mM EDTA, 0.1 % Tween-20), saline sodium citrate buffer (2 Ox, SSC, 300 niM sodium citrate, pH 7.0, 0,5% [wt/volj SDS, 3 M NaCI), lysis buffer (50 mM Tris, pH 7.4, 2% [vol/vol] Triton X~ 100), lysate washing buffer ( 100 mM Tris, pH 8.0, 1 % [wt/vol] Triton C-1Ό0, 3 M urea), solubilization buffer (8 M urea, 0.3 mM GSSG, 3 mM GSH), renaturing buffer (300 mM phosphate, pH 8.6, 250 mM NaCI, 0.3 mM GSSG, 3 mM GSH, 0.5 M urea), Ni 2+ -NTA washing buffer (0.2 M phosphate, pH 6.0, 0.5 M NaCI), elution buffer (50 mM phosphate, pH 8.0, 300 mM NaCI, 250 mM imidazole), storage buffer (50 mM AMBΪO, pH 8.3, 0 3 mM GSSG, 3 mM GSH), and Hepes buffer (100 mM Hepes, pH 7.8, 2 mM DTT), and denaturing buffer (100 mM Hepes, pH 7.2, 6 M urea, 1 mM EDTA) were prepared in deionized water (DI H20) and used directly except where noted otherwise. Lysis buffer, lysate washing buffer, solubilization buffer, and renaturing buffer were sterilized (Q.22-pm syringe filter; HMD Millipore). PGR buffer (IO c , 100 mM Tris, pH 8.3, 500 mM KCI, 15 mM MgC!g; New England BioLabs), CutSmart Buffer (New England BioLabs), T4 DNA ligase buffer (New England BioLabs), and reducing Laemroli buffer (6*; Thermo Fisher Scientific) were used a provided. Oil for PCR and IVTT emulsification (76% [wt/voi] DMF- A-6cs;

20% [wt/voi] mineral oil; 4% [wt/vol] KF-6038) was prepared by combining materials and mixing with gentle rotation [i h, room temperature (RT), 14 rpmj.

[0073] 5 ' -Azido Oligonucleotide Synthesis Aliquots ( ! 00 nmol) of 5 - a inohexyi-modified oligonucleotide were dissolved in phosphate buffer (1 M, pH 8.0, 80 p.L). 5-Azidopentanoic acid NHS ester was prepared by dissolving NHS (9.6 prnol) EDC (9 6 pmol), and 5-azidapentanoic acid (7.2 pmol) in DMF (26 4 pL) and incubating (30 min, 60 °C), The 5-az.idopentanoic acid NHS ester solution was added to the solution of 5'~ am nohexyl -modified oligonucleotide and incubated (2 h, RT). A fresh aliquot of 5~ azidopentanoic acid NHS ester was prepared as described above, added to the acylation reaction, and the reaction incubated (1 h, RT). The reaction was quenched (3 M Tris, pH 7.6, 100 -xL) and incubated (5 min, 60 °C). 5 ! -Azido oligonucleotides were precipitated twice in ethanol. The pellet was dried (N2), resuspended (HPLC grade H2O, 200 pL). and purified by reversed-phase HPLC (XTerra MS CIS column, 4 6 50 mm, 130 A, 2.5 pm; Waters) with gradient elution (mobile phase A; 5% [voi/volj ACN in lOO mM TBAA pH 6; mobile phase B; 30% [vol/vol] ACN in 100 mM TEAA, pH 6; 0-100% B, 20 min) A product fraction aliquot (1 pL) was spotted to a MALDI-TOF MS target plate, dried, and covered with THAP matrix solution (18 mg/mL THAR » 7 mg/roL ammonium citrate dibasic in 1 : 1 ACN:H20) and mass analyzed via MALDl-TOF MS (Microflex; Bruker Dalfonies, Inc.), Product- containing fractions were pooled and evaporated to dryness in vacuo. The 5 '-azido oligonucleotide S I was resuspended (DI H2q) and quantitated by

A260,

[0Q74] Fluorogenic Rhodamine Probe Core Solid-Phase Synthesis, Fmoc-Giy

Wang polystyrene synthesis resin (160 pm, 0,65 mmo!/g, 152,5 mg; Rapp- Polym ere) was transferred to a fritted syringe (6 mL; Torviq) and swelled in DMF (2 h, RT), Linker synthesis proceeded via iterative cycles of solid-phase peptide synthesis. Each cycle included (i) Fmoc deprotection (20% [vol/vol] piperidine in DMF, 2 4 mL, 15 min each aliquot,

RT, 8 rpm); (u) N-Fmoe -am ino acid (0,5 mmol in 1 L DMF) activation with COMU (Q.5 rnrnoi in 0.5: mL DMF) and D1EA (1 mmol) and incubation (30 s, RT); and (iii) N-Fmoc- amino add coupling to resin by transferring activated acid (1 ,7 mL) to resin and incubating with rotation (15 min, RT, 8 rpm). After each deprotection and monomer coupling, reactants were expelled and the resin washed (DMF, 2 4 mL; DCM, 2 x 4 ml,; DMF, 2 c 4 mL). Coupling of Fmoe-PEG -OH followed this protocol. The terminal Fmoc was removed and the resin washed as described above. 5(6)-Carboxyrhodamine 1 10 (0.5 mmol in 1 ml, DMF) was combined with D1EA (1 mmol). COMU (0.5 mmol in 0.5 mL DMF) was subsequently added to the deprotonated dye, the reaction was incubated (30 s), the activated acid wa added to the resin, and the resin incubated with rotation (20 min, RT, 8 rpm). The resin was washed with DMF until there was no visible trace of color in the DMF wash and washed a final time (DCM, 2 x 2 mL). DCM was expelled and the resin was dried in vacuo. Cleavage mixture (93% [vol/vo!] TFA. 5% H20 [vol/vo!], 2% [vo!/vo!] TIPS, 2.5 mL) was added to an aliquot (10 mg) of dried resin and the resin incubated with rotation (1.5 h, RT, 8 rpm). The TFA solution was expelled Into ice-cold diethyl ether (22 mL), incubated ( 20 °C, 1 h), and centrifuged (5 min, 7,000 * g), The supernatant was decanted and the orange pellet dried in vacuo. Cleaved probe was resuspended (DM SO, 800 mΐ), An aliquot of the crude (100 pL) was purified by reversed-phase HPLC (XBridge BELLI 30 C I S, 4,6 x 100 mm, 130 A. 3.5 pm; Waters) with gradient elution; (mobile phase A; 0.1 % TFA in H2O; mobile phase B: ACN; 5% B 1 min, 20-40% B, 20 min). HPLC fractions were pooled based on A492, dried in vacuo, and resuspended in DMSO (to 10 mM). Purified fluorogenic rhodamine probe core was diluted (2 mM in 0, 1 % TFA) and directly infused (50 pL/rnln) to an electrospray ionization (ESI) source of a mass spectrometer (LTQ-XL; Thermo Fisher Scientific). Fiuorogenic rhodamine probe core concentration in DMSO was quantitated by A492, Quantitated probe core S2 was stored protected from light in DMSO (-20 °C). The remaining probe core resin S3 Was stored in DMF (-20 °C),

[0075] Fiuorogenic Rhodamine 1 10 Probe (aelPcifhRI 1 0 Synthesis and

Characterization, Probe core resin S3 (30,5 mg) was transferred to a fritted syringe (2,5 fflL; Torviq) and the DMF drained. Pmoc- trp!!ine-OH ( 1 mmol in i mL DMF) was combined with DIBA (2 mmol) arid COMU (1 mmol in 1 mL DMF), incubated (30 s, RT); the activated acid was then added to resin, and the resin incubated with rotation (1 ,5 h, 50 °C, 8 rpm), The resin was washed with DMF until there was no visible trace of color in the DMF wash. Probe construction proceeded via iterative cycles of solid-phase peptide synthesis. Each cycle included (i) Prnoc de- protection (20% [vol/vol] piperidine in DMF, 2 1 mL, 1 5 in each aliquot, RT, 8 rpm); (ii) N-a-Fmoc-amino acid (0.2 mmol in 0,5 mL DMF) acti vation with COMU (0.2 mmol in 0.5 in i DMF) and DiEA (0.4 mmol) and incubation (30 s, RT); and (hi) N-a-Fmoc-am ino acid coupling to resin by transferring activated acid ( 1 mL) to resin and incubating with rotation (i 5 min, RT, 8 rpm). After each deprotection and monomer coupling, reactants were expelled and the resin washed (DMF, 2 * 2 mL; DCM, 2 * 2 mL; DMF, 2 x 2 mL). The following Fm pc protected amino acid couplings were performed in order: (i) Fmoc-Pro-OF! and (ii) Fmoc-Ile-OFi. The terminal Fmoe was removed and the resin washe as described above; acetic anhydride (20% [vol/vol] in DMF, I ml,) was added to the resin, and the resin incubated with rotation (20 min, RT., 8 rpm) and washed as described with an additional wash (DCM, 2 c 2 mL), DCM was expelled and the resin was dried in vacuo. Cleavage mixture (93% [vol/vol] TFA, 5% H20 [vol/vol], 2% [vp!/vol] TIPS, 2 5 raL) was added to the dried resin and the resin incubated with rotation (1.5 h, RT, 8 rpm). The TFA solution was expelled into ice-cold diethyl ether (22 ml,), incubated (-20 °C. 1 h), and centrifuged (5 min, 7,000 x g). The supernatant was decanted and the orange pellet dried in vacuo.

Cleaved probe was resuspended (DMSO, 800 mT), An aliquot of the crude (100 mΐ.) was purified by reversed-phase HPLC (XB ridge BEH I 3Q C I S, 4.6 x 100 mm, 130 A, 3,5 pm; Waters) with gradient elution (mobile phase A: 0, 1% TFA in HgO; mobile phase B: ACN; 5% B i min, 20-40% B, 20 min; 1.25 mL/min), HPLC fraction aliquots ( 1 mE) were spotted to a MALDLTOF MS target plate, dried, covered with FiCCA matrix solution (1 ,5 mg/mL HCCA in i :2 ACN:0.J% IF A m H2G) and analyzed via MALDI-TOF MS (Microflex; Bruker Da!tonics, Inc,), Product-containing fractions were pooled and lyophiiized protected from light. The light-pink lyophilate was dissolved (DM SO) and quantitated by acid hydrolysis. Quantitated (aciPci†) 2 R 1 10 probe S4 was stored protected from light in DMSO (-20 °C).

|0076] P!uorogenic Rbodamine Probe (acGPR) 2 Rl 10 Synthesis and

characterization. Probe core resin S3 (30.5 mg) was transferred to a fritted syringe (2.5 ml.; Torviq) and the DMF drained, Fmoe-Arg(pbf)-OH (1 mmol in I mL DMF) was combined with DIEA (2 mmol) and CQMU ( I mmol in 1 mL DMF), Incubated (30 s, RT), and the activated acid added to resin and the resin incubated with rotation ( 1.5 h, 50 °C, 8 rpm).

The resin was washed with DMF until there was no visible trace of color in the DMF wash. Probe construction proceeded via iterative cycles of solid-phase peptide synthesis (see above). The following Fmoc-protected amino acid couplings were performed in order; (i) Fmoc-Pro-OH and (ii) Fm0c-Gly-OR. The terminal Fmoc was removed and the resin washed as described above, acetic anhydride (20% [vo!/volj in DMF, 1 mL) was added to the resin, the resin incubated with rotation (20 min, RT, 8 rpm) and washed as described with an additional wash (DCM, 2 c 2 mL). DCM Was expelled and the resin was dried in vacuo. Cleavage mixture (93% {yol/volj TFA, 5% HoO [vol/vol], 2%

[vo!/vol] TIPS, 2.5 mL) was added to the dried resin and the resin incubated with rotation (1 .5 h, RT,8 rpm). The TFA solution was expelled into ice-cold diethyl ether (22 ml), incubated ( 20 °C, I h), and centrifuged (5 in, 7, 000 c g). The supernatant was decanted and the orange pellet dried in vacuo. Cleaved probe was resuspended (DMSO, 800 pL),

An aliquot of the crude (100 pL) was purified by re versed -phase HPLC (XBridge

BE HI 30 CIS, 4.6 x 100 mm, 130A, 3.5 pm; Waters) with gradient elution (mobile phase A: 0.1% TFA in IQQ; mobile phase B: ACM; 5% B 1 min, 5-85% B, 40 min; 1 ,25 mL/min). H PLC fraction aliquots ( I JJL) were spotted to a MALDI-TOF MS target plate, dried, covered with HCCA matrix solution (see above) and analyzed via MALDI-TOF MS (Microflex; Bruker Daltonics, Inc.). Product-containing fractions were pooled and lyophiiized protected from light. The light-pink lyophilate was dissolved (DMSO) and quantitated by acid hydrolysis (see above). Quantitated {acGPR)2R 1 10 probe S5 was stored protected from light in DMSO (-20 °C), |0077] Fiuorogenic Probe RHQ! -fPCitAA-FAM Synthesis and Characterization.

Rink Amide resin (160 pm, 0.44 mmol/g, 227 mg; Rapp-Poiymere) was transferred to a fritted syringe (6 mL; Torviq), swelled in DMF ( 1 h, RT, 14 rpm), and washed (DMF, 2 x 2 ml.; DCM, 2 ·* 2 mL; DMF, 2 c 2 mL). Iterative cycles of peptide synthesis included (i) Fmoc deprotection (20% [vol/vol] piperidine in DMF, 2 x 1 mL, 5 min each aliquot, RT, 8 rpm); (is) N-a-Fmoc-amino acid (30 pmol in 0.5 mL DMF) activation with COM!) (30 mhioΐ in 0.5 mL DMF) and DIEA (60 pmo!) and incubation (30 s, RT); (lit) K~<x-Fmoc~amino acid coupling to resin by transferring activated acid (1 mL) to resin and mc.iihai.ing with rotation (5 in, RT, 8 rpm); (iv) repeat steps ii and iii once. After each deprotection and monomer coupling, reactants were expelled and the resin washed (DMF, 2 x2 mL; DCM, 2 X 2 roL; DMF, 2 x 2 mL). The foliowing Fm oc- protected amino acid couplings were performed in order; (i) Fmoc-Cys(Trt)-OH; (ii ) Fmoc-Lys(Mtt)-OH; (iii) Fmoc- PF,G2~ OH ; (iv) Fmoc~Ala-OH; (v) Fmoc-Ala-OH ; (vr) Fmoc- citrulline-OH; (yii) Fmoe-Pro- OH; (viii) Fmoc-l!e-OH; and (ix) Fmoc-PEG2-OH. An aliquot (4.5 mg) of the resin was removed to a new fritted syringe (6 mL; Torviq), Fmoc was removed (20% [vol/vof] piperidine, 2 c 2 ml,, 15 min each aliquot, RT, 14 rpm), and the resin washed (DMF, 3 x 2 mL; DCM, 3 x 2 mL; DMF, 3 x 2 mL). BHQ I -COOH ( 10 mίhoί in 0,25 mL DMF) was combined with DIEA ( 100 gmoi), added to resin, and the fesiu incubated (overnight, 50 °C, 1 5 rpm) and washed (DMF, 6 x 2 mL; DCM, 6 x 2 mL; DMF, 6 x 2 mL). Mtt was removed by washing the resin with DCM (.3 x 2 mL) and then deprotection mixture (1 % TFA, 5% TIPS, 94% dry DCM, 4 c 1.5 mL), and finally incubating the resin in deprotection mixture (3 x 2 mL, 5 min first two aliquots, 15 min third aliquot). The deprotected resin was washed (4 2 mL DMF;4 x 2 mL DCM; 4 x 2 mL, DMF). 5(6)-Carboxyfiuorescein (50 pmol in I rnL DMF) was combined with DIG (50 pmol), the activated acid added to resin, and the resin incubated with rotation (I h, 37 ® C, 14 rpm). Resin was washed (DMF, 3 x 2 mL; DCM, 3 x 2 mL; DMF, 3 c 2 ml; DCM, 3 x 2 L). Cleavage mixture (88% [vol/vol] TFA, 5% [nnRnoI] phenol, 5% [vol/vol] DI H O, 2% [vol/vol] TIPS, 2 mL) as added to the dried resin and the resin incubated (2 h, RT, 14 rpm). The solution was expelied into ice-cold diethyl ether (10 mt) and centrifuged (5 min, 7,000 c g), The pellet was dried in vacuo overnight and dissolved (DMSO, 800 mΐ). Aliquots of the crude (100 pL) were purified ;by reversed-phase HPLC

(20RBAX Eclipse XBD Cl 8 Semi-Prep 9.4 x 250 mm 5 p ; Agilent) with gradient elution (mobile phase A: 0.1 % TFA in H|¾0; mobile phase B: ACM; 5% B 1 min, 5-60% 2 min. 60% B 1 min, 60-65% B, 10 min; 3.8 mL/min). Product-containing fractions were pooled and Syophilized protected from light. Lyophi!ate was dissolved (DMSO) and quantitated by A534. Quantitated BHQ PcitAA-FAM probe S6 was stored protected from Sight in DMSO (--20 °C).

[0678] Fluorogenic Probe BHQ] -!PRAA-TMR Synthesis and Characterization.

Rink Amide resin (160 gm, 0,44 mrool/g, 22.7 mg; Rapp-Polymere) was transferred to a frited syringe (6 mL; Torviq), swelled in DMF (I h, RT, !4 rpm), and washed (2 x 2 mL DMF; 2 x 2 mL DCM; 2 x 2 ml, DMF) iterative cycles of peptide synthesis included (i) Fmoc deproteetion (20% [vol/vol] piperidine in DMF, 2 x 1 mL, 5 min each aliquot, RT, 8 rpm); (ii) N-a-Fmoc-amino acid (30 pmoj in 0,5 nil DMF) activation with COMU (30 mhioΐ |n 0,5 ml DMF) and DIEA (60 pmol) and incubation (30 s, RT); (US ) N-a-Fmoc- amino acid coupling to resin by transferring activated acid (1 mL) to resin and incubating with rotation (5 in in, RT, 8 rpm); and (iv) repeat steps ii and iii once. After each deprotection and monomer coupling, reactants were expelled and the resin washed (DMF, 2 x 2 mL; DCM, 2 x 2 mL; DMF. 2 c 2 mL). The following Fmoc- protected amino acid couplings were :

performed in order: (i) Fmoc-Cys(Trt)-OF! : ; (ii) Fmoc-Lys(Mti)-OH; (iii) Fmoc-PEGo- OH; (iv) Fmoc-Ala-OH; (v) Fmoc-Ala-GH; (yi) Fmoc-Arg(Pbf)-QH; (vii) Fmoc~Pro~QH;

(viii) Fmoc-lle-OH; and (lx) Fmoe-PECQ-OH. Mtt was removed by washing the resin with deprotection mixture ( 1 % TF A, 5% TIPS, 94% dry DCM, 4 * 1.5 mL), then incubating the resin in deproteetion mixture (2 * 2 mL, 10 min first aliquot, 5 min second aliquot). The deprotected resin was washed (DMF, 4 2 mL; DCM, 4 * 2 mL; DMF, 4 x 2 mL). 5(6)- Carboxytetramethylrhodamine (50 mihrΐ in 1 mL DMF) Was combined with DIEA (100 pmol), COMU (50 . pmol) was subsequently added to the deprotpnated dye, the activated acid was added to resin, and the resin incubated with rotation (2 h, RT, 14 rpm). Resin was washed (DMF, 4 x ml,; DCM, 4 x 2 mL; DMF, 4 ¾ 2 mL; DCM. 3 x 2 mL). Fmoc was removed (20% [vol/vol] piperidine, 2 x 2 mL, 15 in each aliquot, RT, 14 rpm) and washed (DMF,

10 x 2 mL; DCM, 6 2 mL; DMF, 6 x 2 mL). BHQI-SE (12.5 iimol in i mL DMF) was Combined with DIEA (250 pmoi), added to resin, and the resin incubated (overnight, 50 °C, 15 rp ). The resin was washed (DMF, 10 c 2 mL; DCM, I Q x2 mL; DMF, 10 c 2 mL; DCM, 3 x 2 mL) and dried in vacuo. Cleavage mixture (85% [vol/vol] TFA, 5% phenol [wt/vol], 5% [vol/vol] D! 1120, 2% DPS [voi/vol], 2 rnL) was added to the dried resin and the resin incubated (2 h, RT, 14 rpm), The TFA solution was expelled into ice-cold diethyl ether (10 mL) and centrifuged (5 min, 7,000 x g). The pellet was dried in vacuo overnight and dissolved (DM SO, 800 pL). Aliquots of the crude (100 pL) were purified by reversed-phase FIPLC (XB ridge BEH130 C IS, 10 x 15 mm, 130 A, 5.0 pm; Waters) with gradient elution (mobile phase A: 0.1 % TFA in H2O; mobile phase B; ACN; 5% B 1 min, 5-50% B in 2 min, 50% B 1 min, 50-55% B, 10 min; 3.8 L/min). Product-containing fractions were pooled and fyophilized protected from light. Lyophilate was dissolved (DMSO) and quantitated by A534. Quantitated BHQl -IPRAA-TMR probe S7 was stored protected from light in DMSO (-20 °C).

[0079] Protease Activity Assay Probe-Primer Bead Preparation. An aliquot of amine functionalized magnetic beads (M-270 Amine Dyna- beads; 5 mg; Life

Technologies, Inc.) was washed (DI H20, 1 x 200 pL; DMF, 2 * 1 ml..), resuspended in DMF (1 mL), incubated (90 min, RT), and washed (DMF, 1 mL), Linker synthesis proceeded via iterative cycles of solid-phase peptide synthesis. Each cycle included (i) Fmoc

deproieetion (20% [vol/vol] piperidine in DMF, 2 c 50 pt, 15 min each aliquot, RT, 8 rptn); (ti) N-a-Fmoc-amino acid (7.5 pmoi in 50 pL DMF) activation with CQMU (7,5 pmol in 50 pL DMF) and D1EA (30 pmoi) and incubation (30 s. RT); (iii) N-a-Fmoc-amino acid coupling to resin by transferring activated acid (100 pL) to resin and incubating with rotation (5 min, RT, 8 rpm); and (iv) repeat step ii and Hi once. After each deprotection and monomer coupling, beads were immobilized on magnet, reactants removed, and the beads washed (DMF, 3 x 1 mL; DCM, 3 c ! mL; DMF, 3 x 1 L). The following F oc- protected amino acid couplings were performed in order; (i) Fmoc-PEG2-OH ; (ii) Fmoc- PEG2-GH; (iii) Fmoc-P£G2~OH; and (iv) Fmoc-Pra-QH . After coupling of Fmoc-Pra-OI-L beads were deprotected and washed (see above), Beads were exchanged to water (Di l-QO, 1 x 200 pL; 1% Tween-20, 500 pL) and incubated (l h, RT). Azide mix was prepared by combining DMSO ( 13.5 pL), 5'-azido oligonucleotide (100 pM, 1 pL), Tween-20 (10 [wt/yoi], 1 pL), TEAA (2M, 3 pL), ascorbic acid (100 mM, 0.2 pL), and DI HgO (0.3 pL). Catalyst mix was prepared by combining C11SO4 (100 mM, 0.5 pL), TBTA ( 10 mM, 6 pL), and ascorbic aci (300 mM, 2.5 pL). The beads were immobilized on magnet, the supernatant removed, azide mix and beads combined, catalyst mix added, and the reaction incubated (90 min, 50 °Q 14 rpm). Reactants were removed and the beads washed (2 c BWBT, 1 x 0,5 mL), resuspended in BWBT (1 mL), and incubated (90 min, 50 °C, 14 rpm). The oligonucleotide functionalized beads were washed (BWBT, 2 x 0,5 L; bead buffer 2 x 0.5 mL). Oligonucleotide- functionalized beads were exchanged to solvent (DMF, 2 * 0 5 mL, 1 * i ml), incubated (90 f ain, RT), and washed (DMF, 1 mL), Bead aliquots (0.5 mg) were further elaborated with (ae.IPcit) 2 Rl 10 probe S4,/(acGPR)2 H0 probe S5, or rhodamine probe core S3 (5 nmol probe in 20 pL DMF) Probes were combined with DIG (350 nmol in 5.5 pL DMF) and H O At (250 nmol in ί mΐ. DMF), and the reaction was incubated (2 min, RT). Activated probe was added to the oligonucleotide-functionalized bead aliquot and incubated (30 min, 50 °C, 14 rptrt). The activi ty assay probe-primer beads were washed (DMF, 3 x 1 mL; DCM, 3 x i mL; DMF, 3 x 1 mL: bead buffer, 2 c 1 mL) and resuspended in bead buffer (1 E5 beads/pL).

|008«] Ttypsmogen ORF (SEQ 1D NO;4):

ATGGCACATCACCACCACCATCACGTGGGTACCATGAATACCTTTGTTCTGCT

GGCACTGCTGGGTGCAGCAGTGGCATTCCCGACCGATGATGATGATAAAATTG

TTGGCGGTTATACGTGCGCAGCGAATAGCGTGCCGTACCAGGTTTCTCTG A A CT C TG G T A G TC A T TT TTGC G GC G G T A G C G TG A T C A A TT C TC A G T G G G TT

GTTAGTGCCGGACATTGTTATAAAAGCCGTATTCAGGTTCGCCTGGGCGAA

CACAACATCGAXGTGCTGGAAGGCAATGAACAGTTTATTAACGCGGCCAA

AATTATCACGCACCCGAAGTTCAATGGCAACACGCTGGATAATGATATTAT

GCTGATCAAACTGAGCTCTCCGGCAACCCTGAAGAGCCGTG TGGCAACGG ' r

TA GTCTGCCGCGTAGCTGCGCAGCAG CAGG CACCG AATGTCTG ATCAG CG

GCTGGGGTA ATACG AA AAGTAGCGGTTCTAGTTATCCG TCTCTGCTG CAG T

GCCTG AA AG CACCGGTTCTG AG CGATAGCTCTTGTA A A AGTAGCTATCCG GGCCA GATTACCGGTAACATGATCTGTGTGGGCTTCCTGG 1AGGCGG2¾A4Q ATTCTTGCCAGGGTGATAGTGGCGGTCCGGTGGTTTGTAATGGCCAGCTGC AGGGTATTGTGAGCTGGGGCTACGG1TGCGCGG4GffffAAAC4ffAGCGGGTGT

GTATACCAAAGTTTGTAATTACGTGAAGTGGATTCAGCAGACGATCGCAG

CGAACTAA (underlined, italicized, double underlined, and underlined/itaiicizsd residues indicate propeptide, loop l, Dl 89 and loop 2, respectively)

10081] Trypsinogen +ci! ORF {SEQ ID NO:5):

A TG GC A C ATG A CC ACC A CC ATCAC G TGGGTA CCATGAATACCTTTGTTCTGC I

CIGGCACTGCTGGGTGCAGCAGTGGCA ITCCCGACCGATGATGATGATAAA A TT

GTTGGCGGTTATACGTGCGCAGCGAATAGCGTGCCGTACCAGGTTTCTCTG

AACTCTGGTAGTCATTTTTGCGGCGGTAGCCTGATCAATTCTCAGTGGGTT GTTAGTGCCGCACATTGTTATAAAAGCCGTATTCAGGTTCGCCTGGGCGAA C A C A A C A T C G A T G T G C TGG A A Q G C A A T G A A C A GTTT A TT A A C GC G G C C A A AATTATCACeCACCCGAACTTCAATGGCAAeACGCTGGATAATGATATTATG C T G A TC A A A C T G A G C TC TC C G GC A A C C C TG A A C A G C C G T GTG G C A A C C G T T A G T C T G G C G C G T A G C T G C G C A G C A G C A G G C A€ C G A A T G T C T G A T C A G C GG CTGGGGTAATACGAAAAG TAGCXJGTTCTAGTTATCCGTCTCTGCTGCA GT GCCTGA A AGC ACC GGTTCTG A G C G ATAG C TCTT G T AAA A GT AG CTATCC GG GC€AGATTACCGGTAACATGATCTGTGTGGGCTTCC;iXMAGGGC?GGGGCCTC TTCTTGC CAGG GTGATAGTGG CGGTCCCJGTG GTTTGTAATG QCCAGCTGC

AGGGT ATT GTG AGCTGG GCT ACGGTTGCGCGCAGAAAAACAAACCGGGTGT

GTATACCAAAGTTTGTAATTACGTGAACTGGATTCAGCAGACGATC- GCAGCGAACTAA (underlined, italicized, double underlined, and underlined/italicized residues indicate propeptide, loop 1 , D189 and loop 2, respectively. B racketed residue i ndicates the propeptide point mutation).

[0G82J Trypsinogen Expression Cassette Preparation The porcine pancreatic trypsinogen ORP (GenBank accession no. NM_001 162891 ) was synthesized (GenScriptj with codon use optimized for bacterial expression. The trypsinogen ORE was inserted between the Ncoi and Blpi sites of the pET45b vector (EMD Mil iipore) with the 5VATG-3' of the Ncoi corresponding to the site of translation initiation, Plasniid ( 1 rig) was combined with PCR buffer ( 1 V), dNTPs (200 mM each dNTP), Taq (0.05: U/pL), arid oligonucleotide primers 5 ? -TGeGTCCGGCGTAGAGGATC-3 ' (SEQ IP NO: 1 1 ) and 5 ' - A G A C€ GAG ATA -· GGGTTGAGTGTTG-3' (SEQ IDN0:i2) (0.2 mM each). The reaction (50 mE) was thermally cycled (95 °C, 1 SO s [95 °C, 20 s; 68 °C, 90 s) * 30 cycles, Cl 000; Bio-Rad), the products confirmed on agarose gel, purified (Ql Aquiek PCR Purification Kit; Qiagen, Inc.) and quantitated by .4260.

[0083] Generation of EGGKi l 85- 188)XXXX/0189S (SEQ ID NO:9) Site-

Saturation Mutagenesis Libraries and Mutants Amplification reactions (50 pLj were prepared with trypsinogen expression cassete template (1 ng), PCR buffer (l x), dNTPs (200 mM each), and either oligonucleotide primers 5'--TGATGCCGGCCACGATG -3' (SEQ ID NO: ! ) and 5'-CAGGAAG~ CCG AC A C AG ATC AT GTT -3' (SEQ ID NO:2) (0.2 mM each) or 5' - AACATGATCTGTGTGG G CTTCCl ' GNNSNNSNNSNNSTCTTCTTGC- CAGGGTGATAGTGGC ~3 f (SEQ ID NO: 3) (S degeneracy indicates G or C) and 5' A G A C C G A GAT A GOG TIG A G 1G TTG -3 ' (SEQ ID NO; 12) (0,2 mM each) and Taq (0.05 ϋ/mE). The reactions were thermally cycled ([95 °C\ 15 s; 62 °C, 20 s; 68 °G 60 s] c 20 cycles, C 1000; Bio-Rad) and the products confirmed Ori agarose gel. Products were purified (MinElute; QIAGEN) and quantitated by A26G. Purified products (5 ng each) were added to an assembly reaction (50 uL) containing PGR buffer (1 x), dNTPs (200 mM each), Taq (0,05 U/pL), and oligonucleotide primers 5'-TGATG ( X.GGCCACGATG -3' {SEQ ID NO:6) and 5'-AGACGGA- G AT AG GGTTG AGTi iT ! G- 3' (SEQ ID NO: 12) (0,2 pM each), The assembly re- action was thermally cycled ([95 °C, 15 s; 55 °C, 15 s; 68 °C, 90 s] * 20 cycles, C l 000; Bio-Rad). Products were resolved on SeaPlaque ( 1 .2%) low melting temperature agarose gei, excised, purified (QiAEX (1; QIAGEN), and quantitated by A26Ό. The trypsinogerE 011 gene loop 1 region was also mi&agenized using these methods, but starting with the trypsinogen ^ expression cassette template (1 ng) as input. L7P mutants of trypsin arid try sriT ®1 ' were pre- pared in analogous assembly PCRs.

[0084J Tryptic Activity Assay Probe-Primer Bead Characterization. Protease activity assay probe-primer beads functional zed with tryptic control probe

(acGPR)2R ! 1 0 were washed (2* SSC, 1 * 50 pL), combine with fluorescein-labeled oligonucleotide hybridization probe (2* SSC, 1 mM 5C/56-

P A M/C A CTC A A CC CT ATC TC-3’ ) (SEQ ID O:7), and incubated (10 min, RT, 15 rpm). The beads were washed (2* SSC, 3 * 20 pL), combined with NaOH (fi.l N, 2 x 20 pL, on ice, 5 min), washed (Di H20, 1 x 20 pL), and the fluorescein-labeled oligonucleotide concentration in the alkaline eluate quantitate (CFX96; Bio-Rad). The particle density was quantitated using a hemocytometer. Tryptic probe-primer beads (1 E6 beads) were combined with ftypsin (1 pg) and buffer (200 mM AMB1C, pH 8.3, 25 pL), incubated (37 °C, 3 1 h), digestion solution removed, and beads resuspended (0.5 rnL PBS), A separate tryptic probe-pri er head aliquot. (3 E6 beads) was transferred to an amplification reaction (150 pL) containing trypsinogen expression cassette tern- plate DNA (56 pg ~ 41 molecules/bead), PGR buffer (l x), dNTPs (200 mM each), PEG-8000 (4,8 mM), butanedioi (5% [vol/vol]). KF-6012 (0,02% [vol/vol]), Taq (0,3 U/uL), and oligonucleotide primer 5'- TG C G TC CG GC G T A GAG G A TC -3 ' (SEQ ID NO: l l ) (i pM). A liquots of this mixture (20 mE) were thermally cycled ([95 °C, 20 s; 68 °C, 9G s] * 30 cycles, C 1000, Bio-Rad), washe (bead buffer, 2 x 20 pL; breaking buffer, 2 * 200 pL; bead buffer 1 c 200 pL), and resuspended (bead buffer, I 0Q pi). Aliquots of these heads (200 heads) were combined with qPCR matrix (20 mΐ.) containing PCR buffer ( l *), dNTPs (200 mM each), Taq (0.05 U tL), oligonucleotide primers 5/ -TGCGTCC GGCGTAG AGG A . TC-3 ' (SEQ ID NO: i I ) and 5 ' - A G A C C GAG AT A G GG TTG A G T GTT G- 3 ' (SEQ ID NO: 12) (0.5 mM each), and 5V56-FAM/ATG AAT ACC (SEQ ID NO: 13) /ZBN/TTT GTT CTG CTG GCA CTG CTG (SEQ ID NO: 14) /3IABkFQ/-3 ! 5' exonuclease assay probe (250 nM).

Template standard solutions ( 1 E5-1 E-1 fg/pjL in log-scale dilutions) were prepared and added to separate amplification reactions (20 pL each). Standards and samples were thermally cycied ([95 C, 20 s; 68 °C, 90 s] * 30 cycles, C 1000; B io-Rad) with fluorescence monitoring (channel 1, CFX96; Bio-Rad). Samples were quantitated (CFX Manager, version 3, 1 ; Bio- Rad, baseline sub- traction). The number of trypsmogen templates per bead was calculated by dividing the qPCR result by the number of beads. Tr psmogen-tem plated probe-primer beads (1.5E6 beads, probe- gene beads) were combined with IVTT components solution A (10 mE), solution B (7,5 mΐ.,), disulfide bond enhancer (1 mE each enhancer), and EK ( 1 /200 dilution, 0.5 pL). Negative control beads (1 .5E6 beads) were combined with IVTT

components (described above), but without disulfide bond enhancers or EK. Samples were incubated with rotation (21 h, 37 °C, 15 rpm), washed (bead buffer and 10 mM benzamidine HC1, 2 x 1,00 pL), and resuspended (0.5 mL). All digested beads and controls were analyzed by flow cytometry (l bc 488 nm, dichroic 525 SP, optical filter 550 SP, Ga!Iios; Beckman Coulter).

[0085] Emulsion PCR Bead Library Preparation. An amplification reaction ( 150 pL) was prepared containing PCR buffer ( l x), dNTPs (200 mM each), PEG-8000 (4.8 mM), butanediol :(5% [vol/voi]), KF-6012 (0.02% [vol/vol]), Taq (0.3 U/pL),

oligonucleotide primer 5’ -TGCGTCCGG CGTAG AGG ATC-3' (SEQ ID NO; I I ),

EGGK( 185-188)XXXX/D189S (SEQ ID NQ:9) site saturation mutagenesis library (800 pg), and (ac!Pcit),Rl 10 activity probe bead (5E6). PCR mix was combine with ice-cold emulsification oil mix (600 pL) and a stainless steel bead (6 mm diameter; Thomas Scientific). Sample was loaded into a homogenizes (TissueLyser; QIAGEN) and emulsified (10 s, 15 Hz: 10 s, 17 Hz), The emulsion was placed on ice immediately and aliquots (50 pL) were transferred to PCR tubes for thermal cycling ([95 °C, 20 s; 68 °C, 90 s] x 30 cycles; 68 °C, 600 s; C 1000; Bio- Rad). Emulsion samples were pooled, combined with breaking buffer (750 pL), and mixed The beads were collected by centrifugation (5 min, 3,000 x g) and immobilized on magnet. Supernatant was removed, beads were washed (breaking buffer, 1 mL) and resuspended in breaking buffer (100 fftL) and then transferred to a new tube. The beads were washed (breaking buffer, 2 x 1 niff), resuspended in breaking buffer (100 m.E), and transferred to a new tube. Washing and tube transfer was repeated once more using bead buffer. Bead particle density was quantitated using a hemocytomeier and adjusted (84 beads/mί, final). Amplification reaction mixture (2 mL) was prepared containing PGR buffer (i x ). dNTPs (200 mM each}, Taq (0.05 Ij/uL), oligonucleotide primers 5 f -TGCGTCCGGCGT.A G A GG ATC-3 * (SEQ ID NOfi l ) and 5'-AGACCGAG- A T A G G G TTG A G TO TT G - 3 ' (SEQ ID NO: 12) (0.5 mM each), and 5V56-FAM/ ATG AAT ACC (SEQ ID NO: 13) /ZEN/TTT GIT CTG CTG GCA CTO CIO (SEQ ID NO: 14) /3iABkF:Q/ 3' S’ exonuclease assay probe (250 nM). An aliquot of ampliflcatton reaction mixture (320 pL) was reserved for template standards (20 mE each, prepared in log dilutions as described). To the remaining mixture emPCR bead product (84 beads) was added, the sample vortexed, and aliquots (20 mE) quickly distributed to wells of a 96-well plate. The plate was thermally cycled ([95 °C, 15 s; 68 °C, 80 s] x 40 cycles; CLOOO; Bio- Rad) with fluorescence monitoring (channel 1 , CFX96; Bio-Rad) and quantitated (CFX Manager, version 3, 1 ; Bio-Rad, Cq method), assigning Cq to each Well. Wells with Cq within the standard curve were used to calculate the number of templates per bead.

[0086] Emulsion XVTT, Emulsion IViT, Beads from emPCR were washed (bead buffer, 3 x 100 L), resuspended in bead buffer ( 100 pL), and transferred to a 2-mL safe- lock tube (Thermo Fisher Scientific). ΊUΊT reaction components contain solution A (74,7 pL), solution B (56 mϊ;.), disulfide enhancers 1 and 2 (14.9 mΐ each), and EK (1/200 dilution, 4.1 mΐ.. , ). An aliquot of ΪUTT reaction ( 150 mE) was combined with ice-cold emulsification oil mix (600 nL) and emulsified as described above, The emulsion was incubated (67 h, 37 °C) and protected from light. The emulsion was broken and beads washed and harvested as described previously.

[0087] Fluorescence-Activated Cell Sorting Beads from ernlVTT were resuspended in PBS (1 mL, --5E6 beads/rnL). Control {aeiPCit) 2 Rl 10 activity probe-primer beads were similarly prepared. Samples were analyzed by FACS (FACSDiva 8.0.1 ; BD Bioscienees). The control sample was analyzed (13,000 events) to establish a negative gate. The positive gate was defined to yield a Hit rate of 0.004% Hit beads were sorted into individual wells of a 96- well microtiter plate. Amplification reaction mixture for qPCR was prepared as described above and added to each well (20 pL); the plate was then thermally cycled and quantitated as above, PCR products were confirmed on agarose gel Weils exhibiting the correct PCR product size were purified (MinEh.rte;

Qi AGEN) arsd quantitated by A260,

[0088] Real-Time IVTT Activity Assay, Purified PCR product from single- bead amplifications (5 ng) or plasmid encoding the positive control trypslnogen gene (22 ng) was added to individual IVTT reactions (5 pL, see above) containing solution A, solution B, disulfide bond enhances, EK, and either (ac!Pcit) 2 Rl 10 (3 pM) or (cbzIPRfgRI 10 (3 pM). The reactions were incubated (37 °C t 1 ,000 min, Cl 000; Bio-Rad) with fluorescence monitoring (channel 1 , CFX96; Bio-Rad). Activity assays of mutants were conducted similarly, except sequence-verified pET45b plasmids (50 ng) were used as templates for IVTT.

[0089] Cloning of Amplicons, PCR products fro single-bead amplicons of screening hits and mutants for activity assays were digested with Mcoi-HF and BlpL The pET45b destination vector was similarly digested. Digested PCR products were ligated to the digested vector using T4 DNA ligase, and the resulting ligation products were used to transform chemically competent cells (One Shot Topi 0; Life Technologies, Inc,) according to the manufacturer’s protocol. Selective LB agar plates (50 pg/rnL ampieillin) were streaked with the transformed cells; colonies were isolated and grown in selective LB minicultures (50 pg/mL ampieillin, 2 mL); the plasmid DXA harvested (QlAprep Spin ini Prep; QI AGEN); and the resulting plasmids bidirectionally sequenced,

[0090] Protein Expression. Purified plasmid DNA was used to transform aliquots of chemically competent E, coli cells (Shuffle T7 Express; Ne England

BioLabs) according to the manufacturer’s protocol to generate colonies on selecting LB agar plates (50 pg/mL am piciihn). Single colonies were picked to selective LB mini- cultures (50 pg/mL ampieillin in LB, 2 mL), which were incubated with shaking overnight (30 °C, 250 rpm). The overnight cultures were used to inoculate induction cultures (50 pg/mL ampieillin In LB, 200 mL). The induction cultures were incubated with shaking (30 : C, 250 rpm) until QD6Q0 =: 0,6, at w ich point IPTG (100 niM, 0.8 mL) was added.

The induced culture was incubated with shaking (30 °C, 4 h, 250 rpm). Aliquots (50 mL) were centrifuged, the supernatant discarded, and the cell pellet stored (-20 °C). One pellet of cells was thawed on ice (3 h), resuspended in lysis buffer (3 mL), and incubated (30 min, 4 °C). Egg hen: lysozyme (0, 1% [wt/vol]) was added and the cells incubated (35 min, 4 °C). DNase G (2 U) and RNase A (6 pg) were added and the cells Incubated (30 min. 4 °C). The inclusion bodies were pelleted by centrifugation (30 min, 1 8,514 c g), the su pernatant discarded, and the pellets were washed with lysate washing buffer (9 mL, overnight, 4 ¾), The washed pellet was centrifuged: (30 min, 18,514 * g) and washed again with lysate washing buffer (9 mL, I h, 4 °C). After centrifugation (30 min, 4 °C, 1 8,514 * g), pellets were combined with solubilization buffer (9 mL) and incubated with rotation (6 h, 4 °C, 15 rpm). Dissolved proteins were combined with renaturing buffer (30 mL) and incubated (72 h, 4 °C)„ Renatured proteins were bound to Ni^ ~NTA agarose resins (2 mL, overnight, 4 °C), washed with renaturing buffer (2 x 4 mL), and eluted with elution buffer (4 mL). Purified proteins were dialyzed (3.5 K. MWCQ dialysis cartridge; Thermo Fisher Scientific) against renaturing buffer (2 c 500 mL, overnight) and storage buffer (1 c 500 ml,, overnight). Purified proteins were confirmed by SDS/ PAGE and MALDI-TOF MS, Protein

concentration was quantitated b A280 and the sample acidified with HO (pH ~2) before storage (4 °C). Active site titration was performed on aliquots of purified and activated enzyme stock (2.5 mΐ.) by combining with buffer (PBS, 20.5 ;uL) and inhibitor (ActivX

Desthiobiotin-FP Serine Hydrolase probe, 0, 5, 10, 20, 25, 50, or 1 00 mM; 1 mΐ,) and incubating (2 h, 37 °G), Aliquots of (acIPCit) 2 Rl 10 (100 mM, 1 uL) were added to the equilibrated enzyme mixtures and fluorescent intensity monitored (Gemini XPS; Molecular Devices; l ¾c - 500 nm, l¾p ΐ 525 nrn, automatic cutoff - 51 5 nm). Active site- concentration was estimated by linear extrapolation of observed slope (relative

fluorescence units (RFUs]/s) plotted against inhibitor concentration.

[0091] Michaelis-Menten Analysis of Mutant CitruIHne-Dependent; Trypsin. 80 pL) was combined with EK (7.6 fmol) and incubated (30 min, RT). Activated enzyme (2.5 mT) was combined with PBS (20 pL) and incubated ( l .S h, 37 c C) before adding (ac- lPcit)2Rl 10 probe S4 ([probe] = 0, 0, 5, 1 , 2, 4, 8, 16, 32, 64, 128, 256, or 512 mM). After probe addition, the micropiate was analyzed (Gemini XPS; Molecular Devices, lbc = 494 nm, em = 525 nm), monitoring fluorescence every 60 s, A similar analysis was performed using the BHQ l -lPcitAA-FAM probe $6 as substrate and

and using sequencing-grade trypsin (6 mM) in combination with either (cbz- IPR)2R 1 10 probe or BHQ1-IPRAA- TMR probe S7 as substrates. For kinetic analysis of trypsin with probe S4, sequencing-grade trypsin ((trypsin] = 22, 1 1 , or 5.5 mM in 50 mM acetic add, 12.5 pL) was combined with buffer ( 100 mM AMBIC, pH 8,3, 10 L), incubated (30 min, 3? °C), and combined with substrate (S4, 1.9 mM in DMSO, 2.5 mΐ). After substrate addition, the rnicroplate was analyzed (Gemini XPS; Molecular Devices, Xex - 494 nm„ Xem ~ 525 ntn) monitoring fluorescence every 10 s for 400 s. KM and k C at were determined using nonlinear regression with rate given in REUs/s and concentrations in mM, RFU was converted to concentration (mM) using the following empirical formula: [FAM] - (RFU FA M - 15.1 )/2,680 and [TMR] - (RFU TM R -·39.8}/1,600.

10092 i Digestion and LC-MS; Analysis of Model Fluorogenic Peptides BHQ1~

IPdtAA-FAM and BfiQl-iPRAA-TMR. Sequencing-grade trypsin and trypsin c si stock solutions (6 mM, 50 pL) were prepared. Trypsin " ^ 11 was combined with EK (22 frao!) and incubated (30 min, RT). Negative control enzyme solution (50 pL) . was prepared by combining EK (22 finol) and buffer (50 mM AMBIC, pH 8.3, 50 mΐ,). Digestion reactions were prepared in flat-botom black 384- well mieropiates (Coming, Inc.) by combining buffer (50 mM AMBIC, pH 8.3, 14 uL), protease stock (trypsin, EK-acti vated

trypsin +c * or negative control, 3 mI_ · ), and either BHQ! -IPcitAA- FAM 56 (132 mM in DMSO, 1 uL) or BHQ1 -IPRAA-TMR 87 (273: m.M in DMSO, 3 pL), The mieropiates were analyzed (Gemini XPS; Molecular Devices, Tex ~ 494 and 555 nm, : 2em - 525 and 587 n ), monitoring fluorescence every 60 s for 1,400 s. Digestion reactions were quenched with TF A (0.1%, 82 pL). An aliquot of quenched digestion (8 pL) was separated by reversed-phase HPLC (Eclipse XBD Cl 8 4.6 x 15 mm 3,5 pm; Agilent) with: gradient elution (mobile phase A: 0.1% TEA in H20; mobile phase B: ACN; 5% B in 2 min, 5-80% B in 13 min; 1,25 rnL/min). Column eluate was directly infused into the ESI source of a tandem mass spectrometer (L QXL MS; Therm o Fisher Scientific). In the first 7 min of the LC-MS/MS analysis, the instrument was set to perform: one MS I scan (300-2,000 nVz) followed by one data-dependent MS2 and one data-dependent MS3 scan (most intense ion with minimum intensity of 2,500 counts, charge state of +2, isolation width window ± 2 m/z, and normalized CID collision energy 35%). In the last 15 min of the LC-MS/MS analysis, instrument was set to perform MSI only (300-2,000 m/z).

[0093] Autodeimmation, Filter-Aided Digestion, and MS Analysis of PAD4,

Purified, recombinant human PAD4 (35 mM, 24 pL) was combined with calcium- containing buffer (100 mM Hepes, pH 7.6, 50 mM aCl, 10 mM CaCfl, 20 mM DTT, 156 pLJ and incubated with shaking (37 °C, 1 h, 400 rpm), A control sample was prepared by adding enzyme (24 pL) to HepeS buffer (1 56 mΐ,), Autodeiminaied PAD4 (adPAD4) and control PAD4 (PAD4) were quenched by diluting aliquots of each (60 pL) in denaturing buffer (500 pL) in a molecular weight cutoff filter (10,000 IV1WCO; EMD Millipore) and centrifuged (20 min, RT, 14,000 x g). Reducing agent (200 mM DTT, 7 mΐ.) was added to the filter, the sample incubated (2 h, 37 °€, 400 rpm), and the filter centriftiged (20 min, RT, 14,000 x g), lodoacetamide (500 mM, 14 pL) was added to the filter and the sample incubated protected from light ( 1 h, RT). Denaturing buffer (200 pL) was added and the filter centrifuged (30 in, 14,000 x g). Citrulline-dependent mutant protease (14 pM, 30 pL) was combined with buffer (100 mM AMBIC, pH 8,3, 10 pL) and then combined with EK (30 fmol), Control enzyme solution was prepared by combining buffer (50 mM AMBIC, pH 8.3, 40 pL) and EK (30 fmol). Trypsin was prepared from stock (22 pM, 18 pL) by dilution In buffer ( 100 m ' AMBIC, pH 8.3, 22 pL) and E (1 /10 dilution, 4 pL) just before digestion, Buffer (50 mM AMBIC, pH 8.3, 200 pL) was applied to the filter and the filter centrifuged (20 min, 14,000 * g). Buffer (50 mM AMBIC, pH 8.3, 100 pL) was applied to the filter together with an aliquot (6 pL) of mutant protease, trypsin, or control enzyme solution The filters were incubated with shaking (15 min, 37 °C, 750 rpm) and then incubated (37 °C, overnight). Digestion volumes were adjusted to 100 pL with buffer(50 mM AMBIC, pH 8 3); an aliquot (50 pL) was removed for electrophoretic analysis; the filter centrifuged (20 min, 14,000 x g); and the filtrate collected. The filtrate volume was adjusted to 80 pL (0.3% TEA in Di H20), reduced in volume to 30 pL (Speed-Vac;

Thermo Fisher Scientific), and an aliquot (20 pL) was desalted (ZipTIp; EMD Millipore), The e fitted product (20 pL In 1 : 1 0.1 % TFA In HgCTACN) was divided to provide an aliquot (0.5 pL) for MALDI-TQF analysis (4700 Voyager; Sciex); the aliquot was spotted to a MALDI-TQF MS target plate, covered in HCCA matrix solution (see above) and mass analyzed. The remainder was evaporatively dried to remove excess ACM. adPAD4 tryptic digest (- 100 fmol) and adPAD4 mutant protease digest :(~3 pmol) were analyzed by LC- MS/MS. Peptides were concentrated and desalted on an RP precolumn (Acclaim PepMap 100 nanoViper, 0,075 c 20 mm; Thermo Fisher Scientific) and resolved using reversed-phase HPLC (Acclaim PepMap RL SC nano-Viper, 0,075 x 150 mm; Thermo Fisher Scientific), with gradient elution (300 nL/min, mobile phase A: 0.1 % [vol/vol] formic acid in HgO; mobile phase B: 0 1 % [vol/vol] formic acid and 80% [vol/vol] ACN in H2O; 5% B 3 min, 5-40% B 60 min, 40-80% B 2 min), Eiuate was directly infused into the ESI source of the tandem mass spectrometer (Q Exactive; Thermo Fisher Scientific), Data-dependent MS/MS selected the 10 most intense precursors from /z - 380- -1,600 performed at 70,000 resolution. Precursors were collislonaliy fragmented (HCD, normalized collision energy ~ 27%). (0094] Analysis of Citruilinated Fibrinogen. Fibrinogen (FBG) and FAD4- citrullinated fibrinogen (cFBG; Cayman Chemical) were analyzed without further purification. cFBG and FBG (2,5 mg/mL, 12 pL) were combined with denaturing buffer (300 pL) and reducing agent (1 M DTT, 6 pL) in a molecular weight cutoff filter (10,000 MWCO; EMD Miliipore) and incubated (2 h, 37 °C, 750 rpm). The filter was centrifuged (20 min, 14,000 x g) lodoacetamide (500 M, 24 pL) was added to the filter, the sample incubated protected from light (1 h, RT), and the reactants re- moved by centrifugation (20 min, 14,000 x g). Buffer (SO mM AMBIC, pH 8.3, 230 pL) was applied to the filter and the fi lter centrifuged (30 min, 14,000 c g). Mutant protease (14 pM, 30 mE) was combined with buffer ( 100 mM AMBIC, pH 8,3, ! O pLj and then activated by adding EK (30 fmoi). Control enzyme solution was prepared by combining buffer (50 mM

AMBIC, pH 8,3, 40 mΐ.,) and EK, (30 fmo!). Trypsin was prepared from stock (22 mM, 18 pL) by dilution in buffer (100 mM AMBIC, pH 8.3, 22 pL) and EK (1/10 dilution, 4 pL) just before digestion. Buffer (50 mM AMBIC, pH 8.3, 100 pL) was applied to the filter together with an aliquot (6 pL) of Bod enzyme, trypsin, or control enzyme solution, and the filters incubated (37 °C, over- night). The filters were centrifuged (30 min, 14,000 x g) and the filtrate collected. Denaturing buffer (50 pL) was applied to the filter, the filter incubated (1 h, 37 °C; 5 min, 95 °C; 5 min, 37 °C), inverted, and centrifuged into a new' collection tube (5 min, 14,000 * g). Samples retained on the filter (50 mΐ.,) were combined with 6* Laemmii sample buffer (10 pL), incubated (5 min, 95 °G), and analyzed using SDS/PAGE (4-20% Mini-PROTEAN Tris- Glycine, 120 V, 60 min; Bio-Rad), Peptide solution volumes were adjusted to 90 pL with Di H2 . 0, evaporatively dried to 25 pl :

(Speed-Vac), and acidified with TPA (5%, 5 mT)· An aliquot of peptide mixture (20 mΐ,) was desalted (ZipTip CIS; EMD Miliipore). The eluted product (20 pL in 1 : 1 0.1% TFA in B2Q:ACN) was divided to provide an aliquot (0.5 pL) for MALDI-TOF MS analysis as described above (AB Seiex: Voyager). The remainder was evaporatively dried (Speed-Vac) to ~! pL, diluted (0, 1% formic acid, 12 pL), and an aliquot (cFBG and: FBG mutant protease digest, 8 pL) was directly subject to LC-MS/MS analysis, Trypsin digest was: diluted to 10 fmol/pL (with 0,1% formic acid) and subject to LC-MS/MS analysis. The samples were resolved using reverse-phase HPLC (0.075: * 150 mm Acclaim PepMap RLSC Nano Viper, EASY-nLC 1000; Thermo Fisher Scientific) with gradient elution (3Q0 nL/min, mobile phase A: 0.1% [vol/vol] formic acid in H2O; mobile phase B: 0.1% [vol/vol] formic acid, 80% [vol/vol] ACM in H2O; 5% B for 15 min; 5-40% B for 45 min, 40-80% B for 10 s, 80% B for 5 min), and eluent was directly infused to the ESI source of a tandem mass spectrometer (Orbitrap Fusion; Thermo Fisher Scientific). Mass spectra were acquired in data-dependent MS2 mode using Top Speed precursor selection in a survey scan from 380 to 1 ,400 /z with Orbitrap detection. Data-dependent MS2 was performed with HCO fragmentationfnormalized collision energy 30%) and detection in the Orbitrap. Protein Identification was carried out using Mascot and X! Tandem (34, 35). Variable modifications NQR (deamidated) and M (oxidation) were permitted, carbarn idom ethylation of Cys was a fixed modification, one missed cleavage of a nonspecific enzyme was permitted, and mass tolerance of 5 ppm and 0,02 Da was set for precursor and fragment ions, respectively. MS/MS raw files were searched against a customized database containing the amino acid sequences of the proteins of interest (UniProtKB accession nos. P02671, P0267I_2 S P02675, P02679, and Q9U M07).

[0095] It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifieatlons or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described.

[0096] All patents, patent applications, published applications and publications,

Gen Bank sequences, databases, A ' TCC deposits, websites and other published materials referred to throughout the entire disclosure herein, unless noted otherwise, are. incorporated by reference in their entirety and for all purposes as if each is indi vidually so denoted.