Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
COMPUTER REPRESENTATIONS OF PEPTIDES FOR EFFICIENT DESIGN OF DRUG CANDIDATES
Document Type and Number:
WIPO Patent Application WO/2023/200866
Kind Code:
A1
Abstract:
In some aspects, the present disclosure describes a method for obtaining a latent representation. In some cases, the method comprises providing a peptide graph comprising a plurality of nodes and a plurality of edges. In some cases, the plurality of edges encodes bonded interactions and nonbonded interactions between the plurality of nodes. In some cases, the method comprises generating the latent representation for a node in the plurality of nodes. In some cases, the generating is based at least partially on the peptide graph. In some cases, the latent representation encodes short-range interactions and long-range interactions between the node and at least a subset of nodes in the plurality of nodes.

Inventors:
NYSTROM NICHOLAS A (US)
SRINIVASAN SRILOK (US)
STECKBECK JONATHAN D (US)
Application Number:
PCT/US2023/018335
Publication Date:
October 19, 2023
Filing Date:
April 12, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PEPTILOGICS INC (US)
International Classes:
G16C20/50; G06N3/02; G16B15/30; G16B20/30; G16B40/00; G16C20/70; G16C10/00; G16C60/00
Domestic Patent References:
WO2021243106A12021-12-02
Foreign References:
US20210375393A12021-12-02
Other References:
RASCHKA ET AL.: "Machine learning and Al-based approaches for bioactive ligand discovery and GPCR-ligand recognition", METHODS, vol. 180, 6 July 2020 (2020-07-06), pages 89 - 110, XP086298548, Retrieved from the Internet [retrieved on 20230802], DOI: 10.1016/j.ymeth.2020.06.016
JIANG MINGJIAN, LI ZHEN, ZHANG SHUGANG, WANG SHUANG, WANG XIAOFENG, YUAN QING, WEI ZHIQIANG: "Drug–target affinity prediction using graph neural network and contact maps", RSC ADVANCES, ROYAL SOCIETY OF CHEMISTRY, GB, vol. 10, no. 35, 1 June 2020 (2020-06-01), GB , pages 20701 - 20712, XP055806161, ISSN: 2046-2069, DOI: 10.1039/D0RA02297G
KARINA ZADOROZHNY; LADA NUZHNA: "Deep Denerative Models for Drug Design and Response", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 14 September 2021 (2021-09-14), 201 Olin Library Cornell University Ithaca, NY 14853, XP091053806
SENGUPTA ET AL.: "Role of long- and short-range hydrophobic, hydrophilic and charged residues contact network in protein's structural organization", BMC BIOINFORMATICS, vol. 13, 21 June 2012 (2012-06-21), XP021115025, Retrieved from the Internet [retrieved on 20230802], DOI: 10.1186/1471-2105-13-142
GIULINI MARCO, RIGOLI MARTA, MATTIOTTI GIOVANNI, MENICHETTI ROBERTO, TARENZI THOMAS, FIORENTINI RAFFAELE, POTESTIO RAFFAELLO: "From System Modeling to System Analysis: The Impact of Resolution Level and Resolution Distribution in the Computer-Aided Investigation of Biomolecules", FRONTIERS IN MOLECULAR BIOSCIENCES, vol. 8, XP093102482, DOI: 10.3389/fmolb.2021.676976
Attorney, Agent or Firm:
GIL, Phwey (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method for training neural networks using short-range and long-range interactions of a peptide, comprising:

(a) constructing a peptide graph based at least in part on a peptide structure of the peptide, wherein the peptide graph comprises a plurality of nodes and a plurality of edges, and wherein the plurality of edges represents bonded and nonbonded interactions between the plurality of nodes of the peptide graph;

(b) constructing a short-range subgraph of the peptide graph, wherein the short-range subgraph comprises (i) a first central node and (ii) a set of short-range nodes within a first range from the first central node;

(c) constructing a long-range subgraph of the peptide graph, wherein the long-range subgraph comprises a set of long-range nodes within a second range from a second central node, wherein the set of long-range nodes comprises at least one anchor node, wherein the first range and the second range overlap such that, when the first central node and the second central node are the same, the set of short-range nodes and the set of long- range nodes comprise the at least one anchor node;

(d) constructing a first neural network based at least in part on the short-range subgraph and a second neural network based at least in part on the long-range subgraph;

(e) processing feature values of the short-range subgraph using the first neural network to generate a short-range latent representation, and processing feature values of the long- range subgraph using the second neural network to generate a long-range latent representation; and

(f) updating parameters of the first neural network and the second neural network based at least in part on the short-range latent representation and the long-range latent representation.

2. The method of claim 1, wherein (f) further comprises updating parameters of the first neural network and the second neural network such that (i) the short-range latent representation and the long-range latent representation are similar when the first central node and the second central are the same, and (ii) the short-range latent representation and the long-range latent representation are dissimilar when the first central node and the second central are not the same.

3. The method of claim 1 or 2, wherein the peptide structure is obtained at least in part by a molecular simulation.

4. The method of claim 3, wherein the molecular simulation is based at least in part on an electronic structure calculation, a force-field based calculation, molecular dynamics, a Monte Carlo simulation, or any combination thereof. The method of any one of claims 1-4, wherein the peptide structure is obtained using a machine learning algorithm. The method of any one of claims 1-5, wherein the peptide structure is obtained from an experiment or a structure database. The method of any one of claims 1-6, wherein the plurality of nodes comprises atoms in the peptide structure. The method of any one of claims 1-7, wherein the plurality of nodes comprises functional groups in the peptide structure. The method of any one of claims 1-8, wherein the plurality of nodes comprises amino acids in the peptide structure. The method of any one of claims 1-9, wherein the plurality of nodes comprises secondary structures in the peptide structure. The method of any one of claims 1-10, wherein the plurality of nodes comprises tertiary structures in the peptide structure. The method of any one of claims 1-11, wherein the plurality of nodes comprises quaternary structures in the peptide structure. The method of any one of claims 1-12, wherein the first range from the first central node is less than or equal to KO number of edges from the first central node, wherein KO is at least 1. The method of any one of claims 1-13, wherein the second range from the second central node is greater than or equal to KI number of edges from the second central node and less than or equal to K2 number of edges from the second central node, wherein KI is at least 2, and wherein K2 is at least 3. The method of claim 13 or 14, wherein KO is at least 2, 3, 4, 5, 6, 7, or 8. The method of claim 14 or 15, wherein KI is at least 3, 4, 5, 6, 7, 8, or 9. The method of any one of claims 14-16, wherein K2 is at least 4, 5, 6, 7, 8, 9, or 10. The method of any one of claims 1-17, wherein the long-range subgraph comprises at least one connected graph. The method of any one of claims 1-18, wherein the long-range subgraph comprises at least two disconnected graphs. The method of any one of claims 1-19, wherein the at least one anchor node comprises at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 nodes. The method of any one of claims 1-20, wherein the first neural network comprises a graph neural network architecture. The method of any one of claims 13-21, wherein the first neural network comprises at least K0 layers. The method of any one of claims 1-22, wherein the second neural network comprises a graph neural network architecture. The method of any one of claims 13-23, wherein the second neural network comprises at least KO layers. The method of any one of claims 17-24, wherein the second neural network comprises at most K2 layers. The method of any one of claims 22-25, wherein a layer comprises an attention mechanism, a generalized message-passing graph neural network, or both. The method of claim 26, wherein the generalized message-passing graph neural network comprises a graph convolutional neural network. The method of any one of claims 1-27, wherein the short-range latent representation comprises a latent representation of the first central node. The method of any one of claims 1-28, wherein the long-range latent representation comprises a pooled latent representation of the at least one anchor node. The method of any one of claims 1-29, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise an identifier of an atom, a functional group, an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif. The method of any one of claims 1-30, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise a physical property of an atom, a functional group an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif. The method of claim 30 or 31, wherein the motif represents a collection of atoms. The method of any one of claims 1-32, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise a geometric descriptor of an atom, an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif. The method of any one of claims 1-33, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise electronic configuration, a charge, a size, a bond angle, a dihedral angle, or any combination thereof. The method of any one of claims 30-34, wherein the functional group comprises a BRICS motif. The method of any one of claims 1-35, wherein (f) further comprises evaluating similarity or dissimilarity between the first neural network and the second neural network based at least in part on contrastive learning. The method of claim 36, wherein the contrastive learning further comprises optimizing a contrastive loss function based at least in part on a dot product between the short-range latent representation and the long-range latent representation. The method of claim 37, wherein the optimizing is performed such that a sigmoid of the dot product is equal to one when the first central node and the second central node are the same, and the sigmoid of the dot product is equal to zero when the first central node and the second central node are not the same. The method of any one of claims 1-38, wherein the method is implemented using one or more computer processors. The method of claim 39, further comprising providing access to the first neural network and the second neural network to one or more users via a terminal, a web browser, an application, or a server infrastructure. The method of claim 40, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for generating a latent representation of a given peptide from the one or more users, and (ii) providing the latent representation of the given peptide to the one or more users based at least in part on the instructions. The method of claim 40 or 41, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a latent representation of a given peptide. The method of any one of claims 40-42, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises peptide chemical structure for a given latent representation. A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the methods of claims 1-43. The computer program product of claim 44, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network. A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the methods or computer-implemented methods of claims 1-43. The non-transitory computer-readable storage media of claim 46, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of graphical processing units. The non-transitory computer-readable storage media of claim 46 or 47, wherein the computer program is encoded in a human non-readable format. A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the methods or computer-implemented methods of claims 1-43. A method for obtaining a latent representation, comprising:

(a) providing a peptide graph comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encodes bonded interactions and nonbonded interactions between the plurality of nodes; and (b) generating the latent representation for a node in the plurality of nodes, wherein the generating is based at least in part on the peptide graph, and wherein the latent representation encodes short-range interactions and long-range interactions between the node and at least a subset of nodes in the plurality of nodes. The method of claim 50, wherein the generating further comprises:

(a) constructing a short-range subgraph of the peptide graph, wherein the short-range subgraph comprises (i) the node and (ii) a set of short-range nodes within a first range from the node;

(b) constructing a first neural network based at least in part on the short-range subgraph; and

(c) processing feature values of the short-range subgraph into the first neural network to generate the latent representation for the node. The method of claim 50 or 51, wherein the generating further comprises:

(a) constructing a long-range subgraph of the peptide graph, wherein the long-range subgraph comprises (i) the node and (ii) a set of long-range nodes within a second range from the node, wherein the set of long-range nodes comprises one or more anchor nodes;

(b) constructing a second neural network based at least in part on the long-range subgraph;

(c) processing feature values of the long-range subgraph into the second neural network to generate one or more latent representations for the one or more anchor nodes; and

(d) combining the one or more latent representation to generate the latent representation for the node. The method of any one of claims 50-52, wherein the latent representation further encodes one or more features of atoms, functional groups, amino acids, secondary structures, tertiary structures, quaternary structures, or any combination thereof. The method of any one of claims 50-53, wherein the plurality of nodes comprises atoms in the peptide structure. The method of any one of claims 50-54, wherein the plurality of nodes comprises functional groups in the peptide structure. The method of any one of claims 50-55, wherein the plurality of nodes comprises amino acids in the peptide structure. The method of any one of claims 50-56, wherein the plurality of nodes comprises secondary structures in the peptide structure. The method of any one of claims 50-57, wherein the plurality of nodes comprises tertiary structures in the peptide structure. The method of any one of claims 50-58, wherein the plurality of nodes comprises quaternary structures in the peptide structure. The method of any one of claims 51-59, wherein the first range from the node is less than or equal to K0 number of edges from the node, wherein K0 is at least 1. The method of any one of claims 52-60, wherein the second range from the node is greater than or equal to KI number of edges from the node and less than or equal to K2 number of edges from the node, wherein KI is at least 2, and wherein K2 is at least 3. The method of claim 60 or 61, wherein KO is at least 2, 3, 4, 5, 6, 7, or 8. The method of claim 61 or 62, wherein KI is at least 3, 4, 5, 6, 7, 8, or 9. The method of any one of claims 61-63, wherein K2 is at least 4, 5, 6, 7, 8, 9, or 10. The method of any one of claims 51-64, wherein the first neural network comprises a graph neural network architecture. The method of any one of claims 51-65, wherein the first neural network comprises at least KO layers. The method of any one of claims 52-66, wherein the second neural network comprises a graph neural network architecture. The method of any one of claims 52-67, wherein the second neural network comprises at least KO layers. The method of any one of claims 52-68, wherein the second neural network comprises at most K2 layers. The method of any one of claims 66-69, wherein a layer comprises an attention mechanism, a generalized message-passing graph neural network, or both. The method of claim 70, wherein the generalized message-passing graph neural network comprises a graph convolutional neural network. The method of any one of claims 52-71, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise an identifier of an atom, a functional group, an amino acid, a secondary structure, a tertiary structure, or a motif. The method of any one of claims 52-72, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise a physical property of an atom, a functional group an amino acid, a secondary structure, a tertiary structure, or a motif. The method of claim 72 or 73, wherein the motif represents a collection of atoms. The method of any one of claims 52-74, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise geometric descriptor of an atom, an amino acid, a secondary structure, or a tertiary structure. The method of any one of claims 52-75, wherein the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise an electronic configuration, a charge, a size, a bond angle, a dihedral angle, or any combination thereof. The method of any one of claims 72-76, wherein the functional group comprises a BRICS motif. The method of any one of claims 50-77, wherein latent representation is embedded in a latent space, wherein the latent space is organized based at least in part on nonbonded interactions, long-range interactions, or both. The method of any one of claims 50-78, wherein latent representation is embedded in a latent space, wherein the latent space is organized based at least in part on bonded interactions, short- range interactions, or both. The method of any one of claims 50-79, further comprising generating a peptide latent representation at least in part by aggregating a plurality of latent representations over each node in the peptide graph. The method of any one of claims 50-80, further comprising synthesizing a peptide comprising a structure defined by the peptide graph. The method of any one of claims 50-80, wherein the method is implemented using one or more computer processors. The method of claim 82, further comprising providing access to the latent representation generated in (b) to one or more users via a terminal, a web browser, an application, or a server infrastructure. The method of claim 83, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for generating a query latent representation of a query peptide graph from the one or more users, and (ii) providing the query latent representation of the query peptide graph to the one or more users based at least in part on the instructions. The method of claim 83 or 84, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query latent representation of a query peptide. The method of any one of claims 83-85, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation. A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods of claims 83-86. The computer program product of claim 87, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network. A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 83-87. The non-transitory computer-readable storage media of claim 89, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of graphical processing units. The non-transitory computer-readable storage media of claim 89 or 90, wherein the non- transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer. The non-transitory computer-readable storage media of any one of claims 89-91, wherein the computer program is encoded in a human non-readable format. A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods of claims 83-92. A method fortraining a domain-specific neural network, comprising:

(a) providing a trained neural network configured to process at least a subgraph of a peptide graph and generate a latent representation that encodes (i) short-range interactions between a plurality of nodes of the subgraph, and (ii) long-range interactions between (1) at least one node of the subgraph and (2) at least one node in the peptide graph but not in the subgraph;

(b) providing a domain-specific dataset comprising a set of peptide graphs and a set of labels; and

(c) retraining the trained neural network using the domain-specific dataset. The method of claim 94, wherein retraining the trained neural network further comprises, for a given peptide graph in the set of peptide graphs and a given label in the set of labels:

(a) generating a plurality of subgraphs based at least in part on the given peptide graph;

(b) processing the plurality of subgraphs using the trained neural network to generate a plurality of latent representations;

(c) combining the latent representations to generate a prediction value; and

(d) updating at least one parameter of the trained neural network based at least in part on the prediction value. The method of claim 95, wherein the updating at least one parameter of the trained neural network is based at least in part on a difference between the prediction value and the given label. The method of claim 95 or 96, wherein the short-range interactions comprise bonded interactions, nonbonded interactions, or both. The method of claim any one of claims 95-97, wherein the long-range interactions comprise nonbonded interactions, bonded interactions, or both. The method of any one of claims 94-98, wherein the combining further comprises pooling the plurality of latent representations to generate the prediction value. . The method of any one of claims 94-99, wherein the combining further comprises using a machine learning algorithm to generate the prediction value based at least in part on the plurality of latent representations. . The method of any one of claims 94-100, wherein the set of labels comprises (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (ix) a peptide-protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvii) a formulation property, or (xviii) any combination thereof. . The method of any one of claims 94-101, wherein the trained neural network is sufficiently retrained when the set of peptide graphs comprises at most about 1000, 100, 10, 5, 2, or 1 peptide graphs. . The method of any one of claims 94-102, wherein the trained neural network is sufficiently retrained when the set of labels comprises at most about 1000, 100, 10, 5, 2, or 1 labels. . The method of any one of claims 94-103, wherein retraining the trained neural network further comprises: freezing at least a subset of parameters of the neural network during the retraining, fixing a fraction of the weights during the retraining, using a regularization function during the retraining, one or more dropout layers during the retraining, or any combination thereof. . The method of any one of claims 94-104, wherein the method is implemented using one or more computer processors. . The method of claim 105, further comprising providing access to the computer- implemented method to one or more users via a terminal, a web browser, an application, or a server infrastructure. . The method of claim 106, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query property prediction of a query peptide graph from the one or more users, and (ii) providing the query property prediction of the query peptide graph to the one or more users based at least in part on the instructions. . The method of any one of claims 105-107, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query latent representation of a query peptide. . The method of any one of claims 105-108, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation. . A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods of claims 105-109. . The computer program product of claim 110, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

. A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 105-110. . The non-transitory computer-readable storage media of claim 112, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. . The non-transitory computer-readable storage media of claim 112 or 113, comprised in a laptop computer or a personal desktop computer. . The non-transitory computer-readable storage media of any one of claims 112-114, wherein the computer program is encoded in a human non-readable format. . A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods of claims 105-110. . A method of predicting a peptide property of a peptide, comprising:

(a) providing a peptide graph comprising a plurality of nodes and plurality of edges, wherein the plurality of edges encodes bonded interactions and nonbonded interactions; and

(b) generating a domain-specific latent representation based at least in part on the peptide graph, wherein the domain-specific latent representation encodes domain knowledge for a given domain, and short-range interactions and long-range interactions between the plurality of nodes in the peptide graph; and

(c) generating a prediction of the peptide property based at least in part on the domainspecific latent representation. . The method of claim 117, wherein the short-range interactions comprise the bonded interactions, the nonbonded interactions, or both. . The method of claim 117 or 118, wherein the long-range interactions comprise the nonbonded interactions, the bonded interactions, or both. . The method of any one of claims 117-119, further comprising synthesizing a peptide comprising a structure defined by the peptide graph when the prediction of the peptide property is above or below a predetermined threshold value. . A method for performing a domain-specific prediction using a peptide graph, comprising: (a) providing a machine learning architecture comprising: i. an encoder configured to receive at least a subgraph of the peptide graph and generate a latent representation that encodes (i) short-range interactions between nodes of the subgraph, (ii) long-range interactions between (1) at least one node of the subgraph and (2) at least one node in the peptide graph but not in the subgraph, and (iii) domain-specific information for at least one domain; and ii. a predictor configured to receive the latent representation and generate a prediction value; and

(b) processing the peptide graph using the machine learning architecture and generating the prediction value. . The method of claim 121, wherein the domain-specific information comprises information for at least 2, 3, 4, 5, 10, or 100 domains. . The method of claim 121 or 122, wherein the processing further comprises:

(a) generating a plurality of subgraphs based at least partially on the peptide graph;

(b) processing the plurality of subgraphs to the neural network to generate a plurality of latent representations; and

(c) combining the plurality of latent representations to generate the prediction value. . The method of any one of claims 121-123, further comprising synthesizing a peptide comprising a structure defined by the peptide graph based at least in part on the prediction value. . The method of claim 124, wherein the prediction of the peptide property is above or below a predetermined threshold value. . The method of any one of claims 117-125, wherein the method is implemented using one or more computer processors. . The method of claim 126, further comprising providing access to the computer- implemented method to one or more users via a terminal, a web browser, an application, or a server infrastructure. . The method of claim 127, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query property prediction of a query peptide graph from the one or more users, and (ii) providing the query property prediction of the query peptide graph to the one or more users based at least in part on the instructions. . The computer-implemented method of claim 127 or 128, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query latent representation of a query peptide. . The computer-implemented method of any one of claims 127-129, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation. . A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods of claims 127-130. . The computer program product of claim 131, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

. A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 127-131. . The non-transitory computer-readable storage media of claim 133, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. . A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 127-131. . The non-transitory computer-readable storage media of claim 135, comprised in a laptop computer or a personal desktop computer. . The non-transitory computer-readable storage media of any one of claims 133-135, wherein the computer program is encoded in a human non-readable format. . A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods of claims 126-131. . A method for training a neural network to learn hierarchical representations of peptides, comprising:

(a) providing the neural network comprising: i. an encoder configured to receive a peptide representation of a peptide and generate a latent representation based at least in part on the peptide representation, wherein the latent representation encodes short-range interactions, long-range interactions, bonded interactions, non-bonded interactions, or any combination thereof between elements in the peptide representation; and ii. a mapper configured to receive the latent representation and generate a coarsegrained peptide representation of the peptide based at least partially on the latent representation, wherein the coarse-grained peptide representation comprises (i) a set of coarse-grained elements, and (ii) a set of identifiers for the set of coarsegrained elements, wherein coarse-grained representation comprises a lower resolution than the peptide representation;

(b) providing a training dataset comprising a sample peptide representation and a set of sample labels for the peptide representation; and

(c) training the neural network using the training dataset. . The method of claim 139, wherein training the neural network using the training dataset further comprises: (a) processing the sample peptide representation of the training dataset by the neural network;

(b) generating a sample coarse-grained peptide representation based at least in part on the sample peptide representation, wherein the sample coarse-grained peptide representation comprises (1) a set of sample coarse-grained elements, and (2) a set of sample identifiers for the set of sample coarse-grained elements; and

(c) updating at least one parameter of the neural network based at least in part on the set of sample identifiers. . The method of claim 140, wherein the updating at least one parameter of the neural network is based on a loss function computed based at least partially on the set of sample identifiers and the set of sample labels. . The method of claim 139 or 140, wherein the neural network further comprises: i. a second encoder configured to receive the coarse-grained peptide representation and generate a second latent representation based at least partially on the coarsegrained peptide representation, wherein the second latent representation encodes short-range interactions, long-range interactions, bonded interactions, nonbonded interactions, or any combination thereof between the set of coarsegrained elements in the coarse-grained peptide representation. . The method of claim 142, wherein the neural network further comprises: i. a second mapper configured to receive the second latent representation and generate a second coarse-grained peptide representation based at least partially on the second latent representation, wherein the second coarse-grained peptide representation comprises (i) a second set of coarse-grained elements, and (ii) a second set of identifiers for the second set of coarse-grained elements, wherein the second coarse-grained representation comprises a lower resolution than the coarse-grained representation. . The method of claim 143, wherein the neural network further comprises a third encoder configured to generate a third latent representation based at least in part on the second coarsegrained representation. . The method of claim 144, wherein the neural network further comprises a third mapper configured to generate a third coarse-grained representation based at least in part on the third latent representation, wherein the third coarse-grained representation comprises a lower resolution that the second coarse-grained representation. . The method of any one of claims 139-145, wherein the neural network comprises H number of encoders and mappers, such that the neural network is configured to generate a hierarchical representation comprising H number of resolutions and/or levels of encoding, wherein H is at least 2, 3, 4, 5, 6, 7, 8, 9, or 10.

. The method of claim 146, wherein the neural network comprises encoders and mappers in alternating order. . The method of any one of claims 139-147, wherein the mapper comprises a bipartite graph that maps the latent representation to the coarse-grained representation. . The method of any one of claims 143-148, wherein the second mapper comprises a second bipartite graph that maps the second latent representation to the second coarse-grained representation. . The method of any one of claims 145-149, wherein the third mapper comprises a third bipartite graph that maps the third latent representation to the third coarse-grained representation. . The method of any one of claims 139-150, wherein the mapper is configured to combine elements of the latent representation to create the coarse-grained representation. . The method of claim 151, wherein the mapper is configured to combine by masking over elements of the latent representation. . The method of any one of claims 143-152, wherein the second mapper is configured to combine elements of the second latent representation to create the second coarse-grained representation. . The method of claim 153, wherein the second mapper is configured to combine by masking over elements of the second latent representation. . The method of any one of claims 145-154, wherein the third mapper is configured to combine elements of the third latent representation to create the third coarse-grained representation. . The method of claim 155, wherein the third mapper is configured to combine by masking over elements of the third latent representation. . The method of any one of claims 139-156, wherein the peptide representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 139-157, wherein the latent representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 139-158, wherein the coarse-grained representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 143-159, wherein the second latent representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 143-160, wherein the second coarse-grained representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 145-161, wherein the third latent representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 145-162, wherein the third coarse-grained representation comprises a plurality of nodes and a plurality of edges.

. The method of any one of claims 157-163, wherein the plurality of nodes represent an atom, a functional group, a primary structure, a secondary structure, a tertiary structure, a quaternary, a motif, or any combination thereof. . The method of any one of claims 157-164, wherein the plurality of edges represent a bonded interaction, a nonbonded interaction, a short-range interaction, a long-range interaction, or any combination thereof. . The method of any one of claims 139-165, wherein the encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the peptide representation and at least a second subset of elements in the latent representation. . The method of any one of claim 143-166, wherein the second encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the coarse-grained representation and at least a second subset of elements in the second latent representation. . The method of any one of claim 145-167, wherein the third encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the second coarse-grained representation and at least a second subset of elements in the third latent representation. . The method of any one of claims 139-168, wherein the encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the peptide representation, such that the latent representation is based at least partially on the physicochemical feature, the structural feature, or both. . The method of any one of claims 139-169, wherein the second encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the coarse-grained representation, such that the second latent representation is based at least partially on the structural feature, a physicochemical feature, or both. . The method of any one of claims 139-170, wherein the third encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the second coarse-grained representation, such that the third latent representation is based at least partially on the structural feature, a physicochemical feature, or both. . The method of any one of claims 169-171, wherein the physicochemical feature comprises: (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (vix) a peptide-protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvii) a formulation property, or (xviii) any combination thereof.

. The method of any one of claims 169-172, wherein the structural feature comprises: a three-dimensional structure of an atom, a motif, a functional group, a residue, a secondary structure, a tertiary structure, a quaternary structure, or a molecular complex of the peptide.. The method of any one of claims 169-173, wherein the structural feature comprises: a structural annotation of an atom, a motif, a functional group, a residue, a secondary structure, a tertiary structure, a quaternary structure, or a molecular complex of the peptide. . The method of any one of claims 139-174, further comprising:

(a) providing a machine learning algorithm configured to (i) receive a hierarchical representation, and (ii) generate at least one predictive value associated with the hierarchical representation, wherein the hierarchical representation comprises the latent representation, the second latent representation, the third latent representation, the coarsegrained representation, the second coarse-grained representation, or the third coarsegrained representation, or any combination thereof;

(b) providing a second training dataset comprising (i) one or more hierarchical representations encoded by the neural network, and (ii) one or more known values associated with the one or more hierarchical representations; and

(c) training the machine learning algorithm at least in part by (i) processing the one or more hierarchical representations to the machine learning algorithm, (ii) generating one or more predictive values based at least in part on the one or more hierarchical representations, and (iii) updating at least one parameter of the machine learning algorithm based at least in part on the one or more predictive values. . The method of claim 175, wherein the training reduces a difference between the one or more predictive values and the one or more known values. . The method of any one of claims 139-176, further comprising:

(a) processing to the machine learning algorithm one or more sample hierarchical representations to generate one or more sample predictive values; and

(b) selecting at least one sample hierarchical representation in the one or more sample hierarchical representations as a candidate representation based at least in part on the one or more sample predictive values. . The method of any one of claims 175-177, wherein the machine learning algorithm is sufficiently trained when the second training dataset comprises at most about 1000, 100, 10, or 1 known values. . The method of any one of claims 175-178, wherein the one or more known values comprise (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (vix) a peptide-protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvi) a formulation property, or (xviii) any combination thereof. . The method of any one of claims 139-179, wherein the training dataset comprises (i) a plurality of sample peptide representations comprising the sample peptide representation, and (ii) a plurality of sets of sample labels comprising the set of sample labels. . The method of any one of claims 139-180, wherein the set of sample labels comprise an identity for an element in the sample peptide representation, a property associated with an element in the sample peptide representation, or both. . The method of any one of claims 139-181, wherein the set of samples are assigned by masking over each element in the sample peptide representation. . The method of any one of claims 139-182, wherein the masking comprises obscuring a given element in the sample peptide representation and processing neighboring elements of the given element to a neural network, and generating a label for the given element based at least partially on the neighboring elements. . The method of any one of claims 139-183, wherein the training the neural network comprises self-supervised learning. . The method of any one of claims 139-184, wherein training the neural network further comprises, for each given sample peptide representation in the plurality of sample peptide representations and each given set of sample label in the plurality of sets of sample labels: i. processing the given sample peptide representation of the training dataset by the neural network; ii. generating a given sample coarse-grained peptide representation based at least partially on the given sample peptide representation, wherein the given sample coarse-grained peptide representation comprises (1) a given set of sample coarsegrained elements, and (2) a given set of sample identifiers for the given set of sample coarse-grained elements; and iii. updating at least one parameter of the neural network based at least in part on the given set of sample identifiers. . The method of claim 185, wherein the updating is based on a loss function computed based at least partially on the given set of sample identifiers and the given set of sample labels. . The method of any one of claims 157-186, wherein the set of sample labels comprises a label for each node in the plurality of nodes. . The method of any one of claims 157-187, wherein the set of sample labels comprises a label for each edge in the plurality of edges. . The method of any one of claims 139-188, wherein the loss function comprises probability based loss function. . The method of any one of claims 139-189, wherein the loss function is configured to generate a maximum likelihood value.

. The method of any one of claims 139-190, wherein the loss function is configured to generate a Kullback-Leibler value. . The method of any one of claims 139-191, wherein the loss function comprises a crossentropy loss function. . The method of any one of claims 139-192, wherein the method is implemented using one or more computer processors. . The method of claim 193, further comprising providing access to the computer- implemented method to one or more users via a terminal, a web browser, an application, or a server infrastructure. . The method of claim 194, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query hierarchical representation of a query peptide from the one or more users, and (ii) providing the query hierarchical representation of the query peptide to the one or more users based at least in part on the instructions. . The computer-implemented method of claim 194 or 195, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query hierarchical representation of a query peptide. . The computer-implemented method of any one of claims 194-196, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation. . A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the methods of claims 193-197. . The computer program product of claim 198, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network. . A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 193-198. . The non-transitory computer-readable storage media of claim 200, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. . The non-transitory computer-readable storage media of claim 200 or 201, wherein the non-transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer. . The non-transitory computer-readable storage media of any one of claims 200-202, wherein the computer program is encoded in a human non-readable format. . A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods of claims 193-197. . A method of generating a representation of peptides, comprising:

(a) providing a first representation of a peptide, the first representation comprising nodes and edges, wherein the edges encode bonded and nonbonded interactions; and

(b) generating a second representation of the peptide based at least in part on the first latent representation, wherein the second latent representation comprises a lower resolution than the first representation. . The method of claim 205, further comprising generating a third representation based at least in part on the second representation, wherein the third representation comprises a lower resolution than the second representation. . The method of claim 206, further comprising generating an N-th representation based at least in part on the third representation, wherein the N-th representation comprises a lower resolution than an (N-l)-th representation. . The method of any one of claims 205-207, wherein the first representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 205-208, wherein the second representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 205-209, wherein the third representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 205-210, wherein the N-th representation comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 208-211, wherein the plurality of nodes represent an atom, a functional group, a primary structure, a secondary structure, a tertiary structure, a quaternary, a motif, or any combination thereof. . The method of any one of claims 208-212, wherein the plurality of edges represent a bonded interaction, a nonbonded interaction, a short-range interaction, a long-range interaction, or any combination thereof. . The method of any one of claims 208-213, wherein the generating the second representation is based at least in part on a structural feature, a physicochemical feature, or both. . The method of any one of claims 208-214, wherein the generating the third representation is based at least in part on a structural feature, a physicochemical feature, or both. . The method of any one of claims 208-215, wherein the generating the N-th representation is based at least in part on a structural feature, a physicochemical feature, or both. . The method of any one of claims 208-216, wherein the physicochemical feature comprises: (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (vix) a peptide-protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvii) a formulation property, or (xviii) any combination thereof. . The method of any one of claims 208-217, wherein the structural feature comprises: a three-dimensional structure of an atom, a motif, a functional group, a residue, a secondary structure, a tertiary structure, a quaternary structure, or a molecular complex. . The method of any one of claims 208-218, wherein the first representation is a latent representation. . The method of any one of claims 208-219, wherein the second representation is a latent representation. . The method of any one of claims 208-220, wherein the third representation is a latent representation. . The method of any one of claims 208-221, wherein the N-th representation is a latent representation. . The method of any one of claims 208-222, wherein the generating is performed at least in part by using a neural network. . The method of any one of claims 208-223, wherein the generating is performed at least in part by using a mapper. . The method of any one of claims 205-224, further comprising synthesizing a peptide comprising a structure associated with the first latent representation, the second latent representation, or both. . The method of any one of claims 205-224, wherein the method is implemented using one or more computer processors. . The method of claim 226, further comprising providing access to the method to one or more users via a terminal, a web browser, an application, or a server infrastructure. . The method of claim 227, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query hierarchical representation of a query peptide from the one or more users, and (ii) providing the query hierarchical representation of the query peptide to the one or more users based at least in part on the instructions. . The method of claim 227 or 228, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query hierarchical representation of a query peptide. . The method of any one of claims 227-229, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query hierarchical representation.

. A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods of claims 227-230. . The computer program product of claim 231, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network. . A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 227-230. . The non-transitory computer-readable storage media of claim 233, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. . The non-transitory computer-readable storage media of claim 233, wherein the non- transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer. . The non-transitory computer-readable storage media of any one of claims 233-235, wherein the computer program is encoded in a human non-readable format. . A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods of claims 227-236. . A method for training a deep generative neural network, comprising:

(a) providing a neural network comprising: i. an encoder configured to receive a representation of a target peptide and generate a latent representation of the target peptide; and ii. a transformer configured to receive the latent representation of the target peptide and generate a predicted peptide ligand representation of a ligand that is predicted to bind to the target peptide;

(b) providing a plurality of target peptide representations and a plurality of ligand representations; and

(c) training the neural network at least in part by: i. processing the plurality of target peptide representations by the neural network; ii. generating a plurality of predicted ligand representations based at least partially on the plurality of target peptide representations; and iii. updating at least one parameter of the neural network based at least in part on the plurality of predicted peptide ligand representations.

. The method of claim 238, wherein the updating is based on a loss function computed between the plurality of predicted peptide ligand representations and the plurality of ligand peptide representations. . The method of claim 238 or 239, wherein the representation of the target comprises: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof. . The method of claim 240, wherein the primary structure of at least the portion of the target is tokenized. . The method of claim 240 or 241, wherein the secondary structure of at least the portion of the target is tokenized. . The method of any one of claims 240-242, wherein the tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof is encoded using a neural network. . The method of any one of claims 240-243, wherein the representation of the target comprises an embedding of: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof. . The method of any one of claims 240-244, wherein the ligand representation comprises: (i) an atomistic structure of the ligand, (ii) a primary structure of the ligand, (iii) a tertiary structure of the ligand, or (iv) any combination thereof. . The method of any one of claims 240-245, wherein the representation of the target comprises a plurality of nodes and a plurality of edges. . The method of any one of claims 240-246, wherein the representation of the target comprises a hierarchical latent representation of the target. . The method of any one of claims 240-247, wherein the representation encodes short- range interactions, long-range interactions, bonded interactions, and non-bonded interactions between in the target peptide. . The method of any one of claims 238-248, wherein the method is implemented using one or more computer processors. . The method of claim 249, further comprising providing access to the method to one or more users via a terminal, a web browser, an application, or a server infrastructure.

. The method of claim 250, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query ligand representation for a query target peptide from the one or more users, and (ii) providing the query ligand representation to the one or more users based at least in part on the instructions. . The method of claim 250 or 251, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query ligand representation of a query target peptide. . The method of any one of claims 250-252, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query ligand representation. . A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods of claims 250-253. . The computer program product of claim 254, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network. . A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 250-255. . The non-transitory computer-readable storage media of claim 256, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. . The non-transitory computer-readable storage media of claim 256, comprised in a laptop computer or a personal desktop computer. . The non-transitory computer-readable storage media of any one of claims 256-258, wherein the computer program is encoded in a human non-readable format. . A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods of claims 250-254. . A method for generating a ligand representation, comprising:

(a) providing a latent representation of a target peptide, wherein the latent representation comprises an embedding for at least one of: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof; (b) generating the ligand representation of a ligand based at least in part on the latent representation, wherein the ligand is predicted to bind to at least a portion of the target peptide. . The method of claim 261, further comprising providing a second latent representation, wherein the second latent representation is provided by a trained machine learning model, wherein the trained machine learning model comprises a language model, a graph model, flow models, generative adversarial networks, variational autoencoders, autoregressive models, an autoencoder, diffusion models, or any combination thereof. . The method of claim 261 or 262, wherein the generating comprises adding noise to the latent representation. . The method of any one of claims 261-263, wherein the generating comprises generating using a decoder neural network of a transformer neural network. . The method of claim 264, wherein the generating comprises using a greedy-sampling algorithm, a Monte Carlo tree search, a beam search, a genetic algorithm, or any combination thereof. . A method for generating a ligand representation, comprising:

(a) providing a peptide graph of a target peptide, the peptide graph comprising a plurality of nodes and plurality of edges, wherein the plurality of edges encodes short-range interactions, long-range interactions, bonded interactions, and nonbonded interactions; and

(b) generating the ligand representation of a ligand based at least in part on the target peptide graph, wherein the ligand is predicted to bind to at least a portion of the target peptide. . A method for generating a ligand representation, comprising:

(a) providing a hierarchical representation of a peptide, wherein the hierarchical representation comprises at least one high-resolution representation and at least one low- resolution representation; and

(b) generating the ligand representation of a ligand based at least in part on the hierarchical representation, wherein the ligand is predicted to bind to at least a portion of the peptide. . The method of any one of claims 261-267, wherein the generating is performed at least partially by using a transformer neural network. . The method of any one of claims 261-268, wherein the generating is performed at least partially by using a message-passing neural network. . The method of any one of claims 261-269, further comprising performing a binding experiment between the target peptide and the ligand. . The method of any one of claims 261-269, wherein the method is implemented using one or more computer processors. . The method of claim 271, further comprising providing access to the method to one or more users via a terminal, a web browser, an application, or a server infrastructure.

. The method of claim 272, wherein the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query ligand representation for a query target peptide from the one or more users and (ii) providing the query ligand representation to the one or more users based at least in part on the instructions. . The method of claim 272 or 273, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query ligand representation of a query target peptide. . The method of any one of claims 272-274, wherein the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query ligand representation. . A computer program product comprising a computer-readable medium having computerexecutable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods of claims 272-275. . The computer program product of claim 276, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network. . A non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods of claims 272-275. . The non-transitory computer-readable storage media of claim 278, comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. . The non-transitory computer-readable storage media of claim 278, comprised in a laptop computer, or a personal desktop computer. . The non-transitory computer-readable storage media of any one of claims 278-280, wherein the computer program is encoded in a human non-readable format. . A computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods of claims 272-275. . A non-transitory computer-readable storage medium, comprising:

(a) a peptide graph comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encodes bonded interactions and nonbonded interactions between the plurality of nodes; and

(b) a latent representation for a node in a plurality of nodes, wherein the latent representation encodes at least short-range interactions and long-range interactions between the node and at least a subset of nodes in the plurality of nodes. . A non-transitory computer-readable storage medium, comprising: (a) a representation of a peptide comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encode bonded and nonbonded interactions between the plurality of nodes; and

(b) an encoding based at least partially the first representation, wherein the encoding comprises lower resolution than the first representation. . A method comprising:

(a) providing a machine learning algorithm configured to process a hierarchical representation of a peptide to generate a predictive value associated with the hierarchical representation of the peptide;

(b) processing a plurality of hierarchical representations of a plurality of peptides using the machine learning algorithm to generate a plurality of predictive values associated with the plurality of hierarchical representations of the plurality of peptides;

(c) selecting at least a subset of hierarchical representations in the plurality of hierarchical representations when each hierarchical representation in the at least the subset of hierarchical representations is associated with one or more predictive values in the plurality of values, wherein the predictive values are indicative of a quantitative metric of the plurality of hierarchical representations; and

(d) generating one or more peptide structures associated with the subset of hierarchical representations. . The method of claim 285, wherein the optimized property comprises at least one of: (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (ix) a pharmacokinetic property, (x) a pharmacodynamic property, (xi) an in vivo safety property, (xii) a property related to a biomarker, (xiii) a formulation property, or (xiv) any combination thereof. . The method of claim 286, wherein the optimized property is a pharmacological property, and the pharmacological property comprises a targeting ability or an anti-targeting ability for one or more targets. . The method of claim 287, wherein the one or more targets comprise biomolecules found in a human body, an animal, or a plant. . The method of claim 288, wherein the biomolecules comprise proteins, nucleic acids, lipids, or any combination thereof. . The method of claim 289, wherein the subset of hierarchical representations is selected when the subset of hierarchical representations is associated with a plurality of optimized properties. . The method of any one of claims 285-290, wherein the library of peptide comprises peptides associated with at least one of: i. a plurality of targets associated with human medical indications; ii. a plurality of human peptides; iii. a plurality of murine peptides; iv. a plurality of canine peptides; v. a plurality of primate peptides; vi. a plurality of animal peptides; vii. a plurality of plant peptides; viii. a plurality of bacterial peptides; ix. a plurality of viral peptides; x. a plurality of fungal peptides; and xi . a plurality of proteins . . The method of any one of claims 285-290, wherein the library of peptides comprises peptides associated with at least one of: i. a plurality of targets associated with human medical indications; ii. a plurality of known human proteins; iii. a plurality of known murine proteins; iv. a plurality of known canine proteins; v. a plurality of known primate proteins; vi . a plurality of known animal proteins ; vii. a plurality of known plant proteins; viii. a plurality of known bacterial proteins; ix. a plurality of known viral proteins; x. a plurality of known fungal proteins; and xi . a plurality of known proteins . . The method of claim 285, further comprising:

(a) synthesizing a peptide associated with a hierarchical representation in the subset of hierarchical representations;

(b) measuring the optimized property of the peptide; and

(c) updating the peptide library to incorporate a measurement based at least partially on the measuring in (b). . An artificial intelligence system for generating peptide structures, comprising:

(a) a peptide library comprising peptide structures and peptide properties;

(b) a representation learning system configured to receive at least the peptide structures from the peptide library and output hierarchical latent representations of the peptide structures; and

(c) a transfer learning system configured to receive at least the peptide properties from the peptide library and at least the hierarchical latent representations from the representation learning system, and output domain-specific hierarchical latent representations based at least in part on the peptide structures and the peptide properties.

. The artificial intelligence system of claim 294, further comprising a prediction system configured to receive at least the domain-specific hierarchical latent representations and output at least one property prediction for a peptide of interest. . The artificial intelligence system of claim 294 or 295, further comprising a peptide structure generating system configured to receive at least one domain-specific hierarchical latent representation in the domain-specific hierarchical latent representations and output a peptide structure for a peptide of interest. . The artificial intelligence system of any one of claims 294-296, further comprising a transfer learning system configured to receive two or more peptide libraries to output a refined peptide library, wherein the two or more peptide libraries comprise (i) a first peptide library comprising a first set of ligands for a first target, and (ii) a second peptide library comprising a second set of ligands for a second target, and wherein the refined peptide library comprises a refined set of ligands that (1) targets the first target but not the second target, or (2) targets the first target and the second target, or (3) targets neither the first target nor the second target.. An in silico drug discovery method comprising:

(a) performing a first set of docking simulations between a first set of peptides and a molecular target to generate a training dataset;

(b) training a generative neural network using the training dataset;

(c) generating, using the generative neural network, a plurality of peptide sequences;

(d) clustering the plurality of peptide sequences into one or more clusters;

(e) selecting a subset of peptide sequences in the plurality of sequences, wherein the subset of peptide sequences are substantially distributed among the one or more clusters; and

(f) performing a second set of docking simulations between a second set of peptides and the molecular target, wherein the second set of peptides comprises the subset of peptide sequences. . An in silico drug discovery method comprising:

(a) performing a first set of docking simulations between a first set of peptides and a molecular target to generate a training dataset;

(b) training a prediction neural network using the training dataset;

(c) generating a plurality of peptide sequences;

(d) filtering, using the prediction neural network, a subset of peptide sequences from the plurality of peptide sequences; and

(e) performing a second set of docking simulations between a second set of peptides and the molecular target, wherein the second set of peptides comprises the subset of peptide sequences, and wherein at least one peptide in the second set of peptides binds more favorably to the molecular target than each peptide in the first set of peptides. . An in silico drug discovery method comprising: (a) generating, using a generative neural network, a plurality of peptide sequences for binding a molecular target, wherein the generating is based at least partially on a docking simulation dataset;

(b) screening, using a predictive neural network, a subset of peptide sequences from the plurality of peptide sequences; and

(c) performing a set of docking simulations between a set of peptides and the molecular target, wherein the set of peptides comprises the subset of peptide sequences.

Description:
COMPUTER REPRESENTATIONS OF PEPTIDES FOR EFFICIENT DESIGN OF DRUG

CANDIDATES

CROSS-REFERENCE

[0001] This application claims the benefit of U.S. Provisional Application No. 63/330,557, filed April 13, 2022, and U.S. Provisional Application No. 63/330,563, filed April 13, 2022, each of which are incorporated herein by reference in their entirety.

BACKGROUND

[0002] Computer resources can be leveraged by pharmaceutical researchers developing and advancing new molecules into clinical trials.

SUMMARY

[0003] In some aspects, the present disclosure provides a method for training neural networks using short-range and long-range interactions of a peptide, comprising: (a) constructing a peptide graph based at least in part on a peptide structure of the peptide, wherein the peptide graph comprises a plurality of nodes and a plurality of edges, and wherein the plurality of edges represents bonded and nonbonded interactions between the plurality of nodes of the peptide graph; (b) constructing a short-range subgraph of the peptide graph, wherein the short-range subgraph comprises (i) a first central node and (ii) a set of short-range nodes within a first range from the first central node; (c) constructing a long-range subgraph of the peptide graph, wherein the long-range subgraph comprises a set of long-range nodes within a second range from a second central node, wherein the set of long-range nodes comprises at least one anchor node, wherein the first range and the second range overlap such that, when the first central node and the second central node are the same, the set of short-range nodes and the set of long-range nodes comprise the at least one anchor node; (d) constructing a first neural network based at least in part on the short- range subgraph and a second neural network based at least in part on the long-range subgraph; (e) processing feature values of the short-range subgraph using the first neural network to generate a short- range latent representation, and processing feature values of the long-range subgraph using the second neural network to generate a long-range latent representation; and (f) updating parameters of the first neural network and the second neural network based at least in part on the short-range latent representation and the long-range latent representation.

[0004] In some embodiments, (f) further comprises updating parameters of the first neural network and the second neural network such that (i) the short-range latent representation and the long-range latent representation are similar when the first central node and the second central are the same, and (ii) the short-range latent representation and the long-range latent representation are dissimilar when the first central node and the second central are not the same.

[0005] In some embodiments, the peptide structure is obtained at least in part by a molecular simulation. In some embodiments, the molecular simulation is based at least in part on an electronic structure calculation, a force-field based calculation, molecular dynamics, a Monte Carlo simulation, or any combination thereof. In some embodiments, the peptide structure is obtained using a machine learning algorithm. In some embodiments, the peptide structure is obtained from an experiment or a structure database.

[0006] In some embodiments, the plurality of nodes comprises atoms in the peptide structure. In some embodiments, the plurality of nodes comprises functional groups in the peptide structure. In some embodiments, the plurality of nodes comprises amino acids in the peptide structure. In some embodiments, the plurality of nodes comprises secondary structures in the peptide structure. In some embodiments, the plurality of nodes comprises tertiary structures in the peptide structure. In some embodiments, the plurality of nodes comprises quaternary structures in the peptide structure.

[0007] In some embodiments, the first range from the first central node is less than or equal to KO number of edges from the first central node, wherein KO is at least 1. In some embodiments, the second range from the second central node is greater than or equal to KI number of edges from the second central node and less than or equal to K2 number of edges from the second central node, wherein KI is at least 2, and wherein K2 is at least 3. In some embodiments, KO is at least 2, 3, 4, 5, 6, 7, or 8. In some embodiments, KI is at least 3, 4, 5, 6, 7, 8, or 9. In some embodiments, K2 is at least 4, 5, 6, 7, 8, 9, or 10. In some embodiments, the long-range subgraph comprises at least one connected graph. In some embodiments, the long-range subgraph comprises at least two disconnected graphs. In some embodiments, the at least one anchor node comprises at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 nodes. In some embodiments, the first neural network comprises a graph neural network architecture. In some embodiments, the first neural network comprises at least K0 layers. In some embodiments, the second neural network comprises a graph neural network architecture. In some embodiments, the second neural network comprises at least K0 layers. In some embodiments, the second neural network comprises at most K2 layers. In some embodiments, a layer comprises an attention mechanism, a generalized messagepassing graph neural network, or both. In some embodiments, the generalized message-passing graph neural network comprises a graph convolutional neural network. In some embodiments, the short-range latent representation comprises a latent representation of the first central node. In some embodiments, the long-range latent representation comprises a pooled latent representation of the at least one anchor node. In some embodiments, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise an identifier of an atom, a functional group, an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif.

[0008] In some embodiments, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise a physical property of an atom, a functional group an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif. In some embodiments, the motif represents a collection of atoms. In some embodiments, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise a geometric descriptor of an atom, an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif. In some embodiments, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise electronic configuration, a charge, a size, a bond angle, a dihedral angle, or any combination thereof. In some embodiments, the functional group comprises a BRICS motif.

[0009] In some embodiments, (f) further comprises evaluating similarity or dissimilarity between the first neural network and the second neural network based at least in part on contrastive learning. In some embodiments, the contrastive learning further comprises optimizing a contrastive loss function based at least in part on a dot product between the short-range latent representation and the long-range latent representation.

[0010] In some embodiments, the optimizing is performed such that a sigmoid of the dot product is equal to one when the first central node and the second central node are the same, and the sigmoid of the dot product is equal to zero when the first central node and the second central node are not the same. In some embodiments, the method is implemented using one or more computer processors.

[0011] In some embodiments, the method further comprises providing access to the first neural network and the second neural network to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for generating a latent representation of a given peptide from the one or more users, and (ii) providing the latent representation of the given peptide to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a latent representation of a given peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises peptide chemical structure for a given latent representation.

[0012] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the methods disclosed herein.

[0013] In some embodiments, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0014] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the methods or computer-implemented methods disclosed herein.

[0015] In some embodiments, the non-transitory computer-readable storage media is comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of graphical processing units.

[0016] In some embodiments, the computer program is encoded in a human non-readable format.

[0017] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the methods or computer-implemented methods disclosed herein. [0018] In some aspects, the present disclosure provides a method for obtaining a latent representation, comprising: (a) providing a peptide graph comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encodes bonded interactions and nonbonded interactions between the plurality of nodes; and (b) generating the latent representation for a node in the plurality of nodes, wherein the generating is based at least in part on the peptide graph, and wherein the latent representation encodes short-range interactions and long-range interactions between the node and at least a subset of nodes in the plurality of nodes.

[0019] In some embodiments, the generating further comprises: (a) constructing a short-range subgraph of the peptide graph, wherein the short-range subgraph comprises (i) the node and (ii) a set of short-range nodes within a first range from the node; (b) constructing a first neural network based at least in part on the short-range subgraph; and (c) processing feature values of the short-range subgraph into the first neural network to generate the latent representation for the node.

[0020] In some embodiments, the generating further comprises: (a) constructing a long-range subgraph of the peptide graph, wherein the long-range subgraph comprises (i) the node and (ii) a set of long-range nodes within a second range from the node, wherein the set of long-range nodes comprises one or more anchor nodes; (b) constructing a second neural network based at least in part on the long-range subgraph; (c) processing feature values of the long-range subgraph into the second neural network to generate one or more latent representations for the one or more anchor nodes; and (d) combining the one or more latent representation to generate the latent representation for the node.

[0021] In some embodiments, the latent representation further encodes one or more features of atoms, functional groups, amino acids, secondary structures, tertiary structures, quaternary structures, or any combination thereof.

[0022] In some embodiments, the plurality of nodes comprises atoms in the peptide structure. In some embodiments, the plurality of nodes comprises functional groups in the peptide structure. In some embodiments, the plurality of nodes comprises amino acids in the peptide structure. In some embodiments, the plurality of nodes comprises secondary structures in the peptide structure. In some embodiments, the plurality of nodes comprises tertiary structures in the peptide structure. In some embodiments, the plurality of nodes comprises quaternary structures in the peptide structure. In some embodiments, the first range from the node is less than or equal to K0 number of edges from the node, wherein K0 is at least 1. In some embodiments, the second range from the node is greater than or equal to KI number of edges from the node and less than or equal to K2 number of edges from the node, wherein KI is at least 2, and wherein K2 is at least 3. In some embodiments, K0 is at least 2, 3, 4, 5, 6, 7, or 8.

[0023] In some embodiments, KI is at least 3, 4, 5, 6, 7, 8, or 9. In some embodiments, K2 is at least 4, 5, 6, 7, 8, 9, or 10. In some embodiments, the first neural network comprises a graph neural network architecture. In some embodiments, the first neural network comprises at least K0 layers. In some embodiments, the second neural network comprises a graph neural network architecture. In some embodiments, the second neural network comprises at least K0 layers. In some embodiments, the second neural network comprises at most K2 layers. In some embodiments, a layer comprises an attention mechanism, a generalized message-passing graph neural network, or both. In some embodiments, the generalized message-passing graph neural network comprises a graph convolutional neural network. [0024] In some embodiments, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise an identifier of an atom, a functional group, an amino acid, a secondary structure, a tertiary structure, or a motif. In some embodiments, the feature values of the short- range subgraph, the feature values of the long-range subgraph, or both comprise a physical property of an atom, a functional group an amino acid, a secondary structure, a tertiary structure, or a motif. In some embodiments, the motif represents a collection of atoms. In some embodiments, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise geometric descriptor of an atom, an amino acid, a secondary structure, or a tertiary structure. In some embodiments, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise an electronic configuration, a charge, a size, a bond angle, a dihedral angle, or any combination thereof. In some embodiments, the functional group comprises a BRICS motif.

[0025] In some embodiments, the latent representation is embedded in a latent space, wherein the latent space is organized based at least in part on nonbonded interactions, long-range interactions, or both. [0026] In some embodiments, the latent representation is embedded in a latent space, wherein the latent space is organized based at least in part on bonded interactions, short-range interactions, or both.

[0027] In some embodiments, the method further comprises generating a peptide latent representation at least in part by aggregating a plurality of latent representations over each node in the peptide graph. In some embodiments, the method further comprises synthesizing a peptide comprising a structure defined by the peptide graph.

[0028] In some embodiments, the method is implemented using one or more computer processors. In some embodiments, the method further comprises providing access to the latent representation generated in (b) to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for generating a query latent representation of a query peptide graph from the one or more users, and (ii) providing the query latent representation of the query peptide graph to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query latent representation of a query peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation.

[0029] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods disclosed herein. In some embodiments, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network. [0030] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein. In some embodiments, the non- transitory computer-readable storage media is comprised in a distributed computing system, a cloudbased computing system, or a supercomputer comprising a plurality of graphical processing units. In some embodiments, the non-transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer. In some embodiments, the computer program is encoded in a human non-readable format.

[0031] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods disclosed herein. [0032] In some aspects, the present disclosure a method for training a domain-specific neural network, comprising: (a) providing a trained neural network configured to process at least a subgraph of a peptide graph and generate a latent representation that encodes (i) short-range interactions between a plurality of nodes of the subgraph, and (ii) long-range interactions between (1) at least one node of the subgraph and (2) at least one node in the peptide graph but not in the subgraph; (b) providing a domain-specific dataset comprising a set of peptide graphs and a set of labels; and (c) retraining the trained neural network using the domain-specific dataset.

[0033] In some embodiments, retraining the trained neural network further comprises, for a given peptide graph in the set of peptide graphs and a given label in the set of labels: (a) generating a plurality of subgraphs based at least in part on the given peptide graph; (b) processing the plurality of subgraphs using the trained neural network to generate a plurality of latent representations; (c) combining the latent representations to generate a prediction value; and (d) updating at least one parameter of the trained neural network based at least in part on the prediction value.

[0034] In some embodiments, the updating at least one parameter of the trained neural network is based at least in part on a difference between the prediction value and the given label. In some embodiments, the short-range interactions comprise bonded interactions, nonbonded interactions, or both. In some embodiments, the long-range interactions comprise nonbonded interactions, bonded interactions, or both. In some embodiments, the combining further comprises pooling the plurality of latent representations to generate the prediction value. In some embodiments, the combining further comprises using a machine learning algorithm to generate the prediction value based at least in part on the plurality of latent representations.

[0035] In some embodiments, the set of labels comprises (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (ix) a peptide-protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvii) a formulation property, or (xviii) any combination thereof.

[0036] In some embodiments, the trained neural network is sufficiently retrained when the set of peptide graphs comprises at most about 1000, 100, 10, 5, 2, or 1 peptide graphs. In some embodiments, the trained neural network is sufficiently retrained when the set of labels comprises at most about 1000, 100, 10, 5, 2, or 1 labels. In some embodiments, retraining the trained neural network further comprises: freezing at least a subset of parameters of the neural network during the retraining, fixing a fraction of the weights during the retraining, using a regularization function during the retraining, one or more dropout layers during the retraining, or any combination thereof. In some embodiments, the method is implemented using one or more computer processors.

[0037] In some embodiments, the method further comprises providing access to the computer- implemented method to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query property prediction of a query peptide graph from the one or more users, and (ii) providing the query property prediction of the query peptide graph to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query latent representation of a query peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation.

[0038] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods disclosed herein. In some embodiments, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0039] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein. In some embodiments, the non- transitory computer-readable storage media is comprised in a distributed computing system, a cloudbased computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. In some embodiments, the non-transitory computer-readable storage media is comprised in a laptop computer or a personal desktop computer. In some embodiments, the computer program is encoded in a human non-readable format.

[0040] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods disclosed herein. [0041] In some aspects, the present disclosure provides a method of predicting a peptide property of a peptide, comprising: (a) providing a peptide graph comprising a plurality of nodes and plurality of edges, wherein the plurality of edges encodes bonded interactions and nonbonded interactions; and (b) generating a domain-specific latent representation based at least in part on the peptide graph, wherein the domain-specific latent representation encodes domain knowledge for a given domain, and short-range interactions and long-range interactions between the plurality of nodes in the peptide graph; and (c) generating a prediction of the peptide property based at least in part on the domain-specific latent representation.

[0042] In some embodiments, the short-range interactions comprise the bonded interactions, the nonbonded interactions, or both. In some embodiments, the long-range interactions comprise the nonbonded interactions, the bonded interactions, or both.

[0043] In some embodiments, the method further comprises synthesizing a peptide comprising a structure defined by the peptide graph when the prediction of the peptide property is above or below a predetermined threshold value.

[0044] In some aspects, the present disclosure provides a method for performing a domain-specific prediction using a peptide graph, comprising: (a) providing a machine learning architecture comprising: i. an encoder configured to receive at least a subgraph of the peptide graph and generate a latent representation that encodes (i) short-range interactions between nodes of the subgraph, (ii) long-range interactions between (1) at least one node of the subgraph and (2) at least one node in the peptide graph but not in the subgraph, and (iii) domain-specific information for at least one domain; and (ii) a predictor configured to receive the latent representation and generate a prediction value; and (b) processing the peptide graph using the machine learning architecture and generating the prediction value.

[0045] In some embodiments, the domain-specific information comprises information for at least 2, 3, 4, 5, 10, or 100 domains. In some embodiments, the processing further comprises: (a) generating a plurality of subgraphs based at least partially on the peptide graph; (b) processing the plurality of subgraphs to the neural network to generate a plurality of latent representations; and (c) combining the plurality of latent representations to generate the prediction value.

[0046] In some embodiments, the method further comprises synthesizing a peptide comprising a structure defined by the peptide graph based at least in part on the prediction value. In some embodiments, the prediction of the peptide property is above or below a predetermined threshold value. [0047] In some embodiments, the method is implemented using one or more computer processors. In some embodiments, the method further comprises providing access to the computer-implemented method to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query property prediction of a query peptide graph from the one or more users, and (ii) providing the query property prediction of the query peptide graph to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query latent representation of a query peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation.

[0048] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods disclosed herein. In some embodiments, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0049] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein. In some embodiments, the non- transitory computer-readable storage media is comprised in a distributed computing system, a cloudbased computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. In some embodiments, the non-transitory computer-readable storage media is comprised in a laptop computer or a personal desktop computer. In some embodiments, the computer program is encoded in a human non-readable format.

[0050] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods disclosed herein. [0051] In some aspects, the present disclosure provides a method for training a neural network to learn hierarchical representations of peptides, comprising: (a) providing the neural network comprising: i. an encoder configured to receive a peptide representation of a peptide and generate a latent representation based at least in part on the peptide representation, wherein the latent representation encodes short-range interactions, long-range interactions, bonded interactions, non-bonded interactions, or any combination thereof between elements in the peptide representation; and ii. a mapper configured to receive the latent representation and generate a coarse-grained peptide representation of the peptide based at least partially on the latent representation, wherein the coarse-grained peptide representation comprises (i) a set of coarse-grained elements, and (ii) a set of identifiers for the set of coarse-grained elements, wherein coarse-grained representation comprises a lower resolution than the peptide representation; (b) providing a training dataset comprising a sample peptide representation and a set of sample labels for the peptide representation; and (c) training the neural network using the training dataset.

[0052] In some embodiments, training the neural network using the training dataset further comprises: (a) processing the sample peptide representation of the training dataset by the neural network; (b) generating a sample coarse-grained peptide representation based at least in part on the sample peptide representation, wherein the sample coarse-grained peptide representation comprises (1) a set of sample coarse-grained elements, and (2) a set of sample identifiers for the set of sample coarse-grained elements; and (c) updating at least one parameter of the neural network based at least in part on the set of sample identifiers. In some embodiments, the updating at least one parameter of the neural network is based on a loss function computed based at least partially on the set of sample identifiers and the set of sample labels.

[0053] In some embodiments, the neural network further comprises: a second encoder configured to receive the coarse-grained peptide representation and generate a second latent representation based at least partially on the coarse-grained peptide representation, wherein the second latent representation encodes short-range interactions, long-range interactions, bonded interactions, non-bonded interactions, or any combination thereof between the set of coarse-grained elements in the coarse-grained peptide representation. In some embodiments, the neural network further comprises: a second mapper configured to receive the second latent representation and generate a second coarse-grained peptide representation based at least partially on the second latent representation, wherein the second coarse-grained peptide representation comprises (i) a second set of coarse-grained elements, and (ii) a second set of identifiers for the second set of coarse-grained elements, wherein the second coarse-grained representation comprises a lower resolution than the coarse-grained representation. In some embodiments, the neural network further comprises a third encoder configured to generate a third latent representation based at least in part on the second coarse-grained representation. In some embodiments, the neural network further comprises a third mapper configured to generate a third coarse-grained representation based at least in part on the third latent representation, wherein the third coarse-grained representation comprises a lower resolution that the second coarse-grained representation. In some embodiments, the neural network comprises H number of encoders and mappers, such that the neural network is configured to generate a hierarchical representation comprising H number of resolutions and/or levels of encoding, wherein H is at least 2, 3, 4, 5, 6, 7, 8, 9, or 10. In some embodiments, the neural network comprises encoders and mappers in alternating order. In some embodiments, the mapper comprises a bipartite graph that maps the latent representation to the coarse-grained representation. In some embodiments, the second mapper comprises a second bipartite graph that maps the second latent representation to the second coarsegrained representation. In some embodiments, the third mapper comprises a third bipartite graph that maps the third latent representation to the third coarse-grained representation. In some embodiments, the mapper is configured to combine elements of the latent representation to create the coarse-grained representation. In some embodiments, the mapper is configured to combine by masking over elements of the latent representation. In some embodiments, the second mapper is configured to combine elements of the second latent representation to create the second coarse-grained representation. In some embodiments, the second mapper is configured to combine by masking over elements of the second latent representation. In some embodiments, the third mapper is configured to combine elements of the third latent representation to create the third coarse-grained representation. In some embodiments, the third mapper is configured to combine by masking over elements of the third latent representation.

[0054] In some embodiments, the peptide representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the latent representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the coarse-grained representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the second latent representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the second coarse-grained representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the third latent representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the third coarse-grained representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the plurality of nodes represent an atom, a functional group, a primary structure, a secondary structure, a tertiary structure, a quaternary, a motif, or any combination thereof. In some embodiments, the plurality of edges represent a bonded interaction, a nonbonded interaction, a short-range interaction, a long-range interaction, or any combination thereof.

[0055] In some embodiments, the encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the peptide representation and at least a second subset of elements in the latent representation. In some embodiments, the second encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the coarse-grained representation and at least a second subset of elements in the second latent representation. In some embodiments, the third encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the second coarse-grained representation and at least a second subset of elements in the third latent representation. In some embodiments, the encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the peptide representation, such that the latent representation is based at least partially on the physicochemical feature, the structural feature, or both. In some embodiments, the second encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the coarse-grained representation, such that the second latent representation is based at least partially on the structural feature, a physicochemical feature, or both. In some embodiments, the third encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the second coarse-grained representation, such that the third latent representation is based at least partially on the structural feature, a physicochemical feature, or both. In some embodiments, the physicochemical feature comprises: (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (vix) a peptide- protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvii) a formulation property, or (xviii) any combination thereof. In some embodiments, the structural feature comprises: a three-dimensional structure of an atom, a motif, a functional group, a residue, a secondary structure, a tertiary structure, a quaternary structure, or a molecular complex of the peptide. In some embodiments, the structural feature comprises: a structural annotation of an atom, a motif, a functional group, a residue, a secondary structure, a tertiary structure, a quaternary structure, or a molecular complex of the peptide. [0056] In some embodiments, the method further comprises: (a) providing a machine learning algorithm configured to (i) receive a hierarchical representation, and (ii) generate at least one predictive value associated with the hierarchical representation, wherein the hierarchical representation comprises the latent representation, the second latent representation, the third latent representation, the coarse-grained representation, the second coarse-grained representation, or the third coarse-grained representation, or any combination thereof; (b) providing a second training dataset comprising (i) one or more hierarchical representations encoded by the neural network, and (ii) one or more known values associated with the one or more hierarchical representations; and (c) training the machine learning algorithm at least in part by (i) processing the one or more hierarchical representations to the machine learning algorithm, (ii) generating one or more predictive values based at least in part on the one or more hierarchical representations, and (iii) updating at least one parameter of the machine learning algorithm based at least in part on the one or more predictive values. In some embodiments, the training reduces a difference between the one or more predictive values and the one or more known values.

[0057] In some embodiments, the method further comprises: (a) processing to the machine learning algorithm one or more sample hierarchical representations to generate one or more sample predictive values; and (b) selecting at least one sample hierarchical representation in the one or more sample hierarchical representations as a candidate representation based at least in part on the one or more sample predictive values. In some embodiments, the machine learning algorithm is sufficiently trained when the second training dataset comprises at most about 1000, 100, 10, or 1 known values.

[0058] In some embodiments, the one or more known values comprise (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (vix) a peptide -protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvi) a formulation property, or (xviii) any combination thereof.

[0059] In some embodiments, the training dataset comprises (i) a plurality of sample peptide representations comprising the sample peptide representation, and (ii) a plurality of sets of sample labels comprising the set of sample labels. In some embodiments, the set of sample labels comprise an identity for an element in the sample peptide representation, a property associated with an element in the sample peptide representation, or both. In some embodiments, the set of samples are assigned by masking over each element in the sample peptide representation. In some embodiments, the masking comprises obscuring a given element in the sample peptide representation and processing neighboring elements of the given element to a neural network, and generating a label for the given element based at least partially on the neighboring elements. In some embodiments, the training the neural network comprises selfsupervised learning.

[0060] In some embodiments, training the neural network further comprises, for each given sample peptide representation in the plurality of sample peptide representations and each given set of sample label in the plurality of sets of sample labels: i. processing the given sample peptide representation of the training dataset by the neural network; ii. generating a given sample coarse-grained peptide representation based at least partially on the given sample peptide representation, wherein the given sample coarsegrained peptide representation comprises (1) a given set of sample coarse-grained elements, and (2) a given set of sample identifiers for the given set of sample coarse-grained elements; and iii. updating at least one parameter of the neural network based at least in part on the given set of sample identifiers. [0061] In some embodiments, the updating is based on a loss function computed based at least partially on the given set of sample identifiers and the given set of sample labels. In some embodiments, the set of sample labels comprises a label for each node in the plurality of nodes. In some embodiments, the set of sample labels comprises a label for each edge in the plurality of edges. In some embodiments, the loss function comprises probability based loss function. In some embodiments, the loss function is configured to generate a maximum likelihood value. In some embodiments, the loss function is configured to generate a Kullback-Leibler value. In some embodiments, the loss function comprises a cross-entropy loss function.

[0062] In some embodiments, the method is implemented using one or more computer processors. In some embodiments, the method further comprises providing access to the computer-implemented method to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query hierarchical representation of a query peptide from the one or more users, and (ii) providing the query hierarchical representation of the query peptide to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query hierarchical representation of a query peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation.

[0063] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the methods disclosed herein. In some embodiment, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0064] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein. In some embodiments, the non- transitory computer-readable storage media is comprised in a distributed computing system, a cloudbased computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. In some embodiments, the non-transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer. In some embodiments, the computer program is encoded in a human non-readable format. [0065] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods disclosed herein. [0066] In some aspects, the present disclosure provides a method of generating a representation of peptides, comprising: (a) providing a first representation of a peptide, the first representation comprising nodes and edges, wherein the edges encode bonded and nonbonded interactions; and (b) generating a second representation of the peptide based at least in part on the first latent representation, wherein the second latent representation comprises a lower resolution than the first representation. In some embodiments, the method further comprises generating a third representation based at least in part on the second representation, wherein the third representation comprises a lower resolution than the second representation. In some embodiments, the method further comprises generating an N-th representation based at least in part on the third representation, wherein the N-th representation comprises a lower resolution than an (N-l)-th representation.

[0067] In some embodiments, the first representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the second representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the third representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the N-th representation comprises a plurality of nodes and a plurality of edges. In some embodiments, the plurality of nodes represent an atom, a functional group, a primary structure, a secondary structure, a tertiary structure, a quaternary, a motif, or any combination thereof. In some embodiments, the plurality of edges represent a bonded interaction, a nonbonded interaction, a short-range interaction, a long-range interaction, or any combination thereof. In some embodiments, the generating the second representation is based at least in part on a structural feature, a physicochemical feature, or both. In some embodiments, the generating the third representation is based at least in part on a structural feature, a physicochemical feature, or both. In some embodiments, the generating the N-th representation is based at least in part on a structural feature, a physicochemical feature, or both.

[0068] In some embodiments, the physicochemical feature comprises: (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (vix) a peptide -protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvii) a formulation property, or (xviii) any combination thereof.

[0069] In some embodiments, the structural feature comprises: a three-dimensional structure of an atom, a motif, a functional group, a residue, a secondary structure, a tertiary structure, a quaternary structure, or a molecular complex. In some embodiments, the first representation is a latent representation. In some embodiments, the second representation is a latent representation. In some embodiments, the third representation is a latent representation. In some embodiments, the N-th representation is a latent representation. In some embodiments, the generating is performed at least in part by using a neural network. In some embodiments, the generating is performed at least in part by using a mapper.

[0070] In some embodiments, the method further comprises synthesizing a peptide comprising a structure associated with the first latent representation, the second latent representation, or both.

[0071] In some embodiments, the method is implemented using one or more computer processors. In some embodiments, the method further comprises providing access to the method to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query hierarchical representation of a query peptide from the one or more users, and (ii) providing the query hierarchical representation of the query peptide to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query hierarchical representation of a query peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query hierarchical representation.

[0072] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods disclosed herein. In some embodiments, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0073] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein. In some embodiments, the non- transitory computer-readable storage media is comprised in a distributed computing system, a cloudbased computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. In some embodiments, the non-transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer. In some embodiments, the computer program is encoded in a human non-readable format.

[0074] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods disclosed herein. [0075] In some aspects, the present disclosure provides a method for training a deep generative neural network, comprising: (a) providing a neural network comprising: i. an encoder configured to receive a representation of a target peptide and generate a latent representation of the target peptide; and ii. a transformer configured to receive the latent representation of the target peptide and generate a predicted peptide ligand representation of a ligand that is predicted to bind to the target peptide; (b) providing a plurality of target peptide representations and a plurality of ligand representations; and (c) training the neural network at least in part by: i. processing the plurality of target peptide representations by the neural network; ii. generating a plurality of predicted ligand representations based at least partially on the plurality of target peptide representations; and iii. updating at least one parameter of the neural network based at least in part on the plurality of predicted peptide ligand representations.

[0076] In some embodiments, the updating is based on a loss function computed between the plurality of predicted peptide ligand representations and the plurality of ligand peptide representations. In some embodiments, the representation of the target comprises: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof. In some embodiments, the primary structure of at least the portion of the target is tokenized. In some embodiments, the secondary structure of at least the portion of the target is tokenized. In some embodiments, the tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof is encoded using a neural network. In some embodiments, the representation of the target comprises an embedding of: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof. In some embodiments, the ligand representation comprises: (i) an atomistic structure of the ligand, (ii) a primary structure of the ligand, (iii) a tertiary structure of the ligand, or (iv) any combination thereof.

[0077] In some embodiments, the representation of the target comprises a plurality of nodes and a plurality of edges. In some embodiments, the representation of the target comprises a hierarchical latent representation of the target. In some embodiments, the representation encodes short-range interactions, long-range interactions, bonded interactions, and non-bonded interactions between in the target peptide. [0078] In some embodiments, the method is implemented using one or more computer processors. In some embodiments, the method further comprises providing access to the method to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query ligand representation for a query target peptide from the one or more users, and (ii) providing the query ligand representation to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query ligand representation of a query target peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query ligand representation. [0079] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods disclosed herein. In some embodiments, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0080] In some aspects, a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer- implemented methods disclosed herein. In some embodiments, the non-transitory computer-readable storage media is comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. In some embodiments, the non-transitory computer-readable storage media is comprised in a laptop computer or a personal desktop computer. In some embodiments, the computer program is encoded in a human non-readable format.

[0081] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods disclosed herein. [0082] In some aspects, the present disclosure provides a method for generating a ligand representation, comprising: (a) providing a latent representation of a target peptide, wherein the latent representation comprises an embedding for at least one of: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof; (b) generating the ligand representation of a ligand based at least in part on the latent representation, wherein the ligand is predicted to bind to at least a portion of the target peptide.

[0083] In some embodiments, the method further comprises providing a second latent representation, wherein the second latent representation is provided by a trained machine learning model, wherein the trained machine learning model comprises a language model, a graph model, flow models, generative adversarial networks, variational autoencoders, autoregressive models, an autoencoder, diffusion models, or any combination thereof.

[0084] In some embodiments, the generating comprises adding noise to the latent representation. In some embodiments, the generating comprises generating using a decoder neural network of a transformer neural network. In some embodiments, the generating comprises using a greedy-sampling algorithm, a Monte Carlo tree search, a beam search, a genetic algorithm, or any combination thereof.

[0085] In some aspects, the present disclosure provides a method for generating a ligand representation, comprising: (a) providing a peptide graph of a target peptide, the peptide graph comprising a plurality of nodes and plurality of edges, wherein the plurality of edges encodes short-range interactions, long-range interactions, bonded interactions, and nonbonded interactions; and (b) generating the ligand representation of a ligand based at least in part on the target peptide graph, wherein the ligand is predicted to bind to at least a portion of the target peptide.

[0086] In some aspects, the present disclosure provides a method for generating a ligand representation, comprising: (a) providing a hierarchical representation of a peptide, wherein the hierarchical representation comprises at least one high-resolution representation and at least one low-resolution representation; and (b) generating the ligand representation of a ligand based at least in part on the hierarchical representation, wherein the ligand is predicted to bind to at least a portion of the peptide. [0087] In some embodiments, the generating is performed at least partially by using a transformer neural network. In some embodiments, the generating is performed at least partially by using a message-passing neural network. In some embodiments, the method further comprises performing a binding experiment between the target peptide and the ligand.

[0088] In some embodiments, the method is implemented using one or more computer processors. In some embodiments, the method further comprises providing access to the method to one or more users via a terminal, a web browser, an application, or a server infrastructure. In some embodiments, the terminal, the web browser, or the application comprises a graphical user interface for (i) receiving instructions for obtaining a query ligand representation for a query target peptide from the one or more users and (ii) providing the query ligand representation to the one or more users based at least in part on the instructions. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query ligand representation of a query target peptide. In some embodiments, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a peptide chemical structure associated with a query ligand representation.

[0089] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the computer-implemented methods disclosed herein. In some embodiments, the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0090] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein. In some embodiments, the non- transitory computer-readable storage media is comprised in a distributed computing system, a cloudbased computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units. In some embodiments, the non-transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer. In some embodiments, the computer program is encoded in a human non-readable format.

[0091] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the computer-implemented methods disclosed herein. [0092] In some aspects, the present disclosure provides a non-transitory computer-readable storage medium, comprising: (a) a peptide graph comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encodes bonded interactions and nonbonded interactions between the plurality of nodes; and (b) a latent representation for a node in a plurality of nodes, wherein the latent representation encodes at least short-range interactions and long-range interactions between the node and at least a subset of nodes in the plurality of nodes.

[0093] In some aspects, the present disclosure provides a non-transitory computer-readable storage medium, comprising: (a) a representation of a peptide comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encode bonded and nonbonded interactions between the plurality of nodes; and (b) an encoding based at least partially the first representation, wherein the encoding comprises lower resolution than the first representation.

[0094] In some aspects, the present disclosure provides a method comprising: (a) providing a machine learning algorithm configured to process a hierarchical representation of a peptide to generate a predictive value associated with the hierarchical representation of the peptide; (b) processing a plurality of hierarchical representations of a plurality of peptides using the machine learning algorithm to generate a plurality of predictive values associated with the plurality of hierarchical representations of the plurality of peptides; (c) selecting at least a subset of hierarchical representations in the plurality of hierarchical representations when each hierarchical representation in the at least the subset of hierarchical representations is associated with one or more predictive values in the plurality of values, wherein the predictive values are indicative of a quantitative metric of the plurality of hierarchical representations; and (d) generating one or more peptide structures associated with the subset of hierarchical representations.

[0095] In some embodiments, the optimized property comprises at least one of: (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (ix) a pharmacokinetic property, (x) a pharmacodynamic property, (xi) an in vivo safety property, (xii) a property related to a biomarker, (xiii) a formulation property, or (xiv) any combination thereof. In some embodiments, the optimized property is a pharmacological property, and the pharmacological property comprises a targeting ability or an antitargeting ability for one or more targets.

[0096] In some embodiments, the one or more targets comprise biomolecules found in a human body, an animal, or a plant. In some embodiments, the biomolecules comprise proteins, nucleic acids, lipids, or any combination thereof.

[0097] In some embodiments, the subset of hierarchical representations is selected when the subset of hierarchical representations is associated with a plurality of optimized properties. [0098] In some embodiments, the library of peptide comprises peptides associated with at least one of: i. a plurality of targets associated with human medical indications; ii. a plurality of human peptides; iii. a plurality of murine peptides; iv. a plurality of canine peptides; v. a plurality of primate peptides; vi. a plurality of animal peptides; vii. a plurality of plant peptides; viii. a plurality of bacterial peptides; ix. a plurality of viral peptides; x. a plurality of fungal peptides; and xi. a plurality of proteins.

[0099] In some embodiments, the library of peptides comprises peptides associated with at least one of: i. a plurality of targets associated with human medical indications; ii. a plurality of known human proteins; iii. a plurality of known murine proteins; iv. a plurality of known canine proteins; v. a plurality of known primate proteins; vi. a plurality of known animal proteins; vii. a plurality of known plant proteins; viii. a plurality of known bacterial proteins; ix. a plurality of known viral proteins; x. a plurality of known fungal proteins; and xi. a plurality of known proteins.

[0100] In some embodiments, the method further comprises: (a) synthesizing a peptide associated with a hierarchical representation in the subset of hierarchical representations; (b) measuring the optimized property of the peptide; and (c) updating the peptide library to incorporate a measurement based at least partially on the measuring in (b).

[0101] In some aspects, the present disclosure provides an artificial intelligence system for generating peptide structures, comprising: (a) a peptide library comprising peptide structures and peptide properties; (b) a representation learning system configured to receive at least the peptide structures from the peptide library and output hierarchical latent representations of the peptide structures; and (c) a transfer learning system configured to receive at least the peptide properties from the peptide library and at least the hierarchical latent representations from the representation learning system, and output domain-specific hierarchical latent representations based at least in part on the peptide structures and the peptide properties.

[0102] In some embodiments, the system further comprises a prediction system configured to receive at least the domain-specific hierarchical latent representations and output at least one property prediction for a peptide of interest. In some embodiments, the system further comprises a peptide structure generating system configured to receive at least one domain-specific hierarchical latent representation in the domainspecific hierarchical latent representations and output a peptide structure for a peptide of interest. In some embodiments, the system further comprises a transfer learning system configured to receive two or more peptide libraries to output a refined peptide library, wherein the two or more peptide libraries comprise (i) a first peptide library comprising a first set of ligands for a first target, and (ii) a second peptide library comprising a second set of ligands for a second target, and wherein the refined peptide library comprises a refined set of ligands that (1) targets the first target but not the second target, or (2) targets the first target and the second target, or (3) targets neither the first target nor the second target.

[0103] In some aspects, the present disclosure provides an in silico drug discovery method comprising: (a) performing a first set of docking simulations between a first set of peptides and a molecular target to generate a training dataset; (b) training a generative neural network using the training dataset; (c) generating, using the generative neural network, a plurality of peptide sequences; (d) clustering the plurality of peptide sequences into one or more clusters; (e) selecting a subset of peptide sequences in the plurality of sequences, wherein the subset of peptide sequences are substantially distributed among the one or more clusters; and (f) performing a second set of docking simulations between a second set of peptides and the molecular target, wherein the second set of peptides comprises the subset of peptide sequences.

[0104] In some aspects, the present disclosure provides an in silico drug discovery method comprising: (a) performing a first set of docking simulations between a first set of peptides and a molecular target to generate a training dataset; (b) training a prediction neural network using the training dataset; (c) generating a plurality of peptide sequences; (d) filtering, using the prediction neural network, a subset of peptide sequences from the plurality of peptide sequences; and (e) performing a second set of docking simulations between a second set of peptides and the molecular target, wherein the second set of peptides comprises the subset of peptide sequences, and wherein at least one peptide in the second set of peptides binds more favorably to the molecular target than each peptide in the first set of peptides.

[0105] In some aspects, the present disclosure provides an in silico drug discovery method comprising: (a) generating, using a generative neural network, a plurality of peptide sequences for binding a molecular target, wherein the generating is based at least partially on a docking simulation dataset; (b) screening, using a predictive neural network, a subset of peptide sequences from the plurality of peptide sequences; and (c) performing a set of docking simulations between a set of peptides and the molecular target, wherein the set of peptides comprises the subset of peptide sequences.

[0106] Another aspect of the present disclosure provides a non-transitory computer readable medium comprising machine executable code that, upon execution by one or more computer processors, implements any of the methods above or elsewhere herein.

[0107] Another aspect of the present disclosure provides a system comprising one or more computer processors and computer memory coupled thereto. The computer memory can comprise machine executable code that, upon execution by the one or more computer processors, implements any of the methods above or elsewhere herein.

[0108] Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.

INCORPORATION BY REFERENCE

[0109] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. To the extent publications and patents or patent applications incorporated by reference contradict the disclosure contained in the specification, the specification is intended to supersede and/or take precedence over any such contradictory material.

BRIEF DESCRIPTION OF THE DRAWINGS

[0110] The novel features of the disclosure are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings of which:

[0111] FIG. 1 shows a 3D structure of a peptide, an atomistic graph representation of a peptide, and a residue graph representation of a peptide.

[0112] FIG. 2 shows a 3D residue representation of a peptide.

[0113] FIG. 3 shows a 3D atomistic representation of a peptide.

[0114] FIG. 4 shows a residue graph representation of a peptide.

[0115] FIG. 5 shows a neighborhood graph representation of a peptide.

[0116] FIG. 6 shows a context graph representation of a peptide.

[0117] FIG. 7 shows a bipartite graph connecting an atomistic graph of a peptide and the residue graph of the peptide.

[0118] FIG. 8 shows an architecture of a self-supervised representation learning algorithm.

[0119] FIG. 9A shows a graph attention layer.

[0120] FIG. 9B shows a message passing layer.

[0121] FIG. 10 shows an architecture of a generative model.

[0122] FIG. 11 shows a computer system programmed to implement systems and methods of the present disclosure.

[0123] FIG. 12 shows a schematic for a drug discovery process.

[0124] FIG. 13 shows a schematic for a drug discovery process.

[0125] FIG. 14 shows a schematic for a drug discovery process.

[0126] FIG. 15 shows a graphical user interface (GUI) for the Nautilus™ Peptide Library Browser, a web browser-based tool for browsing and visualization multiparameter-optimized peptide libraries. The opening screen can allow selection of a target protein from a list, constrained by user-supplied text in the input box.

[0127] FIG. 16 shows a producing a summary for a selected target, e.g., PDC10_HUMAN, corresponding to programmed cell death protein 10. The GUI can be used to produce a summary of the target followed by a visualization of properties that are important for peptide drug discovery and user interface controls to limit the displayed peptides according to desired values of those properties. The multiparameter-optimized peptide library for PDC10 HUMAN is shown to contain 6,325 peptides. In this example, the properties rendered are “molecular_weight” (molecular weight, expressed in Daltons), “solubility” (probability of the peptide being soluble in water), “lipophilicity” (lipophilicity; parameterized, with higher being more lipophilic), “cell_penetrating” (probability that a peptide is capable of crossing cell membranes), “bbb_penetrating” (probability that a peptide is capable of crossing the blood-brain barrier), and “toxicity” (probability that a peptide is cytotoxic). The top numbers at the left and right ends of each control indicate the minimum and maximum values for the each of the properties, and the lower numbers indicate the currently selected values. The clusters in the visualization illustrates peptides that are similar as measured in the six-dimensional space (for this example; the actual dimensionality is typically higher because more properties are included), where the reduction from six dimensions to two dimensions for visualization is produced using uniform manifold approximation and projection (UMAP), a general, nonlinear dimensionality reduction algorithm that facilitates visualization of high-dimensional data and is well suited for machine learning applications.

[0128] FIG. 17 shows an illustration of selecting peptides having properties in specific ranges - the user has selected peptides predicted to be soluble, cell-penetrating (able to cross cell membranes), have low toxicity, and have moderate to high lipophilicity. This selection allows the number of peptides selected to be reduced from the total of 6,325 to only 2. The selection can be viewed on-screen or downloaded for offline analysis.

[0129] FIG. 18 shows a comma-separated values (CSV) file that is downloaded through the user’s browser. In this example, the downloaded file is displayed using a spreadsheet software.

[0130] FIG. 19 shows a reduced-dimension rendering of six properties of the property-optimized peptide library for the glucocorticoid receptor, a human protein with UniProtKB ID GCR HUMAN, illustrating the distribution of six properties that are important for peptide drug discovery. The properties shown include “molecular weighf ’ (molecular weight, expressed in Daltons), “solubility” (probability of the peptide being soluble in water), “lipophilicity” (lipophilicity; parameterized, with higher being more lipophilic), “cell_penetrating” (probability that a peptide is capable of crossing cell membranes), “bbb_penetrating” (probability that a peptide is capable of crossing the blood-brain barrier), and “toxicity” (probability that a peptide is cytotoxic). The reduced-dimension rendering is produced using uniform manifold approximation and projection (UMAP), a nonlinear dimensionality reduction algorithm that facilitates visualization of high-dimensional data and is well suited for machine learning applications. [0131] FIG. 20 shows reduced-dimension rendering of six properties of the property-optimized peptide library for mitogen-activated protein kinase 9, a human protein with UniProtKB ID MK09 HUMAN.

[0132] FIG. 21 shows reduced-dimension rendering of six properties of the property-optimized peptide library for tumor susceptibility gene 101 protein, a human protein with UniProtKB ID TS101 HUMAN. [0133] FIG. 22 shows reduced-dimension rendering of six properties of the property-optimized peptide library for programmed cell death protein 10, a human protein with UniProtKB ID PDC10 HUMAN.

DETAILED DESCRIPTION

[0134] While various embodiments of the invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed.

[0135] As used in the specification and claims, the singular form “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a peptide” includes a plurality of peptides, including mixtures thereof.

[0136] As used herein, the terms “representation” and “descriptor”, generally refer to a description of a molecular entity. For example, commonly used descriptors include SMILES, primary sequences, or nucleic acid sequences. In some cases, a representation may be used as feature values to a neural network. In some cases, a representation may describe a peptide.

[0137] As used herein, the terms “peptide” and “protein” generally refer to any molecule comprising two or more amino acids.

[0138] A peptide may form large macromolecular structures, whose geometry may be dependent on both short-range interactions between atoms and residues (e.g., covalent bonds), and long-range interactions (e.g., electrostatic interactions, disulfide bridges or hydrogen bonding between distance residues in a primary structure). It may be desirable to encode both short-range and long-range interactions in a molecular structure using a neural network in a latent space, such that the organized latent space can be analyzed to seek useful candidate peptides for synthesis and experiments.

[0139] One challenge in encoding peptide structures into a latent space is scalability. Since the size of peptides may vary greatly (e.g., from 2 amino acids to virtually infinite number of peptides), a useful neural network may be designed to be “scalable”, for example, to be able to be applied to peptides across many different sizes. In some aspects, the present disclosure provides systems and methods for training and using neural networks to produce latent representations over node representations of a peptide (e.g., a node may be an atom) in a “sliding window” fashion. For example, a trained neural network may be applied on each node in the node representation of a peptide without being limited to a certain size range of peptides.

[0140] Another challenge in encoding peptide structures into a latent space is preserving both short- range and long-range interactions in a peptide structure or an assembly comprising a peptide. Short-range interactions may include covalent bonds, and long-range interactions may include electrostatic forces or interactions that exist between distant amino acids in an amino acid sequence. In some aspects, the present disclosure provides systems and methods fortraining and using neural networks to produce latent representations that incorporate both short-range and long-range interactions. In one example, two neural networks may be trained simultaneously, one for short-range and one for long-range, where the two are coupled in the loss gradients such that one learns a latent space similar to the other.

[0141] Another challenge with encoding peptide structures is in exploring the chemical space of peptides. Peptides have enormous diversity in the different structures that may be achievable. Even when it is assumed that a unique sequence corresponds to exactly one conformation (which may not be true, since mechanisms and modifications such as post-translational modifications and chaperone folding leads to even more diversity in peptide conformations), and even when only 20 natural amino acids in the human body are considered, the number of possibilities scale as N 20 , wherein N is the number of amino acids. High-throughput library screening, alone, can be insufficient because the number of compounds that can be produced and assessed is only a small fraction of the possible compounds, thereby reducing the chance of finding a good drug candidate. For example, a high-throughput screen may produce 10 12 compounds, whereas the space of small molecules may be approximately 10 60 , and the space of peptides of length 50 including nonstandard amino acids and modifications is approximately 10 181 . Thus, the odds of finding a drug candidate in a random high-throughput screen can be as low as 1 in 10 48 or 1 in 10 169 .In some aspects, the present disclosure provides systems and methods for performing transfer learning on trained neural networks (e.g., tuning a trained neural network on a domain-specific dataset) such that much less of the chemical space needs to be considered in chemical space exploration.

[0142] In some aspects, the present disclosure provides a method for training a neural network to learn short-range and long-range interactions of a peptide. In some cases, the method may be used to encode physical and/or chemical information of the neighborhood of each element (e.g., a node) in a peptide representation (e.g., a graph representation) by embedding features of the element to a latent space based at least partially on the neighborhood context. The learned node-level information can be engineered to include the peptide- or protein-specific knowledge to create the appropriate latent representation.

[0143] In some cases, the method comprises constructing a peptide graph based at least partially on a peptide structure. For example, FIG. 1 illustrates peptide graphs constructed from a peptide structure. In some cases, the peptide structure (1001) may be used to generate an atomistic representation of a graph (1002). In some cases, the peptide structure may be used to generate a residue representation of a graph (1003). In some cases, the peptide graph comprises a plurality of nodes and a plurality of edges. In some cases, the plurality of edges represents at least bonded and nonbonded interactions between the plurality of nodes of the peptide graph.

[0144] In some cases, the plurality of nodes comprises atoms in the peptide structure. In some cases, the plurality of nodes comprises functional groups in the peptide structure. In some cases, the plurality of nodes comprises amino acids in the peptide structure. In some cases, the plurality of nodes comprises secondary structures in the peptide structure. In some cases, the plurality of nodes comprises tertiary structures in the peptide structure. In some cases, the plurality of nodes comprises quaternary structures in the peptide structure.

[0145] In some cases, the method further comprises constructing a long-range subgraph of the peptide graph. In some cases, the long-range subgraph comprises a set of long-range nodes within a second range from a second central node. In some cases, the set of long-range nodes comprises at least one anchor node. In some cases, the first range and the second range overlap such that, when the first central node and the second central node are the same, the set of short-range nodes and the set of long-range nodes comprise the at least one anchor node.

[0146] In some cases, the first range from the first central node is less than or equal to K0 number of edges from the first central node, wherein K0 is 1 or more. For example, a short-range graph of a central node may be defined as the subgraph of nodes that are K0 hops from the central node. For example, in FIG. 5, node L8 may be selected as the central node. Then, the short-range graph, when KO = 2, of the central node comprises nodes which is less than or equal to two edges from the central node.

[0147] In some cases, the second range from the second central node is greater than or equal to KI number of edges from the second central node and less than or equal to K2 number of edges from the second central node, wherein KI is 2 or more and K2 is 3 or more. For example, the long-range graph of the central node may be defined as the subgraph of nodes that are at least KI edges from the central node and at most K2 edges from central node. For example, in FIG. 6, node L8 is selected as the central node. The KI = 2 and K2 = 3 long-range graph comprises nodes that are at least 2 edges away and at most 3 edges away from the central node. In some cases, the long-range subgraph comprises at least one connected graph. In some cases, the long-range subgraph comprises at least two disconnected graphs. [0148] The number of anchor nodes may vary based on the central node and the peptide graph. For example, in some cases, anchor nodes may be defined as nodes that are common to the neighborhood subgraph and the context subgraph. For example, in FIG. 5 and FIG. 6, node L8 (outlined in black) is selected as the node of interest, v. The context graph of v, given k = 2 and ri = 2 and = 3 comprises of nodes Wl, N3, T5, K6, Q10, KI 1, Y13, 115, 116, D18, V24, G26, K27, G29, Q31, Q32, and Q34. In some cases, the at least one anchor node may comprise at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 nodes.

[0149] Generally, KO, KI, and K2 may comprise various integers. For example, in some cases, KO may be 2, 3, 4, 5, 6, 7, 8, or more. In some cases, KI may be 3, 4, 5, 6, 7, 8, 9, or more. In some cases, K2 may be 4, 5, 6, 7, 8, 9, 10, or more.

[0150] In some cases, the method further comprises constructing a first neural network based at least partially on the short-range subgraph. In some cases, the method comprises processing feature values of the short-range subgraph into the first neural network to generate a short-range latent representation. For example, FIG. 8 shows a first neural network (8001) configured to receive the short-range subgraph. In some cases, the first neural network comprises a graph convolution network (GCN), a graph attention network (GAT), a message passing neural network (MPNN), or any other GNN that performs permutation-invariant pooling and aggregation. Different GNNs may treat attention coefficients across the neighboring nodes differently, resulting in different accuracy, convergence, and learning properties that may be optimized using neural architecture search. In some cases, the first neural network comprises at least K0 layers. In some cases, the short-range latent representation comprises a latent representation of the first central node.

[0151] In some cases, the method comprises constructing a second neural network based at least partially on the long-range subgraph. In some cases, the method comprises processing feature values of the long- range subgraph into the second neural network to generate a long-range latent representation. For example, FIG. 8 shows a second neural network (8002) configured to receive the long-range subgraph. In some cases, the second neural network comprises a graph neural network architecture. In some cases, the second neural network comprises a graph convolution network (GCN), a graph attention network (GAT), a message passing neural network (MPNN), or any other GNN that performs permutation-invariant pooling and aggregation. Different GNNs may treat attention coefficients across the neighboring nodes differently, resulting in different accuracy, convergence, and learning properties that may be optimized using neural architecture search. In some cases, the second neural network comprises at least KO layers. In some cases, the second neural network comprises at most K2 layers. In some cases, the long-range latent representation comprises a pooled latent representation of the at least one anchor node.

[0152] In some cases, the method further comprises updating parameters of the short-range neural network and the long-range neural network such that the short-range latent representation and the long- range latent representation are similar when the first central node and the second central are the same. In some cases, the method further comprises updating parameters of the short-range neural network and the long-range neural network such that the short-range latent representation and the long-range latent representation are dissimilar when the first central node and the second central are not the same. For example, the training task may be posed as predicting whether a given long-range latent representation belongs to a particular short-range latent representation (or vice versa). In some cases, the training task may induce the GNN to map the nodes with similar structural context to nearby points in the highdimensional representation. In some cases, to compare the short-range latent representation and the long- range latent representation, the optimizing may comprise average pooling over the anchor nodes for the long-range latent representation.

[0153] For example, FIG. 8 shows a contrastive learning loss function (8003) configured to compare generated latent representations from the first neural network (8001) and the second neural network (8002). In some cases, the updating comprises evaluating similarity or dissimilarity based at least partially on contrastive learning. In some cases, the contrastive learning comprises optimizing a contrastive loss function based at least partially on a dot product between the short-range latent representation and the long-range latent representation. In some cases, the optimizing is performed such that a sigmoid of the dot product is equal to one when the first central node and the second central node are the same, and the sigmoid of the dot product is equal to zero when the first central node and the second central node are not the same.

[0154] In some cases, positive samples for the contrastive learning may be directly sampled from the graph representations of peptides and proteins. In some cases, negative samples for the contrastive learning may be added through random sampling. In some cases, positive samples for the contrastive learning may comprise pairs of nodes in a short-range graph and a long-range graph where both are sampled from the same peptide graph and central node. In some cases, negative samples may comprise pairs of nodes in a short-range graph and a long-range graph where both are sampled from different peptide graph or same peptide graph but a different central node:

[0155] Various loss functions may be used for contrastive learning. Various measures of similarity or dissimilarity may be used for contrastive learning. In some cases, the output of the model may be the sigmoid of the dot product of the node representation vector and the context representation vector: [0157] In some cases, a similarity function may be a cosine similarity, a Tanimoto similarity, a Jaccard coefficient, a Dice coefficient, a Tversky similarity, an Avalon fingerprint, a Morgan fingerprint, a pharmacophore fingerprint, a molecular access system fingerprint (MACCS), an extended-connectivity fingerprint (ECFP4), a 2D similarity, a 3D similarity, a local similarity, or any combination thereof. [0158] In some cases, training of the model may be performed by minimizing the cross-entropy loss: [0159] J (0) = - ^Z”=1 [y (o log fl® + (1 - y®) log(

[0160] Various loss functions may be used. In some cases, a loss function may comprise regression loss, a probabilistic loss, or both. In some cases, a loss function may comprise a mean squared error loss, a logarithmic loss, a likelihood loss, a quadratic loss, a Hinge loss, a squared hinge loss, a Kullback-Leibler Divergence loss, a mean absolute percentage error loss, or any combination thereof.

[0161] In some cases, the context prediction task may be used to learn the representation at lower levels of representation with individual atoms represented as nodes, and the chemical interactions such as covalent bonds and van der Waals interactions represented as edges.

[0162] In some cases, the node embeddings learned from the low level graph representations (e.g., atomic level graph representations) may be used to learn embeddings at higher levels graph representation (e.g., residue level graph representations) using hierarchical learning.

[0163] In some cases, the low-level embeddings may be processed using GNN operating on a bipartite graph connecting the higher level nodes (e.g., residue nodes) with their nodes in low level representations (e.g., atoms within a particular residue) as shown in FIG. 7.

[0164] In some cases, the learning of higher level node embeddings may be performed by posing the problem as a masked attribute prediction task, where random residues are masked and the output of the GNNs are used to predict the value of the residue.

[0165] In some aspects, the present disclosure provides a method for obtaining a latent representation. In some cases, a trained neural network may be used to obtain the latent representation. In some cases, the latent representation may be obtained for a new peptide or peptide representation not available in a training data set. In some cases, the method may comprise providing a peptide graph comprising a plurality of nodes and a plurality of edges. In some cases, the plurality of edges encodes bonded interactions and nonbonded interactions between the plurality of nodes. In some cases, the method may comprise generating the latent representation for a node in the plurality of nodes. In some cases, the generating is based at least partially on the peptide graph. In some cases, the latent representation encodes short-range interactions and long-range interactions between the node and at least a subset of nodes in the plurality of nodes. In some cases, the latent representation is embedded in a latent space, wherein the latent space is organized based at least partially on nonbonded interactions, long-range interactions, or both. In some cases, the latent representation is embedded in a latent space, wherein the latent space is organized based at least partially on bonded interactions, short-range interactions, or both.

[0166] In some cases, the generating comprises constructing a short-range subgraph of the peptide graph. In some cases, the short-range subgraph comprises (i) the node and (ii) a set of short-range nodes within a first range from the node. In some cases, the generating comprises constructing a first neural network based at least partially on the short-range subgraph. In some cases, the generating comprises processing feature values of the short-range subgraph into the first neural network to generate the latent representation for the node. For example, FIG. 8 show a first neural network (8001) that is configured to generate the latent representation based at least partially on a short-range subgraph. In some cases, the first neural network may be operated independently from a second neural network (8002) during inference. In some cases, the first neural network comprises a graph neural network architecture. In some cases, the first range from the node is less than or equal to KO number of edges from the node, wherein KO is 1 or more. In some cases, KO is 2, 3, 4, 5, 6, 7, 8 or more. In some cases, the first neural network comprises at least KO layers. In some cases, the neural network may be trained in a self-supervised representation learning algorithm. In some cases, the first neural network comprises a graph neural network. In some cases, the second neural network comprises a graph neural network. In some cases, layers of the graph neural network in the neighborhood and context GNNs may be graph convolution networks (GCNs), graph attention networks (GATs), message passing neural networks (MPNN), or any other GNNs that perform permutation-invariant pooling and aggregation. In some cases, different GNNs may treat attention coefficients across the neighboring nodes differently, resulting in different accuracy, convergence, and learning properties that may be optimized using neural architecture search. FIG. 9A illustrates operation of a graph attention network (GAT) layer that may be used. FIG. 9B illustrates operation of a message -passing neural network (MPNN) layer that may be used.

[0167] In some cases, the generating comprises constructing a long-range subgraph of the peptide graph. In some cases, the long-range subgraph comprises (i) the node and (ii) a set of long-range nodes within a second range from the node. In some cases, the set of long-range nodes comprises one or more anchor nodes. In some cases, the generating comprises constructing a second neural network based at least partially on the long-range subgraph. In some cases, the generating comprises processing feature values of the long-range subgraph into the second neural network to generate one or more latent representations for the one or more anchor nodes. In some cases, the generating comprises combining the one or more latent representation to generate the latent representation for the node. For example, FIG. 8 show a second neural network (8002) that is configured to generate the latent representation based at least partially on a long-range subgraph. In some cases, the second neural network may be operated independently from a first neural network (8001) during inference. In some cases, the second neural network comprises a graph neural network architecture. In some cases, the second range from the node is greater than or equal to KI number of edges from the node and less than or equal to K2 number of edges from the node, wherein KI is 2 or more and K2 is 3 or more. In some cases, KI is 3, 4, 5, 6, 7, 8, 9, or more. In some cases, K2 is 4, 5, 6, 7, 8, 9, 10, or more. In some cases, the second neural network comprises at least K0 layers. In some cases, the second neural network comprises at most K2 layers. In some cases, the method comprises generating a peptide latent representation by aggregating a plurality of latent representations over each node in the peptide graph.

[0168] Various representations may be used during inference. In some cases, the plurality of nodes comprises atoms in the peptide structure. In some cases, the plurality of nodes comprises functional groups in the peptide structure. In some cases, the plurality of nodes comprises amino acids in the peptide structure. In some cases, the plurality of nodes comprises secondary structures in the peptide structure. In some cases, the plurality of nodes comprises tertiary structures in the peptide structure. In some cases, the plurality of nodes comprises quaternary structures in the peptide structure. In some cases, the latent representation encodes one or more features of atoms, functional groups, amino acids, secondary structures, tertiary structures, quaternary structures, or any combination thereof. In some cases, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise an identifier of an atom, a functional group, an amino acid, a secondary structure, a tertiary structure, or a motif. In some cases, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise a physical property of an atom, a functional group an amino acid, a secondary structure, a tertiary structure, or a motif. In some cases, the motif represents a collection of atoms. In some cases, the feature values of the short-range subgraph, the feature values of the long-range subgraph, or both comprise geometric descriptor of an atom, an amino acid, a secondary structure, or a tertiary structure. In some cases, the feature values of the short-range subgraph, the feature values of the long- range subgraph, or both comprise an electronic configuration, a charge, a size, a bond angle, a dihedral angle, or any combination thereof. In some cases, the functional group comprises a BRICS motif. In some cases, the method comprises generating a peptide latent representation by aggregating a plurality of latent representations over each node in the peptide graph.

[0169] In some aspects, the present disclosure provides a method for training a domain-specific neural network. In some cases, a trained neural network of the present disclosure may be trained for a domainspecific task. In some cases, the domain-specific training may utilize less data points compared to a training a neural network from scratch.

[0170] In some cases, the method further comprises providing a trained neural network configured to receive at least a subgraph of a peptide graph and generate a latent representation that encodes (i) short- range interactions between a plurality of nodes of the subgraph, and (ii) long-range interactions between (1) at least one node of the subgraph and (2) at least one node in the peptide graph but not in the subgraph.

[0171] In some cases, the method further comprises providing a domain-specific dataset comprising a set of peptide graphs and a set of labels. In some cases, the method comprises retraining the trained neural network. In some cases, the retraining comprises, for a given peptide graph in the set of peptide graphs and a given label in the set of labels, generating a plurality of subgraphs based at least partially on the given peptide graph. In some cases, the retraining comprises processing the plurality of subgraphs to the trained neural network to generate a plurality of latent representations. In some cases, the retraining comprises generating a plurality of subgraphs based at least partially on the given peptide graph. In some cases, the retraining comprises combining the latent representations to generate a prediction value. In some cases, the retraining comprises updating at least one parameter of the trained neural network based at least partially on a difference between the prediction value and the given label. In some cases, the short-range interactions comprise bonded interactions, nonbonded interactions, or both. In some cases, the long-range interactions comprise nonbonded interactions, bonded interactions, or both.

[0172] In some cases, the combining comprises pooling the plurality of latent representations to generate the prediction value. In some cases, the combining comprises using a machine learning algorithm to generate the prediction value based at least partially on the plurality of latent representations.

[0173] In some cases, domain-specific training may be performed using a few-shot learning protocol. In some cases, the trained neural network is sufficiently retrained when the set of peptide graphs comprises at most about 1, 2, 5, 10, 100, or 1000 peptide graphs. In some cases, the trained neural network is sufficiently retrained when the set of labels comprises at most about 1, 2, 5, 10, 100, or 1000 labels. . In some cases, the trained neural network is sufficiently retrained when the set of peptide graphs comprises at least about 1, 2, 5, 10, 100, or 1000 peptide graphs. In some cases, the trained neural network is sufficiently retrained when the set of labels comprises at least about 1, 2, 5, 10, 100, or 1000 labels. In some cases, domain-specific training may adjust the latent space and embeddings therein. In some cases, domain-specific training may adjust the latent space and embeddings to linearize the latent space along a domain-specific direction.

[0174] Various transfer learning techniques may be employed during the retraining. In some cases, the retraining the trained neural network may comprise freezing at least a subset of parameters of the neural network during the retraining, fixing a fraction of the weights during the retraining, using a regularization function during the retraining, one or more dropout layers during the retraining, or any combination thereof.

[0175] Once trained, a domain specific neural network may be used to predict a property associated with an input peptide representation. In some aspects, the present disclosure provide a method of predicting a peptide property. In some cases, the method comprises generating a peptide property prediction based at least partially on the domain-specific latent representation. In some cases, the method comprises providing a peptide graph comprising a plurality of nodes and plurality of edges. In some cases, the plurality of edges encodes bonded interactions and nonbonded interactions. In some cases, the method comprises generating a domain-specific latent representation based at least in part on the peptide graph. In some cases, the domain-specific latent representation encodes domain knowledge for a given domain. In some cases, the domain-specific latent representation encodes the short-range interactions and long-range interactions between the plurality of nodes in the peptide graph. In some cases, the short-range interactions comprise the bonded interactions, the nonbonded interactions, or both. In some cases, the long-range interactions comprise the nonbonded interactions, the bonded interactions, or both. It is considered that any one of the trained neural networks in this disclosure may be trained to be domainspecific.

[0176] For example, the short-range neural network (8001), the long-range neural network (8002) , or both, in FIG. 8, may be retrained on one or more domain specific tasks to create a domain-specific neural network. A domain-specific neural network may then be used to generate modified latent representations or property predictions that are tuned for the domain. In some aspects, the present disclosure provides a method for domain-specific prediction task based on a peptide graph. In some cases, the method comprises providing a machine learning architecture.

[0177] In some cases, the machine learning architecture comprises an encoder. In some cases, the encoder is configured to receive at least a subgraph of the peptide graph and generate a latent representation. In some cases, the encoder encodes (i) short-range interactions between nodes of the subgraph, (ii) long-range interactions between (1) at least one node of the subgraph and (2) at least one node in the peptide graph but not in the subgraph, (iii) domain-specific information for at least one domain, or any combination thereof. In some cases, the machine learning architecture comprises a predictor configured to receive the latent representation and generate a prediction value. In some cases, the method comprises processing the peptide graph to the machine learning architecture and generating the prediction value.

[0178] A neural network may be trained for any number of domain-specific tasks, simultaneously or individually. In some cases, multi-task training (e.g., training on two or more domain specific tasks) may enhance the accuracy, reduce noise, or both for any one of the domain-specific tasks. Without being bound to a particular theory, it may be understood that information useful for performing one domainspecific task may be correlated to information useful for performing another domain-specific task. Therefore, there may be synergistic effects in multi-task training. In some cases, the domain-specific information comprises information for at least 2, 3, 4, 5, 10, or 100 domains.

[0179] Various neural networks, including the ones disclosed herein, may be used. For example, for neural networks operating on peptide graphs, such as the ones shown in FIG. 8, in some cases, the processing comprises generating a plurality of subgraphs based at least partially on the peptide graph. In some cases, the processing comprises processing the plurality of subgraphs to the neural network to generate a plurality of latent representations. In some cases, the processing comprises combining the plurality of latent representations to generate the prediction value.

[0180] In some aspects, the present disclosure provide a method for training a neural network to learn hierarchical representations of peptides. Any one of the neural networks disclosed herein may be used in conjunction with or be incorporated into a larger neural network for learning hierarchical representations of peptides. In some cases, the method comprises providing the neural network.

[0181] In some cases, the neural network comprises an encoder. In some cases, the encoder is configured to receive a peptide representation and generate a latent representation based at least partially on the peptide representation. In some cases, the latent representation encodes short-range interactions, long- range interactions, bonded interactions, non-bonded interactions, or any combination thereof between elements in the peptide representation.

[0182] In some cases, the method may encode physical, chemical, and structural features of peptides and proteins by training on a geometric graph representation of the three-dimensional structure of peptides and proteins using a self-supervising learning algorithm. In some cases, for an atom -level graph, three- dimensional peptide or protein structure may be converted into a graph, wherein each node v represents an atom and each edge e uv represents an interatomic interaction between nodes u and v. In some cases, an interatomic interaction may include any of the following: a covalent bond, a hydrogen bond, a disulfide bond, or a van der Waals interaction based on distance threshold. In some cases, for a residue-level graph, each three-dimensional peptide or protein structure is converted into a graph, where each node v represents a residue and each edge e MV represents a chemical interaction between nodes u and v. In some cases, a chemical interaction can be any of the following: a peptide bond, a disulfide bond, a hydrogen bond, an aromatic -aromatic interaction, an aromatic-pi interaction, an aromatic -cation interaction, a van der Waals interaction based on distance threshold, or a hydrophobic interaction. In some cases, for a molecule-level graph, each three-dimensional peptide or protein structure complex may be converted into a graph, where each node v represents a molecule and each edge e uv represents an intermolecular interaction between nodes u and. In some cases, intermolecular interactions can be any of the following: a covalently bound interaction, a non-specific binding interaction, or a steric interaction.

[0183] Effective high-dimensional latent representations capturing the structural features, chemical interactions with other residues, and the surrounding environment may be learned at the node level, where each node represents an atom, functional group, amino acid residue, or molecule, for respective levels of the hierarchy. In some cases, a self-supervised learning algorithm does not require labels or a priori knowledge of downstream tasks. In some cases, a neural network trained in a self-supervised learning algorithm, which may be agnostic of downstream tasks, may learn a generalized latent representation of the atoms, functional groups, amino acid residues, molecules, bonds, and interactions. In some cases, a neural network trained may benefit from positive transfer during fine-tuning, producing domain-specific latent embeddings for learned representations.

[0184] In some cases, the neural network comprises a mapper configured to receive the latent representation and generate a coarse-grained peptide representation of the peptide based at least partially on the latent representation. In some cases, the coarse-grained peptide representation comprises (i) a set of coarse-grained elements, and (ii) a set of identifiers for the set of coarse-grained elements, wherein the coarse-grained representation comprises a lower resolution than the peptide representation. For example, the mapper may be a bipartite graph that maps between a high resolution of a peptide representation and a low resolution of a peptide representation. FIG. 6 illustrates a bipartite graph connecting atom- and residue-level graphs for a nine-residue peptide.

[0185] In some cases, the method further comprises providing a training dataset comprising a sample peptide representation and a set of sample labels for the sample peptide representation. In some cases, the neural network is trained by: processing the sample peptide representation of the training dataset into the neural network. In some cases, the neural network generates a sample coarse-grained peptide representation based at least partially on the sample peptide representation. In some cases, the sample coarse-grained peptide representation comprises (1) a set of sample coarse-grained elements, and (2) a set of sample identifiers for the set of sample coarse-grained elements. In some cases, the training comprises updating at least one parameter of the neural network based at least partially on a loss function computed based at least partially on the set of sample identifiers and the set of sample labels. [0186] In some cases, the neural network further comprises a second encoder configured to receive the coarse-grained peptide representation and generate a second latent representation based at least partially on the coarse-grained peptide representation. In some cases, the second latent representation encodes short-range interactions, long-range interactions, bonded interactions, non-bonded interactions, or any combination thereof between the set of coarse-grained elements in the coarse-grained peptide representation.

[0187] In some cases, the neural network further comprises a second mapper configured to receive the second latent representation and generate a second coarse-grained peptide representation based at least partially on the second latent representation. In some cases, the second coarse-grained peptide representation comprises (i) a second set of coarse-grained elements, and (ii) a second set of identifiers for the second set of coarse-grained elements. In some cases, the second coarse-grained representation comprises a lower resolution than the coarse-grained representation.

[0188] In some cases, the neural network further comprises a third encoder configured to generate a third latent representation based at least partially on the second coarse-grained representation. In some cases, the neural network further comprises a third mapper configured to generate a third coarse-grained representation based at least partially on the third latent representation. In some cases, the third coarsegrained representation comprises a lower resolution that the second coarse-grained representation.

[0189] A neural network for learning hierarchical representations may comprise numerous hierarchies (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or however may be appropriate for a machine learning task). In some cases, the neural network comprises H number of encoders and mappers, such that the neural network is configured to generate a hierarchical representation comprising H number of resolutions and/or levels of encoding, wherein H is at least 2, 3, 4, 5, 6, 7, 8, 9, or 10. In some cases, the neural network comprises encoders and mappers in alternating order. In some cases, the mapper, the second mapper, the third mapper, or any mapper may comprise a bipartite graph that maps a representation in one hierarchy to a next level of hierarchy. In some cases, the mapper, the second mapper, the third mapper, or any mapper is configured to combine by masking over elements of a given representation to create a representation in the next level of hierarchy.

[0190] Various representations of peptides may be used in the hierarchical neural network. In some cases, the peptide representation comprises a plurality of nodes and a plurality of edges. In some cases, the latent representation comprises a plurality of nodes and a plurality of edges. In some cases, the coarsegrained representation comprises a plurality of nodes and a plurality of edges. In some cases, the second latent representation comprises a plurality of nodes and a plurality of edges. In some cases, the second coarse-grained representation comprises a plurality of nodes and a plurality of edges. In some cases, the third latent representation comprises a plurality of nodes and a plurality of edges. In some cases, the third coarse-grained representation comprises a plurality of nodes and a plurality of edges. In some cases, the plurality of nodes represents an atom, a functional group, a primary structure, a secondary structure, a tertiary structure, a quaternary, a motif, or any combination thereof. In some cases, the plurality of edges represents a bonded interaction, a nonbonded interaction, a short-range interaction, a long-range interaction, or any combination thereof. In some cases, representations other than nodes and edges may be utilized.

[0191] Attention mechanisms and/or a transformer architecture may be used in the hierarchical neural networks. In some cases, the encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the peptide representation and at least a second subset of elements in the latent representation. In some cases, the second encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the coarse-grained representation and at least a second subset of elements in the second latent representation. In some cases, the third encoder comprises one or more attention mechanisms configured to exchange a first set of attention coefficients between at least a first subset of elements in the second coarse-grained representation and at least a second subset of elements in the third latent representation.

[0192] In some cases, the encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the peptide representation, such that the latent representation is based at least partially on the physicochemical feature, the structural feature, or both. In some cases, the second encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the coarse-grained representation, such that the second latent representation is based at least partially on the structural feature, a physicochemical feature, or both. In some cases, the third encoder is further configured to receive a structural feature, a physicochemical feature, or both associated with the second coarse-grained representation, such that the third latent representation is based at least partially on the structural feature, a physicochemical feature, or both.

[0193] Training data may comprise various information useful for training the neural network. In some cases, the training dataset comprises (i) a plurality of sample peptide representations comprising the sample peptide representation, and (ii) a plurality of sets of sample labels comprising the set of sample labels. In some cases, the set of sample labels comprise an identity for an element in the sample peptide representation, a property associated with an element in the sample peptide representation, or both. In some cases, the set of samples are assigned by masking over each element in the sample peptide representation. In some cases, the masking comprises obscuring a given element in the sample peptide representation and processing neighboring elements of the given element to a neural network, and generating a label for the given element based at least partially on the neighboring elements. In some cases, the training the neural network comprises self-supervised learning. In some cases, training the neural network comprises, for each given sample peptide representation in the plurality of sample peptide representations and each given set of sample label in the plurality of sets of sample labels: processing the given sample peptide representation of the training dataset into the neural network; generating a given sample coarse-grained peptide representation based at least partially on the given sample peptide representation, wherein the given sample coarse-grained peptide representation comprises (1) a given set of sample coarse-grained elements, and (2) a given set of sample identifiers for the given set of sample coarse-grained elements; and updating at least one parameter of the neural network based at least partially on a loss function computed based at least partially on the given set of sample identifiers and the given set of sample labels.

[0194] In some cases, the set of sample labels comprises a label for each node in the plurality of nodes. In some cases, the set of sample labels comprises a label for each edge in the plurality of edges. In some cases, the loss function comprises probability based loss function. In some cases, the loss function is configured to generate a maximum likelihood value. In some cases, the loss function is configured to generate a Kullback-Leibler value. In some cases, the loss function comprises a cross-entropy loss function.

[0195] Learned hierarchical representations may be used for predictive or generative machine learning tasks. In some cases, a representation from one level in the learned hierarchical representations may provide the most salient information for a particular task, compared to representations from other levels. For example, dynamical properties of a peptide as a whole (e.g., diffusivity) or in ensemble (e.g., viscosity) may be more so related to the larger peptide structure (e.g., coarse) compared to properties of individual functional groups or atoms (e.g., fine). Therefore, there may be distinct advantages to providing a subset of hierarchical representation(s) for some machine learning tasks. In some cases, a subset of hierarchical representations (e.g., one, two, or more from the hierarchy) may be provided to a predictive or a generative machine learning algorithm.

[0196] In some cases, the method comprises providing a machine learning algorithm configured to (i) receive a hierarchical representation, and (ii) generate at least one predictive value associated with the hierarchical representation. In some cases, the hierarchical representation comprises the latent representation, the second latent representation, the third latent representation, the coarse-grained representation, the second coarse-grained representation, the third coarse-grained representation, or any combination thereof.

[0197] The hierarchical representations, as discussed before, may have most pertinent information for a machine learning task at particular levels in the hierarchy. The hierarchical latent representations may be further improved for the machine learning task via domain-specific training, which is discussed in detail elsewhere in this disclosure. In some cases, the method comprises providing a training dataset comprising (i) one or more hierarchical representations encoded by the neural network, and (ii) one or more known values associated with the one or more hierarchical representations. In some cases, the method comprises training the machine learning algorithm by (i) processing the one or more hierarchical representations to the machine learning algorithm, (ii) generating one or more predictive values based at least partially on the one or more hierarchical representations, and (iii) updating at least one parameter of the machine learning algorithm such that a difference between the one or more predictive values and the one or more known values is reduced. In some cases, the machine learning algorithm is sufficiently trained when the second training dataset comprises at most about 1, 10, 100, or 1,000 known values. In some cases, the machine learning algorithm is sufficiently trained when the second training dataset comprises at least about 1, 10, 100, or 1,000 known values. [0198] Any one of the neural networks disclosed herein may be used to provide information useful to selecting a peptide for experimental trials. In some cases, the method comprises processing to the machine learning algorithm one or more sample hierarchical representations to generate one or more sample predictive values. In some cases, the method comprises selecting at least one sample hierarchical representation in the one or more sample hierarchical representations as a candidate representation based at least in part on the one or more sample predictive values.

[0199] Hierarchical representations of the present disclosure may be useful products for analyzing a chemical space of peptides. In some cases, learned latent representations are organized such that related entities are nearby in latent space, while unrelated entities are distant in latent space. For example, the more two entities are related, the closer they may be in latent space, and the more two entities are different from each other, the further they may be in latent space. Latent representations may be useful as “a map” or “an atlas” of chemical spaces. Hierarchical representations may provide such maps relevant to a given hierarchy. For example, a high resolution representation may provide a map for helping to explore physical properties of peptides that are closely linked to electronic or atomistic features of a peptide. Meanwhile, a low resolution representation may provide map for helping to explore physical properties of peptides that are closely linked to macromolecular interactions.

[0200] In some aspects, the present disclosure provides a method of generating latent representations of peptides. In some cases, the method comprises providing a first representation comprising nodes and edges. In some cases, the edges encode bonded and nonbonded interactions. In some cases, the method comprises generating a second representation based at least partially on the first latent representation. In some cases, the second representation comprises lower resolution than the first representation. In some cases, the method comprises generating a third representation based at least partially on the second representation. In some cases, the third representation comprises lower resolution than the second representation.

[0201] In some cases, the method comprises generating an N-th representation based at least partially on the third representation. In some cases, the N-th representation comprises lower resolution than an (N-l)- th representation. In some cases, the N-th representation comprises a plurality of nodes and a plurality of edges. In some cases, the generating the second representation is based at least partially on a structural feature, a physicochemical feature, or both. In some cases, the generating the third representation is based at least partially on a structural feature, a physicochemical feature, or both. In some cases, the generating the N-th representation is based at least partially on a structural feature, a physicochemical feature, or both. In some cases, the first representation is a latent representation. In some cases, the second representation is a latent representation. In some cases, the third representation is a latent representation. In some cases, the N-th representation is a latent representation. In some cases, the generating is performed at least in part by using a neural network. In some cases, the generating is performed at least in part by using a mapper.

[0202] Various aspects of the present disclosure may be used to generate peptide structures and/or peptide representations. Deep generative learning can refer to a method of training a machine learning model to approximate a distribution of a given dataset, while enabling coherent samples to be generated from the data distribution. Machine learning models trained in this way may be referred to as deep generative models. Deep generative learning can be applied on chemical datasets to create deep generative models for chemistry that allows generation of coherent molecular structures that are not present in a given training dataset of chemical structures.

[0203] In some aspects, the present disclosure provides a method for training a deep generative neural network. In some cases, the method comprises providing a neural network. In some cases, the neural network comprises an encoder configured to receive a representation of a target peptide and generate a latent representation of the target peptide. In some cases, the neural network comprises a transformer configured to receive the latent representation of the target peptide and generate a predicted peptide ligand representation of a ligand that is predicted to bind to the target peptide. In some cases, the neural network comprises providing a plurality of target peptide representations and a plurality of ligand representations.

[0204] In some cases, the method further comprises training the neural network. In some cases, the training the neural network comprises: processing the plurality of target peptide representations with the neural network; generating a plurality of predicted ligand representations based at least partially on the plurality of target peptide representations; and updating at least one parameter of the neural network based at least partially on a loss function computed between the plurality of predicted peptide ligand representations and the plurality of ligand peptide representations.

[0205] For example, FIG. 10 shows a generative neural network architecture. In some cases, the neural network receives an input (10001) comprising an amino acid sequence, a secondary structure annotation, an accessible surface area, and/or other labels that define a physical property, a chemical property, geometrical properties, and the like as disclosed herein, for a predetermined target. In some cases, the neural network generates various embeddings (10002) or tokens, based at least partially on the input. In some cases, the neural network may concatenate (10003) the various embeddings and feed the concatenated information into a transformer (10004), which can generate ligands of a training data (10005) based at least partially on the concatenated information. The neural network, including the transformer, may be trained to model the data distribution of training data, so that the neural network generates new possible ligands while following the statistical qualities of the training data, even though the new ligands may not be explicitly in the dataset.

[0206] Various inputs may be provided to the generative machine learning model. In some cases, the representation of the target comprises: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof. In some cases, the primary structure of at least the portion of the target is tokenized. In some cases, the secondary structure of at least the portion of the target is tokenized. In some cases, the tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof is encoded using a neural network.

[0207] In some cases, the representation of the target comprises an embedding of: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof.

[0208] In some cases, the ligand representation comprises: (i) an atomistic structure of the ligand, (ii) a primary structure of the ligand, (iii) a tertiary structure of the ligand, or (iv) any combination thereof. In some cases, the representation of the target comprises a plurality of nodes and a plurality of edges. In some cases, the representation of the target comprises a hierarchical latent representation of the target. In some cases, the representation encodes short-range interactions, long-range interactions, bonded interactions, and non-bonded interactions between elements in the target peptide.

[0209] Once a generative model is trained, it may be used to generate new samples. In some aspects, the present disclosure provides a method for generating a ligand representation. In some cases, the method comprises providing a latent representation of a target peptide. In some cases, the latent representation comprises an embedding for at least one of: a primary structure of at least a portion of the target peptide, a secondary structure of at least a portion of the target peptide, a tertiary structure of at least a portion of the target peptide, a physical property of at least a portion of the target peptide, a target site of the target peptide, a relative accessible surface area of at least a portion of the target peptide, or any combination thereof.

[0210] For example, FIG. 10 shows a generative neural network architecture. In some cases, the neural network receives an input (10001) comprising an amino acid sequence, a secondary structure annotation, accessible surface area, and other labels that define a physical property, a chemical property, a geometrical property, and the like as disclosed herein, for a predetermined target. In some cases, the neural network generates various embeddings (10002) or tokens, based at least partially on the input. In some cases, the neural network may concatenate (10003) the various embeddings and feed the concatenated information into a transformer (10004), which may generate candidate ligands (10005) based at least partially on the concatenated information. In some cases, neighborhood information may be encoded in the learned representation by posing a training task as a context prediction problem. In some cases, the initial node features of the residues may contain the following information: one-hot encoded vector of the amino acids, one-hot encoded vector of secondary structure, B-factor or Debye-Waller factor, Meiler embedding, ProtScale (Protein Identification and Analysis Tools on the ExPASy Server, solvent-accessible surface area, Ramachandran angles, amino acid indices, or other amino acid embeddings, fingerprints, or properties, or any other feature disclosed herein.

[0211] Various types of representations of peptides may be generated. Various types of representations of peptides may be used to generate other types of representations. In some cases, the method comprises generating the ligand representation of a ligand based at least in part on the latent representation. In some cases, the ligand is predicted to bind to at least a portion of the target peptide. In some cases, the method comprises providing a second latent representation, wherein the second latent representation is provided by a trained machine learning model.

[0212] Various machine learning models may be used to generate candidate ligands. In some cases, the trained machine learning model comprises a language model, a graph model, a flow model, a generative adversarial network, a variational autoencoder, an autoregressive model, an autoencoder, a diffusion model, or any combination thereof. In some cases, the generating comprises adding noise to the latent representation. In some cases, the generating comprises generating using a decoder neural network of a transformer neural network. In some cases, the generating comprises using a greedy-sampling algorithm, a Monte Carlo tree search, a beam search, a genetic algorithm, or any combination thereof. The specific neural network design or search algorithm may influence the distribution of candidates that may be generated.

[0213] In some aspects, the present disclosure provides a method for generating a ligand representation. In some cases, the method comprises providing a peptide graph of a target peptide comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encodes short-range interactions, long- range interactions, bonded interactions, nonbonded interactions, or any combination thereof. In some cases, the method comprises generating the ligand representation of a ligand based at least in part on the target peptide graph. In some cases, the ligand is predicted to bind to at least a portion of the target peptide.

[0214] In some aspects, the present disclosure provides a method for generating a ligand representation. In some cases, the method comprises providing a hierarchical representation of a peptide. In some cases, the hierarchical representation comprises at least one high-resolution representation and at least one low- resolution representation. In some cases, the method comprises generating the ligand representation of a ligand based at least in part on the hierarchical representation. In some cases, the ligand is predicted to bind to at least a portion of the peptide. In some cases, the generating is performed at least partially by using a transformer neural network. In some cases, the generating is performed at least partially by using a message-passing neural network.

[0215] In some cases, a “representation”, a “descriptor”, and the like, may refer to a description of a molecular entity. For example, commonly used descriptors include SMILES, primary sequences, or nucleic acid sequences. In some cases, a representation may be used as feature values to a neural network. In some cases, a representation may describe a peptide.

[0216] In some cases, the feature values may comprise an identifier of an atom, a functional group, an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif. In some cases, the feature values may comprise an electronic configuration, a charge, a size, a bond angle, a dihedral angle, or any combination thereof. In some cases, the motif may comprise a collection of atoms. In some cases, a functional group may comprise a BRICS motif. In some cases, the feature values may comprise a geometric descriptor of an atom, an amino acid, a secondary structure, a tertiary structure, a quaternary structure, or a motif. In some cases, the feature values may comprise a physical property. [0217] In some cases, a representation may reflect multiple spatial and physical scales, such as an atomic structure, a secondary structure, and a tertiary structure for peptides and proteins. In some cases, a representation may also reflect scalar and vector values, fields, surfaces, and volumes associated with the multiscale structure, including quantities such as charges, isopotential surfaces, and multipole moments. In some cases, a representation may be expressed using mathematical structures appropriate for non- Euclidean operations, such as graphs (where a graph G consists of a set of n r vertices Vj, for i G [1, n v ], connected by a set of n e edges e 7 , for j G [1, n e ],), grids (multidimensional arrays of scalar, vector, or tensor values), and manifolds (informally, multidimensional curved surfaces that are locally Euclidean, with associated scalar, vector, or tensor values at each point).

[0218] In some cases, a geometric arrangement of atoms, electrons, and/or motifs in a peptide may be described in different manners and different levels of precision. Various formats and precisions may impart unique values in terms of training machine learning algorithms.

[0219] In some cases, an atomistic representation of a peptide may comprise a M0L2 format. In some cases, an atomistic representation of a peptide may comprise a PDB format. In some cases, an atomistic representation of a peptide may comprise a MOL format. In some cases, an atomistic representation of a peptide may comprise a PDBQ/PDBQT. In some cases, an atomistic representation of a peptide may comprise an SDF format. In some cases, an atomistic representation of a peptide may comprise a SMILES. In some cases, an atomistic representation of a peptide may comprise a SELFIES. In some cases, an atomistic representation of a peptide may comprise an InChi format. In some cases, an atomistic representation of a peptide may comprise a Chemdraw (CDX, CDXML) format. In some cases, an atomistic representation of a peptide may comprise a CIF format. In some cases, an atomistic representation of a peptide may comprise a CML format. In some cases, an atomistic representation of a peptide may comprise an XML format. In some cases, an atomistic representation of a peptide may comprise an ASN1 format. In some cases, an atomistic representation of a peptide may comprise a PARM topology format. In some cases, an atomistic representation of a peptide may comprise a CRD or TRJ format.

[0220] In some cases, a representation of a peptide may comprise one or more electron configurations. In some cases, an electron configuration may comprise one or more atomic orbitals, one or more molecular orbitals, or both. In some cases, an electron configuration may comprise valence electrons of an atom. In some cases, an electron configuration may comprise a character of an electron (e.g., s, p, d, f, and any mixtures thereof). In some cases, an electron configuration may comprise an electron spin. In some cases, an electron configuration may comprise electron density. In some cases, an electron configuration may be represented in various basis functions, including but not limited to, atomic orbitals, molecular orbitals, or plane waves.

[0221] Within various chemoinformatic formats can be differently encoded information. In some cases, an atomistic representation of a peptide may comprise the relative cartesian coordinates of atoms to each other. In some cases, an atomistic representation of a peptide may comprise the relative cartesian coordinates of atoms to an arbitrary point. In some cases, an atomistic representation of a peptide may comprise thermodynamic estimations of values such as solvation energy, potential energy of bond lengths, bond angles, dihedral angles, 1-4 intramolecular interaction energies, intramolecular energies among adjacent bond angles, hydrogen bonding energies, and non-bonded interaction energies. In some cases, an atomistic representation of a peptide may comprise atom type definitions and generalizations. In some cases, an atomistic representation of a peptide may comprise polarizability parameters. In some cases, an atomistic representation of a peptide may comprise Lennard-Jones van der Waal parameters. In some cases, an atomistic representation of a peptide may comprise electrostatic charge parameters. In some cases, an atomistic representation of a peptide may comprise bond length, bond angle, and dihedral force constants. In some cases, an atomistic representation of a peptide may comprise bond length, bond angle, and dihedral equilibrium values. In some cases, an atomistic representation of a peptide may comprise dihedral phase and periodicity force constants.

[0222] Peptide molecules may have highly polar hydrogen bond donor and acceptor moieties, enabling strong interactions. Combined with the chemical diversity of the natural and non-natural amino acids, many different organizations of peptide moieties are possible. In some cases, the organization level can be based on a linear sequence (primary), local electrostatic interaction and/or hydrogen bonding (secondary), formation of a superstructure (tertiary), and combinations of multiple peptides in complex assemblies (quaternary). Some examples of secondary structures include alpha helices and beta sheets of peptides, where each can be formed when hydrogen bonding between residues of a peptide is stabilized. Some examples of a tertiary structure can include larger geometrical features within a peptide such as pockets, hairpins, concatenations, loop regions, globular regions, etc. Examples of quaternary structure can involve organization of multiple individual peptides into higher order structures. In some cases, the precise organization of a peptide may be dependent on the interplay of each level of peptide organization.

[0223] The secondary structure of proteins may include the hydrogen bonding networks of subsections in a primary sequence. Alpha helices and beta sheets can be discerned, for example, within a Ramachandran plot of a peptide due to the arrangement of the backbone amide groups hydrogen bonding. Tertiary structure may be seen as more interaction types contribute to the conformational landscape of a peptide. Tertiary structures may depend on non-polar/hydrophobic van der Waal interactions, electrostatic interactions, and other various interactions described herein.

[0224] FIG. 2 illustrates a ribbon diagram of a peptide. A ribbon diagram is a common depiction of peptides that convey structural features that are salient to biochemistry. In this example, the peptide (SATPdb ID 23792, where SATPdb is a public database of structurally annotated proteins), which has amino acid sequence WYNQTKDLQQK-FYEIIMDIEQNNVQGKKGIQQLQK (referred to henceforth in this disclosure as SATPdb_23792), has antiviral and antimicrobial properties. This peptide comprises 35 standard amino acids, comprises two alpha helices, and comprises a hairpin turn between the two alpha helices caused by functional group interactions, hydrogen bonds, or van der Waals interactions, such that the two alpha helices are substantially parallel. The ribbon is labeled by amino acid type (i.e., alanine (A); arginine (R); lysine (K): asparagine (N); glutamine (Q); aspartate (D); glutamate (E); cysteine (C); methionine (M); glycine (G); histidine (H); isoleucine (I); leucine (L); valine (V); phenylalanine (F); tyrosine (Y); proline (P); serine (S); threonine (T); tryptophan (W)).

[0225] FIG. 3 illustrates an atomistic structure of peptide SATPdb_23792 (showing atoms as balls and covalent bonds as sticks). FIG. 2 and FIG. 3 depict equivalent peptides, where FIG. 2 shows the primary structure (amino acid sequence), secondary structure (in this case, alpha helices), and tertiary structure (in this case, the hairpin turn), and FIG. 4 shows the atomistic detail underlying the primary, secondary, and tertiary structure of peptide SATPdb_23792. In some cases, the atomic structure forms an atomistic graph representation, where nodes v,- can represent atoms, and edges e 7 can represent covalent bonds between atoms.

[0226] FIG. 4 illustrates a residue graph representation of peptide SATPdb_23792. In some cases, in the residue graph representation, nodes v,- may represent amino acids, and edges e 7 may represent the interactions between amino acids, which can include covalent bonds, hydrogen bonds, van der Waals interactions, disulfide bonds, aromatic-aromatic interactions, aromatic -pi interactions, aromatic-cation interactions, hydrophobic interactions, peptide bonds, protein-peptide, protein-protein interactions, and interactions with other kinds of molecules.

[0227] In some cases, a peptide or a peptide assembly may be represented as a graph. Graph representations of peptides can be a powerful representation of information. In some cases, a graph representation may be provided as an adjacency matrix, a linked list, or in a coordinate system. [0228] In some aspects, the present disclosure provides systems and methods for generating latent representations of peptides that can enables more efficient discovery, design, and/or development of useful peptides. The systems and methods disclosed herein may provide improved data efficiency such that less data (or more fundamentally - information) may be used to generate useful peptides. It is recognized herein, that in some domains (e.g., peptide drug discovery) the amount of available data may be low and/or sparse compared to the chemical space of peptides and the complexity of relevant functions over that chemical space (e.g., drug activity of peptides for a given target).

[0229] Latent representations, as it will be described in further detail below, of peptides may encode structural features of peptides over a hierarchy of spatial scales. For example, a given order in the hierarchy of spatial scales may comprise atomic structures (e.g., carbon, oxygen, nitrogen, hydrogen, etc., and bonds formed between them), functional groups (e.g., carboxyl, amide, aromatic, etc., and various other annotations), primary structure (e.g., comprising amino acid sequences which may also include chemical modifications), secondary structure (a helices, p sheets, etc.), tertiary structure (e.g., three- dimensional structure of folded peptides), quaternary structure (e.g., the three-dimensional structure of the interacting subunits of an assembly comprising of multiple peptide units), and other molecules (e.g., a small molecule), or any combination thereof. The latent representations may also encode physicochemical interactions (e.g., charged interactions, dipole-dipole interactions, Van der Waals interactions, hydrogen bonding, various entropic effects, etc.) across its spatial scales and/or physicochemical interactions with a surrounding environment (e.g., other peptides, solvent environments, or other molecules). [0230] In some cases, the learned representations may be used for downstream tasks which can include generating sequences or structures (sometimes including modifications) of drug candidates, predicting physicochemical or druglike properties (e.g., pharmacokinetic, pharmacodynamic, cell penetrating, or blood brain barrier penetrating properties, and various other pharmacologically relevant parameters), optimizing drug leads, and/or developing peptide-drug conjugates (e.g., peptide-antibody conjugates, peptide-antisense oligonucleotide (peptide-ASO) conjugates, peptide-radionuclide conjugates, and peptide-small molecule drug conjugates).

[0231] Latent representations may be calculated from a specific set of training data to form specialized encodings that serve a specific application. Therefore, multiple sets of latent representations may be constructed, each set corresponding to a particular use.

[0232] In some cases, latent representations may be attributed to a node in a peptide representation. In some cases, a latent representation may be attributed, for each node in a plurality of nodes, of an atomistic representation of a peptide. In some cases, a latent representation may be attributed, for each node in a plurality of nodes, of a coarse representation of a peptide. In some cases, a latent representation may be attributed, for each node in a plurality of nodes, of a residue representation of a peptide. A latent representation that is attributed to a node may provide learned features for the node that comprises information such as short-range interactions with other nodes (e.g., nearest neighbors in a graph), long- range interactions with other nodes (e.g., neighbors further away than the nearest neighbors), physical properties associated with nodes, and various other features which has been described at length elsewhere in the application.

[0233] In some cases, a method of the present disclosure may comprise training or using a machine learning algorithm. In some cases, a machine learning algorithm may be trained or be used to generate candidate peptide structures for a task.

[0234] A machine learning model can comprise one or more of various machine learning models. In some cases, the machine learning model can comprise one machine learning model. In some cases, the machine learning model can comprise a plurality of machine learning models. In some cases, the machine learning model can comprise a neural network model. In some cases, the machine learning model can comprise a random forest model. In some cases, the machine learning model can comprise a manifold learning model. In some cases, the machine learning model can comprise a hyperparameter learning model. In some cases, the machine learning model can comprise an active learning model.

[0235] A graph, graph model, and graphical model can refer to a method of conceptualizing or organizing information into a graphical representation comprising nodes and edges. In some cases, a graph can refer to the principle of conceptualizing or organizing data, wherein the data may be stored in a various and alternative forms such as linked lists, dictionaries, spreadsheets, arrays, in permanent storage, in transient storage, and so on, and is not limited to specific cases disclosed herein. In some cases, the machine learning model can comprise a graph model.

[0236] The machine learning model can comprise a neural network comprising various architectures, loss functions, optimization algorithms, priors, and various other neural network design choices. In some cases, the machine learning model can comprise a neural network. In some cases, the machine learning model can comprise an autoencoder. In some cases, the machine learning model can comprise a generative model. In some cases, the machine learning model can comprise a variational autoencoder. In some cases, the machine learning model can comprise a generative adversarial network. In some cases, the machine learning model can comprise a flow model. In some cases, the machine learning model can comprise an autoregressive model. In some cases, the machine learning model can comprise a diffusion model. In some cases, the machine learning model can comprise a neural network with one or more layers. In some cases, the machine learning model can comprise a neural network with one or more fully connected layers. In some cases, the machine learning model can comprise a neural network with one or more convolutional layers. In some cases, the machine learning model can comprise a neural network with one or more message-passing layers. In some cases, the machine learning model can comprise a neural network with a bottleneck layer. In some cases, a layer may comprise an attention mechanism, a generalized message-passing graph neural network, or both. In some cases, a generalized messagepassing graph neural network comprises a graph convolutional neural network.

[0237] In some cases, the machine learning model can comprise a neural network with residual blocks. In some cases, the machine learning model can comprise a neural network with attention. In some cases, the machine learning model can comprise a neural network with one or more non-linearities. In some cases, the machine learning model can comprise a neural network with one or more dropout layers. In some cases, the machine learning model can comprise a neural network with one or more batch normalization layers. In some cases, the machine learning model can comprise a regression loss function. In some cases, the machine learning model can comprise a logistic loss function. In some cases, the machine learning model can comprise a variational loss. In some cases, the machine learning model can comprise a prior. In some cases, the machine learning model can comprise a Gaussian prior. In some cases, the machine learning model can comprise a non-Gaussian prior. In some cases, the machine learning model can comprise an adversarial loss. In some cases, the machine learning model can comprise a reconstruction loss. In some cases, the machine learning model is trained with the Adam optimizer. In some cases, the machine learning model is trained with the stochastic gradient descent optimizer. In some cases, the model learning model hyperparameters are optimized with Gaussian Processes. In some cases, the machine learning model is trained with train/validation/test data splits. In some cases, the machine learning model is trained with k-fold data splits, with any positive integer for k.

[0238] The machine learning model can comprise a variety of manifold learning algorithms. In some cases, the machine learning model can comprise a manifold learning algorithm. In some cases, the manifold learning algorithm comprises principal component analysis. In some cases, the manifold learning algorithm comprises a uniform manifold approximation algorithm. In some cases, the manifold learning algorithm comprises an isomap algorithm. In some cases, the manifold learning algorithm comprises a locally linear embedding algorithm. In some cases, the manifold learning algorithm comprises a modified locally linear embedding algorithm. In some cases, the manifold learning algorithm comprises a Hessian eigen mapping algorithm. In some cases, the manifold learning algorithm comprises a spectral embedding algorithm. In some cases, the manifold learning algorithm comprises a local tangent space alignment algorithm. In some cases, the manifold learning algorithm comprises a multi-dimensional scaling algorithm. In some cases, the manifold learning algorithm comprises a t-distributed stochastic neighbor embedding algorithm (t-SNE). In some cases, the manifold learning algorithm comprises a Bames-Hut t-SNE algorithm.

[0239] In some cases, latent representations of the present disclosure may be tuned for specific applications, for example, for predicting or generative ligands for binding to specific molecular targets, ability to penetrate cell membranes or the blood brain barrier, or physicochemical aspects of efficient chemical synthesis or manufacturing. In some cases, a hierarchical representation may be tuned. In some cases, the tuning may be for a domain-specific task. In some cases, the tuned latent representations may be tuned for the domain-specific task. Domain-specific representations may be tuned for one or a variety of applications, either successively (e.g., by sequential retraining), or simultaneously (e.g., by multitask training). The tuning process may include various computational techniques that reduce the amount of data needed to train effective models for drug design while increasing the accuracy of the models produced. For analogy, large language models (e.g., GPT-3, GPT-4, or BERT large language models or the PaLM model) may be tuned in few shots for certain tasks. Similarly, a representation neural network of the present disclosure may be tuned for a specific application (e.g., binding to the GPCR class of molecular targets). In some ways, this may be analogous to GPT-3 or GPT-4 being retrained on a corpus of specialized text such as medical literature (e.g., PubMed).

[0240] In some cases, a domain-specific hierarchical representation may be designed to exhibit greater predictive accuracy for the domain of specialization while retaining or increasing predictive accuracy for general (non-specialized) predictions. In some cases, the domain-specific hierarchical representation may facilitate designing and optimizing drug candidates for specific physicochemical, biological, pharmacokinetic, or other properties, predicting those or other properties for specific peptide or protein sequences or structures, or optimizing leads using those or other properties. In some cases, hierarchical peptide and protein representations may also facilitate designing and optimizing peptides for purposes other than drug design (for example, veterinary, agricultural, industrial uses) or optimizing chemical synthesis or manufacturing processes.

[0241] In some cases, to achieve a given level of predictive accuracy for a certain set of tasks, training a neural network to generate domain-specific hierarchical representation may require substantially less specialized data than would be required for training a general hierarchical molecular representation learning algorithm from scratch. For example, given a hierarchical molecular representation that has already been trained on a large corpus, typically requiring millions of data points, of peptide, protein, peptide-protein interaction, and protein-protein interaction data to encode environments local to atoms, bonds, functional groups, amino acids, alpha helices, beta sheets, etc., fine-tuning training a domainspecific hierarchical molecular representation may require substantially less data, for example, only tens, hundreds, or thousands of data points (e.g., few shot learning). [0242] In some cases, domain-specific hierarchical representation derived from another representation (e.g., from a hierarchical representation learning algorithm) may be tuned using distinct proprietary data or machine learning model(s). The domain-specific hierarchical representation may be calculated, stored, maintained, and used to isolate the proprietary data or machine learning models. In some cases, domainspecific hierarchical representation may be securely stored and managed on a non-transitory memory or storage. In some cases, the domain-specific hierarchical representation may be accessible by verified users.

[0243] In some cases, a machine learning algorithm may use hierarchical representations, latent representations, or any other engineered representation disclosed herein engine for performing various tasks. In some cases, a generative machine learning algorithm, a predictive machine learning algorithm, or both may be used independently or dependently (e.g., layered or connected approach) to search highdimensional peptide design space(s) to generate candidate peptides for a task (e.g., drug efficacy) in an efficient manner. In some cases, a machine learning algorithm may also be used to generate peptide and protein molecules for veterinary, agricultural, industrial, and other applications, as well as to optimize chemical synthesis and manufacturing. In some cases, systems and methods of the present disclosure may enable the generation of superior candidate drugs that have desirable characteristics (e.g., potency, efficacy, safety, pharmacokinetics, formulation, manufacturability, increased chance of clinical success, and others) compared to traditional drug development approaches. In some cases, the disclosed architecture may reduce the time and cost to develop candidate drugs by reducing the number of compounds which must be synthesized and assayed, and it may reduce the computational resources (e.g., processing, memory, time, energy, storage) required to carry out the drug development.

[0244] In some cases, the disclosed embodiments may enable rational discovery and design of drug compounds for a larger design space and at higher accuracy, higher efficiency, or larger scale than conventional techniques. In some cases, various engineered representations disclosed herein may be used with various machine learning models to discover, generate, design, develop, formulate, classify, or test candidate drug compounds. Each of the various machine learning models may perform certain operations relevant to biology, biochemistry, chemistry, pharmacology, pharmaceuticals, reactions, synthesis, in vitro testing, in vivo testing, or clinical testing. The types of machine learning models may include deep learning neural network architectures (e.g., convolutional neural networks (CNNs), recursive neural networks (RNNs), long short-term memories (LSTMs), transformers, graph convolution networks (GCNs), graph attention networks (GATs), message-passing neural networks (MPNNs), generative adversarial networks (GANs), other neural network architectures, or any combination thereof) and traditional machine learning (e.g., linear and logistic regression, decision trees, gradient boosting, support vector machines, kNN, and other predictive, classification, and clustering algorithms). Such networks may additionally employ methods for interpretability or explainability, such as incorporating causal inference, including counterfactuals, during discovery, design, and optimization of peptide or protein molecules and in the presentation of results. [0245] In some cases, the methods of the disclosure further comprise reducing one or more peptide representations using a machine learning model. The terms “reducing”, “dimensionality reduction”, “projection”, “component analysis”, “feature space reduction”, “latent space engineering”, “feature space engineering”, “representation engineering”, or “latent space embedding”, as used herein, generally refer to a method of transforming a given input data with an initial number of dimensions to another form of data that has fewer dimensions than the initial number of dimensions. In some cases, the terms can refer to the principle of reducing a set of input dimensions to a smaller set of output dimensions. In some cases, the terms can refer to the principle of reducing a set of input dimensions to a set of output dimensions of a same or larger size.

[0246] The term “normalizing”, as used herein, generally refers to a collection of methods for adjusting a dataset to align the dataset to a common scale. In some cases, a normalizing method can comprise multiplying a portion or the entirety of a dataset by a factor. In some cases, a normalizing method can comprise adding or subtracting a constant from a portion or the entirety of a dataset. In some cases, a normalizing method can comprise adjusting a portion or the entirety of a dataset to a known statistical distribution. In some cases, a normalizing method can comprise adjusting a portion or the entirety of a dataset to a normal distribution. In some cases, a normalizing method can comprise adjusting the dataset so that the signal strength of a portion or the entirety of a dataset is about the same.

[0247] Converting can comprise one or more steps of various conversions of data. In some cases, converting can comprise normalizing data. In some cases, converting can comprise performing a mathematical operation that computes a score based on a distance between 2 points in the data. In some cases, the points in the data can comprise a peptide representation. In some cases, the distance can comprise a distance between two edges in a graph. In some cases, the distance can comprise a distance between two nodes in a graph. In some cases, the distance can comprise a distance between a node and an edge in a graph. In some cases, the distance can comprise a Euclidean distance. In some cases, the distance can comprise a non-Euclidean distance. In some cases, the distance can be computed in a frequency space. In some cases, the distance can be computed in Fourier space. In some cases, the distance can be computed in Laplacian space. In some cases, the distance can be computed in spectral space. In some cases, the mathematical operation can be a monotonic function based on the distance. In some cases, the mathematical operation can be a non-monotonic function based on the distance. In some cases, the mathematical operation can be an exponential decay function. In some cases, the mathematical operation can be a learned function.

[0248] In some cases, converting can comprise transforming data in one representation to another representation. In some cases, converting can comprise transforming data into another form of data with less dimensions. In some cases, converting can comprise linearizing one or more curved paths in the data. In some cases, converting can be performed on data comprising data in Euclidean space. In some cases, converting can be performed on data comprising data in graph space. In some cases, converting can be performed on data in a discrete space. In some cases, converting can be performed on data comprising data in frequency space. In some cases, converting can transform data in discrete space to continuous space, continuous space to discrete space, graph space to continuous space, continuous space to graph space, graph space to discrete space, discrete space to graph space, or any combination thereof. In some cases, converting can comprise transforming data in discrete space into a frequency domain. In some cases, converting can comprise transforming data in continuous space into a frequency domain. In some cases, converting can comprise transforming data in graph space into a frequency domain.

[0249] In some cases, reducing can comprise transforming a given input data with any initial number of dimensions to another form of data that has any number of dimensions fewer than the initial number of dimensions. In some cases, reducing can comprise transforming input data into another form of data with fewer dimensions. In some cases, reducing can comprise linearizing one or more curved paths in the input data to the output data. In some cases, reducing can be performed on data comprising data in Euclidean space. In some cases, reducing can be performed on data comprising data in graph space. In some cases, reducing can be performed on data in a discrete space. In some cases, reducing can transform data in discrete space to continuous space, continuous space to discrete space, graph space to continuous space, continuous space to graph space, graph space to discrete space, discrete space to graph space, or any combination thereof.

[0250] The terms “clustering”, “cluster analysis”, or “generating modules”, as used herein, generally refer to a method of grouping samples in a dataset by some measure of similarity. Samples can be grouped in a set space, for example, element ‘a’ is in set ‘A’. Samples can be grouped in a continuous space, for example, element ‘a’ is a point in Euclidean space with distance T away from the centroid of elements comprising cluster ‘A’. Samples can be grouped in a graph space, for example, element ‘a’ is highly connected to elements comprising cluster ‘A’. These terms can refer to the principle of organizing a plurality of elements into groups in some mathematical space based on some measure of similarity. [0251] In some cases, the method further comprises clustering a cohort of peptide representations to determine one or more groups of peptide representations with similar structures, properties, or functions. Clustering can comprise grouping any number of samples in a dataset by any quantitative measure of similarity. In some cases, clustering can comprise K-means clustering. In some cases, clustering can comprise hierarchical clustering. In some cases, clustering can comprise using random forest models. In some cases, clustering can comprise boosted tree models. In some cases, clustering can comprise using support vector machines. In some cases, clustering can comprise calculating one or more N-l dimensional surfaces in N-dimensional space that partitions a dataset into clusters. In some cases, clustering can comprise distribution-based clustering. In some cases, clustering can comprise fitting a plurality of prior distributions over the data distributed in N-dimensional space. In some cases, clustering can comprise using density-based clustering. In some cases, clustering can comprise using fuzzy clustering. In some cases, clustering can comprise computing probability values of a data point belonging to a cluster. In some cases, clustering can comprise using constraints. In some cases, clustering can comprise using supervised learning. In some cases, clustering can comprise using unsupervised learning.

[0252] In some cases, clustering can comprise grouping samples based on similarity. In some cases, clustering can comprise grouping samples based on quantitative similarity. In some cases, clustering can comprise grouping samples based on one or more features of each sample. In some cases, clustering can comprise grouping samples based on one or more labels of each sample. In some cases, clustering can comprise grouping samples based on Euclidean coordinates. In some cases, clustering can comprise grouping samples based the features of the nodes and edges of each sample.

[0253] In some cases, comparing can comprise comparing between a first group and different second group. In some cases, a first or a second group can each independently be a cluster. In some cases, a first or a second group can each independently be a group of clusters. In some cases, comparing can comprise comparing between one cluster with a group of clusters. In some cases, comparing can comprise comparing between a first group of clusters with second group of clusters different than the first group. In some cases, one group can be one sample. In some cases, one group can be a group of samples. In some cases, comparing can comprise comparing between one sample versus a group of samples. In some cases, comparing can comprise comparing between a group of samples versus a group of samples.

[0254] Comparing can comprise a variety of analytical methods carried out by a computer or a human. In some cases, a statistical test can be carried out to identify one or more peptide representations that are the most different in one group versus a comparison group. In some cases, clustering can be carried out on differences in peptide representations, which can lead to the identification of a set of peptide representations that show a high confidence for performing a satisfying a given machine learning task. [0255] In some cases, an in silico screening method may comprise predicting solubility of a peptide. In some cases, predicted solubility of a peptide may be used to include or exclude candidate peptides. In some cases, an in silico screening method may comprise predicting desolvation energies of a peptide. In some cases, predicted desolvation energies may be used to prioritize or exclude candidate peptides. In some cases, an in silico screening method may comprise predicting polar or non-polar surface area of a peptide. In some cases, predicted polar or non-polar surface area of a peptide may be used to include or exclude candidate peptides. In some cases, an in silico screening method may comprise predicting the strength of binding interactions with a molecular target to prioritize. In some cases, the predicted strength of a binding interaction with a molecular target may be used to include or exclude candidate peptides. In some cases, an in silico screening method may comprise predicting Absorption, Metabolism, Distribution, Excretion, or Toxicological (ADMET) properties of a peptide. In some cases, predicted ADMET properties may be used to include or exclude candidate peptides. In some cases, an in silico screening method may comprise thermodynamic integration of a molecular binding event. In some cases, thermodynamic integration may be used to include or exclude candidate peptides.

[0256] In some cases, the peptide structure is obtained using at least a molecular simulation. In some cases, the molecular simulation is based at least partially on an electronic structure calculation, a forcefield based calculation, molecular dynamics, a Monte Carlo simulation, or any combination thereof. [0257] Various molecular modeling techniques may be used to train and/or develop a useful machine learning model. The usefulness of the machine learning model may be dependent on the quality, quantity, and diversity of the molecular properties that serve as inputs for a machine learning algorithm to be trained on. Various experimental and computational methods may be used to generate useful inputs for machine learning, including quantum mechanical calculations, experimental characterizations, and molecular dynamics. In some cases, quantum mechanical calculations and/or approximations can yield accurate descriptors of key molecular properties such as charge distribution, optimal molecular conformations, and transition energies between conformations, macroscopic properties (e.g., solvation free energies). Quantum mechanical approximations can range from Hartree-Fock and density functional theory (DFT) to highly expensive calculations such as coupled cluster. Molecular dynamics may be used as a major input source for useful training data for machine learning. Techniques such as thermodynamic integration can provide free energies of solvation, complexing between a peptide and a ligand, etc.

[0258] In some aspects, the present disclosure provides an in silico drug discovery method. FIG. 12 shows a drug discovery method. In silico simulation or calculation methods (e.g., docking simulations, as well as other methods disclosed herein), neural networks, and high performance computing can be integrated into an autonomous system. The system can generate new sequences for computational screening, perform computational methods to narrow down a number of candidate sequences, suggest candidate sequences for experimental assays, drive an autonomous laboratory to synthesize the candidate sequences, perform experimental assays on candidate sequences, manage and/or build a peptide library, or any combination thereof. In some cases, the method can comprise performing a first set of docking simulations between a first set of peptides and a molecular target to generate a training dataset. In some cases, the method can comprise training a generative neural network using the training dataset. In some cases, the method can comprise generating, using the generative neural network, a plurality of peptide sequences. In some cases, the second set of peptides comprises the subset of peptide sequences. In some cases, the method can comprise training a prediction neural network using the training dataset. In some cases, the method can comprise filtering, using the prediction neural network, a subset of peptide sequences from the plurality of peptide sequences. In some cases, the method can comprise clustering the plurality of peptide sequences into one or more clusters. In some cases, the method can comprise selecting a subset of peptide sequences in the plurality of sequences. In some cases, the subset of peptide sequences are substantially distributed among the one or more clusters. In some cases, the method can comprise performing a second set of docking simulations between a second set of peptides and the molecular target. In some cases, the method can comprise performing a second set of docking simulations between a second set of peptides and the molecular target, wherein the second set of peptides comprises the subset of peptide sequences, and wherein at least one peptide in the second set of peptides binds more favorably to the molecular target than each peptide in the first set of peptides. In some cases, the method can comprise generating, using a generative neural network, a plurality of peptide sequences for binding a molecular target, wherein the generating is based at least partially on a docking simulation dataset. In some cases, the method can comprise screening, using a predictive neural network, a subset of peptide sequences from the plurality of peptide sequences.

[0259] In some cases, a peptide structure may be obtained from at least an experiment or a structure database. In some cases, the set of labels comprises (i) a physicochemical property, (ii) an absorption, distribution, metabolism, excretion, or toxicity (ADMET) property, (iii) a chemical reaction, (iv) a synthesizability, (v) a solubility, (vi) a chemical stability, (vii) a protease stability, (viii) a plasma protein binding property, (ix) a peptide-protein or protein-protein binding interaction, (x) an activity, (xi) a selectivity, (xii) a potency, (xiii) a pharmacokinetic property, (xiv) a pharmacodynamic property, (xv) an in vivo safety property, (xvi) a property reflecting a relationship to a biomarker, (xvii) a formulation property, or (xviii) any combination thereof.

[0260] In some cases, a peptide structure may comprise one or more structural features. In some cases, a structural feature comprises: a three-dimensional structure of an atom, a motif, a functional group, a residue, a secondary structure, a tertiary structure, a quaternary structure, or a molecular complex of the peptide.

[0261] In some cases, a hierarchical representation and domain-specific representations may be obtained using self-supervised representation learning algorithms. In some cases, the self-supervised representation learning algorithms may use large-scale protein databases such as UniProt and the Protein Data Bank (PDB), as well as other public and proprietary peptide and protein data for training. In some cases, a selfsupervised representation learning may produce representations that sample distributions that reflect the high diversity of standard amino acids, nonstandard amino acids, and other modifications present in biologically occurring proteins and protein-protein interactions, including peptides as small proteins. In some cases, machine learned hierarchical representation and domain-specific representations may embody high generality as well as high specificity.

[0262] Drug discovery is also difficult because of the difficulty in advancing a “hit” to a “lead” and to an “optimized lead” status. A “hit” may refer to a molecule that binds to a biological target. Hits can be identified through high-throughput screens, human design, or machine learning design. A “lead” can refer to a molecule that has drug-like properties such as absorption, distribution, metabolism, and excretion (ADME) that are aligned with the target-product profile and low toxicity. An “optimized lead” can refer to a molecule that has additional properties such as favorable pharmacokinetics, pharmacodynamics, and compatibility with the planned formulation. Traditional drug discovery may proceed in sequential steps of “hit ID” to produce hits, “hit-to-lead” (H2L) to produce leads, and “lead optimization” to produce optimized leads. If any of the steps fails, the discovery process may revert to the prior step. For example, if hit-to-lead fails to produce a lead because the hits do not exhibit low toxicity, then the hit identification step may have to be repeated or the discovery program abandoned. This process of sequentially assessing success criteria may substantially increase the time and cost for drug discovery.

[0263] Multiparameter-optimized peptide libraries can enable compressing many aspects of the traditionally sequential processes of hit ID, hit-to-lead, and lead optimization into one step, thereby increasing the efficiency of peptide drug design. A multiparameter-optimized peptide library can comprise a plurality of lists of peptides optimized for specific biological targets, functions, or medical indications. Each list of peptides can comprise a plurality of peptides with predicted properties for each peptide in the list.

[0264] Generative artificial intelligence models can predict peptides while conditioned on binding to one or more specific protein targets (e.g., V2R, GLP-1, PCSK9), exhibiting specific functions (e.g., antibacterial, antiviral, antifungal, anti-inflammatory), and/or addressing certain indications (e.g., Alzheimer’s disease, cancer, multiple sclerosis). Generative Al models may implement autoregressive algorithms including Transformers and relate algorithms, protein folding-based algorithms, reinforcement learning algorithms, active learning algorithms, and diffusion algorithms including denoising diffusion probabilistic models. Protein targets on which generative models operate may be from any species, such as human proteins to develop treatments for disease, bacterial or viral targets to combat emerging infectious diseases, or animal targets to develop veterinary treatments.

[0265] Predictive models can predict properties for each of the peptides, where the properties can include physicochemical properties (e.g., molecular weight, charge, isoelectric point, solubility, lipophilicity), ADME properties (e.g., partition coefficient, distribution coefficient), toxicity, broad medical properties (e.g., anti-inflammatory, anticancer), and properties associated with infectious diseases (e.g., antibacterial, antiviral, antifungal). Predictive models can implement regression and classification algorithms including deep neural networks, boosted decision trees, random forests, other machine learning algorithms, simulations, and calculations. Property predictions can be expressed as physically relevant units, probabilities, or categorically.

[0266] Multiparameter-optimized peptide libraries can accelerate drug discovery programs by allowing drug developers to identify and select for all relevant properties from the outset. Replacing the sequential process of hit ID, hit-to-lead, and lead optimization, where any of the steps may fail, with a unified process wherein all relevant properties are addressed at each step may substantially reduce the number of steps that fail, the number of molecules synthesized and assessed, and the cost and time required for a drug discovery program.

[0267] Multiparameter-optimized peptide libraries, which can inherently be high-dimensional, can be visualized using dimensionality reduction algorithms such as uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE). The visualizations may convey the existence of clusters of peptides that are similar to each other, as measured in the highdimensional space of predicted properties. The visualizations may be coded by different properties to illustrate the distributions of those properties across the clusters. Certain aspects of the visualizations such as the UMAP embeddings may be precomputed, or computed at first use and then stored for rapid lookup, to improve interactivity. Visualizing multiple properties concurrently may allow drug developers to identify peptides meeting the requirements of the discovery program’s target-product profile (TPP) and select those peptides for further analysis.

[0268] A graphical user interface may provide interactive access to visualizations of multiparameter- optimized peptide libraries including capabilities to query the libraries, restrict selection to combined ranges of properties, and access the resulting selections for further analysis.

[0269] Application programming interfaces (APIs), modules, libraries, packages, and executable programs may provide additional means of accessing, selecting from, combining, analyzing, and otherwise using multiparameter-optimized peptide libraries. These methods may be implemented for onpremises execution or as “software as a service” (SaaS) on a public or private cloud service provider. [0270] Software analysis tools may include analyses of multiparameter-optimized peptide libraries produced for different protein targets, functions, or indications, enabling identification of salient features of peptides across target families, between targets and functions, and between targets and indications, further increasing the efficiency with which therapeutics can be developed.

[0271] In some aspects, the present disclosure provides systems and methods for building, updating, and/or leveraging in silico libraries of peptides (which can include nonstandard amino acids and their modifications) that are candidates for addressing molecular targets of disease, human proteins, and/or non-human targets and proteins. FIG. 13 and FIG. 14 show examples of drug discovery processes that can leverage an in silico library of peptides. In some cases, a library of peptides can be optimized for drug development, where compounds in the library is optimized for properties relevant for drug development. Such properties can include results of binding assays, activity, function, physicochemical properties, ADMET, and/or synthesizability. The properties can comprise representations, such as those disclosed in the present application. The library can be integrated with various algorithms disclosed herein. Various libraries, each with different objectives, can be generated to compare libraries for targets, human proteins, and non-human targets and proteins. For example, the libraries can be used to address areas where animal and human models differ and/or for veterinary, agricultural, and industrial applications. In some cases, a library may be generated on-demand (e.g., autonomously) for novel (newly discovered or hypothesized) targets.

[0272] In some cases, in silico libraries of peptides can be predicted to have favorable binding, activity, physicochemical, ADMET, synthesizability, and other properties for specific targets and proteins. Such libraries may surpass traditional compound libraries by representing a larger collection of properties necessary for lead optimization, thereby reducing time, cost, and risk associated with drug design. The libraries can be created using learned, hierarchical representations of peptides and proteins and graph attention networks, as disclosed herein. In some cases, peptides in the libraries may contain non-standard amino acids and modifications. Peptide libraries generated for known targets, other human proteins, and (in some cases) non-human targets and proteins can enable rapid prosecution of validated and novel targets as well as analyses of drug candidates that can address multiple targets. In some cases, libraries may contain approximately 10 4 - 10 6 diverse peptides for about 3,000 targets, about 20,000 human proteins, (potentially) other targets (e.g., mediating protein-protein interactions), and (potentially) nonhuman targets and proteins.

[0273] In some cases, libraries can be generated by efficient, multitask models that process binding assay results, activity, physicochemical properties, pharmacological properties, synthesizability, and other properties, which can substantially improve the quality of the candidate hits. Increased quality of candidate hits can reduce the effort and cost for lead optimization, and can increase the probability of successful drug design. Higher-quality libraries can require relatively fewer compounds (e.g., 10 6 ) for screening. Furthermore, analysis of libraries for different targets can reveal significant opportunities for designing drugs to address multiple targets as well as increasing specificity and safety by reducing off- target effects. [0274] In some aspects, the present disclosure provides a library comprising a plurality of libraries. In some cases, a user or an entity may be granted access to a subset of libraries in the plurality of libraries. In silico libraries of peptides (including nonstandard amino acids and modifications) having potential as drug candidates addressing molecular targets of disease, human proteins, and non-human targets and proteins.

[0275] In some cases, a user may access the library using a web-browser, a terminal, or an autonomous protocol. For example, FIG. 15 shows a GUI for accessing the library using a web browser. Using the browser, a user can select a target protein of interest.

[0276] When a target is selected, a summary report can be produced or retrieved. The summary report can contain a visualization of properties that are important for peptide drug delivery. The visualization can be a visualization of a representation, e.g., a latent representation. Examples of visualizations are shown in FIGS. 19-22. The summary report can contain a list of peptide candidates that bind or could bind to the target protein. The summary report can contain interactive elements that a user can use to narrow down the number of peptide candidates. An interactive element can be checkboxes, sliders, fields for entering numerical values, etc. The user can download the list of peptide candidates, along with any relevant associated properties such as sequence or toxicity, or any other property disclosed herein. An example of a downloaded list is shown in FIG. 18.

[0277] The present disclosure provides computer systems that are programmed to implement methods of the disclosure. FIG. 11 shows a computer system 11001 that is programmed or otherwise configured to implement methods of the present disclosure, such as, for example, training or using a trained machine learning algorithm of the present disclosure.

[0278] The computer system 11001 may regulate various aspects of analysis, calculation, and generation of the present disclosure, such as, for example, during training or using a trained machine learning algorithm of the present disclosure. The computer system 11001 may be an electronic device of a user or a computer system that is remotely located with respect to the electronic device. The electronic device may be a mobile electronic device.

[0279] The computer system 11001 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 11005, which may be a single core or multi core processor, or a plurality of processors for parallel processing. The computer system 11001 also includes memory or memory location 11010 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 11015 (e.g., hard disk), communication interface 11020 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 11025, such as cache, other memory, data storage and/or electronic display adapters. The memory 11010, storage unit 11015, interface 11020 and peripheral devices 11025 are in communication with the CPU 11005 through a communication bus (solid lines), such as a motherboard. The storage unit 11015 may be a data storage unit (or data repository) for storing data. The computer system 11001 may be operatively coupled to a computer network (“network”) 11030 with the aid of the communication interface 11020. The network 11030 may be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet. [0280] The network 11030 in some cases is a telecommunication and/or data network. The network 11030 may include one or more computer servers, which may enable distributed computing, such as cloud computing. For example, one or more computer servers may enable cloud computing over the network 11030 (“the cloud”) to perform various aspects of analysis, calculation, and generation of the present disclosure, such as, for example, training or using a trained machine learning algorithm of the present disclosure. Such cloud computing may be provided by cloud computing platforms such as, for example, Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and IBM cloud. The network 11030, in some cases with the aid of the computer system 11001, may implement a peer-to-peer network, which may enable devices coupled to the computer system 11001 to behave as a client or a server.

[0281] The CPU 11005 may comprise one or more computer processors and/or one or more graphics processing units (GPUs). The CPU 11005 may execute a sequence of machine-readable instructions, which may be embodied in a program or software. The instructions may be stored in a memory location, such as the memory 11010. The instructions may be directed to the CPU 11005, which may subsequently program or otherwise configure the CPU 11005 to implement methods of the present disclosure. Examples of operations performed by the CPU 11005 may include fetch, decode, execute, and writeback. [0282] The CPU 11005 may be part of a circuit, such as an integrated circuit. One or more other components of the system 11001 may be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).

[0283] The storage unit 11015 may store files, such as drivers, libraries and saved programs. The storage unit 11015 may store user data, e.g., user preferences and user programs. The computer system 11001 in some cases may include one or more additional data storage units that are external to the computer system 11001, such as located on a remote server that is in communication with the computer system 11001 through an intranet or the Internet.

[0284] The computer system 11001 may communicate with one or more remote computer systems through the network 11030. For instance, the computer system 11001 may communicate with a remote computer system of a user. Examples of remote computer systems include personal computers (e.g., portable PC), slate or tablet PC’s (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants. The user may access the computer system 11001 via the network 11030.

[0285] Methods as described herein may be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 11001, such as, for example, on the memory 11010 or electronic storage unit 11015. The machine executable or machine readable code may be provided in the form of software. During use, the code may be executed by the processor 11005. In some cases, the code may be retrieved from the storage unit 11015 and stored on the memory 11010 for ready access by the processor 11005. In some situations, the electronic storage unit 11015 may be precluded, and machine-executable instructions are stored on memory 11010. [0286] The code may be pre-compiled and configured for use with a machine having a processer adapted to execute the code, or may be compiled during runtime. The code may be supplied in a programming language that may be selected to enable the code to execute in a pre-compiled or as-compiled fashion. [0287] Aspects of the systems and methods provided herein, such as the computer system 11001, may be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machine-executable code may be stored on an electronic storage unit, such as memory (e.g., read-only memory, randomaccess memory, flash memory) or a hard disk. “Storage” type media may include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.

[0288] Hence, a machine readable medium, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer- readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution. [0289] The computer system 11001 may include or be in communication with an electronic display 11035 that comprises a user interface (UI) 11040 for providing, for example, access to a trained machine learning algorithm of the present disclosure. In some aspects, the present disclosure provide a computer- implemented method for providing access to a trained neural network to one or more users via a terminal, a web browser, an application, or a server infrastructure.

[0290] In some aspects, the present disclosure provides a computer program product comprising a computer-readable medium having computer-executable code encoded therein, the computer-executable code adapted to be executed to implement any one of the methods or computer-implemented methods of claims disclosed herein, wherein the computer-executable code comprises graphical flows between a plurality of tensor operations that define the neural network.

[0291] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the methods or computer-implemented methods disclosed herein, wherein the non- transitory computer-readable storage media is comprised in a distributed computing system, a cloudbased computing system, or a supercomputer comprising a plurality of graphical processing units. In some cases, the computer program is encoded in a human non-readable format.

[0292] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein, wherein the non-transitory computer-readable storage media is comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of central processing units, graphical processing units, or tensor processing units.

[0293] In some aspects, the present disclosure provides a computer-implemented system comprising: a digital processing device comprising: at least one processor, an operating system configured to perform executable instructions, a memory, and a computer program including instructions executable by the digital processing device to perform any one of the methods or computer-implemented methods disclosed herein.

[0294] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein, wherein the non-transitory computer-readable storage media is comprised in a distributed computing system, a cloud-based computing system, or a supercomputer comprising a plurality of graphical processing units.

[0295] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods disclosed herein, wherein the non-transitory computer-readable storage media is comprised in a laptop computer, or a personal desktop computer.

[0296] In some aspects, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by one or more processors to perform any one of the computer-implemented methods of claims disclosed herein, wherein the non- transitory computer-readable storage media is comprised in a laptop computer or a personal desktop computer.

[0297] In some cases, the terminal, the web browser, or the application comprises a graphical user interface for receiving instructions for generating a latent representation of a given peptide from the one or more users and providing the latent representation of the given peptide to the one or more users via the graphical user interface based at least partially on the instructions. In some cases, the terminal, the web browser, or the application comprises a graphical user interface for receiving instructions for obtaining a query hierarchical representation of a query peptide from the one or more users and providing the query hierarchical representation of the query peptide to the one or more users via the graphical user interface based at least partially on the instructions. In some cases, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a latent representation of a given peptide. In some cases, the server infrastructure is configured to transmit information with a second computer, wherein the information comprises peptide chemical structure for a given latent representation. In some cases, the terminal, the web browser, or the application comprises a graphical user interface for receiving instructions for generating a query latent representation of a query peptide graph from the one or more users and providing the query latent representation of the query peptide graph to the one or more users via the graphical user interface based at least partially on the instructions. In some cases, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query latent representation of a query peptide. In some cases, the server infrastructure is configured to transmit information with a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation. In some cases, the information comprises a query hierarchical representation of a query peptide.

[0298] In some cases, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query hierarchical representation of a query peptide. In some cases, the server infrastructure is configured to transmit information with a second computer, wherein the information comprises a peptide chemical structure associated with a query latent representation. In some cases, the terminal, the web browser, or the application comprises a graphical user interface for receiving instructions for obtaining a query hierarchical representation of a query peptide from the one or more users and providing the query hierarchical representation of the query peptide to the one or more users via the graphical user interface based at least partially on the instructions. In some cases, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query hierarchical representation of a query peptide. In some cases, the server infrastructure is configured to transmit information with a second computer, wherein the information comprises a peptide chemical structure associated with a query hierarchical representation. In some cases, the terminal, the web browser, or the application comprises a graphical user interface for receiving instructions for obtaining a query ligand representation for a query target peptide from the one or more users and providing the query ligand representation to the one or more users via the graphical user interface based at least partially on the instructions. In some cases, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query ligand representation of a query target peptide. In some cases, the server infrastructure is configured to transmit information with a second computer, wherein the information comprises a peptide chemical structure associated with a query ligand representation. In some cases, the terminal, the web browser, or the application comprises a graphical user interface for receiving instructions for obtaining a query ligand representation for a query target peptide from the one or more users and providing the query ligand representation to the one or more users via the graphical user interface based at least partially on the instructions. In some cases, the server infrastructure is configured to transmit information to a second computer, wherein the information comprises a query ligand representation of a query target peptide. In some cases, the server infrastructure is configured to transmit information with a second computer, wherein the information comprises a peptide chemical structure associated with a query ligand representation.

[0299] Examples of UIs include, without limitation, a graphical user interface (GUI) and web-based user interface.

[0300] Methods and systems of the present disclosure may be implemented by way of one or more algorithms. An algorithm may be implemented by way of software upon execution by the central processing unit 11005. The algorithm can, for example, train or use a trained machine learning algorithm of the present disclosure.

[0301] In some aspects, the present disclosure provides a non-transitory computer-readable storage medium. In some cases, the non-transitory computer-readable storage medium comprises a peptide graph comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encodes bonded interactions and nonbonded interactions between the plurality of nodes. In some cases, the non-transitory computer-readable storage medium comprises a latent representation for a node in a plurality of nodes, wherein the latent representation encodes at least short-range interactions and long-range interactions between the node and at least a subset of nodes in the plurality of nodes.

[0302] In some cases, the non-transitory computer-readable storage medium comprises a representation of a peptide comprising a plurality of nodes and a plurality of edges, wherein the plurality of edges encode bonded and nonbonded interactions between the plurality of nodes. In some cases, the non- transitory computer-readable storage medium comprises an encoding based at least partially the first representation, wherein the encoding comprises lower resolution than the first representation.

EXAMPLES

[0303] The following examples are provided to further illustrate some embodiments of the present disclosure, but are not intended to limit the scope of the disclosure; it will be understood by their exemplary nature that other procedures, methodologies, or techniques known to those skilled in the art may alternatively be used. Example 1: Neural network for learning embeddings that incorporate short-range and long-range information in a peptide graph

[0304] In some cases, the method comprises constructing a peptide graph based at least partially on a peptide structure. For example, FIG. 1 illustrates peptide graphs constructed from a peptide structure. In some cases, the peptide structure (1001) may be used to generate an atomistic representation of a graph (1002). In some cases, the peptide structure may be used to generate a residue representation of a graph (1003). In some cases, the peptide graph comprises a plurality of nodes and a plurality of edges. In some cases, the plurality of edges represents at least bonded and nonbonded interactions between the plurality of nodes of the peptide graph.

[0305] In some cases, a neighborhood graph of a node v may be defined as the subgraph of nodes that are k hops from v. For example, in FIG. 5, node L8 may be selected as the node of interest, v. Then, the neighborhood graph, when k = 2, of v comprises nodes which is less than or equal to two edges from v. In this example, the nodes Y2, D7, Q9, F12, E14, 130, and L33 are comprised in the neighborhood graph but not the context graph.

[0306] In some cases, the context graph of node v may be defined as the subgraph of nodes that are at least ri edges from v and at most edges from v, where ri > k. For example, in FIG. 6, node L8 is selected as the node of interest, v. The ri =2 and ? = 3 context graph comprises nodes that are at least 2 edges away and at most 3 edges away from node v. The nodes M17, 119, E20, Q21, N22, N23, Q25, K28, and K35 are comprised in the context graph but not the neighborhood graph.

[0307] In some cases, anchor nodes may be defined as nodes that are common to the neighborhood subgraph and the context subgraph. For example, in FIG. 5 and FIG. 6, node L8 (outlined in black) is selected as the node of interest, v. The context graph of v, given k = 2 and ri = 2 and = 3 comprises of nodes Wl, N3, T5, K6, Q10, KI 1, Y13, 115, 116, D18, V24, G26, K27, G29, Q31, Q32, and Q34.

[0308] A neighborhood graph, context graph, and anchor nodes may be defined for each node v,- in a graph G.

[0309] The neighborhood information may be encoded in the learned representation by posing a training task as a context prediction problem. The initial node features of the residues may contain the following information: one-hot encoded vector of the amino acids, one-hot encoded vector of secondary structure, B-factor or Debye-Waller factor, Meiler embedding, ProtScale (Protein Identification and Analysis Tools on the ExPASy Server, solvent-accessible surface area, Ramachandran angles, amino acid indices, or other amino acid embeddings, fingerprints, or properties, or any other feature disclosed herein.

[0310] In some cases, to learn the node level representation, the initial node features of neighborhood subgraph and the context subgraph may be passed through two separate graph neural networks (GNN) stacks. FIG. 8 schematically illustrates an architecture of the self-supervised learning algorithm. The GNN layers for the neighborhood and context GNNs may be graph convolution networks (GCNs), graph attention networks (GATs), message passing neural networks (MPNN), or any other GNNs that perform permutation-invariant pooling and aggregation. Different GNNs may treat attention coefficients across the neighboring nodes differently, resulting in different accuracy, convergence, and learning properties that may be optimized using neural architecture search.

[0311] The context representation, c v , may be constructed by average pooling over the context anchor nodes. The training task may be posed as predicting whether a given context representation belongs to a particular node-level representation n v . In some cases, the training task may induce the GNN to map the nodes with similar structural context to nearby points in the high-dimensional representation.

[0312] In some cases, positive samples may be directly sampled from the graph representations of peptides and proteins. In some cases, negative samples may be added through random sampling. In some cases, positive samples may comprise pairs of nodes in a neighborhood graph and a context graph where both are sampled from the same peptide graph and central node. In some cases, negative samples may comprise of pairs of nodes in a neighborhood graph and a context graph where both are sampled from different peptide graph or same peptide graph but a different central node:

[0313] In some cases, the output of the model may be the sigmoid of the dot product of the node representation vector and the context representation vector:

[0317] In some cases, once the model is trained on the context prediction task, the context GNN may be removed, and only the neighborhood GNN may be used to generate the node embeddings n v for new graph representations of peptides that the network has not seen previously. In some cases, once the model is trained on the context prediction task, the neighborhood GNN may be removed, and only the context GNN may be used to generate the node embeddings for new graph representations of peptides that the network has not seen previously.

[0318] In some cases, the context prediction task may be used to learn the representation at lower levels of representation with individual atoms represented as nodes, and the chemical interactions such as covalent bonds and van der Waals interactions represented as edges.

[0319] In some cases, the node embeddings learned from the low level graph representations (e.g., atomic level graph representations) may be used to learn embeddings at higher levels of graph representation (e.g., residue level graph representations) using hierarchical learning.

[0320] In some cases, the low-level embeddings may be processed using GNN operating on a bipartite graph connecting the higher level nodes (e.g., residue nodes) with their nodes in low level representations (e.g., atoms within a particular residue) as shown in FIG. 7.

[0321] In some cases, the learning of higher level node embeddings may be performed by posing the problem as a masked attribute prediction task, where random residues are masked (e.g., with ‘X’) and the output of the GNNs are used to predict the value of the residue. [0322] While preferred embodiments of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the disclosure. It should be understood that various alternatives to the embodiments of the present disclosure may be employed in practicing the present disclosure. It is intended that the following claims define the scope of the present disclosure and that methods and structures within the scope of these claims and their equivalents be covered thereby.