Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
INTERACTION PROPERTY PREDICTION SYSTEM AND METHOD
Document Type and Number:
WIPO Patent Application WO/2019/202292
Kind Code:
A1
Abstract:
A computer system for predicting an interaction property between a target object and a plurality of test objects is disclosed. The system comprises a data store comprising executable instructions for a neural network, and a processor coupled to the data store and configured to execute the stored instructions to operate the neural network. The system is configured to obtain target object data comprising data indicative of a three-dimensional representation of at least one property of the target object. For each test object of the plurality of test objects, the system is configured to: obtain test object data comprising data indicative of a three-dimensional representation of at least one property of the test object; and operate the neural network on combined object data to determine an indication of an interaction property between the target object and the test object. The combined object data is based on both the target object data and the test object data. The system is configured to output to a resource an indication of a selected subset of test objects determined to be above a selected threshold of likelihood of fitting with the target object, and said subset is selected based on the determined interaction properties.

Inventors:
REDDY AKHILESH (GB)
Application Number:
PCT/GB2019/050887
Publication Date:
October 24, 2019
Filing Date:
March 28, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
DRUGAI LTD (GB)
International Classes:
G16B15/30; G06F17/50; G06T17/00; G06T19/00
Domestic Patent References:
WO2018213767A12018-11-22
Foreign References:
US20160300127A12016-10-13
US20170329892A12017-11-16
US20150169822A12015-06-18
US20180089888A12018-03-29
Other References:
RAGOZA M ET AL: "Protein-Ligand Scoring with Convolutional Neural Networks", JOURNAL OF CHEMICAL INFORMATION AND MODELING, vol. 57, no. 4, 11 April 2017 (2017-04-11), US, pages 942 - 957, XP055597752, ISSN: 1549-9596, DOI: 10.1021/acs.jcim.6b00740
ZHANG C ET AL: "Neural networks: Efficient implementations and applications", 2017 IEEE 12TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), IEEE, 25 October 2017 (2017-10-25), pages 1029 - 1032, XP033295105, DOI: 10.1109/ASICON.2017.8252654
SABOUR S ET AL: "Dynamic Routing Between Capsules", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 26 October 2017 (2017-10-26), XP081283827
Attorney, Agent or Firm:
WHITE, Andrew (GB)
Download PDF:
Claims:
Claims

1. A computer system for predicting an interaction property between a target object and a plurality of test objects, the system comprising:

a data store comprising executable instructions for a neural network, wherein said neural network comprises at least one of a deep residual network, highway network, densely connected network and a capsule network; and

a processor coupled to the data store and configured to execute the stored instructions to operate the neural network;

wherein the system is configured to obtain target object data comprising data indicative of a three-dimensional representation of at least one property of the target object; wherein, for each test object of the plurality of test objects, the system is configured to:

obtain test object data comprising data indicative of a three-dimensional representation of at least one property of the test object; and

operate the neural network on combined object data to determine an indication of an interaction property between the target object and the test object, wherein the combined object data is based on both the target object data and the test object data;

wherein the system is configured to output to a resource an indication of a selected subset of test objects determined to be above a selected threshold of likelihood of fitting with the target object, wherein said subset is selected based on the determined interaction properties.

2. The computer system of claim 1 , wherein the combined object data is based on: (i) data indicative of the target object in a first target pose and (ii) data indicative of the test object in a first test pose.

3. The computer system of claim 2, wherein the system is operable to operate the neural network on combined object data to predict an interaction property between the target object and the test object, wherein said predicted interaction property comprises a predicted interaction property between the target object and the test object with at least one of the objects in a different pose to the pose of that object in the combined object data on which the prediction is made.

4. The computer system of claim 2, or any claim dependent thereon, wherein the system is operable to operate the neural network on the combined object data to determine a likelihood of fit between the two objects, wherein the likelihood of fit is for any of a plurality of combinations of poses for the two objects, wherein said combination of poses includes poses not represented in the combined object data on which the neural network is operated.

5. The computer system of any preceding claim, wherein the at least one property of the target object comprises at least one of: (i) geometric surface data, (ii) a biochemical property, and (iii) an electromagnetic property; and

wherein the at least one property of the test object comprises at least one of: (i) geometric surface data, (ii) a biochemical property, and (iii) an electromagnetic property.

6. The computer system of any preceding claim, wherein for each test object, the system is configured to combine the target object data with the test object data to provide the combined object data.

7. The computer system of claim 6, wherein combining the target object data with the test object data to provide combined object data comprises concatenating the target object data and the test object data.

8. The computer system of claim 7, wherein the target object data comprises a voxel array, wherein said voxel array provides the three-dimensional representation of the at least one property of the target object;

wherein the test object data comprises a voxel array, wherein said voxel array provides the three-dimensional representation of the at least one property of the test object; and

wherein concatenating the target object data and the test object data comprises providing a combined voxel array in which the two voxel arrays are located adjacent to one another.

9. The computer system of claim 8, wherein providing combined object data comprises in the event that the target object voxel array and the test object voxel array are not a selected size, processing at least one of the target object voxel array and the test object voxel array so that they are the selected size, for example so that they are the same size.

10. The computer system of claim 9, wherein the selected size target object array and test object array are both cubic in shape.

11. The computer system of any preceding claim, wherein the interaction property provides an indication of a binding affinity between the target object and the test object.

12. The computer system of any preceding claim, wherein the computer system is configured to further analyse interaction properties between the target object and the test object in a plurality of different poses to determine a best fit.

13. The computer system of claim 12, as dependent on claim 11 , wherein further analysis of interaction properties comprises performing a force-field analysis between the target object and the test object in the plurality of different poses to identify a pose with the highest binding affinity between the target object and the test object.

14. The computer system of either of claims 12 or claim 13, as dependent on claim 2, wherein in the event that an indication of a test object is output to the resource, the computer system is configured to obtain data indicative of a plurality of different poses for the target and/or test object, wherein said poses are different to the first target pose and the first test pose in the combined object data; and

wherein the computer system is configured to analyse the fit between the target and the test object for each of said poses.

15. The computer system of any preceding claim, wherein the plurality of test objects form part of a larger group of candidate test objects;

wherein, based on the selected subset of test objects and known degrees of similarity between test objects in the plurality of test objects and other candidate test objects not in the plurality of test objects, the computer system is configured to identify additional candidate test objects from the larger group of candidate test objects not in the plurality of test objects which also have an increased likelihood of fitting with the target object; and

wherein the computer system is configured to output to a resource an indication of the identified candidate test ions.

16. The computer system of any preceding claim, wherein the target object comprises a medicament, for example a drug molecule, and the test object comprises a possible carrier for the medicament, for example a drug carrier.

17. A method of predicting an interaction property between a target object and a plurality of test objects, the method comprising:

obtaining target object data comprising data indicative of a three-dimensional representation of at least one property of the target object; wherein, for each test object of the plurality of test objects, the method comprises: obtaining test object data comprising data indicative of a three-dimensional representation of at least one property of the test object; and

operating a neural network on combined object data to determine an indication of an interaction property between the target object and the test object, wherein the combined object data is based on both the target object data and the test object data, and wherein said neural network comprises at least one of a deep residual network, highway network, densely connected network and a capsule network;

outputting to a resource an indication of a selected subset of test objects determined to be above a selected threshold of likelihood of fitting with the target object, wherein said subset is selected based on the determined interaction properties.

18. The method of claim 17, wherein the combined object data is based on: (i) data indicative of the target object in a first target pose and (ii) data indicative of the test object in a first test pose; and

wherein the method comprises at least one of:

operating the neural network on the combined object data to predict an interaction property between the target object and the test object, wherein said predicted interaction property comprises a predicted interaction property between the target object and the test object with at least one of the objects in a different pose to the pose of that object in the combined object data on which the prediction is made; and

operating the neural network on the combined object data to determine a likelihood of fit between the two objects, wherein the likelihood of fit is for any of a plurality of combinations of poses for the two objects, wherein said combination of poses includes poses not represented in the combined object data on which the neural network is operated.

19. The method of any of claims 17 or 18, wherein the method comprises, for each test object, combining the target object data with the test object data to provide the combined object data.

20. The method of claim 19, wherein combining the target object data with the test object data to provide combined object data comprises concatenating the target object data and the test object data.

21. The method of claim 20, wherein the target object data comprises a voxel array, wherein said voxel array provides the three-dimensional representation of the at least one property of the target object;

wherein the test object data comprises a voxel array, wherein said voxel array provides the three-dimensional representation of the at least one property of the test object; and

wherein concatenating the target object data and the test object data comprises providing a combined voxel array in which the two voxel arrays are located adjacent to one another.

22. The method of claim 21 , wherein providing combined object data comprises in the event that the target object voxel array and the test object voxel array are not a selected size, processing at least one of the target object voxel array and the test object voxel array so that they are the selected size, for example so that they are the same size.

23. The method of any of claims 17 to 22, wherein the method comprises further analysing interaction properties between the target object and the test object in a plurality of different poses to determine a best fit.

24. The method of claim 23, wherein further analysing interaction properties comprises performing a force-field analysis between the target object and the test object in the plurality of different poses to identify a pose with a highest binding affinity between the target object and the test object.

25. The method of either of claims 23 or 24, as dependent on claim 18, wherein in the event that an indication of a test object is output to the resource, the method comprises:

obtaining data indicative of a plurality of different poses for the target and/or test object, wherein said poses are different to the first target pose and the first test pose in the item of combined object data; and

analysing the fit between the target and the test object for each of said poses.

26. The method of any of claims 17 to 25, wherein the plurality of test objects form part of a larger group of candidate test objects and wherein the method comprises:

identifying, based on the selected subset of test objects and known degrees of similarity between test objects in the plurality of test objects and other candidate test objects not in the plurality of test objects, additional candidate test objects from the larger group of candidate test objects not in the plurality of test objects which also have an increased likelihood of fitting with the target object; and

outputting to a resource an indication of the identified candidate test objects.

27. A computer readable non-transitory storage medium comprising a program for a computer configured to cause a processor to perform the method of any of claims 17 to 26.

Description:
Interaction property prediction system and method

Field of the invention

The present disclosure relates to a system and method for predicting an interaction property between a target object and a plurality of objects.

Background

For a given test object (such as a drug molecule) and target object (such as a protein molecule), predicting how their two surfaces might dock with each other can be a difficult task. The test object may be docked with the target object in a plurality of different poses in order to assess its capacity to bind the target. There may also be some uncertainty and degree of freedom in this assessment owing to the nature of chemical bonds between atoms in a molecule (e.g. in the test object) that allow relative rotation between different parts of the molecule. This gives rise to potentially thousands of conformations that the test object may take on. In addition, it may be desirable to try to find the ideal orientation in three dimensions which allows the best docking between target and test objects. This can add more potential poses for assessment during a docking procedure.

Such assessments may be performed by computer systems that use various models to provide an indication of fit between the test object and the target object. Performing such calculations can take large amounts of time, especially when performing calculations for a single target object screened against thousands or millions of test objects (e.g. candidate drug molecules). There is therefore a need to develop more rapid ways to screen such test objects, which could hugely impact on, for example, drug discovery.

Summary

Aspects of the invention are as set out in the independent claims and optional features are set out in the dependent claims. Aspects of the invention may be provided in conjunction with each other and features of one aspect may be applied to other aspects.

The present disclosure relates to a system for use in a process of docking a test object with a target object. This process may involve detailed analysis (e.g. running simulations) of docking between a test object and a target object. This detailed analysis may involve analysing docking with both the test object and the target object in a large number of different poses. In addition, this detailed analysis may need to be performed on a large number of different test objects before a test object, and an orientation of the target and test object, is found which provides a satisfactory level of docking with the target object. This can be a highly time-consuming process.

Embodiments of the present claims may address this technical problem by providing a system which may act as a pre-filter for identifying a selection of potentially suitable test objects, from a larger group of test objects, for docking with a target object. This may enable detailed analysis of docking between the target and test objects to be focused only on selected test objects which are determined to be more likely to provide satisfactory docking. Selection of suitable test objects using embodiments of the present claims may reduce the number of test objects which are to be subjected to a detailed docking analysis for the target object. Consequently, this may provide a reduction in the time taken to perform the technical process of identifying a satisfactory test and orientation for docking with a target object.

In an aspect, there is provided a computer system for predicting an interaction property between a target object and a plurality of test objects. The system comprises: (i) a data store comprising executable instructions for a neural network, wherein said neural network comprises at least one of a Deep Residual Network (ResNet), a Highway Network, a Densely Connected Network (DenseNet) and a Capsule Network; and (ii) a processor coupled to the data store and configured to execute the stored instructions to operate the neural network. The system is configured to obtain target object data comprising data indicative of a three-dimensional representation of at least one property of the target object. For each test object of the plurality of test objects, the system is configured to: (a) obtain test object data comprising data indicative of a three- dimensional representation of at least one property of the test object; and (b) operate the neural network on combined object data to determine an indication of an interaction property between the target object and the test object, wherein the combined object data is based on both the target object data and the test object data. The system is configured to output to a resource an indication of a selected subset of test objects determined to be more likely to fit with the target object, wherein said subset is selected based on the determined interaction properties.

This may provide improvements in speed for selecting suitable test objects for docking with a target object. The neural network may be trained to determine that some of the test objects are more suitable for docking with the target object than others. Then, a detailed analysis of target-test object pairs can be limited to test objects determined to be suitable by the system. Thus, the system can filter out test objects for which no detailed analysis is to be performed.

Determination of suitable test objects may be made without multiple different analyses of the target and test object. For example, based on combined object data with the target object in a first target pose and the test object in a first test pose, the system may be operable to determine a likelihood of the target and test object docking satisfactorily. This likelihood may include a likelihood of them docking in different poses to those in the combined object data. For example, irrespective of the target and test pose in the combined object data, the system may be operable to recognise three-dimensional relationships between the target and test object without the need to re-analyse with different poses being used. The system may save time, as repeated analysis of the two objects in different poses is not needed, and unsuitable test objects can be filtered out before being subject to a more detailed analysis of the docking.

Drawings

Embodiments of the disclosure will now be described, by way of example only, with reference to the accompanying drawings, in which:

Fig. 1 shows a schematic illustration of an example computer system.

Fig. 2 shows a flowchart illustrating an example method for selecting test objects to output to a resource.

Fig. 3 shows a flowchart illustrating an example method of performing a detailed analysis of docking between a target object and a test object.

Fig. 4 shows a schematic diagram illustrating a structure of an example neural network.

Fig. 5 shows a graph comparing prediction accuracy when using convolution layers only, and when using a deep residual network.

Fig. 1 shows a schematic diagram of a computer system 100. The computer system 100 comprises a data store 1 10 and a processor 120. There is also a resource 130. The data store 1 10 is coupled to the processor 120, and the computer system 100 is coupled to the resource 130.

The data store 110 stores executable instructions for a neural network. The neural network may be at least one of a Deep Residual Network (ResNet), a Highway Network, a Densely Connected Network (DenseNet) and a Capsule Network, and is described in more detail below. The data store 110 is coupled to the processor 120 so that the processor 120 may read data from, and write data to, the data store 1 10. The data store 1 10, and the instructions for the neural network, may be stored in a suitable format such that the instructions are updatable by the processor 120. This may allow the processor 120 to update the instructions based on test-runs of the computer system 100 (hereinafter referred to as‘training the network’). The data store 110 may also store object data, such as test object data for a plurality of test objects.

The processor 120 is coupled to the data store 1 10 so that it can execute the stored instructions for the neural network. The processor 120 may therefore operate the neural network. If the data store 110 stores object data, the processor 120 may be coupled to the data store 110 so that it can retrieve the object data, for example so that it can operate the neural network on such object data and/or so that the object data may be communicated to the resource 130.

The resource 130 is coupled to the computer system 100 so that it may receive an indication from the computer system 100 of at least one test object for a more detailed analysis. For example, the resource 130 may be configured to further analyse interaction properties between target and test objects with the two in a plurality of different poses. The resource 130 may comprise a computer or other suitable device for providing a more detailed analysis of interaction properties between a target object and a test object.

Operation of the computer system 100 will be described in more detail with reference to Fig. 2 and operation of the resource 130 will be described in more detail with reference to Fig. 3.

Whilst the resource 130 is shown as being a separate component to the computer system 100, it is to be appreciated that this may not be the case. For example, the resource 130 may be part of the computer system 100. Although not shown, the computer system 100 may also include a communication system. For example, the communication system may be arranged to communicate with other devices, such as over a network. The communication system may be used when communicating between the computer system 100 and the resource 130. For example, the computer system 100 may be provided as a server configured to receive data from an external device and to analyse this data before sending it back to the device.

Fig. 2 shows a flow chart illustrating an example method, for example a method of operating the apparatus shown in Fig. 1.

At step 210 of the method, target object data is obtained for the target object.

The target object may be a three-dimensional object and the target object data may comprise data which is indicative of a three-dimensional representation of the target object; it may be representative of the target object in a first target pose. The target object data may comprise data representative of one or more properties of the three- dimensional object. The target object data may be representative of the shape of the object and its geometric surface, e.g. its contours and profile. The target object data may be representative of a biochemical property and/or an electromagnetic property. The target object data may be representative of the shape of the target object with additional data (such as electromagnetic/biochemical properties) included at a plurality of locations of the target object. The target object data may therefore provide an indication of the surface of the target object with e.g. an indication of electrostatic charge at each of a plurality of locations on the surface and/or within the target object itself.

The target object data may comprise data in the form of a three-dimensional array of voxels. Voxels in the array may comprise at least one value which provides an indication of at least one property at the location of that voxel. This value may be a single number, or it may be a vector, or other suitable data format. Each voxel may represent a volume, for example so that spacing between each neighbouring voxel is representative of a distance of separation in the actual target object, e.g. this distance may be 0.1 , 0.2, 0.5, 1 , 2, 4 Angstroms. The voxel may therefore represent a value for a property at its respective volume of the target object. Voxels may be assigned a value where they represent the surface of the target object, and so if a voxel has a value, it will be considered part of the surface, and that value may represent a said property of the target object at that part of the surface of the target object. Although all voxels in the three-dimensional array have been described as being assigned a number, it is to be appreciated that only voxels which represent a surface of the object may be attributed a value. Obtaining the target object data may comprise receiving the target object data already in a suitable format (e.g. in a three-dimensional voxel array). It may comprise receiving data in a format which may be processed to be in a selected format for operation of the neural network on it. For example, the method may comprise constructing a three- dimensional surface in a three-dimensional voxel array. A midpoint of the surface may lie in the centre of the array. The three-dimensional array may be cube-shaped, e.g. in each dimension there are the same number of voxels.

At step 220 of the method, test object data is obtained for the test object.

As with the target object, the test object may be a three-dimensional object and the test object data may comprise data which is indicative of a three-dimensional representation of the test object; it may be representative of the test object in a first test pose. The test object data may be in the same format as the target object data. Obtaining the test object data may comprise obtaining data in the relevant format, or it may comprise converting data into the format described above (e.g. the voxel array). Test object data may be obtained from the data store 1 10, or it may be received from an external device. The test object may comprise a form of medicament, such as a drug molecule, ligand, biopolymer or other such substance.

At step 230 of the method, combined object data is obtained.

Combined object data is based on both the test object data and the target object data. The combined object data may be in the form of a three-dimensional voxel array. This combined object voxel array may comprise the target object voxel array and the test object voxel array. The target object voxel array may be located adjacent to the test object voxel array in the combined object voxel array. The combined object voxel array may comprise a three-dimensional representation of the target object in a first target pose and a three-dimensional representation of the test object in a first test pose. The combined object data may comprise a voxel array which is of a normalised size and shape for the neural network to process.

The method may comprise the step of combining the target object data and the test object data to obtain combined object data in the format described above. Combining object data may also comprise scaling at least one of the target object data and the test object data so that they are in a suitable format, e.g. they comprise a three-dimensional voxel array of a suitable size for use with the neural network. Combining the object data may comprise concatenating the target object voxel array and the test object voxel array. Concatenation of the two voxel arrays may comprise adding them together to form a larger voxel array which houses both arrays. In the event that each array is cube-shaped with a size of N x N x N, the combined object array will have a size of 2N x N x N, with the target object voxel array located next to the test object voxel array.

Obtaining the combined object data may comprise processing at least one of the target object data and the test object data so that they conform to a selected size for use with the neural network. The target object and the test object may be of different sizes, and they may therefore be represented by voxel arrays of different sizes, as the scale of the voxel array may be the same for both (e.g. the distance of separation between voxels has a set length in Angstroms). Processing of the voxel arrays may comprise increasing the size of a smaller array by using a padding function. For example, use of a padding function may comprise increasing the size of the array by adding in voxels with no value or with zero value, so that the size of the voxel array increases but the size of the object represented by the voxel array remains the same. When a smaller object is located in a larger array, the object may be located centrally within the larger array, e.g. so that its centre of mass is at the centre of the array. Processing of the voxel arrays may comprise decreasing the size of a larger array by cropping the array. Cropping may comprise removal of voxels from a region (e.g. a periphery) of the array. The target object and test object voxel arrays may be processed so that each is of a selected size, e.g. so that they are both the same size, e.g. so they are both cubic.

For example, where the test object represents a drug ligand to be docked with a drug carrier, the drug carrier may be substantially larger than the drug ligand, and so there may be a large number of different surfaces or angles on which the ligand could bind to the carrier. If the combined object data is scaled so that all the surfaces of the drug carrier and all the surfaces of the drug ligand are represented in one voxel array, operation of the neural network may predict an interaction property between the two based on all these possible combinations, despite the two objects only being represented in one pose in the combined object data.

Although description has been made of actively combining the target object data with the test object data to provide combined object data, it is to be appreciated that the method may instead comprise a step of receiving combined data in the selected format, e.g. from an external device. At step 240 of the method, the neural network is operated on the combined object data to determine an indication of an interaction property between the target object and the test object.

The structure, operation and training of the neural network are discussed in more detail below. In short, the neural network is configured (e.g. trained) to process an input (e.g. a voxel array) and to determine, based on the input, an indication of an interaction property between the target object and the test object. The network may be trained to learn and identify features of target objects and test objects known to (or not to) dock together. The presence of these features (or corresponding/analogous features) in other items of combined object data can then be identified with use of the network. It may operate on combined object data to provide an output indicative of a likelihood (e.g. a value from 0 to 1 ) of the target and test objects matching/being suitable for docking with each other. By training the network, it can learn to identify features or patterns which are indicative of the test object being a suitable match for the target object (or not). Operation of the neural network may enable test objects to be identified which are more likely to fit with the target object, e.g. based on an interaction property determined by the neural network. For example, in the event that the output of the neural network is a probability value indicative of the likelihood of two items fitting, items may be selected which have a probably value greater than a threshold value.

At step 250 of the method, the likelihood of fit between the target object and the test object is tested based on the output from the neural network.

The method may act as a filter for screening a selection of potentially suitable test objects from a plurality of known test objects. The cycle illustrated by steps 220 to 270 may be repeated iteratively for each test object in the plurality of known test objects. This may enable a selection to be made of test objects from the plurality of known test objects. These selected test objects may then be subject to a more detailed analysis to identify suitable test object(s) with a higher degree of precision. The plurality of known test objects may be selected as examples which are representative of an even larger group of candidate test objects. For example, objects in the plurality of known test objects may have properties in common with other objects in the larger group so that for a given test object in the plurality of known test objects, a positive match may also be indicative of a positive match with other test objects in the larger group. That way, the method may enable other test objects to be selected on the basis of a positive indication for one test object.

If the outcome of step 250 is an indication that the test object in question is not likely to fit a target object, the method proceeds to step 260. At step 260, the test object is rejected, and the method returns to step 220 at which point another test object is selected for testing from the plurality of known test objects. In the event that all of the test objects in the plurality of test objects have been tested, no more test objects will be selected.

If the outcome of step 250 is an indication that the test object in question is likely to fit a target object, the method proceeds to step 270. At step 270 an indication of the test object is output to a resource. This step may be done after each occurrence of a test object that is determined to be likely to fit with the target object, or it may be done in batches (e.g. after all of the plurality of test objects have been tested). The step of outputting to a resource may comprise sending a message including an indication of the selected test object. As described above with reference to Fig. 1 , the resource 130 may be a separate component or not, and the step of outputting to a resource 130 may comprise moving on to a next step in a method of selecting a suitable test object for docking with a target object and determining conditions (e.g. poses) for a best fit between the target object and the test object.

A method of determining conditions for a best fit between the target object and the test object will now be described with reference to Fig. 3.

At step 310 of the method, an indication of a test object is received.

This may comprise receiving the output from step 270 of the method of Fig. 2. The indication of the test object may comprise an identifier, based on which data for the test object may be retrieved, and/or it may comprise receiving data for the test object (e.g. receiving the test data voxel array). This step may also comprise receiving an indication of the target object and/or data for the target object.

At step 320 of the method, poses are selected for analysis.

During the screening method shown in Fig. 2, the combined object data may be based only on one pose for the target and test object, and the neural network is operable to provide an indication of whether or not the target object and test object are likely to fit. This indication of whether or not they are likely to fit may be representative of a likelihood of fit which is irrespective of the poses of the target and test object in the combined data, e.g. it may also encompass a likelihood of fit between the target and test object with either and/or both objects in other poses to the pose in the combined image data.

The method of Fig. 2 may provide a screening process for identifying potentially suitable test objects for docking with a target object. In order for a best fit to be identified between the target object and test object, multiple different poses may be analysed, and their fit simulated. This process may therefore be iterative, wherein for each iteration the fit between target and test object is analysed with an incremental change to the pose of either or both objects. At step 320, a pose for the target object is selected and a pose for the test object is selected. This step may comprise identifying a previous pose which has been analysed and modifying that pose accordingly.

At step 330, a force-field analysis is performed for the target object and test object in their selected poses.

A force-field analysis may comprise analysing the potential energy functions of the target and test objects in their selected poses. These energy functions may be known from stored data, e.g. from experimentation or derived using suitable scientific models (e.g. based on quantum physics). On the basis of these energy functions it may be possible to determine how well docking between the target and test object would work. It is to be appreciated that there may be a clear incompatibility for some poses and so no force- field analysis is applied between the target object and the test object for those poses, and the method returns to step 320.

At step 340, the results from the force-field test are analysed to determine how well the target and test object fit. The output of the method may be to find different poses in which there is a fit of a sufficiently high level between the target and test object. The output of the method may be to identify the best possible fit between the target and test object. The output may be both. Depending on the type of output, step 340 may vary. For example, if the type of output is to identify the best possible fit, then at step 340 the iterative process may continue irrespectively of the results of the force-field analysis until all of the selected poses have been analysed. At this stage, the method may then comprise selecting the pose for the target object and the pose for the test object which provide the best fit. In this case, the method would then proceed to step 350 by providing an indication of the pose of the target object and the pose of the test object which provide the best fit. In the case of providing an indication of all fits above a sufficiently high level, at step 340, each time a fit is above a threshold value the method may provide an output based on said fit, before returning to step 320 for analysis of another pose.

The method may comprise repeating steps 320 to 340 until all of the selected poses have been analysed. It is to be appreciated that theoretically, there may be an infinite number of possible orientations/conformations of the target and test object, and so only a finite number of poses will be selected. The selection of these poses may be chosen so that they are representative of as many of the possible poses as can practicably be achieved, e.g. the difference in orientation/conformation between subsequent poses may be selected based on both providing a sufficiently high number of analyses of different poses, and maintaining an achievable computational time for performing the force-field analysis.

At step 350, the method finishes once all of the relevant poses have been tested.

The neural network

The neural network comprises at least one of a deep residual network and/or a highway network and/or a densely connected network and/or a capsule network.

For any such type of network, the network comprises a plurality of different neurons, which are organised into different layers. Each neuron is configured to receive input data, process this input data and provide output data. Each neuron may be configured to perform a specific operation on its input, e.g. this may involve mathematically processing the input data. The input data for each neuron may comprise an output from a plurality of other preceding neurons. As part of a neuron’s operation on input data, each stream of input data (e.g. one stream of input data for each preceding neuron which provides its output to the neuron) is assigned a weighting. That way, processing of input data by a neuron comprises applying weightings to the different streams of input data so that different items of input data will contribute more or less to the overall output of a neuron. Adjustments to the value of the inputs for a neuron, e.g. as a consequence of the input weightings changing, may result in a change to the value of the output for that neuron. The output data from each neuron may be sent to a plurality of subsequent neurons.

The neurons are organised in layers. Each layer comprises a plurality of neurons which operate on data provided to them from the output of neurons in preceding layers. Within each layer there may be a large number of different neurons, each of which applies a different weighting to its input data and performs a different operation on its input data. The input data for all of the neurons in a layer may be the same, and the output from the neurons will be passed to neurons in subsequent layers.

The exact routing between neurons in different layers forms a major difference between capsule networks and deep residual networks (including variants such as highway networks and densely connected networks).

For a residual network, layers may be organised into blocks, such that the network comprises a plurality of blocks, each of which comprises at least one layer. For a residual network, output data from one layer of neurons may follow more than one different path. For conventional neural networks (e.g. convolutional neural networks), output data from one layer is passed into the next layer, and this continues until the end of the network so that each layer receives input from the layer immediately preceding it, and provides output to the layer immediately after it. However, for a residual network, a different routing between layers may occur. For example, the output from one layer may be passed on to multiple different subsequent layers, and the input for one layer may be received from multiple different preceding layers.

In a residual network, layers of neurons may be organised into different blocks, wherein each block comprises at least one layer of neurons. Blocks may be arranged with layers stacked together so that the output of a preceding layer (or layers) feeds into the input of the next block of layers. The structure of the residual network may be such that the output from one block (or layer) is passed into both the block (or layer) immediately after it and at least one other later subsequent block (or layer). Shortcuts may be introduced into the neural network which pass data from one layer (or block) to another whilst bypassing other layers (or blocks) in between the two. This may enable more efficient training of the network, e.g. when dealing with very deep networks, as it may enable problems associated with degradation to be addressed when training the network (which is discussed in more detail below). The arrangement of a residual neural network may enable branches to occur such that the same input provided to one layer, or block of layers, is provided to at least one other layer, or block of layers (e.g. so that the other layer may operate on both the input data and the output data from the one layer, or block of layers). This arrangement may enable a deeper penetration into the network when using back propagation algorithms to train the network. For example, this is because during learning, layers, or blocks of layers, may be able to take as an input, the input of a previous layer/block and the output of the previous layer/block, and shortcuts may be used to provide deeper penetration when updating weightings for the network.

For a capsule network, layers may be nested inside of other layers to provide‘capsules’. Different capsules may be adapted so that they are more proficient at performing different tasks than other capsules. A capsule network may provide dynamic routing between capsules so that for a given task, the task is allocated to the most competent capsule for processing that task. For example, a capsule network may avoid routing the output from every neuron in a layer to every neuron in the next layer. A lower level capsule is configured to send its input to a higher level (subsequent) capsule which is determined to be the most likely capsule to deal with that input. Capsules may predict the activity of higher layer capsules. For example, a capsule may output a vector, for which the orientation represents properties of an object in question. In response, each subsequent capsule may provide, as an output, a probability that the object that capsule is trained to identify is present in the input data. This information (e.g. the probabilities) can be fed back to the capsule, which can then dynamically determine routing weights, and forward the input data to the subsequent capsule most likely to be the relevant capsule for processing that data.

For either type of neural network, there may be included a plurality of different layers which have different functions. The neural network may include at least one convolutional layer configured to convolve input data across its height and width. The neural network may also have a plurality of filtering layers, each of which comprises a plurality of neurons configured to focus on and apply filters to different portions of the input data. Other layers may be included for processing the input data such as pooling layers (to introduce non-linearity) such as maximum pooling and global average pooling, Rectified Linear Units layer (ReLU) and loss layers, e.g. some of which may include regularization functions. The final block of layers may receive input from the last output layer (or more layers if there are branches present). The final block may comprise at least one fully connected layer.

The final output layer may comprise a classifier, such as a softmax, sigmoid or tanh classifier. Different classifiers may be suitable for different types of output; for example, a sigmoid classifier may be suitable where the output is a binary classifier. The neural network of the present disclosure may be configured to predict binding affinities between the target and test object. In which case, the output may be a prediction of the value for the equilibrium dissociation constant. The output of the neural network may provide an indication of a probability that the target and test object fit. It may provide as an output an indication of whether or not a more detailed analysis of the fit between the target and test object is warranted, such as in a binary form, wherein a first output indicates‘yes’ and a second output indicates‘no’. In which case, the network may act as a screen for pulling out a smaller group of compounds for which a more detailed examination is required.

Training the network

The neural network is configured to take in as an input an array, e.g. a three- dimensional (3D) voxel array. The array values may vectorised and/or encoded, such as by using one-hot encoding to provide a binary format. This input is then fed into a set of 3D layers in the neural network. There are several features of this network which may be varied as training of the network proceeds. For each neuron, there may be a plurality of weightings, each of which is applied to a respective input stream for output data from neurons in preceding layers. These weightings are variables which can be modified to provide a change to the output of the neural network. These weightings may be modified in response to training so that they provide more accurate data. In response to having trained these weightings, the modified weightings are referred to as having been ‘learned’. Additionally, the size and connectivity of the layers may be dependent upon the typical input data for the network; although, these too may be a variable which may be modified and learned during training, including the reinforcement of connections.

To train the network, e.g. to learn values for the weightings, these weightings are assigned an initial value. These initial values may essentially be random; however, to improve training of the network, a suitable initialisation for the values may be applied such as a Xavier/Glorot initialisation. Such initialisations may inhibit situations from occurring in which initial random weightings are too great or too small, and the neural network can never properly be trained to overcome these initial prejudices. This type of initialisation may comprise assigning weightings using a distribution having a zero mean but a fixed variance.

Once the weightings have been assigned, training object data may be fed into the neural network. This may comprise operating the neural network on known pairs of target object data and test object data. These pairs may comprise pairs which are known to definitely fit, and pairs which are known to definitely not fit. For example, artificial shapes may be created to be used as training object data. Fitting pairs may be assigned a first value (e.g. 1) and non-fitting pairs may be assigned a different value (e.g. 0). The neural network may be operated on the pairs to provide an output value. The output value from the neural network may then be compared with the known, or expected, value. For example, the output value from the neural network should be 1 if the pairs are expected to have a perfect match, and it should be zero if they are not. Known pairs may be modified so that a value for their fit is somewhere between 0 and 1 (and is known), and these modified pairs may also be used to train the network to identify different values for how good a fit is.

Based on the determined value for a pair, and the expected value for that pair, a backpropagation optimisation method, for example using gradient descent (e.g. stochastic gradient descent) and loss functions is performed on the network. Algorithms such as mini-batch gradient descent, RMSprop, Adam, Adadelta and Nesterov may be used during this process. This may enable an identification of how much each different point (neuron) or path (between neurons in subsequent layers) in the network is contributing to determining an incorrect score. The weightings may then be adjusted according to the error calculated. For example, to minimise or remove the contribution from neurons which contribute, or contribute the most, to an incorrect determination.

After an iteration of training the network with a different target-test pair, these weightings may be updated, and this process may be repeated a large number of times. To inhibit the likelihood of overtraining the network, training variables such as learning rate and momentum may be varied and/or controlled to be at a selected value. Additionally, regularisation techniques such as L2 or dropout may be used which reduce the likelihood of different layers becoming over-trained to be too specific for the training data, without being as generally applicable to other, similar data. Likewise, batch normalisation may be used to aid training and improve accuracy. In general, the weightings are adjusted so that the network would, if operated on the same training image again, produce the expected outcome. Although, the extent to which this is true will be dependent on training variables such as learning rate.

It is to be appreciated that increasing the depth of neural networks may cause problems when training, e.g. due to vanishing gradient problems, and it may also provide slower networks. However, the present disclosure may enable the provision of a network having increased depth and accuracy without sacrificing the ability to adequately train the network.

Example

An example of a neural network will now be described with reference to Fig. 4, and an example of training and operating such a neural network will be described with reference to the results from said process, as illustrated in Fig. 5.

Fig. 4 shows a neural network 400. The network 400 is a deep residual network made up of a plurality of layers 440, each of which is illustrated as a dot with a rectangular box. Some of the layers 440 are organised into blocks 450, which comprise a plurality of different layers 440. The network 400 includes a plurality of branches 460 which provide a path from the output of one block to the output of a later block whilst bypassing the layers of an intermediate block. Some of the branches 460 form blocks 450, which include layers of their own, whilst still bypassing the layers of another block. Also shown in Fig. 4 is inset A which provides a zoomed in view of the structure of an individual block 450. The block 450 shown in inset has a plurality of layers 440 for processing information (shown on the left hand side). Additionally, there is shown a branch 460, which includes a couple of layers 440 for processing information. This inset shows one example, as it is clear from the larger network structure of Fig. 4 that some blocks will include branches having no layers in.

The depth of the network may be selected to provide a balance between accuracy and the time taken to provide an output. Increasing the depth of the network may provide increased accuracy although it may also increase the time taken to provide an output. Use of a branched structure (as opposed to in a convolutional neural network) may enable sufficient training of the network to occur as depth of the network increases, which in turn provides for an increased accuracy of the network.

A simulation was performed which is considered to be comparable to a typical situation for drug discovery, e.g. for determining a suitable target object (drug carrier) for docking with a test object (drug ligand). Different items of target object data were generated, each of which comprising a three-dimensional surface for a target object whose coordinates and surface values were generated randomly. The surface values were varied between -5 and +5 to represent notional surface charge at the surface of the target object, e.g. as could be typical on a protein molecule. Each surface was then used as a template to make a test object that would precisely match the target object.

To represent a typical drug ligand, the test object was trimmed (between 10-50% of the test surface size), its surface values inverted (e.g. -5 became +5), and its shape (3D coordinates) mutated such that it no longer perfectly matched the template surface. In addition, random noise was introduced into the surface values to ensure that test object was not an exact inverted copy of the target object. Finally, the test object was rotated in all three dimensions by a random amount (from 0-359 degrees) and translated off centre (again by a random amount in all three dimensions) so that it did not register precisely with the surface of the target object. These were deemed to be“matched” surfaces since they should dock together (similar shape, opposite charges on surfaces). Non-matching surfaces were also generated in a corresponding, yet opposite manner, e.g. to construct surfaces that were“unmatched” in their shape and/or charge.

Each target object and test object was placed into a separate 32 x 32 x 32 voxel array to represent their properties and structure. The two arrays were concatenated to form a 64 x 32 x 32 array which was then fed into a deep residual network of the type described herein and a convolution-only neural network. Matched surfaces were labelled as and unmatched O’. The network was trained with 500,000 different target-test surface pairs.

As shown in Fig. 5, results from the deep residual network achieved over 90% accuracy at predicting matched versus unmatched target-test pairs. In comparison, a similarly deep network composed only of convolution layers (without the residual component) failed to learn effectively, reaching a peak accuracy of 70%. Networks composed of only convolutional layers (and not deep residual layers) were not able to accurately learn to differentiate between the target-test pairs. With regard to the network 400 of Fig. 4, the addition of an extra block of layers provided an accuracy of 96%, whilst calculations took approximately an additional 25% of time. Training took approximately 1 week on a single Nvidia Tesla K80 graphics card.

Any processors used in the computer system 100 (and any of the activities and apparatus outlined herein) may be implemented with fixed logic such as assemblies of logic gates or programmable logic such as software and/or computer program instructions executed by a processor. The computer system 100 may comprise a central processing unit (CPU) and associated memory, connected to a graphics processing unit (GPU) and its associated memory. Other kinds of programmable logic include programmable processors, programmable digital logic (e.g., a field programmable gate array (FPGA), a tensor processing unit (TPU), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM), an application specific integrated circuit (ASIC), or any other kind of digital logic, software, code, electronic instructions, flash memory, optical disks, CD-ROMs, DVD ROMs, magnetic or optical cards, other types of machine-readable mediums suitable for storing electronic instructions, or any suitable combination thereof. Such data storage media may also provide the data store 110 of the computer system 100 (and any of the apparatus outlined herein).

It will be appreciated from the discussion above that the embodiments shown in the Figures are merely exemplary, and include features which may be generalised, removed or replaced as described herein and as set out in the claims. With reference to the drawings in general, it will be appreciated that schematic functional block diagrams are used to indicate functionality of systems and apparatus described herein. For example, the functionality provided by the data store 1 10 may in whole or in part be provided by a processor 120 having one more data values stored on-chip. In addition, the processing functionality may also be provided by devices which are supported by an electronic device. It will be appreciated, however, that the functionality need not be divided in this way, and should not be taken to imply any particular structure of hardware other than that described and claimed below. The function of one or more of the elements shown in the drawings may be further subdivided, and/or distributed throughout apparatus of the disclosure. In some embodiments, the function of one or more elements shown in the drawings may be integrated into a single functional unit.

The above embodiments are to be understood as illustrative examples. Further embodiments are envisaged. It is to be understood that any feature described in relation to any one embodiment may be used alone, or in combination with other features described, and may also be used in combination with one or more features of any other of the embodiments, or any combination of any other of the embodiments. Furthermore, equivalents and modifications not described above may also be employed without departing from the scope of the invention, which is defined in the accompanying claims.

Whilst the above disclosure has been described with reference to systems and methods for drug discovery selectivity optimisation (e.g. finding suitable matches for docking with drugs), it is to be appreciated that this is not limiting. The present disclosure may be applicable to any situation which may benefit from target and test object analysis. For example, the present disclosure could also be applicable for use with agrochemicals such as pesticides. Additionally, the present disclosure may be applicable for: drug re- purposing, toxicity prediction (e.g. studying on-target/off-target effects, providing personalised medicine, in materials science.

In some examples, one or more memory elements can store data and/or program instructions used to implement the operations described herein. Embodiments of the disclosure provide tangible, non-transitory storage media comprising program instructions operable to program a processor to perform any one or more of the methods described and/or claimed herein and/or to provide data processing apparatus as described and/or claimed herein.

Certain features of the methods described herein may be implemented in hardware, and one or more functions of the apparatus may be implemented in method steps. It will also be appreciated in the context of the present disclosure that the methods described herein need not be performed in the order in which they are described, nor necessarily in the order in which they are depicted in the drawings. Accordingly, aspects of the disclosure which are described with reference to products or apparatus are also intended to be implemented as methods and vice versa. The methods described herein may be implemented in computer programs, or in hardware or in any combination thereof. Computer programs include software, middleware, firmware, and any combination thereof. Such programs may be provided as signals or network messages and may be recorded on computer readable media such as tangible computer readable media which may store the computer programs in non-transitory form. Hardware includes computers, handheld devices, programmable processors, general purpose processors, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), and arrays of logic gates.

Other examples and variations of the disclosure will be apparent to the skilled addressee in the context of the present disclosure.