Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CRYPTOCURRENCIES MALWARE BASED DETECTION
Document Type and Number:
WIPO Patent Application WO/2017/167547
Kind Code:
A1
Abstract:
A computer implemented method to identify a computer security threat based on communication of a network connected device via a computer network, the method comprising: receiving a plurality of blocks of network traffic from the device, each block including a sequence of network traffic data items being identifiable by a position in the sequence of the block; identifying a subset of positions occurring in every block for which a degree of variability of values of data items in each position of the subset meets a predetermined threshold; generating executable code for performing a plurality of processing operations based on the identified subset of positions, the executable code consuming a determinate quantity of computing resources when executed for the received network traffic, wherein the executable code is suitable for detecting a subsequent network communication as a block of network traffic having a sequence of data items for which the identified subset of positions fails to exhibit a degree of variability meeting the predetermined threshold, the detection being based on a comparison of a measure of resources consumed by a computer system executing the executable code and he determinate quantity of computing resources, and the detection corresponding to the identification of a computer security threat.

Inventors:
SMITH KARL (GB)
EL-MOUSSA FADI (GB)
Application Number:
PCT/EP2017/055090
Publication Date:
October 05, 2017
Filing Date:
March 03, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BRITISH TELECOMM (GB)
International Classes:
G06F21/50; G06F21/55; G06F21/56; H04L29/06
Domestic Patent References:
WO2015128612A12015-09-03
Other References:
ANONYMOUS: "Who will protect users from ethereum based malware? : ethereum", 28 March 2016 (2016-03-28), XP055306678, Retrieved from the Internet [retrieved on 20160929]
ALEX BIRYUKOV ET AL: "University of Luxembourg", 19 January 2016 (2016-01-19), Luxemburg, XP055306767, Retrieved from the Internet [retrieved on 20160929]
DANIEL PLOHMANN ET AL: "Case study of the Miner Botnet", CYBER CONFLICT (CYCON), 2012 4TH INTERNATIONAL CONFERENCE ON, IEEE, 5 June 2012 (2012-06-05), pages 1 - 16, XP032204318, ISBN: 978-1-4673-1270-7
SOOD ADITYA K ET AL: "An Empirical Study of HTTP-based Financial Botnets", IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 13, no. 2, 1 March 2016 (2016-03-01), pages 236 - 251, XP011602943, ISSN: 1545-5971, [retrieved on 20160310], DOI: 10.1109/TDSC.2014.2382590
Attorney, Agent or Firm:
ROBERTS, Scott (GB)
Download PDF:
Claims:
CLAIMS

1 . A computer implemented method to identify a computer security threat based on communication of a network connected device via a computer network, the method comprising:

receiving a plurality of blocks of network traffic from the device, each block including a sequence of network traffic data items being identifiable by a position in the sequence of the block;

identifying a subset of positions occurring in every block for which a degree of variability of values of data items in each position of the subset meets a predetermined threshold;

generating executable code for performing a plurality of processing operations based on the identified subset of positions, the executable code consuming a determinate quantity of computing resources when executed for the received network traffic,

wherein the executable code is suitable for detecting a subsequent network communication as a block of network traffic having a sequence of data items for which the identified subset of positions fails to exhibit a degree of variability meeting the predetermined threshold, the detection being based on a comparison of a measure of resources consumed by a computer system executing the executable code and he determinate quantity of computing resources, and the detection corresponding to the identification of a computer security threat.

2. The method of claim 1 wherein the executable code is Ethereum code.

3. The method of any preceding claim wherein the device is an internet of things device.

4. The method of any preceding claim wherein the device has associated a unique identifier and the executable code has associated the unique identifier.

5. The method of any preceding claim wherein the predetermined threshold is defined to identify an absence of variability of values of data items in each position of the subset.

6. The method of any preceding claim wherein the step of identifying a subset of positions includes using a machine learning algorithm to identify positions in every block at which data items exhibit at least a predetermined degree of consistency.

7. The method of claim 6 wherein the machine learning algorithm is an unsupervised algorithm such as an autoencoder.

8. The method of any of claim 6 or claim 7 wherein the machine learning algorithm is a 5 restricted Boltzmann machine.

9. A computer system including a processor and memory storing computer program code for performing the steps of any preceding claim.

10 10. A computer program element comprising computer program code to, when loaded into a computer system and executed thereon, cause the computer to perform the steps of a method as claimed in any of claims 1 to 8.

Description:
CRYPTOCURRENCIES MALWARE BASED DETECTION

The present invention relates to the identification of threats in network communication between network connected devices.

Devices are increasingly becoming network connected by connection to computer networks for communication with clients, servers, each other, publication of information or other purposes. This trend has been described as developing an "internet of things" (loT) in which devices of many potentially disparate kinds and purposes are network connected, including, inter alia: domestic appliances and equipment; utility supply and control apparatus such as energy supply and control; commercial machinery and plant; vehicles; sensors and detectors; lighting; heating; media devices including audio and video; medical devices;

learning aids; timepieces; data storage devices; food preparation and storage devices;

agricultural apparatus; human and animal monitoring devices; personal possessions; articles of fashion including clothing and footwear; roadside apparatus such as traffic monitors; street furniture; and many other devices and apparatus as will be apparent to those skilled in the art. The motivation for network connection of such devices can be varied including, for example: a desire to share information about a state, configuration, presence, environment, locality or arrangement of a device; communication of events, alerts, states or state changes relating to a device; for multiple devices to collaborate, coexist, cooperate, communicate or the like; to generate sensory output for subsequent consumption, recording or the like; for control of devices such as by network configuration, control, installation, modification, operation and the like; and many other purposes as will be apparent to those skilled in the art.

Each network connected device presents a potential vulnerability to a network and other devices connected thereto which malicious agents or entities might seek to exploit for malicious purposes. For example, network connected devices can be subject to spoofing, unauthorised access, unauthorised modification and/or unauthorised use. Such network connected devices can be furnished with little processing resource (so as to reduce manufacturing and operating costs, for example) and traditional security mechanisms such as intrusion detection services, antimalware services, firewalls and antivirus services may be difficult to accommodate for or by the device without unduly impacting the limited resource of the device or other operation of the device or may simply be too costly in view of the value or cost of the device.

Responsibility for monitoring for network threats can be deferred to and discharged by network components such as routers, switches, proxies or dedicated network security or service apparatus shared or protected by potentially numerous network connected devices. However, a particular challenge with loT network connected devices in view of a potentially wide distribution of such devices across networks and in view of potentially many different versions of such devices occurring variously throughout the networks is a need to

consistently apply threat detection for similar devices across the entire network. For example, a first version or release of a network connected domestic appliance may be susceptible to a first security threat. A second version or release of the same domestic appliance may not be susceptible to the first threat but may be susceptible to a second threat. Thus deployments across multiple interconnected networks of mixtures of both versions of the domestic appliance need to accommodate identification of and/or protection against both threats sensitive to the differences between versions. This problem is particularly acute in view of the growing trend to employ software or firmware for loT devices as a mechanism for updating, reviewing, renewing or refreshing devices such that two identical loT devices can execute different software or firmware versions and being exposed to correspondingly different threats. Indeed, the very network connected nature of such loT devices leads to the propensity for their updating by software and/or firmware.

Thus there is a need to address the aforementioned challenges.

The present invention accordingly provides, in a first aspect, a computer implemented method to identify a computer security threat based on communication of a network connected device via a computer network, the method comprising: receiving a plurality of blocks of network traffic from the device, each block including a sequence of network traffic data items being identifiable by a position in the sequence of the block; identifying a subset of positions occurring in every block for which a degree of variability of values of data items in each position of the subset meets a predetermined threshold; generating executable code for performing a plurality of processing operations based on the identified subset of positions, the executable code consuming a determinate quantity of computing resources when executed for the received network traffic, wherein the executable code is suitable for detecting a subsequent network communication as a block of network traffic having a sequence of data items for which the identified subset of positions fails to exhibit a degree of variability meeting the predetermined threshold, the detection being based on a comparison of a measure of resources consumed by a computer system executing the executable code and he determinate quantity of computing resources, and the detection corresponding to the identification of a computer security threat.

Preferably the executable code is Ethereum code.

Preferably the device is an internet of things device. Preferably the device has associated a unique identifier and the executable code has associated the unique identifier.

Preferably the predetermined threshold is defined to identify an absence of variability of values of data items in each position of the subset. Preferably the step of identifying a subset of positions includes using a machine learning algorithm to identify positions in every block at which data items exhibit at least a

predetermined degree of consistency.

Preferably the machine learning algorithm is an unsupervised algorithm such as an autoencoder. Preferably the machine learning algorithm is a restricted Boltzmann machine.

The present invention accordingly provides, in a second aspect, a computer system including a processor and memory storing computer program code for performing the steps described above.

The present invention accordingly provides, in a third aspect, a computer program element comprising computer program code to, when loaded into a computer system and executed thereon, cause the computer to perform the steps of the method set out above.

Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:

Figure 1 is a block diagram of a computer system suitable for the operation of embodiments of the present invention;

Figure 2 is a component diagram of a system to identify computer security threats based on communication of a network connected device via a computer network in accordance with an embodiment of the present invention;

Figure 3 is a flowchart of a method for identifying computer security threats based on communication of a network connected device via a computer network in accordance with an embodiment of the present invention; and

Figure 4 is a component diagram of an arrangement of a distributed embodiment of the present invention.

Figure 1 is a block diagram of a computer system suitable for the operation of

embodiments of the present invention. A central processor unit (CPU) 102 is

communicatively connected to a storage 104 and an input/output (I/O) interface 106 via a data bus 108. The storage 104 can be any read/write storage device such as a random access memory (RAM) or a non-volatile storage device. An example of a non-volatile storage device includes a disk or tape storage device. The I/O interface 106 is an interface to devices for the input or output of data, or for both input and output of data. Examples of I/O devices connectable to I/O interface 106 include a keyboard, a mouse, a display (such as a monitor) and a network connection.

Figure 2 is a component diagram of a system to identify computer security threats based on communication of a network connected device 206 via a computer network 200 in accordance with an embodiment of the present invention. The network connected device 206 is conceivably any network connected device such as devices hereinbefore described including loT devices. The network device 206 communicates via a computer network 200 such as a wired or wireless communications network employing one or more network protocols. The device 206 is operable in communication with a network component 208 such as a network appliance, network connected server or the like. For example, network component 208 is a router, switch, hub, proxy, server or other network connected device. In one embodiment the network component 208 is an internet access point such as a home router, wired or wireless access point, or a Home Hub provided by BT. The network component 208 receives communications from the device 206 as blocks of network traffic 202 (one illustrated) including a plurality of network traffic data items 204a to 204n. For example, a block 202 of network traffic can include a packet, message, frame, datagram, or other transmission unit or part thereof. The data items 204a to 204n inside block 200 are sequenced such that each data item is identifiable by a position in the sequence. Thus data item 204a can be said to have a first (or zeroth) position in the block, and so on. Data items 204a to 204n can be individual fields within the block 200, fixed length data items such as a fixed number of bytes, a single byte or even a single bit.

The network component 208 includes a validator generator 212 for generating a validator software routine for identifying a computer security threat as described below. The validator generator 212 is a hardware, software, firmware or combination component. The validator generator 212 initially identifies the network connected device 206 based on the network traffic 202 such as by extracting an identifier, name, model, version, revision or other identification of the device 206. The validator generator 212 subsequently measures a degree of variability of values for each of the data items 204a to 204n across a plurality of blocks of network traffic 202. Thus the validator generator 212 accesses multiple blocks 202, each including a plurality of sequenced data items 204a to 204n, and for each of the data items a degree of variability of values of the data item is identified. The degree of variability measured by the validator generator 212 is most preferably determined by a machine learning technique such as an autoencoder to receive and process a sequence of data items 204a to 204n for each block of network traffic 202 and autoencode the data items to identify positions of data items in the sequence of all blocks that exhibit a degree of variability of values that meets a predetermined threshold. For example, identifying positions of data items in all blocks for which the variability is zero or very low can be desirable. Such identified positions therefore constitute a subset of positions occurring in every block for which a degree of variability of values of data items in each position in the subset meets the predetermined threshold. An example of the application of autoencoding machine learning techniques to network traffic is described in "The Applications of Deep Learning on Traffic Identification" (Zhanyi Wang, 2015). In a preferred embodiment the autoencoding process is undertaken by use of a restricted Boltzmann machine so as to provide efficient autoencoding such as is described in the paper "An Introduction to

Restricted Boltzmann Machines" (Asja Fischer and Christian Igel, 2012, in "Progress in Pattern Recognition, Image Analysis, Computer Vision and Applications" Volume 7441 of the series Lecture Notes in Computer Science pp14-36).

Thus the validator generator 212 identifies a subset of positions in blocks of network traffic 202 having low or no variability according to a threshold and such positions therefore serve to characterise the blocks of network traffic 202 for the device 206. Accordingly, the subset of positions in the blocks of network traffic 202 and one or more examples of blocks of network traffic 202 itself can be used for subsequent network communication for the device 206 or devices identical to the device 206 to identify subsequent blocks of network traffic having sequences of data items that are inconsistent with the learned low variability positions. The examples of the network traffic 202 itself that is used to identify the subset of positions and can be used to confirm conformance with the characteristics of the network traffic 202 based on the subset of positions is stored as an exemplar network traffic for comparison with subsequent network traffic. Such identified subsequent blocks of network traffic can be flagged as potentially problematic communication as it is inconsistent with expected communication for the network connected device 206. For example, such identified inconsistent blocks of network traffic can be discarded, prevented from ongoing

communication, flagged for investigation of the source device, cause the source device to be scanned, reviewed or otherwise processed by security service or software such as malware detection or intrusion detection facilities and the like.

Preferably the validator generator 212 associates the identifier of the device 206 with the identified subset of positions in order to ensure the subset of positions are applied only to monitor subsequent traffic originating from an identical (or determined to be compatible) network connected device 206.

The mechanism for checking subsequent network traffic based on the identified subset of positions of low variability data items is deployed using executable code that can be communicated to, and executed by, a validator executor 214 at the network component or network components elsewhere in a computer network or in a different computer network for monitoring blocks of network traffic arising from identical or compatible network connected devices 206 arising elsewhere in the network or in the different network. Thus the validator generator 212 is further adapted to generate executable code for, when executed by the validator executor 214, performing a plurality of processing operations based on the identified subset of positions. A key characteristic of the executable code is that it is adapted to consume a determinate quantity of computing resources when executed for a block of network traffic having data items with low variability from the identified exemplar network traffic 202 at positions identified in the subset of positions. That is to say that a deviation of network traffic from the exemplar network traffic 202 at the positions in the subset of positions is identified by a deviation in the computing resources consumed by the executable code from the determinate quantity of resource. Thus the executable code is provided so as to involve the determinate quantity of resource for consistent network traffic and in all other circumstances to involve a quantity of resource that deviates from the determinate quantity. This can be achieved, for example, by performing a comparison between the exemplar block of network traffic and a subsequent block at each data item position in the subset of positions for a degree of variability within the predetermined threshold so causing a path of computing logic that necessarily involves a determinate consumption of computing resource. Any data items that fail to exhibit the requisite degree (or range of degrees) of variability will fail to follow such path and the consumption of resource will deviate from the determinate consumption. Thus the validator executor 214 is a component of the network component 208 that is adapted to execute the executable code generated by the validator generator 212.

In one embodiment the exectable code is proided as Ethereum code such as an

Ethereum account for execution by one or more network components 208 as Ethereum miner as described in detail in "Ethereum: A Secure Decentralised Generalised Transaction ledger" (Dr. Gavin Wood, 2015) and "A Next-Generation Smart Contract and Decentralized Application Platform" (Ethereum White Paper, 2016, github.com/ethereum/wiki/wiki/White- Paper). In such an embodiment the validator executor 214 is a component of an Ethereum network, blockchain or system, such as an Ethereum miner. Ethereum code is beneficial because resource consumption by Ethereum miners is consistently the same for the same code and is charged by way of the virtual "ether" currency. Accordingly, identity in measures of resource consumed by even disparate computer systems executing the Ethereum code can be confirmed by recognising the same extent of expenditure of "ether" (or "gas" as described in the Ethereum papers) for execution of the code. Ethereum accounts or conracts can encode the executable code and further store the exemplar block of network traffic or at least the data items at each position in the subset of positions for reference at runtime when processing a subsequent block of network traffic. Yet further, an association of an identifier of the network connected device 206 with the executable code provides for assurance at execution time that the comparisons of data items at the subset of positions are appropriate for a device as origin of a block of network traffic. Thus, in this way, embodiments of the present invention provide identification of computer security threats by way of deviations from expected resource consumption by executable code performing a plurality of processing operations based on an identified subset of positions of low variability data items in network traffic. The subset of positions can be determined based on an unsupervised machine learning approach such as autoencoding so avoiding a need for user provided definitions. An identifier of a network connected device 206 can be employed in association with the subset of positions to ensure validation executors 214 located potentially remotely or in disparate arrangements or having many disparate or differing versions of network connected devices 206 can identify appropriate blocks of network traffic for processing with reference to a particular definition of a subset of positions. Further, the validator executor 214 provides for the consistent execution of executable code generated based on the identified subset of positions so that a comparison of computing resource consumed by execution serves as an indicator of deviation from an expected network traffic to identify potential threats communicated via the network 200.

Figure 3 is a flowchart of a method for identifying computer security threats based on communication of a network connected device via a computer network in accordance with an embodiment of the present invention. Initially, at step 302, the a plurality of blocks of network traffic are received from the device 206. Each block includes a sequence of network traffic data items being identifiable by a position in the sequence of the block. At step 304 a subset of positions occurring in every block for which a degree of variability of values of data items in each position of the subset meets a predetermined threshold are identified. At step 306 executable code is generated for performing a plurality of processing operations based on the identified subset of positions. The executable code consumes a determinate quantity of computing resources when executed for the received network traffic and is therefore suitable for detecting a subsequent network communication that fails to exhibit a degree of variability meeting a predetermined threshold. The detection is based on a comparison of a measure of resources consumed by a computer system executing the executable code and he determinate quantity of computing resources, and the detection corresponding to the identification of a computer security threat.

Figure 4 is a component diagram of an arrangement of a distributed embodiment of the present invention. In the arrangement of Figure 4 multiple network components are provided 5 4081 , 408b each being in network communication with a plurality of network connected devices. Network component 408a performs the method of Figure 3 to generate executable code based on identified subset of positions of data items as Ethereum code. The Ethereum code is communicated to a server 410 as a central authority for the exchange of executable code for the monitoring of network traffic. The Ethereum code has associated an

0 identification of a particular network device, type of network device, compatible network device, release or version of network device or the like. The server 410 subsequently propagates the Ethereum code to the second network component 408b which employs the Ethereum code for the monitoring of network traffic from network devices identified in association with the Ethereum code. Thus, in this way the network components 408a, 408b5 devise and share threat detection code along with an identification of network connected devices for which such code is appropriate. In the event that network either network component 408a, 408b identifies a deviation in resource consumption by the executable Ethereum code for network traffic from a compatible device, the network component 408a, 408b can take remedial, protective or corrective action such as by discarding the network0 traffic, intervening with/disconnecting an originating network device, or flagging the potential or actual threat to a user. Additionally, such identified actual or potential threats can be communicated to the server 410 along with a block of network traffic for which the threat was identified for further or more detailed analysis. Additionally, the server 410 can communicate threat identifications between network components to elevate security levels consistently5 across the network(s).

Insofar as embodiments of the invention described are implementable, at least in part, using a software-controlled programmable processing device, such as a microprocessor, digital signal processor or other processing device, data processing apparatus or system, it will be appreciated that a computer program for configuring a programmable device,0 apparatus or system to implement the foregoing described methods is envisaged as an aspect of the present invention. The computer program may be embodied as source code or undergo compilation for implementation on a processing device, apparatus or system or may be embodied as object code, for example.

Suitably, the computer program is stored on a carrier medium in machine or device5 readable form, for example in solid-state memory, magnetic memory such as disk or tape, optically or magneto-optically readable memory such as compact disk or digital versatile disk etc., and the processing device utilises the program or a part thereof to configure it for operation. The computer program may be supplied from a remote source embodied in a communications medium such as an electronic signal, radio frequency carrier wave or optical carrier wave. Such carrier media are also envisaged as aspects of the present invention.

It will be understood by those skilled in the art that, although the present invention has been described in relation to the above described example embodiments, the invention is not limited thereto and that there are many possible variations and modifications which fall within the scope of the invention. The scope of the present invention includes any novel features or combination of features disclosed herein. The applicant hereby gives notice that new claims may be formulated to such features or combination of features during prosecution of this application or of any such further applications derived therefrom. In particular, with reference to the appended claims, features from dependent claims may be combined with those of the independent claims and features from respective independent claims may be combined in any appropriate manner and not merely in the specific combinations enumerated in the claims.