Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PROVABLY SECURE VIRUS DETECTION
Document Type and Number:
WIPO Patent Application WO/2016/049225
Kind Code:
A1
Abstract:
We present the first provably secure defense against software viruses. We hide a secret in the program code in a way that ensures that, as long as the system does not leak information about it, any injection of malware will destroy the secret with very'- high probability. Once the secret is destroyed, its destruction and therefore also the injection of malware will be quickly detected.

Inventors:
LIPTON RICHARD J (US)
OSTROVSKY RAFAIL (US)
ZIKAS VASSILIS (US)
Application Number:
PCT/US2015/051779
Publication Date:
March 31, 2016
Filing Date:
September 23, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV CALIFORNIA (US)
GEORGIA TECH RES INST (US)
International Classes:
G06F21/56
Foreign References:
US7624444B22009-11-24
US20080134321A12008-06-05
US20140108807A12014-04-17
US20080114981A12008-05-15
US20090222910A12009-09-03
Attorney, Agent or Firm:
YANG, Frank et al. (Silicon Valley Center801 California Stree, Mountain View CA, US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A computer-implemented method for compiling an original program into a modified program that is resistant to virus injection, the original program stored as words in a computer memory, the method comprising:

inserting a plurality of key shares for a key set at memory locations between the

original program words, wherein the key set comprises one or more secret keys, and the key set is lost if any of the key shares are modified; and modifying the original program so that execution of the modified program produces a same result as execut ion of the original program:

wherein vims injection into a contiguous block of words will modify at least one key share with high probability, and executing a challenge-response protocol based on the key set will verify whether any of the key shares has been modified.

2. The computer-im lemented method of claim 1 wherein the modified program can be proven to detect any virus injection into a contiguous block of or more words, where N is a preselected integer greater than or equal to 3.

3. The computer-implemented method of cla im 2. wherein N=3.

4. The computer-implemented method of claim 1 wherein inserting the plurality of key shares comprises inserting key shares between original program words so that not more than N original program words will be contiguous in the modified program, where N is a preselected integer.

5. The computer-implemented method of claim 4 wherein inserting the plurality of key shares comprises inserting key shares between origmal program words so that no original program words are contiguous in the modified program,

6. The computer-implemented method of claim 1 wherein inserting the plurality of key shares comprises inserting key shares at random memory locations between original program words.

7. The computer-implemented method of claim 1 wherein inserting the plurality of key shares comprises:

spreading out the origmal program words to create unused memory locations between the original program words; and

inserting key shares at the unused memory locations,

8. The computer-implemented method of claim 1 wherein inserting the plurality of key shares comprises inserting key shares so that not more than M key shares is inserted between original program words, where M is a preselected integer.

9. The computer-implemented method of claim 8 wherein inserting the plurality of key shares comprises inserting key shares so that not more than two key shares is inserted between original program words.

10. The computer-implemented method of claim 8 wherein inserting the plurality of key shares comprises inserting key shares so that not more than one key share is inserted between original program words.

1 1. The computer-implemented method of claim 1 wherein modifying the original program comprises inserting instructions to jump over memory locations storing key shares.

12. The computer-implemented method of claim 1 1 wherein inserting instructions to jump over memory locations storing key shares comprises inserting jump instructions immediately after original program words.

13. The computer-implemented method of claim 1 wherein modifying the original program comprises modifying addresses for jump instructions in the original program to account for changes in addresses resulting from insertion of the key shares.

14. The computer-implemented method of claim 1 wherein the key set comprises at least two secret keys, and the challenge-response protocol prevents simultaneous existence of all secret keys.

15. The computer-implemented method of claim 1 further comprising:

inserting message authentication codes (MACs) at memory locations between the original program words, each MAC authenticating associated program words using associated key shares.

16. The computer-implemented method of claim 15 wherein, upon execution of the modified program, a fetch cycle of the program execution will fetch one or more program words, the associated MAC and the key shares associated with the MAC, wherein the fetched program words can be authenticated using the fetched MAC and key shares.

17. The computer-implemented method of claim 16 wherein the fetch cycle fetches a predefined number of words from memory, the predefined number of words including a fixed number of program words, a fixed number of words of key shares, and the MAC.

1 8. The computer-implemented method of claim 15 wherem the MAC is a hash of a combination of the associated program words with a hash of the associated key shares.

19. The computer-implemented method of cla im 1 wherein the original program is binary code, inserting the plurality of key shares comprises inserting the plurality of key shares between words of the binary code, and modifying the original program comprises modifying the binary code.

20. The computer-implemented method of claim 1 wherein the program includes instructions and data.

21. A computer-implemented method for determining whether a virus has beers injected into a modiiied program, the modified program stored as words in a computer memory, the method comprising:

issuing a challenge for the modified program, wherein the modified program

comprises a plurality of key shares for a key set inserted at memory locations between original program words, the key set comprises one or more secret keys, the key set is lost if any of the key shares are modified, and virus injection into a contiguous block of words will modify at least one key share with high probability;

receiving a response for the challenge; and

verifying from the response whether any of the key shares has been modified.

2.2. The computer-implemented method of claim 21 as further modified by any of the dependent limitations of claims 1-20.

23. A computer-implemented method for responding to a request to determine whether a virus has been injected into a modified program, the modified program stored as words in a computer memory, the method comprising:

receiving a challenge for the modified program, wherein the modified program

comprises a plurality of key shares for a key set inserted at memory locations between original program words, the key set comprises one or more secret keys, the key set is lost if any of the key shares are modified, and virus injection into a contiguous block of words will modify at least one key share with high probability; and

issuing a response for the challenge, wherein the response can be used to verify whether any of the key shares has been modified.

24. The computer-im lemented method of claim 23 as further modified by any of the dependent limitations of claims 1-20.

25. A computer-implemented method for compiling an original program into a modified program that can be authenticated, the original program stored as words in a computer memory, the method comprising:

inserting a plurality of key shares for a key set at memory locations between the original program words, wherein the key set comprises one or more secret keys, the key shares associated with program words; inserting message authentication codes (MACs) at memory locations between the original program words, the MACs associated with program words, each MAC authenticating associated program words using associated key shares; and

modifying the original program so that execution of the modified program produces a same result as execution of the original program.

2.6. The computer-implemented method of claim 25 wherein, upon execution of the modified program, a fetch cycle of the program execiEtion will fetch one or more program words, the associated MAC and the associated key shares, wherein the fetched program words can be authenticated using the fetched MAC and key shares.

27. The computer-implemented method of claim 26 wherein the fetch cycle fetches a predefined number of words from memory, the predefined number of words including a fixed number of program words, a fixed number of words of key shares, and the MAC.

2.8. The computer-implemented method of claim 25 wherein the MAC is a hash of a combination of the associated program words with a hash of the associated key shares. 29. The computer-implemented method of claim 25 wherein msertmg the plurality of MACs comprises inserting MACs between original program words so that not more than N original program words are associated with each MAC, where N is a preselected integer.

30. The computer-implemented method of claim 29 wherein N is not more than a number of words fetched during a fetch cycle of program execution.

31. The computer-implemented method of claim 29 wherein N=L

32. The computer-implemented method of claim 1 wherein inserting the plurality of MACs comprises:

spreading out the original program words to create unused memory locations between the original program words; and

inserting MACs at the unused memory locations.

33. The computer-implemented method of claim 25 wherein modifying the original progra comprises inserting instructions to jump over memory locations storing MACs.

34. The computer-implemented method of claim 33 wherein inserting instructions to jump over memory locations storing MACs comprises inserting jump instructions immediately after original program words.

35. The computer-implemented method of claim 25 wherein modifying the original program comprises modifying addresses for jump instructions in the original program to account for changes in addresses resulting from insertion of the MACs.

36. The computer-implemented method of claim 25 wherein the original program is binary code, inserting the plurality of MACs comprises inserting the plurality of MACs between words of the binary code, and modifying the original program comprises modifying the binary code.

37. The computer-implemented method of claim 25 wherein the program includes instructions and data,

38. A computer-implemented method for authenticating a modified program, the modified program including key shares and message authentication codes (MACs) inserted at memory locaiions between original program words, each MAC authenticating associated program words using associated key shares, the method comprising:

fetching original program word(s), associated key share(s) and the associated MAC; and

using the fetched MAC to authenticate the fetched program word(s) using ihe fetched key share(s).

39. The computer-implemented method of claim 38 wherein fetching the original program word(s), associated key share(s) and the associated MAC all occur in a single fetch cycle of program execution.

40. A non-transitory computer readable medium containing instructions that, when executed by a computer, cause the computer to implement any of the methods of claims 1 -39.

Description:
PROVABLY SECURE VIRUS DETECTION

CROSS-REFERENCE TO RELATED APPLICATION S)

[0001] This application claims priority under 35 IJ.S.C. § 1 19(e) to U.S. Provisional Patent Application Serial No. 62/054,160, "Provably Secure Virus Detection," filed Sep 23, 2014. The subject matter of ail of the foregoing is incorporated herein by reference in their entirety.

BACKGROUND

1 · Field of the Invention

[0002] This disclosure relates generally to virus detection.

2. Description of Related Art

[0003] It is desirable to make computer systems safe from the insertion of malware, commonly referred to as virases. The problem is that almost all programs can be attacked by adversaries who install their own code into a program— malware— and thereby take over the system. This allows them to do anything: steal credit card information, get passwords to websites, get health information, destroy industrial controllers, and in general cause untold damage and havoc. And this is only getting worse, as attackers become more sophisticated and as computers are used everywhere, in almost everything. As a result, there have been many efforts to address this problem.

[0004] Perhaps the most common defense mechanism in practice is to block the path the attacker uses to inject and execute its malware. The respective solutions typically aim to protect specific memory vulnerabilities, such as buffer overflow attacks. Most of the attacks of this type aim to overwrite control-data (i.e., data that influence the program stack/counter) to modify the program flow and execute some malicious injected code.

[0005] Other defense approaches monitor the system calls that the program makes (i.e., its interfaces to the operating system) to detect abnormal behavior and attempt to ensure control-flow integrity by checking that the software execiEtion path is within a precompiled control-flow tree. Yet another line of suggested countermeasures include software-fault isol tion (the program is restricted to only access a specific memory range which is thoroughly verified and safeguarded) and data-protection mechanisms (e.g., by means of water-marking) some of which offer e v en acti v e protection, i.e., might ev en repair the compromised data. [0006] Most recently, software diversity has been thought of as a promising weapon in ihe arsenal against malware injection. Inspired by biological systems, where diversification is a prominent venue for resiliency, software diversification uses randomization to ensure that each system executes a unique (or even several unique) copies of the software, so that an attack which succeeds against some systems will most likely fail against other systems (or even when launched twice against the same system). A typical example is address obfuscation (aka address space randomization), which, roughly, compiles the software to randomize the locations of program data and instructions so that the attacker is most likely to follow an invalid computation path and thus provoke a crash or fault with good probability. Such a randomization might be done to the program source at compilation time, or in some cases to the binary (machine code) at loading time.

10007] Another example of software diversification is instruction set randomization (ISR) which randomizes the instruction set. Informally, ISR uses a secret key K to compile the set of (machine code) instructions that the software is to execute into a completely random set of instructions. This is typically done by one-time-pad encrypting the instructions with a pseudo-random key generated by a pseudo-random generator seeded by K; more recent suggestions use AES encryption for this purpose. When the program is loaded to the memory, an emulator who is given the key K decrypts on-the-f!y the randomized instructions and executes them. The expectation is that if a virus is injected (before the emulation) then the decryption is most likely to map it to an incoherent sequence that will most likely provoke a crash of some other type of fault.

[0008] This technique has several limitations. First, it is restricted with respect to the points in the execution path where the vims injection might occur, e.g., the vims cannot affect the emulator. Second, suggested ISR-based solutions typically protect only the code and not the program data. Third, depending on the actual software, on-the-fly compilation might incur a considerable slowdown on the program execution. Finally, ISR solutions (especially the one-time pad-based) make the system sustainable to code-reuse attacks (aka return oriented programming (ROP)) where the attacker (re)-uses parts of the actual code to perform its attack,

[0009] More recently, the idea of booby trapping software has been suggested. At a high level, some booby trap code is planted on the program executable which, when executed, initiates some fault reporting mechanism. However, to date, no concrete description, implementation, or formal security statement for a booby trapping mechanism has been provided. [0010] The above approaches are usually heuristics and are not formally proved to be neither efficient nor effective. Rather, their performance and accuracy is typically experimentally validated, in fact, many of these security countermeasures experiments indicate a potentially undesirable tradeoff between between security and accuracy. Finally, many of them are ad hoc patches targeted to a single or a small class of vulnerabilities caused by malware injections.

[0011] On the more theoretical side of the security literature, cryptography offers security solutions which are backed by rigorous mathematical proofs instead of experimental benchmarks. Examples here include message authentication codes (MACs) and digital signatures for protecting data integrity, secure computation and homomorphic encryption for protecting data confidentiality, and many others. It is fair to say, though, that the

cryptographic literature has mostly overlooked the problem of malicious software injection, in particular its practical aspects. One notable exception is the recent work which relies on a new class of codes called non-malleable codes which, roughly, are resilient to oblivious tampering. Nonetheless, in contrast to the solutions mentioned in the previous section, most of the cryptographic protocols are theoretical and/or adopt a too abstract model of computation which makes them to some extent incompatible to real-world threats,

[0012] An example of a primitive which recently attracted the spotlight within the cryptographic community is program obfuscation. A program can be rewritten in a way that one cannot get any information about it other than the input/output behavior of the function it computes. Both feasibility solution and impossibility results of great theoretical interest were proved. However, existing solutions are impractical and even the biggest optimists do not see ho this primitive can be brought to practice any time soon. Indeed, the existing techniques turn short, simple programs into huge and slow programs. Furthermore their security lies on new computational hardness assumptions that have not yet been thoroughly attacked and evaluated.

SUMMARY

[0013] The present disclosure overcomes the limitations of the prior art by hiding a secret in the program code in a way that ensures that, as long as the system does not leak information about it, any injection of malware will destroy the secret with very high probability. Once the secret is destroyed, its destruction and therefore also the injection of malware will be quickly detected. [0014] In one aspect, let W be a program that we wish to protect from mafware. We recompile W to W. The idea is that W and W must compute the same thing, run in about the same time, and yet W must be able to detect itself against the insertion of malware. The protected program W operates normally most of the time. Periodically it is challenged by another machine to prove that it "knows" a secret key K. This key is known only to W and not to the attacker. If W has not been attacked, then it simply uses the key K to answer the challenge and thus proves that it is still operating properly. We also arrange W so that no matter how the injection of malware has happened, the key K is lost. The attacker's very injection will have changed the state of W so that it is now in a state that no longer knows the key K.

[0015] In one approach, we distribute the key K all through memory by using a secret sharing. We break the key K into many pieces, called key shares, which are placed throughout all of the system's memory and are interleaved with words of W. Our secret sharing allows K to be reconstructed only if ail the pieces are left untouched. If any are changed in any way, then it is impossible to reconstruct the key. Obviously, if there has been no attack, then in normal operation W can reconstruct the key from the pieces and answer the challenges. However, if the pieces are not all intact, then W will be unable to answer the next challenge.

[0016] In another aspect, we arrange that W can be attacked at any time, including when the system computing the response has collected all the key shares and reconstructed the key K, This is a very dangerous time, since if W is attacked at this moment the fact that key shares are destroyed does not matter. The attacker can simply use the reconstructed key K. Such time-of-check-time-of-use (TOCTOU) attacks can be avoided by using two keys Κ·, and Ky in tandem. Both are stored via secret sharing as before, but now one must have both in order to answer the challenges. We arrange that W only reconstructs one key at a time and this means that an attacker injecting a long -enough virus must destroy either or both of the keys.

[0017] in yet another aspect, we treat the case of very small viruses (i.e., ones that consist of a single bit or just a few bits) and viruses that know (or guess) the exact memory locations of key shares in advance and therefore might not need to overwrite them. We armor our system to defend even against such tiny/informed viruses by employing an additional lightweight cryptographic primitive, namely a message authentication code (in short, MAC). A MAC authenticates a value using a random key, so that an attacker without access to the key cannot change the value without being detected. We use the MACs in a straightforward and efficient manner. We authenticate each word in f using adjacent shares of the keys K] and Ky (also) as MAC keys, and require that for any word that is loaded from the random access memory, the CPU verifies the corresponding MAC before further processing the word. Thus, if the MAC does not check out we detect it (and immediately destroy the key), and if the injection changes the MAC keys, the attacker loses one of the shares of one of the two secrets, and we detect that as well.

[0018] We enforce the above checks by assuming a modified CPU architecture (with only ver small amount of additional computation, so the CPU power consumption would only be minimally affected.) More specifically, to get the most out of the MAC functionality, when the CPU executes the program, it verifies authenticity of the words it loads from the memory. That is, the load operation of the CPU loads a block of several words, which is already the case in most modern CPUs. The block includes the program word, the MAC-iag and the corresponding keys. The CPU checks with every load instruction that the MAC verifies before further processing the word, importantly, our MAC uses just one field addition and one field multiplication per authentication/verification. The circuits required to perform these operations are actually trivial compared to the modern CPU complexity and are part of many existing CPU specifications.

[0019] In addition, instead of using the actual key shares in the generation/verification of the MAC tag, we use their hash- v alues. This ensures that the virus cannot manipulate the keys and forge a consistent MAC unless he overwrites a large part of them.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020] Embodiments of the disclosure have other advantages and features which will be more readily apparent from the following detailed description and the appended claims, when taken in conjunction with the accompanying drawings, in which:

[0021] Figure 1 A is a block diagram illustrating a compilation stage for a virus detection scheme.

[0022] Figure IB is a block diagram illustrating a challenge-response stage for a virus detection scheme.

[00231 Figure 2A is a diagram of the virus attack game Qyo ^■

[0024] Figure 2B is a diagram of the repeated detection virus attack game [0025] Figure 2C s a diagram of the vims attack game without self-responsiveness r 3t,V,W

y DS "

[0026] Figure 3 illustrates a transformation from f to W.

[0027] Figure 4 A is a diagram of the procedure Spread, for insertion of key shares, [0028] Figure 4B is a diagram of Compile! for VDS V \ , which is secure without self- responsiveness.

[0029] Figure 4C is a diagram of Response-, for VDS V \ .

[0030] Figure 4D is a diagram of the inverse procedure Spread - 1 .

[0031] Figure 5A is a diagram of Compile? for VDS Vj, which is secure in the repeated detection game,

[0032] Figure 513 is a diagram of Challenges for VDS V 2 .

[0033] Figure 5C is a diagram of Response? for VDS V2.

[0034] Figure 6A is a diagram of the instruction read_au.th.

[0035] Figure 6B is a diagram of the instruction write_auth.

[0036] The figures depict various embodiments for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employ ed without departing from ihe principles described herein.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0037] The figures and the following description relate to preferred embodiments by ¬ way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of what is claimed.

[0038] The following description is organized as follows:

1 . Overview

2. Preliminaries and Notation

3. Modelling Computation

3.1 Random Access Machine

3.2 Software Execution

3.3 Virus Injection

4. Virus Detection Schemes

4.1 VDS Model 4.2 VDS Security Properties

4.3 Modelling Virus Vulnerability

5. Some VDS Examples

5.1 A VDS without Self-Responsiveness

5.2 A VDS with Self-Responsiveness

5.3 A VDS with Standard Security

5.4 A. VDS Secure against Short Viruses

6. Further Extensions

1. OVERVIEW

[0039] The following disclosure describes various virus detection schemes (VDS) for protecting software against arbitrary malicious code injection. In one aspect, the approach uses the fact that an attacker who injects some malware into the system, no matter how clever, no matter when or how he inserts his malware, changes the state of the system he is attacking. The approach is to compile the system to such a state so that any such change will be caught with arbitrary high probability, preferably in a ma ner that can be proven to be secure and without sacrificing performance.

[0040] Figure 1 is a block diagram of such a VDS. The VDS uses a compiler 1 10, which compiles any given program (including its data) into a modified program W. The modified program W performs the same computation as the original program W but allows us to detect viruses injected in the memory at any point of the execution path. As shown in Figure IB, the detection is done via a challenge-response mechanism which is implemented between the machine 150 executing the modified program W and a verifying external device 160.

[0041] In the example of Figure 1, the origina l program W is stored as words in a computer memory. The compiler 1 10 then performs two functions. It inserts 1 12 key shares at memory locations between the original program words. The original program words are spread out in memory to make space for the key shares. The key shares are for a key set (i.e., one or more secret keys), which is used in the challenge-response protocol. If the key shares are modified, then the key set is lost, which will cause a failure in the challenge-response protocol. In addition, the compiler 110 modifies 1 14 the original program so that the original program words still execute in the same order. For example, if the original program words are spread out in memory, then the compiler 110 may insert jump instructions, either implicit or explicit, to account for the new memory locations of the original program words. Note that the input Wto the compiler 1 10 can be binary code, so that source code is not required for this.

[0042] The key shares are interspersed with the original program words so that injection of a virus into a contiguous block will modify a key share with high probabihfy. The modification will then be detected by the challenge-response mechanism. In Figure IB, the verifying device 160 issues a challenge 172 based on the key set. The executing device 150 makes a response 174 based on the words contained in memory. The verifying device 160 can then verify 176 from the received response whether any key shares have been modified.

2. PRELIMINARIES AND NOTATION

[0043] Throughout this description, we assume an (often implicit) security parameter

$

denoted as k. For a finite set S we denote by s *- S the operation of choosing s from 5' uniformly at random. For a randomized algorithm B we denote by (x; r) the output of B on input X and random coins r. To avoid always explicitly writing the coins r, we shall denote by $

y «- B(x ' ) the operation of running B on input x (and uniformly random coins) and storing the output on variable j. When B is deterministic then we write y B(x) to denote the above operation. We use the standard definition of "negligible" and "overwhelming" (e.g., see Oded Goldreich. Foundations of Cryptography: Basic Tools, volume 1. Cambridge University- Press, Cambridge, UK, 2001 ).

[004 1 For a number n £ M, we denote by [n] the set [«] = ί 1 ,..., «} . Furthermore, for a string x 6 {0,1}* we denote by Decimal( ) £ the decimal representation of Inversely, for a number x ε Έ ρ with log p < k, we denote by Bin P (x) the k-bit string (binary) representation of Finally, for strings x anci y we denote their concatenation by x\\y.

[0045] Secret Sharing. A perfect τκ-out-of-TK secret sharing scheme allows a dealer to split a secret ,s into m pieces s \ ,..., s m , called the shares so that someone who holds all m shares can recover the secret (correctness), but any m- 1 shares give no information on the secret (privacy). In this work we use ;»-out-of-m sharings of strings s E S— {0,1}", where n e . A simple construction of such a scheme has the dealer choose all s / 's uniformly at random with the restriction that s— © ^ 1 s i . We refer to the above sharing as the XOR- sharing. An XOR-sharing ofs is denoted as (s) = (i;, . . . , s m " ).

3. MODELLING COMPUTATION [0046] In this section we provide an abstract specification of our model of computation. Since the goal of our work is to pro vide vims detection mechanisms that are applicable to modern computers, we tune our model to be an accurate abstraction of the way a modern computer executes a program. We use as basis the well known Random Access Machine (RAM) model but slightly adapt it to be closer to an abstract version of a modem computer following the von Neumann architecture. In a nutshell, this modification consists of assuming that both the program and the data are written into the RAM's random access memory which is polynomially bounded, in the literature, a RAM with this modification is usually called a Random Access Stored- Program machine (RASP). Along the way we also specify some terminology which demonstrates how such a RAM corresponds to modern computers.

3.1. Random Access Machine

[0047] A RAM . 'R includes two components: A Random Access Memor '- (in short RMEM) and a Central Processing Unit (in short CPU). The memory RMEM is modeled as a vector of words, i.e., bit-strings of appropriate (fixed) length, where each component has some pre-defined address. In an actual computer, each of these words corresponds io a register in the random access memory which can be accessed (read and written) by the CPU according to some pre-defined addressing mechanism. The CPU includes a much smaller number of registers, as well as an instruction set / that defines which operations can be performed on the registers, and how data are loaded to/output from ihe CPU.

[0048] The RAM and the CPU communicate in fetch and execute cycles, aka CPU cycles, typically through a CPU bus. At times we refer to a CPU cycle as a round in the RAM's execution. In each such cycle, the CPU accesses the RMEM to read (load) a word to some of its registers, performs some basic operation from / on its local registers, e.g., adding two registers and storing their output on a third register, and (potentially ) writes some word (the eonienis of some regisier) to a location in ihe memory RMEM. The location in RMEM from where the next word will be read is stored in a special integer-valued register denoted as pc (usually called the program counter) which unless there is a jump instruction (see below) is incremented by one in each CPU cycle. In the following we provide some details on the above components.

[004.9] The Random Access Memory (RMEM). RMEM includes tn = poly(k) registers each of which can store an L-bit string (words) where we assume that L is linear in the security parameter, A typical modern RMEM has L = 32 or L = 64. We represent the memory as a 1 x m array MEM of words, where for i £ {0,1, ... , m} we denote by MEM[? ' ] the ith word, i.e., the contents of the j-th register. By convention, the address of register MEM| ] is i. We shall denote the size (i.e., number of registers) in a given MEM as !MEMj . Each word in MEM might include an instruction from the set /, or an address in the RMEM, or program data, or combinations of these. For example, in modern memory where the words are 64-bits, a word might contain more than one instruction or a combination of instruction and an address.

[0050] The Central Processing Unit (CPU). The CPU includes a set of registers each of which is also assigned a unique address and can store an L-bit word. Similarly to the notation used for the random access memory MEM, we model these registers as an array REG of words, where we denote by REGjV] the content of the /th register. Note that unlike the RMEM which might include thousands or even millions of registers, a typical CPU has no more than 60 registers. We assume that the total amount of storage of the CPU is linear in the security parameter k.

[0051 j Some CPU registers have a special purpose whereas others are used for arbitrary operations. Examples of special registers include the input and output register. The input register is used by the CPU in order to receive input from external devices (e.g., the keyboard) and the output register is used to send messages to external devices, e.g., the monitor. For simplicity, we assume that the RAM might only read from its input register i.e., cannot overwrite it, (only the user can write on it via its input device) and that both the input and the output register are of size polynomial in the security parameter k. Another example of special register which we already saw is the program counter pc which stores the location in the memory that the will be read in the next CPU cycle. As already mentioned, unless PC is changed by the CPU via a so-called jump instruction, it is augmented by 1 at the end of each CPU cycle. Finally, the total number of CPU cycles executed fro the beginning of the computation is stored on another special register which we refer to as the clock-counter. The state of the CPU at any point is described by a vector including the contents of all CPU registers at that point.

[0052] The CPU also includes certain components which allow it to perform operations on its registers and communicate with the RMEM as well as peripheral devices such as the keyboard and the monitor. A detailed description of these components is not necessary in our level of abstraction. The set of all possible operations that a CPU might perform is called the instruciion set /. Informally, / might include any efficiently computable transformation on the state of the CPU along with two special commands that allow the CPU to load messages to and from the memory MEM. For the purpose of this description, each instruction is represented as a tuple of the form (opcode, α , ... , at), where opcode is an identifier for the operation/transformation to be executed, and each a, is either an address of a CPU register, or an address of an RMEM location, or a signed integer.

[0053] The following are some examples of typical CPU instructions:

- (read, which copies the contents of the ih RMEM location to the y ' th CPU register, i.e., sets REG /7 := MEM ]

- (write, /, i) which writes the contents of the y ' th CPU register into the /lb. memory

location.

- (no op) is the empty operation which instructs the CPU to simply proceed to the next round.

- (eopy,y, i) which writes the contents of the y ' th CPU register to the y ' th CPU register.

- (erase, i) which writes 0's on the / C U register.

- (add, i,j) which adds the two register (and stores the result to the second) i.e., it sets REGjy] := REG [2] REGjy]

- (mult, i, j) which multiplies the two register (and stores the result to the second) i.e., it sets REG/// ~ REG/77 + EG /7

- (jump, i) which changes the contents of the program counter PC to equal REGjy]

- jumpif, P, i, j) which of a given Boolean predicate P : {0,1 } *→ {0,1 } checks if (REG[? ' ]) = 1 and if so changes contents of the program counter PC to equal REGjy] (i.e., implements a conditional jump).

[0054] We next establish some useful notation and terminology. A CPU is defined as a pair C = (REG, I) of the vector REG of registers and an instruction set I. We denote a RAM ft with CPU C and RMEM MEM as the pair ft = (C MEM). The state of a RAM ft at any point in the protocol execution is the vector (REG, MEM) including the current contents of all its CPU and RMEM registers. To allo w for asymptotic security definitions where the size of the CPU (i.e., number of its registers) and the size of the memory are polynomial in the security parameter we often consider a family of RAM's, ft = = (C, {MEM k } kP ^) t where for each k £ M : | MEM ¾ . ] = poiy(k). Whenever clear from the context, we refer to such a family as a RAM. Note that all members of a RAM family use a fixed size CPU.

[0055] in the following we define the notion of a complete CPU. A CPU is complete (also referred to as universal) if given sufficient (but polynomial) random access memor '' it can perform any efficient deterministic computation.

[0056] Definition 1. (complete CPU). A CPU C === (REG, I) is complete if for any deterministic polynomial algorithm B and for any polynomial-size input x for B, there exists a RAM :R - (C, MEM), where jMEMj = poly(\x\) such that .72 outputs y «- B(x) in a polynomial number of rounds.

[0057] We at times refer to a RAM (family) with a complete CPU as a complete RAM (family). An example of a complete CPU is one which has three registers, and can, in addition to communicating with its RMEM, compute any Boolean circuit on any two of these registers and store it on the third. Note that emulating the execution of modern software on such a CPU would be very slow; in fact, as mentioned above modern CPUs many more registers and might perform several complicated instructions in a single round.

3.2. Software Execution

[0058] A program to be executed on a RAM IR ------ (C, MEM) is described as a vector W ------

(¼¾, . - . , W' -I) e ({0,i} L ) n of words that might be instructions, addresses, or program data. To avoid confusion, we refer to such a vector including the (binary of a) software and its corresponding data as a programming for ,72,

[0059] The execution of a program proceeds as follows. The vector W including the program and its data is loaded into the memory MEM. Unless stated otherwise, we assume that W loaded sequentially on the first n = \W\ locations of MEM, i.e., for each j 6 {0, . . . , 7i— 1} : Wj is written on register MEM[/ ' ]. Every location of j of MEM with j ≥ \W\ is filled with (no_op) operations. The user might give input(s) to ,72 by writing them on its input register in a sequential (aka write once) manner, i.e., a new input is written (appended) next to the last previous input. Without loss of generality, we assume that all CPU registers are filled with zeros at the onset of the computation (and before inputs are given). Once the data are loaded (and, potentially, input has been written on the input register) the RAM starts its execution by fetching the word of the RMEM which the program counter points to, i.e., MEM[pc]. Unless stated otheiwise, at the beginning of the computation pc is initiated to 0 (i.e., points to the first memory location). Following that, it executes the program in CPU cycles as described above. Note that we make no "private state" assumption about the CPU and, in particular at the beginning of the computation, all the CPU registers are set to the all-zero word 0 L and its first action is to read from the random access memory.

[0060] The computation of a RAM typically stops when it reaches a halting state which is specified by setting the value of a special C U register which we call the halting register. For simplicity, we assume that the halting register stores only a bit which is set to I when the RAM reaches a halting state. However, as we are interested in capturing even reactive computation which might generate output and receive inputs several times during the execution, we make the following assumption. If a RAM R has reached a halting state and some new input is written (appended) in its input register, then .72 resumes the computation, i.e., increases the program counter and proceeds to the next CPU cycle. The state of the RAM at the beginning of this new round is identical to its halting state with the only difference that the halting register is reset to zero and the program counter is increased by one.

[0061 j A property for programmings W which will be useful in our constructions is that it does not try to read or write from a memory location which is "out of the W boundaries".

[0062] Definition 2. Let .72 = (C, MEM) be a RAM and W = (w lt w n ) be a programming which is written on memory locations MEMfi-J, MEMj i r , ]. We say that W is self-restricted for .72 if for any input sequence, an execution of .72 initiated with W as above never writes or reads from a location/≠ ¾, . . . , i n }.

[0063] The definition of self-restricted programming naturally extends to a RAM famiiylR = {¾} feeW .

[0064] Definition 3. Let = {¾} feeN . = {C, MEM fe }, ¾eM be a RAM family. A programming W is a self-restricted programming for the RAM family R if for some k E N with k = poly VY), W is a self-restricted programming for R k .

3.3. Virus Injection

[0065] A virus attacks a RAM by injecting its code on selected locations of the memory RMEM. For simplicity, we assume that the vims can only inject sequences of full words but our constructions can be extended to apply also to viruses whose length (in bits) is not a multiple of the word length L. More formally, an -word virus is modeled as a tuple virus , W) = ((CCQ , . - . , &e-i) > ( w o > ■·■· w -e- 1 )λ where each a, έ ε a is a location (address) in the memory and each ν έ 6 W is a word. The effect of injecting a vims virus into a RAM . 'R— (C, MEM), is to have for each , 6 , the register MEM [a,;] (overwritten with i j . We say that the virus is valid for .72 if the following properties hold: (1) ¾≠ α ; · for every a i; α ; · 6 ci and (2) α έ E {0, |MEM |— 1}, for every α έ e . Furthermore, we say that virus is nonempty if |cf] > 0.

[0066] Note that we do not allow the vims to inject itself on the CPU registers. This is justified by the fact that during the software execution, the CPU communicates only with the RAM and the input/output devices. Importantly, we make no security or privacy assumption, such as tamper- resilience, on the CPU. For example, an "intelligent" vims is allowed to take full control of the CPU, i.e., overwrite ail its registers, while it is being executed. ( 67] The viruses that we consider in this work are continuous, i.e., the virus injects itself into contiguous locations in the RAM's memory. In this case, the definition of a valid virus can be simplified to a pair virus = (a, W), where W is as above and

a 6 { 1, . . . . I MEM I — j W\} indicates the position in RMEM where the first word of the virus is injected. Nonetheless, our definitions are more liberal and are for arbitrary (not necessarily continuous) viruses. In the following we denote by | virus j the size (also referred to as length) of the virus, namely its number of words (for non-consecutive viruses [virus | =

!« !)·

4. VIRUS DETECTION SCHEMES

[0068] In this section we formalize our notion of pro vably secure virus deteciion. More concretely, we introduce the notion of a virus detection scheme (in short VDS) which demonstrates how to compile a given program (and its data) into a modified program which allows us to detect a virus which is injected as long as it modifies the programming of the RAM. The detection is done via a provable secure challenge-response mechanism.

Importantly, the execution of the compiled program is only a small factor slower than the execution of the original program (typically a small constant factor).

4.1. VDS Model

i. A virus detection scheme (VDS) V includes four algorithms, i.e., V - ----- (Compile, Challenge, Response, Verify), defined as follows:

* Compile is a randomized algorithm which takes as input the description

31 = C, MEM) of a RAM, a programming W for 31, and a key K t- X, where X e \ 0, 1 } 0(k> is the key-space, and outputs a secure programming W for a 31 = (C, MEM ) with the same

CPU C and with memory size ! MEM j = poly (maxffe, jMEMj}); i. e. , W <- Compile (31, W, K). Note that K might be a symmetric -key or a public-key/synimetric-key pair.

* Challenge is a randomized algorithm that on input of a key K e X, and a string

Z e InpcM c {0, 1 }"°°'^ outputs a challenge string c E Outa^ where OutcM E {Q, l } po!ylK)

$

denotes the output domain of algorithm Challenge; i.e., c *~ Challenge^, K).

* Response is a (potentially) randomized algorithm which on input of a string c £ Ouic si and an | MEM | -long word- vector W outputs a string y 6 Outg. e sp ; where OutR esp c

{0, 1 } " ') ( ' denotes the output domain of algorithms Response; i.e., y <— Response (c, W ). « Verify is a (potentially) randomized algorithm which on input KeK, a message z≡ Inpchsx, a challenge c e Out hsk and a response y e Ouh esp outputs a bit b e {0, 1 } ; i.e., b $

«~ Verity (K,z,c,y). We say the Verify accepts if b = 1.

4.2. VDS Security Properties

[0070] In the following we specify security properties which a VDS preferably should satisfy. The first property is verification correctness which, intuitively, guarantees that if the RAM has not been attacked, then the reply to the challenge is accepting (except with negligible probability).

[0071] Definition S (verification correctness). A virus detection scheme V = (Compile, Challenge, Response, Verify) for a RAM family R, is verification-correct iff for every programming W ofR the following holds for some negligible function μ,

K < -- K A W «- Compile(/?, W, K) A z inp chal Λ c ChalSenge(z, K)

Pr

Λ y Response(c, W) A Verify(/f, z, c, y)≠ 1

[0072] The probability in the above definition is taken over the random coins of all involved algorithms and the coins for sampling the key.

[0073] The second desired property of a VDS is compilation correctness, which intuitively ensures that the compiled code W encodes the same program as the original code W. This means that on the same input (s) from the user the following properties hold: (1) R, produces the same output sequence, and (2) there exists an efficient transformation which maps executions ofR programmed with W onto executions ofR programmed with W such that the contents of the memory MEM at any point of the execution of R with programming W, are efficiently mapped to contents of MEM at a corresponding point in the execution oiR, with programming W. This latter property will be useful in reality, where the computation also modifies program data which need to be returned to the hard drive to be used by another application.

[0074] Note that an execution of W might terminate (and/or generate outputs) after a different number of CPU cycles than an execution of W. The reason is that the compiler will typically spread out the execution of each W-instruction over several CPU rounds.

Notwithstanding, we require that these extra rounds are polynomial in the number of rounds that the original program W would need to produce outputs and/or terminate. In the following, for a RAM R. - ((REG, /), MEM) and for ap e N, we denote by RMEM{W,p, = (xj, ... , xi)) the state of MEM (i.e., the contents of MEM) at the end of the/ -th CPU cycle when executed on inputs x \ , ; and programming W: we also denote the set of all possible memory states of R, by∑ R . Moreover, we denote by R^ (W, p, = (xi ...,x<)) the first output generated by R, (i.e., the first string written on the CPU's output register) after round p.

[0075] Definition 6 (β-bouncled emulation). Let Q : M --> M be an efficiently computable monotonically non-decreasing function, and for a RAM Ji = (C, MEM) let W he a programming for JI. We say that a programming W for RAM Ji = (C. MEM) with | MEM | = poly (MEM) is a Q-bounded sound emulation of 14/ if there exists an efficiently computable function

T ·∑j? MEM →∑3j such that for every p E M and every sequence x of inputs the following two properties holds for a negligible function μ:

Xout(W, Q (p), x)) - X out (W, p, x), and

T (x MEM (w, Q (p), x)) - X MEM ( , P, X)

[0076] Definition 7 (compilation correctness). A VDS V = (Compile, Challenge, Response, Verify) for the RAM JI is compilation correct if for any programming W of J there is a known polynomially computable function Q : → such that the compiled programming W - Compile (JI, W, K) is a (Abounded emulation of W except with negligible probability. As before the probability is taken over the random coins for choosing the key K and for executing algorithm Compile.

[0077] In an application of a VDS, the user (who executes a compiled program on his computer) should only be expected to input the challenge (e.g., as a special input) at a point of his choice and check that the reply verifies according to the predicate Verify. Thus, another desired property of a VDS is self-responsiveness, which requires that the secured

programming W includes the code which can compute the response algorithm Response. On a high-level, the requirement here is that upon receiving a special input message x'— (check, c) the next output of the RAM equals the output of Response on input c with overwhelming probability.

[0078] Definition 8 (self-responsiveness). A VDS V = (Compile, Challenge,

Response, Verify) for a RAM Ji is self-responsive if there exist efficiently computable monotonically non-decreasing function Q : M→ M such that for every programming W of JI and every sequence of inputs x with (check, c)€ x and c 6 7np chai> assuming the input (check, c) is written on Jl's input register at round p, the following property holds for some negligible function μ < μ(/ )

[0079] The aforementioned properties, specify the behavior of software protected by a

VDS when it is not infected by a virus. Last but not least, we also desire detection accuracy which states that if some non-empty rnalware is injected onto the RAM, then it is detected with high probability. This is one of the most challenging properties to ensure and is the heart of any VDS. In the following we specify this property by means of a security game between an adversary Adv who aims to inject a vims on a RAM, and a challenger Ch who aims to detect it. informally, the security definition requires that Adv cannot inject a virus without being detected, except with sufficiently low probability.

4.3 Modelling Virus Vulnerability

[0080] At a high-level, the security game, denoted by $ v ¾s W » proceeds as follows. The challenger Ch picks a uniformly random key K and compiles a programming W of a RAM 52 into a new programming W by invocation of algorithm Compile. Ch then emulates an execution of W on 31., i.e., emulates its CPU cycles and stores its entire state at any given point. The adversary is allowed to inject a virus of his choice on any location in the memory- MEM. Eventually, the challenger executes the (virus) detection procedure. It computes a challenge c by invocation of algorithm Challenge (z, K), and then feeds input Q c) to the emulated RAM and lets it compute the response y,

[0081] To capture worst case attacks scenarios, we allow the adversary to inject his vims at any point during the RAM emulation and make no assumption as to how many rounds the RAM executes after the vims has been injected. Although this might seem redundant given the "closed-system" assumption on the RAM, i.e., that it only loads data on MEM from external devices at the beginning of the execution, real-world software might indeed load more data while it executes. More concretely, we allow the adversary to specify the number far e of rounds to be executed before the detection process kicks in, and the index p an of the round in which Adv wants the virus to be injected on the memory. Furthermore, we make no assumptions on ho w much information the adversary holds on the original programming W or on the inputs/outputs of 31, by assuming that Adv knows W and can decide/see the entire sequence of inputs/outputs. [0082J The formal description of the security game C fV DS ' s shown in Figure 2A. Both Adv and Ch know the specification i¾ of the RAM under attack as well as the VDS Fand the programming W, For simplicity we describe the game for the case where the adversary injects a virus only once, but our treatment can be extended to repeated injections. We present the game for self-responsive VDSs. The simpler (but less secure) notion of non-self-responsive VDS's can be obtained by modifying Step 4b in Figure 2A so that the challenger is evaluating Response himself, i.e, without invoking the RAM. We return to this relaxation in Section 5.1 below as building such a weaker VDS turns out to be useful for our construction of self-responsive VDS.

[0083] Definition 9 A. We say that a virus detection scheme V =

(Compile,Challenge,Response, Verify) is secure for RAM family 31 if it satisfies the following properties for any programming Wof 31:

1. V is verification correct, compilation correct, and self-responsive

2. For any polynomial adversary Adv in the game £ V ¾' S ' W who injects a valid nonempty virus:

Pr[b = 1] < i(jc),

where μ is some negligible function, and the probability is taken over the random coins of Adv and Ch. Note that the random coins of Ch include the coins used by the executed algorithms of the VDS.

[0084] In our construc tions we restric t our statements to certain class of programming that satisfy some desirable properties making the compilation easier. We will then say that the corresponding VDS is secure for the given class of programmings.

[0085] The Repeated Detection Game The above security definition requires that the adversary is caught even when e injects its virus while the RAM is executing the Response algorithm. Indeed, this is captured by the check in Step 4b. In the following we describe a relaxed security game which also provides a useful guarantee for practical purposes. In this game the virus detection (challenge/response) procedure is executed multiple times periodically (on the same compiled programming). The requirement is that if the adversary injects its virus to the RAM at round p, then he will be caught by the first invocation of the vims detection procedure which starts after round p. Note that ail executions use the same compiled RAM program and therefore the same key K. To obtain a worst case security definition we allow the adversary to even define the exact rounds in which the detection procedure will be invoked. In most practical applications, however, it will be the user that specifies these intervals. The corresponding security game Q^s which involves τ executions of the detections procedure is described in Figure 2B. The security definition is similar to Definition 9A but requires that any virus that is injected in the RAM will be caught in the first virus detection attempt performed after the injection,

[0086] In the following, we refer to the z ' th execution of Step 3b in the repeated detection attack game T VDS (Figure 2B) as the ith vims detection attempt. The

corresponding security definitions state that any virus that is injected in the RAM will be caught in the next virus detection attempt.

10087] Definition 9B (Security in the Repeated Detection Game). We say that a vims detection scheme V is secure for a RAM family R, in the repeated detection model if it satisfies the following properties for any programming WoiR:

1. I 7 is verification correct, compilation correct, and self-responsive,

2. For any τ e M + and any polynomial adversar Adv in the repeated detection attack game r y DS who injects a valid non-empty virus before the ith virus detection attempt,

Prj , = 1] < μ{}£), where μ is some negligible function.

5. SOME VPS EXAMPLES

]0088] in this section we provide some examples of VDSs. We begin with examples which are secure assuming the virus is not too short, i.e., assuming its length is linear in the security parameter or, stated differently, assuming that the virus has a constant number of words. Recall that in all our constructions, we also assume that the virus is continuous, This includes most malware in reality, as viruses are usually several words long. Nonetheless, in Section 5.4, we even show how to get rid of this length restriction.

]0089] Our construction for VDSs secure against non-short viruses proceeds in three steps. In a first step we show how to construct a VDS which achieves a weaker notion of security that, roughly, does not have self-responsiveness. In a second step, we show how to transform into a VDS Vi which is secure in the repeated detection game. Finally, in a third step we show how to transform Ί? 2 into a VDS V3 which is secure in the standard detection game. VDSs V·, and V 2 are not described only as steps towards building l 7 j, but may also be useful in applications where their corresponding (weaker) security guarantees are satisfactory. In a final extension, we extend the VDS to one that is secure against arbitrarily short viruses. 5.1 A VPS without Self-Responsiveness

[0090] In this section we describe the VDS V-, (Compile i, Challenge], Response;, Verify',) which has security without self-responsiveness. More concretely, the corresponding attack-game Syos ^ s derived from the standard attack-game £?VDS ^ DV modifying the detection procedure so that in Step 4b of the standard attack game £VD * s ' W (Figure 2A), instead of invoking the RAM K ' on the compiled programming W and input (check, c) to compute y = Response(c, W), the challenger evaluates y <- Response(c, W) himself. For completeness, we have included the detailed game description as Figure 2C.

[0091] Definition 10. (Security without self-responsiveness). We say that a virus detection scheme V is secure without self-responsiveness for a RAM family i if it satisfies the following properties for any programming W of 7Z:

1. V is verification correct and compilation correct

2. For any polynomial adversary Adv in the virus detection attack game without self- responsiveness ivDS" (Fi ure 2C) who injects a valid non-empty vims

Pr[b = \ ) < fi(k),

where μ is some negligible function.

[0092[ At a high level, the idea of our construction is as follows. The algorithm Compile i chooses a key K uniformly at random, computes an additive sharing (K) of K and interleaves a different share of (K) between every two words in the original programming. More precisely, Compile compiles a programming W for a RAM (family) !R into a new programming W IR constructed as follows: Between any two consecutive words w; and wj + i of Wi e compiler interleaves a uniformly chosen &-bit string K l,l÷1 = K^' l+ \\ ... \\K: C '' l _

L

where each K'' l i l e {0,1} L . In the last k bits of the compiled programming (i.e., after the last word | W| „i) the string K last K@ φ¾ ~2 K l+1 is written.

[0093] To ensure that the compiled programming W executes ihe same computation as W we need to ensure that while being executed it "jumps over" the locations where key shares are stored (as the key shares are only for use in the detection procedure). For this purpose, we do the following modification. After each word Wj of the original program we put a (jump, a) instruction where a is ihe position in ihe compiled programming that Wj+\ has been moved to. The transformation from Wto W is shown in Figure 3. Similarly, we modify- any jump instructions of the original programming W io point to the correct locations in W. Depending on the instruction set and the actual programming if this could be done in different ways. For simplicity, here we will assume that the programming Wwe are compiling has the following properties:

1. Wis self-restricted for the given RAM (family) 31.

2. The only type of instructions in W which change the program flow (i.e., modify the program counter) are of the form (jumpto, z), (jumpto_if, v, z), (jumpby, z), and (jumpby_if, z), where z E TL is a fixed integer with \W\— 1 < z > 0. The (conditional) jump commands have the following effect:

* (jumpby, z) updates the program counter as pc = pc + z. Note that this means that in the next cycle the word at position in pc ÷ z + 1 of RMEM will be fetched, as at the beginning of each round p > 1 the program counter is always increased by one.

* (jumpby_if, , P, i, z) executes (jumpby, z) if P(REGj i j) = 1, where P : {Ο,Ι} 1 → {0,1 } is a Boolean predicate and i E {0, ! REG]— 1}. Note that the predicate makes jumps only conditioned on words written on CPU registers and not on the random access memory.

3. For any input sequence, an execution of Win RAM 32 does not write any (new) instructions to the memor MEM and does not write on locations where instructions other than no_op are written. We point out that Wis allowed to over-write program data but is not allowed to insert new instructions to the memory. This will be useful for ensuring that our compiled code jumps to the correct memory locations.

4. The only instructions that access the random access memory RMEM are (read, i,j), (write, j, i) as described above, and the built-in load command that is executed at the beginning of each CPU cycle and loads the contents of MEM[pc] to a special CPU register. All other instructions define operations on values of the CPU registers.

[0094] We refer to a programming W which satisfies the abo ve conditions as a non- selfmodifying structured programming for 31. Without loss of generality, we shall allow that any- considered RAM has the above instructions as part of its CPU instruction set /. Observe, that the above specification is sufficient for any structured program, i.e., a program which does not use "goto" commands but might use while and for loops. As a fact, writing structured programs is considered a good practice and most programmers avoid writing self-modifying code as it can be a source of bugs. Furthermore, classical results in programming languages imply that any program can be compiled to a structured program with a low complexity overhead. [0095] In the following we include a formal description of our compiler. For completeness, we first describe a process, called Spread (see Figure 4A), which spreads a structured programming to allow for enough space between the words to fit the key shares and add the extra "jump" instructions to preserve the right program flow. We then describe our compiler Compile-, (see Figure 4B) that uses the Spread-out programming and replaces the no_op's with appropriate key shares. Spread is only an example of moving things around for creating space for the key shares. More sophisticated techniques using ideas from the security and programming languages literature can also be used.

[0096] Referring to Figure 4A, for any given n 6 r¾, the process Spread compiles a structured programming W into one which leaves n empty memory locations (filled with no_op instructions) between any two words of W and implements the same computation as W. Informally, Spread compiles W so that a RAM initiated with W always executes two rounds for emulating each round of a RAM initiated with W, i.e., one round for the coiresponding operation of IV and one extra round for the jump to the position of the next operation of IV. To ensure that this one-to-two-round mapping is preserved throughout the execution Spread writes jump only on words w t with i— 1 mod n. Thus, if the ith word of the programming W we are compiling was already a jump operation, then Spread replaces it with a (jump, 0) instruction (whose effect is the same as a no_op instruction) followed by the appropriately modified jump operation which points to the position of the next W word.

[0097] The following proven lemma states the properties we desire from Spread.

[0098] Lemma 1. Let W ho a self-restricted, non-self-modifying structured programming for a complete RAM R = (C, MEM). Then for any n E M, the algorithm

Spread( W , ix) outputs a self-restricted non-self-modifying structured programming W =

(ivj vv m ) for RAM R - (c, MEM) with | MEM | - n | MEM | satisfying the following properties:

1. W is a Q-bounded emulation of W, where Q(p) = 2p.

2. An execution of W on RAM R never accesses (reads or writes) memory locations MEM [i] with /(rnod n + 2) g {0, 1}.

[0099] Given the process Spread we can now pro vide the detailed description of this example compiler Compile- ! (see Figure 4B). The compiler translates a self-restricted non- self-modifying structured programming W for a RAM ' — (C, MEM) into a programming W of a RAM R ' = (C, MEM), with j MEM \ = (k + 2) \ MEM | . Recall that we assume that k is a multiple of L. [00100] To complete the description of VDS V \ we describe the remaining three algorithms Challenge- ! , Response ! and Verify- ! . The algorithm Challenge-; , chooses random string x, 6 {0, l} k , and computes the challenge as an encryption ofx with key K; i. e. , c <- Challenge (x, K)— Enc K (x). For achieving security without self-responsiveness, we can take Enc to be the one-time pad, i.e., Enc K (x) = x Θ K. The corresponding algorithm Response^ works as follows. On input of the challenge c— Enc K (r) and the compiled programming W, it reconstructs the sharing of if by summing up all its shares as retrieved from W and outputs a decryption of the challenge under the reconstructed key, i.e., outputs y = c ® K (see Figure 4C). The corresponding verification algorithm Verif ! , simply verities that y = x.

[00101] Intuitively the security of the above scheme follows from the fact that an adversary who injects a linear vims as a single block on the RAM's memory will have to overwrite a linear number of bits of some key share which will make it infeasible to correctly decrypt the challenge and thus pass the VDS check. Note that here we use a non-standard property of the additi v e sharing which states that an adversary who erases bits of any share, has probability 2 s"k of recovering the secret even knowing all the remaining sharing information.

[00102] Theorem 1. Assuming .72 is a complete RAM family, if the adversary injects a virus virus with | virus j > 3 on consecutive locations of the random access memory, the VDS V \ is secure for 32 without self-responsiveness with respect to the class of all non-selfmodifying structured programmings.

[00103] Proof, We prove each of the properties of Definition 10 separately, namely, verification correctness, compilation correctness, and security in the attack game without self-responsiveness £ y ¾' ¾ (Figure 2C).

[0010 ( Compilation correctness: This property following immediately from Lemma 1 which ensures that Definition 7 is satisfied for Q(p)— 2p and for the function T computed by Spread " 1 (see Figure 4D).

[00105 [ Verification correctness: Lemma 1 ensures that the execution of the compiled programming W never accesses the key shares. Thus the reconstruction algorithm will succeed in reconstructing the correct key; hence verification correctness of V 5 follows from the correctness of decryption of the one-time pad cipher.

[00106] Security in <J V Q S _ : Let c be the challenge as generated by algorithm

Challenge ! . Any adversary Adv that passes the verification test with probability p can be used by an adversary Adv' to break the security of one-time pad encryption. More precisely, consider the following game capturing one-time pad encryption with key-leakage between adversary Adv' and a challenger Ch':

$

Ch generates a uniform one-time pad key K— jj . . . jj ifjj <- {0,1} , where

$

each K j fc {G,!} 1 ', choses a uniformly random plaintext x =<- {0,1}^, computes y— x ® K, and sends it to Adv'.

Adv' sends Ch' a non-empty set / £ {±, .. , , k} of indices and for each i c / receives K t from Cb'.

Adv' outputs a string x ' and wins iff x— x' '.

[00107] Tt follows from the perfect privacy of the one-time pad that the probability that the adversary wins in the above game is at most p = which is negligible, since is nonempty and L is linear in k. In the following we show that if there exists an adversary Adv ihai wins in the game Q^ Q ^- w ^ noticeable probability, then there exists an adversary Adv' which wins the above one-time pad with leakage game also with noticeable probability which yields a contradiction.

[00108] Adversary Adv' works as follows. It received from its challenger Ch' the challenge eiphertext y and initiates with Adv an execution of the game £ fV DS-- ' where he plays the role of the challenger Ch as follows. Upon receiving from Adv' the vims virus =

(a, (w[, Wy)) (recall that we consider a continuous virus which needs only to give the location a where the first word will be injected) Adv' uses it to find a location of a key share which would be overwritten by virus in an execution of the game ^VDS--' · 1° particular, let ;. * = min iP f ai- a+i;} {ii ί £ {0,1} (mod n)} and let in ; * := ;. * mod n . Adv' sends its challenger Ch the set / = i < ■ "J and receives the ke -v- words K-, , Ku n - ,l, Ku n ~ ,l, . . , K„. Adv sets Ku n — 0 L and sets K— ΚΛ . . . \\K .. \\Ku \\Ku , II . . . \\K V . Adv' emulates the execution of Ch' with the inputs received by Adv, computes the response y' from Step 4c, and outputs x' = y + y', [00109] We next argue that if Adv succeeds in his att ack with proba bility p then Adv' also succeeds with probability at least p. To this direction, consider the hybrid setting where Kr = Kr i.e, SK od =SK It is easy to v erify that the success probability of Adv in this hybrid experiment is identical with the corresponding probability to our original experiment described above. Indeed, until the vims is injected, the key shares are not accessed. Thus Adv overwrites the value of at lest one share of Ku

n obliviouslv of the actual value of Ku .

n

However, as the sharing is additive, this (overwritten) share acts as a perfect blinding for Ku and therefore after the injection the distribution of the memory MEM in the two hybrids is identical. Hence, the success probability of Adv is also the same in the two experiments. Thus, if Adv' succeeds in outputting x such that x = y ® K in the hybrid with probability p, he will also succeed in the original experiment with the same probability, and therefore in this case Adv' wins with probability p.

5.2 A VPS with Self-Responsiveness

[00110] Our next step is to modify Ί , so that it is secure (with self-responsiveness) in the repeated detection, TQVDS Ά , where 3 has a complete CPU C. Note that the algorithm Response ! above is deterministic and thus, by definition, can be simulated on such a RAM 31. Towards this direction, prior to applying the above compilation strategy we extend the given sound programming W for RAM 31 = (C, MEM) to also implement a corresponding response algorithm. In particular, given any (deterministic) response algorithm Response we can embed to W a programming W Resp with the following property: W Resp is a self-restricted programming for 31 = (C, MEM) which computes Response (i.e., on input some (W, c), W Resp computes y— Response (W, c) and writes it on its output register).

[00111] Embedding to W. The above embedding is done by appending W RESP at the end of W. As long as both if and W RESP are self-restricted, we can be sure that the execution of the one does not modify the part in RMEM where the other is stored. However, to achieve a self-responsive VDS we use a "threading" mechanism which enforces that upon receiving a (check, c) input, the RAM interrupts its execution of W and jumps to an execution of ^Resp. As soon as the execution of W RESV completes, the RAM should continue executing W from where it left off. We point out that modern architectures have several methods for implementing such a multi-thread computation. For completeness, in the following we describe a process that implements the above embedding in a single processor CPU. The process adds a piece of code executing the following computation between any two instruciions in W {$st alternatively, adds this code on a fixed location and puts a jump pointing to it between any two Swords):

[00112] The code checks whether the last string written on the input register is of the form (check, c) for some c E {0, l} k . If this is not the case it does nothing; otherwise:

1. It stores the entire state of the CPU on the first non-used portion of the memory

MEM.

2. It jumps to the part of the memory where is stored (hence the next step is to execute the response procedure). 3. Once 31 finishes executing W R esp , restore the CPU (except input and output registers) to its state before the execution of by copying back the state as saved to MEM in Step 1 above.

[00113] As the CPU C is complete, there exists a programming implementing the above embedding procedure which does the conditional thread-switching between W and f^ Resp. In the following we denote by Emb(i-F, WR SSP ) the programming for 31, = (C, MEM) which computes Wwiih Wsesp embedded to it using W & eek as described above.

[00114] It might seem like the new programming Emb( W, W Res?) takes care of the self- responsiveness issue but this is not the case, as applying the compiler Compile: from the previous section to Emh(W, Η¾ β3Ρ) does not yield a secure VDS in the repeated detection model. Informally, the reason is that the resulting scheme only detects attacks (virus injections) that occur outside the execution of the Response code W r Resp , In more detail, a possible adversarial strategy is to inject the virus while the RAM is executing the Response algorithm, and in particular as soon as the key K has been reconstructed on the CPU and is about to be used to decrypt the challenge. By attacking at that exact point, the adversary might be able to inject an "intelligent" virus onto MEM while the key is still in the CPU, restore the key back to the memory and uses it to pass the current as well as ail future virus detection attempts.

[00115] To protect against the above attack we use the following technical trick. Instead of using a single key K of length k we use a 2/ -bit key K which is parsed as two / -bit keys Kod and K ev via the following transformation. Let K - x■·, j j . . . j | χ? _, where each x t is a word l.

(i.e., an L-bit string). WJog we assume that k = Lq for some q. Then Kod is a concatenation of the odd-indexed words, i.e., ,-'s with i = 1 mod 2, and K c is a concatenation of the even indexed words. Now the challenge algorithm outputs a double encryption of z with keys K 0ii and ic ev , i.e., c— Enc Kad Enc^ ^)),

[00116] In order to decrypt, the response algorithm does the following. First it reconstructs Κ,, ά by XOR-ing the appropriate shares, and uses it to decrypt c, thus computing EncKev(s). Subsequently, it erases Κ \ from the CPU register (the standard way for doing so it by filling the register where K„c is stored with 0's) and after the erasure completes, it starts reconstructing K ey and uses it (as above) to decrypt EacK ev z) and output y = z.

[00117] Intuitively, the reason why the above idea protects against an adversary attacking even during a detection-procedure execution in the repeated detection game is that in order to correctly answer the challenge, the virus needs both T od and K ev . However, the keys are never simultaneously written in the CPU. Thus if the adversary injects the virus before AOd is erased and overwrites ? bits of K then he will overwrite tl2 bits from a share of K ev (which at that point exists only in the memory). Thus e will not be able to decrypt the challenge. Otherwise, if the adversary injects the virus after Κ has been erased from the CPU, then he will overwrite H/2 bits from a share of K Q d- In this case he will successfully pass ibis deiection attempt, but will fail the next detection attempt. The detailed descripiion of the compiler Compile ? is given in Figure 5A,

[00118] It might seem as if we are done but this is again not the case, because the above argument cannot work if we instantiate Enc( » ) with one-time-pad encryptio as in the previous section. ' TO see why, assume that we use c = z 0 K\ Θ KL Because the input register is read-only (for the RAM) once c is given it cannot be erased. Furthermore, from c and y = z (which is the output of Response) one can recover K 0C | © K e := c 0 v. Now if the adversary injects its virus after y is computed but while K ev is still on the CPU's register he can easily recover both K 0d and K ev and answer ail future challenges in an acceptance manner. Thus we cannot use the one-time pad encryption as in the VDS V \ . In fact, we cannot even use a standard CCA2 secure cipher for Enc(.) as the virus will have access to a big part of the private key and standard CCA2 encryption does not account for that (recall that we only require the virus to o verwrite a portion of a key share). Therefore we make use of a leakage resilient encryption scheme which is secure as long as the adversaiy' s probability of guessing the key even given the leakage is negligible. A (public-key) encryption scheme satisfying the above property (with CPA security) was suggested in Yevgeniy Dodis, Shafi Goldwasser, Yael Tauman Kalai, Chris Peikert, and Vinod

Vaikuntanathan. Public-key encryption schemes with auxiliary inputs, in Daniele Micciancio, editor, TCC 2010, volume 5978 of LNCS, pages 361 -381 , Zurich, Switzerland, February 9- 1 1, 2010. Springer, Berlin, Germany [Dodis], which is incorporated herein by reference. We point out that the compiler Compile?, will only use the secret key (i.e., the secret key plays the role of K m Figure 5 A) and ignores the public key. We refer to the corresponding challenge, response, and detection algorithms instantiated with such a leakage resilient double encryption scheme as Chal]enge 2 and Response ? , respectively.

[00119] For completeness, we describe Challenge ? in Figure 5B. We need however to be careful in the design of Response ? . It is important for the security of Response ? that (1) it does not modify memory (RMEM) locations where key shares are written (as otherwise it will not be possible to answer any future detection attempt) and (2) that is does not copy key shares on other memory locations (as this will ensure that the key shares that the virus overwrites are not recoverable). Therefore we suggest an explicit construction of Response 2 satisfying the above properties (see Figure 5C), but remark that there are several (potentially more efficient) ways for this task,

[00120] An important point is to ensure that while compiling W RESF7 , the compiler Corapile 2 still allows it to read the keys. Recall that by default, Spread makes all read commands to read locations where words of the compiled programming are stored and not key positions. We can resolve this by a simple technical trick. We assume that W Resp uses a special-read command (read key, i,j, £) for accessing the key shares, which reads the /-th word of the /th key share and stores it on register REG[ ]. This extra (read key, £) instruction does not have to be part of the CPU instructions and can be easily implemented in machine code by a simple combination of read and jump commands. In the following we denote by V?_ the VDS (Compile?, Challenge?., Response?, Verify-,).

100121] Theorem 2. Let J ' l be a complete RAM and for any given programming W, let Emb(W, Ws sp i) be the programming, as above, which is derived by embedding to WUhe programming W Resp for implementing Response 2 . Assuming that the adversary injects a virus v rus with j virus j > 3 on consecutive locations of the random access memor and that the encryption scheme used in Compile 2 is CPA secure even against an adversary who learns all but = io(iog k) bits of the secret key, the VDS V 2 is secure for Jl in the repeated detection model with respect to the class of all non-self-modifying structured programmings.

[00122] Proof, As in Theorem 1 we prove the properties separately,

[00123] Compilation correctness/Self-responsiveness: These properties follow immediately from Lemma 1 , Indeed, for compilation correctness we note that given that Emb( W. WR SS P) is self- restricted, Lemma 1 ensures that Definition 7 is satisfied for Q (p) = 2p and for the function T computed by Spread "1 . Furthermore, since Emb( F, is a programming which upon input (check, c) executes the algorithm Response?, self- responsiveness follows easily from compilation correctness.

[00124] Verification correctness: Lemma 1 ensures that the execution of the compiled programming W never accesses the key shares. Furthermore, any execution of the verification algorithm does not write anything over memory positions where key shares are stored. Thus verification correctness follows from the fact that upon 52 receiving (check, c) the programming EmbiB 7 , executes algorithm Response ? which correctly computes a double encryption of the random seed*. Thus the reconstruction process will succeed in reconstructing the correct keys. Hence verification correctness of V 2 follows from the correctness of decryption of the encryption scheme used by Challenge 2 and Response^

[00125] Security in the repeated detection game τθ^^' " 1 ^ ,¾ Re p) : Similarly to Theorem 1 we prove that an adversarv Adv that passes the verification test with probability p can be used by an adversary Adv' to break the CPA security with length bounded leakage (in the terminology of [Dodis] : U (k)-L CPA" for some linear ) of the encryption scheme. We recall that the $(k) ' -LB CPA security game ensures that the scheme is secure even when the adversary learns a linear number of bits of the secret key.

[00126] Adversary Adv' interacts with its "fi A ;-i .B CPA" challenger Ch' as follows. Adv' initiates with Adv an execution of the game ^VDS'""' where he plays the role of the challenger Ch as follows: Upon receiving from Adv the virus virus = (a, (w , w£) (recall that we consider a continuous virus which needs only to give the memor location a where the first word will be injected) Adv' uses it to find a location of a key share which would be overwritten by virus in an execution of the game In particular, let i* =

ΓΓΠ!½: Ί ¾ , , , Η-υΐ i ϊ ¾ {0,l}(raod 2n)} and let V n := i * mod n and without loss of generality assume that Γ is odd (the case where i * is even can be handled symmetrically).

Adv' instincts its "-£(k)-LB CPA" challenger Ch' to generate the secret/public key pair (SK, PK). Let SK = SK X jj ... \\SK n/z where each SK t - ----- {0, 1 } L for the t{k)-lB CPA security game. (Recall that n = y).

Adv' sets SK od sets SK od i — SK od i . Denote the odd key as SK od = SK od>1 || ... li - !!·¾< ¾ ,« ·

Adv' runs the key generation algorithm for the "£(&)-LB CPA" scheme to generate another pair (SK ev , PK ev ) of secret/public keys. Parse SK ev as

SK ev = SK ev l jj ... \\SK ev n , where each SK eVii € {0,1} L .

Adv' sets SK = SK od l sic. ev,l SK. οά,Λ SK P

[00127 j Given the above key SK, Adv' emulates the execution of Ch with the inputs (virus) received by Adv as follows. Up to round p pre Adv' runs exactly the program of Ch with the only difference that for each dth iteration of the detection procedure, Ad v chooses the uniformly random message x and computes the challenge as the (double) encryption c— EnC ¾- od (EnCpK ev (x)) writes it on the input tape, and writes x to the output tape. Since ^ResD 2 s give and the VDS is compilation-correct, Adv' can keep track of the number of rounds that Response 2 needs for replying. At round p att , Adv' injects the virus it received from Adv onto the simulated RAM exactly as Ch would, and continues its emulation until the first invocation of the detection procedure that starts after round p att . In that invocation, Adv' chooses uniformly random .¾¾ and xi, computes m 0 = Εηςρχ· (x) and — Encp¾- (x) and submits mo, m\ to its left-or-right oracle for the U (k)-LB CPA" game. Adv' receives from the oracle c b — Enc/¾- (τη 6 ) and is supposed to guess b. For this, Adv' does the following. It continues the emulation of the RAM execution where the challenge is c = Cb and receives v. if y = ib, for some b' & {0, 1} then Adv' outputs b' otherwise it aborts.

[00128] We next argue that if Adv succeeds in his attack with probability p then Adv' also succeeds in the "£(k)-L CPA" with probability at least p. ' TO this direction, consider the hybrid setting where SK 0 d.r> = AO<u* i.e, SK od ~SK. The success probability of Adv in this n n■

hybrid experiment is identical with the corresponding probability to the original experiment. Indeed, until the vims is injected, the key shares are not changed and ail the challenges and responses are consistent. (This is where we use the fact that the scheme is public-key, as Adv' has an encryption oracle which is not the ease in definitions of symmetric encryption with auxiliary input.) Thus Adv overwrites the value of at lest one share of SKotr obliviously

11 ' of its actual value which means that it makes no difference for Adv's output whether in that position there was an actual share of SKow or the all-zero word. ow, we observe that when

Adv succeeds in correctly decrypting the challenge, Adv' also succeeds in correctly guessing the bit h. Hence, if the VDS is insecure, i.e., Ad v succeeds with noticeable probability, then Adv' " also succeeds with noticeable probability which contradicts the "S(Jc)-LB CPA" security of the encryption scheme.

5.3 A VPS with Standard Security

[00129] The next and final step in our construction is to compile the VDS V 2 from the previous section which is secure in the repeated detection model into a corresponding VDS which is secure in the standard model. In fact, the transformation is to a large extent generic and can be applied to most self-responsive VDSs as long as the algorithm Response can be described as an invertible RAM program, i.e., a program which at the end of its execution leaves the memory in the exact same state it was at the beginning. Note, given sufficiently memory, that this is the ease for the Response algorithms described here, as they only need to compute an exclusive-OR of the key shares and then use it to decrypt the challenge. Modern CPUs can perform these operations without ever changing the contents of the memory.

[00130 [ Definition 11 (Invertible Programming). We say that a programming is invertible for a RAM R ' = (C, MEM) if the state of MEM at the end of its execution is identical to the state of MEM at the beginning of its execution.

[00131 [ Our transformation assumes a hash function h(-) ' which can be efficiently computed by an invertible programming on the RAM Ji This is the case for most known hash functions in contemporary CPUs, it further assumes that Response 2 is also implemented by an invertible programming. Both these assumptions are clearly true assuming sufficient memory. Under the above assumptions, we convert the

VDS V 2 = (Compile 2 , Challenge,, Response,, Verify, ) which is secure in the repeated detection model into a VDS Ί . = (Compile,, Challenge 3 , Response 3 , Verify 3 ) secure in the single detection model as follows:

[00132] 1. The algorithm Response 3 works as follows on input a challenge c and a programming W:

(a) It executes the invertible programming for h(-) which computes (on the CPU) a complete hash y h — h(W \\c) of the memory RMEM concatenated with the challenge ciphertext, outputs y h to the user, and erases it from the CPU. This will ensure that, even if one uses this scheme for repeated detections, the hash will be different in each invocation of the detection process.

(b) It executes the invertible programming for Response, and outputs its output y = Response 2 (e, w).

(c) It executes again the invertible programming for ( ) computes again a complete hash ' / j = /ifwljc) where W' denote the current memory RMEM and outputs

[00133] 2. The algorithm Compile 3 is the same as Compile, but uses Response 3 instead of Response,, i.e., compiles that programming Emb(W, W Resp ).

[00134] 3. Chalienge 3 ine same as Chalienge 2 .

[00135 [ 4. Verify 3 is modified as follows: Let K and z denote the inputs of Challenge 3 in the standard attack game £¾ w (Figure 2A), and let y h , y andy' ft denote the outputs of the detection procedure. Then . Verify 3 (·, z,- (>¾; >'» y' h i outputs \ ify = z and ^ = y'n, otherwise it outputs 0.

[00136] Theorem 3. Let 31 be a complete RAM. If the adversary injects a. virus virus with I virus j > 3 on consecutive locations of the random access memory and the encryption scheme used in Compile? is CPA secure even against an adversary who learns all but e£ bits of the secret key for any constant e < 1, then the YDS V 3 is secure for 3i in the random oracle model with respect to the class of all non-self-modifying structured programmings.

[00137 j Proof (sketch). The verification correctness, the compilation correctness, and the self- responsiveness of 1 7 3 are argued analogously to Theorem 2. In the following we argue that the scheme is secure. If the adversar plants the virus before the first evaluation of the hash function has been erased (i.e., before Step lb of esponse 3 starts), then the security of V 2 in the repeated detection model ensures that the adversary will be caught with overwhelming probability (since Step lb invokes Response ?, which guarantees to catch viruses injected before it starts). Otherwise, if the vims injects itself after y h has been erased, then in order to produce a valid reply, the virus will need to compute ¾. Because ¾.( ) behaves as a random oracle, and, by assumption, the adversary overwrites at least L = (k) bits of a key share which he cannot recover, the probability that the virus outputs y h is negligible in k.

5,4. A VPS Secure against Short Viruses

[00138] The constructions from the previous sections are secure against sufficiently long viruses (e.g., more than two words in the examples given). Nonetheless, as most modern CPU's do not use more than 256 different instructions, a single byte is sufficient for encoding all instructions. Thus we can optimize the size and/or security of the scheme in several ways. For example, we can pack more than one compiled instruction in a single word (e.g., put both the original word iVj and the corresponding jumpby instruction), thus allowing our scheme to detect any vims of length more than a single word. Or we can use the key sharing trick, where the last L— 8 bits of each word is also used as key shares. This would provide a 2 - (. -. -s ) r0 ec o even against viruses that are a single word long, as such a virus would have to overwrite L— 8 bits of the secret key,

[00139] In this section we describe a construction (in the continuous injection model) which achieves security 2 )(k) for any desirable value of the security parameter k, independent of the vims length, and in particular even when the virus affects only part of a single word (i.e., is a few bits long). To capture such viruses that affect only part of a word, we slightly modify the virus injection model to allow for the virus to leave certain positions untouched. More precisely, a vims in the bit-by-bit injection model is, as before, a tuple virus— (a, W)— ^(<¾ , ), (w 0 _ , W £ - )> where each a, : 6 a is a location

(address) in the memory but each W; e W is a string w t E {0, 1, ±}W, where _L is a special symbol which does not modify the bit in the position it is injected. I.e., when the word

Wj = (w^n , v 'ii) i s injected on location MEM[aJ, if the word on location MEM[a, : ] was w'j — (w i ' .■■ , w i ' L ) prior to the injection, then after the injection MEM[a. : ] = w-'— { wi' , ... , wl t ' l ), where { w" = w t ) if w u ≠± otherwise w" ~ w- . The continuous injection assumption can also be expressed for the bit-by-bit case: A virus virus =

(a, W — ( , w is continuous in the bit-by-bit injection model if

- e f-L}*|l{0,l }*||{ }*. By convention, we refer to the number of non-1 elements in a virus virus as the number of bits of the virus. At times we call this quantity the effective size of virus.

[00140] Before formally describing our construction, let us discuss the difficulty in designing a VDS which tolerates a virus of arbitrary small effective size. Depending on the actual programming we compile, an "intelligent" short virus that is injected on the position of the first operation executed might cause the RAM to jump to a location that allows the virus to take over the entire computation. This attack is admittedly unlikely, thus one might suggest that we look away from the above issue and hope that there are no intelligent viruses which are that small. This is not the approach we take here. Instead, to prevent such an attack and ensure that even such viruses will be caught we use the following idea. We use a compiler similar to Compile 3 , but we include, for each program word and pair of key shares, a pair of message authentication codes (MACs) which we check every time we read a program word. For completeness, before gi v ing the details of the compiler we introduce the MAC we are using.

[00141] A message authentication code (MAC) includes a pair of algorithms (Mac, Ver), where Mac : M x X→ T is the authentication algorithm (i.e., for a message m E M and a key vk £ X t ~ Mac( , vk) e T is the corresponding authentication tag), and

Ver : M x T x X→ {0,1} is the verification algorithm (i.e., for a message m E M, an authentication tag t E T and a key vk E X Ver(m, t, vk) = 1 if and only if t is a valid authentication tag for m with key vk). We let M includes all strings of length at most fe, ?C={0,l} 2fe 'and T - {0,1} ¾ . Such a MAC can be constructed as follows. Let GF(2 k ) be the field of characteristic two and order 2 k . (Every x 6 GF(2 k ) can be represented as a fc-bit string and vice-versa). Let also vk = vk 1 |jvk 2 E {0,l} 2fc where vk; 6 {0,l} fc for i E {1,2}. Then Mac(m, vk) = vk 5 m + vk 2 where + and denote the field addition and multiplication in GF(2 k ), respectively. Of jmj < k then pad it with appropriately many zeros.) The verification algorithm simply checks that the message, tag, and key satisfy the above condition. An important property of the above MAC is that for a given key vk, every ni £ M has a unique authentication tag that passes the verification test. Furthermore, the probability of any (even computationally unbounded) adversary learning a MAC tag to guess the corresponding key is at most 2 ~k .

[0014.2] The compiler Compile 4 is similar to compiler Compile 3 from the previous section. (Recall that CompiIe 3 maps each , : £ Emb(W, V Resp ) to a pair of commands w¾. fi¾+ i where w- i +1 is the corresponding jump command.) More concretely,

Compile 4 compiles the original programming W with the response algorithm Response 4 (see below) embedded in it as in the previous section (i.e., compiles Emb(W, W Resp ) into a new programming W which interleaves additive shares K^ 1 and of two encryption keys

Κ οά and K ev between any two program words ιν έ and w i+1 of Ernh(M , W¾ esp4 ) and adds the appropriate jump command to ensure correct program flow. The difference is that Compile 4 also adds MAC tags f 0d = Mac( -i, jj w- l+1 K g ^ +1 ) and t' sv = Mac( j, II u¾ +1 To visualize it, Compile 4 expands every word w; E W as w L || "jump" || II \\

Mac(wj II "j ump", K^ 1 ) \\ MacfiV j || jump, Κ +1 ). Note that since each key share is of size 2k and each tag is of size k, in order for Spread to leave sufficient space for the above keys and MACs it will need to be executed with parameter n ~ ~.

[00143] As we prove, if the programming is compiled as abo v e, then any virus, no matter how small, which is written consecutively over any positions of the above sequence even in the bit-b -bit injection setting, will either have to overwrite a long number of bits of some key K,-, or will create an inconsistency in at least one of the MACs with overwhelming probability. However, to exploit the power of the MACs we need to make sure that during the program execution, before loading any word to the CPU we first verify its MACs. To this direction, the CPU loads values from the memory RMEM to its registers via a special load instructions (read_auth, which, first, verifies both MACs of the word in location of the memory with the corresponding keys, and only if the MACs verify, it keeps the word on the CPU. If the MAC verification fails, then (read_auth, deletes at least one of the key shares from the memory, thus introducing an inconsistency that will be caught by the detection procedure.

[00144 ( A description of instruction (read_auth, is given in Figure 6A. Note that (read„auth, makes no tamper-resiliency assumption on the memory or the CPU, It only needs for the architecture to provide us with this simple piece of microcode. For example, the new architecture that Intel recently announced promises to allow for support of such a microcode in its next generation microchips. Moreover, we do not require that the set of instructions excludes the standard read.

[00145] Note that since the original programming f might include read instructions, the compiler replaces them by (read_auth, i,j) instructions. Furthermore, to ensure that the compiled program will not introduce inconsistencies, we replace in W every (write, /, i) instruction (which writes the contents of register j in RMEM location i) with microcode which, when writing a word in the memory it also updates the corresponding MAC tags. We denote this special instruction as write_auth and describe it in detail in Figure 6B.

100146] A formal description of the compiler Compile,* is derived along the lines of Compile;, with the above modifications. We next turn to the other three algorithms of the VDS. Challenge^ and Verify 4 are identical to the corresponding algorithms from the previous section with the only difference that the keys used in Challenge 4 are each of size 2k instead of k. Response 4 works as Response^ with the following difference. After the first evaluation of the hash function, and before doing anything else, Response 4 scans the entire memory to ensure that ail MAC tags are correct. After this check, Response 4 continues as Responses would. Observe that for all (read, · , *) instructions in W Resp the compiler Compile 4 will replace then with corresponding (read_auth, * , * ) instructions. However, it will not touch the (readjkey, i,j,l) instructions used for reading the keys. Therefore, the keys need not be authenticated. In the following we denote the

VDS 7 4 = (Compile 4 , Challenge 4) Response 4 , Verify 4 ).

[00147] Theorem 4. Let 31 be a complete RAM. If the adversary injec ts a virus virus on consecutive locations of the random access memory and the encryption scheme used in Compile ? , is CPA secure even against an ad v ersar who learns any £k bits of the secret key for any constant e< 1, then the VDS V 4 = (Compile 4 , Chaiienge 4 , Response 4 , Verify 4 ).is secure for 31 with respect to the class of all non-self-modifying structured programmings.

[00148] Proof (sketch). The verification correctness, the compilation correctness, and the self- responsiveness of 1 7 4 are argued analogously to Theorem 3. In the following we argue that the scheme is secure. We consider the following cases about the bit-length γ (i.e., the number of bi ts) of virus, where for simplicity and in slight abuse of notation, in the following we write W; to denote the concatenation of the word and its corresponding "jump" instruction in the compiled program, (Recall that we assume that the virus writes itself on continuous locations, but might cover parts of different, consequent, word, e.g., might start at the end of some word and continue to the beginning of the following word.)

[00149] Case 1 : The virus injection affects at least two (consecutive) program words w,- and w i+1 . In this case, the virus will overwrite all the key material between H¾ and w l+1 . This makes it information theoretically impossible for the attacker to recover the key thus, similarly to the proof of Theorem 3, the attacker will fail in the detection procedure.

[00150} Case 2: The virus affects exactly one program word w / . First we observe that if the virus extends only to the left of w;, or only writes itself on top of w¾ i.e., does not affect the keys used to compute i-v MACs, then the security of the MAC scheme ensures that with overwhelming probability, the MAC will fail. Hence as soon as w„ is read, the virus will be exposed. Note that at the latest during the detection process w¾ will be read as Response, first scans the entire memory for MAC failures and then invokes Response;,.

[00151] For the remainder of this proof assume that the virus extends to the right of w,-. Using (he notation from our description, the words to the right of n¾ are K^ , Kl'^ 1 , t o l A = Mac(w £ , fC j ' d " 5 )' a«d t e l y = Mac(w/j, K^ +1 ). Since the virus modifies w t , if it leaves fCe'v* 3 and tg V untouched, it will fail the MAC test. Thus the virus needs to at least modify Kg'v +1 - However, by the continuous injection assumption, this implies that the virus needs to completely overwrite K^ 1 (as it is written in-between m and K^ + 1 ). But, in this case it will be information theoretically impossible for the attacker to recover this key share as the only value in the memory that is related to this key share is the MACs Mac(K¾ Κ^' ' 1 ). Thus, similarly to the proof of Theorem 3, the attacker will fail in the detection procedure.

[00152] Case 3: The virus injection does not modify any program word (i.e., it only overwrites keys and/or MAC tags). By the continuous injection assumption, this implies that the virus is injected to the space between two program words wi, and WM. Using our notation, let i< 'i +1 , t o l d = Mac(n¾ and t e l v = Mac(n^, K^ +1 ) denote the memory contents between w¾ and wm. We consider the following cases: (A) the virus overwrites part of

or ' l , and (B) the vims overwrites only MACs. In case (A) we point out that the virus is never executed since by construction the program counter will never point to key locations for executing an instruction. Since the key shares are part of an additive sharing of the decryption keys, any change to any one of the shares also changes the decryption key thus, by the correctness of decryption of the underlying scheme, with high-probability the response algorithm will return a false decryption and the virus will be caught. In case (B) things are even more simple, since the injective nature of our MAC ensures that the tags are uniquely defined by the word w, and the keys and K^, +1 . Thus any modification to the MAC tag will be detected once w ; is read.

6. FURTHER EXTENSIONS

[00153 [ Although the detailed description contains many specifics, these should not be construed as limiting the scope of the invention but merely as illustrating different examples and aspects of the invention. It should be appreciated that the scope of the invention includes other embodiments, optimizations and extensions not discussed in detail above, some of which are discussed below.

[00154] Key share insertion. The example solutions described above inserted one or two key shares between every two words of original programming. Key shares can also be interleaved with original program words in other ways. For example, the original programming might be spread so thai up to N original program words can still be continguous, where N is some preselected integer. N=l in the above examples, but it could also be greater than 1. Conversely, key share insertion may be limited so that not more than M key shares are inserted between any two adjacent original program words. M=l or 2 in the above examples, but could take on other values. In another approach, key shares can be inserted in a random fashion into the original programming, especially if the insertion is such that an injection of malware will disturb the key shares with high probability. The key shares also do not all have to be from a single secret key; they can be key shares from a set of secret keys.

[00155] Allowing injection on CPU registers. The example solutions described above protect against injection of malware on the memory RMEM but not on the CPU . Nonetheless, the general VDS V 3 which protects against arbitrarily small and non-continuous injection can be adapted to protect also against injection directly on the CPU registers. One way for this would be as follows: apply the compilation to the entire memory of the system, including the CPU registers. The reconstruction will then require the machine to read also the key shares written on these registers.

[00156] Compiling new software from the hard drive. As described, our virus injection mechanisms treat the RAM as a closed system, i.e., values are loaded to the memory RMEM once and no new data is loaded from the hard drive during the computation. This might seem somewhat restrictive, and we believe that it can circumvented by securing the contents of the hard drive using standard cryptographic mechanisms, i.e., encryption and message authentication. The executed software would then verify autheniicity of data loaded from the hard drive. This will cause some slowdown on loading more data from the hard drive, but need only be applied to critical data.

[00157] Siiblinear word size. Our results assume that the word length L is linear in the security parameter. This is realistic in modem computer architectures where the word size is 64 bits, as values for security parameter which is useful in practice are 128-256 bits. We point out however that our results can be easily extended to architectures with smaller word size. In such a scenario, a "not too short" virus (in the terminology of Section 5) would be one that is Z,w(log &)~bits long. Indeed, such a virus would have to overwrite at least w(\og k) bits from different key shares which would make the guessing probability of an adversary negligible in k.

[00158] Non-continuous injection. The VDS's described here are secure assuming the virus injects itself on consecutive memory locations but with appropriate adaptations our techniques can be extended to tolerate non-continuous injection. As an example, by randomizing the positions on which the code words are written between the key shares, we can achieve that the adversary injecting a moderate virus will overwrite a big part of a key share or, in case of the last MAC construction, will create a MAC inconsistency.

[00159] independent MAC Usage. As another variant, the concepts describe in Section 5.4 regarding the use of MACs can be implemented regardless of whether the virus detection based on interspersed key shares is also implemented. The reverse is also true. Although implementing only one or the other will result in less securit than implementing both, for many applications this level of security may be sufficient or other security measures may be additionally used to provide further security.

[00160] Various other modifications, changes and variations which will be apparent to those skilled in the art may be made in the arrangement, operation and details of the method and apparatus of the present invention disclosed herein without departing from the spirit and scope of the invention as defined in the appended claims. Therefore, the scope of the invention should be determined by the appended claims and their legal equivalents.

[00161] In alternate embodiments, the invention is implemented in computer hardware, firmware, software, and/or combinations thereof. Apparatus of the invention can be implemented in a computer program product tangibly embodied in a machine-readable storage dev ice for execution by a programmable processor; and method steps of the invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output. The invention can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine langitage if desired; and in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Generally, a computer will include one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include ail forms of non- volatile memory, including by way of example

semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices: magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits) and other forms of hardware.