Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND A SYSTEM FOR SIDE-CHANNEL INFORMATION DEFENCE
Document Type and Number:
WIPO Patent Application WO/2024/062211
Kind Code:
A1
Abstract:
Methods, apparatus, and systems are provided for calculating a lower bound for a conditional minimum entropy per data value associated with a source for a pre-determined confidence value ϵ, such that the probability of the conditional minimum entropy per data value being greater than lower bound is at least 1 - ϵ, wherein the source produces data values from a set of possible values, the method comprising: obtaining a first probability distribution function corresponding to each data value from the set produced by the source, wherein the first probability distribution function indicates the probability of measuring one or more specific side-channel values, when that data value is produced, wherein each specific side-channel value is associated with a different side-channel; obtaining a second probability distribution function, wherein the second probability distribution function indicates the probability that the source produces each data value from the set; calculating a conditional minimum entropy per data value associated with the one or more side-channels using each of the first probability distribution functions and the second probability distribution function, wherein the conditional minimum entropy per data value associated with the one or more side-channels represents the uncertainly of inferring a data value produced by the source given a specific side-channel measurement associated with each of the one or more side channels; dividing a range of possible side-channel measurements of each of the one or more side channels into intervals or subsets, wherein the number of intervals or subsets is equal to a discretization parameter; calculating a minimum of the conditional minimum entropy per data value associated with the one or more side-channels for each interval or subset; and calculating the lower bound for the conditional minimum entropy per data value associated with the source using the local minimum of the minimum entropy per data value associated with the side-channel for each interval or subset.

Inventors:
SHIU DANIEL (GB)
Application Number:
PCT/GB2023/052185
Publication Date:
March 28, 2024
Filing Date:
August 22, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ARQIT LTD (GB)
International Classes:
H04L9/00; G06F7/58; H04L9/08
Other References:
TOBIAS GEHRING ET AL: "Ultra-fast real-time quantum random number generator with correlated measurement outcomes and rigorous security certification", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 13 December 2018 (2018-12-13), XP081634520
PETER MATTHIAS ET AL: "A Proposal for Functionality Classes for Random Number Generators Version 2.35 -DRAFT", 2 September 2022 (2022-09-02), pages 1 - 239, XP093026284, Retrieved from the Internet [retrieved on 20230222]
Attorney, Agent or Firm:
HILL, Justin John et al. (GB)
Download PDF:
Claims:
CLAIMS

1 . A computer-implemented method for calculating a lower bound for a conditional minimum entropy per data value associated with a source for a pre-determined confidence value e, such that the probability of the conditional minimum entropy per data value being greater than the lower bound is at least 1 - e, wherein the source produces data values from a set of possible values, the method comprising: obtaining a first probability distribution function corresponding to each data value from the set produced by the source, wherein the first probability distribution function indicates the probability of measuring one or more specific side-channel values, when that data value is produced, wherein each specific side-channel value is associated with a different side-channel; obtaining a second probability distribution function, wherein the second probability distribution function indicates the probability that the source produces each data value from the set; calculating a conditional minimum entropy per data value associated with the one or more side-channels using each of the first probability distribution functions and the second probability distribution function, wherein the conditional minimum entropy per data value associated with the one or more side-channels represents the uncertainly of inferring a data value produced by the source given a specific side-channel measurement associated with each of the one or more side channels; dividing a range of possible side-channel measurements of each of the one or more side channels into intervals or subsets, wherein the number of intervals or subsets is equal to a discretization parameter; calculating a minimum of the conditional minimum entropy per data value associated with the one or more side-channels for each interval or subset; and calculating the lower bound for the conditional minimum entropy per data value associated with the source using the local minimum of the minimum entropy per data value associated with the side-channel for each interval or subset.

2. The method of claim 1 , wherein calculating the lower bound for the conditional minimum entropy per data value associated with the source comprises calculating a conditional probability distribution function, wherein the conditional probability distribution function indicates the probability of the source producing each data value from the set and inducing a sidechannel measurement in a particular interval or subset for each of the one or more side channels.

3. The method of claim 2, wherein calculating the lower bound further comprises calculating an optimizing parameter.

4. The method of claim 3, wherein the optimizing parameter is calculated using a divergence parameter.

5. The method of claim 4, wherein the divergence parameter is dependent on the total number of data values produced by the source, the pre-determined confidence value, the number of distinct possible types of data values produced by the source, and the discretization parameter.

6. The method of any preceding claim, wherein the lower bound is provided to a privacy amplifier able to condense the data provided by the source to produce condensed data, wherein the privacy amplifier compares the lower bound to a predetermined value of conditional minimum entropy per data value, and, based on the result of the comparison, uses the lower bound to control an amount of condensing of the data produced by the source used to produce the condensed data, such that a conditional minimum entropy per data value of the condensed data is closer to the predetermined value.

7. The method of claim 6, wherein the privacy amplifier uses universal hashing to control the amount of condensing of the data produced by the source.

8. The method of any of claims 6 to 7, wherein the predetermined value is equal to unity.

9. The method of any preceding claim, wherein obtaining the first probability distribution function corresponding to each data value comprises measuring a plurality of side-channel values corresponding to each of the one or more side-channels.

10. The method of claim 9, wherein measuring a plurality of side-channel values comprises measuring a plurality of side-channel values corresponding to each of the one or more sidechannels for each data value from the set.

11 . The method of claim 9 or claim 10, wherein obtaining the first probability distribution function further comprises inferring the first probability distribution function from the plurality of measured side-channel values corresponding to each of the one or more side-channels.

12. A computer-implemented method for calculating a confidence value e of a lower bound for a conditional minimum entropy per data value associated with a source, such that the probability of the conditional minimum entropy per data value being greater than the lower bound is at least 1 - e, wherein the source produces data values from a set of possible values, the method comprising: obtaining a first probability distribution function corresponding to each data value produced by the source, wherein the first probability distribution function indicates the probability of measuring one or more specific side-channel values associated with one or more sidechannels, when that data value is produced, wherein each specific side-channel value is associated with a different side-channel; obtaining a second probability distribution function, wherein the second probability distribution function indicates the probability that the source produces each data value from the set; calculating a conditional minimum entropy per data value associated with the one or more side-channels using each of the first probability distribution functions and the second probability distribution function, wherein the conditional minimum entropy per data value associated with the one or more side-channels represents the average uncertainly of inferring a data value produced by the source given a specific side-channel measurement associated with each of the one or more side channels; dividing a range of possible side-channel measurements of each of the one or more side channels into intervals or subsets, wherein the number of intervals or subsets is equal to a discretization parameter; calculating a minimum of the conditional minimum entropy per data value associated with the one or more side-channels for each interval or subset; providing the lower bound for the conditional minimum entropy per data value associated with the source; and calculating the confidence value e for the lower bound using the lower bound and the local minimum of the conditional minimum entropy per data value associated with the side-channel for each interval or subset.

13. The method of claim 12, wherein calculating the confidence value e comprises calculating an optimizing parameter using the lower bound and the local minimum of the conditional minimum entropy per data value associated with the one or more side-channels for each interval or subset.

14. The method of claim 13, wherein calculating the confidence value e further comprises calculating a divergence parameter using the optimizing parameter.

15. The method of claim 13, wherein the divergence parameter is dependent on the total number of data values produced by the source, the confidence value e, the number of distinct possible types of data values produced by the source, and the discretization parameter.

16. The method of any of claims 12 to 15, wherein the lower bound is provided to a privacy amplifier, able to condense the data provided by the source to produce condensed data wherein the privacy amplifier compares the lower bound to a predetermined value of conditional minimum entropy per data value, and, based on the result of the comparison, uses the lower bound to control the amount of condensing of the data produced by the source used to produce the condensed data, such that a conditional minimum entropy per data value of the condensed data is closer to the predetermined value.

17. The method of claim 16, wherein the privacy amplifier uses universal hashing to control the amount of condensing of the data produced by the source.

18. The method of any of claims 16 to 17, wherein the predetermined value is equal to unity.

19. The method of any of claims 12 to 18, wherein obtaining the first probability distribution function corresponding to each data value comprises measuring a plurality of side-channel values corresponding to each of the one or more side-channels.

20. The method of claim 19, wherein measuring a plurality of side-channel values comprises measuring a plurality of side-channel values corresponding to each of the one or more sidechannels for each data value from the set.

21 . The method of claim 6 or claim 16, the method further comprising performing cryptography using the condensed data, the method comprising using the condensed data as a one-time pad.

22. The method of claim 6 or claim 16, the method further comprising using the condensed data as a fully secure cryptographic key for an encryption or authentication algorithm.

23. The method of claim 6 or claim 16, the method further comprising using the condensed data as a fully secure seed to a pseudo-random number generator.

24. The method of claim 6 or claim 16, the method further comprising using the condensed data as a fully secure blinding value for sensitive data.

25. The method of claim 6 or claim 16, the method further comprising using the condensed data as a fully secure, confidential commitment value.

26. A system for privacy-amplifying data, the system comprising a source for producing data; one or more sensors for measuring one or more side-channels associated with the source; a processor configured to perform the operations according to any one of claims 1 to 11 for calculating a lower bound on the probability of measuring a sequence of side-channel measurements; or configured to perform the operations according to any one of claims 12 to 20 for calculating a confidence value e for a lower bound associated with making the sequence of side-channel measurements; and a privacy amplifier for condensing the data from the source to a smaller representation using the lower bound.

27. The system of claim 26, wherein the privacy amplifier is configured to use universal hashing to condense data produced by the source to a smaller representation.

28. A computer program comprising instructions which, when the program is executed by a processor, cause the processor to carry out the method of any of claims 1 or 20.

29. A computer-readable medium comprising computer code or instructions stored thereon, which when executed on a processor, causes the processor to perform the method according to any of claims 1 to 20.

Description:
METHOD AND A SYSTEM FOR SIDE-CHANNEL INFORMATION DEFENCE

[001 ] The present invention relates a method for calculating a lower bound for a conditional minimum entropy per data value, a method for calculating a confidence value of a lower bound for a conditional minimum entropy per data value, and system for privacy-amplifying data.

BACKGROUND

[002] Cryptographic processes typically require that some information belonging to a user or collection of users remain confidential and private from adversaries.

[003] This is achieved using various cryptographic algorithms that protect confidential information when data is transmitted over a conventional communications channel. These algorithms require auxiliary data about which an adversary has no knowledge. Often data sources produce non-uniform data and/or data which is not sufficiently confidential, and so their output can be subject to a "whitening" or "privacy-amplifying" process that shortens that data into a smaller representation that is both confidential and uniformly random to mitigate or negate the information available to an adversary. In addition to the source data being insufficiently confidential, confidentiality can be lost via unintended side-channels, and this can also be compensated for using whitening/privacy amplification. Accordingly, it is useful for confidentiality to be quantified, and it is also useful to quantify the information available to an adversary from side-channels.

[004] Existing strategies for measuring and subsequently mitigating side-channel information do so by calculating the mutual information of the information and the side-channel. However, mutual information represents the average information loss and more information loss is possible in cases that are more adverse than the typical case.

[005] The present invention is intended to provide a measure for the minimum entropy loss due to side-channel information as well as a confident upper bound for this loss that applies even in extremely adverse cases.

SUMMARY

[006] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to determine the scope of the claimed subject matter; variants and alternative features which facilitate the working of the invention and/or serve to achieve a substantially similar technical effect should be considered as falling into the scope of the invention disclosed herein.

[007] In a first aspect, the present disclosure provides a method for calculating a lower bound for a conditional minimum entropy per data value associated with a source for a pre-determined confidence value e, such that the probability of the conditional minimum entropy per data value being greater than the lower bound is at least 1 - e, wherein the source produces data values from a set of possible values, the method comprising: obtaining a first probability distribution function corresponding to each data value from the set produced by the source, wherein the first probability distribution function indicates the probability of measuring one or more specific side-channel values, when that data value is produced, wherein each specific side-channel value is associated with a different side-channel; obtaining a second probability distribution function, wherein the second probability distribution function indicates the probability that the source produces each data value from the set; calculating a conditional minimum entropy per data value associated with the one or more side-channels using each of the first probability distribution functions and the second probability distribution function, wherein the conditional minimum entropy per data value associated with the one or more side-channels represents the uncertainly of inferring a data value produced by the source given a specific side-channel measurement associated with each of the one or more side channels; dividing a range of possible side-channel measurements of each of the one or more side channels into intervals or subsets, wherein the number of intervals or subsets is equal to a discretization parameter; calculating a minimum of the conditional minimum entropy per data value associated with the one or more side-channels for each interval or subset; and calculating the lower bound for the conditional minimum entropy per data value associated with the source using the local minimum of the minimum entropy per data value associated with the side-channel for each interval or subset.

[008] Preferably, calculating the lower bound for the conditional minimum entropy per data value associated with the source comprises calculating a conditional probability distribution function, wherein the conditional probability distribution function indicates the probability of the source producing each data value from the set and inducing a side-channel measurement in a particular interval or subset for each of the one or more side channels.

[009] Preferably, calculating the lower bound further comprises calculating an optimizing parameter.

[010] Preferably, the optimizing parameter is calculated using a divergence parameter. [011] Preferably, the divergence parameter is dependent on the total number of data values produced by the source, the pre-determined confidence value, the number of distinct possible types of data values produced by the source, and the discretization parameter.

[012] Preferably, the lower bound is provided to a privacy amplifier, wherein the lower bound is provided to a privacy amplifier able to condense the data provided by the source to produce condensed data, wherein the privacy amplifier compares the lower bound to a predetermined value of conditional minimum entropy per data value, and, based on the result of the comparison, uses the lower bound to control an amount of condensing of the data produced by the source used to produce the condensed data, such that a conditional minimum entropy per data value of the condensed data is closer to the predetermined value.

[013] Preferably, the privacy amplifier uses universal hashing to control the amount of condensing of the data produced by the source.

[014] Preferably, the predetermined value is equal to unity.

[015] Preferably, the first probability distribution function corresponding to each data value comprises measuring a plurality of side-channel values corresponding to each of the one or more side-channels.

[016] Preferably, measuring a plurality of side-channel values comprises measuring a plurality of side-channel values corresponding to each of the one or more side-channels for each data value from the set.

[017] Preferably, obtaining the first probability distribution function further comprises inferring the first probability distribution function from the plurality of measured side-channel values corresponding to each of the one or more side-channels.

[018] In a second aspect, the present disclosure provides a method for calculating a confidence value 6 of a lower bound for a conditional minimum entropy per data value associated with a source, such that the probability of the conditional minimum entropy per data value being greater than lower bound is at least 1 - e, wherein the source produces data values from a set of possible values, the method comprising: obtaining a first probability distribution function corresponding to each data value produced by the source, wherein the first probability distribution function indicates the probability of measuring one or more specific side-channel values associated with one or more side-channels, when that data value is produced, wherein each specific side-channel value is associated with a different side-channel; obtaining a second probability distribution function, wherein the second probability distribution function indicates the probability that the source produces each data value from the set; calculating a conditional minimum entropy per data value associated with the one or more side-channels using each of the first probability distribution functions and the second probability distribution function, wherein the conditional minimum entropy per data value associated with the one or more side-channels represents the average uncertainly of inferring a data value produced by the source given a specific side-channel measurement associated with each of the one or more side channels; dividing a range of possible side-channel measurements of each of the one or more side channels into intervals or subsets, wherein the number of intervals or subsets is equal to a discretization parameter; calculating a minimum of the conditional minimum entropy per data value associated with the one or more side-channels for each interval or subset; providing the lower bound for the conditional minimum entropy per data value associated with the source; and calculating the confidence value e for the lower bound using the lower bound and the local minimum of the conditional minimum entropy per data value associated with the side-channel for each interval or subset.

[019] Preferably, calculating the confidence value e comprises calculating an optimizing parameter using the lower bound and the local minimum of the conditional minimum entropy per data value associated with the one or more side-channels for each interval or subset.

[020] Preferably, calculating the confidence value e further comprises calculating a divergence parameter using the optimizing parameter.

[021 ] Preferably, the divergence parameter is dependent on the total number of data values produced by the source, the confidence value e, the number of distinct possible types of data values produced by the source, and the discretization parameter.

[022] Preferably, the lower bound is provided to a privacy amplifier, able to condense the data provided by the source to produce condensed data wherein the privacy amplifier compares the lower bound to a predetermined value of conditional minimum entropy per data value, and, based on the result of the comparison, uses the lower bound to control the amount of condensing of the data produced by the source used to produce the condensed data, such that a conditional minimum entropy per data value of the condensed data is closer to the predetermined value.

[023] Preferably, the privacy amplifier uses universal hashing to control the amount of condensing of the data produced by the source.

[024] Preferably, the predetermined is equal to unity. [025] Preferably, obtaining the first probability distribution function corresponding to each data value comprises measuring a plurality of side-channel values corresponding to each of the one or more side-channels.

[026] Preferably, measuring a plurality of side-channel values comprises measuring a plurality of side-channel values corresponding to each of the one or more side-channels for each data value from the set.

[027] Preferably, performing cryptography using the condensed data, the method comprising using the condensed data as a one-time pad.

[028] Preferably, the method further comprising using the condensed data as a fully secure cryptographic key for an encryption or authentication algorithm.

[029] Preferably, the method further comprising using the condensed data as a fully secure seed to a pseudo-random number generator.

[030] Preferably, the method further comprising using the condensed data as a fully secure blinding value for sensitive data.

[031] Preferably, the method further comprising using the condensed data as a fully secure, confidential commitment value.

[032] In a third aspect, there is provided a system for privacy-amplifying data, the system comprising a source for producing data; one or more sensors for measuring one or more side-channels associated with the source; a processor configured to perform the operations according to the method of the first aspect for calculating a lower bound on the probability of measuring a sequence of side-channel measurements; or configured to perform the operations according to the method of the second aspect for calculating a confidence value e for a lower bound associated with making the sequence of side-channel measurements; and a privacy amplifier for condensing the data from the source to a smaller representation using the lower bound.

[033] Preferably, the privacy amplifier is configured to use universal hashing to condense data produced by the source to a smaller representation.

[034] In a fourth aspect, there is provided computer program comprising instructions which, when the program is executed by a processor, cause the processor to carry out the method of any of the first and second aspects. [035] In a fifth aspect, there is provided a computer-readable medium comprising computer code or instructions store thereon, which when executed on a processor, causes the processor to perform the method of any of the first and second aspects.

[036] The methods described herein may be performed by software in machine readable form on a tangible storage medium e.g. in the form of a computer program comprising computer program code means adapted to perform all the steps of any of the methods described herein when the program is run on a computer and where the computer program may be embodied on a computer readable medium. Examples of tangible (or non-transitory) storage media include disks, thumb drives, memory cards etc. and do not include propagated signals. The software can be suitable for execution on a parallel processor or a serial processor such that the method steps may be carried out in any suitable order, or simultaneously.

[037] This application acknowledges that firmware and software can be valuable, separately tradable commodities. It is intended to encompass software, which runs on or controls "dumb" or standard hardware, to carry out the desired functions. It is also intended to encompass software which "describes" or defines the configuration of hardware, such as HDL (hardware description language) software, as is issued for designing silicon chips, or for configuring universal programmable chips, to carry out desired functions.

[038] The features and embodiments discussed above may be combined as appropriate, as would be apparent to a person skilled in the art, and may be combined with any of the aspects of the invention except where it is expressly provided that such a combination is not possible or the person skilled in the art would understand that such a combination is self-evidently not possible.

BRIEF DESCRIPTION OF THE DRAWINGS

[039] Some embodiments of the invention will be described, by way of example, with reference to the following drawings, in which:

[040] Figure 1 is a schematic of a system for side-channel characterization and privacy amplification, in accordance with some embodiments of the present invention;

[041] Figure 2 is a flowchart illustrating a method for calculating a lower bound on a probability of measuring a sequence of side-channel measurements; and [042] Figure 3 is a flowchart illustrating a method for calculating a confidence value for a lower bound associated with making the sequence of side-channel measurements.

DETAILED DESCRIPTION

[043] The invention will be understood from the following detailed description of embodiments, which are meant to be descriptive, by way of example only, and not limiting. For the sake of brevity, some well-known features, methods, systems, procedures, components and circuits, that will be well known to those skilled in the art, are not described in detail.

[044] As mentioned above, many cryptographic systems depend on the creation of confidential, uniformly random data. Often data sources produce non-uniform, non-confidential data, or insufficiently uniform and/or confidential data, and so their output undergoes a whitening or privacy-amplifying process that shortens that data into a smaller representation that is both confidential and uniformly random. The amount of shortening that is required can be determined from information theory and is based upon the minimum entropy of the source data. In addition to the source data, confidentiality can be lost via unintended side-channels, and again this can be accounted for with whitening/privacy amplification.

[045] An overview of an aspect of the present disclosure is that it provides a process to calculate with high confidence a lower bound to the minimum entropy of source data, taking into account entropy loss due to side-channel information.

[046] The source is studied in its standard mode of operation, producing data that, rather than remaining confidential, is used to model the conditional distribution of side-channel information measured by a sensor. This calibration can be performed as part of the initial set-up, but can also be regularly checked to defend against the behaviour of the source over time. Sufficient measurements should be taken to confidently describe the conditional distribution of the sidechannel with respect to the data produced by the source.

[047] The conditional probability distribution functions associated with the different values of confidential data produced by the source (e.g. 0-bits or 1 -bits) can then be used to create a function that determines the minimum entropy of the secret data value associated with a given set of side-channel measurements.

[048] The range of possible side-channel measurements can then be discretised into subsets or intervals for each conditional distribution with a confident lower bound on the minimum entropy of the secret data value associated with a set of sidechannel measurements lying in the given subset or interval. This discretised lower bound can then be used in conjunction with Sanov’s theorem to provide a concentration bound on the probability of a sequence of sidechannel measurements guaranteeing minimum entropy less than a given level to the secret data corresponding to the measurements. By setting the concentration bound so that the probability of an exceptional sequence of measurement lies outside the confident estimation, one can be confident of a lower bound on the minimum entropy of the secret data even if an adversary has accessed the side-channel measurements. This bound can be provided to a privacy amplification unit to condense the secret data to a shorter representation commensurable to the minimum entropy that is guaranteed and consequently fully private from the adversary.

[049] Figure 1 shows a schematic overview of a system 100 for privacy-amplifying data produced from a source based on side-channel information associated with the source.

[050] In particular, the system 100 comprises a source 1 10 configured to produce confidential data 120 (e.g. bits with values of 0 and 1 in sequence). The source 1 10 could be, for example, a random number generator or a quantum communications system. The source 1 10 produces confidential data 120 in operation, and also generates changes in physical parameters which can be measured to provide side-channel information 130.

[051 ] In the illustrated embodiment, the system 100 operates periodically to characterize the side channel information 130 produced by the source 1 10. In some examples, during characterization, the source 1 10 is configured to produce side-channel information 130 and confidential data 120 that is of the same characteristics as during operational deployment (i.e. in normal use). This may include “in the field” characterization of the system 100 in order to ensure consistent secure output over time. In other examples, the source 1 10 may be arranged to operate while being subjected to environmental changes controllable by an active adversary and/or with changes to the source 1 10 behaviour that can be affected by an active adversary, in order to test the response of the source 1 10 to possible adversary attacks.

[052] The system 100 further comprises at least one side-channel sensor 140 configured to detect side-channel information 130, which may be referred to as simply side-channel, originating from the source 1 10. In some examples, data from multiple sensors 140 may be used and combined. The side-channel information 130 may be used to determine different side-channels based on the measurement of one or more different parameters. For example, the time taken for the source 1 10 to produce different data values, such as 0 or 1 , the power consumption of the source 1 10 when producing different data values, or the electromagnetic (EM) radiation emitted from the source 1 10 when producing such different data values. Rather than treating the confidential data 120 produced by the source as confidential and using it for its usual purpose, in the illustrated embodiment the confidential data 120 is used to characterize side-channel information 130 measurements induced by creation of different data values of the confidential data 120. [053] Side channel information 130 associated with the side-channel is then processed by the sensor 140 (as described below) and then communicated (as data 1 50) to a processor 160. In some examples, the processor 160 may be part of a larger computational unit (not shown in full). Using the data 150 provided by the sensor 140, the processor 160 is configured to use the side channel information 130 and the confidential data 120 to calculate a lower bound B, which represents a bound with a specified confidence on the conditional minimum entropy per datum when the source 1 10 is producing confidential data 120. In some examples, the source 1 10 may provide the confidential data 120 directly to the processor 160. In other examples, the sensor 140 may gather the confidential data 120 together with the side channel data 130, and provide both the confidential data 120 and the side channel data 130 to the processor 160. In such examples, the sensor 140 may, for example, multiplex the confidential data 120 and the side channel data 130 provided to the processor 160, or tag the side channel data 130 to indicate the corresponding confidential information 120. Other approaches may also be used.

[054] The processor 160 is configured to provide the lower bound B to a privacy amplifier 170 unit. The privacy amplifier unit 170 is configured to use this lower bound B to privacy amplify (or whiten) the data 120 received from the source 1 10 to produce whitened data or condensed data 180. This minimizes the risk of an adversary being able to infer the whitened data 180 by taking measurements of the side-channel information 130 associated with the generation of confidential data 120. It will be understood that the availability of more data allows more confidence in the calculated lower bound B.

[055] In one embodiment, the condensed data may be for performing cryptography. For example, the condensed data may be used as a fully secure cryptographic key for an encryption or authentication algorithm; as a fully secure seed to a pseudo-random number generator; as a fully secure blinding value for sensitive data; or as a fully secure, confidential commitment value.

[056] In an alternative embodiment, the processor 160, using data 150 provided by the sensor 140, is configured to calculate a confidence level on the lower bound B. This allows the evaluation of the risk that an adversary making a specific sequence of side-channel measurements that allows the adversary to infer the confidential data 120 produced by the source 1 10.

[057] Figure 2 shows a flowchart 200 of the operation of the system 100 shown in figure 1 according to an embodiment of the present invention. Referring also to figure 1 , at operation 202, the system 100 (more particularly, the source 1 10) produces both the confidential data 120 and the side-channel information 130 of the same characteristics as of operational deployment. As mentioned previously, this can include “in the field” characterization of the system 100 to ensure consistent secure output over time. This may also include carrying out the method while the source 1 10 is subject to environmental changes controllable by an active adversary and/or changes to the source 1 10 behaviour that can be affected by an active adversary.

[058] At operation 204, the side-channel sensor 140 detects and quantifies the one or more sidechannels 130 of the source 1 10 to produce a first probability distributions function f v (x) for each value v of the confidential data 120 produced by the source 1 10. In some examples, there are multiple side-channels associated with the source 1 10, and so multiple sensors 140 are used measure the different side-channel values each time a data value is produced by the source 1 10. In this case, f v is a conditional joint distribution function produced by combining data from the multiple sensors 140.. In particular, for each data value of the confidential data 120 produced by the source 1 10 (e.g. every time the source produces a 0 or 1 of the confidential data 120) one or more side-channel measurements are taken by the sensor 140. In the present application, it is assumed that the quantification of side-channel information 130 for a particular measured parameter or parameters can be confidently modelled with a probability distribution function of the form r (x) (also called a first probability distribution function in the present application), conditional on v e V, which represents the type of data value v produced by the source (e.g. bit 0 or 1 ) from a set V of possible data values. It is also assumed that side-channel measurements corresponding to each v are identically (subject to dependence on the confidential information) and independently distributed. This means that first probability distributions functions f v (x) for each v are not dependent on one another, and are identical for a specific side-channel. These assumptions are generally the case for side channels of devices producing confidential data. The distribution of side channel information 130 may be summarised as histogram data or samples from standard probability distributions, for example, the normal, exponential, Gaussian, or Poissonian distributions, albeit having different parameters for each v. This list of possible distributions is not intended to be exhaustive, and other distributions may be used.

[059] Thus, for each type of data value v produced by the source 1 10 from set V, the sensor 140 is able to qualify the data by producing a corresponding probability distribution function f v (x) conditional on v. For example, one or more side-channel measurements x (e.g. x may represent the amount of power consumed) are taken by the sensor 140 each time the source 1 10 produces a data value to produce r (x). Such a probability distribution function can be, for example, a normal, exponential, Gaussian or Poisson distribution. Because each type of data value v produces a side-channel of a different characteristic, each type of data value v will result in a probability distribution function r (x) with different characteristics (e.g. side-channel measurements of bit values 1 and 0 both produce a distribution, but with different means and standard deviations).

[060] At operation 206, the sensor 140 provides the one or more first probability distribution functions f v (x) associated with each type of data value v to the processor 160. In addition, the processor 160 also receives a probability distribution function p(v) (also called a second probability distribution function in the present application), or a distribution of data values, indicating the probability that the source 1 10 produces a data value type v. The function p(v) can either be directly from the source 1 10 to the processor 160, or from the source 1 10 to the sensor 140 and then to the processor 160.

[061 ] At operation 208, given the first and second probability distribution functions (/ v (x) and p(v)) , the processor 160 summarizes these functions as either histogram data or samples from f v (x) and p(v). The underlying distribution that can produce such samples can then be parameterised with confidence by the processor 160 using standard methods of statistical estimation (for example the mean and variance of an underlying normal distribution can be estimated without bias using the sample mean and variance and confidently bounded with Cramer-Rao methods). As mentioned previously, collections of side-channels dependent or otherwise in their conditional values (e.g. power and EM emission, measured using separate sensors 140), can be modelled with joint distribution functions. In some examples, p(v) may be a separately validated input to privacy amplification. In other examples, p(v) may be validated by the processor 160 or directly modelled, removing the need for additional processes.

[062] At step 210, the processor 160 uses the first and second probability distribution functions f v (x) and p(v) for the side-channel information 130 values and the confidential data values 120 to compute a minimum entropy E m in for a data value v of the confidential values 120 for a given set of possible side-channel measurements x. The minimum entropy is a measure of the uncertainty associated with an unknown data value. For example, if E m in = 0 there is only one type of confidential data value (e.g. bit 0 or bit 1 ) that could have produced the side-channel value x, and one can be certain that the side-channel value x is associated with that type of confidential data value (e.g. value x is definitely due to the source 1 10 producing confidential data bit 1 ). If E(x)=log2|V|, then all confidential data values are equally likely to have produced the observed side-channel value x and the uncertainty is maximised. The minimum entropy for a given set of possible side-channel measurements x is given by the following formula:

[063] where, for a given secret value v of the confidential data 120 from a set V of possible values, for a given v e V , p(v) is the probability that the source produces the datum with value v and r(x) is the probability distribution function of the set of sidechannel measurements conditional on v. In the worked example the logarithms are measured in base 2, which is convenient in the example where the data values are bits (having 2 possible values). In other examples logarithms in a different base may be used as appropriate. Note that E mm always take values in the range 0 to 1 when entropy is measured in the base equal to the number of possible types of data values in set V (e.g. base 2 for data values 0 and 1 ).

[064] As mentioned previously, mutual information (which is a measure of the most likely value associated with an unknown datum) corresponds to the loss of Shannon entropy of the confidential data, which is the average number of guesses required to recover the confidential value (or put another way, a mathematical function that describes the probability of observing different outcomes of various events). For security purposes, the more relevant measure may be the minimum entropy that represents the most likely guess as to the confidential value and the work per success to recover a confidential value, and which is necessary to block an adversary aiming to minimise work per success.

[065] It can be appreciated that the probability of the source 110 producing data value types (e.g. 0 and 1 ) can take any value between 0 and 1 . For example, in the instance that the data source equiprobably produces a 0 or 1 bit (i.e. p(v) = 0.5 for v= 0 or 1 ) , the minimum entropy associated with a given set of possible side-channel measurements x is given by:

[066] Given the expression for the minimum entropy E mln , Sanov’s theorem is used to bound empirical deviation from the expected conditional minimum entropy of a sequence of multiple data values x with high confidence. In particular, if a confidence of 1 -£ is required for statements/proofs of security, Sanov's theorem is able to provide a suitable bound B on the deviation of the minimum entropy associated with a sequence of side-channel measurements x. For example, if the source 110 produces a consecutive sequence n data values, it will also produce an associated sequence of side-channel measurements xi,...x n , where each sidechannel value Xi is associated with one specific side-channel 130. In the case where there is more than one-side channel 130 associated with the source 110, Xi presents multiple values associated with different side-channels 130. The minimum entropy associated with this sequence of side-channel measurements xi,...x n would be then given by Emin(xij+...+E m in(x n ). The latter indicates the uncertainly of guessing the sequence of n confidential data values from the sequence of side-channel measurements xi,...x n . Sanov's theorem can then be used to provide a bound B, such that the probability P of the minimum entropy associated with a sequence n of side-channel measurements being above nB is above a chosen level of confidence £ (i.e. such that P(Emin(x +...+Emin(x n ) > nB) > 1 -E).

[067] Thus, the bound can be used in combination with a leftover hash lemma to compute a universal hash output size and produce a shorter private value (aka condensed or whitened data value) of that length from a larger private value in such a way as to minimise the information about the new shorter value available to an unauthorised entity.

[068] In order calculate the bound, at operation 212, the processor 160 divides the range of possible side-channel measurements x into C intervals (or subsets) of a discretisation parameter N. For example, if the side-channel measurements x range over the full real line, (i.e. all values between -« and «) the processor 160 might divide the real line of values of parameter N into number C intervals, where the integral of f v (x) is 1/C on each interval. Note that in such an example, the side-channel measurement is equally likely to fall into any of these intervals. In the present application, it is assumed that confident lower and upper bounds of the first probability distribution functions f v (x) can effectively be computed on intervals of measurement of the side-channel information 130.

[069] At operation 214, for each of the C intervals or subsets the processor 160 computes a lower bound L for the E mln minimum entropy on that interval or subset. In particular, apart from a fixed number of local minima, the minimum on an interval will be attained at an endpoint. For example, if we consider the range of possible values of E m in(x) when x lies in the interval 2 < x < 5 for example, the smallest value is typically either E m in(2) or E m in(5). Provided that the first probability distribution function f v (x) allows confident calculation of interval endpoints/subset division for a range of values, the range of values may be taken broadly to make the lower bound apply universally with high confidence. Let be the function that returns the minimum for the interval S containing x and h v (S) for the minimum for the interval S.

[070] At operation 216, the processor 160 computes a divergence parameter D for a given (predetermined) level of confidence e, given number of produced data values n (e.g. the total number of bits of confidential information 120 produced by the source 110), the number of distinct possible data values k (equivalent to |V|), and the discretisation parameter N, given by:

[071] Given a target divergence value D, at operation 218 the processor 160 proceeds to find an optimising parameter K that will allow subsequent derivation of a bound B for the minimum entropy E mln of the confidential information associated with the observed sequence of sidechannel values. Specifically, for v ranging over possible data values and S ranging over the intervals of the discretisation associated with / r (x), processor 160, using standard numerical solving methods such as binary search, finds K that satisfies: [072] where P(v,S) is conditional probability distribution function which indicates the probability of a data type value v being produced and inducing a side-channel measurement in the interval S and

[073] Given the optimisation parameter A, at operation 220 the processor 160 calculates B, the confident lower bound on the minimum entropy per bit according to:

[074] Optionally, in some examples, the bound B can be computed for a range of different choices of C and the highest bound is still a lower bound for the conditional minimum entropy. Therefore the processor 160 can further optimise over this choice if desired. The lower bound B applies in the vast majority of cases (with quantification of the probability of an exceptional case) rather than applying as “expected behaviour”. In other words, there is a possibility that the sequence of side-channel measurements leak more minimum entropy than nB, however this possibility has small probability, in particular less than e. If instead a value nB’ was calculated representing the expected leakage, in practice more information would be leaking roughly half of the time.

[075] At operation 322, the processor 160 provides the value B to the privacy amplifier unit 170. The privacy amplifier unit 170 is able to condense the data provided by the source to produce condensed data. The privacy amplifier 170 compares the lower bound B to a predetermined value of conditional minimum entropy per data value. Based on the result of the comparison, privacy amplifier 170 uses the lower bound B (which is always less than or equal to unity) to control an amount of condensing of the data produced by the source used to produce the condensed data, such that a conditional minimum entropy per data value of the condensed data is closer to the predetermined value. In some examples, the whitening process produces a smaller, transformed quantity of data whose E mln value is as close as possible to a desired value, within the limits of the chosen confidence value e. In one example, the desired value for E mln is equal to unity.

[076] Privacy amplification can be defined as a measure of the average amount of minimum entropy that must be eliminated given information from a related value in order to achieve perfect confidentiality. If necessary, the privacy amplifier unit 170 then uses privacy amplification techniques, such as universal hashing, to convert the confidential data 120 from the source 110 to a shorter form more uniformly distributed across possible values to produce whitened data 180, having this required minimum entropy per bit, and for which an adversary has no net knowledge. It will be understood that if the value B determined by the processor 160 is satisfactory (i.e. , indicates a minimum entropy per bit at or above a required value) no privacy amplification by the privacy amplification unit 170 is required, so that this privacy amplification may be omitted. In other examples, different privacy amplification techniques may be used in addition to, or alternatively to, universal hashing.

[077] An example of the system 100 of figure 1 using data 120 produced from the source 110 to produce whitened data 180, is set out below for a situation that might arise in an operational instance. For these calculations, the following assumptions are made:

• The confidential data source 110 emits binary values 0 and 1 independently and equiprobably;

• The first probability distribution functions f v (x) of the side-channel (for example, power consumption of the source 110) follow a normal distribution. This means the units of measurement can be normalized such that f 0 (x); associated with the bit 0 value has mean 0 and standard deviation 1 . It is also assumed that the associated with the bit 1 value has a mean of 0.5 and standard deviation 1 .2;

• A confidence of £ = 1x10 -11 is required in a lower bound for the minimum entropy of 2.5 million outputs (i.e. n in equation 3 above is equal to 2.5 million);

• Numerical values are quoted to 4 significant figures.

[078] Then, the minimum entropy function is given by (using the above values and assumptions in equation 1): which has a single local minimum at x=-1 .1364 where it is 0.8096.

[079] Taking C=100, two copies of the real line (one each for and 0 (x) and /i(x)) are divided into 100 intervals each. The intervals represent a range of values that are taken 1% of the time by the side-channel when the relevant bit is produced. In particular this makes all of the P(v,S) terms equal to 1/200.

[080] For the value 0 the division is set as

(-oo, -2.3268], [-2.3268, -2.0542], ... , [2.0542,2.23268], [2.3268, oo) and for the value 1 the division is set as (-oo, -2.292], [-2.292, -1.965], ... , [2.965,3.292], [3.292, +oo). We compute our minimum entropy function at endpoints and local minima for to obtain the local minima of the minimum entropy on each of these intervals. For the value 0 these ho minima are 0.0000, 0.8923, ... ,0.2781 , 0.0000 and for the value 1 these h r minima are 0.0000, 0.8765,... , 0.09274, 0.0000.

[081 ] Then the divergence parameter is calculated as:

D = - Zog 2 (2500001 2 °° X 10 -11 ) /2500000 = 0.01715 (8)

[082] Using binary search to solve for the optimal value A=-0.3154 such that:

[084] which we compute the optimal B as:

[085] 0.7520 (10)

[086] which means that one can confidently say that the minimum entropy of the confidential 2,500,000 bits is at least 1 ,880,000 bits.

[087] Searching over other values of C with equiprobable discretisations gives a best choice C=226 for which gives B=0.7567 and a conditional minimum entropy of at least 1 ,891 ,000 bits.

[088] In an alternative embodiment, the processor 160 can reverse the order of some of the operations to quantify the risk of a certain threshold of leaked information being exceeded, as shown the flowchart 300 of figure 3. In this embodiment, the operations 302-314 are identical to operations 202-214 of figure 2. However, instead of calculating the divergence parameter D, the processor 160 initially calculates the optimizing parameter A given a lower bound B using equation 6 (operation 316). At operation 318, the processor 160 uses the optimizing parameter to A to calculate the divergence parameter D using equation 4. At operation 320, using equation 3 the processor 160 calculates a confidence level e.

[089] Thus, the technique described in figure 3 gives an indication of the level of risk (e) that the minimum entropy per bit is below a given bound B. If this risk is deemed too high, then at operation 322 the processor 160 passes the risk to the privacy amplifier unit 170. As before, the privacy amplifier unit 170 uses use techniques, such as universal hashing, to convert the confidential data 120 output by the source 1 10 to a shorter form more uniformly distributed across possible values to produce whitened data 180, for which an adversary has no net knowledge. Alternatively, if the risk is deemed acceptable, the confidential data 120 output by the source 1 10 without increasing the amount of whitening by the privacy amplifier unit 170. [090] In the described embodiments functions are carried out by the side-channel sensor 140 and the processor 160. It will be understood that in other examples some of these functions could be carried out by a different one of the side-channel sensor 140 and the processor 160. In general, functionality may be exchanged between these elements.

[091 ] In summary, the system 100 is designed to achieve the following goals:

• Confident bounding of the security risk represented by the existence of a side-channel.

• Privacy amplification that negates extremal case minimum entropy loss due to sidechannels;

• Enable accurate treatment of side-channel information within information theoretic proof frameworks;

• Improve the secure key rate of quantum key distribution methods;

• Improve the secure output rate physical random number generators; and

• Permit cryptographic devices to operate securely under more unfavourable side-channel conditions.

[092] The term 'computational unit ' is used herein to refer to any device or group of devices with processing capability such that it/they can execute instructions. Those skilled in the art will realise that such processing capabilities are incorporated into many different devices and therefore the term 'computing system' as used herein may include PCs, servers and many other devices.

[093] The components described herein are not necessarily physically separated from each other unless otherwise stated, and the functionality of components illustrated in the figures may be distributed or shared between different or the same physical devices. For example, some of the functions of a communication system may be performed by a computing system and vice versa.

[094] It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems or those that have any or all of the stated benefits and advantages.

[095] Any reference to 'an' item refers to one or more of those items. The term 'comprising' is used herein to mean including the method steps or elements identified, but that such steps or elements do not comprise an exclusive list and a method or apparatus may contain additional steps or elements. [096] As used herein, the terms "component" and "system" may encompass computer-readable data storage that is configured with computer-executable instructions that cause certain functionality to be performed when executed by a processor. The computer-executable instructions may include a routine, a function, or the like. It is also to be understood that a component or system may be localized on a single device or distributed across several devices.

[097] Further, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim.

[098] The figures illustrate exemplary methods. While the methods are shown and described as being a series of acts that are performed in a particular sequence, it is to be understood and appreciated that the methods are not limited by the order of the sequence unless otherwise stated. For example, some acts can occur in a different order than what is described herein. In addition, an act can occur concurrently with another act. Further, in some instances, not all acts may be required to implement a method described herein.

[099] It will be understood that the above description of embodiments is given by way of example only and that various modifications may be made by those skilled in the art. What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable modification and alteration of the above devices or methods for purposes of describing the aforementioned aspects, but one of ordinary skill in the art can recognize that many further modifications and permutations of various aspects are possible. Accordingly, the described aspects are intended to embrace all such alterations, modifications, and variations that fall within the scope of the appended claims.