Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS AND APPARATUS TO DISCRIMINATE AUTHENTIC WIRELESS INTERNET-OF-THINGS DEVICES
Document Type and Number:
WIPO Patent Application WO/2020/208426
Kind Code:
A1
Abstract:
Methods and apparatus automatically discriminate authentic wireless Internet-of-Things (loT) devices using a trained machine-learning module. In a training phase, the machine-learning module is trained to identify authentic loT devices based on data in frame headers of wireless data emitted by the loT devices. The trained machine-learning module may identify authentic loT devices without analysing data from the payload of the frames to which the frames headers belong, and thus the privacy of data in the payload of the frame is not compromised and encryption of the payload data does not adversely affect performance of the trained machine-learning module in a subsequent production phase. Each training data sample may consist of header data from a sequence of successive frames of wireless data from authentic wireless loT devices and, to enhance accuracy, may exclude address data.

Inventors:
ZHENG TAO (CN)
WANG XIAOYU (CN)
WANG XIN (CN)
Application Number:
PCT/IB2020/000356
Publication Date:
October 15, 2020
Filing Date:
April 09, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ORANGE (FR)
International Classes:
G06F21/44; H04L29/06; H04W12/12
Foreign References:
US20070025265A12007-02-01
US9536072B22017-01-03
Other References:
FRANKLIN J ET AL: "Passive Data Link Layer 802.11 Wireless Device Driver Fingerprinting", PROCEEDINGS OF THE USENIX SECURITY SYMPOSIUM, XX, no. 15TH, 1 August 2006 (2006-08-01), pages 167 - 178, XP002669055
MARKUS MIETTINEN ET AL: "IoT Sentinel: Automated Device-Type Identification for Security Enforcement in IoT", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 15 November 2016 (2016-11-15), XP081361966, DOI: 10.1109/ICDCS.2017.283
Attorney, Agent or Firm:
CABINET BEAU DE LOMENIE (FR)
Download PDF:
Claims:
CLAIMS:

1. Apparatus (1 ) to discriminate authentic wireless-loT -devices, the discrimination apparatus comprising:

a receiver (2) to receive wireless data from loT devices;

a trained machine-learning module (10) to receive and analyse data received by the receiver; and

an interface (12) to output the result of analysis of data by the trained machine learning module (10) as an indication of identification of an authentic or inauthentic wireless loT device;

wherein the trained machine-learning module (10) is arranged to analyse data from frame headers of frames of wireless data received from wireless loT devices.

2. Apparatus (1 ) to discriminate authentic wireless-loT -devices, wherein the machine-learning module (10) is arranged to analyse data from said frame headers without analysing data from the payload of the frames to which said frames headers belong.

3. Discrimination apparatus (1 ) according to claim 1 or 2, wherein the trained machine-learning module (10) is arranged to analyse frame header data that excludes address data representing the address of the device providing the wireless data.

4. Discrimination apparatus (1 ) according to any one of claims 1 to 3, wherein the trained machine-learning module (10) is arranged to analyse data samples, each data sample comprising a set of data items extracted from each one of a sequence of frame headers.

5. Discrimination apparatus (1 ) according to claim 4, wherein the set of data items comprises data items extracted from at least three frame headers.

6. Discrimination apparatus (1 ) according to any previous claim, wherein the trained machine-learning module (10) comprises long short-term memory units.

7. A computer-implemented method to discriminate authentic wireless loT devices, comprises:

receiving (P1 ) wireless data from loT devices;

analysing (P3) received wireless data, by a trained machine-learning module; and

outputting (P4) the result of the analysis by the trained machine-learning module to indicate identification of an authentic or inauthentic wireless loT device;

wherein the analysing by the trained machine-learning module comprises analysing data from frame headers of frames of wireless data received from wireless loT devices.

8. A computer-implemented discrimination method according to claim 7, wherein the analysing by the trained machine-learning module comprises analysing data from said frame headers without analysing data from the payload of the frames to which said frames headers belong.

9. A computer-implemented discrimination method according to claim 7 or 8, wherein the analysing by the trained machine-learning module comprises analysing frame header data that excludes address data representing the address of the device providing the wireless data.

10. A computer-implemented discrimination method according to any one of claims 7 to 9, wherein the analysing by the trained machine-learning module comprises analysing data samples, each data sample comprising a set of data items extracted from each one of a sequence of frame headers.

1 1 . A computer-implemented discrimination method according to any one of claims 7 to 10, and comprising:

a training phase wherein the trained machine-learning module is trained (T3), using training data, to discriminate between authentic and inauthentic wireless loT devices,

wherein said training data consist of data from frame headers, but not data from the payload, of frames of wireless data received from authentic wireless loT devices.

12. A computer-implemented discrimination method according to claim 1 1 , wherein said training data consists of frame header data that excludes address data representing the address of the device providing the wireless data.

13. A computer-implemented discrimination method according to claim 1 1 or 12, wherein each training data sample comprises of a set of data items extracted from each one of a sequence of frame headers. 14. A computer program comprising instructions which, when the program is executed by a processor, cause the processor to carry out the method according to any one of claims 7 to 13.

15. A computer-readable medium comprising instructions which, when executed by a processor, cause the processor to carry out the method according to any one of claims 7 to 13.

Description:
METHODS AND APPARATUS TO DISCRIMINATE AUTHENTIC WIRELESS INTERNET-OF-THINGS DEVICES

The present invention relates to the field of the Internet of Things (loT). More particularly, the invention relates to apparatus and methods to discriminate authentic wireless loT devices automatically.

The Internet of Things is becoming progressively better established, and increasing numbers of loT devices are being brought into operation. loT devices and terminals face many serious security threats including, but not limited to, unauthorized modifications, faking of devices, and so on.

US 9536072 proposes the use of machine learning to analyse the behaviour of electronic devices and their users, including the behaviour of loT devices. A trained machine-learning algorithm is used to determine when an electronic device has been stolen or is behaving in an unusual manner (which may indicate that malicious software is operating on the device). During the training phase, this system establishes one or more local user profiles that represent observed user-specific behaviours according to a centroid sequence. The local user profile may be classified into a baseline profile model that represents aggregate behaviours associated with various users over time. During the production phase, the system may generate a current user profile model comprising a centroid sequence re-expressing user-specific behaviours observed over a particular time interval, and the current user profile model may be compared to plural baseline profile models to identify the baseline profile model closest to the current user profile model. A change in operator may be detected where the baseline profile model closest to the current user profile model differs from the baseline profile model.

US 9536072 is based on analysis of device behaviour and so, if an inauthentic device copies the usual behaviour of an authentic device, the system will not detect a problem.

Moreover, a large number of services make use of loT devices which communicate wirelessly and often these are more easily attacked. As a result, it is becoming increasingly important to assure the security of wireless loT devices. Notably, there is a need for methods and systems to automatically discriminate authentic wireless loT devices, so that they can be differentiated from inauthentic wireless devices (i.e. illicitly-modified devices, fake devices, unauthorized devices, and so on).

The present invention has been made in the light of these issues.

Embodiments of the present invention provide a wireless- loT-device discrimination apparatus, comprising:

a receiver to receive wireless data from loT devices; a trained machine-learning module to receive and analyse data received by the receiver; and

an interface to output the result of analysis of data by the trained machine learning module as an indication of identification of an authentic or inauthentic wireless loT device;

wherein the trained machine-learning module is arranged to analyse data from frame headers of frames of wireless data received from wireless loT devices.

Embodiments of the invention further provide a corresponding computer- implemented method to authenticate wireless loT devices, as specified in appended claim 6.

Embodiments of the invention still further provide a computer program comprising instructions which, when the program is executed by a processor, cause the processor to carry out the method according to any one of appended claims 6 to 13.

Embodiments of the invention still further provide a computer-readable medium comprising instructions which, when executed by a processor, cause the processor to carry out the method according to any one of appended claims 6 to 13.

The above-mentioned wireless- loT-device discrimination apparatus, wireless-loT- device discrimination method, computer program and computer-readable medium may enable the automatic discrimination of authentic wireless loT devices. In this document the expression“discrimination” is used in a general sense to cover cases (apparatus, methods) where the output indicates that an authentic device has been identified, cases where the output indicates that an inauthentic device has been identified, and cases where the output can indicate whether a device is judged to be authentic or inauthentic. Furthermore, the expression “wireless loT device” simply designates an loT device which emits wireless data (irrespective of whether that device may also emit data in a wired fashion).

The above-mentioned wireless-loT-device discrimination device, wireless-loT- device discrimination method, computer program and computer-readable medium perform the discrimination of wireless loT devices using machine learning applied to data from frame headers of wireless data emitted by loT devices. It has been found that this approach can provide good accuracy in identification of authentic devices.

The trained machine-learning module may be arranged to analyse data from the frame headers without analysing data taken from the payload of the frames to which said frames headers belong. Often the data in the payload of frames of wireless data emitted by loT devices is encrypted. By excluding frame payload data from the analysis, the trained machine-learning device is able to perform its analysis in a generic manner irrespective of whether the data received from wireless loT devices is encrypted or unencrypted. Furthermore, by omitting frame payload data from the analysis, the apparatus and methods according to the invention preserve the privacy of the payload data.

The trained machine-learning module may be arranged to analyse frame header data that excludes address data representing the address of the device providing the wireless data. It is common for hackers to fake loT devices’ addresses (e.g., MAC address, IP address, etc..). By excluding address data from the analysis performed by the machine-learning module, the discrimination accuracy may be improved.

The trained machine-learning module may be arranged to analyse data samples, each data sample comprising a set of data items extracted from each one of a sequence of successive frame headers. The set of data items may comprise data items extracted from at least three successive frame headers. Experiments have shown that discrimination results of high accuracy may be obtained in a case where the trained machine-learning module is trained using data samples consisting of this type of time- series data.

Different technologies may be used to implement machine-learning modules. Certain embodiments of the invention use LSTM (Long Short-Term Memory) units to analyse patterns in the above-mentioned time series data.

Further features and advantages of embodiments of the present invention will become apparent from the following description of said embodiments, which is given by way of illustration and not limitation, illustrated by the accompanying drawings, in which:

Fig.1 is a block diagram schematically illustrating components of a discrimination apparatus according to an embodiment of the invention;

Fig.2 illustrates processes involved in training and using a machine-learning module in the apparatus of Fig.1 , in which:

Fig.2A is a flow diagram illustrating processes in a training phase, and Fig.2B is a flow diagram illustrating processes in a production phase (use of the trained machine-learning module);

Fig.3 is a graph illustrating how, in tests, the accuracy of the discrimination performance of the discrimination apparatus was found to vary as a function of the number of epochs;

Fig.4 is a graph illustrating how, in tests, the loss of the discrimination apparatus was found to vary as a function of the number of epochs; and

Fig.5 schematically illustrates process flows between apparatus components during the training phase and production phase, in which:

Fig.5A illustrates process flows during the training phase, and Fig.5B illustrates process flows in the production phase. Certain embodiments of the invention will now be described for the purposes of illustration, not limitation.

Fig.1 illustrates a discrimination apparatus 1 according to a first embodiment of the invention. As can be seen from Fig.1 , the main components of the apparatus 1 are a receiver (RX) 2 to receive wireless data from loT devices, a machine-learning module (MACH LRN) 10 to analyse received data and an output interface (REP) 12 to inform a user of the result of the analysis performed by the machine-learning module 10. The discrimination apparatus may also include a wireless information storage and search unit 4, or it may communicate with an external storage/search unit (not shown) over a wired or wireless communication channel.

The receiver (RX) 2 receives data output wirelessly by loT devices. The receiver 2 may be arranged to collect and/or monitor the data. The receiver 2 may format and shape the received data, for example so as to extract, from the received signals, the data that will be used by the machine-learning module 10 for discrimination purposes. The receiver 2 may store the received data, or the formatted/shaped data, in a storage medium that is internal to or external of the discrimination apparatus 1 , for example a storage medium (MEM) 6 in a wireless information storage and search unit 4.

The machine-learning module 10 analyses data received from wireless loT devices to perform discrimination of loT devices. The machine-learning module 10 requires training in order to be able to perform discrimination. Typically, the machine learning module 10 is operated in a training phase and in a production phase. In the training phase the machine-learning module 10 is trained, using training data, to discriminate wireless loT devices: for example, to be able to recognize authentic devices, to be able to recognize inauthentic devices and/or to be able to generate output indicative of whether a subject device is authentic or inauthentic.

Discrimination apparatus 1 embodying the invention may be supplied with the machine-learning module 10 already pre-trained, or the apparatus 1 may be supplied with the machine-learning module 10 untrained (so that the user can select the training data set to be used for training the machine-learning module 10). Even in a case where the machine-learning module is already pre-trained at the time of supply of the discrimination apparatus 1 , additional training may be performed using extra data collected during operation of the discrimination apparatus 1.

In the production phase, data received from a subject loT device (e.g. a newly- discovered loT device) is input to the trained machine-learning module 10 and the output of the trained machine-learning module 10 then indicates whether the subject loT device is authentic or inauthentic. The output interface 12 is arranged to produce an output indicating the result of the analysis performed by the machine-learning module 10.

The output interface 12 may be designed in different manners. For example, in certain embodiments of the invention the output interface 12 is configured to produce an alert in the case where the output of the machine-learning module 10 indicates that an inauthentic loT device has been detected. The alert may take any convenient form including but not limited to visual (e.g. lighting an indicator lamp, displaying a message on a screen, producing a printed message or report, transmission of an SMS, etc.) and audible forms (generation of a tone, spoken message, and so on). In certain embodiments of the invention the output interface 12 is arranged to output a report on the result of the analysis performed by the machine-learning module 10 irrespective of the nature of that result (i.e. irrespective of whether the result demonstrates detection of an authentic or inauthentic loT device). Of course, if desired the output interface 12 may be configured to produce an alert in the case where the output of the machine-learning module 10 indicates that an authentic loT device has been detected.

The output interface 12 may be constructed using different technologies depending on the application, on the type of output (alert or report) that is to be produced, and on the intended recipient of the output (e.g. a local or remote user, a network operator, a data-collecting module, etc.). Various non-limiting examples include configuring the output interface 12 as a graphical user interface, as a wireless communications interface, as a wired interface, and so on.

The discrimination apparatus 1 may be implemented in various ways on hardware and/or software. For example, the discrimination apparatus may be implemented on a general-purpose computer by suitable programming of the computer (in which case, it will be appreciated that the various components illustrated in Fig.1 represent various functions implemented by the computer). The recited functionality may be defined by instructions in a computer program, and execution of the instructions by a processor can implement this functionality. The present invention provides such computer programs, as well as computer-readable media (discs, tapes, USB keys, etc.) storing such instructions.

Typically, the discrimination apparatus 1 may be applied in a networked environment. The discrimination apparatus 1 may be integrated into a network component (e.g. an access node) or may be a server or other standalone device. The discrimination apparatus 1 may capture wireless data frames emitted by loT devices in a variety of ways. For example, the discrimination apparatus 1 may work in different ways according to the monitoring protocol. In cases where the WiFi protocol is used, the discrimination apparatus 1 may work in the monitoring mode, and may poll all channels or just focus on specific channels. In cases where the Bluetooth protocol is used, the discrimination apparatus 1 may chase specific channels according to the protocol, and so on.

Fig.2 illustrates processes involved in training and using the machine-learning module 10 of the discrimination apparatus of Fig.1.

Fig.2A is a flow diagram illustrating processes in the training phase, and Fig.2B is a flow diagram illustrating processes in the production phase (where the trained machine-learning module is exploited).

During the training phase, data from wireless loT devices is received (step T1 ). The data can be collected over a desired time interval. The data may come from wireless loT devices that are known to be authentic or known to be inauthentic (in the case of supervised learning), or can come from loT devices whose status - authentic or inauthentic - is not known (in the case of unsupervised learning). Depending on the application, the received data may be formatted or shaped to a greater or lesser extent (step T2) to produce training data samples that will be used to train the machine-learning module. Thus, for example, the wireless data received from a given loT device i may be processed to extract, from the headers of a sequence of P successive frames, the values of a set of N parameters. The P x N data matrix for loT device i then constitutes a single training data sample.

The machine learning module 10 is then trained (step T3) using the set of training data samples. The number of training data samples used in the training process can be set as desired and, in general, a greater number of training data samples results in improved discrimination performance during the production phase. Likewise, different algorithms can be used to implement the training process. When using certain learning algorithms, the whole data set is presented to the machine-learning module plural times (i.e. there are plural epochs).

After the machine-learning module 10 has undergone training using the training data samples, the machine-learning module 10 is considered to be“trained”. It can be considered that, during the training phase, the machine-learning module develops a model relating its inputs into an output indicative of a desired evaluation (e.g. a general evaluation such as is the subject loT device an authentic device, or is the subject loT device an inauthentic device, or a more specific evaluation such as: is the subject loT device a device which has undergone modification, is the subject loT device a device produced by a specified manufacturer, and so on),

In general, it is useful to validate the model which has been developed by the machine-learning module during the training phase. Typically, this involves inputting, to the trained machine-learning module, some data samples which relate to loT devices having known status (authentic/inauthentic) or known properties, and checking what percentage of these data samples are correctly classified by the trained machine learning module. Assuming that the validation process shows that the trained machine learning module has adequate performance, the trained machine-learning module 10 is then used, during the production phase, to process data samples obtained from target loT devices in order to discriminate authentic/inauthentic devices.

As illustrated in Fig.2B, during the production phase, data from wireless loT devices is received (step P1 ). Typically, some wireless data is received from a subject loT device and it is not known a priori whether this subject loT device is authentic or inauthentic. Depending on the application, the received data may be formatted or shaped to a greater or lesser extent (step P2) to produce a data sample for analysis. In general, the data sample has the same format as the training data samples that were used in the training process T3 performed during the training phase. Thus, for example, if the machine-learning module was trained using training data samples consisting of the values of P parameters extracted from the headers of N successive frames then the data sample analysed in respect of the subject loT device during the production phase likewise consists of the values of the P parameters extracted from the headers of N successive frames of wireless data received from the subject loT device.

The data sample derived from the subject loT device is then input to the trained machine-learning module for analysis thereby (step P3). The results output from the trained machine-learning module give rise to an output (step P4) which indicates, for example, that the subject loT device is authentic or inauthentic.

The inventors have conducted experiments regarding how machine learning can enable authentic and/or inauthentic loT devices to be discriminated. The experiments show that good discrimination performance can be obtained in the case where the data input to the machine-learning module 10 in the training phase and the production phase is data extracted from the headers of frames of wireless data output by loT devices.

Furthermore, in some embodiments the machine-learning and subsequent identification is based on analysis of data from frame headers, but not data from the payload of the frames of wireless data. This avoids difficulties which otherwise can arise due to the fact that the payload of frames of wireless data emitted by loT devices can often be encrypted. Furthermore, the payload portions of the frames may well contain personal data or data that is commercially sensitive. The privacy of the wireless data is preserved by excluding the payroll data from the information that is input to the machine learning module 10. The headers of wireless data frames generally contain address information and, unfortunately, hackers often fake such address information. Accordingly, certain embodiments of the invention exclude address information from the information that is input to the machine-learning module 10.

It has been found that good discrimination results are obtained in the case where each sample of data that is input to the machine-learning module consists of data extracted from each one of a sequence of plural successive frame headers. In this case the information from wireless loT devices is treated as multiple time series, and the time series data is used to train the machine learning module and identified by the model which the machine-learning module develops as it is trained.

An example will now be given in the context of discriminating loT devices that emit data according to the WiFi specification (e.g. IEEE 802.1 1 a).

In this example, the data input to the machine-learning module 10 is time-series data taken from a sequence of P successive frames of WiFi data and, from each frame header, information is taken which corresponds to a particular selection of N parameters from among the parameters that are present in WiFi frame headers. In the specific example described here, data is taken from a sequence of 10 successive frames of WiFi data and the N parameters that are exploited by the machine-learning module 10 are: Duration, SN (Sequence Number), Signal Strength, Frame Length, and Delta Time (i.e. time between two contiguous frames).

Accordingly, in this example the information input to the machine-learning module 10 has the structure of a matrix as shown below:

In this example, wherein the machine-learning module 10 handled time series data, the machine-learning module was implemented using LSTM (Long Short-Term Memory) to analyse the patterns in this kind of time series and to identify them.

Figs. 3 and 4 illustrate results that were obtained in the case where the machine learning module 10 used three LSTM layers and the size of each LSTI s hidden layer was 128. The number of epochs in the training process was 50, and each epoch used the same training data. The batch size of the training data set was 32. More precisely, Fig.3 illustrates how the accuracy of the output from the machine learning module 10 varied as the number of epochs increased, during the training and production phases, while Fig.4 illustrates how the loss (summation of errors) varied as the number of epochs increased, during the training and production phases.

As can be seen from Figs.3 and 4, the validation result was very close to the training result. Furthermore, good accuracy (>97%) and small loss (<8%) were achieved.

Although time series data from 10 successive frames is used in the above example (i.e. P=10), the number of frames in the sequence can be changed. Experimental results show that it is beneficial to include at least three frames in the time series. Although no explicit upper limit on the length of the sequence has been identified so far, as the number of frames included in the time series increases there is an increase in the time required to train the machine-learning module and an increase in the amount of time taken by the trained machine-learning module to perform analysis. Furthermore, the times series may include data from headers of frames which, although in time order, are not successive to one another: for example, data may be taken from every other frame header in a time series of 2P frames. It will be understood that the manner of selecting frame headers to constitute the sequence used in the production phase should be the same as in the training phase.

Although the above example makes use of data relating to five parameters in the headers of WiFi frames (i.e. N=5), the number of parameters that are used can be adjusted depending on the amount of difference there is between authentic and inauthentic loT devices. For example, in some cases there are significant differences between the frame-header data output by authentic and inauthentic loT devices and it is permissible to train the machine-learning module 10 simply using two parameters, e.g. SN and delta time. However, if the difference between authentic and inauthentic wireless loT devices is very slight then it may be necessary to use 5 parameters (e.g. all 5 of the parameters in the above example) to enable devices to be discriminated. The number N of parameters that are used by the machine learning module 10 affects the accuracy of identification, with a larger value of N being associated with a greater degree of identification accuracy. The design of the machine learning module itself is not affected by the specific parameters that are selected, except insofar as the input of the machine learning algorithm needs to be adapted to the number of parameters.

Although the above example is given in the context of discriminating loT devices that emit data according to the WiFi specification, the invention may be applied to discriminate loT devices that emit data frames that correspond to other wireless standards, for example, Bluetooth, Zigbee, Cellular network specifications, and so on. In the case of embodiments handling these other technical standards, various parameters can be extracted from the headers of the wireless data and input to the machine-learning module, for example: the size of wireless data, delta time of sequence data, the session/sequence/paragraph number, signal strength, the existing/unexpired time, and the transmission speed (in some cases where variable bitrate transmission is involved).

Although the above example concerns a case in which the machine-learning module 10 used LSTM units to analyse time series data, other machine-learning architectures could be used, for example support vector machines (SVMs), other forms of recurrent neural networks (RNNs), Hidden Markov models, gated recurrent units (GRUs), and so on. Moreover, the number of layers in the machine-learning architecture is not limited to three as in the example above. Better results were obtained using an LSTM architecture compared to using an SVM architecture.

As noted above, the discrimination apparatus 1 may include a wireless information storage and search unit 4. This wireless information storage and search unit 4 may include a storage medium or memory 6 in which data provided by the receiver 2 is stored. The stored data may include data formatted and/or shaped by the receiver 2 and, if desired, raw data received from wireless loT devices. The wireless information storage and search unit 4 may also include a querying module (SRCH) 8 to allow queries to be performed on the data held in the storage medium/memory 6.

A description will now be given of Figs. 5A and 5B which illustrate examples of process flows which may take place in a discrimination apparatus 1 according to Fig.1 during the training phase and production phase, respectively. Figs.5A and 5B relate to an example in which the discrimination apparatus 1 includes a wireless information storage and search unit 4. In Fig.5B,“(T) MACH LRD” represents the trained machine learning module. a) Training phase

The receiver 2 collects wireless data (INF W|D ) emitted by loT devices. The receiver 2 preferably performs formatting and shaping (PREP) of the received data at this stage, before uploading it (UPL) to the wireless information storage and search unit 4. The machine-learning module 10 sends a query (QU(TRNG)) to the wireless information storage and search unit 4 requesting a training data set. Typically, the query is formulated by the system designer. The query defines the nature of the requested training data, for example, indicating a time slot to be covered by the sequence of frames, and the parameters from the frame headers that are to be used in the training method (e.g. SN, delta time, and so on). For instance, the query could request supply of a training data set from timel to time 2 including specified data items. The wireless information storage and search unit 4 sends a training data set (DAT(TRNG) to the machine-learning module 10. The machine-learning module 10 performs a training process (TRNG) to establish an internal model to relate its inputs to a desired evaluation output. b) Production phase (discrimination/identification phase)

The receiver 2 collects wireless data (INF W|D ) emitted by loT devices. The receiver 2 preferably performs formatting and shaping (PREP) of the received data, before uploading it (UPL) to the wireless information storage and search unit 4. The machine-learning module 10 sends a query (QU(NWST)) to the wireless information storage and search unit 4 requesting a new data sample, for example a data sample relating to the most recently uploaded loT data, this new data sample relating to a subject loT device for which it is desired to determine whether or not it is an authentic device. The wireless information storage and search unit 4 sends the requested new data sample (DAT(NWST) to the trained machine-learning module 10. The trained machine-learning module 10 applies the received data sample at its inputs and produces an output DET which indicates the evaluation result. The output interface 12 produces a signal RES indicating the result produced by the trained machine-learning module.

During the production phase (identification phase) the output from the trained machine learning algorithm can be said to “identify” the subject device which has provided the data sample undergoing analysis. In other words, the output may be a value which indicates that the subject loT device is an original/legal/not modified/authentic device or a modified/fake/illegal device. If the training data set focuses on data relating to loT devices of a particular type (e.g. having certain functions), the trained machine-learning module may produce an output which indicates whether or not a subject loT device which produces a data sample undergoing analysis is a device of this particular type. Likewise, if the training data set focuses on data relating to a specific loT device (e.g. device A belonging to user B), the trained machine learning module may produce an output which indicates whether or not an analysed data sample was produced by the specific loT device A.

Although the invention has been described above with reference to certain specific embodiments, it is to be understood that various modifications and adaptations may be made within the scope of the appended claims.