DEVICE AND METHOD FOR LEARNING FROM DATA WITH MISSING VALUES

Title:

DEVICE AND METHOD FOR LEARNING FROM DATA WITH MISSING VALUES

Document Type and Number:

WIPO Patent Application WO/2023/147857

Kind Code:

Abstract:

A device (112, 203) for a communication network comprising multiple input nodes (101,102, 103, 201, 202) each configured to process respective first data (105, 106, 107, 208, 209) relating to an entity (104, 207) and output respective second data, the device configured to implement multiple neural networks (109, 110, 111, 206a, 206b) and to: receive (701) a respective signal from at least one of the nodes, the respective signal being determined in dependence on an assessment of quality (116, 117, 211) of one or more of the respective first data and the respective second data; and in dependence on the or each respective signal, select (702) one of the neural networks to process respective second data received by the device from one or more of the nodes. This may allow the device to infer a property of an entity when reliable data is not available from all input nodes.

Inventors:

ZAIDI ABDELLATIF (DE)
MOLDOVEANU MATEI CATALIN (DE)

Application Number:

PCT/EP2022/052529

Publication Date:

August 10, 2023

Filing Date:

February 03, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HUAWEI TECH CO LTD (CN)
ZAIDI ABDELLATIF (DE)

International Classes:

G01S5/02; G06N3/04; G06N3/08

Other References:

ZHANG QIANQIAN ET AL: "Semi-Supervised Learning for Channel Charting-Aided IoT Localization in Millimeter Wave Networks", 2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), IEEE, 7 December 2021 (2021-12-07), pages 1 - 6, XP034073441, DOI: 10.1109/GLOBECOM46510.2021.9685865
GUO JIAJIA ET AL: "Deep Learning-based CSI Feedback and Cooperative Recovery in Massive MIMO", 14 December 2020 (2020-12-14), XP055964554, Retrieved from the Internet [retrieved on 20220926]
NIRMAL ISURA ET AL: "Deep Learning for Radio-Based Human Sensing: Recent Advances and Future Directions", IEEE COMMUNICATIONS SURVEYS & TUTORIALS, IEEE, USA, vol. 23, no. 2, 12 February 2021 (2021-02-12), pages 995 - 1019, XP011856297, DOI: 10.1109/COMST.2021.3058333

Attorney, Agent or Firm:

KREUZ, Georg M. (DE)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A device (112, 203) for a communication network, the communication network comprising multiple input nodes (101 ,102, 103, 201 , 202) each configured to process respective first data (105, 106, 107, 208, 209) relating to an entity (104, 207) and output respective second data, the device (112, 203) being configured to implement multiple neural networks (109, 110, 111 , 206a, 206b) and to perform steps comprising: receiving (701) a respective signal from at least one of the input nodes (102, 103, 202), the respective signal being determined in dependence on an assessment of a quality (116, 117, 211) of one or more of the respective first data (105, 106, 107, 208, 209) processed at the respective input node (102, 103, 202) and the respective second data output by the respective input node (102, 103, 202); and in dependence on the or each respective signal from the at least one of the input nodes (102, 103, 202), selecting (702) one of the multiple neural networks to process respective second data received by the device from one or more of the input nodes (101 , 102, 103, 201 , 202).

2. The device (112, 203) as claimed in claim 1 , wherein the respective second data comprises an output of a respective neural network (109, 110, 111 , 204, 205) implemented by the respective input node (101 ,102, 103, 201 , 202).

3. The device (112, 203) as claimed in claim 2, wherein the respective second data comprises an activation vector of a last layer of the respective neural network (109, 110, 111 , 204, 205) implemented by the respective input node (101 ,102, 103, 201 , 202).

4. The device (112, 203) as claimed in claim 3, wherein the device (112, 203) is configured to process respective second data received by the device (112, 203) from two or more of the input nodes (101 ,102, 103, 201 , 202), and wherein the device (112, 203) is configured to concatenate the activation vectors of the last layers of the respective neural networks (109, 110, 111 , 204, 205) implemented by the respective two or more of the input nodes (101 ,102, 103, 201 , 202).

5. The device (112, 203) as claimed in any preceding claim, wherein the device (112, 203) is configured to process respective second data received by the device (112, 203) from two or more of the input nodes (101 ,102, 103, 201 , 202) and wherein a sum of dimensions of the second data received by the device (112, 203) from the two or more of the input nodes (101 ,102, 103, 201 , 202) is equal to a size of the first layer of the neural network (113, 114, 115, 206a, 206b) selected by the device (112, 203).

6. The device (112, 203) as claimed in any preceding claim, wherein the communication network comprises K input nodes and wherein the device (112, 203) is configured to implement K neural networks.

7. The device (112, 203) as claimed in any preceding claim, wherein the assessment of the quality of one or more of the respective first data (106, 107, 209) processed at the respective input node (102, 103, 202) and the respective second data output by the respective input node (102, 103, 202) comprises determining a quality indicator (116, 117, 211) in dependence on one or more of the respective first data (106, 107, 209) processed at the respective input node (102, 103, 202) and the respective second data output by the respective input node (102, 103, 202).

8. The device (112, 203) as claimed in claim 7, wherein the respective signal indicates whether the quality indicator (116, 117, 211) is above a threshold.

9. The device (112, 203) as claimed in claim 8, wherein the device (112, 203) is configured to process respective second data received from the one or more of the input nodes (102, 103, 202) if the quality indicator (116, 117, 211) is above the threshold.

10. The device (112, 203) as claimed in claim 9 as dependent on claim 3, wherein the device (112, 203) is configured to receive the activation vector of the last layer of the respective neural network (110, 111 , 205) implemented by the respective one or more of the input nodes (102, 103, 202) if the quality indicator (116, 117, 211) is above the threshold.

11. The device (112, 203) as claimed in any of claims 8 to 10, wherein the device (112, 203) is configured to not receive data from an input node (102, 103, 202) if its quality indicator (116, 117, 211) is not above the threshold.

12. The device (112, 203) as claimed in any preceding claim, wherein the signal is a 1 -bit flag.

13. The device (112, 203) as claimed in any preceding claim, wherein the multiple input nodes (101 , 102, 103, 201 , 202) in the communication network comprise a primary node (101 , 201) and at least one secondary node (102, 103, 202), wherein the one or more of the input nodes (101 , 102, 103, 201 , 202) from which second data is processed by the device (112, 203) includes the primary node (101 , 201).

14. The device (112, 203) as claimed in claim 13, wherein the primary node (101 , 201) is a predetermined input node considered to process data having a highest quality indicator of the respective first data processed by the respective input nodes (101 , 102, 103, 201 , 202).

15. The device (112, 203) as claimed in claim 13 or claim 14, wherein all of the at least one of the nodes (102, 103, 202) from which a respective signal is received by the device are nodes other than the primary node (101 , 201).

16. The device (112, 203) as claimed in any preceding claim, wherein the communication network is a wireless network.

17. The device (112, 203) as claimed in any preceding claim, wherein the device (112, 203) and each of the input nodes (101 , 102, 103, 201 , 202) are base stations.

18. The device (112, 203) as claimed in any preceding claim, wherein the entity (104, 207) is a vehicle or a mobile device.

19. The device (112, 203) as claimed in any preceding claim, wherein the device (112, 203) is configured to determine a location of the entity (104, 207).

20. The device (112, 203) as claimed in any preceding claim, wherein the respective first data (105, 106, 107, 208, 209) processed at each input node (101 , 102, 103, 201 , 202) comprises channel state information.

21. A method (700) for implementation at a device in a communication network, the communication network comprising multiple input nodes (101 , 102, 103, 201 , 202) each configured to process respective first data (105, 106, 107, 208, 209) relating to an entity (104, 207) and output respective second data, the device (112, 203) being configured to implement multiple neural networks (109, 110, 111 , 206a, 206b), the method comprising: receiving (701) a respective signal from at least one of the input nodes (102, 103, 202), the respective signal being determined in dependence on an assessment of a quality (116, 117, 211) of one or more of the respective first data (105, 106, 107, 208, 209) processed at the respective input node (102, 103, 202) and the respective second data output by the respective input node (102, 103, 202); and in dependence on the or each respective signal from the at least one of the input nodes (102, 103, 202), selecting (702) one of the multiple neural networks (109, 110, 111 , 206a, 206b) to process respective second data received from one or more of the input nodes (101 ,

102, 103, 201 , 202).

22. A network node (102, 103, 202) configured to communicate with a device (112, 203) in a communications network and to process respective first data (106, 107, 209) relating to an entity (104, 207) and output respective second data, the network node (102, 103, 202) being configured to: perform (801) an assessment of a quality (116, 117, 211) of one or more of the first data (106, 107, 209) and the second data; and in dependence on the assessment of the quality (116, 117, 211) of one or more of the first data (106, 107, 209) and the second data, send (802) at least one of a signal and the second data to the device (112, 203).

23. The network node (102, 103, 202) as claimed in claim 22, wherein the network node (102,

103, 202) is configured to implement a neural network (110, 111 , 205) for processing the first data (106, 107, 209) relating to the entity (104, 207).

24. The network node (102, 103, 202) as claimed in claim 23, wherein the second data comprises the output of the neural network (110, 111 , 205) implemented by the network node (102, 103, 202).

25. The network node (102, 103, 202) as claimed in claim 24, wherein the second data comprises an activation vector of a last layer of the neural network (110, 111 , 205) implemented by the network node (102, 103, 202).

26. The network node (102, 103, 202) as claimed in any of claims 22 to 25, wherein the assessment of the quality (116, 117, 211) of one or more of the first data (106, 107, 209) and the second data comprises determining a quality indicator (116, 117, 211) in dependence on one or more of the first data (106, 107, 209) and the second data.

27. The network node (102, 103, 202) as claimed in claim 26, wherein the signal indicates whether the quality indicator (116, 117, 211) is above a threshold.

28. The network node (102, 103, 202) as claimed in claim 27, wherein the network node (102, 103, 202) is configured to send the second data to the device (112, 203) if the quality indicator (116, 117, 211) is above the threshold.

29. The network node (102, 103, 202) as claimed in claim 28 as dependent on claim 25, wherein the network node (102, 103, 202) is configured to send the activation vector of the last layer of the neural network (110, 111 , 205) implemented by the network node (102, 103, 202) to the device if the quality indicator (116, 117, 211) is above the threshold.

30. The network node (102, 103, 202) as claimed in any of claims 27 to 29, wherein the network node (102, 103, 202) is configured to not send the second data to the device if its quality indicator (116, 117, 211) is not above the threshold.

31. The network node (102, 103, 202) as claimed in any of claims 22 to 30, wherein the signal is a 1 -bit flag.

32. The network node (102, 103, 202) as claimed in any of claims 22 to 31 , wherein the first data (106, 107, 209) comprises channel state information.

33. A method (800) for implementation at a network node (102, 103, 202) in a communications network, the network node being configured to communicate with a device (112, 203) and to process respective first data (106, 107, 209) relating to an entity (104, 207) and output respective second data, the method comprising: performing (801) an assessment of a quality (116, 117, 211) of one or more of the first data (106, 107, 209) and the second data; and in dependence on the assessment of the quality (116, 117, 211) of one or more of the first data (106, 107, 209) and the second data, sending (802) at least one of a signal and the second data to the device (112, 203).

Description:

DEVICE AND METHOD FOR LEARNING FROM DATA WITH MISSING VALUES

FIELD OF THE INVENTION

This invention relates to the processing of data in a network, for example to distributed learning and inference in such a network.

BACKGROUND

There are many techniques where multiple nodes work together to perform distributed learning. A practical example includes multiple base stations (BSs) communicating over a wireless network to locate a user. These solutions usually offer better performance when compared to that of a single node. However, they can introduce the problem of reliability, i.e. some nodes not always having reliable data. This issue can appear due to a fault in the data acquisition process, interference, or simply the absence of data to be measured by the node.

Currently, there are three main ways of attempting to solve this problem.

One method, which can be referred to as data imputation, looks at replacing the missing data with artificially generated data. This can be done either by generating the artificial data based on prior behaviour, or by using dummy information. This technique does not reflect realist channel variations and can fool the decision maker, thus reducing its performance. Additionally, the fact that the data is missing can be useful information for inference, which can be lost when using data imputation.

Another method, which can be referred to as decoupled inference, looks at performing inference on the received information from each node individually and then combining the resulting predictions. For the previous example an application of this technique would be that the user’s location is first estimated based on each BS individually and then the results are averaged for a final result. This technique can be highly suboptimal as information about the correlation between the sources will be lost.

A further method would require training the same model on a dataset formed of a mixture of missing and complete data. The choice of statistics for the resulting dataset needs to match the real case scenario, which might not always be known. Additionally, implementing them for some classes of learning techniques, such as neural networks (NNs), can prove to be complicated. It is desirable to develop a method which overcomes such problems.

SUMMARY

According to one aspect there is provided a device for a communication network, the communication network comprising multiple input nodes each configured to process respective first data relating to an entity and output respective second data, the device being configured to implement multiple neural networks and to perform steps comprising: receiving a respective signal from at least one of the input nodes, the respective signal being determined in dependence on an assessment of a quality of one or more of the respective first data processed at the respective input node and the respective second data output by the respective input node; and in dependence on the or each respective signal from the at least one of the input nodes, selecting one of the multiple neural networks to process respective second data received by the device from one or more of the input nodes.

This may allow for a device that can effectively infer a property of an entity using a neural network when reliable data may not be available from some of the input nodes in the network.

The respective second data may comprise an output of a respective neural network implemented by the respective input node. The use of a neural network may allow the respective first data to be processed at the respective input node and allow the output to be provided to the device.

The respective second data may comprise an activation vector of a last layer of the respective neural network implemented by the respective input node. This may allow the device to receive the second data from input nodes and allow data from multiple input nodes to be combined to form a combined input to the neural network implemented by the device.

The device may be configured to process respective second data received by the device from two or more of the input nodes. The device may be configured to concatenate the activation vectors of the last layers of the respective neural networks implemented by the respective two or more of the input nodes. This may allow data from multiple input nodes to be combined to form a combined input to the neural network implemented by the device.

The device may be configured to process respective second data received by the device from two or more of the input nodes and a sum of dimensions of the second data received by the device from the one or more of the input nodes may be equal to a size of the first layer of the neural network selected by the device. This may allow the output of one or more neural networks at one or more of the input nodes to be input to the neural network implemented by the device.

The communication network may comprise K input nodes. The device may be configured to implement K neural networks. This may allow a neural network to be trained and subsequently implemented depending on which of the K input nodes process reliable data from the entity.

The assessment of the quality of one or more of the respective first data processed at the respective input node and the respective second data output by the respective input node may comprise determining a quality indicator in dependence on one or more of the respective first data processed at the respective input node and the respective second data output by the respective input node. This may be a convenient way of determining whether the data processed at an input node is reliable.

The respective signal may indicate whether the quality indicator is above a threshold. This may allow the input nodes to signal to the device whether or not they process reliable data.

The device may be configured to process respective second data received from the one or more of the input nodes if the quality indicator is above the threshold. This may allow input nodes which process reliable data to subsequently send the outputs of their respective neural networks to the device.

The device may be configured to receive the activation vector of the last layer of the respective neural network implemented by the respective one or more of the input nodes if the quality indicator is above the threshold. This may allow input nodes which process reliable data to subsequently send the activation vector of the last layer of their respective neural networks to the device to be used for the inference.

The device may be configured to not receive data from an input node if its quality indicator is not above the threshold. The device may be configured to not receive data from an input node if the quality indicator indicates the absence of reliable data in the respective first data processed by that input node. This may allow the device to only receive data from input nodes that are considered to process the most reliable data.

The signal may be a 1 -bit flag. This may be a convenient way of signalling to the device that an input node has reliable data to process. The multiple input nodes in the communication network may comprise a primary node and at least one secondary node. The one or more of the input nodes from which second data is processed by the device may include the primary node. The primary node may be a predetermined input node considered to process data having a highest quality indicator of the respective first data processed by the respective input nodes. The primary node may be predetermined by ordering the input nodes in order of reliability. The primary node may be the most reliable node. The device may receive the output of the neural network implemented by the primary node.

All of the at least one of the nodes from which a respective signal is received by the device may be nodes other than the primary node. In this case, the device does not receive a signal from the primary node.

The communication network may be a wireless network. This may be a convenient implementation that may allow the device to be used in a variety of networking applications.

The device and each of the input nodes may be base stations. This may allow the device to be used in telecommunications networks.

The entity may be a vehicle or a mobile device. This may allow the device to determine a property of a variety of entities that is of use in real world scenarios.

The device may be configured to determine a location of the entity. This may allow the device to be used in applications requiring accurate location determination.

The respective first data processed at each input node may comprise channel state information (CSI). This may allow the device to utilise CSI signals sent from the entity to each of the input nodes.

According to a second aspect there is provided a method for implementation at a device in a communication network, the communication network comprising multiple input nodes each configured to process respective first data relating to an entity and output respective second data, the device being configured to implement multiple neural networks, the method comprising: receiving a respective signal from at least one of the input nodes, the respective signal being determined in dependence on an assessment of a quality of one or more of the respective first data processed at the respective input node and the respective second data output by the respective input node; and in dependence on the or each respective signal from the at least one of the input nodes, selecting one of the multiple neural networks to process respective second data received from one or more of the input nodes.

This method may allow for effective inference of a property of an entity when reliable data may not be available from some of the input nodes in the network.

According to a third aspect there is provided a network node configured to communicate with a device in a communications network and to process respective first data relating to an entity and output respective second data, the network node being configured to: perform an assessment of a quality of one or more of the first data and the second data; and in dependence on the assessment of the quality of one or more of the first data and the second data, send at least one of a signal and the second data to the device.

This may allow for a network node that can effectively determine whether data it has processed is reliable enough to be used for inference at the device and signal this to the device, along with its processed data in the case that it is deemed reliable.

The network node may be configured to implement a neural network for processing the first data relating to the entity. This may allow a node in a network to individually process its respective data and provide the output to the device.

The second data may comprise the output of the neural network implemented by the network node. The second data may comprise an activation vector of a last layer of the neural network implemented by the network node.

The assessment of the quality of one or more of the first data and the second data may comprise determining a quality indicator in dependence on one or more of the first data and the second data. This may be a convenient way of determining whether the data processed at an input node is reliable.

The signal may indicate whether the quality indicator is above a threshold. This may allow the network node to signal to the device whether or not it processes reliable data.

The network node may be configured to send the second data to the device if the quality indicator is above the threshold. This may allow a network node which processes reliable data to subsequently send the output of their respective neural networks to the device. The network node may be configured to send the activation vector of the last layer of the neural network implemented by the network nodes to the device if the quality indicator is above the threshold. This may allow network nodes which process reliable data to subsequently send the activation vector of the last layer of their respective neural networks to the device.

The network node may be configured to not send the second data to the device if its quality indicator is not above the threshold. This may allow the device to only receive data from network nodes that are considered to process the most reliable data.

The signal may be a 1 -bit flag. This may be a convenient way of signalling to the device that an input node has reliable data to process.

The first data may comprise channel state information. This may allow the device to utilise CSI signals sent from the entity.

According to a further aspect there is provided a method for implementation at a network node in a communications network, the network being configured to communicate with a device and to process respective first data relating to an entity and output respective second data, the method comprising: performing an assessment of a quality of one or more of the first data and the second data; and in dependence on the assessment of the quality of one or more of the first data and the second data, sending at least one of a signal and the second data to the device.

This method may allow for a network node that can effectively determine whether data it has processed is reliable enough to be used for inference at the device and signal this to the device, along with its processed data in the case that it is deemed reliable.

BRIEF DESCRIPTION OF THE FIGURES

The present invention will now be described by way of example with reference to the accompanying drawings.

In the drawings:

Figure 1 schematically illustrates an example of an inference diagram for the architecture described herein in the case in which there are K nodes that observe measurements from an entity. Figure 2 schematically illustrates an exemplary application for the architecture described herein.

Figure 3 schematically illustrates an example of the first step in the training process where NN1 is trained together with NN3-Part1.

Figure 4 schematically illustrates an example of the second step in the training process where NN2 and NN3-Part2 are trained together with the fixed NN1.

Figure 5 schematically illustrates an example of the forward and backward passes between BS1 , BS2 and BS3 when training using data from both BS1 and BS2.

Figure 6 schematically illustrates an exemplary communication protocol between three base stations, BS1 , BS2 and BS3 during inference.

Figure 7 shows an example of a method for implementation at a device in a communication network in accordance with embodiments of the present invention.

Figure 8 shows an example of a method for implementation at a network node in a communication network in accordance with embodiments of the present invention.

DETAILED DESCRIPTION

Described herein is an architecture for distributed learning and inference that can effectively operate in the presence of missing data and/or unreliable data.

In embodiments of the present invention, a network comprises devices including multiple input nodes and a device for processing data received from one or more of the input nodes. The latter device is herein referred to as a fusion node. The device and each of the input nodes may comprise a processor and a memory. The memory stores in a non-transient way code that is executable by the processor to implement the device or respective node in the manner described herein. The device and each of the input nodes also comprises a transceiver for transmitting and receiving data to and from the device. The network may be a wireless network or a wired network.

There can be an arbitrary number of information sources that provide information about an entity that are observed or measured or received each at a distinct input node in the network. These input nodes can encode their information and send it to the fusion node, which performs the inference. The inference stage can infer a property of the entity, such as its location.

As discussed above, networks using multiple sources of data to perform inference usually offer better performance when compared to that of a single node. However, some nodes may not always process reliable data, or may not process any data at all. Embodiments of the present invention can account for such eventualities. Figure 1 schematically illustrates an inference diagram for an exemplary architecture for embodiments of the present invention in the case where there are K input nodes that observe measurements from an entity. Node 1 is shown at 101 , node 2 at 102 and node K at 103. The input nodes are each configured to process data relating to an entity 104, herein referred to as respective first data. This first data may comprise channel state information (CSI). In this example, respective first data is received from the entity by the respective node. The respective first data received from the entity by node 1 is indicated at 105, the respective first data received from the entity by node 2 is indicated at 106 and the respective first data received from the entity by node 1 is indicated at 107. The first data may be transmitted from the entity to the nodes via the internet, illustrated at 108.

In this example, the input nodes 101 , 102, 103 each receive a data signal from the entity 104 comprising the respective first data. Alternatively, the input nodes may make measurements of a property or characteristic of the entity and optionally record the first data measurements at the respective input node. In a further example, an input node may receive a data signal from a camera sensor that records images, or video, of the entity, In the above examples, the first data therefore relates to the entity. The input nodes are each configured to process their respective first data.

The network therefore comprises multiple input nodes each configured to process respective first data relating to the entity. The first data may be sent to the input node from the entity. Preferably, each of the input nodes observes a different measurement relating to the entity. For example, for a problem of localization, to determine the location of an object, one sensor providing first data to be processed at an input node may be an accelerometer located at the entity (for example, inside a mobile phone) and another may record a GPS location of the entity. In another example, one sensor providing first data to be processed at an input node may record video and another sensor may be an accelerometer located at the entity (for example, inside a mobile phone). In another example, base stations acting as input nodes may measure signal strength or channel state estimation and process this data.

As illustrated in Figure 1 , each node 101 , 102, 103 is configured to implement a neural network (NN) to process its respective first data. The NNs implemented by node 1 , node 2 and node K are indicated at 109, 110 and 111 respectively. The NNs implemented by the input nodes output respective second data. In the preferred implementation, the output of an input node comprises an activation vector of the last layer (i.e. the output later) of the respective neural network implemented by that input node. The input nodes are each configured to output their respective second data.

Some of the input nodes may process data that may be considered to be more reliable than the data processed by other input nodes. The sources of information measured by each input node can be placed in order of reliability. This can be done prior to the training of the NNs implemented by the input nodes and the fusion node 112. In this example, node 1 101 is the most reliable node, i.e. node 1 processes the most reliable data from the entity. Node K 103 is the least reliable node.

This node 101 is referred to herein as the primary node. The primary node is a predetermined node which is considered to process the most reliable data of the data processed by the input nodes. The other, less reliable, nodes in the network 102 and 103 can be referred to as secondary nodes.

For example, node 101 may be the spatially closest node to the entity and may therefore be considered to receive the most reliable data from the entity because the distance over which data signals are wirelessly transmitted is the smallest. The least reliable input node may be located furthest from the entity (i.e. data may be more unreliable in proportion to the distance of the input node from the entity). In another example, one input node may process data from a sensor that is more expensive and/or sophisticated than another and as a consequence may be more reliable. In a further example, one sensor may collect data of a type that is easier to process and therefore may be more reliable. For example, the algorithm used for location based on GPS data can be more accurate than localization based on images, for example from video. In another example, some sensors may be easier to access and maintain and thus would be easier to fix and keep working and therefore may be more reliable. The most reliable node and the order of reliability of the secondary nodes is preferably predetermined before training the neural networks implemented by the device.

Data may also be unreliable due to a fault in the data acquisition process, interference or the absence of data to be measured or received by the input node (i.e. an absence of respective first data at a respective input node).

The input node may determine whether its data is reliable, which can allow the input nodes to be ranked in order of reliability, for example, ranked from the primary node (node 1 101 in Figure 1) to the least reliable secondary node (node K 103 in Figure 1). As shown in Figure 1 , the fusion node 112 is configured to implement multiple neural networks 113, 114, 115. At least some of the multiple NNs are different. Two NNs may be considered to be different if they have the same architecture, but different weights. The fusion node may select one of its possible neural networks 113, 114, 115 to perform the inference.

Thus, in this example, there are K input nodes in the network that each observe data. Each input node implements its own NN. In the preferred example, the fusion node 112 (node (K+1)) has a NN formed of K separate parts.

During the inference phase, the input nodes that are not the primary node (the primary node being considered to be the most reliable node) can assess the quality of the data they have received, measured or processed relating to the entity. In a preferred example, the assessment of the quality of the first and/or second data at an input node is performed by determining a Quality Indicator (QI). In the example shown in Figure 1 , Qis are determined for the respective first data processed by nodes 102 and 103, shown at 116 and 117 respectively.

The QI may be received at the input node alongside the first data. For example, the QI may be (or may be determined in dependence on) a received signal strength indicator (RSSI). Alternatively, the NN implemented by a respective input node may output a confidence level on its prediction and the QI may be determined in dependence on the confidence level. In another example, the QI may be measured directly from the first data, for example based on the signal-to-noise ratio (SNR) of the first data.

Data at an input node may be considered to be reliable if the determined QI for that input node is above a threshold. The threshold may be predetermined. The threshold may be a chosen value above which the signal is still considered to be useful or reliable. Data at an input node may be considered to be unreliable if the QI for that input node is not above the threshold. The threshold is greater than zero. If the input node has not received any data from the entity (i.e. the first data comprises no data), the QI may, for example, be zero for that input node (i.e. below the threshold).

The primary node may have the highest QI of the node in the network, as determined prior to training of the NNs. This node can then be selected as the primary node for the subsequent training of the NNs and the inference.

The NN implemented by a respective input node may output both a quality assessment (for example, a QI) and the second data. In another example, the NN may output a quality assessment and some additional data that together form the second data sent to the fusion node. The second data itself (i.e. the data sent to the fusion node if a node is considered to be reliable) may also be used to assess the quality of data at a respective input node. This may allow for detection of a problem that has occurred during processing of first data by the NN implemented by a particular input node that may result in the second data output by the input node being unreliable. This can allow the fusion node to not use the second data output by that input node during inference. In some examples, an input node may assess the quality of both its respective first data and second data.

The quality assessment may be used to determine the signal (as will be described in more detail below) and, in dependence on the signal, the second data can be sent to the fusion node.

If the first data processed at a secondary input node and/or the second data output at the secondary node is considered to be reliable, the respective second data output from the respective neural network implemented at that input node is input to a neural network implemented by the fusion node 112. The neural network implemented by the fusion node 112 is selected in dependence on which of the input nodes are considered to process reliable data (for example, which of the input nodes have a QI that is above a threshold). As will be described in more detail below, this can be done in dependence on signals received from input nodes that are not the primary node.

As the primary node 101 is considered to process reliable data, its secondary data (the output of its NN) is sent to the fusion node 112 during inference.

The input nodes that are not the primary node (i.e. the one or more secondary nodes) can each send a signal to the fusion node indicating whether they have processed reliable data. The signal is determined in dependence on the assessment of the quality of the first data received or measured by the node, and/or the data output by the neural network implemented by the node (i.e. in dependence on the respective second data). The signal sent to the fusion node by an input node may be determined in dependence on the QI of that input node. Therefore, the input nodes at which a quality assessment has been carried out (nodes 102 and 103 in the example shown in Figure 1) each send a signal to the fusion node.

The signal is preferably a 1 -bit flag. For example, the signal T may indicate that the input node processes reliable data (or no data) and the signal ‘0‘ may indicate that the input node does not process reliable data. The outputs of the respective neural networks may also be encoded by relative entropy coders, shown at 118, 119 and 120 for nodes 101 , 102 and 103 respectively.

Depending on which input nodes have indicated via their respective signals that they have reliable data, the fusion node selects the appropriate decoder NN to be used for the inference.

Therefore, in the example of Figure 1 , fusion node 112 receives a respective signal from the input nodes 102 and 103. The signal from an input node notifies the fusion node whether the data at that input node is reliable or not. In the preferred example, the device receives signals from all nodes except the primary node 101 , which is considered to have the most reliable data. Thus, preferably, the primary node 101 does not send such a signal (i.e. a signal determined in dependence on an assessment of the quality of the first data processed by the primary node and/or the second data output by the primary node) to the fusion node.

In dependence on the one or more signals received from the at least one input node(s) (i.e. all input nodes expect for the primary node, which is a predetermined node considered to process reliable data), the fusion node selects one of the neural networks to process the second data received by the device from one or more of the input nodes.

Therefore, the second data output from the primary node 101 is used by the fusion node for the inference, along with respective second data output from any other input nodes which are considered to be reliable. The output of the inference, shown at 121 , may be a property of the entity, such as its location.

The device therefore receives processed data (i.e. respective second data) from the input node designated as the primary node. In addition, the device can receive second data from other input nodes that are considered to process reliable data (for example, if the QI for that node is above a threshold).

The at least one secondary node(s) in the network can therefore output a signal, which is received by the fusion node. Depending on the signal(s) from the at least one input node (i.e. the secondary node(s)), the fusion node is configured to perform one of the operations below.

If the signal(s) from at least one secondary node indicate that the data at those secondary node(s) is reliable, the fusion node is configured to perform the following operations: • obtain outputs of the NNs implemented by the primary node and the at least one secondary node;

• combine the obtained outputs to generate a combined output;

• provide the combined output as an input to the NN implemented by the fusion node; and

• run the NN implemented by the fusion node based on the input.

The combined output may be formed by concatenating activation vectors of the last layers (i.e. output layers) of the NNs implemented by the primary node and one or more secondary nodes that have reliable data. In this case, the size of the input layer of the NN implemented by the fusion node is equal to the sum of sizes of output layers of the NNs implemented by the primary node and the one or more secondary nodes (the secondary nodes which have reliable data).

If the signal(s) from the at least one secondary node indicate that the data at the secondary node(s) is not reliable, the fusion node is configured to perform the following operations:

• obtain the output of the NN implemented by the primary node;

• provide the obtained output as an input to the NN implemented by the fusion node; and

• run the NN implemented by the fusion node based on the input.

For input nodes with reliable data, the fusion node may receive the reliability indicator signal and the respective second data from a reliable node at the same time; for example, in one transmission packet. For input nodes with unreliable data, the fusion node only receives the respective signal from each of these nodes: it does not receive any second data from these nodes.

The assessment of data quality performed at the input nodes (for example, the determination of the Qis) can therefore assist the fusion node in deciding which decoder NN to use and can allow the learning architecture to deal with missing or unreliable data during inference.

The training of the NNs in the network will now be generally described. The training method for this technique is such that each node is only trained once. For a network with K input nodes, the input nodes are ordered by their reliability from node 1 (the most reliable) to node K (the least reliable node). The process starts by training the most reliable node (node 1) with its own decoder at the fusion node. Afterwards, the encoder of node 1 is fixed and the fusion node saves the decoder.

The NN at node 1 (NN1) is trained together with part 1 of the NN at the fusion node (NN(K+1)- Partl).

During the forward pass of the training phase, the primary node 1 uses its available data to perform a forward pass through its NN 1. It then sends the output of the last layer of NN1 to the fusion node which uses the received vector as input into its own NN(K+1)-Part1. The fusion node runs backpropagation on its own NN(K+1)-Part1 and then sends the error vector of its input layer to node 1 for it to continue the backpropagation on NN1 . This process repeats until convergence. Once it finishes, the NNs (NN1 and NN(K+1)-Part1) are fixed. In the next step of the training process, the NN at node 2 (NN2) is trained together with the fixed NN1 and a new decoder NN(K+1)-Part2 at the fusion node. Node 1 and node 2 do a forward pass on their NNs simultaneously and send the activation values of their last layer to the fusion node. There the received vectors are concatenated and used as input into NN(K+1)-Part2.

During the backward pass, the fusion node completes the propagation on its own NN and then vertically splits the error vector at its input layer. It then only sends the part corresponding to the input from node 2 to node 2. It does not send anything to node 1. The process repeats until convergence.

The next step is to train the NN implemented by node 2 (the second most reliable node) together with the fixed NN for node 1 and a different decoder.

After training, the encoder at node 2 is fixed and the fusion node saves the obtained decoder.

NN2 and NN(K+1)-Part2 are again fixed and the process repeats by incrementally adding the NN at each of the remaining nodes and training them, until convergence, together with the previously fixed nodes and a new part of the NN(K+1).

The process continues until the NNs of all K nodes are trained and the fusion node has K decoders that it can use depending on the subset of nodes that have reliable data.

As discussed above, during inference, each of the nodes assesses the quality of its data and sends a signal (for example, a 1-bit signal) to the fusion node indicating the presence or absence of reliable data. The fusion node receives data from the subset of nodes that has reliable data and for which it has a NN trained for. The reliable nodes are, in the preferred implementation, a consecutively numbered subset of nodes in the order of reliability of the nodes, including the primary node and consecutively numbered secondary nodes, up to the last numbered node which is considered to process reliable data. This makes training of the NNs more straightforward. The fusion node then uses the received information from those nodes with the appropriate part of its NN to make the inference.

An exemplary application of the device to a wireless network is depicted in Figure 2.

The goal of the inference in this example is to estimate a property of a user; in this case, the position and/or speed of a car 207. In other examples, the user may be an object, mobile device or person/pedestrian. The position and/or speed of the car is estimated distributively based on Channel State Information (CSI) signals received from the car at two base stations (BS) of the wireless network. In this example, the two base stations, BS1 shown at 201 and BS2 shown at 202, act as input nodes.

The inference, which determines the position and/or speed of the user, is performed at some other distant/remote base station (BS3, shown at 203), which acts as the fusion node device in this example.

The three BSs BS1 201 , BS2 202 and BS3 203 that are involved in this example are each equipped with neural networks (NNs).

In the example shown in Figure 2, BS1 201 and BS2202 communicate with BS3 203. The NN at BS1 is designated NN1 204, the NN at BS2 is designated NN2 205, the NN at BS3, 206, is formed of two parts designated NN3-Part1 and NN3-Part2, show in more detail in Figure 3.

In this example, it is assumed that, using an OFDM-based transmission scheme, there are ^13 subcarriers for communication between BS1 and BS3 and ^23 subcarriers for communication between BS2 and BS3.

The three BSs are equipped with NNs which are such that:

• NN1 at BS1 has input layer of size dim(X1) and output layer of size ^13

• NN2 at BS2 has input layer of size dim(X2) and output layer of size ^23 • NN3-Part1 at BS3 has input layer of size ^13

• NN3-Part2 at BS3 has input layer of size ^13 + ^23

Figure 3 illustrates the first step in the training process. Here, the neural network implemented by BS1 201 , NN1 204, is trained together with the neural network NN3-Part1 206a, which can be implemented by BS3 203. NN2 205 and NN3-Part2 206b play no role in this first stage of the training.

BS1 receives signal X1 from the user, as shown at 208, and BS2 receives signal X2, as shown at 209, from the user. Thus first data processed by BS1 201 is shown at 208 and the first data processed by BS2 202 is shown at 209. The output of the inference is shown at 210.

In this example, the following procedure is used. During the forward pass, BS1 201 computes a forward pass on its data 208. BS1 then sends the activation vector of the last layer of NN1 204 to BS3 203.

During the backward pass, first BS3 203 computes a backward pass by backpropagating the error vectors between the layers of its NN 206a, starting from the last one. The output of the first layer of the NN 206a has dimension K _i3 This output vector is sent to BS1 201. BS1 continue the backward pass by backpropagating the error vector among the layers of its NN 204, starting from the last layers. This process repeats until the network converges. Once that happens the resulting weights and biases for NN1 204 are fixed.

Figure 4 shows the next step in the training process where NN2 and NN3-Part2 are trained together with the fixed NN1. NN2 205 and NN3-Part2 206b are influenced by the already trained NN1 204. However, NN1 204 is not influenced by NN2 205 or NN3-Part2 206b at this stage. The parameters of NN1 204 do not update during this process.

Figure 5 schematically illustrates the forward and backward pass between BS1 201 , BS2 202 and BS3 203 when training using data from both BS1 and BS2. As shown by the arrows in Figure 5, while the communication between BS2 and BS3 goes both ways, the communication between BS1 and BS3 is unidirectional (from BS1 to BS3).

During the forward pass, simultaneously, each of BS1 and BS2 computes a forward pass on its data. BS1 then sends the activation vector of the last layer of its NN (which so size K ₁₃) to BS3. BS2 sends the activation vector of the last layer of its NN (which so size ₂₃) to BS3. BS3 first concatenates the inputs it gets from BS1 and BS2; and then computes a forward pass on the obtained activation vector, which is of sizeC^is + K ₂₃) as shown in Figure 5.

During the backward pass, first BS3 computes a backward pass by backpropagating the error vectors between the layers of its NN, starting from the last one. The output of the first layer of its NN has dimension(^i3 + K ₂₃). This output vector is split into sub-vectors of respective dimensions K ₁₃ and K ₂₃, respectively as shown in Figure 5. Only the vector of size K ₂₃ is sent to BS2; nothing is sent to BS1 . Similarly, BS2 continues the backpropagation until the first layer of its NN.

Figure 6 shows an example of the communication protocol between BS1 201 , BS2 202 and BS3203 during inference. The Quality Indicator (QI) at BS2, shown at 211 , provides the signal determined in dependence on an assessment of the quality of the data processed at BS2 which informs BS3 203 which part, 206a or 206b, of its NN it should use. The signal is preferably in the form of a 1 -bit flag.

During the inference phase, BS1 201 computes a forward pass on its data using the learnt NN 204 and sends the activation vector of the last layer of its NN (which in this example is of size K ₁₃) to BS3 203.

BS2 202 checks the quality of its data and sends a signal to BS3 to indicate if it has reliable data or not. The signal may be formed in dependence on QI 211 . If QI 211 is above a threshold, the signal can indicate that BS2 202 is considered to process reliable data.

If the signal indicates that the BS2 does not process reliable data (for example, because the QI 211 is not above a threshold), BS3 uses NN3-Part 1 to process data from BS1 only. If the signal indicates that BS1 does process reliable data (for example, because the QI 211 is above a threshold), BS3 uses NN3-Part 2 to process data from both BS1 and BS2.

If BS2 has access to reliable data, it computes a forward pass on its data using the learnt NN simultaneously with BS1 and sends the activation vector of the last layer of its NN (which in this example has size K ₂₃) to BS3. If BS2 does not have access to reliable data, it does not send any data (apart from the 1 -bit flag) to BS3. Therefore, if BS3 receives a signal from BS2 indicating the absence of reliable data, BS3 computes a forward pass on the obtained activation vector from NN1 through NN3-Part1. If BS3 receives a signal from BS2 indicating the presence of reliable data, BS3 uses activation values from both BS1 and BS2; and then computes a forward pass on the obtained activation vector which is of size (^13 + K23) through NN3-Part2 206b.

Figure 7 shows an example of a method 700 for implementation at a device in a communication network in accordance with embodiments of the present invention. As described in the above examples, the communication network comprises multiple input nodes each configured to process respective first data relating to an entity and output respective second data and the device is configured to implement multiple neural networks. At step 701 , the method comprises receiving a respective signal from at least one of the input nodes, the respective signal being determined in dependence on an assessment of a quality of one or more of the respective first data processed at the respective input node and the respective second data output by the respective input node. At step 702, the method comprises, in dependence on the or each respective signal from the at least one of the input nodes, selecting one of the multiple neural networks to process respective second data received from one or more of the input nodes.

Figure 8 shows an example of a method 800 for implementation at a network node in a communications network in accordance with embodiments of the present invention. As described in the examples above, the network is configured to communicate with a device and to process respective first data relating to an entity and output respective second data. At step 801 , the method comprises performing an assessment of a quality of one or more of the first data and the second data. At step 802, the method comprises, in dependence on the assessment of the quality of one or more of the first data and the second data, sending at least one of a signal and the second data to the device.

The solution can be extended to an arbitrary number of input nodes working together to infer the position, or other property, of the entity.

Where an input node is considered to process reliable data, the respective signal and the respective second data may be sent by the input node and received by the device in the same transmission (e.g. in the same data packet).

Another use of the network described above may be, for example, to form an estimated diagnosis of a disease or condition in a patient. In this case, the multiple input nodes may each measure a property of the patient (for example, blood pressure, breathing rate, heart rate etc) and process the data using their respective neural networks. The inference at the fusion node device may form an estimated diagnosis for the patient.

The approach described herein allows for improved performance and better reliability compared to previous methods when dealing with missing or unreliable data during inference.

One of the advantages of the described architecture is that when an additional node is added, the previously trained NNs do not need to be retrained. This can significantly reduce the cost of introducing new nodes. Thus there is reduced training complexity when increasing the number of nodes, as well as reduced bandwidth requirements and fault aware inference.

The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Previous Patent: APPARATUS, METHODS AND COMPUTER PROGRAMS

Next Patent: EARPHONE