Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FIRST WIRELESS NODE, OPERATOR NODE AND METHODS IN A WIRELESS COMMUNICATION NETWORK
Document Type and Number:
WIPO Patent Application WO/2023/208474
Kind Code:
A1
Abstract:
A method performed by a first wireless node for determining one or more preferred precoders in a wireless communications network is provided. The one or more precoders are maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node. The first wireless node obtains (601) a training model. The training model has been trained to provide an output comprising any one or more out of: the one or more preferred precoders maximizing the SNR received by the second wireless node, one or more reconstructed estimated channels, and one or more reconstructed estimated channel features. The first wireless node obtains (602) a first compressed channel feature codeword from the second wireless node. The first wireless node determines (603), based on the obtained first compressed channel feature codeword and the obtained training model, the one or more preferred precoders maximizing the SNR received by the second wireless node.

Inventors:
RINGH EMIL (SE)
TIMO ROY (SE)
AXNÄS JOHAN (SE)
ZHANG XINLIN (SE)
FRENNE MATTIAS (SE)
Application Number:
PCT/EP2023/056957
Publication Date:
November 02, 2023
Filing Date:
March 17, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ERICSSON TELEFON AB L M (SE)
International Classes:
H04B7/0417; H04B7/0456; H04B7/06
Domestic Patent References:
WO2021217519A12021-11-04
Other References:
LIU WENDONG ET AL: "EVCsiNet: Eigenvector-Based CSI Feedback Under 3GPP Link-Level Channels", IEEE WIRELESS COMMUNICATIONS LETTERS, IEEE, PISCATAWAY, NJ, USA, vol. 10, no. 12, 15 September 2021 (2021-09-15), pages 2688 - 2692, XP011892287, ISSN: 2162-2337, [retrieved on 20211207], DOI: 10.1109/LWC.2021.3112747
MUHAN CHEN ET AL: "Deep Learning-based Implicit CSI Feedback in Massive MIMO", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 21 May 2021 (2021-05-21), XP081967393
ERICSSON: "Discussions on AI-CSI", vol. RAN WG1, no. Online; 20220516 - 20220527, 29 April 2022 (2022-04-29), XP052152910, Retrieved from the Internet [retrieved on 20220429]
"Physical layer procedures for data (Release 16", 3GPP TS 38.214
ZHILIN LUXUDONG ZHANGHONGYI HEJINTAO WANGJIAN SONG: "Binarized Aggregated Network with Quantization: Flexible Deep Learning Deployment for CSI Feedback in MassiveMIMO System", ARXIV, 2105.00354, vol. 1, May 2021 (2021-05-01)
J. DUCHIE. HAZANY. SINGER: "Adaptive subgradient methods for online learning and stochastic optimization", JOURNAL OF MACHINE LEARNING RESEARCH, 2010
D. KINGMAJ. BA: "A method for stochastic optimization", ARXIV, 1412.6980, December 2014 (2014-12-01)
NELSON COSTASIMON HAYKIN: "Multiple-Output Channel Models: Theory and Practice", 2010, JOHN WILEY & SONS, article "Multiple-Input"
S. LOFFEC. SZEGDY: "Batch normalization: Accelerated deep network training by reducing internal covariance shift", ARXIV 1502.03167, March 2015 (2015-03-01)
C. TRABELSIO. BILANIUKY. ZHANGD. SERDYUKS. SUBRAMANIANJ. F. SANTOSS. MEHRIN. ROSTAMZADEHY. BENGIOC. J. PA: "Deep complex networks", ARXIV 1705.09792, 2018
K. SCHARNHORST: "Angles in Complex Vector Spaces", ARXIV: 9904077, 1999
MUHAN CHENJIAJIA GUOCHAO-KAI WENSHI JINGEOFFREY YE LIANG YANG: "Deep Learning-based Implicit CSI Feedback in Massive MIMO", ARXIV: 2105.10100, 2021
KUMAR PRATIKRANA ALI AMJADARASH BEHBOODIJOSEPH B. SORIAGAMAX WELLING: "Neural Augmentation of Kalman Filter with Hypernetwork for Channel Tracking", ARXIV: 2109.12561, 2021
PRANAV MADADIJEONGHO JEONJOONYOUNG CHOCALEB LOJUHO LEEJIANZHONG ZHANG: "PolarDenseNet: A Deep Learning Model for CSI Feedback in MIMO Systems", ARXIV: 2202.01246, 2022
ZHILIN LUJINTAO WANGJIAN SONG: "Multi-resolution CSI Feedback with deep learning in Massive MIMO System", ARXIV: 1910.14322, 2019
A. GALANTAICS. J. HEGEDUS: "Jordan's principal angles in complex vector spaces", NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2006
Attorney, Agent or Firm:
BOU FAICAL, Roger (SE)
Download PDF:
Claims:
CLAIMS

1. A method performed by a first wireless node (110) for determining one or more preferred precoders in a wireless communications network (100), wherein the one or more precoders are maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node (120), the method comprising: obtaining (601) a training model, wherein the training model has been trained, by minimizing a loss function indicative of a reconstruction loss of one or more reconstructed second channels and/or one or more reconstructed second channel features, to provide an output comprising any one or more out of:

- the one or more preferred precoders maximizing the SNR received by the second wireless node (120),

- one or more reconstructed estimated channels, and

- one or more reconstructed estimated channel features, obtaining (602), from the second wireless node (120), a first compressed channel feature codeword indicative of one or more first channels and/or one or more first channel features estimated by the second wireless node (120), and determining (603), based on the obtained first compressed channel feature codeword and the obtained training model, the one or more preferred precoders maximizing the SNR received by the second wireless node (120).

2. The method according to claim 1, wherein determining (603) the one or more preferred precoders comprises: providing the first compressed channel feature codeword as input to the obtained training model, which training model has been trained by minimizing the loss function, receiving any one or more out of: the one or more preferred precoders maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated first channels and one or more reconstructed estimated first channel features, as output from the training model, and determining the one or more preferred precoders based on the output from the training model.

3. The method according to any of claims 1-2, wherein obtaining (601) the training model comprises: obtaining one or more second compressed channel feature codeword, indicative of one or more estimated second channels and/or one or more estimated second channel features, reconstructing the one or more estimated second channels and/or one or more estimated second channel features by using the training model, calculating, based on the reconstructed one or more estimated second channels and/or reconstructed one or more estimated second channel features and/or an output of the training model, a reconstruction loss using the loss function, and training the model to provide any one or more out of: the one or more preferred precoders maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated channels and one or more reconstructed estimated channel features, by using machine learning based on the calculated reconstruction loss, to minimize the loss function.

4. The method according to claim 3, wherein the reconstruction loss is any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors,

- a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

5. The method according to claim 4, wherein when the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

6. The method according to any of claims 1-5, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are approximatively orthogonal.

7. The method according to claim 6, wherein determining the more than one orthogonal preferred precoders comprises any one or more out of:

- performing computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and

- adding a penalizing term to the loss function used for training the training model.

8. The method according to any of claims 1-7, wherein the loss function used for training the training model is a strictly increasing or non-decreasing function of a second loss function.

9. The method according to claim 8, wherein the loss function is the logarithm of the sum of 1 and the second loss function.

10. A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to any of the claims 1-9.

11. A carrier comprising the computer program of claim 10, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.

12. A method performed by an operator node (130) for training a training model to provide any one or more out of: one or more preferred precoders maximizing a Signal- to-Noise Ratio, SNR, received by a second wireless node (120), one or more reconstructed estimated channels and one or more reconstructed estimated channel features, the method comprising: obtaining (701) one or more third compressed channel feature codeword, indicative of one or more third estimated channels and/or third estimated channel features, reconstructing (702) the one or more third estimated channels and/or one or more third estimated channel features by using the training model, calculating (703) based on the reconstructed one or more third channels and/or reconstructed one or more third estimated channel features and/or an output of the training model, a reconstruction loss using the loss function, and training (704) the training model to provide any one or more out of: The one or more preferred precoders maximizing the SNR received by the second wireless node (120), the one or more reconstructed estimated channels and the one or more reconstructed estimated channel features, based on the calculated reconstruction loss, to minimize the loss function.

13. The method according to claim 12, wherein the reconstruction loss is any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors,

- a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

14. The method according to claim 13, wherein when the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

15. The method according to any of claims 12-14, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are approximatively orthogonal.

16. The method according to claim 15, wherein determining the more than one orthogonal preferred precoders comprises any one or more out of:

- performing computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and

- adding a penalizing term to the loss function used for training the training model.

17. The method according to any of claims 12-16, wherein the loss function used for training the training model is a strictly increasing or non-decreasing function of a second loss function.

18. The method according to claim 17, wherein the loss function is the logarithm of the sum of 1 and the second loss function.

19. A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to any of the claims 12-18.

20. A carrier comprising the computer program of claim 20, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.

21. A first wireless node (110) configured to determine one or more preferred precoders in a wireless communications network (100), wherein the one or more precoders are maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node (120), the first wireless node (110) further being configured to: obtain a training model, wherein the training model is adapted to have been trained, by minimizing a loss function indicative of a reconstruction loss of one or more reconstructed second channels and/or one or more reconstructed second channel features, to provide an output comprising any one or more out of:

- the one or more preferred precoders maximizing the SNR received by the second wireless node (120),

- one or more reconstructed estimated channels, and

- one or more reconstructed estimated channel features, obtain, from the second wireless node (120), a first compressed channel feature codeword indicative of one or more first channels and/or one or more first channel features estimated by the second wireless node (120), and determine, based on the obtained first compressed channel feature codeword and the obtained training model, the one or more preferred precoders maximizing the SNR received by the second wireless node (120).

22. The first wireless node (110) according to claim 21 , wherein first wireless node (110) is configured to determine the one or more preferred precoders by further being configured to: provide the first compressed channel feature codeword as input to the obtained training model, which training model is adapted to have been trained by minimizing the loss function, receive any one or more out of: the one or more preferred precoders maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated first channels and one or more reconstructed estimated first channel features, as output from the training model, and determine the one or more preferred precoders based on the output from the training model.

23. The first wireless node (110) according to any of claims 21-22, wherein first wireless node (110) is further configured to obtain the training model by further being configured to: obtain one or more second compressed channel feature codeword, indicative of one or more estimated second channels and/or one or more estimated second channel features, reconstruct the one or more estimated second channels and/or one or more estimated second channel features by using the training model, calculate, based on the reconstructed one or more estimated second channels and/or reconstructed one or more estimated second channel features and/or an output of the training model, a reconstruction loss using the loss function, and train the model to provide any one or more out of: the one or more preferred precoders maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated channels and/or one or more reconstructed estimated channel features, by using machine learning based on the calculated reconstruction loss, to minimize the loss function.

24. The first wireless node (110) according to claim 23, wherein the reconstruction loss is adapted to be any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors,

- a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

25. The first wireless node (110) according to claim 24, wherein when the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are adapted to be any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

26. The first wireless node (110) according to any of claims 21-25, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are adapted to be approximatively orthogonal.

27. The first wireless node (110) according to claim 26, wherein first wireless node (110) is configured to determine the more than one orthogonal preferred precoders by further being configured to any one or more out of:

- perform computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and

- add a penalizing term to the loss function used for training the training model.

28. The first wireless node (110) according to any of claims 21-27, wherein the loss function used for training the training model is adapted to be a strictly increasing or non-decreasing function of a second loss function.

29. The first wireless node (110) according to claim 28, wherein the loss function is adapted to be the logarithm of the sum of 1 and the second loss function.

30. An operator node (130) configured to train a training model to provide any one or more out of: one or more preferred precoders maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node (120), one or more reconstructed estimated channels and one or more reconstructed estimated channel features, the operator node (130) further being configured to: obtain one or more third compressed channel feature codeword, indicative of one or more third estimated channels and/or one or more third estimated channel features, reconstruct the one or more third estimated channels and/or one or more third estimated channel features by using the training model, calculate, based on the reconstructed one or more third channels and/or reconstructed one or more third channel features and/or an output of the training model, a reconstruction loss using the loss function, and train the training model to provide any one or more out of: the one or more preferred precoders maximizing the SNR received by the second wireless node (120), the one or more reconstructed estimated channels and the one or more reconstructed estimated channel features, by using machine learning, based on the calculated reconstruction loss, to minimize the loss function.

31. The operator node (130) according to claim 30, wherein the reconstruction loss is adapted to be any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, - a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

32. The operator node (130) according to claim 31 , wherein when the reconstruction loss is adapted to be the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are adapted to be any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

33. The operator node (130) according to any of claims 30-32, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are adapted to be approximatively orthogonal.

34. The operator node (130) according to claim 33, wherein the operator node (130) is configured to determine the more than one orthogonal preferred precoders by further being configured to any one or more out of:

- perform computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and

- adding a penalizing term to the loss function used for training the training model.

35. The operator node (130) according to any of claims 30-34, wherein the loss function used for training the training model is adapted to be a strictly increasing or nondecreasing function of a second loss function.

36. The operator node (130) according to claim 35, wherein the loss function is adapted to be the logarithm of the sum of 1 and the second loss function.

Description:
FIRST WIRELESS NODE, OPERATOR NODE AND METHODS IN A WIRELESS

COMMUNICATION NETWORK

TECHNICAL FIELD

Embodiments herein relate to a first wireless node, an operator node and methods therein. In some aspects, they relate to determining one or more precoders and/or training a training model for providing one or more out of one or more preferred precoders maximizing a Signal-to-Noise Ratio received by a second wireless node, one or more reconstructed estimated channels and one or more reconstructed estimated channel features.

BACKGROUND

In a typical wireless communication network, wireless devices, also known as wireless communication devices, mobile stations, stations (STA) and/or User Equipments (UE)s, communicate via a Wide Area Network or a Local Area Network such as a Wi-Fi network or a cellular network comprising a Radio Access Network (RAN) part and a Core Network (CN) part. The RAN covers a geographical area which is divided into service areas or cell areas, which may also be referred to as a beam or a beam group, with each service area or cell area being served by a radio network node such as a radio access node e.g., a Wi-Fi access point or a radio base station (RBS), which in some networks may also be denoted, for example, a NodeB, eNodeB (eNB), or gNB as denoted in Fifth Generation (5G) telecommunications. A service area or cell area is a geographical area where radio coverage is provided by the radio network node. The radio network node communicates over an air interface operating on radio frequencies with the wireless device within range of the radio network node.

3GPP is the standardization body for specify the standards for the cellular system evolution, e.g., including 3G, 4G, 5G and the future evolutions. Specifications for the Evolved Packet System (EPS), also called a Fourth Generation (4G) network, have been completed within the 3rd Generation Partnership Project (3GPP). As a continued network evolution, the new releases of 3GPP specifies a 5G network also referred to as 5G New Radio (NR).

Frequency bands for 5G NR are being separated into two different frequency ranges, Frequency Range 1 (FR1) and Frequency Range 2 (FR2). FR1 comprises sub-6 GHz frequency bands. Some of these bands are bands traditionally used by legacy standards but have been extended to cover potential new spectrum offerings from 410 MHz to 7125 MHz FR2 comprises frequency bands from 24.25 GHz to 52.6 GHz. Bands in this millimeter wave range have shorter range but higher available bandwidth than bands in the FR1.

Multi-antenna techniques may significantly increase the data rates and reliability of a wireless communication system. For a wireless connection between a single user, such as UE, and a base station, the performance is in particular improved if both the transmitter and the receiver are equipped with multiple antennas, which results in a Multiple-Input Multiple-Output (MIMO) communication channel. This may be referred to as Single-User (SU)-MIMO. In the scenario where MIMO techniques is used for the wireless connection between multiple users and the base station, MIMO enables the users to communicate with the base station simultaneously using the same time-frequency resources by spatially separating the users, which increases further the cell capacity. This may be referred to as Multi-User (MU)-MIMO. Note that MU-MIMO may benefit when each UE only has one antenna. Such systems and/or related techniques are commonly referred to as MIMO.

The 5 H1 generation mobile wireless communication system (NR) uses OFDM with configurable bandwidths and subcarrier spacing to efficiently support a diverse set of usecases and deployment scenarios. With respect to the 4 th generation system (LTE), NR improves deployment flexibility, user throughputs, latency, and reliability. The throughput performance gains are enabled, in part, by enhanced support for Multi-User MIMO (MU- MIMO) transmission strategies, where two or more UE receives data on the same OFDM time frequency resources, i.e. , spatially separated transmissions.

The MU-MIMO transmission strategy is illustrated in Figure 1. In this illustration, a multi-antenna base station with N TX antenna ports transmit information to several UEs, with sequence intended for the j'th UE. Before modulation and transmission, precoding W^ 7 - 1 is applied to S^\j = 1,2, ...,/, to mitigate multiplexing interference - the transmissions are spatially separated. Note the order of modulation and precoding, or demodulation and combining respectively, that may differ depending on the implementation of MU-MIMO transmission. Each UE demodulates its received signal and combines receiver antenna signals to obtain an estimate S® of corresponding transmitted sequence. This estimate S® for the j'th UE can be expressed as

The second term represents the spatial multiplexing interference (due to MU -Ml MO transmission) seen by UE(j) and the third term represents other interference and noise sources. The goal is to construct the set of precoders to meet a given design target resulting in that correlates well with the channel H® observed by UE(j) whereas it correlates poorly with the channels, #= i, observed by other UEs.

To construct precoders Wy l \ i = 1,2, ..., J that enable efficient MU-MIMO transmissions, the radio access network needs to obtain detailed information about all the users downlink channels W(i), i = 1,2, . . ,/.

Channel state information (CSI) reporting in NR

In deployments where full channel reciprocity holds, detailed channel information can be obtained from uplink sounding reference signals (SRS) that are transmitted periodically, or on demand, by active UEs. The radio access network can estimate the uplink channel from SRS and by reciprocity obtain the downlink channel H®.

Full channel reciprocity can be obtained in time division duplex (TDD) deployments for UEs with same number of transmitters (TX chains) as receive branches (RX chains). However, the typical scenario is that UEs have fewer TX chains than RX chains, so the radio access network might only be able to estimate part of the uplink channel using SRS (in which case only certain columns of a precoding matrix can be estimated using SRS). This situation is known as partial channel knowledge.

In frequency division duplex (FDD) deployments, full channel reciprocity cannot be expected since the uplink and downlink channels use different carriers and, therefore, the uplink channel might not provide enough information about the downlink channel to enable MU-MIMO precoding. In such deployments, active UEs need to feedback channel state information (CSI) to the radio access network over uplink control or data channels. In LTE and NR, this feedback is achieved by a signalling protocol that can be outlined as follows:

• The radio access network configures a UE to report CSI in a certain way.

• The radio access network node transmits CSI reference signals (CSI-RS).

• The UE estimates the downlink channel (or important features thereof) from the transmitted CSI-RS.

• The UE reports CSI over an uplink control and/or data channel.

• The radio access network uses the UE’s feedback for downlink user scheduling and precoding.

In the above, important features of the channel may refer to a Gram matrix of the channel, one or more eigenvectors that correspond to the largest eigenvalues of an estimated channel covariance matrix, approximations of such aforementioned eigenvectors, one or more DFT base vectors, or orthogonal vectors from any other suitable and defined vector space, that best correlates with an estimated channel matrix, or an estimated channel covariance matrix, the channel delay profile.

In NR, a UE can be configured to report CSI Type I and CSI Type II, where the CSI Type II reporting protocol has been specifically designed to enable MU -Ml MO operations from uplink UE reports. The CSI Type II can be configured in a normal reporting mode or in a port selection reporting mode.

The CSI Type II normal reporting mode is based on the specification of sets of Discrete Fourier Transform DFT basis functions in a precoder codebook. The UE selects and reports the L DFT vectors from the codebook that best match its channel conditions (like the classical codebook precoding matrix indicator (PMI) from earlier 3GPP releases). The number of DFT vectors L is typically 2 or 4 and it configurable by the NW. In addition, the UE reports how the L DFT vectors should be combined in terms of relative amplitude scaling and co-phasing.

Algorithms to select L, the L DFT vectors, and co-phasing coefficients are outside the specification scope -- left UE and NW implementation. Or, put another way, the Rel 16 specification only defines signalling protocols to enable the above message exchanges.

In the following, we will use “DFT beams" interchangeably with DFT vectors. This slight abuse of terminology is appropriate whenever the base station has a uniform planar array with antenna elements separated by half of the carrier wavelength.

The CSI type II normal reporting mode is illustrated in Figure 2, see also technical specification [1], The selection and reporting of the L DFT vectors b n and their relative amplitudes a n is done in a wideband manner; that is, the same beams are used for both polarizations over the entire transmission band. The selection and reporting of the DFT vector co-phasing coefficients are done in a subband manner; that is, DFT vector cophasing parameters are determined for each of multiple subsets of contiguous subcarriers. The co-phasing parameters are quantized such that e i6n is taken from either a QPSK or 8PSK signal constellation.

If the NW cannot accurately estimate the full downlink channel from uplink transmissions, then active UEs need to report channel information to the NW over the uplink control or data channels. In LTE and NR, this feedback is achieved by the following signaling protocol:

The NW transmits Channel State Information reference signals (CSI-RS) over the downlink using N ports.

The UE estimates the downlink channel (or important features thereof) for each of the N ports from the transmitted CSI-RS.

The UE reports CSI (e.g., channel quality index (CQI), precoding matrix indicator (PMI), rank indicator (Rl)) to the NW over an uplink control and/or data channel.

The NW uses the UE’s feedback for downlink user scheduling and MIMO precoding.

In NR, both Type I and Type II reporting is configurable, where the CSI Type II reporting protocol has been specifically designed to enable MU -Ml MO operations from uplink UE reports.

The CSI Type II normal reporting mode is based on the specification of sets of Discrete Fourier Transform DFT basis functions in a precoder codebook. The UE selects and reports the L DFT vectors from the codebook that best match its channel conditions (like the classical codebook precoding matrix indicator (PMI) from earlier 3GPP releases). The number of DFT vectors L is typically 2 or 4 and it configurable by the NW. In addition, the UE reports how the L DFT vectors should be combined in terms of relative amplitude scaling and co-phasing.

Algorithms to select L, the L DFT vectors, and co-phasing coefficients are outside the specification scope -- left UE and NW implementation. Or, put another way, the Rel 16 specification only defines signaling protocols to enable the above message exchanges. In the following, we will use “DFT beams” interchangeably with DFT vectors. This slight abuse of terminology is appropriate whenever the base station has a uniform planar array with antenna elements separated by half of the carrier wavelength.

The CSI type II normal reporting mode is illustrated in Figure 2, see also technical specification [1], The selection and reporting of the L DFT vectors b n and their relative amplitudes a n is done in a wideband manner; that is, the same beams are used for both polarizations over the entire transmission band. The selection and reporting of the DFT vector co-phasing coefficients are done in a sub band manner; that is, DFT vector cophasing parameters are determined for each of multiple subsets of contiguous subcarriers. The co-phasing parameters are quantized such that e j9n is taken from either a QPSK or 8PSK signal constellation.

With k denoting a subband index, the precoder W v [k] reported by the UE to the NW can be expressed as follows:

The Type II CSI report can be used by the NW to co-schedule multiple UEs on the same OFDM time-frequency resources. For example, the NW can select UEs that have reported different sets of DFT vectors with weak correlations. The CSI Type II report enables the UE to report a precoder hypothesis that trades CSI resolution against uplink transmission overhead.

NR 3GPP Release 15 supports Type II CSI feedback using port selection mode, in addition to the above normal reporting mode. In this case,

The base station transmits a CSI-RS port in each one of the beam directions.

The UE does not use a codebook to select a DFT vector (a beam), instead the UE selects one or multiple antenna ports from the CSI-RS resource of multiple ports.

Type II CSI feedback using port selection gives the base station some flexibility to use non-standardized precoders that are transparent to the UE. For the port-selection codebook, the precoder reported by the UE can be described as follows Here, the vector e is a unit vector with only one non-zero feature, also referred to as element, which can be viewed a selection vector that selects a port from the set of ports in the measured CSI-RS resource. The UE thus feeds back which ports it has selected, the amplitude factors and the co-phasing factors.

Different CSI-reporting frameworks

The CSI type II reporting as described above falls into the category of CSI-reporting framework that can be called precoding-vector feedback. In this framework the UE is reporting suggested precoding-vectors to the NW in different ways and different frequency granularity.

Another category of CSI reporting that can be considered, especially with the development of new powerful compression algorithms (e.g., based on AEs as described below) is full-channel feedback. In this framework the UE reports a compression or representation of the whole observed/estimated channel, and possibly also noise covariance estimates, in the feedback.

AE based CSI reporting

Neural network (NN) based autoencoders (AEs) have recently gained large interests in the wireless communications research community for its capability of compressing and decompressing (reconstructing) Ml MO radio channels accurately even at high compression ratios. The use of the AE is here in the context of CSI compressing where a UE provides CSI feedback to a radio access network node by sending a CSI report that include a compressed and encoded version of the estimated downlink channel, or of important features thereof. A summary of recent academia work on this topic can be found in [3], Furthermore, 3GPP decided to start a study item for Rel.18 that includes the use case of Al-based CSI reporting in which AEs will play a central part of the study [2,4],

An AE is a neural network, i.e. , a type of machine learning algorithm, that has been partitioned into one encoder and one decoder. This partitioning is illustrated in Figure 3 by considering a simple NN example with fully connected layers (a.k.a. dense NN). The encoder and decoder are separated by a bottleneck layer that holds a compressed representation, Y in Figure 3, of the input data X. The variable Y is sometimes called the latent representation of the input X. More specifically, The size of the bottleneck (latent representation) Y is significantly smaller than the size of the input data X. The AE encoder thus compresses the input features X to Y.

The decoder part of the AE tries to invert the encoder’s compression and reconstruct X with minimal error, according to some predefined loss function.

The terms latent representation, latent vector, and output is used interchangeably. Analogously the terms latent space and output space are used interchangeably and refer to the space of all possible latent vectors, for a given architecture. Similarly, the input space is the space of all possible inputs, for a given architecture. The word space can be understood as, e.g., a linear vector space, in the mathematical sense.

AEs can have different architectures. For example, AEs can be based on dense NNs (like Figure 3), multi-dimensional convolution NNs, recurrent NNs, transformer NNs, or any combination thereof. However, all AEs architectures possess an encoder- bottleneck-decoder structure. A characteristic of AEs is that they can be used to compress and decompress data in an unsupervised manner.

Figure 3. An illustration of a fully connected autoencoder (AE).

Figure 4 illustrates how an AE might be used for Al-enhanced CSI reporting in NR. In summary:

• The UE estimates the downlink channel (or important features thereof) from downlink reference signal(s), e.g., CSI-RS. For example, the UE estimates the downlink channel as a 3D complex-valued tensor, with dimensions defined by the radio access network node CSI-RS antenna ports, the UE’s Rx antenna ports, and frequency (the granularity of which is configurable, e.g., subcarrier or subband).

• The UE uses a trained AE encoder to compress the estimated channel [features] down to a binary codeword. The binary codework is reported to the radio access network over an uplink control and/or data channel. In practice, this codeword will likely form one part of a channel state information (CSI) report that might also include rank, channel quality, and interference information.

• The radio access network node uses a trained AE decoder to reconstruct the estimated channel [features]. The decompressed output of the AE decoder is used by the radio access network in, for example, MIMO precoding, scheduling, and link adaption. Figure 4 Using Autoencoder (AE) for CSI Compression (inference phase).

The architecture of an AE (e.g., structure, number of layers, nodes per layer, activation function etc) will need to be tailored for each particular use case. For example, properties of the data (e.g., CSI-RS channel estimates), the channel size, uplink feedback rate, and hardware limitations of the encoder and decoder all need to be considered when designing the AE’s architecture.

After the AE’s architecture is fixed, it needs to be trained on one or more datasets. To achieve good performance during live operation (the so-called inference phase), the training datasets need to be representative of the actual data the AE will encounter during live operation.

The training process involves numerically tuning the AE’s trainable parameters (e.g., the weights and biases of the underlying NN) to minimize a loss function on the training datasets. The loss function could be, for example, the MSE loss calculated as the average of the squared error between the UE’s downlink channel estimate H and the NN’s reconstruction H, i.e., \\H - H || . The purpose of the loss function is to meaningfully quantify the reconstruction error for the particular use case at hand.

The training process is typically based on some variant of the mini-batch gradient descent algorithm, which, at its core, has three components: a feedforward step, a back propagation step, and a parameter optimization step.

Feedforward: A batch of training data, such as a mini-batch, (e.g., several downlinkchannel estimates) is pushed through the AE, from the input to the output. The loss function is used to compute the reconstruction loss for all training samples in the batch. The reconstruction loss may refer to an average reconstruction loss for all training samples in the batch.

Back propagation (BP): The gradients (partial derivatives of the loss function, L, with respect to each trainable parameter in the AE) are computed. The back propagation algorithm sequentially works backwards from the AE output, layer-by-layer, back through the AE to the input. The back propagation algorithm is built around the chain rule for differentiation: When computing the gradients for layer n in the AE, it uses the gradients for layer n+1. Parameter optimization: The gradients computed in the back propagation step are used to update the AE’s trainable parameters using a gradient descent method with a learning rate hyperparameter that scales the gradients. The core idea is to make small adjustments to each parameter so that the average loss over the training batch decreases. It is commonplace to use special optimizers to update the AE’s trainable parameters using gradient information. The following optimizers are widely used to reduce training time and improving overall performance: adaptive sub-gradient methods (AdaGrad) [5], RMSProp, and adaptive moment estimation (ADAM) [6],

The above process (feedforward pass, back propagation, parameter optimization) is repeated many times until an acceptable level of performance is achieved on the training dataset. An acceptable level of performance may refer to the AE achieving a pre-defined average reconstruction error over the training dataset (e.g., normalized MSE of the reconstruction error over the training dataset is less than, say, 0.1). Alternatively, it may refer to the AE achieving a pre-defined user data throughput gain with respect to a baseline CSI reporting method (e.g., a MIMO precoding method is selected, and user throughputs are separately estimated for the baseline and the AE CSI reporting methods).

Optimal precoder and receiver weights (via IRC receiver)

The AE decoder in the gNB should output a precoder that maximizes the received SNR. The SNR should be maximized with respect to the UE’s receiver weights w and the precoder p that is output by the AE, namely, max SNR. w,p

The SNR can be expressed as a function of the channel H and the noise covariance matrix R as follows:

The right-hand side of the above equation can be rewritten as The maximization of SNR(H, /?) is with respect to both w and p. However, in the latter formulation the maximizer with respect to w can be formulated as a function of p. Namely, for a given p is attained by an eigenvector corresponding to the largest eigenvalue of the generalized eigenvalue problem and is then a function of p. The left-hand side is a rank-1 matrix, and by using that P H H H W is a scalar and that the normalization is not unique for an eigenvector, we can conclude that the optimal UE receiver weights are up to some normalization.

Remark: This is the same expression as for the normal IRC receiver

Now the SNR can be maximized with respect to p, and thus implicitly with respect to iv. Inserting the expression for w in the definition of SNR and maximizing over p gives,

Where we have used that the noise covariance R is Hermitian. The maximizer is given by an eigenvector corresponding to the largest eigenvalue of the eigenvalue problem where p is a complex constant, e.g., eigenvalue.

Remark: For intuition consider the case that R = I, then p is the right singular vector corresponding to the largest singular value of H, in other words and eigenvector of the Tx-Tx covariance seen by the UE. Then also w is the corresponding left singular vector. Moreover, p is then the square of the singular value.

Remark: The eigenvalue problem is equivalent to a singular value decomposition of L^H, where l R is a Cholesky factor of R, i.e, R = L R L R with L R being lower triangular and having real and positive entries on the main diagonal.

Angels in complex vector spaces and principal angels between subspaces.

Cosine similarity is based on the angels between vectors. This is well defined in the real-valued case, which is most common in Artificial Intelligence/Machine Learning (AI/ML) literature. For complex-valued vector spaces, there are several different angels that can be measured. A summary of those is available in [14], We consider the generalized cosine similarity as: where {p, v} A ■= p H Av and is known as an inner product. The (classical) cosine similarity is defined by using A = I A , i.e., the identity matrix. In other words, CS(p,v) = GCS/(p, v).

Based on this concept of angles in vector spaces, one can then define principal angles between subspaces. This is treated in paper [20] (specifically, see definition 4). These angles can be computed using an Single Vector Decomposition (SVD) of an appropriately defined matrix, see Theorem 13 in [20], The discussion following the theorem also outlines that this can be applied also to “generalized angles” as defined in the generalized cosine similarity above.

In many papers the authors use the mean-square error (MSE), the normalized mean-square error (NSME), the cosine similarity, or generalized cosine similarity as loss function and measure of performance. See, e.g., [15], [16], [17], [18], and [19], SUMMARY

An object of embodiments herein is to improve the performance of a wireless communications network by using loss functions when training AI/ML models for CSI feedback.

According to an aspect of embodiments herein, the object is achieved by a method performed by a first wireless node for determining one or more preferred precoders in a wireless communications network. The one or more precoders are maximizing a Signal- to-Noise Ratio, SNR, received by a second wireless node.

The first wireless node obtains a training model. The training model has been trained, by minimizing a loss function indicative of a reconstruction loss of one or more reconstructed second channels and/or one or more reconstructed second channel features, to provide an output comprising any one or more out of:

- the one or more preferred precoders maximizing the SNR received by the second wireless node,

- one or more reconstructed estimated channels, and

- one or more reconstructed estimated channel features.

The first wireless node obtains a first compressed channel feature codeword from the second wireless node. The first channel codeword is indicative of one or more first channels and/or one or more first channel features estimated by the second wireless node.

The first wireless node determines, based on the obtained first compressed channel feature codeword and the obtained training model, the one or more preferred precoders maximizing the SNR received by the second wireless node.

According to another aspect of embodiments herein, the object is achieved by a method performed by an operator node for training a training model to provide any one or more out of: One or more preferred precoders maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node, one or more reconstructed estimated channels, and one or more reconstructed estimated channel features.

The operator node obtains one or more third compressed channel feature codeword. The third channel codeword is indicative of one or more third estimated channels and/or third estimated channel features. The operator node reconstructs the one or more third estimated channels and/or one or more third estimated channel features by using the training model.

The operator node calculates based on the reconstructed one or more third channels and/or reconstructed one or more third estimated channel features and/or an output of the training model, a reconstruction loss using the loss function.

The operator node trains the training model to provide any one or more out of: The one or more preferred precoders maximizing the SNR received by the second wireless node, the one or more reconstructed estimated channels, and the one or more reconstructed estimated channel features, based on the calculated reconstruction loss, to minimize the loss function.

According to another aspect of embodiments herein, the object is achieved by a first wireless node configured to determine one or more preferred precoders in a wireless communications network. The one or more precoders are maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node. The first wireless node is further configured to:

Obtain a training model, wherein the training model is adapted to have been trained, by minimizing a loss function indicative of a reconstruction loss of one or more reconstructed second channels and/or one or more reconstructed second channel features, to provide an output comprising any one or more out of:

- The one or more preferred precoders maximizing the SNR received by the second wireless node,

- one or more reconstructed estimated channels, and

- one or more reconstructed estimated channel features, obtain, from the second wireless node, a first compressed channel feature codeword indicative of one or more first channels and/or one or more first channel features estimated by the second wireless node, and determine, based on the obtained first compressed channel feature codeword and the obtained training model, the one or more preferred precoders maximizing the SNR received by the second wireless node.

According to another aspect of embodiments herein, the object is achieved by an operator node configured to train a training model to provide any one or more out of: One or more preferred precoders maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node, one or more reconstructed estimated channels and one or more reconstructed estimated channel features. The operator node is further configured to:

Obtain one or more third compressed channel feature codeword, indicative of one or more third estimated channels and/or one or more third estimated channel features, reconstruct the one or more third estimated channels and/or one or more third estimated channel features by using the training model, calculate, based on the reconstructed one or more third channels and/or reconstructed one or more third channel features and/or an output of the training model, a reconstruction loss using the loss function, and train the training model to provide any one or more out of: the one or more preferred precoders maximizing the SNR received by the second wireless node, the one or more reconstructed estimated channels and the one or more reconstructed estimated channel features, by using machine learning, based on the calculated reconstruction loss, to minimize the loss function.

Embodiments herein target to determine one or more precoders, e.g., preferred precoders, that minimize an SNR received by a second wireless node. Upon obtaining a training model and obtaining a first compressed channel codeword from a second wireless node, the first wireless node determines the one or more preferred precoders that maximizes the SNR received by the second wireless node. This based on the first channel codeword and the training model.

Embodiments bring the advantage of an efficient mechanism improving the performance in the wireless communications network. This is achieved by using a training model and determine one or more preferred precoders that maximizes the SNR received by a second wireless node based at least on the training model. This e.g., leads to an increased downlink throughput, and results in an improved the performance in the wireless communications network.

BRIEF DESCRIPTION OF THE DRAWINGS

Examples of embodiments herein are described in more detail with reference to attached drawings in which: Figure 1 illustrates an example of a Ml MO transmission strategy according to prior art.

Figure 2 illustrates an example of CSI reporting according to prior art.

Figure 3 illustrates an example of an NN according to prior art.

Figure 4 illustrates an example of an AE for CSI reporting.

Figure 5 is a schematic block diagram illustrating embodiments of a wireless communications network.

Figure 6 is a flowchart depicting embodiments of a method in a first wireless node.

Figure 7 is a flowchart depicting embodiments of a method in an operator node.

Figures 8 a and b are schematic block diagrams illustrating embodiments of a first wireless node.

Figures 9 a and b are schematic block diagrams illustrating embodiments of an operator node.

Figure 10 schematically illustrates a telecommunication network connected via an intermediate network to a host computer.

Figure 11 is a generalized block diagram of a host computer communicating via a base station with a user equipment over a partially wireless connection.

Figures 12 to 15 are flowcharts illustrating methods implemented in a communication system including a host computer, a base station and a user equipment.

DETAILED DESCRIPTION

As a part of developing embodiments herein the inventors identified a problem which first will be discussed.

It is not clear what is a good loss function to use when training AI/ML models for CSI feedback. Understanding this requires domain-specific knowledge.

There are inherit problems with using cosine similarity, and generalized cosine similarity, between precoding vectors extracted from the feedback and optimal precoding vectors. These problems occur regardless of whether the feedback is in terms of suggested precoding vectors or full-channel feedback, from which approximations of the optimal precoders are extracted. Some of these problems are:

• Cosine similarity and generalized cosine similarity are not a direct proxy for DL throughput. o Consider, for example, the case of a single-layer transmission when the best and second-best precoding vectors are orthogonal. Suppose that the received power of the best precoding vector at a UE is slightly larger than that of the second best. In this case, reconstructing/using the second-best precoding vector, or any vector that is close in the sense of cosine similarity to the second-best precoding vector, will give good performance in terms DL throughput. However, when measuring the quality of the reconstruction with a cosine similarity to the optimal precoding vector, it will evaluate it as a poor reconstruction/choice.

• It is not clear how cosine similarity may be defined for higher ranks, o One example is to measure each transmission rank individually.

However, there is an inherent order of importance on the different layers. If the difference is large between the strongest and weakest suggested directions in a report, then capturing components of the precoding vector corresponding to the strongest transmission direction is more important than the weaker, in order to achieve good DL throughput.

Using MSE or NMSE to evaluate the feedback against the full channel, regardless of if full-channel feedback or precoding vector feedback is used to construct approximations of the full channel, has the drawback that it might push the AI/ML model to try to reconstruct parts of the channel that is not relevant for the precoding. For example, instead of using bits to represent directions resulting in bad SNR and/or SI NR at the UE, these might be better used to represent the strong directions in from which precoding vectors will be chosen.

An object of embodiments herein is to improve the performance of a wireless communications network using loss functions when training AI/ML models for CSI feedback.

Some embodiments herein may provide loss functions that may be used for training a training model, such as an AI/ML model, for CSI feedback. The loss functions, which may also be referred to as custom loss functions, may be designed, such as calculated or generated, for serving as improved proxies for downlink throughput. The loss functions may be related to received SNR, received SINR and/or instantaneous per-layer, e.g., Ml MO- layer, mutual information in the form of log(1+SNR) and or log(1+SINR).

Embodiments herein may further provide methods for training a model, such as a training model which may also be referred to as an Al and/or ML model, e.g., using one or more of the provided loss functions.

Embodiments herein may further provide methods for determining one or more precoders, e.g., by using the trained training model.

Embodiments herein provide advantages such as e.g., loss functions that are better aligned with existing domain knowledge and the end-goal for what the AI/ML model output will be used for, e.g., maximizing downlink throughput.

Training AEs with one or more of the provided loss functions to compress and reconstruct the UE’s CSI-RS based downlink channel estimate may provide an improved performance e.g., in terms of downlink throughput.

The loss functions may also take higher-layer transmissions into account in a balanced way. Something that is not possible with existing solutions.

Figure 5 is a schematic overview depicting a wireless communications network 100 wherein embodiments herein may be implemented. The wireless communications network 100 comprises one or more RANs and one or more CNs. The wireless communications network 100 may use 5G NR but may further use a number of other different technologies, such as, Wi-Fi, (LTE), LTE-Advanced, Wideband Code Division Multiple Access (WCDMA), Global System for Mobile communications/enhanced Data rate for GSM Evolution (GSM/EDGE), or Ultra Mobile Broadband (UMB), just to mention a few possible implementations.

Network nodes, such as a first wireless node 110, 120, operate in the wireless communications network 100. The first wireless node 110, 120 may respectively e.g. provides a number of cells, and may use these cells for communicating with UEs, e.g. a second wireless node 110, 120. The first wireless node 110, 120 may respectively be a transmission and reception point e.g. a radio access network node such as a base station, e.g. a radio base station such as a NodeB, an evolved Node B (eNB, eNodeB, eNode B), an NR Node B (gNB), a base transceiver station, a radio remote unit, an Access Point Base Station, a base station router, a transmission arrangement of a radio base station, a stand-alone access point, a Wireless Local Area Network (WLAN) access point, an Access Point Station (AP ST A), an access controller, a UE acting as an access point or a peer in a Device to Device (D2D) communication, or any other suitable networking unit. The first wireless node 110, 120, may respectively e.g. further be able to communicate with each other via one or more CN nodes in the wireless communications network 100.

User Equipments operate in the wireless communications network 100, such as a second wireless node 110, 120. The second wireless node 110, 120 may e.g. be an NR device, a mobile station, a wireless terminal, an NB-loT device, an eMTC device, an NR RedCap device, a CAT-M device, a Wi-Fi device, an LTE device and a non-access point (non-AP) STA, a STA, that communicates via a base station such as e.g. the network node 110. It should be understood by the skilled in the art that the UE relates to a nonlimiting term which means any UE, terminal, wireless communication terminal, user equipment, (D2D) terminal, or node e.g. smart phone, laptop, mobile phone, sensor, relay, mobile tablets or even a small base station communicating within a cell.

The communications network further comprises an operator node 130. The operator node may e.g., provide a training model to the first wireless device 110, 120 and/or train the training model.

Methods herein may be performed by first wireless device 110 and the operator node 130. As an alternative, a Distributed Node (DN) and functionality, e.g. comprised in a cloud 140 as shown in Figure 5, may be used for performing or partly performing the methods of embodiments herein.

Embodiments herein may provide a set of loss functions that aligns with domain knowledge, e.g., in terms of received SNR and/or SINR at the UE and resulting instantaneous per-layer mutual information.

Figure 6 shows an example method performed by the first wireless node 110. The method is e.g., for determining one or more preferred precoders in the wireless communications network 100. The one or more preferred precoders may e.g., maximize an SNR received by the second wireless node 120. The first wireless node 110 may e.g. be a network node 110 or a UE 110. The second wireless node 120 may e.g. be a network node 120 or a UE 120. When the first wireless node 110 is a network node 110, the second wireless node 120 is a UE 120. When the first wireless node 110 is a UE 110, the second wireless node 120 is a network node 120.

The method may comprise any one or more out of the following actions. The actions may be executed in any suitable order.

Action 601

The first wireless node 110 obtains a training model. The training model has been trained to provide, e.g., an output comprising, any one or more out of: The one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node 120, one or more reconstructed estimated channels and one or more reconstructed estimated channel features. The training model has been trained by minimizing a loss function indicative of a reconstruction loss of one or more reconstructed estimated second channels and/or reconstructed estimated second channel features. The loss function may e.g., be defined according to any or the Some First to Twelfth embodiments mentioned below.

Obtaining the training model may e.g., comprise obtaining, such as receiving, the training model from another node in the wireless communications network 100. The other node may e.g., be the operator node 130. The training model may e.g., be an AE as described above.

The training model may be pre-trained by e.g., the operator node 130 or the cloud 140, alternatively, the first wireless node 110 trains the training model.

In some embodiments, obtaining the training model, comprises that the first wireless node 110 obtains one or more second compressed channel feature codewords. The one or more second compressed channel feature codewords are indicative of one or more estimated second channels and/or one or more estimated second channel features. The first wireless node 110 reconstructs the one or more estimated second channels and/or one or more estimated second channel features, e.g., by using the training model, and calculates a reconstruction loss using the loss function. The calculation may e.g., be based on the reconstructed one or more estimated second channels and/or reconstructed one or more estimated second channel features and/or an output of the training model. The wireless node 110 trains the training model to provide any one or more out of: the one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node 120, one or more reconstructed estimated channels and one or more reconstructed estimated channel features. The training is performed by using machine learning, based on the calculated reconstruction loss, to minimize the loss function. Minimizing the loss function may comprise that an acceptable performance is achieved. This may mean that the above explained example is performed iteratively, such as several times, until e.g., the calculated reconstruction loss is below a threshold.

Alternatively, the above explained example is performed iteratively until a pre-defined data throughput gain is achieved, e.g., in respect to a baseline method. The baseline method may be any other method used for similar purposes. As mentioned above, the loss function may e.g., be defined according to any or the Some First to Twelfth embodiments mentioned below. The one or more second compressed channel code words may e.g., be a training dataset, such as estimates of the one or more second channels and/or channel features. The estimates of the one or more second channels and/or channel features may be estimates obtained, or measured, in a live network, or it may be constructed, or generated, to simulate the conditions in a live network.

In some embodiments, the reconstruction loss may be any one or more out of: A sum of a plurality of reconstruction losses calculated based on a plurality of sub-bands, a sum of reconstruction losses calculated based on a plurality of transmission layers, a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

When the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights may be any one or more out of: The square of eigenvalues of a covariance matrix related to the one or more estimated channels, and the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

In some embodiments, the loss function used for training the training model is e.g., a strictly increasing function of a second loss function. Alternatively, the function is a nondecreasing function. The loss function may e.g., be the logarithm of the sum of 1 and the second loss function. According to this example, the second loss function may e.g., be defined according to any or the Some First to Twelfth embodiments mentioned below. In some embodiments, the loss function comprises a penalizing term, e.g., as defined in the Some First Embodiments below, in order to provide orthogonal preferred precoders. The penalizing term may be added by the wireless node 110 when training the training model.

Action 602

The first wireless node 110 obtains, a first compressed channel feature codeword from the second wireless node 120. The first compressed channel feature codeword is indicative of one or more first channels and/or channel features estimated by the second wireless node 120. In other words, the first compressed control channel feature codeword is indicative of the full channel, such as all estimated aspects of the channel, or only certain features of the channel such as e.g., one or more eigenvectors related to the Tx- Tx covariance matrix, wherein the eigenvectors can be the true eigenvectors or approximations thereof and computed in a subband manner, in a wideband manner, or a combination thereof; one or more suggested precoding vectors, wherein the precoding vectors are computed in a subband manner, in a wideband manner, or a combination thereof; or updates (deltas) of aforementioned full channel, eigenvectors, and/or precoding vectors in relation to previously transmitted information. The first compressed channel feature codeword may be related to a single layer transmission or a multi-layer transmission.

This may e.g., depend on configuration in the second wireless node 120 that may, e.g., be done by the first wireless node 110.

Action 603

The first wireless node 110 determines the one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node 120. The one or more preferred precoders are determined based on the obtained first compressed channel feature codeword and the obtained training model. In other words, the wireless node 110 may determine the one or more preferred precoders e.g., based on feeding the training model with an input comprising the obtained first compressed channel feature codeword and receiving an output from the training model comprising any one or more out of the one or more preferred precoders and the one or more estimated first channels and/or channel features. The wireless node 110 may determine the one or more preferred precoders by selecting or more precoders from the received output. Alternatively, the wireless node 110 may calculate, such as e.g., generate or estimate, the one or more preferred precoders from the output, e.g., when the output comprises the one or more estimated first channels and/or channel features.

In some embodiments, to determine the one or more precoders, the first wireless node 110 provides the first compressed channel feature codeword as input to the obtained training model. The training model has been trained by minimizing the loss function. The first wireless node 110 may further receiving any one out of: The one or more precoders, e.g., maximizing the SNR received by the second wireless node 120, one or more reconstructed estimated first channels and one or more reconstructed estimated first channel features as output from the training model. The first wireless node 110 determines the one or more preferred precoders based on the output from the training model.

In some embodiments, when more than one preferred precoders are determined, the more than one preferred precoders are orthogonal. Alternatively, the more than one preferred precoders may be e.g., approximatively orthogonal.

Orthogonal when used herein may mean having inner product equal to zero.

In some embodiments, the first wireless node 110 performs computational postprocessing on the output of the training model to orthogonalize the more than one precoders.

Figure 7 shows an example method performed by the operator node 130. The method is e.g., for training a training model to provide any one or more out of: One or more preferred precoders, e.g., maximizing a Signal-to-Noise Ratio, SNR, received by the second wireless node 120 one or more reconstructed estimated channels and one or more estimated channel features in the wireless communications network 100. The one or more preferred precoders may e.g., maximize an SNR received by the second wireless node 120.

The second wireless node 120 may e.g., be a network node 120 or a UE 120.

The method may comprise any one or more out of the following actions. The actions may be executed in any suitable order.

Action 701

The operator node 130 obtains one or more third compressed channel feature codeword, indicative of one or more third estimated channels and/or one or more third estimated channel features. In other words, the third compressed control channel feature codeword is indicative of the full channel, such as all estimated aspects of the channel, or only certain features of the channel such as e.g., one or more eigenvectors related to the Tx-Tx covariance matrix, wherein the eigenvectors can be the true eigenvectors or approximations thereof and computed in a subband manner, in a wideband manner, or a combination thereof; one or more suggested precoding vectors, wherein the precoding vectors are computed in a subband manner, in a wideband manner, or a combination thereof; or updates (deltas) of aforementioned full channel, eigenvectors, and/or precoding vectors in relation to previously transmitted information. The third compressed channel feature codeword may be related to a single layer transmission or a multi-layer transmission. The one or more third compressed channel code words may e.g., be a training dataset, such as estimates of the one or more third channels and/or channel features. The estimates of the one or more third channels and/or channel features may be estimates obtained, or measured, in a live network, or it may be constructed, or generated, to simulate the conditions in a live network.

Action 702

The operator node 130 reconstructs the one or more third estimated channels and/or one or more third estimated channel features, e.g., by using the training model.

By reconstructing the one or more third estimated channels and/or channel features, an estimation of the one or more third estimated channels and/or channel features is obtained. The reconstructing may e.g., comprise decompressing the one or more third compressed channel feature codeword.

Action 703

The operator node 130 calculates, e.g., based on the reconstructed one or more third channels and/or one or more third estimated channel features and/or an output of the training model, a reconstruction loss using the loss function. As mentioned above, the loss function may e.g., be defined according to any or the Some First to Twelfth embodiments mentioned below.

Action 704

The operator node 130 trains the training model to provide any one or more out of: The one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node 120, one or more reconstructed estimated channels and/or reconstructed estimated channel features. The training is performed by using machine learning, based on the calculated reconstruction loss, to minimize the loss function. Minimizing the loss function may comprise that an acceptable performance is achieved. This may mean that the above explained example is performed iteratively, such as several times, until e.g., the calculated reconstruction loss is below a threshold. Alternatively, the above explained example is performed iteratively until a pre-defined data throughput gain is achieved, e.g., in respect to a baseline method. The baseline method may be any other method used for similar purposes. As mentioned above, the loss function may e.g., be defined according to any or the Some First to Twelfth embodiments mentioned below.

In some embodiments, the reconstruction loss may be any one or more out of: A sum of a plurality of reconstruction losses calculated based on a plurality of sub-bands, a sum of reconstruction losses calculated based on a plurality of transmission layers, a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

When the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights may be any one or more out of: The square of eigenvalues of a covariance matrix related to the one or more estimated channels, and the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

In some embodiments, the loss function used for training the training model is e.g., a strictly increasing function of a second loss function. Alternatively, the function is a nondecreasing function. The loss function may e.g., be the logarithm of the sum of 1 and the second loss function. According to this example, the second loss function may e.g., be defined according to any or the Some First to Twelfth embodiments mentioned below.

In some embodiments, the loss function comprises a penalizing term, e.g., as defined in the Some First Embodiments below, in order to provide orthogonal preferred precoders. The penalizing term may be added by the wireless node 110 when training the training model.

In this way, by using the above methods, it may, e.g., be possible to train and use training models for CSI feedback which has better performance than comparative models not using the above method, e.g., resulting in increased user downlink throughput. The methods will now be further explained and exemplified in below embodiments. The embodiments below may be combined in any suitable manner with the embodiments described above. In the below embodiments, the training model may also referred to as the AE.

In the below embodiment, the loss functions are described as per-sample loss functions and when training is done on a batch of training data, then the loss function observed over the batch is a function of the per-sample losses, e.g., a sum or an average of all the per-sample losses computed over the samples of data.

In the below embodiments an output precoding vector p and an output L-tuple of precoding vectors (pi,p 2 < > PL) may be computed in a wideband or subband manner. In the latter case, the presented loss functions will result in a different loss for each subband. The per-sample loss function can then be defined as a function of the per-subband losses, e.g., a sum or an average of all the per-subband losses computed over all subbands.

Some First Embodiments

It is desired that the outputted precoding vectors for different layers are orthogonal to each other, on a per subband basis. The discussion on the below Some Second to Twelfth Embodiments assumes that this is, e.g., approximatively, the case. The orthogonality may be achieved in different ways.

Full-channel feedback

For full-channel feedback this comes naturally if computing the precoding vectors based on an SVD.

Precoding-vector feedback

For precoding-vector feedback there are different ways to, e.g., approximately, achieve this, e.g., any one out of:

- By computational post-processing that orthogonalizes the output across the layers on a per subband basis. This may be done by stacking the decoded suggested precoding vectors for different layers into a matrix A, on a per subband basis, and then by e.g., any one out of: - Computing a Single Vector Decomposition (SVD) of A and taking the left singular vector corresponding to non-zero singular values.

- Orthogonalizing by using a version of the Gram-Schmidt method, e.g., the single Gram-Schmidt, the double Gram-Schmidt, or the modified Gram- Schmidt. Keeping only vectors whose residual has a norm larger than some threshold.

- Computing a QR-factorization of A, i.e., A = QR, and taking the columns of Q corresponding to diagonal elements in R whose absolute value is larger than some threshold. This regardless of if the computation is done using Gram-Schmidt method, Householder transformations, or Givens rotations.

- By adding the following penalization term to the loss function: where A { k is a parameter that determines the penalization (or cost), i.e., increase in the value of the loss function, for the two vectors p r and p k not being orthogonal. The penalization ranges between approximately zero, when the two vectors are approximately orthogonal, and up to k . The parameter X t k can depend on both £ and k, one of them, or it be constant for all both { and /c.The parameter is real-valued, non-negative, and should, e.g., be at least the same size as the largest singular value. This makes it dependent on the training batch and the subband processed for training, both of which can be avoided by considering the maximum over all the training data, possibly with some extra margin.

Some Second Embodiments

For a single layer transmission, minimizing the following loss function, taking an input channel H and output p, and assuming the noise covariance matrix R is fixed and known, may train the training model to output a precoding vector p that maximizes SNR at the UE: The precoding vector may be computed in a wideband or subband manner. In the latter case, the above formula will result in a different loss for each subband. These subband losses may, e.g, simply be summed together to obtain a single loss for training the training model.

The absolute value in the above numerator has been left for consistency, however, it may be removed since the numerator is always non-negative. To see why it may be removed, a Cholesky decomposition of the noise covariance matrix R, may be considered, i.e.,

Taking the inverse yields:

Thus, the numerator is equal to the squared Euclidean norm of L^Hp, i.e., ||L -1 Hp|| 2 , since the latter which is real and always non-negative.

The loss function may be rewritten as: where a 2 is the i-th eigenvalue of H H R~ r H (squared singular value of H in the case when R = I), and v L is a corresponding i-th eigenvector. In these embodiments it is assumed that the eigenvectors v L have norm equal to 1. A? is the number of CSI-RS ports at the first wireless node 110.

Some Third Embodiments For a multi-layer transmission, the training model may be trained to output an L-tuple of precoding vectors (p 1; p 2 , ••• , PL) that maximizes the receive power of the second wireless node 120, e.g., summed over the L precoding vectors, subject to an orthogonality constraint, e.g., that the precoding vectors form an orthogonal basis. The following loss function may achieve this aim:

The absolute value in the above numerator has been left for consistency, however, it may be removed, analogous to the reasoning in Some Second Embodiments.

The loss function may be rewritten as

Some Fourth Embodiments

This is a version of the Some Third Embodiment, but with a log(1+SNR) in the summation over layers:

Some Fifth Embodiments

For a multi-layer transmission, the training model may be trained to output an L-tuple of precoding vectors (Pi,p 2 , --- , PL) which maximizes the receive power of the second wireless node 120, e.g., summed over the L precoding vectors, subject to an orthogonality constraint, e.g., the vectors form an orthogonal basis. Given the knowledge that eigenvectors v lt v 2 , ...,v L of HUR ^H achieves this, a cosine of the angle between the subspaces spanned and v is to be 1 , or as close to 1 as possible. It is further acknowledged that if a trade-off in accuracy is needed, it is more important to capture eigenvectors v { corresponding to large eigenvalues oj. Considering a weighting that is some function, #(•), of the corresponding eigenvalue oj, the following loss function may be found:

One specific example with #(•) being the square-root, gives the following loss function:

In another example, for a layer-1 transmission, i.e. , L = 1, and with a general #(•), then the loss function reduces to g(o- 2 ) times the cosine similarity between the output precoding vector and an eigenvector corresponding to the largest eigenvalue, i.e.,

Some Sixth Embodiments

A version of the Some Fifth Embodiments, with #(•), being the square-root. The term-wise squared sum may also be used as a loss function:

This loss function may be rewritten as: which reconnects to the Some Second Embodiments, but with the smaller singular values truncated (not considered).

Some Seventh Embodiments

For a multi-layer transmission, the training model may be trained to output an L-tuple of precoding vectors > PL) which maximizes the receive power of the second wireless node 120, e.g., summed over the L precoding vectors, subject to an orthogonality constraint, e.g., the vectors form an orthogonal basis. The Some Seventh Embodiments considers a closed loop, e.g., water filling, computation with a total transmit-power constraint. The loss function may be determined by first computing:

• st = where the value s { is indicative of the SNR received by the llpfll second wireless node 120.

• And p is chosen such that ^ =1 U { < U for a maximum allocated power U, which is a fixed parameter.

The loss may then be given by:

Some Eight Embodiments

Version of the Some Seventh Embodiments, but with a log(1+SNR) in the summation over layers. The loss function may be determined by first computing:

• And p is chosen such that ^ =1 U { < U for a maximum allocated power U (which is a fixed parameter).

The loss may then be given by:

Some Nineth Embodiments

For a multi-layer transmission, the training model may be trained to output an L-tuple of precoding vectors (Pi,p 2 >"- , PL) which maximizes the receive power of the second wireless node 120, e.g., summed over the L precoding vectors, subject to an orthogonality constraint, e.g., the vectors form an orthogonal basis. The Some Seventh Embodiments considers a closed loop, e.g., water filling, computation with a t per-antenna-port transmitpower constraint. The loss function may be determined by first computing:

• And p is chosen such that ^ =1 (J { \p { \ < U for a maximum allocated power U, which is a fixed parameter. Here |p | is an element-wise absolute value of the vector p r , and the inequality is to be understood as an element-wise inequality applied to all elements in the resulting left-hand-side vector.

The loss may then be given by:

Some Tenth Embodiments

Version of Embodiment 9, but with a log(1+SNR) in the summation over layers. The loss function may be determined by first computing:

• And p is chosen such that ^ =1 U { |p | < U for a maximum allocated power U (which is a fixed parameter). Here |p | is an element-wise absolute value of the vector p { , and the inequality is to be understood as an elementwise inequality applied to all elements in the resulting left-hand-side vector.

The loss function may then be given by:

Some Eleventh Embodiments

Using generalized cosine similarity with the application-relevant matrix A = H H R~ 1 H, the loss function may be rewritten as: where CS is short-hand notation for the cosine similarity.

Some Twelfth Embodiments

Any monotonously increasing function f of the loss function £(p) will have the same optimum p as the loss function itself for a single sample, and may hence for an individual sample be used as a proxy or surrogate for the per-sample loss function. However, if the loss function is averaged over multiple samples to form an overall average loss, the function f may affect the results if f is not linear. During training, an average over the losses for many samples is effectively taken, either explicitly, e.g., in the case of batch processing, or implicitly, e.g., in the case of stochastic gradient descent (SGD). It may hence in principle matter whether one uses £(p) or £'(p) = f(£(p)) as loss function, for a monotonously increasing, or non-decreasing, function f .

The used loss function £'(pi,p 2 ,-" > PL) is a strictly increasing, or non-decreasing, function f of any of the loss functions in the Some Embodiments presented above, i.e.

For example, f may be chosen to be the logarithmic function, meaning the averaging takes place in the logarithmic domain, such as dB scale instead of linear SNR scale. More generally, the choice of f may be used to control how the neural network during training balances further reduced loss for samples with already small loss vs samples with fairly large loss.

In one variant of the Some Twelfth Embodiments, the function f takes one or more additional arguments, reflecting e.g. the ground truth optimal precoder v L or some other property of the precoders or the channel for the sample. For example, an additional argument may be the overall magnitude of the channel/precoder coefficients/elements (the ground truth v L and/or the estimate p ( ). The function f may e.g., be designed to balance optimization towards high-quality or low-quality channels. In some variants, the loss function reflects the loss in link capacity relative to the link capacity of the sample with the optimal precoder.

Figure 8a and 8b shows an example of arrangement in the first wireless node 110.

The first wireless node 110 may comprise an input and output interface configured to communicate with other networking entities in the wireless communications network 100, e.g. the second wireless node 110 and the operator node 130. The input and output interface may comprise a receiver, e.g. wired and/or wireless, (not shown) and a transmitter, e.g. wired and/or wireless, (not shown).

The first wireless node 110 may comprise any one or more out of: An obtaining unit, and a determining unit to perform the method actions as described herein.

The embodiments herein may be implemented through a processor or one or more processors, such as at least one processor of a processing circuitry in the first wireless node 110 depicted in Figure 8a, together with computer program code for performing the functions and actions of the embodiments herein. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the first wireless node 110. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to the first wireless node 110.

The first wireless node 110 may further comprise respective a memory comprising one or more memory units. The memory comprises instructions executable by the processor in the first wireless node 110. The memory is arranged to be used to store instructions, data, configurations, and applications to perform the methods herein when being executed in the first wireless node 110.

In some embodiments, a computer program comprises instructions, which when executed by the at least one processor, cause the at least one processor of the first wireless node 110 to perform the actions above. In some embodiments, a respective carrier comprises the respective computer program, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.

Those skilled in the art will also appreciate that the functional modules in the first wireless node 110, described below may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g. stored in the first wireless node 110, that when executed by the respective one or more processors such as the at least one processor described above cause the respective at least one processor to perform actions according to any of the actions above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuitry (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a system-on-a-chip (SoC).

Figure 9a and 9b shows an example of arrangement in the operator node 130.

The operator node 130 may comprise an input and output interface configured to communicate with other networking entities in the wireless communications network 100, e.g. the first wireless node 110 and the second wireless node 120. The input and output interface may comprise a receiver, e.g. wired and/or wireless, (not shown) and a transmitter, e.g. wired and/or wireless, (not shown).

The operator node 130 may comprise any one or more out of: An obtaining unit, a reconstructing unit, an calculating unit and a training unit to perform the method actions as described herein.

The embodiments herein may be implemented through a processor or one or more processors, such as at least one processor of a processing circuitry in the operator node 130 depicted in Figure 9a, together with computer program code for performing the functions and actions of the embodiments herein. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the operator node 130. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to operator node 130.

The operator node 130 may further comprise respective a memory comprising one or more memory units. The memory comprises instructions executable by the processor in the operator node 130. The memory is arranged to be used to store instructions, data, configurations, and applications to perform the methods herein when being executed in the operator node 130.

In some embodiments, a computer program comprises instructions, which when executed by the at least one processor, cause the at least one processor of the operator node 130 to perform the actions above.

In some embodiments, a respective carrier comprises the respective computer program, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.

Those skilled in the art will also appreciate that the functional modules in the operator node 130, described below may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g. stored in the operator node 130, that when executed by the respective one or more processors such as the at least one processor described above cause the respective at least one processor to perform actions according to any of the actions above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuitry (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a system-on-a-chip (SoC).

When using the word "comprise" or “comprising” it shall be interpreted as nonlimiting, i.e. meaning "consist at least of".

The embodiments herein are not limited to the preferred embodiments described hereim. Various alternatives, modifications and equivalents may be used. It should be noted that the order of the First-Twelfth Embodiments mentioned above is not related to the order of the embodiments 1-36 below. Each of the First-Twelfth Embodiments may relate to and be combined with any suitable embodiment out of the embodiments 1-36 below.

Embodiments

Below, some example Embodiments 1-36 are shortly described. See e.g. Figures 5, 6, 7, 8a, 8b, 9a and 9b.

Embodiment 1. A method performed by a first wireless node (110) e.g., for determining one or more preferred precoders in a wireless communications network (100), wherein the one or more precoders e.g., are maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node (120), the method comprising anyone or more out of: obtaining (601) a training model, wherein the training model has been trained, by minimizing a loss function indicative of a reconstruction loss of one or more reconstructed second channels and/or one or more reconstructed second channel features, to provide, e.g., an output comprising, any one or more out of:

- the one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node (120),

- one or more reconstructed estimated channels, and

- one or more reconstructed estimated channel features, obtaining (602), from the second wireless node (120), a first compressed channel feature codeword indicative of one or more first channels and/or one or more first channel features estimated by the second wireless node (120), and determining (603), based on the obtained first compressed channel feature codeword and the obtained training model, the one or more preferred precoders e.g., maximizing the SNR received by the second wireless node (120).

Embodiment 2. The method according to Embodiment 1, wherein determining (603) the one or more preferred precoders comprises: providing the first compressed channel feature codeword as input to the obtained training model, which training model has been trained by minimizing the loss function, receiving any one or more out of: the one or more preferred precoders e.g., maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated first channels and one or more reconstructed estimated first channel features, as output from the training model, and determining the one or more preferred precoders based on the output from the training model.

Embodiment 3. The method according to any of Embodiments 1-2, wherein obtaining (601) the training model comprises: obtaining one or more second compressed channel feature codeword, indicative of one or more estimated second channels and/or one or more estimated second channel features, reconstructing the one or more estimated second channels and/or one or more estimated second channel features, e.g., by using the training model, calculating, e.g., based on the reconstructed one or more estimated second channels and/or reconstructed one or more estimated second channel features and/or an output of the training model, a reconstruction loss using the loss function, and training the model to provide any one or more out of: the one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated channels and one or more reconstructed estimated channel features, by using machine learning, e.g., based on the calculated reconstruction loss, to minimize the loss function.

Embodiment 4. The method according to Embodiment 3, wherein the reconstruction loss is any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors,

- a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors. Embodiment 5. The method according to Embodiment 4, wherein when the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

Embodiment 6. The method according to any of Embodiments 1-5, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are, e.g., approximatively, orthogonal.

Embodiment 7. The method according to Embodiment 6, wherein determining the more than one orthogonal preferred precoders comprises any one or more out of:

- performing computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and

- adding a penalizing term to the loss function used for training the training model.

Embodiment 8. The method according to any of Embodiments 1-7, wherein the loss function used for training the training model is e.g., a strictly increasing or nondecreasing function of a second loss function.

Embodiment 9. The method according to Embodiment 8, wherein the loss function is the logarithm of the sum of 1 and the second loss function.

Embodiment 10. A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to any of the Embodiments 1-9.

Embodiment 11. A carrier comprising the computer program of Embodiment 10, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium. Embodiment 12. A method performed by an operator node (130) e.g., for training a training model to provide any one or more out of: One or more preferred precoders, e.g., maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node (120), one or more reconstructed estimated channels and one or more reconstructed estimated channel features, the method comprising anyone or more out of: obtaining (701) one or more third compressed channel feature codeword, indicative of one or more third estimated channels and/or third estimated channel features, reconstructing (702) the one or more third estimated channels and/or one or more third estimated channel features, e.g., by using the training model, calculating (703), e.g., based on the reconstructed one or more third channels and/or reconstructed one or more third estimated channel features and/or an output of the training model, a reconstruction loss using the loss function, and training (704) the training model to provide any one or more out of: The one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node (120), the one or more reconstructed estimated channels and the one or more reconstructed estimated channel features, based on the calculated reconstruction loss, to minimize the loss function.

Embodiment 13. The method according to Embodiment 12, wherein the reconstruction loss is any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors,

- a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

Embodiment 14. The method according to Embodiment 13, wherein when the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

Embodiment 15. The method according to any of Embodiments 12-14, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are, e.g., approximatively, orthogonal.

Embodiment 16. The method according to Embodiment 15, wherein determining the more than one orthogonal preferred precoders comprises any one or more out of:

- performing computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and

- adding a penalizing term to the loss function used for training the training model.

Embodiment 17. The method according to any of Embodiments 12-16, wherein the loss function used for training the training model is e.g., a strictly increasing or nondecreasing function of a second loss function.

Embodiment 18. The method according to Embodiment 17, wherein the loss function is the logarithm of the sum of 1 and the second loss function.

Embodiment 19. A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to any of the Embodiments 12-18.

Embodiment 20. A carrier comprising the computer program of Embodiment 20, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.

Embodiment 21. A first wireless node (110) e.g., configured to determine one or more preferred precoders in a wireless communications network (100), wherein the one or more precoders e.g., are maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node (120), the first wireless node (110) further being configured to any one or more out of: obtain a training model, wherein the training model is adapted to have been trained, by minimizing a loss function indicative of a reconstruction loss of one or more reconstructed second channels and/or one or more reconstructed second channel features, to provide, e.g., an output comprising, any one or more out of:

- the one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node (120),

- one or more reconstructed estimated channels, and

- one or more reconstructed estimated channel features, obtain, from the second wireless node (120), a first compressed channel feature codeword indicative of one or more first channels and/or one or more first channel features estimated by the second wireless node (120), and determine, based on the obtained first compressed channel feature codeword and the obtained training model, the one or more preferred precoders e.g., maximizing the SNR received by the second wireless node (120).

Embodiment 22. The first wireless node (110) according to Embodiment 21, wherein first wireless node (110) is configured to determine the one or more preferred precoders by further being configured to: provide the first compressed channel feature codeword as input to the obtained training model, which training model is adapted to have been trained by minimizing the loss function, receive any one or more out of: the one or more preferred precoders e.g., maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated first channels and one or more reconstructed estimated first channel features, as output from the training model, and determine the one or more preferred precoders based on the output from the training model.

Embodiment 23. The first wireless node (110) according to any of Embodiments 21-22, wherein first wireless node (110) is further configured to obtain the training model by further being configured to: obtain one or more second compressed channel feature codeword, indicative of one or more estimated second channels and/or one or more estimated second channel features, reconstruct the one or more estimated second channels and/or one or more estimated second channel features, e.g., by using the training model, calculate, e.g., based on the reconstructed one or more estimated second channels and/or reconstructed one or more estimated second channel features and/or an output of the training model, a reconstruction loss using the loss function, and train the model to provide any one or more out of: the one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node (120), one or more reconstructed estimated channels and/or one or more reconstructed estimated channel features, by using machine learning, e.g., based on the calculated reconstruction loss, to minimize the loss function.

Embodiment 24. The first wireless node (110) according to Embodiment 23, wherein the reconstruction loss is adapted to be any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors,

- a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

Embodiment 25. The first wireless node (110) according to Embodiment 24, wherein when the reconstruction loss is the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are adapted to be any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors. Embodiment 26. The first wireless node (110) according to any of Embodiments 21-25, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are adapted to be, e.g., approximatively, orthogonal.

Embodiment 27. The first wireless node (110) according to Embodiment 26, wherein first wireless node (110) is configured to determine the more than one orthogonal preferred precoders by further being configured to any one or more out of:

- perform computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and

- add a penalizing term to the loss function used for training the training model.

Embodiment 28. The first wireless node (110) according to any of Embodiments 21-27, wherein the loss function used for training the training model is adapted to be e.g., a strictly increasing or non-decreasing function of a second loss function.

Embodiment 29. The first wireless node (110) according to Embodiment 28, wherein the loss function is adapted to be the logarithm of the sum of 1 and the second loss function.

Embodiment 30. An operator node (130) e.g., configured to train a training model to provide any one or more out of: One or more preferred precoders, e.g., maximizing a Signal-to-Noise Ratio, SNR, received by a second wireless node (120), one or more reconstructed estimated channels and one or more reconstructed estimated channel features, the operator node (130) further being configured to anyone or more out of: obtain one or more third compressed channel feature codeword, indicative of one or more third estimated channels and/or one or more third estimated channel features, reconstruct the one or more third estimated channels and/or one or more third estimated channel features, e.g., by using the training model, calculate, e.g., based on the reconstructed one or more third channels and/or reconstructed one or more third channel features and/or an output of the training model, a reconstruction loss using the loss function, and train the training model to provide any one or more out of: the one or more preferred precoders, e.g., maximizing the SNR received by the second wireless node (120), the one or more reconstructed estimated channels and the one or more reconstructed estimated channel features, by using machine learning, based on the calculated reconstruction loss, to minimize the loss function.

Embodiment 31. The operator node (130) according to Embodiment 30, wherein the reconstruction loss is adapted to be any one or more out of:

- a sum of a plurality of reconstruction losses calculated based on a plurality of subbands,

- a sum of reconstruction losses calculated based on a plurality of transmission layers,

- a weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors,

- a weighted sum being calculated by calculating a set of values based on similarities between multiple precoders and multiple eigenvectors, wherein the weights are based on norms of precoders and/or eigenvalues associated to the multiple eigenvectors.

Embodiment 32. The operator node (130) according to Embodiment 31, wherein when the reconstruction loss is adapted to be the weighted sum of reconstruction losses calculated based on inner products between precoders and eigenvectors, the weights are adapted to be any one or more out of:

- the square of eigenvalues of a covariance matrix related to the one or more estimated channels, and

- the norms of the plurality of precoders and/or eigenvalues associated to the multiple eigenvectors.

Embodiment 33. The operator node (130) according to any of Embodiments SO- 32, wherein, when more than one preferred precoders are determined, the more than one preferred precoders are adapted to be, e.g., approximatively, orthogonal.

Embodiment 34. The operator node (130) according to Embodiment 33, wherein the operator node (130) is configured to determine the more than one orthogonal preferred precoders by further being configured to any one or more out of:

- perform computational post-processing on the output of the training model to orthogonalize more than one preferred precoders, and - adding a penalizing term to the loss function used for training the training model.

Embodiment 35. The operator node (130) according to any of Embodiments SO- 34, wherein the loss function used for training the training model is adapted to be e.g., a strictly increasing or non-decreasing function of a second loss function.

Embodiment 36. The operator node (130) according to Embodiment 35, wherein the loss function is adapted to be the logarithm of the sum of 1 and the second loss function.

Further Extensions and Variations

With reference to Figure 10, in accordance with an embodiment, a communication system includes a telecommunication network 3210 such as the wireless communications network 100, e.g. an loT network, or a WLAN, such as a 3GPP-type cellular network, which comprises an access network 3211, such as a radio access network, and a core network 3214. The access network 3211 comprises a plurality of base stations 3212a, 3212b, 3212c, such as the network node 110, access nodes, AP STAs NBs, eNBs, gNBs or other types of wireless access points, each defining a corresponding coverage area 3213a, 3213b, 3213c. Each base station 3212a, 3212b, 3212c is connectable to the core network 3214 over a wired or wireless connection 3215. A first user equipment (UE) e.g. the UE 120 such as a Non-AP STA 3291 located in coverage area 3213c is configured to wirelessly connect to, or be paged by, the corresponding base station 3212c. A second UE 3292 e.g. the UE 120 such as a Non-AP STA in coverage area 3213a is wirelessly connectable to the corresponding base station 3212a. While a plurality of UEs 3291, 3292 are illustrated in this example, the disclosed embodiments are equally applicable to a situation where a sole UE is in the coverage area or where a sole UE is connecting to the corresponding base station 3212.

The telecommunication network 3210 is itself connected to a host computer 3230, which may be embodied in the hardware and/or software of a standalone server, a cloud- implemented server, a distributed server or as processing resources in a server farm. The host computer 3230 may be under the ownership or control of a service provider, or may be operated by the service provider or on behalf of the service provider. The connections 3221, 3222 between the telecommunication network 3210 and the host computer 3230 may extend directly from the core network 3214 to the host computer 3230 or may go via an optional intermediate network 3220. The intermediate network 3220 may be one of, or a combination of more than one of, a public, private or hosted network; the intermediate network 3220, if any, may be a backbone network or the Internet; in particular, the intermediate network 3220 may comprise two or more sub-networks (not shown).

The communication system of Figure 10 as a whole enables connectivity between one of the connected UEs 3291 , 3292 and the host computer 3230. The connectivity may be described as an over-the-top (OTT) connection 3250. The host computer 3230 and the connected UEs 3291, 3292 are configured to communicate data and/or signaling via the OTT connection 3250, using the access network 3211, the core network 3214, any intermediate network 3220 and possible further infrastructure (not shown) as intermediaries. The OTT connection 3250 may be transparent in the sense that the participating communication devices through which the OTT connection 3250 passes are unaware of routing of uplink and downlink communications. For example, a base station 3212 may not or need not be informed about the past routing of an incoming downlink communication with data originating from a host computer 3230 to be forwarded e.g., handed over to a connected UE 3291. Similarly, the base station 3212 need not be aware of the future routing of an outgoing uplink communication originating from the UE 3291 towards the host computer 3230.

Example implementations, in accordance with an embodiment, of the UE, base station and host computer discussed in the preceding paragraphs will now be described with reference to Figure 11. In a communication system 3300, a host computer 3310 comprises hardware 3315 including a communication interface 3316 configured to set up and maintain a wired or wireless connection with an interface of a different communication device of the communication system 3300. The host computer 3310 further comprises processing circuitry 3318, which may have storage and/or processing capabilities. In particular, the processing circuitry 3318 may comprise one or more programmable processors, application-specific integrated circuits, field programmable gate arrays or combinations of these (not shown) adapted to execute instructions. The host computer 3310 further comprises software 3311, which is stored in or accessible by the host computer 3310 and executable by the processing circuitry 3318. The software 3311 includes a host application 3312. The host application 3312 may be operable to provide a service to a remote user, such as a UE 3330 connecting via an OTT connection 3350 terminating at the UE 3330 and the host computer 3310. In providing the service to the remote user, the host application 3312 may provide user data which is transmitted using the OTT connection 3350.

The communication system 3300 further includes a base station 3320 provided in a telecommunication system and comprising hardware 3325 enabling it to communicate with the host computer 3310 and with the UE 3330. The hardware 3325 may include a communication interface 3326 for setting up and maintaining a wired or wireless connection with an interface of a different communication device of the communication system 3300, as well as a radio interface 3327 for setting up and maintaining at least a wireless connection 3370 with a UE 3330 located in a coverage area (not shown) served by the base station 3320. The communication interface 3326 may be configured to facilitate a connection 3360 to the host computer 3310. The connection 3360 may be direct or it may pass through a core network (not shown) in Figure 11 of the telecommunication system and/or through one or more intermediate networks outside the telecommunication system. In the embodiment shown, the hardware 3325 of the base station 3320 further includes processing circuitry 3328, which may comprise one or more programmable processors, application-specific integrated circuits, field programmable gate arrays or combinations of these (not shown) adapted to execute instructions. The base station 3320 further has software 3321 stored internally or accessible via an external connection.

The communication system 3300 further includes the UE 3330 already referred to. Its hardware 3335 may include a radio interface 3337 configured to set up and maintain a wireless connection 3370 with a base station serving a coverage area in which the UE 3330 is currently located. The hardware 3335 of the UE 3330 further includes processing circuitry 3338, which may comprise one or more programmable processors, applicationspecific integrated circuits, field programmable gate arrays or combinations of these (not shown) adapted to execute instructions. The UE 3330 further comprises software 3331 , which is stored in or accessible by the UE 3330 and executable by the processing circuitry 3338. The software 3331 includes a client application 3332. The client application 3332 may be operable to provide a service to a human or non-human user via the UE 3330, with the support of the host computer 3310. In the host computer 3310, an executing host application 3312 may communicate with the executing client application 3332 via the OTT connection 3350 terminating at the UE 3330 and the host computer 3310. In providing the service to the user, the client application 3332 may receive request data from the host application 3312 and provide user data in response to the request data. The OTT connection 3350 may transfer both the request data and the user data. The client application 3332 may interact with the user to generate the user data that it provides.

It is noted that the host computer 3310, base station 3320 and UE 3330 illustrated in Figure 11 may be identical to the host computer 3230, one of the base stations 3212a, 3212b, 3212c and one of the UEs 3291, 3292 of Figure 10, respectively. This is to say, the inner workings of these entities may be as shown in Figure 11 and independently, the surrounding network topology may be that of Figure 10.

In Figure 11 , the OTT connection 3350 has been drawn abstractly to illustrate the communication between the host computer 3310 and the use equipment 3330 via the base station 3320, without explicit reference to any intermediary devices and the precise routing of messages via these devices. Network infrastructure may determine the routing, which it may be configured to hide from the UE 3330 or from the service provider operating the host computer 3310, or both. While the OTT connection 3350 is active, the network infrastructure may further take decisions by which it dynamically changes the routing e.g., on the basis of load balancing consideration or reconfiguration of the network.

The wireless connection 3370 between the UE 3330 and the base station 3320 is in accordance with the teachings of the embodiments described throughout this disclosure. One or more of the various embodiments improve the performance of OTT services provided to the UE 3330 using the OTT connection 3350, in which the wireless connection 3370 forms the last segment. More precisely, the teachings of these embodiments may improve the applicable RAN effect: data rate, latency, power consumption, and thereby provide benefits such as corresponding effect on the OTT service: e.g. reduced user waiting time, relaxed restriction on file size, better responsiveness, extended battery lifetime.

A measurement procedure may be provided for the purpose of monitoring data rate, latency and other factors on which the one or more embodiments improve. There may further be an optional network functionality for reconfiguring the OTT connection 3350 between the host computer 3310 and UE 3330, in response to variations in the measurement results. The measurement procedure and/or the network functionality for reconfiguring the OTT connection 3350 may be implemented in the software 3311 of the host computer 3310 or in the software 3331 of the UE 3330, or both. In embodiments, sensors (not shown) may be deployed in or in association with communication devices through which the OTT connection 3350 passes; the sensors may participate in the measurement procedure by supplying values of the monitored quantities exemplified above, or supplying values of other physical quantities from which software 3311 , 3331 may compute or estimate the monitored quantities. The reconfiguring of the OTT connection 3350 may include message format, retransmission settings, preferred routing etc.; the reconfiguring need not affect the base station 3320, and it may be unknown or imperceptible to the base station 3320. Such procedures and functionalities may be known and practiced in the art. In certain embodiments, measurements may involve proprietary UE signaling facilitating the host computer’s 3310 measurements of throughput, propagation times, latency and the like. The measurements may be implemented in that the software 3311, 3331 causes messages to be transmitted, in particular empty or ‘dummy’ messages, using the OTT connection 3350 while it monitors propagation times, errors etc.

Figure 12 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as the network node 110, and a UE such as the UE 120, which may be those described with reference to Figure 10 and Figure 11. For simplicity of the present disclosure, only drawing references to Figure 12 will be included in this section. In a first action 3410 of the method, the host computer provides user data. In an optional sub action 3411 of the first action 3410, the host computer provides the user data by executing a host application. In a second action 3420, the host computer initiates a transmission carrying the user data to the UE. In an optional third action 3430, the base station transmits to the UE the user data which was carried in the transmission that the host computer initiated, in accordance with the teachings of the embodiments described throughout this disclosure. In an optional fourth action 3440, the UE executes a client application associated with the host application executed by the host computer.

Figure 13 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as an AP STA, and a UE such as a Non-AP STA which may be those described with reference to Figure 10 and Figure 11. For simplicity of the present disclosure, only drawing references to Figure 13 will be included in this section. In a first action 3510 of the method, the host computer provides user data. In an optional sub action (not shown) the host computer provides the user data by executing a host application. In a second action 3520, the host computer initiates a transmission carrying the user data to the UE. The transmission may pass via the base station, in accordance with the teachings of the embodiments described throughout this disclosure. In an optional third action 3530, the UE receives the user data carried in the transmission.

Figure 14 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as an AP STA, and a UE such as a Non-AP STA which may be those described with reference to Figure 10 and Figure 11. For simplicity of the present disclosure, only drawing references to Figure 14 will be included in this section. In an optional first action 3610 of the method, the UE receives input data provided by the host computer. Additionally or alternatively, in an optional second action 3620, the UE provides user data. In an optional sub action 3621 of the second action 3620, the UE provides the user data by executing a client application. In a further optional sub action 3611 of the first action 3610, the UE executes a client application which provides the user data in reaction to the received input data provided by the host computer. In providing the user data, the executed client application may further consider user input received from the user. Regardless of the specific manner in which the user data was provided, the UE initiates, in an optional third sub action 3630, transmission of the user data to the host computer. In a fourth action 3640 of the method, the host computer receives the user data transmitted from the UE, in accordance with the teachings of the embodiments described throughout this disclosure.

Figure 15 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as an AP STA, and a UE such as a Non-AP STA which may be those described with reference to Figure 10 and Figure 11. For simplicity of the present disclosure, only drawing references to Figure 15 will be included in this section. In an optional first action 3710 of the method, in accordance with the teachings of the embodiments described throughout this disclosure, the base station receives user data from the UE. In an optional second action 3720, the base station initiates transmission of the received user data to the host computer. In a third action 3730, the host computer receives the user data carried in the transmission initiated by the base station.

Definitions of some abbreviations and acronyms used herein.

Abbreviation Explanation AE Autoencoder (see Figure 3)

Al Artificial Intelligence

BS Base station

CSI Channel State Information

DFT Discrete Fourier Transform

MIMO Multiple-input multiple-output (channel)

ML Machine learning

MU-MIMO Multi-user MIMO

NN Neural network

NW Network

PMI Precoding Matrix Indicator

RS Reference signal

Rx Receiver

SVD Singular Value Decomposition

Tx Transmitter

UE User equipment

References:

1. 3GPP TS 38.214 “Physical layer procedures for data (Release 16)”

2. RP-213599, “Study on Artificial Intelligence (Al)/Machine Learning (ML) for NR Air Interface", December 2021

3. Zhilin Lu, Xudong Zhang, Hongyi He, Jintao Wang, and Jian Song, “Binarized Aggregated Network with Quantization: Flexible Deep Learning Deployment for CSI Feedback in MassiveMIMO System”, arXiv, 2105.00354 v1, May, 2021.

4. RWS-210024, “Rel.18 Network AI/ML,” QUALCOMM, TSG RAN Rel-18 workshop, June 28 - July 2, 2021.

5. J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization”, Journal of Machine Learning Research, 2010.

6. D. Kingma and J. Ba, “A method for stochastic optimization,” arXiv, 1412.6980, Dec 2014.

7. P103827 “A method to enable proprietary ML-based CSI reporting”, November 2021

8. P102554 “Methods enabling machine-learning based CSI feedback”

9. P103610 “Reference decoders for ML-based CSI reporting” 10. P104267 “A hybrid model-learning solution for CSI reporting”

11. Nelson Costa and Simon Haykin, Multiple- Input, Multiple-Output Channel Models: Theory and Practice, John Wiley & Sons, 2010 (Appendix A, which is the one referenced, is openly available from the publisher.)

12. S. Loffe and C. Szegdy, “Batch normalization: Accelerated deep network training by reducing internal covariance shift,” arXiv 1502.03167, March, 2015.

13. C. Trabelsi, O. Bilaniuk, Y. Zhang, D. Serdyuk, S. Subramanian, J. F. Santos, S. Mehri, N. Rostamzadeh, Y. Bengio, and C. J. Pa, “Deep complex networks,” arXiv 1705.09792, 2018.

14. K. Scharnhorst, “Angles in Complex Vector Spaces”, arXiv: 9904077, 1999.

15. Muhan Chen, Jiajia Guo, Chao-Kai Wen, Shi Jin, Geoffrey Ye Li, Ang Yang, “Deep Learning-based Implicit CSI Feedback in Massive MIMO”, ”, arXiv: 2105.10100, 2021.

16. Kumar Pratik, Rana Ali Amjad, Arash Behboodi, Joseph B. Soriaga, Max Welling, "Neural Augmentation of Kalman Filter with Hypernetwork for Channel Tracking”, arXiv: 2109.12561 , 2021.

17. Pranav Madadi, Jeongho Jeon, Joonyoung Cho, Caleb Lo, Juho Lee, Jianzhong Zhang, "PolarDenseNet: A Deep Learning Model for CSI Feedback in MIMO Systems", arXiv: 2202.01246, 2022

18. Zhilin Lu, Jintao Wang, Jian Song, "Multi-resolution CSI Feedback with deep learning in Massive MIMO System", arXiv: 1910.14322, 2019.

19. RP-211809, “Discussion on R18 study on AI/ML-based 5G enhancements”, September 2021.

20. A. Galantai, Cs. J. Hegedus, “Jordan’s principal angles in complex vector spaces”, Numerical Linear Algebra with Applications, 2006