ANTENNA CALIBRATION CONTROL FOR ADAPTING ANTENNA CALIBRATION INTERVALS

Title:

ANTENNA CALIBRATION CONTROL FOR ADAPTING ANTENNA CALIBRATION INTERVALS

Document Type and Number:

WIPO Patent Application WO/2022/260562

Kind Code:

Abstract:

An antenna calibration control unit (110) and method therein for regulating antenna calibration intervals of an antenna unit (120) are disclosed. The antenna calibration control unit (110) obtains system performance parameters related to needs of calibration of the antenna unit. The system performance parameters comprise information on the previous antenna calibrations, information obtained from an antenna supervision unit (130) and information obtained from an antenna calibration process unit (140) and/or a baseband processing unit (150). The antenna calibration control unit (110) evaluates the system performance parameters and determines whether to trigger an antenna calibrating process or not based on the evaluating result, thereby regulating the antenna calibration intervals between antenna calibration measurements.

Inventors:

WIDEBRANT ANDERS (SE)
KALYANAM SARAT (SE)
ZHANG HAO (CN)
LEVIN GEORGY (CA)

Application Number:

PCT/SE2021/050545

Publication Date:

December 15, 2022

Filing Date:

June 07, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ERICSSON TELEFON AB L M (SE)

International Classes:

H04B17/12; G06N20/00; H01Q3/26; H04B7/0413; H04B17/21; H04L7/00

Domestic Patent References:

WO2020244783A1	2020-12-10
WO2018145534A1	2018-08-16

Foreign References:

US20080297402A1	2008-12-04
EP2203987A2	2010-07-07

Other References:

C. SHAN ET AL.: "Diagnosis of Calibration State for Massive Antenna Array via Deep Learning", IEEE WIRELESS COMMUNICATIONS LETTERS, vol. 8, 2019, pages 1431, XP011750153, DOI: 10.1109/LWC.2019.2920818
DOMINIK BAUMANN; JIA-JIE ZHU; GEORG MARTIUS; SEBASTIAN TRIMPE: "Deep Reinforcement Learning for Event-Triggered Control", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 13 September 2018 (2018-09-13), 201 Olin Library Cornell University Ithaca, NY 14853 , XP080917153
CHONGWEN HUANG; GEORGE C. ALEXANDROPOULOS; ALESSIO ZAPPONE; CHAU YUEN; M\EROUANE DEBBAH: "Deep Learning for UL/DL Channel Calibration in Generic Massive MIMO Systems", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 7 March 2019 (2019-03-07), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081130552
WEI XIZIXIANG; JIANG YI; WANG XIN: "Calibration of Phase Shifter Network for Hybrid Beamforming in mmWave Massive MIMO Systems", ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), IEEE, 20 May 2019 (2019-05-20), pages 1 - 6, XP033582312, DOI: 10.1109/ICC.2019.8761829
F. ZARDI ET AL.: "Artificial Intelligence for Adaptive and Reconfigurable Antenna Arrays: A Review", IEEE ANTENNAS AND PROPAGATION MAGAZINE, vol. 63, 3 December 2020 (2020-12-03), pages 28, XP011858428, DOI: 10.1109/MAP.2020.3036097

Attorney, Agent or Firm:

SJÖBERG, Mats (SE)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. An antenna calibration control method for regulating antenna calibration intervals of an antenna unit (120) comprising one or more antenna elements in a wireless communication system, the method comprising: obtaining (210) system performance parameters related to needs of calibration of the antenna unit (120), wherein the system performance parameters comprise information on the previous antenna calibrations, information obtained from an antenna supervision unit (130) and information obtained from an antenna calibration process unit (140) and/or a baseband processing unit (150); evaluating (220) the system performance parameters; determining (230) whether to trigger an antenna calibrating process or not based on the evaluation result, thereby regulating the antenna calibration intervals between antenna calibration measurements.

2. The antenna calibration control method according to claim 1 , wherein the system performance parameters comprise any one or a combination of: a) time elapsed since last antenna calibration measurement; b) average uplink and downlink throughput per active user; c) number of unique beams used by an antenna element over time; d) average uplink transmission power of terminals; e) average beam quality reported by terminals; f) measured peak-to-null beamforming ratio of an antenna element; g) amount of phase rotation in calculated compensation factors for an antenna element compared to the previous antenna calibration measurement; h) time of day and day of year; i) antenna calibration signal transmitting power and antenna calibration signal receiver gain; j) number of connected terminals to a network node; k) external ambient temperature outside the antenna unit;

L) internal hardware component temperatures; m) antenna unit age; n) signal to interference and noise ratio between a captured antenna calibration signal and the surrounding radio frequency environment; o) number of Multi-User Multiple Input Multiple Output, MIMO, Single-User MIMO and number of beams.

3. The antenna calibration control method according to any one of claims 1 -2, wherein evaluating the system performance parameters is performed by executing a machine learning process taking the system performance parameters as learning inputs to the machine learning process.

4. The antenna calibration control method according to claim 3, wherein the machine learning process is a standard reinforcement learning process, and wherein the reinforcement learning process comprises an interpreter (320) and an agent (330), and wherein the interpreter (320) is configured to: receive the system performance parameters as learning inputs; divide the learning inputs into a first and second groups, wherein the first group is for calculating a reward, and the second group is a set of states that representing the current states of the system; calculate a reward based on the first group of the learning inputs; and input the reward and the second group of the learning inputs to the agent; and wherein the agent (330) is configured to: receive the set of states and the reward from the interpreter; select an action from a predefined set of actions based on the set of states and the reward, wherein the predefined set of actions comprises triggering of a full- band antenna calibration measurement, triggering of a sub-band antenna calibration measurement, no triggering of an antenna calibration measurement; and trigger an antenna calibration measurement unit (160) to perform the selected action.

5. The antenna calibration control method according to claim 4, wherein the reward is calculated based on a combination of weighted system performance parameters and the reward represents a short-time feedback of the system performance after an action is performed and the state of the system is changed from the current state to a next state, and wherein a total reward is calculated based on the short-time reward.

6. The antenna calibration control method according to claim 5, wherein the target of the machine learning process is to select best actions that maximize the total reward. 7. The antenna calibration control method according to any one of claims 4-6, wherein the reinforcement learning process is implemented by a Q-learning network, and the reinforcement learning process comprises the following steps: initiating (510) the Q-learning network with multiple Q-value functions by setting initial weights for the Q-value functions; acquiring (520) the system performance parameters as learning inputs to the Q-learning network; setting (530) a set of states of the system according to the learning inputs; selecting (540) an action from the predefined set of actions; triggering (550) an antenna calibration measurement unit to perform the selected action; obtaining (560) a reward from the feedback of the system performance parameters after the selected action is performed; calculating (570) a total reward with the Q-value functions based on the reward; updating (580) weights of the Q-value functions; and continuing to perform steps 510-570 to achieve an optimized Q-value thereby setting the antenna calibration intervals between antenna calibration measurements adaptively.

8. The antenna calibration control method according to any one of claims 4-7, wherein triggering of a sub band antenna calibration measurement comprises: triggering an antenna calibration measurement unit (160) to send antenna calibration signals covering a part of a carrier frequency or a few subcarriers; evaluating the system performance parameters comprises evaluating phase drifts of the antenna calibration signals between the antenna calibration signals measurements; and determining to trigger a full-band antenna calibration measurement or not based on the evaluation result.

9. The antenna calibration control method according to any one of claims 1 -8 is implemented in a network node (1000).

10. The antenna calibration control method according to any one of claims 1-8 is implemented in a cloud. 11 . The antenna calibration control method according to any one of claims 1-8 is implemented in an Open Radio Area Network, O-RAN, or Open Radio Unit, O-RU. 12. An antenna calibration control unit (110) configured to perform the method according to any one of claims 1 -8.

13. A network node (1000) comprising an antenna calibration control unit (110) according to claim 12.

14. A computer program product (1060) comprising instructions which when the program is executed by a computer, cause the computer to carry out the antenna calibration control method according to any one of the claims 1-8. 15. A data processing system comprising a processor configured to perform the the antenna calibration control method according to any one of the claims 1-8.

Description:

ANTENNA CALIBRATION CONTROL FOR ADAPTING ANTENNA CALIBRATION

INTERVALS

TECHNICAL FIELD

Embodiments herein relate to an antenna calibration control unit and method. In particular, they relate to controlling antenna calibrations, e.g., for a network node, in a wireless communication system.

BACKGROUND

Radio units, such as transceivers etc. in a wireless communication system usually use multiple antennas to perform beamforming for spatial data multiplexing. Multiple antennas can significantly increase the data rates and reliability of a wireless communication system. The performance is improved if both the transmitter and the receiver are equipped with multiple antennas, which results in a Multiple-Input Multiple-Output (MIMO) communication channel.

Accurate beamforming relies on the multiple antennas having a closely synchronized amplitude and phase response. To achieve this, the multiple antennas must be continuously calibrated while the radio unit is in use.

This runtime antenna calibration can be performed by periodically injecting an antenna calibration signal into the multiple antennas and capturing it after it has passed through each antenna path. This gives a measurement of each antenna path, which can then be used to compensate for differences in amplitude and phase response between the multiple antennas.

The antenna calibration signal injected into the multiple antennas consumes time and frequency resources in the ongoing radio transmission, which could otherwise have been used to transmit traffic data between a network node or base station and terminals in a cell of the wireless communication system. This reduces cellular data traffic throughput and can cause unnecessary handovers.

The problem becomes worse the more often the antennas require calibration. This has particularly negative effect on millimeter-wave radios, which due to their high radio frequency exhibit faster phase drift than radios in the lower frequency spectrum.

In addition, calculating appropriate compensation factors for the different antennas based on the measurement requires significant signal processing resources, leading to higher energy usage and increased hardware costs. SUMMARY

It is therefore an object of embodiments herein to provide an improved method for antenna calibration in a wireless communication system.

According to one aspect of embodiments herein, the object is achieved by an antenna calibration control unit and method therein for regulating antenna calibration intervals of an antenna unit comprising one or more antenna elements in a wireless communication system.

The antenna calibration control unit obtains system performance parameters related to needs of calibration of the antenna unit. The system performance parameters comprise information on the previous antenna calibrations, information obtained from an antenna supervision unit and information obtained from an antenna calibration process unit and/or a baseband processing unit.

The antenna calibration control unit evaluates the system performance parameters and determines whether to trigger an antenna calibrating process or not based on the evaluating result, thereby regulating the antenna calibration intervals between antenna calibration measurements.

The system performance parameters may be evaluated by executing a machine learning process taking the system performance parameters as learning inputs to the machine learning process.

According to some embodiments herein, the machine learning process is configured to select an action from a predefined set of actions comprising triggering of a full band antenna calibration measurement, triggering of a sub band antenna calibration measurement, no triggering of an antenna calibration measurement.

In other words, the embodiments herein make use of the machine learning process, such as a reinforcement learning, to form a picture of multiple environmental factors that can affect antenna calibration performance. The embodiments herein make use of a reinforcement learning to evaluate and weigh multiple traffic performance factors that affect the need for antenna recalibration.

According to the embodiments herein, the interval between the antenna calibration measurements may be regulated to maximize the wireless communication system performance, for example higher data throughput and lower energy consumption may be achieved from better beamforming when needed, and from antenna calibration avoidance when beamforming requirements are low. According to the embodiments herein, the interval between the antenna calibration measurements may be regulated to minimize traffic impact. For example, longer interval between the antenna calibration measurements will reduce the impact on the ongoing traffic since time and frequency resources occupied by antenna calibration signals will be less compared to shorter interval.

The embodiments herein use a special sub-band antenna calibration measurement method which allows evaluation of antenna calibration performance without the need to fully calibrate the antennas. The sub-band antenna calibration measurement is here used to detect antenna calibration performance, not to fully calibrate the antennas. That is the sub band antenna calibration measurement does not apply compensation factors and adjust the phase and amplitude of the beams from the antenna unit.

The embodiments herein increase traffic throughput and decrease handover probability by reducing the amount of antenna calibration interruptions over a given period.

The embodiments herein save energy and radio processing capacity by increasing the interval between antenna calibrations, since each calibration requires significant data processing to analyze the captured signal and derive the appropriate compensation factors.

By considering both radio metrics and baseband processing metrics, the embodiments herein can optimize the intervals of antenna calibration measurements to maintain a desired accuracy of an antenna calibration compensation. The embodiments herein can also opportunistically reduce the required accuracy of the antenna calibration compensation at times when the wireless communication system is not required to provide high precision beamforming e.g. high number of Multi-User MIMO (MU-MIMO) and layers. It also opportunistically increases the required accuracy of antenna calibration at the times when the wireless communication system is required to provide high precision beamforming.

Therefore, embodiments herein provide an improved method for antenna calibration.

BRIEF DESCRIPTION OF THE DRAWINGS

Examples of embodiments herein are described in more detail with reference to attached drawings in which:

Figure 1 is an overview of an antenna calibration system according to embodiments herein; Figure 2 is a signal flowchart illustrating an antenna calibration sequence according to embodiments herein; Figure 3 is a schematic view of a machine learning process according to embodiments herein;

Figure 4 is a schematic view of Q-network function;

Figure 5 is a flow chart of a reinforcement learning process used in an antenna calibration control method according to embodiments herein;

Figure 6 shows a model of antenna calibration according to embodiments herein;

Figure 7 shows an example of pilot antenna calibration signals for sub-band antenna calibration according to embodiments herein;

Figure 8 shows an example cloud implementation for antenna calibration control according to embodiments herein;

Figure 9 shows an example O-RAN/O-RU implementation for antenna calibration control according to embodiments herein; and

Figure 10 is a block diagram illustrating a network node in which an antenna calibration control unit and method according to embodiments herein may be implemented.

DETAILED DESCRIPTION

Figure 1 shows a schematic overview of an antenna calibration system 100. The antenna calibration system 100 comprises an antenna calibration control unit 110 and an antenna unit 120 that needs calibration. The antenna unit 120 may comprise one or more antenna elements. The antenna calibration system 100 further comprises an antenna supervision unit 130 configured to capture runtime information about the antenna unit 120, an antenna calibration process unit 140 configured to produce compensation factors from antenna calibration measurement results, a baseband processing unit 150 configured to process data received from the antenna unit 120 and data to be sent by the antenna unit 120, an antenna calibration measurement unit 160 configured to inject antenna calibration signals to the antenna unit 120 and capture or measure the antenna calibration signals. The arrows in Figure 1 are used to only indicate information or data flow between different units.

The antenna calibration control unit 110 is configured to perform an antenna calibration control method for regulating antenna calibration intervals of the antenna unit 120. The antenna calibration control unit 110 is configured to trigger antenna calibration measurements when needed, using inputs from the antenna supervision unit 130, the antenna calibration process unit 140 and/or the baseband processing unit 150. The antenna calibration control method for regulating antenna calibration intervals of the antenna unit 120 will be described in detail with reference to Figure 2. The method comprises the following actions.

Action 210

The antenna calibration control unit 110 obtains system performance parameters related to needs of calibration of the antenna unit 120. The system performance parameters comprise information on the previous antenna calibrations, information obtained from the antenna supervision unit 130 and information obtained from the antenna calibration process unit 140 and/or the baseband processing unit 150.

In other words, when the antenna unit 120 is in active use, the antenna calibration control unit 110 collects inputs from the baseband processing unit 150, the antenna supervision unit 130, and the antenna calibration process unit 140.

The system performance parameters may comprise any one or a combination of: a) time elapsed since last antenna calibration measurement; b) average uplink and downlink throughput per active user; c) number of unique beams used by an antenna element over time; d) average uplink transmission power of terminals; e) average beam quality reported by terminals; f) measured peak-to-null beamforming ratio of an antenna element; g) amount of phase rotation in calculated compensation factors for an antenna element compared to the previous antenna calibration measurement; h) time of day and day of year; i) antenna calibration signal transmitting power and antenna calibration signal receiver gain; j) number of connected terminals to a network node; k) external ambient temperature outside the antenna unit;

From which unit the above system performance parameters may be collected will be described in a later section.

Action 220 The antenna calibration control unit 110 evaluates the system performance parameters.

According to some embodiments herein, the antenna calibration control unit 110 evaluates the system performance parameters by executing a machine learning process taking the system performance parameters as learning inputs to the machine learning process. The machine learning process will be described in detail in a later section.

Action 230

The antenna calibration control unit 110 determines whether to trigger an antenna calibrating process or not based on the evaluating result, thereby regulating the antenna calibration intervals between antenna calibration measurements.

If an antenna calibration measurement is needed, the calibration control unit 110 triggers an antenna calibrating process via the antenna calibration measurement unit 160. During the antenna calibrating process, the antenna calibration measurement unit 160 sends or injects an antenna calibration signal to the antenna unit 120 and measures the antenna calibration signal by an antenna calibration signal receiver and then reports the measurements to the antenna calibration process unit 140. If no measurement is needed, the antenna calibration control unit 110 continues to collect learning inputs and evaluate them using e.g. the machine learning algorithm.

Figure 3 is a block diagram showing an example of a machine learning process according to embodiments herein. The purpose of the machine learning process is to either trigger an Antenna Calibration (AC) measurement or not, based on the present and past system state. This decision is taken repeatedly throughout the active runtime of the antenna unit 120, thereby regulating the interval between AC measurements.

The machine learning process shown in Figure 3 is a standard reinforcement learning process. It collects the system state from a set of learning inputs. The learning inputs to the algorithm are different measured aspects of the system state in and around a network node or base station comprising the antenna unit 120. As shown in Figure 3, the reinforcement learning process comprises an Environment 310, an Interpreter 320, and an Agent 330. The Environment 310 represents all sources of learning inputs listed in the system performance parameters above. The Interpreter 320 and Agent 330 form the learning algorithm, which calculates a reward from selected learning inputs and takes a decision on triggering AC or not.

The Interpreter 320 is configured to receive the system performance parameters as learning inputs and divide the learning inputs into a first and a second group. The first group is for calculating a reward, for example, the system performance parameters b), e), f) etc. listed above. The second group is a set of states that represents the current states of the system. For example, the system performance parameters a), c), d), g), h), i), j), k), I), m), n) etc. listed above.

The Interpreter 320 is further configured to calculate a reward based on the first group of the learning inputs and input the reward and the second group of the learning inputs to the Agent 330.

The Agent 330 is configured to receive the set of states that represents the current states of the system and the reward from the Interpreter 320.

The Agent 330 is further configured to select an action from a predefined set of actions based on the set of states and the reward. The predefined set of actions may comprise triggering of a full-band antenna calibration measurement, triggering of a sub-band antenna calibration measurement, no triggering of an antenna calibration measurement. The sub band AC measurement is here used to detect AC performance, not to fully calibrate the antennas, that is the sub-band antenna calibration measurement does not apply compensation factors and adjust the phase and amplitude for the antenna unit 120, which is performed for the full-band antenna calibration.

The Agent 320 is further configured to trigger the antenna calibration measurement unit 160 to perform the selected action.

The reward represents a short-time feedback of the system performance after an action is performed and the state of the system is changed from the current state to a next state. The reward may be calculated based on a combination of weighted system performance parameters. Then a total reward may be calculated based on the short-time reward as will be described in the following section. The target of the machine learning process is to select best actions that maximize the total reward.

In the following some examples are given on how the system performance parameters are used by the machine learning process to optimize the intervals of antenna calibration measurements.

Some inputs reflect system performance and may be influenced by the learning algorithm’s decisions on how frequently antenna calibration measurements are triggered. These will be combined into the reward calculated by the Interpreter and used by the Agent, as outlined above. These inputs and their influence on the reward include for examples:

• Parameters from the antenna calibration control unit 110: a) Time elapsed since last measurement. This parameter is the information on the previous antenna calibrations. Long intervals are better for system performance, since the AC signal can disturb traffic data. The learning algorithm will be rewarded for maximizing this input.

• Parameters from the baseband processing unit 150: b) Average uplink and downlink throughput per active user. The learning algorithm will be rewarded for maximizing this input. c) Number of unique beams used over time. Higher quality antenna calibration may be associated with the antenna element being able to use more unique beams for uplink and downlink transmissions. The learning algorithm will be rewarded for maximizing this input. d) Average uplink transmission power of terminals. Better antenna calibration may lead to higher uplink sensitivity, which corresponds to on average lower required uplink transmission power in connected terminals. The learning algorithm will be rewarded for minimizing this input. e) Average beam quality reported by terminals via Channel State Information (CSI) reports. Accurate antenna calibration should improve the average quality of beams. Candidate measurements in CSI reports include Channel Quality Indicator (CQI), Rank Indicator (Rl), Precoding Matrix Indicator (PMI) and Layer 1 Reference Signal Received Power (L1-RSRP). The learning algorithm will be rewarded for maximizing this input.

• Parameters from the antenna unit supervision unit 130: f) measured peak-to-null beamforming ratio. Accurate antenna calibration will increase the power ratio between a peak beam and a null beam. The learning algorithm will be rewarded for maximizing this input.

• Parameters from the antenna calibration process unit 140: g) Amount of phase rotation in the calculated compensation factors compared to the previous AC measurement. Higher levels of phase rotation to correct for AC errors indicate that the system's beamforming accuracy has decreased further. This input can be collected by full-band or sub-band AC measurements, as will be described in the following section. The learning algorithm will be rewarded for minimizing this input.

Other inputs are not influenced by the algorithm’s decisions and are listed below because they affect how often antenna calibration measurements need to be taken. These inputs include: • Parameters from the antenna calibration control unit 110: h) Time of day and day of year. The calibration quality may be less important depending on recurring events over time, and the meaning of other learning inputs may also vary periodically over time. i) Antenna calibration signal transmitting power and antenna calibration signal receiver gain. These may affect the quality of the AC measurement but may also be associated with higher interference in the radio environment.

• Parameters from the baseband processing unit 150: j) Number of connected terminals to a network node comprising the antenna unit 120. This may influence the average throughput, number of unique beams, transmission power, and beam quality inputs.

• Parameters from the antenna supervision unit 130: k) External ambient temperature outside the antenna unit 120. This may affect the rate of phase drift and aging in the antenna unit 120.

L) Internal hardware component temperatures. This can affect the phase drift in the radio processing chain comprising e.g. amplifiers, filters, switches, mixers, analog-to-digital converter (ADC), digital-to-analog converter (DAC) etc. m) Age of the antenna unit 120. This may affect the rate of phase drift in the radio processing chain.

• Parameters from the antenna calibration process unit 140: n) Signal to interference and noise ratio (SINR) between a captured AC signal and the surrounding radio frequency (RF) environment. A low ratio may make the antenna calibration process unit 140 produce less accurate compensation factors, which may lead to a need for more frequent calibrations to maintain compensation accuracy. On the other hand, high noise and interference may also reduce the ability of the system to make use of precise beamforming, leading to lower requirements for AC compensation accuracy.

• Other parameters: o) Number of Multi-User MIMO, Single-User MIMO and number of beams. The different numbers of MU-MIMO, SU-MIMO and beams have different requirement on AC compensation accuracy. A higher number of MU- MIMO and beams requires a better AC compensation accuracy, while the SU-MIMO has lower requirement on AC compensation accuracy. Hence the algorithm can adjust the AC interval to provide different AC compensation accuracy according to the requirement of numbers of MU- MIMO, SU-MIMO and beams. Therefore the solution here provides on- demand AC according to the wireless communication system requirements.

The system performance may be represented by, for example, a formula:

P _t = u ₁ * x _t + u ₂ * x ₂ + u ₃ * x ₃ T ···

Where p _t is a weighted system performance in time t in combination of a set of selected learning inputs X = [x ₁,x ₂,x ₃, ...], u _lru ₂, u ₃ ...are the weights for the selected learning inputs x ₁,x ₂,x ₃ respectively. The selected learning inputs may be e.g.: x ₁: average uplink and downlink throughput per active user; x ₂: average beam quality reported by terminals; x ₃: measured peak-to-null beamforming ratio of an antenna; r is the reward representing the feedback of performance from the system, and may be calculated as e.g.: r = 1, if p _t > p _t-1 (Improvement on performance) r = 0, if p _t = p _t-1 (No improvement on performance) r = -1, if p _t < p _t-1 (Degradation on performance)

Where p _t-1 is the system performance from last time.

In the following, the machine learning process will be described in detail with reference to Figures 4 and 5. Figure 4 shows a Q-learning network defining the reinforcement learning process. Q-learning is an off-policy reinforcement learning algorithm that seeks to find the best action to take given the current state. It’s considered off-policy because the q-learning function learns from actions that are outside the current policy, like taking random actions, and therefore a policy isn’t needed. More specifically, q-learning seeks to learn a policy that maximizes the total reward. The ‘q’ in q-learning stands for quality. Quality in this case represents how useful a given action is in gaining some future reward.

The learning process is defined as below, β is the learning rate or step size, which determines to what extent one accepts the new value verses the old value. Eventually an optimized Q(s, a,w ) can be found after the learning process and it can be used to set the AC interval adaptively.

Where r is the reward that represents the one-time or short-term feedback of the system performance after an action is performed at a given state, y is the discount rate used to balance between the short-term or one-time reward and long-term reward, w is weight.

As can be seen from the equation, the new Q(s, a,w) is the sum of three factors:

(1 - the current Q value weighted by the learning rate. Values of the learning rate near to 1 make faster the changes in Q. bn, the reward to obtain if an action is taken when in a state s weighted by the learning rate. the maximum reward that can be obtained from state s' weighted by the learning rate and discount rate.

Q(s, a ) is the expected total reward in a series of actions and can be calculated with Bellman Equation:

Q(s, a) = E[r _t+1 + yr _t+2 + y ²r _t+3 + ^... |s, a]

The Bellman Equation is a recursive formula that forms the basis for dynamic programming. It computes the expected total reward of taking an action from a state in a Markov decision process by breaking it into the immediate reward and the total future expected reward, where E represents Expected.

An optimal value function with Bellman Expectation equation is:

Q*(s, a) = E[r + g * max (/(s', a') |s, a] a'

The optimal Q*(s, a) can be represented by Q-network function Q(s, a,w ) with weights w, the weights w are updated to make Q(s, a,w ) approximated to Q*(s, a):

Q (s, a, w) ≈ Q*(s, a),

Where s is the state of the system and it can be represented by the value of vector X = [x ₁,x ₂,x ₃, ...] which is a vector of learning inputs as described above, s may have more inputs in some cases or applications. a is a predefined set of the actions, which may comprise: al: trigger a full-band antenna calibration measurement. In a full-band antenna calibration measurement, the antenna calibration process unit 140 and the antenna calibration measurement unit 160 are triggered to start an antenna calibration process to produce compensation factors and adjust the phase and amplitude for the antenna unit 120. a2: trigger a sub-band antenna calibration measurement. a3: not trigger antenna calibration measurement.

Reward r is the one-time or short-term feedback of performance. The target of learning process is to get the max long-term rewards Q(s, a).

For example, the one-time reward might be low when AC interval is short because it would cause more traffic interruption. However, the long term reward might be high in some scenarios where it needs a super good calibration performance.

As shown in Figure 5, the reinforcement learning process comprises the following steps.

Step 510 Initiating the Q-learning network with multiple Q-value functions by setting initial weights for the Q-value functions.

Step 520 Acquiring the system performance parameters as learning inputs to the Q- learning network.

Step 530 Setting a set of states of the system according to the learning inputs.

Step 540 Selecting an action from the predefined set of actions.

Step 550 Triggering the antenna calibration measurement unit 140 to perform the selected action.

Step 560 Obtaining a reward from the feedback of the system performance parameters after the selected action is performed.

Step 570 Calculating a total reward with the Q-value functions based on the reward.

Step 580 Updating weights of the Q-value functions.

The reinforcement learning process is a loop process and continues to perform steps 520-580 to achieve an optimized Q-value thereby setting the antenna calibration intervals between antenna calibration measurements adaptively.

The antenna calibration control unit 110 uses the machine learning process described above to process the learning inputs. The result is a decision on whether an antenna calibration measurement is needed. If a full-band antenna calibration measurement is needed, the antenna calibration process unit 140 and the antenna calibration measurement unit 160 are triggered to start an antenna calibration process to produce compensation factors and adjust the phase and amplitude for the antenna unit 120. If no antenna calibration measurement is needed, the calibration control unit 110 continues to collect learning inputs and evaluate them using the machine learning process. The sub-band AC measurement is here used to detect AC performance, e.g. by detecting phase drifts of antenna calibration signals, not to fully calibrate the antennas. This allows the sub-band AC measurement to have a much lower impact on traffic data.

In the following section, the full-band and sub-band antenna calibration measurements will be described.

Figure 6 shows an example model of antenna calibration. This is a general model for an antenna calibration measurement. For a full-band calibration measurement, the antenna calibration signals X _k are injected into an antenna unit 120 comprising multiple antenna elements or branches 611 , 612, ... , such as X ₁ goes to the first antenna element 611 , etc. Signal Y is received from an antenna calibration signal receiver Cal TRX 620. The antenna calibration signal receiver Cal TRX 620 is coupled to a calibration network 630 which is coupled to each of the antenna elements by a coupling element 641, 642...

Where, N _ANT is the number of antenna branches in the antenna unit 610. N _sc is the number of subcarriers occupied by the calibration signals. For a typical application, N _sc is set to a number to cover the full carrier frequency band. The channel state information of each antenna branch H _k can be calculated as below:

Please note that can be separated by any division methods like time or code division.

The difference of channel state information H _k between two or multiple AC cycles can be used as learning input as below. Φ _ij is the phase difference between two AC cycles i and j. The phase difference information indicates the phase drift of radio chains.

Where the radio chains may comprise receiver chains, transmitter chains or transceiver chains and/or the filter units, antenna units, subarrays etc. The receiver chains may comprise low noise amplifier, switches, filters, ADC etc. The transmitter chains may comprise power amplifier, DAC, switches, filters, etc. The transceiver chains may comprise clocks, digital unit, field-programmable gate array (FPGA), application-specific integrated circuit (ASIC) etc.

For a sub-band calibration measurement, a pilot AC signal covering only part of the carrier frequency or subcarriers is used to detect and measure the phase drift over time. As shown in Figure 7, only a few pilot AC signals will be sent to do the measurement. In this method the N _sc will be much less than the one in full-band measurement, for example, as shown in Figure 7, the pilot AC signal only cover two sub-carriers.

The phase drift between the measurements may be used by the learning algorithm to decide the scheduling of a full-band AC. For example, the algorithm may decide to have a bigger AC interval if the phase drift is small. In the sub-band measurement, the occupied resources and the computational load will be much less than the full-band AC measurement.

Therefore, according to some embodiments herein, the action of triggering a sub band antenna calibration measurement may comprise the following actions: triggering the antenna calibration measurement unit 160 to send antenna calibration signals covering a part of a carrier frequency or a few subcarriers; evaluating phase drifts of the antenna calibration signals between the antenna calibration signals measurements; and determining to trigger a full-band antenna calibration measurement or not based on the evaluation result.

According to some embodiments herein, the antenna calibration control method described above according to embodiments herein may be implemented in a cloud. In the cloud implementation, the functions of anyone or all of the baseband processing unit 150, the antenna calibration control unit 110 and the antenna calibration process unit 140 may be deployed in a virtualized environment, as shown in Figure 8. The signaling sequences between the functions does not change if some or all of them are deployed in the cloud.

According to some embodiments herein, the antenna calibration control unit 110 and method therein described above according to embodiments herein may be implemented in an Open Radio Area Network (O-RAN) or Open Radio Unit (O-RU) as shown in Figure 9. O-RAN is a concept based on interoperability and standardization of RAN elements including a unified interconnection standard for white-box hardware and open source software elements from different vendors. O-RAN architecture integrates a modular base station software stack on off-the-shelf hardware which allows baseband and radio unit components from discrete suppliers to operate seamlessly together. As illustrated in Figure 9, the antenna calibration control unit 110, the antenna calibration process unit 140, the antenna calibration measurement unit 160, the antenna supervision unit 130, the antenna unit 120 and a part of the baseband processing unit 150 may be deployed in the O-RAN/O- RU. This solution may limit the amount of learning inputs that can be retrieved from the baseband processing unit 150, as in some embodiments only the lower layers of it may be deployed inside the O-RU.

According to some embodiments herein, the antenna calibration control unit 110 and method therein may be implemented in a network node. Figure 10 shows an example of a network node 1000, which may be a base station, for example, an eNB, gNB, eNodeB, gNodeB, or a Home NodeB, or a Home eNodeB or a Home gNodeB. The network node 1000 may comprise a receiving module 1010, a transmitting module 1020, a processing module 1030 and a memory 1040. The network node 1000 may further comprise the antenna calibration control unit 110, the antenna unit 120, the antenna supervision unit 130, the antenna calibration process unit 140 and the antenna calibration measurement unit 160 describe above with reference to Figure 1. The antenna calibration control unit 110 and method therein may be implemented in the processing module 1030. The antenna calibration control unit 110 and method therein may also be implemented through one or more processors, such as the processor 1050 in the network node 1000 together with computer program code for performing the functions and actions of the embodiments herein. The program code mentioned above may also be provided as a computer program product, for instance in the form of computer readable medium or a data carrier 1060 carrying computer program code 1070, as shown in Figure 10, for performing the embodiments herein when being loaded into the network node 1000. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server or a cloud and downloaded to the network node 1000.

According to some embodiments herein, the antenna calibration control method according to embodiments herein may be implemented in a computer program product. The computer program product comprises instructions which when the program is executed by a computer, cause the computer to carry out the antenna calibration control method described above.

According to some embodiments herein, the antenna calibration control method according to embodiments herein may be implemented in a data processing system comprising a processor configured to perform the the antenna calibration control method described above.

To summarize, the antenna calibration control method according to embodiments herein is an adaptive control mechanism for antenna calibration, which opportunistically increases or decreases the period between antenna measurements and calibrations to achieve the maximize useful beamforming accuracy and minimize the frequency of the calibration events. Future high frequency band radio products have much higher requirements on short intervals between antenna calibration measurements, while at the same time increasing the complexity of the antenna calibration algorithm. Using fixed short intervals for such products may become prohibitively expensive due to the processing capacity needed for executing the antenna calibration algorithm. The embodiments herein save energy and radio processing capacity by increasing the interval between calibrations, since each calibration requires significant data processing to analyze the captured signal and derive the appropriate compensation factors.

The embodiments herein may regulate the interval between antenna calibration measurements to maximize system performance.

The embodiments herein may regulate the interval between antenna calibration measurements to minimize traffic impact.

The antenna calibration control method according to embodiments herein includes a special sub-band antenna calibration measurement method where the calibration signal only occupies a part of the carrier spectrum, which allows evaluation of antenna performance without the need to fully calibrate the antennas.

The antenna calibration control method according to embodiments herein uses machine learning to adapt to changing conditions in the network and the physical environment. The embodiments herein make use of reinforcement learning to form a picture of multiple environmental factors that may affect antenna calibration performance. The embodiments herein make use of reinforcement learning to evaluate and weigh multiple traffic performance factors that may affect the need for antenna recalibration.

The embodiments herein increase traffic throughput and decrease handover probability by reducing the amount of antenna calibration interruptions over a given period. By considering both radio metrics and baseband processing metrics, the embodiments herein can optimize the intervals of antenna calibration measurements to maintain a desired antenna calibration accuracy. The embodiments herein can also opportunistically reduce the required antenna compensation accuracy at times when the wireless communication system does not need to provide high precision beamforming, for example when the number of MU- MIMO, SU-MIMO and number of beams are low, the SINR is high etc.

The word "comprise" or “comprising”, when used herein, shall be interpreted as non limiting, i.e. meaning "consist at least of". The embodiments herein are not limited to the above described preferred embodiments. Various alternatives, modifications and equivalents may be used. Therefore, the above embodiments should not be taken as limiting the scope of the invention, which is defined by the appended claims.

Previous Patent: A BASE STATION, A CORE NETWORK NODE AND METHODS IN A SCENARIO WHERE A FIRST BASE STATION IS REPLACED...

Next Patent: METHODS, APPARATUS AND MACHINE-READABLE MEDIA RELATING TO CHANNEL ESTIMATION