Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
RECEIVING DEVICE AND METHODS THEREOF
Document Type and Number:
WIPO Patent Application WO/2018/082775
Kind Code:
A1
Abstract:
The invention relates to a receiving device. The receiving device (100) is configured to - receive a Multiple Input Multiple Output, MIMO, signal y over a MIMO channel, wherein the received MIMO signal y comprises transmitted data x arranged in k = 1,2,..., v transmission layers in a first order, - compute a channel estimation matrix H for the MIMO channel, - select N Br branches for QR decomposition M-algorithm, QRM, detection, wherein N Br is an integer larger than 1; for the m-th branch of the N Br branches: a) select one transmission layer of the v transmission layers as a first layer i and order the remaining transmission layers of the v transmission layers so as to obtain a second order, b) permute the channel estimation matrix H according to the second order using a permutation matrix Pm so as to obtain a permuted channel estimation matrix H pm arranged according to the second order, c) compute a QR decomposition of the permuted channel estimation matrix H pm so as to obtain a first decomposition matrix Q pm and a second decomposition matrix R pm , d) compute a transformed received MIMO signal zpm based on the received MIMO signal y and first decomposition matrix Q pm , e) compute an initial estimate (formula AA) of the transmitted data x for the first layer i based on the transformed received MIMO signal Z pm and the second decomposition matrix R pm , f) select a set of father nodes (formula BB ) for the first layer i based on the initial estimate (formula CC), g) perform QRM detection based on the set of father nodes (formula BB ), the transformed received MIMO signal Z pm and the second decomposition matrix R pm so as to obtain a set of accumulated metrics (AM m ) and a corresponding set of estimates (formula DD ) of the 20 transmitted data x for the m-th branch; - execute a) to g) for each of the N br branches so as to obtain a set of accumulated metrics and a corresponding set of estimates of the transmitted data x for each branch; - compute log likelihood ratios, LLRs, for the transmitted data x based on the sets of accumulated metrics and corresponding sets of estimates for the N br branches. Furthermore, the invention also relates to corresponding methods, a computer program, and 25 a computer program product.

Inventors:
CHEN JUNSHI (SE)
TUMULA CHAITANYA (SE)
Application Number:
PCT/EP2016/076575
Publication Date:
May 11, 2018
Filing Date:
November 03, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECH CO LTD (CN)
CHEN JUNSHI (SE)
International Classes:
H04L25/03
Foreign References:
US20080095257A12008-04-24
US20110182336A12011-07-28
US8903027B12014-12-02
Other References:
MICHAEL WU ET AL: "Flexible N-Way MIMO Detector on GPU", SIGNAL PROCESSING SYSTEMS (SIPS), 2012 IEEE WORKSHOP ON, IEEE, 17 October 2012 (2012-10-17), pages 318 - 323, XP032329005, ISBN: 978-1-4673-2986-6, DOI: 10.1109/SIPS.2012.59
Attorney, Agent or Firm:
KREUZ, Georg (DE)
Download PDF:
Claims:
CLAIMS

1 . A receiving device (100) for a communication system (500), the receiving device (100) being configured to

- receive a Multiple Input Multiple Output, MIMO, signal y over a MIMO channel, wherein the received MIMO signal y comprises transmitted data x arranged in k = 1, 2, ... , v transmission layers in a first order,

- compute a channel estimation matrix H for the MIMO channel,

- select NBr branches for QR decomposition M-algorithm, QRM, detection, wherein NBr is an integer larger than 1 ;

for the m-th branch of the NBr branches:

a) select one transmission layer of the v transmission layers as a first layer i and order the remaining transmission layers of the v transmission layers so as to obtain a second order, b) permute the channel estimation matrix H according to the second order using a permutation matrix pm so as to obtain a permuted channel estimation matrix HPm arranged according to the second order,

c) compute a QR decomposition of the permuted channel estimation matrix HPm so as to obtain a first decomposition matrix QPm and a second decomposition matrix RPm,

d) compute a transformed received MIMO signal zPm based on the received MIMO signal y and first decomposition matrix QPm,

e) compute an initial estimate (xfm) of the transmitted data x for the first layer ί based on the transformed received MIMO signal zPm and the second decomposition matrix RPm, f) select a set of father nodes (xf™, - , xf™N ) for the first layer ί based on the initial estimate {xfm),

g) perform QRM detection based on the set of father nodes (xf™, - , xf™N ), the transformed received MIMO signal zPm and the second decomposition matrix RPm so as to obtain a set of accumulated metrics (AMm) and a corresponding set of estimates (x^m, ... , pNm) of the transmitted data x for the m-th branch;

- execute a) to g) for each of the NBr branches so as to obtain a set of accumulated metrics and a corresponding set of estimates of the transmitted data x for each branch;

- compute log likelihood ratios, LLRs, for the transmitted data x based on the sets of accumulated metrics and corresponding sets of estimates for the NBr branches.

2. The receiving device (100) according to claim 1 , configured to compute the LLR for the ;'-th bit of the /c-th layer of the transmitted data x based on a first accumulated metric and a second accumulated metric, wherein the first accumulated metric is the minimum accumulated metric among the NBr branches that have the ;'-th bit of the /c-th layer equal to 0 and wherein the second accumulated metric is the minimum accumulated metric among the NBr branches that have the ;'-th bit of the /c-th layer equal to 1 .

3. The receiving device (100) according to claim 2, configured to

compute the set of accumulated metrics AMm l for the Z-th father node for the m-th branch based on

where || ||F is the Frobenius norm and x m is the estimate of transmitted data x for the Z-th father node of the m-th branch.

4. The receiving device (100) according to claim 3, configured to

compute the corresponding estimate xf™ for the /c-th layer of the Z-th father node for the m-th branch based a corresponding minimum path metric for the /c-th layer of the Z-th father node for the m-th branch.

5. The receiving device (100) according to any of the preceding claims, configured to

order the remaining transmission layers using at least one of: simplified mean square error based channel ordering, correlation based channel ordering, and fixed channel ordering.

6. The receiving device (100) according to any of the preceding claims, configured to

execute a) to g) for the NBr branches in parallel, sequential, or in combination of parallel and sequential.

7. The receiving device (100) according to claim 6, configured to

execute a) to g) for the NBr branches sequential or in a combination of parallel and sequential, and

compute the initial estimate (xfm) based on the corresponding estimate of the transmitted data x from at least one previous branch.

8. The receiving device (100) according to any of the preceding claims, configured to

compute the initial estimate (xfm) using at least one of: zero forcing estimation, minimum mean square error estimation, and soft parallel interference cancelation estimation.

9. The receiving device (100) according to any of the preceding claims, configured to select the set of father nodes (xf™, - , xf™N ) from a set of constellation points which are closest in Euclidean distance to the initial estimate (xfm). 10. The receiving device (100) according to any of the preceding claims, configured to

select the set of father nodes (xf™, - , xfpN ) based on a function of the constellation point (~xfm) which is closest in Euclidean distance to the initial estimate (xfm).

1 1 . The receiving device (100) according to any of the preceding claims, configured to

select index i equal to index m.

12. The receiving device (100) according to any of the preceding claims, configured to

select the value of NBr equal to the value of v. 13. The receiving device (100) according to any of the preceding claims, configured to

decode the received MIMO signal y based on the LLRs for the transmitted data x.

14. A method for a receiving device (100), the method (200) comprising:

- receiving (202) a Multiple Input Multiple Output, MIMO, signal y over a MIMO channel, wherein the received MIMO signal y comprises transmitted data x arranged in k = 1, 2, ... , v transmission layers in a first order,

- computing (204) a channel estimation matrix H for the MIMO channel,

- selecting (206) NBr branches for QR decomposition M-algorithm, QRM, detection, wherein NBr is an integer larger than 1 ;

for the m-th branch of the NBr branches:

a) selecting (208a) one transmission layer of the v transmission layers as a first layer i and order the remaining transmission layers of the v transmission layers so as to obtain a second order,

b) permuting (208b) the channel estimation matrix H according to the second order using a permutation matrix pm so as to obtain a permuted channel estimation matrix HPm arranged according to the second order,

c) computing (208c) a QR decomposition of the permuted channel estimation matrix HPm so as to obtain a first decomposition matrix QPm and a second decomposition matrix RPm, d) computing (208d) a transformed received MIMO signal zPm based on the received MIMO signal y and first decomposition matrix QPm, e) computing (208e) an initial estimate (xfm) of the transmitted data x for the first layer i based on the transformed received MIMO signal zPm and the second decomposition matrix f) selecting (208f) a set of father nodes (xf™, - , xf™N ) for the first layer ί based on the initial estimate {xfm),

g) performing (208g) QRM detection based on the set of father nodes (xf™, ... , χ?™Ν ), the transformed received MIMO signal zPm and the second decomposition matrix RPm so as to obtain a set of accumulated metrics (AMm) and a corresponding set of estimates (x^m, ... , p m) of the transmitted data x for the m-th branch;

- executing (210) a) to g) for each of the NBr branches so as to obtain a set of accumulated metrics and a corresponding set of estimates of the transmitted data x for each branch;

- computing (212) log likelihood ratios, LLRs, for the transmitted data x based on the sets of accumulated metrics and corresponding sets of estimates for the NBr branches. 15. Computer program with a program code for performing a method according to claim 14 when the computer program runs on a computer.

Description:
RECEIVING DEVICE AND METHODS THEREOF Technical Field

The invention relates to a receiving device. Furthermore, the invention also relates to corresponding methods, a computer program, and a computer program product.

Background

Multiple input multiple output (MIMO) systems have attracted much attention as a key enabler for increased spectral efficiency. The 3GPP Long Term Evolution (LTE) standard for mobile broadband release 10, also known as LTE-Advanced, includes multi-antenna transmission modes that permit the design of algorithms to improve performance in a wide range of scenarios.

A new transmission mode introduced in LTE release 10 is the transmission mode 9 (TM9) which supports non-codebook based precoding for MIMO transmissions up to eight layers. Further, in LTE release 12, 256 QAM modulation is adopted for downlink. The combination of these two key features can greatly improve the system throughput, but on the other hand also poses a great challenge on MIMO detection design in the client device. A receiving algorithm with low detection complexity, good detection performance and easy hardware implementation is necessary.

The Maximum Likelihood Detection (MLD) detection algorithm is optimal in the sense of minimum error probability when all data are equally likely, and it fully exploits the available diversity. But MLD possesses an enormously high complexity that grows exponentially with the number of transmit antennas and symbol constellations, which prohibits it from being implemented in hardware.

Conventional simple multiuser detection algorithms, such as Zero Forcing (ZF) or Minimum Mean Square Error (MMSE) linear filters, can be applied straightforwardly to MIMO detection, but the performance degrades significantly in fading channels, especially in high correlation scenarios. An improvement to these simple algorithms is to employ interference cancellation technology, such as Ordered Successive Interference Cancellation (OSIC) and Parallel Interference Cancellation (PIC), which employ serial/parallel interference nulling, symbol-by- symbol detection, and interference cancellation techniques at the receiver. These detection algorithms offer lower complexity than the optimal MLD. However, the performance of these detection algorithms is much lower than the performance of the MLD. Another conventional solution is Sphere Decoder (SD). With near MLD performance, SD offers an alternative detection algorithm at a lower complexity to improve the efficiency of decoding. There are many variants of SD, e.g. the well-known depth-first sphere decoder employs Schnorr Euchner Enumeration (SEE-SD) to perform tree traversal and achieves maximum- likelihood (ML) performance. However, a major limitation of SEE-SD is the intrinsic variable throughput, which tends to drop off significantly with decreasing Signal-to-Noise Ratio (SNR). Some other suboptimal algorithms, such as K-best algorithm and Fixed Complexity Sphere Decoder (FSD), are proposed to obtain constant throughput and lower hardware-complexity, at an acceptable cost of performance loss. The K-best List Sphere Detector(LSD) is a breadth- first tree-search algorithm which keeps K number of nodes with the smallest accumulated Euclidean distances at each layer of the search tree. However, the complexity of all these algorithms is still too high for hardware implementation in some MIMO scenarios.

Summary

An objective of embodiments of the invention is to provide a solution which mitigates or solves the drawbacks and problems with conventional detection algorithms.

An "or" in this description and the corresponding claims is to be understood as a mathematical OR which covers "and" and "or", and is not to be understand as an XOR (exclusive OR).

The above and further objectives are solved by the subject matter of the independent claims. Further advantageous implementation forms of the invention can be found in the dependent claims. According to a first aspect of the invention, the above mentioned and other objectives are achieved with a receiving device for a communication system, the receiving device being configured to

- receive a Multiple Input Multiple Output, MIMO, signal y over a MIMO channel, wherein the received MIMO signal y comprises transmitted data x arranged in k = 1, 2, ... , v transmission layers in a first order,

- compute a channel estimation matrix H for the MIMO channel,

- select N Br branches for QR decomposition M-algorithm, QRM, detection, wherein N Br is an integer larger than 1 ;

for the m-th branch of the N Br branches:

a) select one transmission layer of the v transmission layers as a first layer i and order the remaining transmission layers of the v transmission layers so as to obtain a second order, b) permute the channel estimation matrix H according to the second order using a permutation matrix p m so as to obtain a permuted channel estimation matrix H Pm arranged according to the second order,

c) compute a QR decomposition of the permuted channel estimation matrix H Pm so as to obtain a first decomposition matrix Q Pm and a second decomposition matrix R Pm ,

d) compute a transformed received MIMO signal z Pm based on the received MIMO signal y and first decomposition matrix Q Pm ,

e) compute an initial estimate xf m of the transmitted data x for the first layer ί based on the transformed received MIMO signal z Pm and the second decomposition matrix R Pm ,

f) select a set of father nodes xf™, - , xf™ N for the first layer ί based on the initial estimate xf m ,

g) perform QRM detection based on the set of father nodes xf™, ... , χ?™ Ν , the transformed received MIMO signal z Pm and the second decomposition matrix R Pm so as to obtain a set of accumulated metrics AM m and a corresponding set of estimates x^ m , - , X£N of the transmitted data x for the m-th branch;

- execute a) to g) for each of the N Br branches so as to obtain a set of accumulated metrics and a corresponding set of estimates of the transmitted data x for each branch;

- compute log likelihood ratios, LLRs, for the transmitted data x based on the sets of accumulated metrics and corresponding sets of estimates for the N Br branches.

In one example, step g) is followed by step h) reverse the order of the corresponding set of estimates x^ m , - , X£N back to the first order. This step is performed for each branch. After having reversed the order back to the first order, all the corresponding estimates for the branches have the same ordering which facilitates the LLR computation so as to obtain the constellation points belong to one specific layer from all the corresponding sets of estimates for the N Br branches.

A father node is a constellation point, which belongs to a set of constellation points of the transmitted data, which is closest to the initial estimate xf m of the first layer i according to a specific search criteria, e.g. the Euclidean distance.

The receiving device according to the first aspect provides a number of advantages over conventional solutions. One such advantage is that the receiving device according to the first aspect has low computational complexity compared with maximum likelihood detection (MLD) whilst providing high detection performance close to MLD. Further, the receiving device according to the first aspect also provides a hardware friendly implementation architecture since only one next layer node for each layer detection of each father node is stored, thereby reducing manufacturing cost and manufacturing time. In a first possible implementation form of a receiving device according to the first aspect, the receiving device is further configured to

compute the LLR for the ; ' -th bit of the /c-th layer of the transmitted data x based on a first accumulated metric and a second accumulated metric, wherein the first accumulated metric is the minimum accumulated metric among the N Br branches that have the ; ' -th bit of the /c-th layer equal to 0 and wherein the second accumulated metric is the minimum accumulated metric among the N Br branches that have the ; ' -th bit of the /c-th layer equal to 1 .

The first implementation form exploits the diversity of different branches so as to obtain more accurate LLR values.

In a second possible implementation form of a receiving device according to the first implementation form of the first aspect, the receiving device is further configured to

compute the set of accumulated metrics AM m for the Z-th father node for the m-th branch based on

AM m>l = || zPm - RP^ m \\ 2 F , l = 1 FN m ,

where || || F is the Frobenius norm and x Pm is the estimate of transmitted data x for the Z-th father node of the m-th branch.

The second implementation form provides a low computational solution for computing the accumulated metrics.

In a third possible implementation form of a receiving device according to the second implementation form of the first aspect, the receiving device is further configured to

compute the corresponding estimate xf™ for the /c-th layer of the Z-th father node for the m-th branch based a corresponding minimum path metric for the /c-th layer of the Z-th father node for the m-th branch.

The third implementation form provides a low computational solution for obtaining the corresponding estimate for the /c-th layer of the Z-th father node for the m-th branch.

In a fourth possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

order the remaining transmission layers using at least one of: simplified mean square error based channel ordering, correlation based channel ordering, and fixed channel ordering.

The fourth implementation form provides a plurality of low complex solutions with low detection performance losses for ordering the remaining layers.

In a fifth possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

execute a) to g) for the N Br branches in parallel, sequential, or in combination of parallel and sequential. The fifth implementation form provides different implementation solutions for achieving different performance measures, such as computational time and detection performance. Further, this implementation form also provides flexibility when designing different implementation architectures of the present solution. In a sixth possible implementation form of a receiving device according to the fifth implementation form of the first aspect, the receiving device is further configured to

execute a) to g) for the N Br branches sequential or in a combination of parallel and sequential, and

compute the initial estimate xf m based on the corresponding estimate of the transmitted data x from at least one previous branch.

The sixth implementation form provides an improved initial estimate from at least one previous branch. Thereby, the detection performance of the current branch is improved which means improved overall detection performance.

In a seventh possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

compute the initial estimate xf m using at least one of: zero forcing estimation, minimum mean square error estimation, and soft parallel interference cancelation estimation. The seventh implementation form provides different implementation solutions for achieving different performance measures, such as computational time and computational complexity. Further, this implementation form also provides flexibility when designing different implementation architectures of the present solution.

In an eighth possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

select the set of father nodes xf™, - , xf™ N from a set of constellation points which are closest in Euclidean distance to the initial estimate xf m .

The eighth implementation form improves the probability of finding the (real) transmitted symbol x t of the transmitted data x. In a ninth possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

select the set of father nodes xf™, ... , xf : Nm based on a function of the constellation point xf m which is closest in Euclidean distance to the initial estimate xf m .

The ninth implementation form increases the probability of finding the counter hypothesis bit for the LLR computation.

In a tenth possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

select index i equal to index m.

The tenth implementation form is a convenient way of selecting index i.

In an eleventh possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

select the value of N Br equal to the value of v. The eleventh implementation form combined with the tenth implementation form can iterate through all the layers as the first layer, and thereby increase the detection performance.

In a twelfth possible implementation form of a receiving device according to any of the preceding implementation forms of the first aspect or to the first aspect as such, the receiving device is further configured to

decode the received MIMO signal y based on the LLRs for the transmitted data x.

The twelfth implementation form provides high detection performance and low computational complexity due to the accurate LLRs for the transmitted data x.

According to a second aspect of the invention, the above mentioned and other objectives are achieved with a method for a receiving device, the method comprises

- receiving a Multiple Input Multiple Output, MIMO, signal y over a MIMO channel, wherein the received MIMO signal y comprises transmitted data x arranged in k = 1, 2, ... , v transmission layers in a first order,

- computing a channel estimation matrix H for the MIMO channel,

- selecting N Br branches for QR decomposition M-algorithm, QRM, detection, wherein N Br is an integer larger than 1 ;

for the m-th branch of the N Br branches:

a) selecting one transmission layer of the v transmission layers as a first layer i and order the remaining transmission layers of the v transmission layers so as to obtain a second order, b) permuting the channel estimation matrix H according to the second order using a permutation matrix p m so as to obtain a permuted channel estimation matrix H Pm arranged according to the second order,

c) computing a QR decomposition of the permuted channel estimation matrix H Pm so as to obtain a first decomposition matrix Q Pm and a second decomposition matrix R Pm ,

d) computing a transformed received MIMO signal z Pm based on the received MIMO signal y and first decomposition matrix Q Pm ,

e) computing an initial estimate xf m of the transmitted data x for the first layer i based on the transformed received MIMO signal z Pm and the second decomposition matrix R Pm , f) selecting a set of father nodes x ™, - , *^ for the first layer ί based on the initial estimate xf m ,

g) performing QRM detection based on the set of father nodes the transformed received MIMO signal z Pm and the second decomposition matrix R Pm so as to obtain a set of accumulated metrics AM m and a corresponding set of estimates x^ m , - , x^ m of the transmitted data x for the m-th branch;

- executing a) to g) for each of the N Br branches so as to obtain a set of accumulated metrics and a corresponding set of estimates of the transmitted data x for each branch;

- computing log likelihood ratios, LLRs, for the transmitted data x based on the sets of accumulated metrics and corresponding sets of estimates for the N Br branches.

In one example, step g) is followed by step h) reverse the order of the corresponding set of estimates x^ m , - , x^ m back to the first order. This is performed for each branch.

In a first possible implementation form of a method according to the second aspect, the method further comprises

computing the LLR for the ; ' -th bit of the /c-th layer of the transmitted data x based on a first accumulated metric and a second accumulated metric, wherein the first accumulated metric is the minimum accumulated metric among the N Br branches that have the ; ' -th bit of the /c-th layer equal to 0 and wherein the second accumulated metric is the minimum accumulated metric among the N Br branches that have the ; ' -th bit of the /c-th layer equal to 1 .

In a second possible implementation form of a method according to the first implementation form of the second aspect, the method further comprises

computing the set of accumulated metrics AM m for the Z-th father node for the m-th branch based on

AM m>l = || zPm - RP^ m \\ 2 F , l = 1 FN m ,

where || || F is the Frobenius norm and x Pm is the estimate of transmitted data x for the Z-th father node of the m-th branch.

In a third possible implementation form of a method according to the second implementation form of the second aspect, the method further comprises

computing the corresponding estimate xf™ for the /c-th layer of the Z-th father node for the m-th branch based a corresponding minimum path metric for the /c-th layer of the Z-th father node for the m-th branch.

In a fourth possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises ordering the remaining transmission layers using at least one of: simplified mean square error based channel ordering, correlation based channel ordering, and fixed channel ordering.

In a fifth possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises

executing a) to g) for the N Br branches in parallel, sequential, or in combination of parallel and sequential. In a sixth possible implementation form of a method according to the fifth implementation form of the second aspect, the method further comprises

executing a) to g) for the N Br branches sequential or in a combination of parallel and sequential, and

computing the initial estimate xf m based on the corresponding estimate of the transmitted data x from at least one previous branch.

In a seventh possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises

computing the initial estimate xf m using at least one of: zero forcing estimation, minimum mean square error estimation, and soft parallel interference cancelation estimation.

In an eighth possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises

selecting the set of father nodes xf™, ... , xf™ Nm from a set of constellation points which are closest in Euclidean distance to the initial estimate xf m .

In a ninth possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises

selecting the set of father nodes xf™, ... , xf™ Nm based on a function of the constellation point ~ xf m which is closest in Euclidean distance to the initial estimate xf m . In a tenth possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises

selecting index i equal to index m.

In an eleventh possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises

selecting the value of N Br equal to the value of v.

In a twelfth possible implementation form of a method according to any of the preceding implementation forms of the second aspect or to the second aspect as such, the method further comprises

decoding the received MIMO signal y based on the LLRs for the transmitted data x.

The advantages of any method according to the second aspect are the same as those for the corresponding receiving device according to the first aspects.

The invention also relates to a computer program, characterized in code means, which when run by processing means causes said processing means to execute any method according to the present invention. Further, the invention also relates to a computer program product comprising a computer readable medium and said mentioned computer program, wherein said computer program is included in the computer readable medium, and comprises of one or more from the group: ROM (Read-Only Memory), PROM (Programmable ROM), EPROM (Erasable PROM), Flash memory, EEPROM (Electrically EPROM) and hard disk drive. Further applications and advantages of the present invention will be apparent from the following detailed description.

Brief Description of the Drawings

The appended drawings are intended to clarify and explain different embodiments of the present invention, in which:

- Fig. 1 shows a receiving device according to an embodiment of the invention.

- Fig. 2 shows a method for a receiving device according to an embodiment of the invention.

- Fig. 3 shows a communication system according to an embodiment of the invention. - Fig. 4 illustrates a QR decomposition M-algorithm (QRM) metric computation in a search tree structure. - Fig. 5 shows a parallel QRM (PQRM) solution.

- Fig. 6 shows a serial QRM (SQRM) solution.

- Fig. 7 shows a hybrid QRM (HQRM) solution.

- Fig. 8 shows a QRM structure for one branch only.

- Fig. 9 shows a further method according to an embodiment of the invention.

- Fig. 10 shows selected constellation points and corresponding path metrics.

- Fig. 11 shows a selection of father nodes using a Father Node Selection (FNS) solution.

- Fig. 12 shows a selection of father nodes using an Effective Father Node Selection (EFNS) solution.

- Fig. 13 shows the performance of PQRM with different channel ordering solutions.

- Fig. 14 shows the performance of different embodiments of the invention.

Detailed Description

In MIMO scenarios with a large number of transmit and/or receiver antennas and high order modulations, conventional detection solutions have low performance and/or high hardware implementation complexity. Hence, there is clearly a need to provide an improved detection solution for these MIMO scenarios, which preferably has both high performance and acceptable hardware implementation complexity. Therefore, embodiments of the invention relate to a receiving device and a corresponding method.

According to an embodiment of the invention an improved detection solution is implemented in a receiving device, such as the receiving device 100 shown in Fig. 1 . The receiving device 100 comprises a processor 102 coupled to a transceiver 104. The processor 102 and the transceiver 104 are coupled to each other by means of communication means 106 known in the art. The receiving device 100 can be configured for both wireless and wired communications over wireless and wired communication systems, respectively. The wireless communication capability is provided with an antenna 108 coupled to the transceiver 104. The wired communication capability is provided with a wired communication interface 110 coupled to the transceiver 110.

The receiving device 100 is configured to receive MIMO signal y over a MIMO channel. The received MIMO signal y comprises transmitted data x arranged in k = 1, 2, ... , v transmission layers in a first order. The receiving device 100 is further configured to compute a channel estimation matrix H for the MIMO channel and select N Br branches for QR decomposition M- algorithm (QRM) detection, where N Br is an integer larger than 1 . The receiving device 100 is further configured to for the m-th branch of the N Br branches: a) select one transmission layer of the v transmission layers as a first layer i and order the remaining transmission layers of the v transmission layers so as to obtain a second order, b) permute the channel estimation matrix H according to the second order using a permutation matrix p m so as to obtain a permuted channel estimation matrix H Pm arranged according to the second order,

c) compute a QR decomposition of the permuted channel estimation matrix H Pm so as to obtain a first decomposition matrix Q Pm and a second decomposition matrix R Pm ,

d) compute a transformed received MIMO signal z Pm based on the received MIMO signal y and first decomposition matrix Q Pm ,

e) compute an initial estimate xf m of the transmitted data x for the first layer ί based on the transformed received MIMO signal z Pm and the second decomposition matrix R Pm ,

f) select a set of father nodes ¾ m , - , i¾ m for the first layer ί based on the initial estimate xf m ,

g) perform QRM detection based on the set of father nodes xf™, - , xf™ N , the transformed received MIMO signal z Pm and the second decomposition matrix R Pm so as to obtain a set of accumulated metrics AM m and a corresponding set of estimates x^ m , - , X£N of the transmitted data x for the m-th branch. The receiving device 100 is further configured to: execute a) to g) for each of the N Br branches so as to obtain a set of accumulated metrics and a corresponding set of estimates of the transmitted data x for each branch ; and to compute log likelihood ratios (LLRs) for the transmitted data x based on the sets of accumulated metrics and corresponding sets of estimates for the N Br branches.

The received MIMO signal y may be a superposed signal. However, the MIMO signal y comprises at least transmitted data x addressed for the receiving device 100 or to a communication device associated with the receiving device 100. Transmitted data x herein is the data transmitted by a transmitting device and intended for the receiving device 100.

In one embodiment, the receiving device 100 is configured to h) reverse the order of the corresponding set of estimates x^ m , ... , Xp m back to the first order after having executed step g). This is performed for each branch.

The receiving device 100 is in one embodiment configured to decode the received MIMO signal y based on the computed LLRs for the transmitted data x. The computed LLRs may also be used in other suitable applications. Fig. 2 shows a flow chart of a corresponding method 200 which may be executed in a receiving device 100, such as the one shown in Fig. 1 . The method 200 comprises receiving 202 a MIMO signal y over a MIMO channel. The received MIMO signal y comprises transmitted data x arranged in k = 1, 2, ... , v transmission layers in a first order. The method 200 further comprises computing 204 a channel estimation matrix H for the MIMO channel and selecting 206 N Br branches for QR decomposition M-algorithm, QRM, detection, where N Br is an integer larger than 1 . The method 200 further comprises for the m-th branch of the N Br branches:

a) selecting 208a one transmission layer of the v transmission layers as a first layer i and order the remaining transmission layers of the v transmission layers so as to obtain a second order,

b) permuting 208b the channel estimation matrix H according to the second order using a permutation matrix p m so as to obtain a permuted channel estimation matrix H Pm arranged according to the second order,

c) computing 208c a QR decomposition of the permuted channel estimation matrix H Pm so as to obtain a first decomposition matrix Q Pm and a second decomposition matrix R Pm , d) computing 208d a transformed received MIMO signal z Pm based on the received MIMO signal y and first decomposition matrix Q Pm ,

e) computing 208e an initial estimate xf m of the transmitted data x for the first layer i based on the transformed received MIMO signal z Pm and the second decomposition matrix R Pm ,

f) selecting 208f a set of father nodes xf™, ... , xf™ Njn for the first layer i based on the initial estimate xf m ,

g) performing 208g QRM detection based on the set of father nodes xf™, ... , xf] N , the transformed received MIMO signal z Pm and the second decomposition matrix R Pm so as to obtain a set of accumulated metrics AM m and a corresponding set of estimates x^ m , - , X£N of the transmitted data x for the m-th branch. The method 200 further comprises: executing 210 a) to g) for each of the N Br branches so as to obtain a set of accumulated metrics and a corresponding set of estimates of the transmitted data x for each branch; and computing 212 LLRs for the transmitted data x based on the sets of accumulated metrics and corresponding sets of estimates for the N Br branches.

In one example, the method 200 comprises the step: h) reversing 208h the order of the corresponding set of estimates x^ m , - , X£N back to the first order after having executed step g). This is performed for each branch. Fig. 3 shows a communication system 500 according to an embodiment of the invention in which the receiving device 100 is comprised in a client device 300. The communication system 500 comprises in this example a network node 400 and the client device 300. The network node 400 is configured to transmit data x using a transmitting device (not shown in Fig. 3). A received MIMO signal y comprising the transmitted data x is obtained by the receiving device 100 in the client device 300. The signal y is detected by the receiving device 100 with the MIMO detection solution according to the invention. It should however be noted that the receiving device 100 may be part of another type of communication device than a client device 300. In one example, the receiving device 100 may be part of a network node 400.

The MIMO detection solution according to the invention will be described in more detail in the context of a system model presented below. Embodiments of the invention are however not limited thereto. It is assumed in the present system model that the transmitting device has N T number of transmit antennas and the receiving device 100 has N R number of receiving antennas. The transmitted data x is arranged in v number of transmission layers, wherein 1 < v≤ 8 in one particular example. Further, the transmitted data x has been precoded by a MIMO precoder w of size N T x v.

The received MIMO signal y can therefore be written as

y = H ch Wx + n = Hx + n r Eq. 1

where y is the received signal on N R receiving antennas, and n r is the complex domain circularly symmetric additive white Gaussian noise with E{n r n r H } = O 1 Nr , which means that the mean of the noise covariance is an identity matrix; H = H ch W is the effective channel seen by the receiving device 100 and is represented as a N R x v matrix. Further, the entries of the transmitted data x belong to a M-QAM signal constellation set Ω (other modulation schemes are possible), such as the gray mapped signal constellation defined in 3GPP standards. In the following, it is assumed that v = N T for simplicity even though the present solution is not limited thereto.

If it is assumed that all constellation points in the constellation set Ω are equally probable, the optimal decision metric can be written as

where || · ||| denotes the square Frobenius norm. ' -th bit of the /c-th transmission layer (k = 1, 2, ... , v) can be written as

denotes all possible transmit symbol vectors whose ; ' -th bit of the /c-th

However, since the complexity of evaluating Eq. 3 is very high, the summations in Eq. 3 are replaced by the max operator using the Jacobian logarithm identity, i.e.

log(exp(a) + exp(b)) = max(a, b) + log(l + exp(|a— b \))

= max(a, b) Eq. 4

This approximation is called Max-Log-MAP approximation and by using the Max-Log-MAP approximation Eq. 3 can be simplified to

— Hx\\ Eq. 5

However, computing the LLRs using the evaluation of Eq. 5 is also very complex. Hence, for large-layer MIMO detection with high order modulation, low-complexity sub-optimal algorithms are in practice considered.

The optimal MIMO detection metric given in Eq. 2 can equivalent^ be expressed as

x = arg min ||z— Eq. 6

xeci N T

where the transformed received signal is z = Q H y and H = QR, Q is an N R x N T semi-unitary matrix with Q H Q = I NT ; and R is an N T x N T upper triangle matrix which can be derived as Q H H = Q H QR = R, and represented as

Therefore, by using the upper triangular structure of matrix R, the decision metric can be represented as

By a close observation of the above expression it is noted that mentioned expression can be re-written as

| z l r l,N T x N T r l,N T -l x N T - l '1,1 Χ ΐ|

The first term on the left hand side of the equation above is denoted accumulated metric AM. The terms M 1 to Μ Ντ on the right hand side of the equation are denoted as path metrics. It is noted that the terms path metric M k , 1≤ k≤ N T only depend on the transmitted data x from layer k to N T , i.e. x k , x k+1 , ... , χ Ντ . Hence, these terms can be computed in a sequential manner using a tree-structure, as shown in Fig. 4. The first layer path metric Μ Ντ is computed as illustrated in Fig. 4, and some path metrics and corresponding estimates χ Ντ are kept according to a search criteria, e.g. breadth first, and depth first, then the second layer path metric Μ Ντ _ 1 is computed, and so on, until the last layer path metric M 1 is reached. The set of nodes selected for the tree-search procedure at the first layer (i.e. symbols corresponding to layer N T ) are called father nodes herein. Corresponding to each node of the tree and at any layer k in the tree, there can be |Ω| possible candidate leaf-nodes corresponding to symbol x Nr - k+1 , where |Ω| is the number of constellation point in the constellation point set Ω corresponding to x Nr - k+1 .

Many sub-optimal MIMO detection algorithms, like previously mentioned SD, K-best algorithm or QRM algorithms, use this kind of tree-search procedure for finding the most likely transmitted symbol of the transmitted data x or for computing the LLRs of the bits corresponding to the transmitted data x. To reduce the complexity, mentioned sub-optimal detection algorithms also retain only a subset of leaf-nodes corresponding to each father node at any given layer in the tree-search procedure. Different permutation of the columns of H will generate different R which will lead to different detection performance. One QRM detection with a specific permutation of the columns of H is denoted as one branch and each branch is given index m herein. If a high number of permutations of the columns of H and corresponding QRM detection are executed and the results of different branches are combined, the detection performance can be improved greatly.

Further, embodiments of the invention propose three QRM-based algorithms called Parallel QRM (PQRM) detection, Sequential QRM (SQRM) detection, and hybrid QRM (HQRM) detection for branch processing. In these QRM algorithms, a pre-defined number of branches with different permutation of the columns of H are run in parallel, sequential, or hybrid manner as shown in Fig. 5 for PQRM, Fig. 6 for SQRM, and Fig. 7 for HQRM, respectively. Herein, the number of branches is denoted N Br and different number of branches used in the MIMO detection will lead to different detection performance. Theoretically, the more branches used, the better is the detection performance. In one embodiment, the number of branches N Br is set equal to the number of layers, i.e. the value of N Br is set equal to the value of v.

As aforementioned, Fig. 5 illustrates PQRM in which the processing of the number of branches N Br are performed in parallel. In PQRM, there is no information exchange between the different branches in the algorithm. The information exchanged between different branches can e.g. be accumulated metric, transmitted signal estimate x achieved by each branch, etc., so that detection performance can be improved, or computational complexity can be reduced.

Fig. 6 illustrates SQRM in which the processing of the number of branches N Br are performed in sequential, i.e. in a sequence. In SQRM, for execution of each branch, input information may be obtained from the previous branch. Further, information obtained in the execution of the branch may be passed to the next branch.

Fig. 7 illustrates HQRM in which the processing of the number of branches N Br are performed in parallel and sequential. In HQRM, the first M branches may exchange information as in SQRM and the following M + 1 to N Br branches only get information from the previous branch.

By using the processing output of all N Br branches, it is possible to obtain the LLR estimate of the ; ' -th bit of the /c-th layer by finding the minimum accumulated metric among all the branches that have the ; ' -th bit of the /c-th layer equal to "0", and subtracting the minimum accumulated metric among all the branch that have the ; ' -th bit of the /c-th layer equal to "1 ", i.e.

xen N T-b k =o xen N T-b k j = i

For a given m-th branch, the ordering of layers in the transmitted data x are given by the permutation matrix p m . The equation y = Hx + n r can also be represented as the product of permuted H and permuted transmitted data as: y = H Pm x Pm + n r , where the permuted channel estimation matrix is H Pm = H(: , p m ). This means that the /c-th column of H Pm is the (p m (/c))-th column of H, and the permuted transmitted data x Pm = x(p m ) means that the /c-th element of x Pm is the (p m ( c))-th element of x. H Pm can also be decomposed into the product of a first decomposition matrix Q Pm and a second decomposition matrix R Pm as: H Pm = QPmRPm _ if (QPm)« j s multiplied with y another permuted transformed received signal will be obtained as z Pm = (Q Pm ) H y.

The tree-search procedure is used to compute the detection metric \\z Pm - R Pm x Pm \\ j by selecting a set of father nodes i™, ... , xf™ Nm and the corresponding candidates of the other layers at the tree-search procedure. Typically, in the m-th branch, the k = m-th transmission layer is selected as the first layer for the tree-search procedure. Each branch uses the given ordering of layers and retains one next layer's candidate at each layer as it traverses through the tree as shown in Fig. 8. The only difference between Fig. 4 and Fig. 8 is that in Fig. 8 only one next layer's candidate for each father node is kept.

A further method according to an embodiment of the invention is shown in Fig. 9. The method shown in Fig. 9 is in one embodiment implemented and executed in the receiving device 100 which therefore is configured accordingly.

In step I one transmission layer x m among the total number of transmission layers v is selected as the first layer i.

In step II the remaining transmission layers of the total number of transmission layers are ordered according to a channel ordering algorithm, such as correlation based ordering, fixed channel ordering, or low complexity channel ordering. These channel ordering algorithms are described below.

In step III the channel estimation matrix H is permuted using a permutation matrix p m according to the previous channel ordering in step I and II. The permuted channel estimation matrix is denoted as H Pm .

In step IV a QR decomposition of the permuted channel estimation matrix H Pm is computed so as to obtain a first decomposition matrix Q Pm and a second decomposition matrix R Pm .

In step V the initial estimate x ™ of the transmitted data for the first transmission layer i is computed by using an estimate method, such as ZF, MMSE, and SPIC. These estimate methods are described below. However, an estimate from at least one previous branch in HQRM or SQRM can also be used as previously described. In step VI a set of father nodes x^ iV ... , x ^, FNm is selected using any of Father Node Selection (FNS) solution or Effective Father Node Selection (EFNS) solution which are described below.

In step VII a QRM detection for the m-th is performed that keeps only one next layer's candidate for each layer.

In step VIII after all the selected branches have been processed, the LLR of the ; ' -th bit of the /c-th layer is obtained by finding the minimum accumulated metric among all the branches that have the ; ' -th bit of the /c-th layer equal to "0", and subtracting the minimum accumulated metric among all the branches that have the ; ' -th bit of the /c-th layer equal to "1 ", i.e. LLR k j = .

xeCi N T;b k =o xeCi N T;b k = i

The set of accumulated metrics AM m for the Z-th father node for the m-th branch can be computed based on the expression

where || || F is the Frobenius norm and x Pm is the estimate of transmitted data x for the Z-th father node of the m-th branch.

The channel ordering of the remaining layers will impact the performance of the MIMO detection since different channel ordering will have different error propagation. There is an optimal channel ordering which is called Mean Square Error (MSE) channel ordering herein.

However, this channel ordering is not feasible to implement due to its high complexity.

Therefore, in this disclosure other low complex channel ordering methods are proposed. These are correlation based channel ordering, fixed channel ordering, and simplified MSE based channel ordering.

In the MSE based channel ordering the remaining layers are ordered according to their mean- square error after MMSE detection. In the m-th PQRM branch the permutation of layers are according to p m = [-, -, -, m] T and p m (N T ) = m. The ordering of the remaining layers is given by the following algorithm:

1 . In the m-th branch, set p m (N T ) = m, i.e. set the m-th layer as the first layer i.

2. Remove the m-th column from H and denote it with TempH.

3. Repeat steps 4 to 7 below for k = (N T - 1): -1: 2

4. Compute c k = diag (l - TempH "(TempH * TempH H + o i) Tempti). 5. Find the temporary index temp_Index k corresponding to minimum of absolute values of c k and find its corresponding index Index k in H.

6. Set p m (/c) = Index k , wherein p m comprises the ordered layers for m-th branch.

7. Remove temp_Index k column from TempH.

8. Sort the columns of H according to p m .

As can be seen from the above algorithm N R x N R matrix inversions have to be performed in step 4 of the above algorithm. To reduce the complexity of the MSE based channel ordering, two simplified MSE based channel ordering algorithms are proposed in the present disclosure. These two algorithms are denoted as simplified MSE based channel ordering algorithm 1 and simplified MSE based channel ordering algorithm 2. The simplified MSE based channel ordering algorithm 1 is given by:

1 . In the m-th PQRM branch, set p m (N T ) = m, i.e. set the m-th layer as the first layer i.

2. Remove the m-th column from H and denote it with TempH.

3. Compute c m = diag(l - TempH" (TempH * TempH" + o i) Tempti)

4. Sort the values of entries of c m in descending order, denote it by temp_Sort_Ind, and find its corresponding index Sortjnd in H.

5. Assign p m (l: N T - 1) = Sortjnd, wherein p m contains the ordered layers for the m-th PQRM branch.

6. Sort the columns of H according to p m .

In the m-th PQRM branch, the MSE matrix c m is computed only once as opposed to repeatedly computing it as described in the simplified MSE based channel ordering algorithm.

To further reduce the complexity, the MSE computation for determining the ordering of transmission layers is performed only in N T - 3 branches. For the remaining 3 branches a simple permutation of ordering obtained in one of the branches is used.

The simplified MSE based channel ordering algorithm 2 is given by:

Compute c all = diag(l - H H (H * H H + σ^/) "1 //).

Sort the entries of c all in descending order, let the order be lndex all .

for i = 1: N T

Set i (^r) = i

if i e Index all (l: N T - 4 or N T ) Remove the i-th column from H and denote it with TempH.

Compute c t = diag (l - TempH "(TempH * TempH H + o i) Tempti)

Sort the values of entries of c t in descending order temp_Sort_Ind, and find its corresponding index Sortjnd in H.

Assign Pi (l: N T - 1) = Sortjnd.

else

Delete index i from the entries of Pmdex all (N T ) and denote it p temp .

Assign Ρί (1: Ν τ - 1) = p temp .

end

end

In correlation based channel ordering, the idea is to make the transmission layers which are detected towards the end of the tree-search less correlated with the transmission layers which are detected at the top of the tree-search. The correlation based channel ordering is given by the following algorithm:

1 . Compute H_H = H H H.

2. In the m-th branch, set p m (N T ) = m.

3. Remove the m-th entry of the m-th column of H_H.

4. Sort the absolute values of the remaining entries of the m-th column of H_H in ascending order temp_Sort_Ind, and find its corresponding index Sortjnd in H.

5. Assign p m (l-. N T - 1) = Sortjnd, wherein p m contains the ordered layers for m-th branch.

6. Sort the columns of H according to p m .

Finally, the simplest channel ordering is the fixed channel ordering of the remaining layers in the different branches. The fixed channel ordering does not depend on either instantaneous or statistical knowledge of channel coefficients. There is very low complexity associated with this channel ordering. For example, in case of 8 layers, the following ordering of layers is used:

p 1 = [8 7 6 5 4 3 2 1]

p 2 = [8 7 6 5 4 3 1 2]

p 3 = [8 7 6 5 2 1 4 3]

p 4 = [8 7 6 5 2 1 3 4]

Ps = [4 3 2 1 8 7 6 5]

p 6 = [4 3 2 1 8 7 5 6]

p 7 = [4 3 2 1 6 5 8 7]

Ps = [4 3 2 1 6 5 7 8] Another approach to further reduce the overhead associated with the channel ordering complexity is to use the same channel ordering for some continuous number of resource elements (REs) within a transmitted symbol of the transmitted data x. This solution can be applied to all the channel ordering methods described herein.

Different solutions to compute the initial estimate xf m will now be described. The simplest way to get the x ™ is using the ZF result of the first layer which is given by the expression

zPm

$Pm _ N T

Nj Pm

'N T ,N T

Another way is to use the MMSE or SPIC estimate.

In the MMSE solution the following expression is used

x MMSE = {W MMSE * y)./T

SPIC is an iterative estimate, so for the first iteration (i.e. Iteration = 0) MMSE detection is performed. Using the LLR values, the soft-symbol mean E{x k } and soft-symbol variance v k = E[\E{x k } - x k \ 2 ] are computed using known methods. Using the mean and variance, the interference cancellation per layer is performed to obtain y (fe) = y -∑m=i h m E{x m }. Thereafter

m≠k

layer-wise MMSE detection χξ ΡΙ = w SPIC k * y (fe) //? fe is performed using LMMSE filter WsPick = fr f c (HR XX H H + a r 2 / WR ) _ 1 and /? fe = w SPIC>k h k with R xx = diag^ ... ¾ T ).

In the first branch of HQRM or SQRM, these aforementioned methods can be used to get the initial estimate xf m . For the other branches of HQRM, the best detection result of all the previous sequential branches can be used as the initial estimate xf m . Similarly, for one specific branch of SQRM, the best detection result of all previous branches can be used as the initial estimate xf m .

For the first layer of a given branch m, the method of selecting the set of father nodes will impact the complexity and detection performance. For example, the detection will have the best performance if all constellation points belonging to the constellation set Ω are selected, but will also have the highest complexity. There are two main solutions to simplify the selection of the set of father nodes and reduce the number of father nodes in the father nodes set with marginal performance loss, i.e. FNS and EFNS. For both FNS and EFNS an initial estimate xf m of the transmitted data for each branch m is required.

In the FNS method, FN m number of father nodes from a set of constellation points which are closest to the initial estimate x^™ in terms of Euclidean distance are selected, as shown in Fig. 1 1 . The set of constellation points are denoted as Tne FN $ method comprises:

for x NTik e Ω

D (k) = — x Nr,fc||

end

Select FN m smallest value from D (k) and add corresponding constellation point x NTik to the set of father nodes. D (k) is the Euclidean distance of the constellation point x Nr>k to the initial estimate x ™. For example, if FN m is 7, the 7 constellation points that are closest to x™ are selected as the father node set. In Fig. 1 1 these 7 constellation points are marked with black squares. This means that for each layer ί the set of father nodes xf™, ... , xfj N can be selected from a set of constellation points which are closest in Euclidean distance to the initial estimate xf m . In the EFNS solution, the constellation point that is closest to the initial estimate x^™ is selected. Said constellation point is denoted x^™ and corresponds to a specific bit sequence b Nr 2 N , where N is the number of bits of one constellation point. The mapping of bits to a symbol can e.g. be gray-mapping as defined in 3GPP standards. For each bit b Nr j of said constellation point x ™, select the closest set of constellation points among all constellation points in the constellation set Ω that have the flipped bit of b Nj , . The EFNS method therefore comprises:

for j = 1 to N, where N is the bit number of one constellation point,

for x NTik with b Nr j≠ B NT

D (k) = II — x Nr,fc||

end

Select a set of D (k) that have the smallest values and add corresponding constellation points x Nr>k to the set of father nodes. If any of x Nr>k has been added in previous iteration, then x Nr>k is skipped, and the next closest constellation point is added to the set of father nodes. end

One example of the EFNS method is shown in Fig. 12. In this example FN m is still 7. The closest constellation point to the initial estimate x^™ is the constellation point labeled "001000" and is denoted as x^™. As the first bit of xJJ™ is 'Ο', a constellation point with a first bit set to Ί ' and which is also closest to needs to be found. In Fig. 12 this is the constellation point labeled "100010" marked with a black square. In the same way, 6 more constellation points are found, each with a one bit flipped of These constellation points are also marked with black squares in Fig. 12. This means that for each layer i the set of father nodes xf™, ... , xf™ Nm can be selected based on a function of the constellation point xf m which is closest in Euclidean distance to the initial estimate xf m . Therefore, in one example the function is the flipping of bits of the constellation point x m which is closest in Euclidean distance to the initial estimate xf m .

For providing an even deeper understanding of the receiving device 100 and the method 200 an exemplary implementation is hereby described in a 3x3 system. Such a system comprises three layers, i.e. x x , x 2 , and x 3 , and the received MIMO signal y is given by

which has the matrix form

y = Hx + n r

After QR decomposition of the channel estimation matrix // = QR and after multiplying the received signal y with the decomposition matrix Q H the transformed received signal z is given by

which has the matrix form

z = Rx + nl

To find the estimate x that corresponds to the minimum of ||z - Rx

and the accumulated metric is defined as follow:

wherein M 3 ,M 2 ,M 1 are the path metrics of different layers.

The goal is to find the combination of x 3 , x 2 , and x t that minimizes the accumulated metric AM. If a ML algorithm is used, all possible combinations of ¾, ¾ and χ ι needs to be tested. For example, if x 3 ,x 2 and x 1 have 256 QAM constellation, the total number of combinations for ¾, ¾ and x i is 256 * 256 * 256 = 2 24 which is not feasible to compute.

However, the MIMO detection can be performed in a suboptimal way by splitting the procedure of finding the minimum of the accumulated metric AM (related to step 208g) into three steps. The first step is to find a set of x 3 that have small path metrics M 3 .The second step is to find a set of x 2 that have small path metrics M 2 based on the found set of x 3 . And the third step is to find a set of x x that have small path metrics M 1 based on the found set of x 3 , x 2 .

It has been previously noted that the columns of the channel estimation matrix H can be permuted and the received MIMO signal y can be written in another form. If the third column is un-permutated (x 3 is here selected as the first layer), then the two possible permuted matrices are (first form):

and (second form):

y = H*x * + n r

For the first form, the QR decomposition and transformed expression are shown in Eq. 7. To find the suboptimal solution x needs to be searched in the order of x 3 , ¾, and ¾. For the second form, the QR decomposition of H* = Q*R* is performed and the received MIMO signal y is multiplied with Q* H . The transformed received signal z* is then given by

which has the matrix form

z* = R*x* + n r * To find the x* that corresponding to the minimum of \\z* - R*x* \\ 2 , i.e.

x* = arg min \\z* — R*x* and

AM*—

the suboptimal solution needs to be searched in the order of x* 3 , , and x* 2 .

The two permutations of the columns of the channel estimation matrix H can be seen as the result of two different channel ordering. These two channel orderings have different performance and therefore the channel ordering with the best performance should be selected. In this example, it is assumed that the first channel ordering complies with the optimal channel ordering.

The ZF estimate of x 3 is x 3 =— but this estimate may not be an exact constellation point in the symbol constellation. So, a set of constellation points subopt close to the ZF estimate x 3 , e.g. n subopt has N constellation points, ¾i,¾ 2 , -,¾ , - ancl corresponding path metrics is selected to minimize M 3 . Since it is a

suboptimal algorithm a set of x 3 estimates need to be kept instead of selecting only the closest constellation point.

For each member of the set of constellation points n subopt , e.g. x 3 1 to x 3i , there are theoretically 256 possible constellation points for x 2 , but only one estimate ¾ that minimize M 2 needs to be found. All corresponding path metrics are given by the expression

Λ ,2 Λ ,2 Λ ,2

| z 2 r 2,2-*2,l r 2,3 x 3,l| | z 2 r 2,2 x 2,2 r 2,3 x 3,2| ' ■■■ ' | z 2 r 2,2 x 2,W r 2,3 x 3,W|

For each pair of {χ 3 ,ι,χ2,ι},{ 3,2 < ¾,2} < { 3, 2, } there are theoretically 256 possible constellation points for x x , but only one estimate ¾ that minimize M 1 needs to be found. All the corresponding path metrics are given by the expression

When all the estimates {χ 3 ,ι, χ 2 ,ι, χι,ι}, {¾2, χ2,2, χι,2}ν , {Χ3,Ν-¾Ν-Χ Ι ,Ν} are obtained all the accumulated metrics AM 1( AM 2 , AM N can be calculated according to

AM 1 = M 3JL + M 2JI + Μ

AM 2 = M 3|2 + M 2|2 + M LJ2

AM N = M 3|N + M 2|N + M LJN

If all the selected constellation points and path metrics are connected together, the structure will look like a tree as shown in Fig. 10. The arrow lines represent path metrics and the ends of the arrows represent constellation points. χ to x 3iN are herein the father nodes.

The above processing is called branch detection. Another branch detection can be performed by changing the ordering of the channel estimation matrix H and the ordering of the transmitted data x. If in this example x 2 is set as the first layer, there are two kinds of channel ordering of the remaining layers, i.e.

and

In this example it is assumed that the first channel ordering has the desired ordering.

The received signal y ean be written in matrix form as

y = H'x'

QR decomposition of the channel estimation matrix H' = Q'R' is performed and the received signal y is multiplied with Q' H . The transformed received signal is given by

The matrix form is given by

z' = R'x' + n r ' To find the x' that corresponds to the minimum of \\z' - R'x'Wj, i.e

x' = arg min \\ z'— R'x'

ΧΈΩ. 3

and

AM — \z 3 Ϊ" 3,3^2 I + \ z 2 r 2,2 X 1 r 2,3 X 2 \ + \ z 1 r 1,1 X 3 r 1,2 X 1 r 1,3 X 2

AM', AMI ? AMI,

To find the suboptimal solution x' is searched in the order of x' 2 , x'i , and x' 3 .

For this branch, the sets of estimates {χ'2,ι, χ'ι,ι, χ' 3 ,ι}, - - - , {X'2,N'X' I ,N'X'3,N} CAN be obtained and all the accumulated metrics AM' 1( AM' 2 , AM' N can be computed in a similar manner as for the previous branch.

Using a similar method also x x is set as the first layer. The channel estimation matrix H and transmitted data x are permuted and then the accumulated metric AM" and the estimate x 7 * are obtained. If it is assumed that the ordering is for this branch, the sets of estimates

7? ι,ι, χ 7? 2 ,ι, χ 7, 3 ,ι}, { V2.^2,2.? 2}.-- { 7? I,N.X 7? 2,N.X 7? 3,N} can be obtained and all the accumulated metrics AM" 1( AM" 2 , ... , AM" N can be computed in a similar manner as for the previous branch.

After all the accumulated metrics and estimations of all the branches are obtained, LLR for all the bits can be computed. For example, the LLR of the first bit of x 1 will be

1

LLR- min AM 1 AM N ,AM[ AM N ', AM" AM%)

xe£l i :b 1 1 = 0

- min (AM AM N ,AM AM N ',AM' AM%)

xe£l i :b 1 1 = l To find the x 1 estimate among {x l , ... , x ljN , χ' , ... , x' ljN , χ" , ... , χ'Ί } whose first bit is equal to "0" and the corresponding accumulated metrics. The minimum of these accumulated metrics related to "0" is selected as a first minimum. Similarly, to find the x x estimate among {x l , ... , χι,Ν - χ'ι,ΐ' - < x'i,N< x"i,i' - whose first bit is equal to " and the corresponding accumulated metrics. The minimum of these accumulated metrics related to "1 " is selected as a second minimum. The first minimum is subtracted by the second minimum, and in this way, the LLRs for all the bits of the transmitted data x is obtained. According to LTE protocol, there is a one by one mapping between sequence of bits and constellation point x as shown in Table 1 . For example, for 16 QAM, bits 00000 is mapped to constellation point x =— + *— and so on.

^ sqrt(10) J sqrt(10)

Table 1 : Bit to 16QAM mapping in LTE

Fig. 13 and 14 show detection performance of embodiments of the invention. The x-axis shows the SNR in dB and the y-axis shows the Block Error Rate (BLER).

Fig. 13 shows the performance difference for different channel ordering algorithms. Herein the transmitting device and the receiving device 100 each has 8 antennas, and the modulation scheme is 256QAM. The channel is EPA3 Low correlation as defined in 3GPP standard. It can be seen in Fig. 13 that the MSE based channel ordering algorithm has the best performance followed by the simplified MSE based channel ordering algorithm 1 (labelled "MSE based simplification alg-1 "), and the simplified MSE based channel ordering algorithm 2 (labelled "MSE based simplification alg-2"). The correlation based channel ordering algorithm has the second worst performance, and the fixed channel ordering has the worst performance. However, the complexity of the different channel ordering solutions is reverse to the performance.

Fig. 14 shows the performance of different branch processing solutions (i.e. PQRM, SQRM, and HQRM) and different father node selection methods previously described. Label "MMSE- SPIC-PQRM" in Fig. 14 uses the MMSE estimate as the input to SPIC and the SPIC estimate as the initial estimate, EFNS for selection of father nodes, and the different branches are processed in parallel, i.e. PQRM. MMSE-SPIC-PQRM has the best detection performance. Label "MMSE-PQRM" in Fig. 14 uses the MMSE estimate as the initial estimate, EFNS for selection of father nodes, and the different branches are processed in parallel, i.e. PQRM. MMSE-PQRM has the second best detection performance. Label "SQRM's" in Fig. 14 uses in the first branch the ZF estimate as the initial estimate, and in branch 2 to branch 8 the best estimate of previous branches as the initial estimate, EFNS for selection of father nodes, and the different branches are processed in sequential, i.e. SQRM. SQRM has the third best performance. Label "HQRM's" in Fig. 14 uses in the first branch the ZF estimate as the initial estimate, in branch 2 to branch 4 the best estimate of previous branches as the initial estimate, and in branch 5 to branch 8 the best estimate of branch 1 to branch 4 as the initial estimate, EFNS for selection of father nodes, and the first four branches are processed in sequential and the remaining four branches are processed in parallel, i.e. hybrid processing HQRM. HQRM has the second worst detection performance. Label "PQRM" in Fig. 14 uses the ZF estimate as the initial estimate, EFNS for selection of father nodes, and the different branches are processed in parallel. PQRM has the worst performance.

The receiving device 100 may be a standalone device or part of a communication device having wired and/or wireless communication capabilities. One such type of communication device is a client device also denoted as a user device, a User Equipment (UE), a mobile station, an internet of things (loT) device, a sensor device, a wireless terminal and/or a mobile terminal, is enabled to communicate wirelessly in a wireless communication system, sometimes also referred to as a cellular radio system. The UEs may further be referred to as mobile telephones, cellular telephones, computer tablets or laptops with wireless capability. The UEs in the present context may be, for example, portable, pocket- storable, hand-held, computer-comprised, or vehicle-mounted mobile devices, enabled to communicate voice and/or data, via the radio access network, with another entity, such as another receiver or a server. The UE can be a Station (STA), which is any device that contains an IEEE 802.1 1 -conformant Media Access Control (MAC) and Physical Layer (PHY) interface to the Wireless Medium (WM). The client device 100 may also be configured for communication in 3GPP related LTE and LTE-Advanced, in WiMAX and its evolution, and in fifth generation wireless technologies, such as New Radio.

Another such type of communication device is a network node also denoted as a radio network node, an access network node, an access point, or a base station, e.g. a Radio Base Station (RBS), which in some networks may be referred to as transmitter, "eNB", "eNodeB", "NodeB" or "B node", depending on the technology and terminology used. The radio network nodes may be of different classes such as e.g. macro eNodeB, home eNodeB or pico base station, based on transmission power and thereby also cell size. The radio network node can be a Station (STA), which is any device that contains an IEEE 802.1 1 -conformant Media Access Control (MAC) and Physical Layer (PHY) interface to the Wireless Medium (WM). The network node may also be a base station corresponding to the fifth generation (5G) wireless systems.

Furthermore, any method according to embodiments of the invention may be implemented in a computer program, having code means, which when run by processing means causes the processing means to execute the steps of the method. The computer program is included in a computer readable medium of a computer program product. The computer readable medium may comprises essentially any memory, such as a ROM (Read-Only Memory), a PROM (Programmable Read-Only Memory), an EPROM (Erasable PROM), a Flash memory, an EEPROM (Electrically Erasable PROM), or a hard disk drive. Moreover, it is realized by the skilled person that embodiments of the receiving device 100 comprises the necessary communication capabilities in the form of e.g., functions, means, units, elements, etc., for performing the present solution. Examples of other such means, units, elements and functions are: processors, memory, buffers, control logic, encoders, decoders, rate matchers, de-rate matchers, mapping units, multipliers, decision units, selecting units, switches, interleavers, de-interleavers, modulators, demodulators, inputs, outputs, antennas, amplifiers, receiver units, transmitter units, DSPs, MSDs, TCM encoder, TCM decoder, power supply units, power feeders, communication interfaces, communication protocols, etc. which are suitably arranged together for performing the present solution. Especially, the processor 102 may comprise, e.g., one or more instances of a Central Processing Unit (CPU), a processing unit, a processing circuit, a processor, an Application Specific Integrated Circuit (ASIC), a microprocessor, or other processing logic that may interpret and execute instructions. The expression "processor" may thus represent a processing circuitry comprising a plurality of processing circuits, such as, e.g., any, some or all of the ones mentioned above. The processing circuitry may further perform data processing functions for inputting, outputting, and processing of data comprising data buffering and device control functions, such as call processing control, user interface control, or the like.

Finally, it should be understood that the invention is not limited to the embodiments described above, but also relates to and incorporates all embodiments within the scope of the appended independent claims.