Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR A SYMBOL ENUMERATION ORDERING PROCEDURE AND AN APPARATUS USING THE METHOD
Document Type and Number:
WIPO Patent Application WO/2016/198462
Kind Code:
A2
Abstract:
The invention discloses a method for a symbol enumeration ordering procedure and an apparatus using the method. The objection of the invention to allow an accurate and efficient calculation of the metrics and a determination of the optimum search order of the constellation symbols with low complexity are solved by a method comprising the steps of a) computing quadrature components √Q of partial metrics (I) in (II) and (III); b) sorting the quadrature components in ascending order; c) selecting a global minimum as a first symbol (IV) in an enumeration sequence; d) expanding a maximum of two new candidates for the next symbols within a metrics matrix, one along (II) and one along (III); e) adding the newly determined enumeration candidate(s) or symbols to a stack and sorting them in ascending order of their partial metrics (V); f) selecting the next symbol to be explored within the enumeration ordering procedure; g) removing the selected symbol from the stack; h) repeating steps d) to g) for enumerating further symbols.

Inventors:
PEREZ ADEVA ESTHER (DE)
FETTWEIS GERHARD (DE)
Application Number:
PCT/EP2016/063054
Publication Date:
December 15, 2016
Filing Date:
June 08, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV DRESDEN TECH (DE)
International Classes:
H04L25/03; H04L1/06; H04L25/06; H04L27/38
Other References:
None
Attorney, Agent or Firm:
ADLER, Peter (Stachow & PartnerKrenkelstraße 3, Dresden, DE)
Download PDF:
Claims:
Claims

Method for a symbol enumeration ordering procedure using a detector for receiving radio frequency signals and a channel decoder decoding the received radio frequency signals whereas the method comprising the following steps:

a) Computing quadrature components Q of partial metrics λ in R- and I-dimension according to

and A(Qnamely a real component and an imaginary

component ;

b) Sorting the quadrature components in ascending order;

(kRkl) c) Selecting a global minimum as a first symbol x ' in an enumeration sequence xfnu [p] = xfnu [0] = — and increasing an enumeration pointer p;

d) Expanding a maximum of two new symbols within a metrics matrix Ai one along R- and I-dimension departing from the previously selected symbol x ' and computing the artial metrics Ai[xi and Ai[xi ) according

e) Adding the newly determined enumeration symbol to a stack and sorting them in ascending order of their partial metrics λι;

f) Selecting the next symbol to be explored within the enumeration ordering procedure and increasing the enumeration pointer p; g) Removing the selected symbol from the stack;

h) Repeating steps d) to g) for enumerating further symbols until the metrics matrix contains the metric values of all symbols.

Method for a symbol enumeration ordering procedure according to claim 1, wherein a controlled-expansion mechanism is employed avoiding unnecessarily generated metric values for new symbols in step d.

Method for a symbol enumeration ordering procedure according to claims 1 or 2, wherein the real component is only computed if all the metrics values previously determined within a same column have been examined in the procedure.

Method for a symbol enumeration ordering procedure according to claims 1 or 2, wherein the imaginary component is only computed if all the metrics values previously determined within a same row have been examined in the procedure.

Method for a symbol enumeration ordering procedure according to claims 1 to 4, wherein the number of enumerated symbols stored in a stack is ≤ Q.

Apparatus for soft-input soft-output sphere detection using a multiple input multiple output (MIMO) sphere detector and a channel decoder wherein soft-information generated by the detector are exchangeable between the detector and the channel decoder.

Apparatus for soft-input soft output sphere detection according to claim 6, wherein the apparatus comprising a stack of size Q whereas Q is the number of symbols receivable by the detector.

8. Apparatus for soft-input soft output sphere detection according to claim 6, wherein the apparatus comprising functional units.

9. Apparatus for soft-input soft output sphere detection according to claim 8, wherein the functional units comprising a node enumeration unit, a metric

computation unit, a soft-output computation unit and a fast sorter unit.

10. Apparatus for soft-input soft output sphere detection according to claim 9, wherein the node enumeration unit performs the method steps c) to f) according to claim

1.

11. Apparatus for soft-input soft output sphere detection according to claim 9, wherein the metric computation unit comprises a total of 2*V<2 multipliers per

dimension .

12. Apparatus for soft-input soft output sphere detection according to claim 9, wherein the soft-output

computation unit comprises an adder for adding the quadrature components without any memory requirements.

13. Apparatus for soft-input soft output sphere detection according to claim 9, wherein the fast sorter unit sorts Q metric values in parallel within one clock cycle .

Description:
Method for a symbol enumeration ordering procedure and an apparatus using the method

FIELD OF THE INVENTION The invention discloses a method for a symbol enumeration ordering procedure using a detector for receiving radio frequency signals and a channel decoder decoding the

received radio frequency signals.

The invention also discloses an apparatus for soft-input soft-output sphere detection using a multiple input multiple output (MIMO) sphere detector and a channel decoder wherein soft-information generated by the detector are exchangeable between the detector and the channel decoder.

BACKGROUND OF THE INVENTION Modern wireless communications standards, such as IEEE

802.11 (WLAN) , IEEE 802.16 (WiMAX) , 3GPP LTE or LTE- Advanced, define numerous operating modes as well as sophisticated transmission techniques to be supported by receiver architectures while satisfying a vast variety of stringent, and most often conflicting, requirements. Multi antenna detection belongs to the most computationally intensive constituents of the receiver's baseband signal processing, especially in the context of spatial- multiplexing transmission. Designing adaptive, good

performing and cost-effective MIMO (multiple input multiple output) detectors represents a challenge, especially concerning high-order systems (i.e., ≥ 4x4 MIMO

configurations with ≥ 64-QAM modulations) in the context of iterative receivers.

The task of MIMO detection is the determination of the bits c most likely sent as well as of reliability information for these bits. This can be accomplished by calculating log- likelihood ratios (LLRs) : r 1 1 (1)

L=(c m, i yj«-— min {λ 0 }+— min {λ 0 }

N 0 c|c mJ =+l N 0 c|c mJ =-l with the 1-th bit of a symbol sent through the m-th antenna represented by c m ,i and ^( y , c ,L a ) denoting the distance metric for a set of received symbols y, a given bit stream c and the a- priori knowledge L a . y, c and L a are multidimensional values. Besides the most likely sent symbol arg min {λ 0 }

x(c)|cEC

(i.e. the detection hypothesis) and its corresponding metric λο (c ML ) , the detector has to determine the counter-hypotheses and their metrics for each bit in order to generate soft- information for the decoder. Soft-information is understood as a reliability metric whose value correlates with the probability or likelihood of the bit taking on a discrete value. The detection problem can be formulated as a tree search by performing a QR-decomposition (QRD) of the channel matrix H = QR, where Q is unitary and denotes the number of constellation symbols and R is an upper triangular matrix.

With modified received symbols y ' = Q H y, modified with a- priori information, λο can be recursively calculated as the accumulation of the layer i partial metrics

Metric from interference a-priori information

already reduced symbol

estimated

symbols = Euclidean

distance

N T -1

i = Vi - 2_, Ti J x J

j=i+i (3)

The contribution λ α {χ{) may increase or decrease the value of λί, depending on both a,j and the sign of L a (ci,j). This leads to exploration of unfavorable nodes during the tree search as well as to exclusion of favorable ones. In order to avoid this effect, monotonously increasing λι is considered by redefining X a as

L-1 L-1

^a(¾) = - ^(|£α(¾)| - Qjia(Qj)) = ~ 2 ^|^α(¾)| ! with C t ≠ sign(L a (Ci )) ( 4 )

7=0 7=0

In order to cope with the impractically high complexity entailed by an exhaustive search of all minima required for eq. (1) , the tuple-search (sphere) detector (TSD) constrains the candidates or symbols set to a collection of T most likely tree paths. The metrics λο of these constellation symbols are stored in a search tuple

T: = {λο (ci) , λο (C2) λο (CT) }, defining the sphere radius as the maximum metric in the tuple:

R = max {A 0(t } (5)

c t \c t £T Currently the aforementioned metric values A and the

corresponding symbol sequence can be determined in two different ways. Thus, the metric values and the search order of the constellation symbols can be either exactly

calculated or estimated generally.

According to the state of the art one can determine the metric values by:

- Exact calculation: The exact calculation of the metric values is done according to the eq. (2) . Three real

multiplications and real additions per analyzed

constellation symbol are required for the metric value A calculation. Already determined intermediate values cannot be reused, but must be recalculated every time; or by:

- Estimation: For estimation the metric values A certain constellation characteristics of the symbols y are used (for example, geometric distance r between the constellation points) . This simplification of the metric determination achieves a significant reduction of the computational complexity, which in turn leads to a more efficient hardware complexity.

Therefore, the computationally intensive operations required for the metric calculation (MC) in eq. (2) can be

considerably simplified by an estimation based on a sector- aided enumeration approach also called search-sequence determination approach (SSD) which divides the constellation space into geometrical decision regions (see figure 1) . The distances between the constellation symbols and the

(normalized) interference-reduced received signal can be replaced by predefined geometrical distances between the constellation symbols and fixed reference points z i (such as the geometric centers of the defined decision regions) :

= ri \\yi'" - ¾ ll 2 * r? - x = ¾') 2 (6)

It is additionally possible to pre-calculate as well as r^(A ) 2 , consequently simplifying the complex-value products in eq. (2) to a single real-value multiplication or even making these operations to vanish completely. The reduced computational complexity of this so-called metric estimation (ME) approach makes it seemingly attractive for practical hardware implementations, at the cost of degrading the detection error-rate performance (due to the loss of accuracy) , and increasing the memory required (to store the precomputed distances (Δ έ ) or the products ¾ (Δ έ ) ) .

According to the state of the art one can determine the search order of the constellation symbols y by:

- Exact calculation: The Schnorr-Euchner (SE) enumeration is a widely known strategy to determine the ideal node

ordering, consisting in a sequence of constellation symbols ...,χ sorted in ascending order of their partial metrics {λ { {χ°) < λ^χ}) < ■■■ < t (x-

The complexity of the SE method is outrageous since the partial metrics A of all Q constellation symbols y have to be repeatedly computed and sorted for each parent node in the tree. This represents an enormous waste of computational resources, provided that symbols whose metrics A violate the radius constraint > R) , see above, will not be explored and, consequently, enumerating them is unnecessary; or by:

- Estimation: The estimate of the constellation symbol order, as for the metric estimate, based on the

constellation properties (e.g. geometric distance between the constellation points) . For this only the metric values A of the required symbols y are determined. With the

estimation processes the sorting of the symbols y by the SE method can be simplified or even completely dissolved.

Disregarding the contribution of the a-priori information to the metrics A, the node or symbol ordering is uniquely depending on the Euclidean distances between the

(normalized) interference reduced received Signal y[" = y"/ r ii and the constellation symbols:

(Δ*) 2 = with k = {0, ... , Q - 1} (7) The enumeration can be directly visualized on the

constellation plane, where geometrical properties can be exploited in order to approximate the ideal SE ordering with a significantly lower computational cost. An example is the so-called search sequence determination (SSD) approach, already mentioned above, which divides the constellation space into geometrical decision regions (as illustrated in figure 1) , each associated to a predefined sequence of symbols which approximates the ideal SE enumeration. The node ordering is determined by a predefined succession of symbols associated to the decision region where y[" lays. In the presence of a-priori information, however, the metric values do not depend uniquely on the Euclidean distances but also on the contribution λ α (¾) and, consequently, a sorted sequence of symbols cannot be predicted by solely examiningA . In this case Euclidean-distance-based

enumeration strategies are thus suboptimal and may lead to considerable error-rate performance degradation. A min- search (MS ) and an adaptive hypothesis (AH) approaches respectively, correct the hypothesis by taking into account the a-priori contribution. For this purpose, the MS strategy computes additional metric values A in order to determine the global minimum whereas the AH method finds a local minimum among the already computed metrics. These techniques compensate the performance loss to some extent, whereas the detection accuracy is still suboptimal.

The known solutions to the exact calculation of the metric values and the constellation symbol order have the

disadvantage that a high computational complexity is

required, which in turn leads to an unacceptable high space requirement, energy usage and / or processing time of the resulting hardware implementation. By estimates of metrics and search order the expense of the calculations can indeed be significantly reduced, however, this simplification leads to a significant reduction in the detection accuracy and thus to a considerable deterioration of the error rate. In iterative systems where a-priori information is exchanged between the detector and the decoder, the loss is

particularly high through estimation method. The a-priori information results in that the geometric characteristics of the constellation for the estimate of the metric values are no longer sufficient to determine this with the necessary accuracy. In this case, the estimate of the metric results in an unacceptable loss in detection accuracy. The objection of the invention should allow on the one hand an accurate and efficient calculation of the metrics and on the other hand a determination of the optimum search order of the constellation symbols with low complexity. An

efficient hardware implementation of this invention should be proposed.

The object of the invention will be solved by a method according to claim 1, namely a method for a symbol

enumeration ordering procedure using a detector for

receiving radio frequency signals and a channel decoder decoding the received radio frequency signals whereas the method comprising the following steps:

a) Computing quadrature components Q of partial metrics λ in R- and I-dimension according to namely a real component K and an imaginary component I; the decomposition of the calculation of the metrics in their quadrature components enables the reuse of already determined metric components (i.e., real and imaginary sums) and thereby leads to a reduction of the computational complexity.

b) Sorting the quadrature components in ascending order; the sorting of the metrics based on the quadrature components enables low-effort and precise determination of the search order. It is no longer necessary to know the metrics of all constellation symbols in advance, which also leads to a reduction of the computational complexity. c) Selecting a global minimum as a first symbol x in an enumeration sequence xf num [p] = xf num [0] = xf V) = xf' 0) and increasing an enumeration pointer p;

d) Expanding a maximum of two new candidates for the next symbols within a metrics matrix λ±, one along R- and one along I-dimension departing from the previously selected symbol ) and

Af(x> according to A f ( x> = A¾ t J + A M (i) f J ; λ± is built by spanning metrix along the modulation constellation points ;

e) Adding the newly determined enumeration candidate (s) or symbol to a stack and sorting them in ascending order of their partial metrics λι , whereas a next symbol to be explored within the tree-search is always contained in the first position of the stack;

f) Selecting the next symbol to be explored within the enumeration ordering procedure or commonly named as tree- search and increasing the enumeration pointer p;

g) Removing the selected symbol from the stack; and last but not least

h) Repeating steps d) to g) for enumerating further symbols, also named as tree nodes, until the metrics matrix contains the metric values of all symbols, whereas the number of symbols is described by Q. For example in a 16-QAM

constellation's matrix 16 metric values of 16-QAM symbols have to be enumerated.

The proposed method is also referred to as smart-sorting enumeration approach. The smart-sorting enumeration approach determines the exact SE ordering in an efficient manner. Due to the effects of the a-priori information on the partial metrics and the resulting constellation symbol ordering, the introduction of sorting operations within the enumeration process is, to some extent, unavoidable. The intrinsic quadrature decomposition of the QAM constellation results in that all constellation symbols with the same real /

imaginary component share the same real / imaginary component of the Euclidean distance. Regarding the bit- specific a-priori information it is the same that all constellation symbols with the same binary value of a particular bit share the same a-priori information for this bit. The binary mapping of the constellation symbols is also decomposed in quadrature components, whereby all

constellation symbols with the same real / imaginary

component possess the same real / imaginary value. The calculation of the partial metric values can then be split into a real and an imaginary part namely in the two QAM quadrature components:

A¾(x 4 ) = R [r? (Δ^ χ ύ) 2 } + N 0 X*( Xi )

(8)

= I {r ucl {xd) 2 } + NMX with

with Ci ≠ sign(L a (Ci )) with ci ≠ sign(L a (Ci ))

The layer partial metric of any constellation symbol can be determined by simply adding the corresponding quadrature components A £ [x k ,k ^ = x k ^ + A ( Q [x k ^ (10)

To establish the optimum search sequence, it is advantageous to use the quadrature components for sorting the metrics, thereby sorting all of metrics is no longer necessary. After sorting, the symbol which is at the point (/c E ,/c n ) = (0,0) is the tree node with the lowest metric value and should therefore be examined first. The following symbols of the search order are determined in which one calculates

gradually always a symbol in real direction and a symbol in the imaginary direction (eq. (10)) . The determination of the symbols is continued until either all constellation symbols have been investigated or no further determination is necessary for the performed tree search.

In one preferred embodiment a controlled-expansion mechanism is employed, because in the case when better metric values have been already computed, the controlled-expansion

mechanism is employed to avoid generating new metric values unnecessarily. Specifically, the metric expanding along the real component is only computed if all the metrics

previously determined within the same column have been examined in the tree search. The analogous condition applies to the metric expanding along the imaginary component with regard to the existing values within the same row. It should be additionally noticed that, in order to determine if a better metric value has been already computed, no metric value comparison is required. Due to the sorting of the metric quadrature components in ascending order, examining the element's indexes (/c E ,/c n ) within the matrix is

sufficient .

Owing to this controlled-expansion mechanism, in a preferred embodiment the number of enumerated elements stored in a stack is ≤ Q. By these means, an overflow is prevented and the stack size is kept reasonably small. It should be additionally noticed that only the stack content is stored, whereas the metrics matrix is uniquely depicted here for the sake of clarity. The object of the invention will also be solved by an apparatus for soft-input soft-output sphere detection using a multiple input multiple output (MIMO) sphere detector and a channel decoder wherein soft-information generated by the detector are exchangeable between the detector and the channel decoder. Soft-Information is a reliability metric whose value correlates with the probability or likelihood of the bit taking on a discrete value.

In one embodiment the apparatus comprises a stack of size Q whereas Q is the number of symbols y receivable by the detector. A symbol consists of several bits. The stack is preferably used to store already calculated partial metric results (values of the quadrature components) in order to reuse these metrics. The size of the stack corresponds to Q symbols. In this case Q denotes the number of constellation symbols, for example for a 64-QAM constellation a memory size of 8 symbols is required. In order to avoid an overflow of the stack memory, a control mechanism is used. The control mechanism detects whether sufficient metrics are available and stops the calculation in positive case.

In one preferred embodiment the apparatus comprises

functional units in order to perform the proposed method.

The functional units comprise a node enumeration unit, a metric computation unit, a soft-output computation unit and a fast sorter unit. The node enumeration unit performs the method steps c) to f) . Only two parallel adders are required to compute the partial layer metrics of successive nodes. These values are added to the stack, which is subsequently sorted employing the fast sorter circuit described below. The next sibling symbol (which occupies the first position in the stack after the sorting process) is then selected and removed from the stack. Two additional parallel adders are required to add the metrics accumulated throughout the upper tree layers Xi+i . The metric computation unit incorporates the L parallel addition operations required to process the a-priori information from the channel decoder. In contrast to the SSD-ME (search sequence determination- metric

estimation) approach as known from the state of the art, the TSD-SSE-MC (Tuple-sphere-detector - smart-Sorting- enumeration - metric calculation) design uses the exact computation of the metrics according to eq. (8) and (10) instead of the estimation in eq. (6), which introduces two multipliers for each of the quadrature components (namely the squared operation \y["— Xj| 2 and the subsequent

multiplication with ra) . Thus the SSE-MC approach requires a total of 2 *V<2 multipliers per dimension. The soft-output computation unit used by the SSE-MC approach comprises an adder for adding the quadrature components without any memory requirements. In contrast to the SSD-ME approach, the a-priori information is already incorporated in the

quadrature components and its computation is consequently eliminated. Additionally, the memory requirements are reduced since no pre-computed distances need to be stored. The fast sorter unit sorts V<2 metric values in parallel within one clock cycle. For this purpose, each metric value is compared with all other V<2 — 1 values in the set,

generating the V<2 X Q matrix of binary flags. The sorting indexes are subsequently generated by V<2 banks of adders, each accumulating the flag values on each column. The computed indexes are lastly employed to configure a bank of multiplexers which generates the metrics ' sequence sorted in ascending order.

The main advantage of the invention, namely the smart- search-enumeration (SSE) procedure, compared to the SE method is that both the exact metric values and the optimum search sequence can be determined at a much lower

complexity. In detail, the invention compared to the SE- procedure provides the following significant benefits:

Optimal error rate: Because of the determination of the exact metric values and the optimal search order, this invention offers the same detection accuracy as the SE method .

Drastic reduction of computational complexity: The

decomposition of the metric calculations in quadrature components leads in the worst case to a reduction of the additions by 60% and the multiplications by 90% compared to the SE method. By sorting the quadrature components

comparison operations and additions can be saved up to 97% compared to the SE method.

Dramatically reduce implementation costs: A very efficient hardware implementation is possible by reducing the

computational complexity associated with low-demand memory.

Compared to the estimate based methods a much higher accuracy can be achieved by the proposed method. When viewed in detail the advantages of the invention over the estimate based methods are:

Number of tree nodes: The present invention as well as the SE method examine a higher number of tree nodes, as it is necessary for methods estimating the metric values and the search order. The advantage of the invention is that despite of the complexity of the search tree (namely the number of tree nodes that has to be examined) is in a similar range as estimation based methods. This is possible in conjunction with the radius clipping-process. To keep a radius within reasonable size, as its size directly leads to the number of candidates which need to be considered, radius clipping is a technique to reduce the computational load.

Computational complexity: The computational complexity of metric calculation, even if it is broken down into

quadrature components is significantly higher than the computational complexity of a metric estimate. According to the invention it is however achieved that the realization costs are similar to those of the estimate-based

implementation. Also the area consumption on an integrated chip required by the implementation of the invention is only 12% higher than the estimation procedure.

Error rate to SNR: In comparison to the estimate-based method, the invention provides a significant improvement of the error rate to- SNR relationship (approximately 0.4 dB to 0.7dB at the same bit error rate) .

The turbo principle allows exploiting the full potential of MIMO communications by exchanging soft-information between the detector and the channel decoder. This enables enhancing the communication's reliability drastically or,

alternatively, reducing the transmit energy substantially while guaranteeing a maximum target error rate. The latter represents a very convenient aspect within the scope of modern and future mobile communications systems, where the predicted ever-rising power consumption trend calls for a dramatic improvement of the energy efficiency. The benefit of the turbo principle comes however at the cost of

increasing the receiver's overall computing time, which raises as a consequence of the repeated detection and decoding processes and, in the particular case of depth- first tree-search detectors, the higher number of nodes examined .

BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 Constellation's space partitioning into five decision regions employed by the search sequence determination (SSD) approach;

Fig. 2 Sample partial metrics λι and stack content before and after sorting their quadratur components, corresponding to the symbols a 16-QAM constellation according to the invention;

Fig. 3 Smart-sorting enumeration procedure, applied to

partial metrics from the example in figure 2, the rightmost tables depicts the stack's content;

Fig. 4 Tuple-search (sphere) detector regularized flow

diagram;

Fig. 5 Tuple-search (sphere) detector's task scheduling and pipelining schemes: a) TSD-SSD-ME, b) TSD-SSE-MC and c) pipeline-interleaving scheme; Fig. 6 Architecture of the TSD' s node enumeration unit (NEU)

- a) SSD-ME NEU, b) SSE-MC NEU; Fig. 7 Architecture of the TSD's metric computation unit (MCU) - a) SSD-ME MCUU, b) SSE-MC MCU;

Fig. 8 Architecture of the TSD-SSE-MC's fast sorting

circuit .

DETAILED DESCRIPTION OF THE INVENTION

Figure 2 shows sample partial metrics λι before and after sorting their quadrature components, corresponding to the

symbols of a 16-QAM constellation. The gradient color scale varies according to the magnitude of the metric values to ease visualizing the metric' s ideal ordering. In order to obtain any of the Q partial metric values, only Q partial metric components have to be computed in each dimension. By sorting the metric quadrature components in ascending order (as illustrated in figure 2 (b) ) , the exact SE symbol

sequence can be determined gradually, enumerating the tree nodes on demand instead of computing and sorting all the Q partial metrics at once (as required by the conventional SE enumeration approach) .

Figure 3 visualizes the smart sorting enumeration (SSE) procedure, applied to partial metrics from the example in figure 2. The rightmost tables depict the stack's content. The values in shaded cells correspond to symbols which have been examined during the tree search. The values in black- bordered cells represent the new metrics added to the stack on each step. The symbols selected to be next explored is denoted by bold fonts.

Figure 4 shows a tuple-search (sphere) detector (TSD) regularized data flow diagram whereas the sphere detection process can be decomposed into several functional blocks ((a)... (n) ) . The depicted operations loop is executed

iteratively, examining one tree node in each iteration. The data path has been explicitly divided into two branches. The path comprised by light-grey shaded blocks encloses the operations required by the tree node extension process, whereas the dark-grey shaded blocks contain the operations required to generate the soft-output information. The tree search procedure has been regularized in order to simplify the control flow and ease implementing parallelization techniques, like e.g., single-instruction multiple-data (SIMD) . The regulari zation concept consists in executing the same computation pattern in each iteration, regardless of the value of runtime data such as e.g., the tree layer. For this purpose, "dummy" operations are introduced and

multiplexing logic is inferred in order to select only valid data for further processing. As shown in figure 4, after the initialization stage the first and the second nodes (x ,xf) to be examined within the tree search are determined by blocks (a) and (b) respectively, according to the selected enumeration strategy (search sequence determination (SSD) or smart-sorting enumeration (SSE)) .In case the SSE approach is applied, the metric 's quadrature components have to be previously computed, as denoted by block (a' ) . In case the SSD approach is applied, the corresponding search sequence has to be determined instead (block (b' ) ) . The partial metrics of the two first nodes (X± (x ) , X± (xf) ) are

subsequently computed, by means of the metric calculation (MC) or metric estimation (ME) approach, within the

respective blocks (c) and (e) . Analogously, the corresponding reduced-interference received signals ( £ ) are calculated according to eq. (3) in blocks (d) and (f) . The previously determined partial metric Xi( °) is subsequently considered in order to update the radius tuple (5) within block (g) . This operation is performed at each tree layer for the sake of regularization, whereas it only takes effect at layer i = 0. The next step consists in selecting the tree layer to be processed within the next loop (block (h) ) . In case the search proceeds on the same layer or traces back to upper tree layers, a new sibling node needs to be enumerated and its corresponding partial metric and inter-antenna interference computed. These operations, performed by blocks (i) , (j) and (k) , respectively, are rather unconditionally executed in order to keep the regularized control flow.

Likewise, the operations involved in the soft-output

generation, actually required only at layer i = 0, are nevertheless continuously executed regardless of the tree layer. In particular, the bit-specific metrics required to compute the LLR values, see eq. (1) , are determined by block (1) and the stored subset of candidates is thereafter updated by block (m) . Finally, the LLR values are computed in block (n) . As shown in figure 4, the only conditional operation required is the end-condition check, which

terminates the execution as soon as the root node is reached (i = N T - 1) .

The time budgets (in clock cycles) of each functional block have been defined according to the number of sequential add- equivalent operations, assuming a sufficient degree of internal bit-level parallelism (limited by dependencies among data) . The overall latency is reduced by partially overlapping the execution of the defined functional blocks (instruction-level parallelism) . The resulting scheduling scheme of the TSD-SSD-ME approach is illustrated in figure 5 a) . According to this scheme and for the defined time budgets, an overall delay of 10 clock cycles is identified. Since the data generated by blocks (m) , (n) and (f) are not required by the immediately subsequent detection loop(s), a new iteration can be triggered every 5 clock cycles. In the case of the TSD-SSE-MC detection approach, the node

enumeration as well as the metric computation modules have been modified in order to implement the SSE and MC

strategies, respectively. The different computational effort and data dependencies of these blocks with respect to the analogous modules in the TSD-SSD-ME design require a

modified task scheduling scheme (depicted in figure 5 b)). Particularly, the computation of the quadrature partial metrics is firstly performed in block (a' ) (with a time budget of 2 clock cycles) . The first and second nodes to be examined, as well as their corresponding partial metrics, are subsequently determined ( (a) + (b) + (c) + (e) ) (within a time budget of 1 clock cycle) . In parallel to this, successive sibling nodes are analogously enumerated

((i)+(j)). The radius and tree layer determination operation ( (g) and (h) ) are scheduled exactly as in the case of the TSD-SSD-ME approach, whereas the execution of the

interference-reduction units (d) and (k) , as well as (f) , has been delayed 1 and 2 clock cycles, respectively. In order to maintain the proposed 5-cycle architecture, the time budget for these units has been reduced to 2 clock cycles, as illustrated in figure 5 b) . But these modules do not lie within the critical path and, consequently, no decay of the maximum achievable clock frequency is expected. Lastly, the unit computing the bit-specific metrics and LLRs (1) , (m) , (n) ) keep the same timing structure as in the previous case. The newly proposed TSD-SSE-MC approach presents a fairly more compact and structured scheduling scheme than the precessing TSD-SSD-ME design, which allows grouping the execution (and thereby the hardware resources) of similar functional blocks easily. The pipeline- interleaving technique, consisting in pipelining several independent data paths and executing them in interleaved fashion, is applied in order to enhance the design's

throughput (see figure 5 c)). This technique has been applied to the proposed TSD-SSE-MC approach. The previously described task blocks have been clustered according to their execution times t, as illustrated in figures 5 a) and 5 b) . Resulting from this, 5 sets Dq E {Dq... Dq} of parallel operations are composed, each defining a pipeline stage. On this basis, the throughput can be easily enhanced by simply interleaving the pipelined execution of the 5 task-sets corresponding to different detection paths, as illustrated in figure 5 (c) . The resource utilization is maximized when the number of interleaved data paths equals the number of available pipeline stages P = 5, achieving an average throughput T :

LN T P , LN T ,

T = E[n]P + P - l fcLK → E\n\ fcLK [bps] ' with fc LK > f CLK representing the clock frequency reached by the pipelined circuit, ideally increasing by a factor of nearly P with respect to the non-pipelined architecture (fcLK ¾ P x †CLK) · Assuming that a new detection starts as soon as another has finished, no processing element (PE) is idle after filling in the pipe, leading to the sustained

throughput expression on the right-hand side of equation (11) . The memory address computation and access operations are embedded in the pipeline structure, in parallel to the detection algorithm's computations. By these means, the execution of a new detection path can be triggered as soon as another one is concluded, without stalling the pipeline's execution .

Figure 6 shows the node enumeration unit for the SSD-MC and the inventive SSE-MC approach. This functional block

enumerates the nodes to be examined within the tree search process in the appropriate order. In the case of the SSD-ME approach the pre-computed enumeration sequences are stored in a small (« 32 bytes) look-up-table (LUT) . The first node j is directly obtained by rounding the normalized reduced- interference received signal y[" to the closest constellation symbol, whereas subsequent symbols are directly selected from one of the enumeration series contained in the LUT, as illustrated in figure 6 (a) . The selected node sequence is previously identified by determining which of the quantized constellation regions contains the received signal y'" . For this purpose, comparisons of the real and the imaginary components of Δ Γ = y'"— x e ^ are required. In the case of the SSE approach the node enumeration unit implements the node enumeration procedure (method steps c) to f) ) according to claim 1. The first node in the enumeration sequence (method step c) ) is directly obtained after sorting the metric 's quadrature components (previously determined by the metric computation unit) . In order to enumerate further nodes, steps d) to f) are executed. As illustrated in figure 6 b) , only two parallel adders are required to compute the partial layer metrics of successive nodes. These values are added to the stack, which is subsequently sorted employing the fast sorter circuit described below. The next sibling symbol (which occupies the first position in the stack after the sorting process) is then selected and removed from the stack. Two additional parallel adders are required to add the metrics accumulated throughout the upper tree layers λί+ι . The complexity of this unit is comparable to that of the SSD, which requires a similar amount of adders and comparators, in addition to a LUT containing the pre-defined enumeration sequences.

Figure 7 shows the architecture of the TSD' s metric

computation unit (MCU) , on the one hand of the SSD-ME approach already known from the state of the art and on the other hand of the inventive SSE-MC approach. In the case of the SSD-ME approach the products r^(A ) 2 are pre-computed and eventually stored in a cache (65 bytes), consequently simplifying the metric computation in eq. (2) to a single addition operation, as illustrated in figure 7 a) . This unit was extended in order to incorporate the L parallel addition operations required to process the a-priori information from the channel decoder. The main differentiating aspect in the case of the TSD-SSE-MC design resides in the exact

computation of the metrics according to eq. (8) and (10) instead of the estimation in eq. (6), which introduces two multipliers for each of the quadrature components (namely the squared operation and the subsequent

multiplication with ra) . Thus, the SSE-MC approach requires a total of 2 *V<2 multipliers per dimension, thus increasing the computational complexity considerably with respect to the SSD-ME approach. The parallelism of the block computing the a-priori information has also increased, requiring parallel instances per dimension. The quadrature metric components are sorted by employing the fast sorter circuit described below (step b) ) , which also entails a complexity increase with respect to the SSD-ME approach. The memory requirement, on the contrary, is reduced since no pre- computed Euclidean distances need to be stored. This unit (corresponding to task block (a') in figure 4) computes only the metrics' quadrature components, whereas the final partial metric values (including the upper layer's metric) are subsequently obtained in the NEU, as previously

described .

Figure 8 depicts the fast sorter circuit which sorts V<2 metric values in parallel within one clock cycle. For this purpose, each metric value is compared with all other V<2 — 1 values in the set, generating the V<2 X Q matrix of binary flags depicted in the figure 8. The sorting indexes are subsequently generated by V<2 banks of adders, each

accumulating the flag values on each column. The computed indexes are lastly employed to configure a bank of

multiplexers which generates the metrics ' sequence sorted in ascending order.

The proposed smart-sorting enumeration (SSE) approach achieves the same error-rate performance as the exact

Schnorr-Euchner enumeration, but also the same complexity. The latter can be easily mitigated by simply applying an internal radius clipping mechanism. By these means, the accumulated number of nodes explored during the tree search is reduced in 20% on average (at bit-error-rate (BER) = 10 "5 ) with regard to the Schnorr-Euchner enumeration approach. The application of the internal radius clipping does not affect the error-rate performance.

The complexity of the tree search process is strongly influenced by the order in which the constellation symbols descending from a parent node are explored during the tree search. Tree paths which are advantageous for (1) (i.e.

presenting small metrics) should be explored firstly in order to avoid spending computational effort on subtrees which will be eventually pruned by the radius constraint and, consequently, tree nodes are preferably examined in ascending order of their partial metrics. To accomplish this, numerous enumeration methods have been proposed, which mainly differ in their complexity and accuracy.

Method for a symbol enumeration ordering procedure and an apparatus using the method

Reference signs stack

node enumersation unit (NEU)

metric calculation unit (MCU)

fast sorter unit

multiplexer