Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ANALOG CHANNEL ESTIMATION TECHNIQUES FOR BEAMFORMER DESIGN IN MASSIVE MIMO SYSTEMS
Document Type and Number:
WIPO Patent Application WO/2019/195426
Kind Code:
A1
Abstract:
Novel channel estimation techniques that can be performed solely in the analog domain are provided. The techniques use continuous analog channel estimation (CACE), periodic analog channel estimation (PACE), and multiantenna frequency shift reference (MAFSR). These techniques provide sufficient channel knowledge to enable analog beamforming at the receiver while significantly reducing the estimation overhead. These schemes involve transmission of a reference tone of a known frequency, either continuously along with the data (CACE, MAFSR) or in a separate phase from data (PACE). Analysis of the methods show that sufficient receiver beamforming gain can be achieved in sparse channels by building the beamformer using just the amplitude and phase estimates of the reference tone from each receiver antenna.

Inventors:
RATNAM VISHNU VARDHAN (US)
MOLISCH ANDREAS F (US)
Application Number:
PCT/US2019/025587
Publication Date:
October 10, 2019
Filing Date:
April 03, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV SOUTHERN CALIFORNIA (US)
International Classes:
H04B7/0413; H04B1/16
Domestic Patent References:
WO2017004546A12017-01-05
Foreign References:
KR20130119788A2013-11-01
US20080004078A12008-01-03
US20040121753A12004-06-24
Other References:
TADILO ENDESHAW BOGALE ET AL.: "Hybrid Analog-Digital Channel Estimation and Beamforming: Training-Throughput Tradeoff (Draft with more Results and Details)", ARXIV:1509.05091V2, 9 October 2015 (2015-10-09), XP055642514, Retrieved from the Internet [retrieved on 20190715]
Attorney, Agent or Firm:
PROSCIA, James W. et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A MIMO system that applies continuous analog channel estimation, the MIMO system comprising:

a transmitter (TX) that transmits a transmitted signal that includes a data signal combined with a predetermined reference signal,

a receiver (RX) including:

a plurality of antennas wherein each antenna receives the transmitted signal and outputs an associated received signal;

a baseband conversion processor that either includes an independent oscillator or recovers the predetermined reference signal including either or both of signal amplitude and phase, eac associated received signal being multiplied with an independent oscillator signal or a recovered reference signal and with its quadrature component in the analog domain, resulting in processor output signals that arc low-pass signals "With at least partially compensated i nter-antenna phase shift;

an optional amplitude and phase compensation processor that adjusts outputs from the baseband conversion processor via analog phase shifters; and

an analog adder that sums outputs signals from either the bandpass conversion processor or the optional amplitude a d phase compensation processor as a summed signal output thereby emulating signal combining and bcamforming.

2. The MiMO system of clai 1 wherein the amplitude and phase compensation processor applies a baseband reference signal in the outputs from the baseband conversion processor as control signals to the analog phase shifters to further compensate for inter-antenna phase-shifts in the outputs from the baseband conversion processor.

3. The MIMO system of claim i wherein the transmiter transmits the transmitted signal by orthogonal frequency divisio multiplex irig (OFDM ).

4. The MIMO system of claim 1 wherein the analog adder applies a weight sum when summing the processor output signals.

5. The MIMO system of claim 1 wherein the baseband conversion processor include a local oscillator instead of a signal recover} circuit.

6. The MIMO syste of claim 1 wherein the baseband conversion processor includes:: at least one reference signal recovery circuit;

a plurality of first mixers with each antenna having an associated first mixer; and a plurality of second mixers with eac antenna having a second mixer,

7. The MIMO system of claim 6 wherein each antenna has an associated signal recovery circuit.

8, The MIMO system of claim 6 wherein the reference signal recovery circuit includes a secondary phase locked loop array that receives signals form a subset of the plurality of antennas and wherein outputs from each secondary phase locked loop are summed and fed as control signals to a primary phase locked loop.

9 The M l MU syste of clai 8 wherein the outputs from each phase locked loop arc summed as a weighted sum.

I f) The MIMO system of claim 6 wherein:

an associated reference signal recovery circuit of each antenna isolates and recovers the termined reference signal as an associated isolated reference signal from the associated received signal;

an associated first mixer of each antenna multiplies the associated isolated reference signal with the associated received signal to produce an associated in-phase-derived output signal; and an associated second mixer of each antenna multiplies a quadrature component of the associated isolated reference signal with the associated received signal to produce a quadrature- derived output signal such that the associated in-phase-derived output signal and the quadrature- derived output signal are the processor output signals.

11. The MIMO system of claim 10 further comprising a plurality of low noise amplifiers positioned between each antenna and baseband conversion processor.

12. The M IMO system of claim ID wherein each associated reference signal recovery circuit includes an injection locked oscillator, a phase locked loop, or a bandpass filter.

13 The MIMO system of claim ID wherein sufficient frequency separation may be provided between the predetermined reference signal and data signals to enable each reference recovery circuit.

14. The MIMO system of claim 10 wherein a plurality of filters extract a baseband signal and filter out noise from the associated in-phase-derived output signal outputted by the first plurality of mixers and a second plurality of low pass filters extract associated quadrature-derived outputted by the secon plurality of mixers.

15 The M IMO system of claim 1.4 further comprising a low pass filter and/or down converter that receives the summed signal output and outputs an analog baseband signal.

16. The MIMO system of claim 10 wherein a sparse nature of wireiess channels is exploited to ensure a large beamforiniftg gain after the summed signal output.

17. The MIMO system of claim 12 that can perform receive beam forming without digital channel estimation.

18. The MIMO system of claim 12 further comprising an an log to digital converter that converts an analog baseban signal to digital baseband signal and a demodulator that demodulates the digital baseband signal.

19. The MIMO system of claim 1 comprising the amplitude and phase compensation processor which includes associated phase shifters, each phase shifter receiving in-phase-derived control signal and a quadrature-derived control signal such that outputs from the baseband conversion processor are phase adjusted.

20. A M1MO system of clai 19, wherein the amplitude and phase compensation processor includes iowpass filters that filter the in-phase and quadrature output signals from a baseband processor to generate the in-phase-derived control and quadrature-derived control signal to be used for the associated phase shi fters, the baseband eon version processor including:

at least one reference signal recovery circuit;

a plurality of first mixers with each antenna having an associated first mixer; and

a plurality of second mixers with each antenna having a second mixer.

21. The MIMO system of claim 1 wherein the predeteremined reference signal is a sinusoidal reference signal.

22. T he MIMO system of claim L wherein a plurality of signal processors receive-, a plurality of pilot signals and data streams.

23. The MIMO system in claim 1 , that can suppress phase noise of transmit and receive oscillators.

24, A MIMO system having a periodic analog channel estimation (PACE) receiver, the MIMO system comprising:

a transmitter (ΎC) that transmits a transmitted signal that is a reference signal during a beamforming design phase and a data signal during a data transmission phase;

a receiver ( RX) including:

a plurality of antennas wherein each antenna receives the transmitted signal and outputs an associated received signal;

a phase and amplitude estimator that recovers the reference signal during an a beamforming design phase and then multiplies each associate received signal during a beamforming design phase with a recovered reference signal and a quadrature component of the reference signal in the analog domain to form a plurality of in-phase-derived baseband control signals and a plurality of qua rature- derived baseband control signals with each antenna hav ing an associated in-phase- derived baseband control signal and an associated quadrature-derived baseband control signal:

a plurality of variable gain phase shjfrors with each antenna having an associated variable gain phase shifter wherein the associated variable gain phase shifter of each antenna receives the associated n-phase-derived baseband control signal and the associated quadrature-derived baseband control signal, arid through which the data signal is processed during a data transmission phase; and

an analog adder that sums outputs from the plurality of the variable gain phase shifters as a summed signal output.

25, The MIMO system of claim 24 wherein the transmitter transmits the transmitted signal by orthogonal frequency division multiplexing (OFDM).

2b. The MIMO system of claim 24, that does not re u re continuous transmission of the reference signal.

27. The MΪMO system of claim 24 wherein the analog adder applies a weight sum when ' u ming the variable gain phase shifters.

28. The MIMO system of claim 24 wherein the phase and amplitude estimator includes: a reference signal recovery circuit that recovers and isolates the reference signal as an isolated reference si nal from the associated received signal of one or a plurality of antennas during a beam former

29 The MIMO system of claim 28 wherein the reference signal recovery circuit includes:

a subset of the plurality of antennas; a plurality of first mixers with each antenna from the subset of the plurality' of antennas having an associated first mixer;

a primary voltage controlled oscillator with a nominal oscillation frequency that may he same or different than the reference signal frequency, each associated first mixer multiplying a received signals from the antenna with an output from the primary voltage controlled oscillator to derive an associated intermediate frequency output signal;

a plurality of firs! low pass filters, with each antenna of the subset of the plurality of antennas having an associated first low pass filter that filters the intermediate frequency output signal from the associated first mixer from noise and other high frequency components;

a plurality of secondary phase locked loops, with each antenna of th e s ubset of the plurality of antennas having an associated secondary phase locked loop that locks to the associated intermediate frequency output signal associated with first low pass filter, each secondary phase locked loop including a secondary variable gain amplifier, a secondary mixer and a secondary voltage controlled oscillator with nominal oscillation frequency approximately equal to the associated intermediate frequency output signal, the secondary phase locked loops converting intermediate frequency signals at the associated first low pass filter outputs to baseband with at least partially compensated inter-antenna phase shift;

a plurality of variable gain amplifiers wherein outputs from the secondary phase locked loops arc further weighted by an associated variable gain amplifier with each secondary voltage controlled oscillator having an associated variable gain amplifier; and

an adder that combines weighted base-band signals from the secondary phase locked loops to obtain a baseband combined control signal with enhanced signal to noise ratio wherein the baseband combined control signal is a control signal for the primary voltage controlled oscillator, the primary voltage controlled oscillator outputting a recovered transmitted reference signal with a lower accumulation of channel ami nhase noise.

30, The MTMO system of claim 29 wherein the reference signal recovery circuit Further includes an output mixer that takes that receives an output of the primary voltage controlled oscillator while outputting a frequency shifted version of the primary voltage controlled oscillator output to better match the reference Signal Frequency.

31. The MIMO system of claim 24 wherein the phase and amplitude estimator further includes:

a plurality of first mixers with each antenna having a associated first mixer;

a plurality of second mixers with each antenna havin an associated second mixer; and a plurality of first sample and hold circuits wit each antenna having a associated first sample and hold circuit; and

a plurality of second sample and hold circuits with each antenna hav ing an associated second sample an hold circuit.; and

wherein;

an associated first mixer of each antenna multiplies an isolated reference signal with the associated received signal to produce an associated in -phase-derived output signal;

an associated second mixer of each antenna multiplies a quadrature component of the isolated reference signal with the associated received signal to produce a associated quadrature -derived output signal;

the associated first sample and hold circuit of each antenna receives the associate in-phase-derived output signal and outputs the associated in-phase-do ivod baseband control signal; an

the associated second sample and hol circuit of each antenna receives the associated quadrature-derived output signal and outputs the associated quadrature-derived baseband control signal.

32, The MIMO system of claim 31 wherein the associated first sample and hold circuit and the associated second sample and hold circuit are implemented by a krw pass filter followed by a low sampling rate ADC.

33. The MIMO system of claim 31 wherein the associated first sample and hold circuit and the associated second sample and hold circuit are implemented by integrate and hold circuits.

34. The MIMO system of claim 31 further comprising a plurality of low noise amplifiers positioned between each antenna and phase and amplitude estimator.

35. The MIMO system of claim 31 wherein each associated reference signal recovery circuit is an injection locked oscillator, a phase locked loop, or a bandpass filter.

36. The MIMO syste of claim 3 ! that exploits the sparse nature of wireless channels to ensure a large beamforining gain after the analog adder,

37. The MIMO system of clai 31 further comprising a low pass filter and/or down converter that receive the summed signal output and outputs an analog baseband signal.

38. The MIMO system of claim 37 further comprising an analog to digital converter that converts the analog baseband signal to digital baseband signal.

39/ The MIMO system of claim 38 further comprising an OFDM demodulator that demodulates the digital baseband signal.

44 The M l MO system of claim 24 wherein the reference signal is a sinusoidal reference signal.

41 The MIMO system of clai 24 wherein a plurality of signal processors receive a plurality of pilot signals and data streams.

42. The MIMO system of claim 24 that can perform receive beamforming without digital channel estimation.

43. A MIMO system that applies a milltianienna frequenc shift reference (MAFSR), the MIMO system comprising:

a transmitter (7 X) that transmits a transmitted signal that includes a data signal combined with a predetermined reference signal; and

a receiver (RX) including:

a plurality of antennas wherein each antenna receives the transmitte signal and outputs an associated received signal; a plurality of bandpass filters wherein each antenn is associa ted with a bandpass fi lter and each bandpass filter receives a corre spending associated received signal and outputs an associated filtered received signal;

a squaring circuit squares each associated filtered received . signal to form associated squared received signals, wherein each antenna is associated with a squaring circuit; and

an analog adder that adds the associated squared received signals from ail antennas to produce a summed signal.

44. The MIMO system of claim 43 wherein the receiver further includes a low pass filter that filters the summed signal to form a lowpass filtered signal,

45. The MIMO system of claim 44 wherein the receiver further includes an analog to digital converter that samples the lowpass filtered signal and a demodulator that demodulates the lowpass filtered signal

46. The MIMO system of claim 43 wherein the predetermined reference signal is a sinusoidal reference signal.

47. The MIMO system of claim 43 wherein the transmitter transmits the transmitted signal by orthogonal frequency division multiplexing (OFDM).

48, The MIMO system of claim 43 where data signals occupy odd subcarriers and reference signal occupies an even sub-carrier or vice versa, to prevent inter carrier interference in the demodulated outputs.

49, The MIMO system in claim 43 phase noise of a transmit oscillator is suppressed.

50. The MIMO system of claim 43 wherein the receiver does not include a local oscillator.

Description:
ANALOG CHANNEL ESTIMATION TECHNIQUES FOR BEAMFORMER DESIGN IN

MASSIVE IMO SYS TEMS

TECHNICAL FIELD Tn at least one aspect, the present, invention is related to transmission in multlple- i npu L-tni! 1 t iple-ou tp a i systems.

BACKGROUND Massive multiple-input -niultiple-outptit (Ml MO) systems, where the transmitter (TX) and/or receiver (RX) are equipped with a large array of antenna elements are considered a key enabler of 5G cellular technologies due to the massive beamformmg and or spatial multiplexing gains they offer. This technology is especially attractive at millimeter (m ) wave and terahertz (TElz) frequencies, where the massive antenna arrays can be built with small form factors, and where the resulting beamforming gain can help compensate for the large channel attenuation. Despite the numerous benefits, full complexity massive MIMO transceivers, where each antenna has a dedicated up down-conversion chain, arc hard to implement in practice. This is due to the cost and power requirements of the up/down -conversion chains - which include expensive and power hungry circuit components such as the analog-to-digital converters ( ADCs) and digital-to-analog converters. A key solution to reduce the implementation costs of massive MIMO while retaining many of its benefits is Hybrid Bcamforming, wherein a massive antenna array is connected to a smaller number of up/down-conversion chains via the use of analog hardware, such as phase-shifters and switches. While being comparatively cost and power efficient, the analog hardware can focus the transmit receive power into the dominant channel directions, thus minimizing the performance loss in comparison to full complexity transceivers. In this paper, we focus on a special case of hybrid bcamforming with only one tip clow -conversion chain ( lor the in-phase and quadrature-phase ignal components each), referred to as analog beam formi g. O0O3J A major challenge f r analog beamtorming (and also hybrid bcamforming in general) is the acquisition of channel state information (CSI) required for beamformmg, referred henceforth as rCSI. Such rCSI may include, for example, average channel parameters or instantaneous parameters , and is commonly obtained fay transmitting known signals (pilots) at the T.X and performing channel estimation (CE) at the R at! east once per rCSI coherence time - "which is the duration for which the rCSl remains approximately constant. Since one dovvn- conversion chain has to be time multiplexed across the RX antennas for CE in analog beamfbrming, several pilot re-transmissions are required for rCSl acquisition. As an example, exhaustive CE approaches require 0(M tx M rx ) pilots per rCSl coherence time, where M cx , M vx are the number of TX and RX antennas, respectively and ()(·) represents the scaling behavior in big- ‘olf notation. Such a large pilot overhead may consume a significant portion of the time-frequency resources when the time for which rCSl remains constant is short, such as in vehic!e-to-vehicle channels, in systems using narrow TX ' RX beams, e.gv, massive MiMO systems, or in channels with large carrier frequencies ( high Doppler) and high blocking probabilities, e.g.. at nun-wave, TF-Iz frequencies. The overhead also increases system latency and makes the initial access (TL) procedure cumbersome. As a solution, several fast CE approaches have been proposed in literature, which are discussed below assuming M tx — 1 for convenience. Side informatio aided CE approaches utilize spatial/temporal statistics of rCSl to reduce the pilot overhead. Compressed sensing based CE approaches exploit the sparse nature of the channel to reduce the number of pilots per coherence time up to 0 [L!og(flf rr / )] . where L is the channel sparsity level. Iterafive angular domain CE uses progressively narrower search beams at the RX to reduce the required transmissions to O(log.¾i rs: ) Approaches that utilize side Information to improve iterative angular domain CE or perform angle domain tracking have also been considered. Sparse ruler based approaches exploit the possible Tocplkz structure of the spatial correlation matrix to reduce pilots to Since the required pilots still scale with M TX in these approaches, they are only partially successful inreducing the CE overhead. Furthermore, since sonic of these approaches require side information and. or prior timing/frequency synchronization, they may not be applicable for the IA process. Some approaches also require a long rCSl coherence time that spans the pilot re-transmissions and or are only applicable for certain antenna array conliguratious and channel models. Finally, to reduce the impact of the transient effects of analog hardware on CE, the multiple pilots may have to be temporally spaced apart, thus potentially increasing the overhead and latency. The main reason for the overhead is that conventional CE approaches require processing in the digital domain, while the RX has only one down-conversion chain. Prior to the growth of digital hardware and digital processing capabilities, some legacy systems used an alternate RX beamfomnng approach in single path channels, that does not require digital CE. In this approach, an analog phase locked loop (PLL) is used to recover the received signal carrier at each RX antenna, and the recovered carrier is then used for down -converting the received signal at that antenna to baseband. Since the carrier and data suffer the same inter-antenna phase shift (in single path channels), the down-conversion leads to compensation of this phase shift, enabling coherent combining of the signals from each antenna (I.e., beamfbrming). As this approach does not involve digital processing or pilots, it shows potential in solving the high CE overhead encountered with digital CE. Since carrier recovery can also be interpreted as estimation of the channel phase at the carrie frequency using analog hardware we shall refer to this class of techniques as analog channel estimation ( ACE) The delay domain counterpart of this approac was also explored for single antenna ultra-wideband systems, referred to as transmit reference schemes. However, such legacy ACE systems were mainly proposed for space communication and hence only supported single path channels. Additionally, recovering the carrier at the RX via a PLL is difficult at the low signal-to-noise ratios (SNRs) and high frequencies encountered in mm-wave TE systems, and leads to a high RX phase-noise, i.e.. random Jluctuation in the instantaneous frequency of the recovered carrier that degrades system performance.

|0006] Accordingly, there is a need for improved, cost effective ACE architectures.

SUMMARY

10007] in at least one aspect a MEMO system having a continuous analog channel estimation (CACF.) receiver is provided. The MIMO system includes a transmitter (TX) that transmits a transmitted signal that includes a data signal combined with a predetermined reference signal. The system also includes a receive (RX) that includes a plurality of antennas wherein each antenna receives the transmitted signal and output an associated received signal. The receiver further includes a baseband conversion processor that either contains an independent oscillator or recovers the transmitted reference signal including either or both of the signal amplitude and phase. Each associated received signal is then multiplied with the independent oscillator/recovered reference signal and with its quadrature component in the analog domain, resulting in processor output signals that are low-pass signals with at least partially compensated inter-antenna phase shift. The RX further may include an optional amplitude and phase compensation processor that adjusts outputs from the baseband conversion processor via analog phase shifters. The amplitude and phase compensation processor may utilize the baseband reference signal In the outputs from the baseband conversion processor as control signals to the phase-shifters, to further compensate the inter-antenna phase-shifts in the outputs from the baseband conversion processor. The RX also includes an analog adder that sums outputs signals from either the bandpass conversion processor or the optional amplitude and phase compensation processor as a summed signal output thereby emulating signal combining and bearnforming without the RX applying explicit channel estimation. Ill another aspect, a M IMO system having a periodic analog channel estimation (PACT:! receiver is provided. The M1MO system includes a TX that transmits a transmited signal that is a predetermined reference signal during a beamforming design phase and a data signal during a data transmission phase. The reference signal can be a reference tone with a predetermined frequency. The M !MO system also includes a RX that includes a plurality of antennas wherein each antenna receives the transmitted signal and outputs an associated received signal. The receiver also includes a phase and amplitude estimator circuit that recovers the reference signal during an a beamforming design phase an then multiplies each associated received signal during a beat» forming design phase with the recovered reference signal and a quadrature component of the reference signal in the analog domain to form a plurality of in-phasc-dcrivcd control signals and a plurality of quadrature-derived control signals with each antenna having art associated inphase-derived control signal and an associated quadrature-derived control signal. The receiver also includes a plurality of variable gain phase shifters with each antenna having an associated variable gain phase-shifter wherein the associated variable gain phase shifter of each antenna receives the associated in-phase-derived (baseband ) control signal and the associated quadrature - derived (baseband) control signal through which the data signal is processe during a data transmission phase. An analog adder sums outputs from the plurality of the variable gain phase shifters as a summed signal output. In another aspect, a non-coherent MΪMO system that applies a multiantenna frequency shift reference (MAFSR) receiver is provided. The MIMO system includes a transmitter (TX) that transmits a transmitted signal that includes a data signal combined with a predetermined reference signal. The reference signal can be a reference tone having a predetermined frequency. The MIMO system also includes a receiver (RX). The receiver includes a plurality of antennas wherein each antenna receives the transmitted signal and outputs an associated received signal. There receiver also includes a plurality of bandpass filters wherein each antenna is associated with a bandpass filter and each bandpass filter receives a corresponding associated received signal an outputs an associated filtered received signal. A squaring circuit squares each associated filtered received signal to form associated squared received signals, wherein each antenna is associated with a squaring circuit. The squared outputs involve, among other signals a product between the reference signal and data signals with the inter-antenna phase shift compensated. Finally, an analog adder that adds the squared received signals from all antennas to p c a summed signal. In another aspect, a novel transmission scheme (CAGE) tor low-complexity massive M IMO systems, that does not require phase-shifters or explicit CS1 estimation at the RX is provided. fOOl l] In another aspect, a novel transmission scheme (CAGE) for low-complexity massive MIMO systems, that only requires base band phase-shifters and does not require explicit CS estimation at the RX is provided.

|8012] hi another aspect, a novel transmission scheme (CAGE for low-complexity massive MIMO systems, that can mitigate the impact of oscillator phase noise. In another aspect, an RX architecture for the CACE scheme and characterization of the achievable throughput in a wide-band channel, for a single spatial data-sfream, is provided. In still another aspect, a near-optimal power allocation for data stream and an 1A procedure for CACE is provided.

|0015] In another aspect, a novel beamfonning scheme (e.g., PACE aided beamformer) that enables receive beamfonning in massive MIMO systems with reduced hardware and energy cost is provided, which alleviates one or more problems of the prior art. In PACE the TX transmits a reference signal which may be a sinusoidal tone at a known frequency, during periodic beam former design phases. A carrier recovery circuit, such as a phase-locked loop (PEL), is used to recover the reference signal at one or a plurality of antennas. This recovered reference signal, and it ' s quadrature component, arc then used to estimate the phase off-set and amplitude of the reference signal at each RX antenna, via a bank of‘filter, sample and hold’ circuits (represente as integrators in Fig. 3). At each antenna, the phase and amplitude estimates are used to control a variable gai phase-sh liter, thus updating the RX analog beam. During the data transmission phase, the received signals pass through these phase-shifters arc summed, down-converted, sampled and demodulated. As the phase and amplitude estimation is done in the analog domain, 0(1) pilots are sufficient to perform RX beamfbrming. Additionally the power from multiple channel multi-path components (MFCs) may be accumulated, thereby increasing the system diversity against MPC blocking. Note that by providing an option for externally controlling the inputs to the phase- shifters, the proposed architecture can also support digital CE. Furthermore, the same variable gain phase-shifts can be used for transmit beam for ing on the reverse link. Multiple implementations of such single-chain PACE receivers can receive multiple data streams if they use orthogonal reference signals. While the proposed architecture is also applicable in narrow-band scenarios, the detailed discussion below shall use as example a widc-band scenario, though this should not be interpreted as a restriction of the applicability of the patented methods. In another aspect, a novel PACE technique, that enables RX beainforming

with low CE overhead is provided. f0917] in another aspect, a novel PACE technique, that requires only one reference recovery circuit is provided. fiOISJ In another aspect, a novel PACE technique with a reference recovery circuit that can extract the reference signal from multiple antennas is provided.

|0019] In yet another aspect, a novel PACE technique, that does not require continuous transmission of the reference signal is prov ided. 1100201 In still another aspect, a receiver architecture that supports the PACE technique, and characterizes the achievable system throughput in a wide -band channel is provided.

| 921 j In another aspect, a non-coherent Ml MO MA-FSR receiver architecture that does not require art oscillator at the receiver, and can perform bearafomiing with low CE overhead is provided,

9022] In another aspect, a MIMO MA-FSR receiver architecture that suppresses transmit oscillator phase noise is provided.

|9023] In still another aspect, an MA-FS scheme with a reference signal and data signals design is provided that ensures the product of the data signal with itself does not cause interference to the product between the reference signal and data signal at the outputs of the squaring circuits.

BRIEF DESCRIPTION OF THE DRAWINGS

|S)l>24j FIGURE 1: Block diagram for a MIMO system with a multi-antenna CAGE receiver..

190251 FIGURE 2; Block diagram for a MIMO system with a multi-antenna CAGE receiver

19926] FIGURE 3; Block diagram for a MIMO system with a ulti- antenna PACE receiver.

|0027| FIGURE 4: Block diagram of weighted carrier arraying for reference tone recovery. fO02B] FIGURE 5: Block diagram for a M IMO system with a mu In -antenna frequency shift reference t MA-FSR) receiver.

|Q029] FIGURES 6A and 6B: An illustration of the transmit and receive signals with

CAGE aided beamforming. A) Transmit signal at XX antenna m; B) Baseband signal at RX antenna m. FIGURE 7: Comparison of analytical (from Lemma 3.1) and simulated statistics of the nDFT coefficients of a sample RX phasc-noisc process with T s = Ips, X,— K 2 T 1 = 512 and s q ~ l/^T s Simulations averaged over 1(U realizations.

[0031] FIGURE 8: Comparison of analytical SBRs (from ( 1.28) with/without Remark 3.1 ) to simulated results, for different sub^carriers of a C ' ACF receiver with Quadrature Phase Shift Keying. Simulations consider crj ~ 1/T S (— 93dBc at 10MHz offset), mean RX oscillator frequency of f (: and / < . + 5MHz, and are obtained by averaging over 10 6 realizations.

[0032] FIGURE 9: Comparison of approximate capacity (from (29)) versus g, with optimal choice of E^ J and chosen via (32), respectively, for s| == l / and fi mtX E s f KN $ — -3,3dB.

[0033] FIGURES 10A and l OB: Throughput of ACE schemes (PACE, CAGE, JVIA-FSK) and of digital CE with either perfect rCSI r nested array sampling versus SNR and L. For PACE, the arrayed PEL from [51 V. V. Rafnam and A. F. Molisch,“Periodic analog channel estimation ai ded beam fo rm i n for massive I O systems. ' IEEE Transactions on Wireless

Communications, 2019.. j is used with identical parameters, and RX beam former design phase lasts 6 symbols. For nested array sampling, horizon tal(4,4) and vertical (2,2) nested arrays are used, A) sparse channels B) dense stochastic channels.

[0034] FIGURE 11 : An illustrative transmission block structure for the PACE scheme. FIGURE 12 : Block diagram of the PEL at antenna 1 for reference recovery, and a sample illustration of its output.

[0036] FIGURES 13A arid 13B; Accuracy of the analytical approximation for the fi lter, sample and hold outputs in (15) versus SNR for the one PEL and weighted arraying r†r

architectures. In Fig. 13 A, we plot 1— IE j j .. — -— ei 11 for simulations and i— e tor analytic approximation. We assume A® ,'p/3 and the remaining parameters are from Table 1. [0037] FIGURES 14 A and 14B: Comparison of iSE for PACE based beamforming and other schemes versus SNR an number of MFCs. Here ~=

E cs K Vk £ X

and the FLU parameters are from Table 1. For Fig. 14B we use ----------- = 1

|0038] FIGURE 35: SER for data streams k * 50,74,94 of an MA-FSR RX with QAM modulation (K— 50, E T ~ E s /2, E^— E S /(2\X |), K— 50, g— 5).

|8039] FIGURES 16; Comparison of the iSR (without channel estimation overhead) of

MA-FSR and analog beam formi ng (a) For MA-FSR, E Y = E $ /2, K = 128 and g ~ 5 (b) For analog beamforming, we only use the subcarriers {0, . , K— g— 1}.

[9040] FIGURE 1 7A and 17B: MA-FSR designs with improved performance. A) with noise suppression. B) with narrow band pass filter.

DETAILED DESCRIPTION

|804l] Reference will now be made in detail to presently preferred compositions, embodiments and methods of the present invention, which constitute the best modes of practicing the invention presently known to the inventors. The Figures are not necessarily to scale. However, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. Therefore, specific details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for any aspect of the invention and/or as a representative basis for teaching one skilled in the art to variously employ the present invention. It is also to be understood that this invention is not limited to the specific embodiment and methods described below, as specific components and/or conditions may. of course, vary. Furthermore, the terminology used herei n is used only for the purpose of describing particular embodiments of the present invention and is not intended to be limiting in any way. It must also be noted that, as used in the specification and the appended claims, the singular form "a." "an." and "the" comprise plural referents unless the context clearly indicates otherwise. For example, reference to a component in the singular is intended to comprise a plurality of components.

|0044] The term "‘comprising’ ' is synonymous with“Including,”“having,”“containing,” or“characterized by.” These terms are inclusive and open-ended and do not exclude additional, unrecite elements or method steps.

|0045] The phrase“consisting of’ excludes any element, step, or ingredient not specified in the claim. When this phrase appears in a clause of the body of a claim, rather than immediately following the preamble, it limits only the element set forth in that clause; other elemen s are not excluded from the claim as a whole.

|0046] The phrase“consisting essentially of' limits the scope of a claim to the specified materials or steps, plus those that do not materially affect the basic and novel characteristic^) of the claimed subject matter.

|Q 47} With respect to the terms“comprising,”“consisting of,” and“consisting essentially of,” where one of these three terms is used herein, the presently disclosed and claimed subject matter can include the use of either of the other two terms. Throughout this application, where publications are referenced, the disclosures of these publications in their entireties are hereby incorporated by reference into this application to more fully describe the state of the art to which this invention pertains.

Abbreviations:

10050] “ACE” means analog channel estimation.

“aCSI” means average channel state information.

10052] “ADC” means a alog-to- digital converters.

“BS” means base-station. Q054] ‘DAC means digital-to-analog converters.

TA” means initial access.

‘M L-FSR" means mnlti-antenna frequency shift reference ‘MIMO” means niuitiple-mput-roultiple-ouiputi

'MFC' means multi-path components.

“OFD " * means orthogonal frequency division multiplexing.

“C AGE” means continuous analog channel estimation

"PACE ' ' ' means periodic analog channel estimation. “rCSl” means required channel state information.

‘R.TAT” means reference tone aided transmission.

“SNR” means signal -to- noise ratio.

|8065] “TX” means transmitter.

“RX” means receiver.

8867] “UE” means user equipment.

“aBD” mean analog beamformer design phase

With reference to Figures I and 2, a MIMO system and in particular a CAGE system implementing continuous analog channel estimation aided beam forming is prov ided. Mi MO system 10 or 10’ includes transmitter 12 that transmits a transmitted signal that includes a data signal and a continuously transmitted reference signal. Typically, the reference signal is, but is not restricted to be, a sinusoidal reference tone having a predetermined frequency. In a refinement, transmitter 12 transmits the transmitted signal by orthogonal frequency division multiplexing, Receiver 14 includes a plurality of antennas 18' where i is an Integer fro 1 to M rx the number of antennas. Each antenna 18’ receives the transmitted signal and outputs an associated received signal S x .ft). T he baseband conversion processor 20 either contains an independent oscillator or recovers the transmitted reference signal including either or both of the signal amplitude and phase and then multiplies each associated received signal with the independent oscillator-· or the recovered reference signal and with its quadrature component in the analo domain, resulting in processor output signals. The baseband conversion processor 20 output signals are essentially lowpass signals with at least partially compensated inter-antenna phase shifts. The RX may also contain an optional amplitude and phase compensation processor 22 lor phase and amplitude adjustment of the signals outputted fro baseband conversion processor 20 via the use of analo phase shifters. The amplitude and phase compensation processor 22 may utilize the baseband reference signal n the outputs from the baseband conversion processor 20 as control signals to the phase-shifters, to further compensate the inter-antenna phase-shifts in the outputs from the baseband conversion processor. Analog adder 24 aunts the baseband conversion processor 20’S output signals or the optional amplitude and phase compensation processor 22 S Output signals as a summed signal output (he., Re(e) ..p r(t}) and Tm(on pr(t) )) thereby emulating signal combining and beamforming without the receiver applying explicit channel estimation In a refinement, analog adder 24 applies a weight sum when summing the processor output signals. Advantageously, the sparse nature of wireless channels is exploited to ensure a large beamforming gain after the summed signal output. Moreover, MlMi.) system 10 or 10 can suppress the phase noise of transmit and receive oscillators.

Still referring to Figures 1 and 2, in a variation, MIMQ system 10 or 10’ includes a plurality of low noise amplifiers 26 with each antenna 18’ having a associated low noise amplifier 26’ that outputs the associated received signal. The associated low noise amplifier 26' of each antenna I S 1 receives and amplifies associated received signal to form an amplified associated received signal v -.i(t). Baseband conversion processor 20 may either include one or more reference signal recovery circuits 28 (in Figure 1 ) or may include an RX local oscillator 28 ' (in Figure 2). in a refinement illustrated in Figure 1 , each antenna 18 : has an associated reference recovery circuit 28’. The associated reference recovery circuit 2 1 o each antenna 18 isolates and recovers the received reference signal as an associated isolated reference signal from the associated amplified received signal. In a ict cment illustrated in Figure I , each associated reference recovery circuit 28* includes a narrow band pass filter and/or an injection locked oscillator and/or a phase locked loop followed by an optional variable gain amplifier. Sufficient frequency separation may be provided between the transmitted ref erence signal and data signals to enable the reference recovery. Baseband conversion processor 20 also includes a plurality of first mixers 30 w ith each antenna 18’ having an associated first mixer 3U . The associated first mixer 30’ of each antenna 18' multiplies the associated isolated reference signal (in Figure 1 ) or the local oscillator output 28 {in Figure 2) w ith the associated received signal to produce an associated in-phase-derived output signal. Baseband conversion processor 20 also includes a plurality of second mixers 32’ with each antenna 18' having an associated second mixer 32*. The associated second mixer 32’ of eac antenna multiplies a quadrature component of the associate isolated reference signal or local oscillator output: with the associated received signal to produce a quadrature-derived output signal ln a refinement, the quadrature component of the associated isolated reference signal can be formed by 0 phase shifters, illustrated as 34‘ in Figure I and 34 in Figure 2. Characteristically, the in-phase-derived output signal and the quadrature-derived output signal are the processor output signals.

| 07 ] Ίiί the variation depicted in Figure 1 , MTMO system lO includes a low pass filter and. or dovvnconverter 40 that receives the in-phase-derived output signal and the quadrature- derived output signal and outputs a baseband signal (which is a low pass filtered summed signal output). Thebaseband signal is sampled by Analog to digital converter 36 and then demodulated by OFDM demodulator 38 that demodulates tire bandpass signal.

|iM}7i ] ln the variation depicted in Figure 2, low pass filters 42‘ filter outputs from mixers

30* while low pass fillers 44 s filters outputs from mixers 32 : prior to signals being provided to amplitude and phase compensation 22. Filters 42’ and filters 44’ extract the baseband signal and filter out noise from the associated in-phase-derived output signals and the associated quadrature- derived output signals from the mixer outputs 30 ! and 32 . In particular, the amplitude and phase compensation processor includes these lowpass filters to filter the in-phase and quadrature output signals from a baseband processor to generate the in-phase derived and quadrature-derived control signals to be used tor the associated phase shifters. Amplitude and phase compensation circuit 22 provides phase and amplitude adjustment of the plurality of in-phase-derived output signals and the plurality of quadrature-derived output signals from baseband conversion processor 20. The amplitude and phase compensation circuit 22 involves a low pass filter 50 1 at each antenna i, that receives an in-phase-derived output signal and outputs an in-phase-derived control signal. Similarly low pass filter 52' receives a quadrature-derived output signal and outputs a quadrature- derived control signal. These control signals correspond to the baseband reference signals in the baseband outputs from filters 42’ and 44*. Ampliatde and phase compensation 22 also includes a plurality of variable gain phase-shifters 56 with each antenna I S’ having an associated variable gain phase-shifter 56’. The associated variable gain phase-shifter 56' of each antenna 18’ receives from filters 50’ and 52 the associated in-phase-derived control signal and Lhe associated quadrature -derived control signal, through: which the data signal its processed during a data transmission phase. Analog adder 24 sums outputs from the plurality of the variable gain phase- shifters as a summed signal output (he., baseband signal with ReffO L ri ft)) and Im ou- F ft))). With reference to Figure 3, a schematic illustration of a system a M1MO system implementing periodic analog channel estimation (PACE) aided beamforming is provided. MTMO system 60 includes transmitter 62 that transmits a transmitted signal that includes a data signal during a data transmission phase and includes a reference signal during the beam former design phase. Typically, the reference signal is, but is not restricted to be, a sinusoidal tone having a predetermined frequency. Moreover, this system docs not require continuous transmission of the reference signal. In a refinement transmitter 62 transmits the transmitted signal by orthogonal frequency division multiplexing. Receiver 64 includes a plurality of antennas 68’ where i is an integer from 1 to M rx the number of antennas. Each antenna 68’ receives the transmitted signal and outputs an ssociated received signal. Phase and amplitude estimator 70 recovers the reference signal during a beam former design phase and then multiplies each associated received signal with rite recovered reference tone and a quadrature component of the reference tone in the analog domain to form a plurality of in-phasc-dcrivcd baseband control signals and a plurality of quadrat li re -derived baseband control signals with each antenna having an associated in-phase- derived control signal anti an associated quadrature-deriv ed control signal. Receiver 64 also includes a plurality of variable gain phase-shifters 76 with each antenna 68' having an associated variable gain phase-shifter 76’. The associated gain phase-shifter 76' of each antenna 68' receives the associated in-phase-derived control signal and the associated quadrature-derived control signal through which the data signal is processed during a data transmission phase. An analog adder 80 sums outputs from the plurality of the variable gain phase-shifters as a summed signal output. In a refinement, analog adder 80 applies a weight sum when summing the processor output signals. Advantageously , the sparse nature of wireless channels is exploited to ensure a large beamformin gain after the summed signal output. Sti ll referring to Figure 3 , in a variation, MIMO system 60 includes a plurality of low noise amplifiers 78 with each antenna 68' having an associated lo noise amplifier 78’. The associated low noise amplifier ?8 ! of each antenna 68’ receives and amplifies associated received: signal to form an amplified associated received signal Still referring to Figure 3, in a variation, the phase and amplitude estimator 70 includes a reference tone recovery circuit 82 that recovers and isolates the reference signal as an isolated reference signal from the associated received signal of a single antenna (e.g., the first antenna 68' ) during a beamfonner design phase. In another variant, tire reference recovery circuit 82 may recover the isolated reference signal from received signals from a plurality of antennas (see Figure 4). In a refinement, reference recovery circuit 82 includes a narrow band pass filter and/or an injection locked oscillator aml or a phase locked loop followed by an optional variable gain amplifier. In the example depicted in Figure 3, the reference tone recovery circuit 82 is a phase locked loop. The phase and amplitude estimator 70 also includes a plurality of first mixers 84 with each antenna having an associated first mixer 841 The associated first mixer 84' of each antenna multiplies the isolated reference tone with the associated received signal sVxjft) to produce an associated in-phasc-dcrivcd output signal. Tire phase and amplitude estimator 70 also includes a plurality of second mixers 86 with each antenna having an associated second mixer 86'. The associated second mixer 86' of each antenna 68’ multiplies a quadrature component of the isolate reference signal with the associated received signal V i U) to produce an associated quadrature- derived output signal. The phase and amplitude estimator 70 also includes plurality of first‘filter, sample and hold ' circuits 90 with each antenna having an associated first filter, sample and hold circuit 90’. The associated first filter, sample and hold circuit 90’ of each antenna 68’ receives the associated in-phase-derived output signal and outputs the associated in-phase-derived control signal for phase shifter 761 The phase and amplitude estimator 70 also has a plurality of second ‘filter, sample and hold’ circuits with each antenna having an associated second filter, sample an hold circuit 92’. The associated second filter, sample and hold circuit 92' of eac antenna 68' receives the associated quadrature-derived output signal and outputs the associated quadrature- derived control signal. The filter sample and hold circuits, low pass filter the outputs from mixers S4 ; and 86 ; . and sample the corresponding filtered outputs. In one variant, such filter, sample and hold circuits can be implemented by a low pass filter followed by a low sampling rate ADC . In another variant, the filter, sample and hold circuits can be implemented by' an integrate and hold circuit. The data signals received at each antenna 6 V during the data transmission phase are phase shifted by the phase shifters 76', whose control signals are determined by the filter, sample and hold circuit 90' and 92' outputs from the preceding beam former design phase Advantageously, the system of Figure 3 can perform receive beamforming without digital channel estimation. f 01175] In a refinement the reference signal can be recovered from one antenna or a plurality of antennas. Figure 4 depicts a weighted carrier array subsystem i 10 for reference tone recovery from a subset of the plurality' of antennas. Each antenna of the subset is designated by 6V where j is a label for the antennas in the subset. Subsystem 1 10 includes a plurality of first mixers 105. with each antenna 68 J connected to subsystem 110 having an associate first mixer 105 1 . Subsystem 110 also includes a primary voltage controlled oscillator (VCO) 1 1 7, with a nominal oscillation frequency that may be same or different from the reference signal frequency. The first mixer 105- 1 at antenna 6V multiplies the received signals from antenna 6V with the output from the VCO 1 17 to derive an associated intermediate frequency signal. Subsystem 100 also .includes a plurality of first low pass filters 106, with each antenna 6V connected to subsystem 1 10 having an associated first low-pass filter 106 s that filters the intermediate frequency output signal from mixer 105* from noise and other high frequency components. Subsystem 1 1 also includes a plurality' of secondary phase locked loops, with each antenna 6V connected to subsystem 1 10 having an associated secondary phase locked loop 1 14 s that locks to tire intermediate frequency output signal associated with first low pass filter 106'. In one refinement depicted in Figure 4, each secondare phase locked loop 1 14 s includes a secondary variable gain amplifier, a secondary mixer and a secondary VCO with nominal oscillation frequency approximately equal to the intermediate frequency signal. The secondary PLLs convert the intermediate frequency output signals at the associated first low pass filter outputs to base-band with at least partially compensated inter- antenna phase shift. The outputs from the secondary PLLs are further weighted by a plurality of variable gain amplifiers, with each secondary' VCO having an associated variable gain amplifier 115-'. The weighted base-band signals from the secondary PLLs are combined using an adder 1 16 to obtain a base-band combined control signal with enhanced signal to noise ratio. The base-band control signal from the adder 1 16 is used as a control signal to control the oscillation frequency of the primary VCO 1 17. Due to the enh nced signal to noise ratio, the primary VCO output may recover the transmitted reference signal with a lower accumulation of channel noise and phase noise. Subsystem 1 10 may also include an optional second mixer 1.18 that takes as an input the output of the primary VCO ! 17, and provides an output signal a frequency shitted version of the primary VCO output, to better match the reference signal frequency. £0076] In a variation, MIMO system 60 includes a low pass filter and/or do wnc on verier

100 that receives as input the summed signal output and outputs a baseband signal. The baseband signal is sampled by Analog to digital converter i 02 to form a digital baseban signal. The digital baseband signal is then demodulated by OFDM demodulator 104 that demodulates the bandpass

{(1(177] With reference to Figure 5, a non-eoherent MIMO system using a mulfiantenna frequency shift reference (MA-FSR) receiver is provided MTMO system 120 includes transmitter 122 that transmits a transmitted signal that includes a data signal and a reference signal. Typically, the reference signal is, but is not restricted to be, a sinusoidal tone having a predetermined frequency. In a refinement, transmitter 1 22 transmits the transmitted signal by orthogonal frequency division multiplexing. Receiver 124 includes a plurality of antennas I 28 : where i is an integer from I to M rx the number of antennas. Each antenna 128' receives the transmitted signal and outputs an associated received signal which passes through hand pass filter 130 that leaves the associated received signal un-distorted while suppressing out-of-band noise. Squaring circuit 13 2 squares each associated received signal to form associated squared received signals. The squared outputs involve, among other signals, a product between the reference signal and data signals with the inter-antenna phase shift compensated. Analog adder 134 adds the squared received signals to produce a summed signal r sq (t). Low pass filter i 36 filters the summed signal to form a low pass filtered signal. The low pass filtered signal is sampled by ADC 140 and OFDM demodulation by OFDM demodulator 144 to extract the output signal corresponding to the product between the reference signal and data signals. Advantageously, the sparse nature of wireless channels is exploited to ensure a large beamforming gain after the summed signal output. The system ot Thi-> embodiment can suppress the phase noise of the transmit oscil lator. Advantageously, this embodiment, does not require an oscillator at the receiver. With reference to Figure 5, the reference signal and data signals are designed such that the product of the data signal with itself does not cause interference to the product between the reference signal and data signal at the outputs of the squaring circuit 132\ In one variant, this is implemented by allowing the reference signal to occupy an even numbere OFDM sub-carrier, while all data signals occupy odd numbered OFDM sub-carriers or vice versa.

[0079] With respect to the systems of Figures 1 -5, the occurrence and potential suppression of phase noise should be considered. Phase noise is an oscillator impairment, that causes an oscillator output instantaneous frequency to waver randomly. Therefore, in a conventional M1MO receiver, when the received signal is converted to base band by an oscillator, phase noise distorts the resulting base-band signal, causing degradation in performance. In CACF. and MA-FSR based receivers, while this phase-noise induced distortion is random, the same distortion is applied to both the data and reference signals since the data and reference signals are transmitted simultaneously. Thus, if the base-band reference signal is used to control the phase shifters 56 ; in the CACE system of Figure 2, the distortion caused by the phase noise is implicitly undone at the phase shi fter outputs. In a similar way, the baseband conversion circuit 20 for the CACE receiver in Figure I and the squaring circuits ! 2 1 in the MA-FSR receiver in Figure 5, multiply the received data signal with the reference signal, prior to combining and demodulation. Since the phase noise distorts both the reference and data signals identically, their multiplication undoes the impact of the phase noise at the baseband conversion circuit output 20 in Figure 1 and at the squaring circuit 13 1 outputs in Figure 5. Thus CACF. and MA-FSR receivers provide resilience to oscillator phase noise.

[0080] The following examples illustrate the various embodiments of the present invention. Those skilled in the art will recognize many variations that are within the spirit of the present invention and scope of the claims. Q08I ) L CONTINUOUS ANALOG CHANNEL ESTIMATION AIDED

BEAMFORMING FOR MASSIVE M IMO SYSTEMS

{0082] I. Introduction

|0083| In this embodiment, a more generalized ACE approach for RX beamforming, called continuous ACE (CAGE) is explored, that does not require earner recovery at the RX, mitigates oscillator phase noise and works in multi-path channels. The latter is accomplished by exploiting not only phase of the carrier signal at the R X but also its amplitude in CAGE, a reference tone, i.e. a sinusoidal tone at a known frequency, is continuously transmitted along with the data by the TX as illustrated in Fig. 2. At the RX. the received signal at each antenna is converted to baseband by a bank of mixers and a local osciiiator that is tuned {approximately) to the reference frequency. as illustrated ill Fig. 2. The in-phase ( I) and quadrature-phase <Q) components of the resulting baseband signal at each antenna are low-pass filtered, to extract the received signals corresponding to the reference as illustrated in Fig. 1 . These filtered outputs are then used to feed variable gain, baseband analog phase-shifters which generate the RX analog beam. The un- filtered baseband received signals at each antenna are processed by these phase-shifters, added and fed to a single ADC for demodulation. As shall be shown, this process emulates using the received signal for the reference as a matched filter for the received data signals, and it achieves a large RX beamforming gain in sparse, wide-hand massive MTMO channels. This is because while the reference and data signals may have different frequencies and thus experience different channel responses, such channel responses exhibit a strong coupling across frequency. Furthermore since oscillator phase- noise affects both the reference and data similarly, the match fdtcring helps mitigate the phase- noise from the demodulation outputs. Unlike conventional analog beamforming, CAGE aided beamforming also improves diversity against MPC blocking by combining the received signal power from multiple channel MPG's. Finally, no dedicated pilot symbols are required to update the RX analog beam, unlike with digital CE. The phase shifts from the receiver circuit can also be utilized for transmit beamforming on the reverse link. Furthermore, by providing an option for digitally controlling these phase-shifter inputs, the proposed architecture can also support conventional RX beamforming approaches when required. On the flip side, GAGE may require additional analog hardware in comparison to conventional digital CE, including 2 M rx mixers and low-pass filters. Additionally, the accumulation of power from multiple MPCs, while improving diversity may cause performance degradation in frequency selective channels, as shall he shown. Final ly, the proposed approach in its suggested form does not support reception of multiple spatial data streams and can only be used for beamforming at one end of a communication link. This architecture is therefore more suitable for use at the user equipment (IJEs). The possible extensions: to multiple spatial stream reception shall be explored in future work. A different ACE technique that does not require continuous transmission of the reference, called PACE, is also described herien. While PACE prevents wastage of transmit resources on the reference, it suffers from an exponential degradation of performance due to phase-noise, unlike CACE, A third type of non coherent ACE technique, called MA-FSR, that uses square law components at the RX was explored by us in [52]. While being resilient to phase-noise like CAGE and having a low hardware cost, MA-FSR is a non-coherent technique that suffers from a poor bandwidth efficiency of 50%. t should be emphasized that CACE. PACE and MA-FSR. are three different ACE schemes to help reduce CE overhead in massive TMO systems, each having their unique advantages and RX architectures, anti each requiring separate performanc analysis techniques. The detailed analysi presented in this paper for CACE, in combination with the analysis of PACE and MA-FSR (also descibed herein), shall aid in a detailed comparison of these schemes for a specific application. The contributions of the present embodiment inicude:

1. A novel transmission technique called CACE and a corresponding RX architecture are proposed that enable RX beamlbnning without dedicated pilot symbol transmissions.

2. Analytically characterization of the achievable system throughput with CACE aided heamformins in a wide-hand channel.

3, In the process, the impact of the system phase-noise and the ability of CACE to su re h it are also analyzed.

4. Simulations under practically relevant channel models are presented to support the analytical results. f$0S4] Notation: scalars are represented by light-case letters; vectors by bo!d-ease letters; arid sets by calligraphic letters. Additionally, j = V— 1 , a * is the complex conjugate of a complex scalar v, jaj represents the f 2~ nonn of a vector a, A T is the transpose of a matrix A and A is the conjugate transpose of a complex matrix A, Finally, I a is an a x a identity matrix, © a>ft is the a x b all zeros matrix, !{} represents the expectation operator, ·== represents equality in distribution. Re( }/lmf/| refer to the real imaginary component, respecti ely, and C (a, B) represents a circularly symmetric complex Gaussian vector with mean a and covariance matrix B.

|0085] II General Assumptions and System model

|0086] We consider a single cell system in downlink, where a M u antenna base-station

(BS) transmits data to multiple UEs simultaneously via spatial multiplexing. Since we mainly focus on the downlink, we shall use the terms BS/TX and UE. ' RX interchangeably·. Each UE is assumed to have a hybrid architecture, with M rx antennas and one down-conversion chain, and it performs CAGE aided RX beamibnning. On the other hand, the BS may have an arbitrary- architecture and it transmits a single spatial data-stream to each scheduled UE. For convenience we consider the use of noise-less and perfectly linear antennas, filters, amplifiers and mixers at both the BS and the representative UE. We assume the downlink BS-UE communication to be divided into three stages: (it Initial Access (7A) (ii) TX beam former design ~ w here the TX acquires rCST for all the UFs and uses it to perform UE scheduling, TX beamforming and power allocation and (iii) Data transmission - wherein the BS transmits data signals and the scheduled UEs use CAGE to adapt the RX beams and receive the data. Through a major portion of this paper, we assume that the TA and TX beam former design have been performed apriori and shall focus on the data transmission stage. However in Section 5, we shall also discuss how CAGE beam forming can help in: stages (i) and (ii). f 0087] In stage (iii). we assume the BS to transmit spatially orthogonal signals to the scheduled UEs to mitigate inter-user interference. This can be achieved, for example, by careful UE scheduling and/or via avoiding transmission to common channel scattcrcrs (A Adhikary, E. A. Safadi, M K. SaraimE R. Wang, G. Caire, T. S. Rappaport, and A. E. Molisch,“Joint spatial division and multiplexing for mra-wave channels,” IEEE Journal on Selected Areas i Communications, vol. 32, pp. 1239 1255, June 2014) For this system model and for a given TX beamformer and power allocation, we shall restrict the analysis to a single representative UE without loss of generality-. The BS is assumed to transmit orthogonal frequency division multiplexing (OFDM) symbols to the representative UE, with K— K x + K + 1 sub-carriers indexed as X“ {— ¾ < , . , Kz}· The 0-th sub-carrier is used as a reference tone, i.c., a pure sinusoidal signal with a pre-deter med frequency, while data is transmitted on the K t — g lower and K- t - g higher sub-carriers, represented by the index set {—K i ... ..—g— 1, g + 1, ... , K . The 2 g sub-carriers indexed as —1, 1, . . , , *?] are blanked to act as a guard band between: the reference an data sub-carriers as illustrated in Fig. 6 A, with being a design parameter. Since the BS can afford an accurate oscillator, by ignoring its phase-noise, the complex equivalent transmit signal for the 0-th OFDM symbol of stage (iii) can the be expressed as: for— T Cp £ t < T s , where t is the M tx X 1 : unit-norm TXbeamfbrrniog vector for this UE (designed aprkiri in stage (ii)), E^ T> is the energy-per-symbol allocated to the reference tone Q ~ defines the non-data sob-carriers. x k is the data signal on the k- th OFDM subcarrier, is the reference frequency. [ k -- k/T s represents the frequency offset of the A.-th sub- carrier and T 5 , T cp arc the symbol duration and the cyclic prefix duration, respectively. Here we define the complex equivalent signal such that the actual (real) transmit signal is given by S Re{s r (t)}. For the data sub-carriers {k £ X\Q), we assume the use of independent data streams with equal power allocation, and circularly symmetric Gaussian signaling, i.c-., x k ~ CN(Q, E i]} ). The transmit power constraint is then given by < E s . where E s is the total OFDM symbol energy (excluding the cyclic prefix ). The channel to the representative UE is assumed to have L MFCs with the M ric X M Vy channel impulse response matrix and its Fourier transform, respectively given as (M. AkdeniT:, Y. Liu, M. Samimi, S Sun, S Rangan, T. Rappaport, and E. Erkip,‘'Millimeter wave channel modeling and cellular capacity evaluation,' IEEE Journal on Selected Areas in Communications, vol. 32, pp. 1164 -1 179, June 2014.1 ):

is the complex amplitude and is the delay and (^), a rx (i) are the M tx x 1 TX and M gc X 1 RX array response vectors, respectively, of the -th MPC. As an illustration, the -th RX array response vector for a uniform planar array with M H horizontal and My vertical elements ( rx ~ M H M v ) is given for h e (0, , ,, /¥«— 1] and v £ {!,. i/' a zitO· Y « ¾ ? are the azimuth and elevation angles of arnvai for the f-th MFC, Dk,Dg are the horizontal and vertical antenna spacings and l is the carrier wavelength. Expressions lor a t ,(^) can be obtained similarly. Note that in (1.2) we implicitly ignorethe frequency variation of individual PC amplitudes {a 0 , .. , Cti- c } and the beam squinting effects <S. K. Garakoui, F A. M. Klumpermk, B. Nauta, and F. E. van Vliet, '‘Phased- array antenna beam squinting related to frequency dependency of delay circuits,” in European Microwave Conference, pp. 1304-· 1307. Oct 201 1.53), which arc reasonable assumptions for moderate system bandwidths. It is emphasized that the complete channel response (including all MFCs) however still experiences frequency selective fading. To prevent inter-symbol interference, we let the cyclic prefix be longer than the maximum channel delay: T }) > T f ... Ί . We also consider a generic temporal variation model, where the time for which MPC parameters (a:, , a tx (i ! ).3^( , T } stay approximately constant is much larger than the symbol duration T K . Finally, we do not assume any distribution prior or side information on {vcg, a tA. (f }.a rA .(f), t ? }. Each RX antenna front-end is assumed to have a low noise amplifier (LNA) followed by a band-pass filter (BPF) that leaves the desired signal un-distorted but suppresses the out-of-band noise. The filtered signal is then converted to baseband using the in-phase and quadrature-phase components of an RX local oscillator, as depicted in Fig. 2. This oscillator is assumed to be independently generated at the RX fix*., without locking to the received reference).

While wc model the RX oscillator to suffer from phase-noise, for ease of theoretical analysis we assume the mean RX oscillator frequency to be equal to the reference frequenc f c . This assumption shall be relaxed later in the simulation results in Section 6. Then, from (1. l)-( 1 ,2), the received baseband signal for the 0-th OFDM symbol can be expressed as: for 0 < !: < T s , where the Re/l iii parts of § rx }}} t) are the outputs corresponding to the in-phase and quadrature-phase components of the RX oscillator, q(!:) is the phase-noise process of the RX oscil lator and w(f) is the M rx x 1 complex equiva lent, baseband, stationary, additive, vector Gaussian noise process, with individual entries being circularly symmetric, independent an identically distributed (Li.d.), and having a power spectral density: w ( /)— Tty for—f K < f < fg . Note that while (1 4) is obtained by assuming no TX phase-noise, the results ca be generalize under some mild constraints by treating the TX pliase-notse as a part of $ (£) (P. Mathecken, T Rrthonen, S Werner, and R. Wichman,“Performance analysis of Of DM with wiener phase noise and frequency selective fading channel/ ' TREE Transact ions on Communications, vo!. 59, pp. 1321 -1331 , May 201 I ;. We model the RX phase-noise $ (t) as a Wiener process which is representative of a tree running oscillator [I, Piazzo and P, Mandarini, 'Analysis of phase noise effects in OFDM modems, '5 IEEE Transactions on Communications, vol. 50. pp. 1696 I 705, oct 2002; S. Wu, P. Liu, and Y. Bar-Ness,“Phase noise estimation and mitigation lor OFDM systems " IEEE Transactions on Wireless Communications, vol. 5, pp.

3616-3625, December 2006; D. Petrovic. W. Rave, and G. Fettweis. "Effects of phase noise on

OFDM systems with and without PEL: Characterization and compensation,” IEEE Transactions on Communications vol. 55, pp 1 607 1 6 16, Aug 2007). In .Appendix 1 C. we also discuss how the results can be extended to phase-noise modeled as an Ornstein Uhlenheck (OU) process, which is representative of an oscillator either locked to the received reference, or synthesized from a stable low frequency source (D. Petrovic, W. Rave, and G. Fettweis.“Effects of phase noise on OFDM systems with and without PEL: Characterization and compensation,” IEEE Transactions on Communications, vol. 55, pp. 1607 1 6 16, Aug 2007; A. Mehrotra,“Noise analysis of phase- locked loops.” IEEE Transactions on Circuits and Systems 1; Fundamental Theory and

Applications, vol. 49, pp. ! 309 131 6, Sep 2002) For the Wiener model, 0(£) Is a non-stationary Gaussian process which satisfies: where Hr(£) is a real while Gaussian process with variance <rj. We assume the RX to have apriori knowledge of s . As illustrated in Fig. 2, the baseband signal S rx.m (t) at each RX antenna m is passed through a low-pass filter to isolate the received reference signal. For convenience, this filter L PFQ is assumed It be an ideal rectangular filter having a cut-off frequency of /^, where < g/2. Neglecting the contribution of the data sub-carriers to the filtered outputs (which is accurate for low phase-noise i.e., s% « g/T s ), these outputs can be expressed for 0 < £ < 7’ s as: where we define 4(t) ^ LFF^{eri 0it) } and w(£) is the M rx x 1 filtered Gaussian noise with power spectral density: S W (J) = N 0 for— / d < / < [ g . An illustration of this filtering operation is provided in Fig. 6B. These filtered signals S rx (t) arc used as the control signals to a variable gain phase-shi ter array, through which the baseband received signal vector s rv.F5 (T) is processed, as shown hi Fig. 2. We assume the filter cut-off frequency g/T s to be small enough to allow the phase-shifters to respond to the slowly time varying control signals s rx i! (t). A more detailed discussion about is considered in Sections 4 and 6, The phase-shifted outputs are then summed up and fed to an ADC that samples at K/T s samples/sec. This sampled output for the 0-th OFDM symbol can be expressed as: y[n]— s rx fJ H ( tT, L j K } s rx F! b (n’R / K )

here we define wjn] and w[n] w(“ K ).

Conventional OFDM demodulation follows on [w] to obtain the different OFDM sub-carrier outputs, as analyzed in Section 3. [0089] III. Analysis of the demodulation outputs

[0090] Without loss: of generality, we shall restrict the analysis to the representative 0-th

OFDM symbol and thus, we only focus on {yjnj jd < n < !<}. Note that the sampled, band-limited additive noise w(n] and the sampled RX phase-noise e ~ ^ n for 0 < ή < K pan be expressed using their normalized Discrete Fourier Transform (nDFT) expansions as: w[n]— å k ex W[fe]e i2irfcn,/if ,(l 8a) where said are the

corresponding nDFT coefficients. Here nDFT is an unorthodox definition for Discrete Fourier Transform, where the normalization by K is performed while finding W[fc], i2j c] instead of in

(1 8}. These nDFT coefficients are periodic wi th period K and satisfy the following lemma:

Lemma

{ I XT) for arbitrary integers k lt > where <¾,— 1. if a - b (mod K) or 0;f b — 0 otherwise.

16091] Proof. See Appendix 1A.

[6092] To test the accuracy of the approximation in Lemma 3.1 . the Monte-Carlo simulations of D ¾ ¾ , D^. t+J and D /; M O0 for a typical phase-noise process ( -93dBe Hz at 10MHz offset) are compared to (1,9b) in Fig. 7. As is evident fro the results, ( i .9b) is accurate for k·.— fe 2 . Similarly, simulated & k,k *i > k,k+i 0o values are w 2fklB lower than A k k for all k, i.e,, « 0 as in (1.9b). The analogous version of Lemma 3.1 for phase-noise modeled as an OU process is presented in Appendix 1C. In a similar way, for nDFT coefficients of the channel noise we have: Lemma 3.2 The nDFT coefficients q/w[n], i.e., {W[fc] | Vk}, are jointly Gaussian with: for arbitrary integers k 2 > where <v / ~ 1 it a— h (mod R) or <¾i b ~ 0 otherwise. roof. See Appendix IB.

Note that using these iiDFT coefficients, the low-pass filtered versions of w[n] an gH e l«] in (1.7) can be approximated as: wfn]

.4[n] » ¾ il[k)F 2!tkr! - K , (1.1 lb) where Q ~ {—g, . . . . §} and the approximations are obtained by replacing the linear convolution of s l.x BB (t) and the filter response !..PF${ } with a circular convolution. This is accurate when the filter response has a narro support, i.e., for § » .1. The remaining results in this paper are based on the approximations in ( 1 .9) ( 1 .1 i ) and on an additional approximation discussed later in Remark 3.1. While wc still use the <, =, > operators in the folio wing results for convenience of notation, it is emphasized that these equations are true in the strict sense only if the approximations in (1.9 > ( 1 .1 1 ) and Remark 3. 1 are met with equality. However simulation results are also used in Section V I to test the validity of these approximations. Substituting (8) and (1 i ) into ( / }, the fc-th OFDM demodulation output can be expressed as:

fe] +

K(0)tE^hr[k]il[k + k ])

k])

kl (1 12)

(6095] We shall split F ¾ as = S k + 1 -l· Z k \ vhere S k , referred to as the signal component, involves the terms in (1 .12) containing x k and not containing the channel noise, 4-, referred to as the interference component, involves the terms containing £’®, {x%\ k e and not containing the channel noise, and Z k , referred to as the noise component, containin the remainin terms. These signal, interference and noise components are analyzed in the following subsections.

A. Signal Component Analysis

From ( 1.12), the signal component for k e 3€\Q can be expressed as: where we define ¾- i fc2 — t f M (k x yj {k- )t/M Tx . Since c JC, note that jii[£]j 2 < 1 from Lemma 3.1 , i.e., the phase-noise causes some loss in power of the signal component. However this loss is much smaller than in PACE (V. V. Ratna and A. F. Moiisch.“Periodic analog channel estimation aided beamforming for massive Ml MO systems,” IEEE Transactions on Wireless Communications, 2019 accepted) or digital CF. based beam forming, where only |il[G|) contributes to the signal component unless additional phase-noise compensation is used. As i evident from ( 1.13 ), CACIi based beamforming utilizes the (tittered) received signal vector corresponding to the reference tone as weights to combine the received signal vector corresponding to the data sub-curriers, i.e., it emulates imperfect maximal ratio combining (MRC), The imperfection is because the reference tone and the L-th data stream pass through slightly different channels owing to the difference in their modulating frequencies. However the resulting loss in beamforming gain is expected to be small for sparse channels i.e., tor L « M tx, M rx , as shall be shown in Sections IV and VI. T he second moment of the signal component, averaged over the phase-noise and data symbols, can be computed as:

(1.14) where we define m{a, g) å ¾6 V a+ k .a+ & an d (114) follows from Jensen’s inequality ami (1.9b).

|0898J B, Interference Component Analysis

(0999J From (1.12), the interference component for k £ X\Q can be expressed as:

As is clear from above, the demodulation output Y k for k £ X\g suffers Inter-carrier interference (Id) from other sub-carrier data streams due to the EX phase-noise The first and second moments of , averaged over the other sub-carrier data (k £ 3C\[ U {¾}]} and the phase- noise, can be expressed as:

10[ft]i +Mέ¾ l/?m D | 2 [f W ] 2 E{l«[ft + ft]| 2 } rv . - inri-ii2i2i v. ,

u- * kes J ) ' -i j feeg FOOwmrC

*]| 2 ]

(1) (2)

where = , = are obtaine using the fact that {x k \ k e 5C} have a zero-mean and are independently distributed; < is obtained by defining A max fce \p k i j , observing \ft Q k j < max , and using

M\\g u {k}\ s i K\{k} in first term and using the Cauchy-Schwarz inequality for the secon term;

(4)

< follows by changing the summation order in the first term and by using ( 1 1 ) for the second

(5) , b > !

term; = follows by using Sl[k]— il[ft 4- JC\ and ( 11) for the first term and < follows b using

(12) and the Jensen’s inequality. As shall be shown in Section VI, (21 ) may be a loose bound on

IC1 for lower subcarriers, i.e., \k\ « K .

PI 00] Remark 3,1 A tighter approximation for 3E{|/ ¾ j 2 } can he obtained by replacing i 01] This heurtistic is obtained by assuming n[ft] and fl fc + ftj tp be independently

, - (2)

distributed for k, k e g and k e :K\§ in step“ of (21), but we skip the proof for brevity. As shall be verified in Section VI, Remark 3. i offers a much better ICT approximation Vk and hence we shall use fi(k,g) instead of p(ft, /y) in the forthcoming derivations in Section VL

z - S f e efi - % nw[fc] C¾(0)tvzfr ) n[fc + ic] + å geK¾? M(f- k )tx- k il[k + ft - ft]) ¾ = T s åk^ W [ ]W[/c 4 k]

|0tO4| Note that the noise consists of both signal -noise and noise-noise cross product terms. From Lemma 3.2, it can readily be verified that r i ¹ where the expectation is taken over the noise realizations. Thus the second moment of i¾, averaged over the TX data, phase-noise and channel noise, can be expressed as iEfl / 2 }— ECl^ 13 1 2 } E{|Z ; } 1 2 } T E{|z j¾ where:

EflZ^ I 2 } = å ke g N o Ei¾-(0)tVIi¾[fc 4 fc} + åg fJCX5 Ji(fc)t¾ ¾ £i[fc + fe ~ ft]| : 2

^ where \Q\ = 2 4 1 , = , = follow from Lemma 3.2; = , = follow from ( I 2), and = follows irom Lemma 3.2. ( 1.9b) and the result on the expectation of the product of four Gaussian random variables 9\V. Bar and F. Dittrich, '‘Useful formula for moment computation of normal random variables with nonzero means, ' ’ iOIE Transactions on Automatic Control, vol. 16, pp. 263 -265, Jun Ϊ 971 From ( 1.18), we can their upper-bound the noise power as;

E{j¾ | 2 } < M rx max N 0 fE^ 4 \g\E^l+ M TX \§ ( 1 19) where we use the fact that |/4 fe | < fi max, å fct¾i [D + * >fc f K 4 A fe k ] £ 1 for k º 3C\g (as § £ g / 2 ) from (1 , 9a), ce Analysis

(0106j From ( 1.12> ( 1.17 ) the effective single-input-single-output (SISO) channel between the fc-th sub-earner input and corresponding output can be expressed as; IV - M„0n*VE ( r ) [5^ c< , i /cli 2 !^, + h + Z . fork e %\Q ( 1.20) where ( k and Z k are analyzed in Sections lli-B and Pί-C. respectively. As is evident from ( 1.20), the signal component suffers from two kinds of fading: (i) a frequency-selective and channel dependent slow fading component represented by ø.*. and (it) a frequency- flat and phase-noise dependent fast fading component, represented by 2¾ 6 T jfl[fe] | z . The estimation of these fading coefficients is discussed later in this section. In this paper, we consider the simple demodulation approach where % is estimated only from Y fc , and the are treated as noise. For this demodulation approach, a lower bound to the signai-to-mterference-plus-noise ratio (S1NR) can be o btained from (LI 4 ). (LI 6b) , R emark 3 , 1 an ( 1 , 19), as : rk B m =

where b {b a ! \C' } and we use the fact that E{j/ ft -f ¾ | 2 } = !{|4| 2 } +

E{]¾| 2 }.

The orthogonal ity of array response vectors is approximat ly satisfied if (he MPCs are well separated and M rx » L (O. F.l Ayach. R. Heath, S. Abu-Surra, S. Rajagopal, and Z Pi, “The capacity optimality of beam steering in large millimeter wave MTMO systems, ' in TEF.F. International Workshop on Sienal Processing Advances in Wireless Communications tSPAWCT pp, 100 104, June 2012 ). From Remark 4.1 , note that even without explicit CE at the RX vl B (fi) scales with M rx in the low SNR regime, which is a desired characteristic. While the TCI term also scales with its contribution can be kepL small in the desired SNR range by picking § such that m 0. §) » 1. In a similar way, with perfect knowledge of the lading coefficients at the RX, an approximate lower bound to the ergodic capacity can be obtained as;

(11

where > is obtained by assuming / & ,¾ to be Gaussian distributed and using the expression for ergodic capacity (A. Goldsmith and P. Va iya,“Capacity of fading channels with channel sideinformation, M IEEE Transactions on Information Theory, yol. 43, no. 6, pp. 1986- 1992, 1997),

(2) (3)

¾ follows by sending the outer expectation into the log(-) functions an > follows from (1.14),

(2)

( 1 .16b) and (1.19), While is an approximation, it typically yields a lower bound since (from (1.9a) and (R. Bhatia and C. Davis,“A better bound on the Variance, The American Mathematical Monthly, voi. 107, p. 353, apr 2000).

Note that for demodulating c I Ί> and achieving the above SINK and capacity the RX requires estimates of N 0 and the S1SO channel fading coefficients b and å f c F(j· |ii fe]| 2 · Since the RX has a good bcamtbrming gain (1.21 ). the channel parameters ^, N 0 can be tracked accurately at the RX with a low estimation overhead using pilot symbols and blanked symbols. These values, along with phase-noise parameter s , can further be fed back to the TX for rare and power allocation. Note that since these pilots are only used to estimate the S1SO channel parameters and not the actual MIMO channel, the advantages of simplified CE are still applicable for a CAGE based RX. On the other hand, the low variance albeit fast varyin component jflffcjjl 2 can be estimated for every symbol using the 0-th sub-carrier output ¾. It can be shown from ( 1.12) that

we have E{|/ 0 1 2 ] < 1{|4 ] and I{jZ 0 | 2 } < 2I{|¾ | 2 } for any k€ C\ . Thus can be estimated from F 0 with an S1NR >

2 E Which is usually a large value.

(d) Q1I0] A. Optimizing system parameters fOIll] From (1.22), note that the approximate ergot!ic capacity€ Άrrϊ · ac (b) is a decreasing function oft; for > 2 . Thus a C apj , r<Ws: (/?) maximizing choice of g should satisfy g 2 g. From (1.22) and (1.21), we can also lower bound :

(1,23b if)

> follows from the fact that y taking the summation over

(2)

k in (1.22) into the denominator of the logarithm; and = follows from the fact that t can be verified that the numerator oίΈ(b is a differentiable, strictly concave function of E^-h while the denominator is positive affine function of E K Thus XG/?) is a strictly pseudo-concave function of E i - r> (S. Schaible,‘ Fractional programming, " Zeitschrift fAV.nr Operations Research, voi.27, pp.39-54, άBb dec 1983), and the C approx Qi) maximizing power allocation can be obtained by setting = 0 as:

N O ISI 0 YR « .C f- N 0 (/f — /E . As evident from (1.23b), g offers a trade-off between the phase- noise induced iCi and channel noise accumulation. While finding a closed form expression for (1.23b) maximizing g is intractable, it can be computed numerically by performing a simple line search over s given by (1.24)

10112 V. Initial Access, TX beamforming and uplink beamforraing (01 13] In this section we briefly discuss stages (i) and (ii) of downlink transmission

(see Section 2), and uplink XX beamforming for CACE aided UEs. in the suggested l.A protocol for stage (i), the BS performs beam sweeping along different angular directions, possibly with different beam widths similar to the approach of 3GPP New Radio (NR). For each TX beam, the BS transmits primary (PSS) and secondary synchronization sequences (SSS) with the reference signal, in a form simitar to ( E l y The UFA use CACE aided RX beam forming and initiate uplink random access to the BS upon successfully detecting a PSS SSS. As shall be shown in Section VI, the SINR expression ( 1 . 1 ) is resilient to frequency mismatches between TX and RX oscillators, and thus is also applicable for the PSS SSSs where frequency synchronization may not exist. Since angular beam-sweeping is only performed at the BS, the IA latency does not scale with M rx and yet the PSS/SSS symbols can exploit the RX bcamfonning gain, thus improving cell discovery radius and or reducing IA overhead. This is in contrast to digital CE at the HE, which would require sweeping through many RX beam directions for each TX direction necessitatin several repetitions ofohe PSS/SSS for each TX beam. During downlink stage (ii). note that scheduling of UEs, desig ing TX bcamformer and allocation of power requires knowledge of a t > (-£) } for all the UEs. Such rCSl can be acquired at the BS either by downlink CE with CSl feedback from the UEs or by uplink CE. The protocol for downlink CE with feedback is similar to the IA protocol » with the BS transmitting pilot symbols instead of PS and SSS Uplink CE can be performed by transmitting orthogonal pilots from the UEs omni-directionally, and using any of the digital CE algorithms from Section 1 at the BS. Note that CAGE cannot be used at the BS since the from multiple UEs need to be separated via digital processing. Note that the phase shifts used for RX bcamtbrming at a CACE aided UE in downlink, can also be used for transmit bcamtbrming in the uplink. However since the reference signal is not available at the UF during uplink transmission in time division duplexing systems, a mechanism for locking these phase shi ft values from a previous downlink transmission stage is required (similar to (V. V. Ratnarn and L. F. Molisch, Periodic analog channel estimation aided beamforming for massive MiMG systems,” IEEE Transactions on Wireless Communications, 2019, accepted to)) in contrast frequency division duplexing can avoid such a mechanism due to continuous availability of the downlink reference, and consequently s

0115] VL Simulation Results [0116] For the simulations, we consider a single cell scenario with a 2/2-spaced 32 x 8

(Mp c“ 256) antenna BS and one representative UE with a A/2 -spaced 16 x 4 (M rx — 64) antenna array, having perfect timing synchronization to the BS, one down-conversion chain, and using C ACE aided beam forming. The BS has apriori rCSI and transmits one spatial OFDM data stream with T s — + T— 512 and f i: — 30 GHz. along the strongest MPC, i.e., t = a tA. (t) for i = argmax^d or; |} | . The UE oscillator has Wiener phase-noise with variance 0% known both to the BS and UE. The UE also has perfect knowledge oΐb, N 0 and å ki ^ [ilffe]] 2 .

|QI 17] For testing the validity of the analytical results, we first consider a sample sparse channel matrix H(t) with L— 3, ¾ ~ {0,20, 4( ) }ns, angles of arrival ¾¾¾ g— (0, p/6,—p/6},

{ v'0 6, -y 0.3, v O.i }. The UE uses this model, the symbol error rates

(SERs) for the sub-earners, obtained by Monte-Carlo simulations, are compared to the analytical SERs for a Gaussian channel with S1NR given by ( 1 .21) {with/without Remark 3.1 ) in Fig. . For the Monte-Carlo results, we use truncated sine filters: /(nt) for it] < 27 s / , 9. As observed from the results and also mentioned in Section lli-B, the use of Remark 3.1 in ( 1.21 ) provides a tight SINR bound even for small j/t|. We also observe that the SER for k ~ 22 ( g } is high due to the interference caused from die high power reference signal. However this interference diminishes very quickly with k, as evident from the SER for k - 40 While the mean RX oscillator frequency was assumed to be perfectly matched to the T oscillator tor the analytical results (see Section 11), we also plot in Fig.8 the case with a 5 MHz frequency mismatch. Results show a negligible degradation in performance, suggesting that the CACE design ts resilient to frequency mismatches smaller than the filter bandwidth of LPF . Duo to the accuracy of the bounds in Fig. 8 and for computational tractability, wc shall henceforth use ( 1.21 ) and ( 1 ,22) to quantify performance of CACE in the rest of the section. We next plot the C approx (fi) from ( 1.24) a [unction of g in Fig. 9 with (a)€. yrnoc ( b) maximizing E '} (obtained by exhaustive search over 0 < E r < £ « .) and (b) E v chosen from (1.24), respectively. As observed from the results, the curves are very close, suggesting the accuracy of the power allocation in (1.24) While the poor system performance at lower § is due to phase-noise induced IC l , the poor performance at high g is due to noise accumulation and spectral efficiency reduction. We also note that the optimal g increases with SNR.

|0P8] Fig IGA compares the achievable throughput for beamfonning with digital CE and different ACE schemes: C ACT, PACE (V. V. Ratnam and A. F. Mo! isch,“Periodic analog channel estimation aided beamforming for massive M1MO systems,’ * IEEE Transactions on Wireless Cotmimicaiions^ 2019. accepted), MA-FSR (V. V, Ratnam and A, Molisch,“Multi-antenna FSR receivers: Low complexity, non-eoherent, massive antenna receivers," in IEEE Global Communications Conference (GI..ORFCOM), Dec. 201 8 ). respectively for the sparse channel defined above. For digital CE. the RX beam former is aligned with the largest eigenvector of the effective RX correlation matrix R r (t) ~ & (P. Sudarshan, N. Mehta, L.

Molisch, and J. Zhang,“Channel statistics-based RF pre-processing with antenna selection,” IEEE Transactions on Wireless Communications, vol. 5, pp. 3501 3. 1 1 , December 2006), which in turn is either (a) known apriori at BS or (b 5 is estimated by nested array based sampling (P. Pal and P. P. Vaidyanathun,“Nested arrays: A novel approach to array processing with enhanced degrees of freedom." IEEE Transactions on Signal Processing, vol. 58, pp. 4167 4181 , Aug 2010) To decouple the loss in beamforming gain due to CE errors from loss due to phase-noise, we assume <7 # ~ 0. As is evident from Fig 1UA, PACE and CACE suffer only a < 2dB beamfonning loss in compared to digital CE in sparse channels and above a threshold SNR. While CACE performs marginally worse than PACE at high SN R due to continuous transmission of the reference, unlike PACE it does not suffer from earner recovery losses at. low SNR, While MA-FSR performs poorly due to low bandwidth efficiency, it requires much simpler hardware then all other schemes. To demonstrate the phase-noise suppressing capability of CACE (and MA-FSR ), we also plot the throughput of CACE (with optima! g) and digital CE, with s = 1/Ί and without any additional phase-noise mitigation. As is evident from the results, both CACE a d MA-FSR aid in mitigating oscillator phase-noise in addition, to enabling RX beamfonning. To study the impact of more realistic channels and number of MFCs, wo also consider a rich scattering stochastic channel in Fig. 10B, having L / .10 resolvable MPCs and 10 sub-paths per resolvable MFC. All channel parameters are generated according to the 3GPP TR3S.90O Rel 14 channel model (UMi NLoS scenario) (TR38.90Q,“Study on channel model for frequency spectru above 6 GHz (release 14),” Tech. Rep. V14.3.1, 3GPP, 2017), with the resolvable MFCs and sub-paths modeled as clusters and rays, respectively. However to model the sub-paths of each MPC as unreso!vable, we use an intra-cluster delay spread of Ins and an intra-cluster angle spread of jr/50 (for all elevation, azimuth, arrival and departure). As observed, the loss in beam forming gain with L is only slightl higher for ACE schemes than digital CE. fii l9] Note that the throughputs in Fig. 10 do not include the CE overhead for PACE and digital CE. While nested array digital CE requires 21 dedicated pilot symbols (» 2 g M rx ) for updating RX heamformer, PACE requires 6 symbols (0(1)) and CACE, MA-FSR: o ly require a continuous reference tone. T he corresponding overhead reduction can be significant when downlink CE with GSI feedback is used for rC SI acquisition at the RS (see Section V) For example, with exhaustive beam-scanning (C, Jeong, ,T. Park, and H. Yu.“Random access in millimeter- wave beamforming cellular networks: issues and approaches, 5* IEEE Communications: Magazine, vol. 53, pp. 1 80 1 5, January 2015) at the RS and an rCSl coherence time of 10ms, the RS rCSl acquisition overhead reduces from 40% for nested array digital CF to 1 1% for PACE and - \g\/K < 5% for CAGE.

VIL Conclusions

|0.I21J This paper proposes the use of a novel CE technique called CACE for designing the RX beamfofmer in massive MIMO systems. In CACE, a rcfcix ee tone is transmitted along with the data signals. At c-ach RX antenna, the received signal is converted to baseband, die reference component is isolated, and is used to control the analog phase-shifter through which the data component is processed. The resulting baseband phase-shifted signals from all the antennas are then added, and fed to die down-conversion chain. This emulates using the received signal for reference as a matched filter for data, and enables both RX bcamforming and phase-noise cancellation. The performance analysis suggests that in sparse channels and for g » 1, the SINR with CACE scales linearly with M,.*. The analysis and simulations also show that g yields a tradeoff between phase-noise induced ICl and noise accumulation. Simulations suggest that CACE suffers only a small degradation in beam forming gain in comparison to digital CE based beamforming in sparse channels, and is resilient to TX-RX oscillator frequency mismatch hi comparison to other AGE schemes, C ACE performs marginally worse than PACE at high SMR but perforins much better at lower SNR. It also performs much better than MA-FSR, albeit at a higher RX hardware complexity. Finally, CAGE also provides phase-noise suppression unlike most other€E schemes. The CK overhead reduction with CAGE is significant especially when downlink OF: with feedback is required. The TA latency reduction with CACF aided beamforming is also discussed. While baseband phase shifters are sufficient for CACF. based RX unlike in conventional analog bea rming, 2M rx mixers may be required for the baseband conversion at the RX; thus adding to the hardware cost. pi 22] APPENDIX f .A Proof of Lemma \refL ni na_PN_pr0pertiea. Note that from the definition of 0[fc], we have ® ίG j -k], where T represents the nDFT Operation. Then using convolution property of the nDFT, we have:

Zae W[a]ίE fa + k] which proves property ( i .9a). Property (1.9b) can be obtained as follows:

(1)

where— follows by using the expression for the characteristic function of the Gaussian random

(2) (3):

variable # [h]; - follows by defining u = L— n and « follows by changing the inner

Proof of Lemma \refLemma_N properties. Hole that each component of w (£) is independent and identically distributed as a circularl symmetric Gaussian random process. Hence

its nDFT coefficients, obtained as wi*l - i K å£3 »Ci>V K)e 1 'P K are also jointly Gaussian and circularly symmetric. For these coefficients at RX antennas a, b we obtain:

= ¾¾,, N„/T s where we use the auto-correlation function of the channel noise at any EX antenna as: R ,(t)— Nosi n (p K t /T s ) ex {— jn (K t ¾ ) i/ 7 s ) l: .

181261 APPENDIX 1C

|8127] Here we model the RX phase-noise 0(t) as a zero mean Qrnstein-Ulhenbeek (OU) process (J. L. Dopb,“The brownian movement and stochastic equations/' The Annals of Mathematics, vol, 43, p. 351. apr 1942), which is representative of the output of a type-1 phase- locked loo with a linear phase detector (A. Viterbi. Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1 66; D. Petrovic, W, Rave, and G Fettweis,“Effects of phase noise on OFDM systems with and without PLL: Characterization and compensation/ IEEE Transactions on Communications, vol. 55, pp. 1607 1616, Aug 2007; A. Mehrotra,“Noise analysis of phase-locked loops,” IEEE Transactions on Circuits and Svstems I: Fundamental Theory and Applications, vol. 49, pp 1309 -1316, Sep 2002). For such a model, $(£) satisfies: where, u¾ (£) is a standard real white Gaussian process, and ¾ , cr e are system parameters. From (6) it can be shown that $(t) is a stationary Gaussian process (in steady state), with an auto- correlation function given by: E b (t)— E{q(ί}q(r 4- r)}— (D. Petrovic, W. Rave, an

G. Fettweis, /‘‘Effects of phase noise on OFDM systems with and without PEL: Characterization and compensation,” IEEE Transactions on Communications, vol 55, pp. 1607 - 1616. Aug 2007). Lemma ! 0.1 F\ >? pha v e-noise modeled as an OU process no have;

for arbitrary integers kt > k 2 , here = 1 if a ~ b (mod K) or = 0 otherwise, and

|(I129j Proof of Lemma \refLemma_OUJPN jproperties. Note that from the definition of il[k], we have e ~ ^ w l l il[/r] and -® W * [—&], where represents the nDFT

Operation Then using convolution property' of the nDFT, we have:

å a ex P[«]W * [s + k ] which proves property (1.9a). Property' (1.9b) can be obtained as follows:

where— follows by using the expression for the characteristic function of the

, > ; . . .. (3) .

Gaussian random variable 0[w]— i?[nj; ~ follows by defining u = n— n an « follows by using the fact that R & [«] has a limited support around u— 0 and hence # [u] » $[u—

K) ¾ 0 for u > ( K ~~ l)/2. Note that since ; s an auto-correlation function, its nDFT is non-negative, thus ensuring that A fcl fcl > 0 in (1.29).

|013O] 2. PERIODIC ANALOG CHANNEL ESTIMATION AIDED

BE VMFORMING FOR MASSIVE MIMO SYSTEMS

[0131] I. Introduction

10132] In the present embodiment, a riov!e ACE scheme, referred u> as periodic ACE

(PACE) is provided. In this embodoiment, the reference is transmitted judiciously, and its amplitude and phase are explicitly estimated to drive an RX phase shifter array in contrast to CACE. PACE requires one carrier recovery circuit and M rx phase shifters (sec Fig. 3) and can support both homo heter-dync reception. In PACE, the TX transmits a reference tone at a known frequency during each periodic RX beam former update phase. One carrier recovery circuit. involving phase-locked loops (PLLs), is used to recover the reference tone from one or more antennas, as shown in Fig. 3. This recovered reference tone, and its quadrature component, are then used to estimate the phase off-set and amplitude of the received reference tone at each RX antenna, via a bank of‘filter, sample and liokT circuits (represented as integrators in Fig, 1). As shall be shown, these estimates are proportional to the channel response at the reference frequency. These estimates are used to control an array of variable gain phase -shifters, which generate the RX analog beam. During the data transmission phase the wide-band received data signals pass through these phase-shifters, are summed and processed similar to conventional analog beamforming. As the phase and amplitude estimation is done in the analog domain, 0(1 ) pilots are sufficient to update the R.X beamformer. Additionally, the power from multiple channel MFCs is accumulated by this approach increasing the system diversity against MFC blocking. Furthermore, the same variable gain phase-shifts can also be used for transmit beamforming on the reverse link. Finally, by providing an option for digitally controlling the inputs to the phasc-shiftcrs, the proposed architecture can also support conventional beamforming approaches. On the flip side, FAC E recj utres some additional analog hardware components, suc as mixers an filters, in comparison to conventional digital CE. Additionally, the accumulation of power from multiple MFCs may cause frequency selective fading in a wide-band scenario, which can degrade performance. Finally, the proposed approach in its current suggeste form does not support reception of multiple spatial data streams and can only be used for beamforming at one end of a communication link. This architecture is therefore more suitable for use at the user equipment (UEs). The possible extensions to multiple spatial stream reception shall be explored In future work. While the proposed architecture Is also applicable in narrow-band scenarios, in this paper we shall focus on the analysis of a wide-band scenario where the repetition interval of 1* ACE and beamformer update is of the order ofaCST coherence time, i.e. time over which the aCSI stays approximately constant (also called staiionarity time in some literature). The contributions of the present emboidment include:

[. The development of a novel transmission technique, na ely PACE, and a corresponding RX architecture that enable RX analog beamforming with low CE overhead.

2. To enable the RX operation, two novel reference recovery circuits are explored. These circuits are non-linear, making their analysis non-trivial. We provide an approximate analysis of their phase-noise and the resulting performance that is tight in the high SNR regime.

3. The achievable system throughput with PACE aided beamforming in a wide-band channel is analytically characterized. 4. Simulations with practically relevant channel models are used to support the analytical results and compare performance to exi ting schemes.

{0I.14J Notation; scalars are represented by light-case letters; vectors by bold-case letters; and sets by calligraphic letters. Additionally, j = a * is the complex conjugate of a complex scalar a, jaj represent the £ z -norm of a vector a and A 1 is the conjugate transpose of a complex matrix L. Finally, !E{) represents the expectation operator, represents the Kronecker product, represents equality in distribution, Re{·}.. Im{· j refer to the real; imaginary component, respectively, CJN ' ( a, B) represents a circularly symmetric complex Gaussian vector with mean a and covariance matrix B, Exp (a) represents an exponential distribution with mean a and Um{a, b} represents a uniform distribution in range u, bj.

P. RACE General Assumptions and Syste model

We consider the downlink of a single-cell MIMO system, wherein one base station

(BS) with M tx antennas transmits to several UEs with Vi: antennas each. Since focus Is on th downlink, we shall use abbreviations BS & TX and UE & RX interchangeably. Each UE is assumed to have one up/dowm-conversion chain, while no assumptions are made regarding the BS architecture. Here wc assume the communication between the BS and UEs to involve three important phases: (i) initiaf access (IA) · where the BS and UEs find each other, timing frequency synchroni/ation is attained and spectral resources are allocated; (ii ) analog beamformer design - where the BS and UEs obtain the required aCST to update the analog precoding combining beams; and (hi) data transmission. The relative time scale of these phases are illustrated in Fig, 11 T hrough most of this paper (Sections 2-4), we assume that the I A and beamformer design at the BS are already achieved, and we mainly focus on the beamformer design phase at the UF and the data transmission phase. Therefore we assume perfect timing and frequency synchronization between the BS and UE, and assume that the TX beam forming has been pre-destgned based on aCSI at the BS Later in Section V, we also briefly discuss how aCSI can be acquired at the BS, ho IA can be performed and how the use of FACE can be advantageous in those phases. The BS transmits one spatial data-stream to each scheduled UE, and all such scheduled UFs are served simultaneously via spatial multiplexing. Furthermore, the data to the UEs is assumed to be transmitted via orthogonal preceding beams such that, there is no inter-user interference. Under these assumptions and given transmit precoding beams and power allocation, we shall restrict the analysis to one representative UE without loss of generality. For convenience, we shall also assume the use of noise-less and perfectly linear antennas, filters, amplifiers and mixers at both the BS and UF. An analysis including the non-linear effects of these components is beyon the scope of this paper. The BS transmits orthogonal frequency division multiplexing (OFDM) symbols with K sub-carriers, indexed as X— {— if j , ... , JC 2 — 1, /¾ with K 4- f 2 + 1 = If, to thi representative UE, The BS transmits two kinds of symbols: reference symbols and data symbols. In a reference symbol, only a reference tone, j.e„ a sinusoidal signal with a predetermined frequency known both to the BS and UE. is transmitted on the 0-th sUbcarrier, and the remaining sub -carriers are all empty. On the other hand, in a data symbol all the K sub-earners are used tor data transmission. The purpose of the reference symbols is to aid PACE and beamfbrmer design at the RX, as shall be explained later. Since the BS can afford an accurate oscillator, we shall assume that the BS suffers negligible phase noise. The X 1 complex e uivalent transmit signal for the 0-th symbol, if it is a reference or data symbol, respectively, cart then be expressed as:

for— T c , y < t < T s , where t is the M tx x 1 unit-norm TX beamforming vector for this UE with s the data signal at the k-th OFDM sub-carrier, j = v .1. f is the carrier reference frequency. f k = k/T s represents tlie frequency offset of the c-th sub-carrier. T cs = T cp + T s and T S T cp are the symbol duration and the cyclic prefix duration, respectively. Here we define the complex equivalent signal such that the actual {real) transmit signal is given b ~ . For the data symbols, we assume the use of Gaussian signaling with. = JE{j%j 2 }, for each k E K. The total average transmit OFDM symbol energy (including cyclic prefix) allocated to the UF is defined as E vii , where E cs . For convenience we also assume that f c is a multiple of 1 /T cs, which ensures that the reference tone has the same initial phase in consecutive reference symbols.

The channel to the representative UE is assumed io be sparse with L resolvable MPC s (L « M t x , M rx ), and the corresponding M,. v x M i x channel impulse response matrix is given as (M. Akdeniz, Y. Liu, M. Sami mi, S. Sun.. S. Rangan, T. Rappaport, and E. Erkip,“Millimeter wave channel modeling and cellular capacity evaluation,” IEEE Journal on Selected Areas in Communications, voL 32, pp. 1 164-1 179, June 2014):

H(£) - å ¾a ra (l)a fe ( )¾(i - t,), (2.2) where a ( is the complex amplitude and t ( is the delay and a tt (/;'), a rA i) are the XX and RX array response vectors, respectively- of the f~th MPC. As an illustration, the f-th RX array response vector tor a uniform planar array with M H horizontal and M v vertical elements (M rx = M H M V ) is given define:

Yΐ i( ), V' e i ? are the azimuth and elevation angles of arrival for the -th MPC, D H ,D n are the horizontal and vertical antenna spacings and L is the wavelength of the carrier signal. Expressions for a tr (f) can be obtained similarly. Note that in (2.2) we implicitly assume requency- flat MPC amplitudes ( 0 , . , u L. j } and ignore beam squinting effects (S. K. Garakoui, E. A. M. Kiumperink, B. Nauta, and P. E. van Vliet,“Phased-array antenna beam squinting related to frequency dependency of delay circuits,” in European Microwave Conference, pp 1304-1 307, Oct 20.1 1 ), which are reasonable assumptions for moderate syste bandwidths. To prevent inter symbol interference, we also let the cyclic prefix be longer than the maximum channel delay: T cp > ¾-! · To model a time varying channel, we treat {ai » ¼ c (ϊ) < arxC^)} as aCSl parameters, that remain constant within an aCS 1 coherence time and may c ange arbitrarily afterwards. However since the channel is more sensitive to delay variations, the MPC delays {t 0 , are modeled as iCSI parameters that only remain constant within a shorter interval called the iCSI coherence time. Note that this time variation of delays is an equiv lent representation of the Doppler spread experienced by the R X. Finally, we do not assume any distribution prior or side information on front-end is assumed to have low noise amplifier followed by a band pass filter at each antenna element that leaves the desired signal un-di started hut suppresses the out-of-band noise. The M rx X: 1 filtered complex equivalent received waveform for the 0-th symbol can then be expressed as:

the p: X 1 complex equivalent, baseband, stationary', additive, vector Gaussian noise process, with individual entries being circularl symmetric, independent and identically distributed (U.d.), and having u power spectral density: S w {f) ~ No for ~/K £ j < J K 7 . During the dam transmission phase, the M tx X 1 received data waveform Sj^fc) is phase shifted by a bank of phase-shifters, whose outputs are summed an fed to a down-conversion chain for data demodulation, as in conventional analog beamforming. However unlike conventional CE based analog beamforming. the control signals to the phase- shifters are obtained using the reference symbols sT r ' (¾) and using PACE, as shall be discussed in the next section. HI. Analog beamformer design at the receiver

18141] During each beamformer design phase, the BS transmits D consecutive reference symbols to facilitate PACE at the RX. This process involves two steps: locking a local· R oscillator to the received reference tone and usinsi this Socked oscillator to estimate the amplitude and phase-offsets at each antenna. Here locking refers to ensuring that the phase difference between the oscillator and the received reference tone is approximately constant. The first i¾ reference symbols are used for the former step and the remaining D 2 — D— ¾ symbols are used for the latter step. Therefore D Is independent of M rx and is mainly determined by the time required for oscillator locking (see Remark 3.1). The first step shall be referred to as recovery of the reference tone and is analyzed in Section 3.1 and while the latter step is discussed in Section 3.2. As shall be shown both steps are significantly impaired by channel noise. Therefore in Section 3.3, we propose an improved architecture for reference tone recovery that provides better noise performance, albeit with a slightly higher hardware complexity'. For convenience, we shall assume that the MFC delays do not change within the beam former design phase, and are represented as {G,,, , . , , G .. } (see also Remark 3 2). However the delays may be different during the data transmission phase, as shall be considered in Section 4. Without loss of generality, assuming the first reference symbol to be the 0-th OFDM symbol, the complex equivalent RX signal for the D reference symbols at antenna m can be expressed as:

amplitude of the reference tone at antenna m. III-A Recovery of the reference tone - using one PI T.

0143] For locking a local RX oscillator to the reference signal, we first consider the use of a type 2 analog PLL at RX antenn 1, as illustrated in Fig, 12. The PLL is a common carrier- recovery circuit - wit a mixer, a loop low pass filter (LF) a variable loop gain (G) and a voltage controlled oscillator (ECO) arranged in a feedback mechanism - that can filter the noise from an input noisy sinusoidal signal (see S. C. Gupta,“Phase-locked loops,” Proceedings of the IEEE, vol. 63, pp. 291 - 306, Feb 1975; L. Viterbi, Principles of coherent communication, McGraw-Hill scries in systems science, McGraw-Flill, 1966} for more details).

Here LI· is assumed to be a first-order active low-pass filter with a transfer function 1 -I- c/s and the loop gain G is assumed to adapt to the amplitude of the input such that constant. For convenience, we also ignore the VCO’s internal noise (A, Mehrotra,

“Noise analysis of phase-locked loops,” TFFE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol 49, pp. 1309-1316, Sep 2002; D. Petrovie, W Rave, and G, Fettweis, “Effects of phase noise on OFDM systems with and without FIX: Characterization and compensation,” IEEE Transactions on Communications, vol. 55, pp. 1607- 1616, Aug 2007). Without loss of generality, let the output of the VCO (i.e. the recovered reference tone) be expressed as; where 0(f) may be arbitrary and we define q e (— n, rt] such that A J e - = -j !Af 1 ] . Then the stochastic differentia! equation governing (2.6) for 0 < t < DT CS — T cp is given by [56 A. Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966,]:

where / V ; is the Iree running Frequency of the VCO with no input, we use (2.5) and assume /; is much larger than the bandwidth of LF. In Ibis subsection, we are interested in finding the time required for locking (D [ T CS ), i.e., for 0(t) to (nearly) converge to a constant and characterizing the distribution of the PLL output .s (t). or equivalently 0(f), during the last D reference symbols when the PLL is locked to the reference tone. The first part is answered by the following remark: mark 3.1 For the PLL considered, the lock acquisition time is

the no noise scenario (S. C. Gupta,“Phase-locked loops. " Proceedings of the

IFEF, vol. 63, pp. 291 -306, Feb 1975; A. Viterbi, Principles of coherent communication McGraw-Hill series in systems science, McGraw-Hill, 1 66). Thus e: and jA^ IG must be of the orders of 1/T S and 2itjf c — f vco j respectively, to keep D | small. S146] Numerous techniques (D. Messerschmitt, "Frequency detectors for PLL acquisition in timing and carrier recovery,” IEEE Transactions on Communication , vol 27, pp. 1:288-1295, Se 1979; Y. Venkaiaramay a and B. S. Sonde,“Acquisition time improvement of PLLs using some aiding functions, " ’ Indian Institute of Science Journa , vol. 63, pp. 73-88, Mar. 1 81 ) have been proposed to further reduce the lock acquisition time, which are not explored here for brevity. In the locked state, it can be shown that Q (j) suffers from random fluctuations due to the input noise w , and that 0(t) {modulo 2p) is approximately a zero mean random process (S. C. Gupta,‘Phase-locked loops,'’ Proceedings of the IEEE, vol. 63, pp. 2 1-306, Feb 1975: A. Viterbi, Principles of coherent commu ication. McGraw-lIill series in systems science, McGraw-Hill, 1966). This fluctuation manifests as phase noise of s’p^(i), While several attempts have been made to characterize the locked state 0(t) (see (S. C Gupta‘'Phase-locked loops," Proceedings of the IEEE, voi. 63, pp. 291 306, Feb 1975; L. Viterbi, Principle v of coherent communication . McGraw-Hill series in systems science. McGraw-Hill, 1966.] and references therein), dosed form results are available only for a few simple scenarios that are not applicable here. Therefore, for analytical tracfabilify, we linearize (2.7) using the following widely used approximations (A, Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill ,

L. We neglect cycle slips and assume that the deviations of $(£) about its mean value are small, such that » 1 - j 0(1) in the locked state.

jvy .

2. We assume that the distribution of the baseband noise process w. t '(t) is invariant to multiplication with e | s a]so a Gaussian noise process with power spectral density S w (/ ' ). Approximatio 1 is accurate in the locked state and in the large SNR regime, while Approximation 2 is accurate when the noise bandwidth is much larger than the loop filter bandwidth (A. Viterbi. Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966; A. J. Viterbi,“Phase-locked loop dynamics in the presence of noise by Fokker-Plafick techniques, , Proceedings of the IEEE, vol. 51 , pp. 737 -1753, Dec 1963]. Using these approximations and the definition of , we can linearize (2,7) as:

2n\f - f co \ (2.8) where we replace $(£) by d L (t) to denote use of the linear approximation. Note that for sufficient 2p) during the last ¾ reference .symbols. Assuming 0 L (O)“ 0 and the PLL input to be 0 for t < 0 and taking the Laplace transform on both sides of (2.8), we obtain:

where 0 L (s) and Wf (s) are the Laplace transforms of ¾(t) and (t), respectively. It can be verified using the final value theorem that the contribution of the last ter on the right hand side of (2.9) vanishes for t: » 0 (i.e.. in locked state). Therefore ignoring this term in (2.9), we observe that $ }. (£) is a zero mean, stationary Gaussian process (A. Mchrotra, : “Noise analysis of phase- locked loops," IE CL Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 49, pp. 1309 1316, Sep 2002 ). in the locked state. Furthermore, the locked state power spectral density, auto-correlation function and variance of 0 L (t) can then be computed, respectively, as:

¾ , (/) = m Q2nf \ 2

where 2a = G\AP \ + Jc^A^l 2 4G |4; r) je, 2b = G\A^ } \ ~ J ^A^ I 2 ~~ 4£ | ¾ (2.12>~ ( 2.13) follow from finding the Inverse Fourier transform via partial fraction expansion and the final expressions follow by observing that S w (f ) < N 0 for all /. Since 6k (£ ) is stationary' and

Gaussian in locked slate, note that its distribution is completely characterized by (2.10) ( 2, !

IP-B Phase and ampl itude effect estimation

This subsection analyzes the procedure for reference signal phase and amplitude offset estimation at each RX antenna. As illustrated in Fig. .3, the PLL signal from antenna 1 is ted to a TT/2 phase shifter to obtain its quadrature component. From (2,7), th in-phase and quadrature- phase components of the PLL signal for DtT cs — T cp < t < DT CS — T cp can be expressed together as:

At each RX antenna, the received reference signal is multiplied by the in-phase and quadrature- phase components of the PLL signal, and the resulting outputs are led to 'll Iter, sample and hold' circuits. This circuit involves a k pass fi lter with a bandwidth of « i/(£V/c :i· ), followed by a sample and hold circuit that samples the filtered output at the end of the D reference symbols. For convenience, in this paper we shall approximate this‘filter, sample and hold’ by an integrate and hold operation as depicted in Fig. 3. Representing the 'filter* sample and hold ' output corresponding to the in-phase and q uadrature-phase components of the PLL output as real and: imaginary respectively, the M rx X 1 complex sample and hold vector can be approximated as;

fe-th siibearrier during beamfOrmer design phase and

i.i.d. Gaussian noise process vector with power spectral density S w (/) (see Approximation 2).

Note that in locked state (T x £ t < T 2 ), we hav e 0(£) « ¾(£) (modulo 2p), as per approximations 1 and 2. Furthermore from (2.1 1 ), the auto-correlation function of 6h (i) decays exponentially with a time constant of 0(1/G |2l‘j l ! j). Therefore, for i7 {z4 1 1 » 1/ /½¾ } ), Ip A c g experiences enough independent realizations of 0(t), Therefore replacing the integral in (2,14) with an expectation over VCO phase noise, we have:

( t ! d (2) where w follows from the fact that 0(f) ¥ %(t) {modulo 2 p) in locked state. - follows by defining W® A ir) (£)di and by using the characteristic function for the stationary

Gaussian process 0 L (£). Since w‘ r) (£) is i.i.d, Gaussian with a power spectral density S w (f , it can be verified From (2.15), note that

the signal component of the sample and hold output Ip_ p ft · is directly proportional to the channel matrix at the reference frequency. The outputs arc used as a control signals to the RX phase-shifter array, to generate the RX analog beam to be used during the data transmission phase. From (2.15) and (2.12), note that cither D 2 or j i^ ' | can be increased, to reduce the impact of noise oil the analog beam. Since jA^ j is a non-decreasing function of fs (r) (see (2,5)), this implies that should be kept as large as possible while satisfying E t I < E cs and meeting the spectral mask regulations.

|b150] Note that the results in this section are based on several approximations including the linear phase noise analysis in Section 3. 1 To test the accuracy of these results, the numerical values of J L e ~ i 9it; at\/ D 2 T cS , obtained by simulating realizations of 9(t) from (2.7), are

Va i l ' ) ;/;.; !

compared to its analytic approximation e 2 in Fig. 13. Note that this comparison reflects the accuracy of the approximation in (2.15 ). As is evident from Fig. 1 3. ( 2.15 } is accurate above a certain SNR. Additionally, since Ip^o ? decays exponentially with Var{£i L (t)j (see (2.15)), we observe from Fig. 13 A that the mean integrator output drops drastically below a certain threshold SNR. As shall be shown in Section 4, such a dro in the mean causes a sharp degradation in the

03 system performance below this threshold SNR. Therefore in the next subsection we propose a better reference recovery circuit, called weighted carrier arraying, that reduces the SNR threshold, jO! Sl] Table 1 : One PLL· and weighted arraying simulation parameters.

fill 52] Remark 3,2 The preceding derivations assumed that the MFC delays are identical for the D reference symbols. However since the PLL continuously tracks the RX signal and phase ; ampHtu4e estimation at each antenna is performed simultaneously, these results are vali even if the delays change slowly within the beamformer design phase

18153] Remark 3.3 The RX phase-shifter army or the down-conversion chain are not utilized during the D reference symbol of the beamformer design phas . Therefore, data reception is also possible during these D reference symbols in parallel, a long as a sufficient guard band between the dat sub-carrier and the referenc sub-carrier i provided (similar to (2.27)) to reduce impact on the PI L performance,

{8154] Note that in a multi-cell scenario, use of the same reference tone in adjacent cells cars cause reference tone contamination, i.e., may contain components corresponding to the channel from a neighboring BS. This is analogous to pilot contamination in conventional CL· approaches ff. Marzetta,“Noncooperative cellular wireless with unlimited numbers of base station antennas." IEEE Transactions on Wireless Communications, vo!. 9. pp. 3590-3600. November 2010), and can be avoided by using different, well-separated reference frequencies in adjacent cells.

18155] III-C. Recovery of the reference tone - using: weighted carrier arraying 8156] For reducing the PLL SNR threshold and improving performance, in this subsection we propose a new reference recovery technique called weighted carrier arraying, as illustrated in Fig.4. Apart from a main primarly PEL, weighted carrier arraying has secondary PLLs at a subset LG of antennas, which compensate for the inter-antenna phase shift. The resulting phase compensated signals from the M antennas are weighted, combined and tracked by the primary PLL. which operates at a higher SNR and with a wider loop bandwidth than the secondary PELs. Note that this architecture can be interpreted as a generalization of the carrier recovery'· process in {P. Thompson,“.Adaptation by direct phase-shift adjustment in narrow-band adaptive antenna systems," IEEE Transactions on Antennas and Propagation, vo!. 24, pp. 756 760, Sep 1976; C. Golliday and R. Fluff,“Phase-locked loop coherent combiners for phase array sensor systems/’ IEEE Transactions on Communications, vol. 30, pp. 2329 -2340, Oct 1982; .1 11. Schrader, “Receiving system design for the arraying of independently steerable antennas/’ IRE Transactions on Space Electronics and Telemetry, vol. SET-8, pp, 148 153, June 1962; J. H. Schrader,“A phase-lock receiver for the arraying of independently directed antennas/’ IEEE Transactions on Antennas and Propagation, vol. 12, pp. 155-161 , March 1964} that allows weighted combining. We shall next analyze the performance of this arrayed PLL in the locked state. However, an analysis of the transient behavior and lock acquisition time of this design is beyond the scope of this paper. f01.57] In Fig. 4, LPF/BPF refer to low-pass and band-pass filters with wide bandwidths, designed only to remove the unwanted side-band of the mixer outputs. Without loss of generality, we express the outputs of the primary and secondary VCOs as: c-o rnf ~ V2cos[27r/ 1F £ + > m + 4 s m (t)\, m £ M respectively, where 0(f), < (£) are arbitrary s / · is the common free funning frequency of the secondaiy VCOs, and f , are such that ’ similar to

Section 3.1 , from (2.5) the differential equation governing the secondary PLL at antenna m £ M can be expressed as: where we define ^(t) = vy)p (t)e lS he loop gain of the secondary VCC) at antenna m. Similarly, for the primary VCO we have:

where f* CQ is the tree running frequency of the primary VCO, G v is the loop gain and Li ; is an active low pass filter with transfer function LF p (s) = (1 -l· £ p /s). Similar to Section 3.1 , to obtain the locked state distribution of q(ί) we shall rely on the linear PLI. analysis by using: 1 ;

h is accurate in the high SNR locked .state where which is accurate for a wide noise bandwidth. Using these approximations in (2.16) ( 2.1 7) with zero initial conditions and taking Laplace transforms, we obtain:

where wi/Ys), 0 L (s) and d4 n (s) are the Laplace transforms of linear approximation an linear approximation respectively. We assume tlrat the loop gains of the PLLs adapt to the amplitudes of the input such that

constant. Then solving the system of equations in (2.18), we obtain:

It can be verified using the final value theorem that the last term in (2.19) only contributes a constant phase shift for t » 0 (in locked state), say 0 L . Thus, using steps similar to Section I11-L, we can obtain the locked state power spectral density and variance of the time v aryin part of

where Comparing (2.21 ) to ( 2. 12 ), note that the PLL phase noise Is essentially reduced bv the maximal ratio combining gain corresponding to the f antennas. As ibis variation in (t) manifests as phase noise of Sp LL (t) in Fig. 4, the‘filter, sample and bold/ outputs with weighted carrier arraying can be obtained by using (2.21) in (2.15). The accuracy of the resulting approximation is studied via simulations in Fig. 13. 01S8] IV. Data transmission

{8159] During the data transmission phase, OFDM symbols of type ( l b) are transmitted and the corresponding received signals are processed via the phase-shi fter array with I l Cg as the control signals. Without loss of generality, again assuming the 0-th OFDM symbol as a representative data symbol, the combined data signal at the RX for 0 < t < f s can be expressed

where the l/v 2 Is a scaling constant for convent cncc and we assume that the M PC delays for this representative data symbol are (r 0 , . , x }. This phase shi fted and combined signal R (t) is then converted to baseband by a separate RX oscillator, and any resulting phase noise is assumed to be mitigated via some digital phase noise compensation techniques (D. Petrovic. W. Rave, and G. cttwcis,“Effects of phase noise on OFDM systems with and without PLL: Characterization and compensation, ' ’ IEEE Transactions on Communications, vol. 55. pp. 1607-1616, Aug 2007; P. Robertson and S. Kaiser,“Analysis of the effects of phase-noise in orthogonal frequency division multiplex (OFDM) systems,'’ in IEEE International Conference on Communications ( ICC), vol, 3, pp 1652-1657 voi.3. Jim 1995; S. Wit, P. Liu. and Y. Bar-Mess,“Phase noise estimation and mitigation for OFDM systems,” IEEE Transactions on Wireless Communications, vol. 5, pp. 3616-3625, December 2006; S. Randel, S. Adhikari, and S. 1... Jansen,“Analysis of R F-pi lot-based phase noise compensation for coherent optica! OFDM systems,” IEEE Photonics Technology Letters, vol. 22.. pp 1288- 1290, Sept 2010). Therefore neglecting the down-conversion phase noise, the resulting baseband signal can be expressed as i¾ j (t)— This signal is then sampled and OFD demodulatio follows. The OFDM demodulation output for the fe-th subcarrier (k 6 X) is then given by:

where (/ & ) a i a rx .(f)a tx (f) e 52ir ^ c+ / fe5Tf is the rx x M** frequency domain channel matrix for the fc-th data subcarrier a being independently distribute for eaeli fc £ K as W^[ ] ~ GdV ' O^x j . ( 0 T K /T S )E Mi.v |. Note from (1.15) that ll ACg is similar (with appropriate scaling), but not identical, to the MRC beam former for the k -th sub-carrier: The mismatch is due to the beamforming noise W (r) and because the reference symbols and the fc-th sub-carrier data stream pass through slightly different channels, owing to the difference in sub-carrier frequencies and the MFC delays (i ? ¹ ). Consequently, the beamformer only achieves imperfect MRC, leading to some loss in performance and causing the effective channel coefficients I ,ir£ -if(/i..)t to vary with the sub- carrier index k, i.e., the system experiences frequency-selective fading. Furthermore, since the MFC delays {t 0 ,. , , t .. j } change after every iCSl coherence time so may these channel coefficients. As depicted in Fig. 1 1 , we assume that the TX transmits pilot symbols within each iCSl coherence time to facilitate estimation of these coefficients {l„ rfi .?f(/ fc )t|fc £ K] at the RX. Since these pilots are used only to estimate the effective single-input-single-output (SISO) channel and not the actual M!MO channel, the corresponding overhead is small and shall be neglected here. Assuming perfect estimates of these channel coefficients, from (2.22) tire effective SNR for the fc-th sub-carrier, and the instantaneous system spectral efficiency (iSE), respectively, can he

where we neglect the cyclic prefix overhead in (2.24) for convenience. Note that the iSE maximizing data power id location {lf k d) \k e ;/(. ' } can be obtained via water-filling across tire sub- carriers. While the exact expressions lor (2.23) ( 2.24) are involved, their expectations with respect toT ;P , c-£ can be bounded as stated by the following theorem. Theorem 4 * 1 If the RX array response vectors for the channel MPC are mutually orthogonal, Le., a tc (R a gC (ϊ) ~ 0 fbr -i ¹ i, the effective SNR and iSE, averaged over the heamformer noise W^- r can he hounded as in (2.25)

where d ¾ represents a: ³ inequality at a high enough SNR such that the approximations in Section 3 are accurate.

(0161 ] Proof. Substituting (2.15) in (2.22), and by treating the received signal component corresponding to i.e., {(W (r:> ]^!M(f k )tx k , a noise, we can obtain it lower bound

to the mean SNR as:

Y l ACE (B(t:})

denom. denom. = [t f 5f (f 0 ffM(f 0 )tE^e ~v ^ e ^ W

where > follows from the Jensen’s inequality and = from the orthogonality of the array response vectors. Similarly, by treating [W ^ r ^ yf(f k )tx k as Gaussian noise independent of ¾, a lower bound on the mean iSE can be obtained as

where we use similar steps to (2.26).

|(H62j The array response orthogonality condition in Theorem 4. 1 is satisfied if the scalterers corresponding to different MPOs are well separated and M » L (O. El Ayuch, R. Heath. S Abu-Surra, S. Rajagopal. and Z Pi, ; The capacity optimality of beam steering in large millimeter wave MEMO systems, ' ’ in IEEE Internationa! Workshop on Signal Processin Advances in Wireless Communications (SPAWC), pp. 100 - 104, June 2012 ). Note that even though the RX does not explicitly estimate the array response vectors a rA (f for the MFCs, we still observe an RX beamforming gain of M rx in (2,25a). The impact of imperfect MR.C combining and the resulting frequency-selective fading is quantified by ?(/ c , f k ), where note that \b (fc f k ) 1— 1 ? (0,0) 1. Another drawback of the fading is that it may cause a drastic drop in per a nee of the one PEL architecture in Section III- A if |A if) , the reference signal strength at the antenna 1 - falls i a fading dip, as is evident from (2.12) and (225). Note however that the weighted arraying architecture in Section 3.3 enjoys diversity against such fading by recoverin the reference tone from multiple antennas M.

|0163] V, Initial access and aCSl estimation at the BS

|0164] In this section we suggest how aCSl can be acquired at the BS during the TX beamformer design phase and also propose a sample !A protocol that can utilize PACE. Note that power allocation, user-scheduling and design of the TX beamformer t. requires knowledge of the TX array response vectors and amplitudes {jo^|, a tx -(f)} for the different UEs. Such aCSl can be ac uired at the BS either via uplink CE. or by downlink CE with CS1 feedback from the RX, Uplink CE can be performed by transmitting an orthogonal pilot from each LIE omni-directionally, and using any of the digital CE algorithms from Section 1 at the BS. Note that PACE cannot be used at the BS since the pilots from multiple UF.s need to be separated via digital processing. For downlink CE with feedback, the BS transmits reference signals sequentially along different transmit precoder beams { beam sweeping), with D reference symbols for each beam. The UEs perform PACE for each TX beam, and provide the BS with uplink feedback about the corresponding link strength for data transmission.

|i)165] The suggested 1A protocol is somewhat similar to the downlink CE with feedback, where the BS performs beam sweeping along different angular directions * possibly with different beam w idths. For each TX beam, the BS transmits D reference symbols, followed by a sequence of primary (PSS) and secondary synchronization sequences (SSS). The RX performs PACE, and provides uplink feedback to the BS upon successfully detecting a PSS. However due to lack of prior timing synchronization during IA phase, the 'filter, sample and hold’ circuit in Section 3.2 cannot be used directly for the PACE. One alternative is to allow continuous transmission of the reference tone even during the PSS and SSS with the following suggested symbol structure; where Q defines a guard band around the reference tone, to reduce the Impact of the data sub- carriers on the PIT, output. The amplitude and phase estimation can then be performed similar to Section 3.2. by multiplying the received signal at each antenna with the PLL output and then filtering with a low pass filter with cut-off frequency 1/(D T CS ) Duo to the continuous availability of the reference tone, the filter outputs can be directly used to control the phase shifter at each antenna without the‘sample and hold’ operation. Since D— 0( 1). the 1L latency does not scale with M rx and yet the PSS/SSS symbols can exploit the RX beamforming gain, thus improving cell discovery radius and/or reducing IA overhead.

M{¾] VL Simulation Results

|1167j For the simulation results, we consider a single cell scenario with a d/2 -spaced

32 X 8 (M tx = 256) antenna BS and one representative U E with a l/2-spaced 16 X 4 (M rx = 64) antenna array, having one down-conversion chain and using PACE aided beamforming. The BS has perfect aCSI arid transmits one spatial OFDM data stream to this UE with K = 1024 sub- carriers ami the beamfbrmer t aligned with the stronges channel MFC. The RX beam former design phase is assumed to last D = 6 symbols with D 2 — 2. where the BS transmits reference symbols with power E' r> = 20 E^/K ( to satisfy spectral mask regulations). The system parameters for the one PEL and weighted arraying ease, respectively, are as given in Table 1, For comparison to existing schemes, we include the performance of RTAT - the continuous ACE based beamforming scheme in (V. V. Ratnam and A. Molisch,“Reference tone aided transmission for massive MIMO: analog beamforming without CSI,’ in IF HR International Conference on

Communications (ICC), {Kansas City. USA), May 201 8), and of statistical RX anal beamforming (P. Sudarshan, N. Mehta. L Molisch, and J. Zhang,‘'Channel statistics -based RF pre-processing with antenna selection. ' ’ IEEE Transactions on Wireless Communications, vol. 5, pp 3501-351.1, December 2006), where the beamfbrmer is the largest eigen- vector of the RX spatial correlation matrix: R rr (t) = -rE f cesr W

(/i c ) T · For both these schemes we ignore impact of phase noise and additionally, for statistical beam forming we consider two cases: (a) perfect know ledge of R ri (t) at the RX and (b) estimate of R rJK (t) obtained using sparse-ruler sampling (P. Pal and. P. P. Vaidyanathan,“Nested arrays: L novel approach to array processing with enhanced degrees of freedom ' IEEE Transactions on Signal Processing, vol. 58, pp. 4167- 4181, Aug 2010) - a reduced complexity digital CE technique. Note that PACE uses 6 reference symbols per beamformer update phase, RTAT avoids reference symbols but requires continuous transmission of the reference and sparsc-rulcr sampling requires 21 pilot symbols for M n 64. ffil $8] We first consider a sparse multi-path channel having 1 = 3 MFCs with delays f ; · =

(0, 20, 40}ns, angles of arrival — (0.45p, p/2, p/2} and effective

amplitudes — ( n Q 6,— n q·3, n 0.1 ). respectively, dimnu the RX beamformer design

phase and i = i f + {30,25,25}ps for one snapshot of the data transmission phase. For this channel, the mean iSE of PACE aided beamforming, obtained us ng Monte-Carlo simulatio s with the non-linear PEL equations (2,7), (2.16), (2.17), is compared to the analytical approximation (2.25b), and the performance other schemes in Fig. 14A. Since the RX beamformer Ir Lί;L · in (2. 14) is random, the one sigma interval of iSE is also depicted as a shaded region here. As is evident from the results, the beamforming gain with PACE aided beam terming is only 2 dB lower than that of statistical beamtbrrning, above a certain SNR threshold. Below this threshold, however, PACE experiences an exponential decay in pcformancc due to the oscillator phase noise as also predicted by Theorem 4.1 . As is expected, this SNR threshold is lower for weighted carrier arraying than for one PLL Furthermore, the derived analytical approximations are also accurate above this SNR threshold. PACE also outperforms RTAT at high SNR due to the judicious transmission of the reference while the deceptively better performance of RTAT at low SNR is due to neglect of phase-noise. Note that these PACE results are obtained for an oscillator offset of S M Hz (see Table I ). Better performance can be achieved if the PLL is optimized for more accurate local oscillators. To study the impact of more realisti c channels and nu mber of MFCs , e next model the channel as a rich scattering stochastic channel with L resolvable MFC s each, with .10 unresolved sub-paths. Here the MFC's and sub-paths are generated identically to the clusters and rays respectively in the 3GPP T 38.900 Rel 14 channel model (UMi NLoS scenario) (TR38.900. “Study on channel model for frequency spectrum above 6 GHz (release 14), ' "’ Tech. Rep. VS4.3. 1 , 3GPP, 2017). The only difference from (TR38.900,“Study on channel model tor frequency spectrum above 6 GHz (release 14),” Tech, Rep. V14.3.1 . 3GPP, 201 7) is that we use an intracluster delay spread of ins and an intra-cluster angle spread of p/50 (for all elevation, azimuth, arrival and departure), to ensure that the sub-paths of each MPC are unresolvable, The channel SNR at each RX antenna (including the TX beam forming gain) is fixed at 0 dB, and the channel variation between beam for er design phase and one snapshot of the data transmission phase is modeled bv assuming that the RX moves a distance of d— 2cm in a random azimuth direction without changing its orientation. Note that this channel can also be represented by our system model by replacing /, in (2.2) with 10L. For this stochastic channel model, the mean iSE for PACE aided beamforming, averaged over channel realizations, is compared to RTAT and statistical beamforming in Fig. I4B. For computational tractability, we skip the non -linear PEL simulation and use the analytical expressions (2.15) arid (2,24) to quantify performance of PACE. These expressions are accurate at. OdB SNR as observed from Fig. 14 A. As observed from the results, the loss in beamforming gain for PACE aided beam iorming increases with I, and therefore PACE is mainly suitable for channels with L < 10 resolvable MFCs. It. must be emphasized that such cases may frequently occur at nun-wave frequencies, where the number of resolvable MPCVc!usters with significant energy (within 20dB of the strongest) is on the order 3— 10 (M. Akdeniz. Y. Liu, M. Samimi, S. Sun, S. Rangan, T Rappaport, and E. Erfcip,“Millimeter wave channel modeling and cellular capacity' evaluation,” IEEE Journal on Selected Areas in Communications, vol. 32, pp. i 164— 1 1 79, June 2014; TR3S.900,“Study on channel model for frequency spectrum above 6 GHz (release 14),” Tech. Rep VI 4.3.1 , 3GPP, 201 7). Note that for the 1SE results in this section, we did not include the CE overhead, While digital appoaehes like sparse ruler sampling (S. Haghighaishoar and G. Caire,‘ Massive MEMO channel subspace estimation from low-dimensional projections,” 1F.F.F. Transactions on Signal Processing, vol. 65, pp. 303 -318 Jan 2017; P. Pal and P. P. Vaidyanathan,“Nested arrays: A novel approach to array processing with enhanced degrees of freedom,” IEEE Transactions on Signal Processing, vol. 58, pp. 4167-4181, Aug 2010) require 21 pilots (for M rx = 64), PACE uses only D ~ 6 pilots. The corresponding overhead induction is significant when downlink CE with feedback is used for aCSi acquisition at the BS, such as in frequency division duplexing systems. For example with exhaustive bcamscanning (C. Jcong, J. Park, and I E Yu,“Random access in millimeter-wave beamforming cellular networks: issues and approaches,' IEEE Communications Magazine, vol. 53, pp. 180-185, January 2015) at the TX and an aCST coherence time of 10ms, the BS aCSi acquisition overhead reduces from. 40% for sparse rule techniques to 11% for PACE (see Section 5 for protocol). The overhead reduction is expected to be higher if the additional time required for beam switching and settling (O. S. Sands,“Beam-switch transient effects in the R.F path of the K APA receive phased array antenna,” tech, rep., NASA Technical Memorandum TM-2003-2 ! 25R8, Feb 2002; K. Venugnpa!, A. Alkhaleeb, N. G, Prelcic, and R. W, Heath,“Channel estimation for hybrid architecture-based wideband millimeter wave systems,” IEEE Journal on Selected Areas in Communications, vol. 35, pp. 1996 -2009, sep 2017) are also taken into account. Thus, PACE aided beamforming shows potential in solving the CE overhea issue of hybrid massive MIMO systems, with minimal degradation in performance. VIE Conclusions

|0.172J This paper proposes the use of PAC E for designing the RX beam former in massive

MEMO systems. This process involves transmission of a reference sinusoidal tone during each beamformer design phase, and estimation of its received amplitude and phase at each R X antenna using analog hardware. A one PT.-E based carrier recovery circuit is proposed to enable the PACE receiver, and its analysis suggests that the quality of obtained channel estimates decay exponentially with inverse of the SNR at the PLI, input. To remedy this and also to obtain diversity against fading, a multiple PT..L based weighted carrier arraying architecture is also proposed. The performance analysis suggests that PACE aided bcamforming can be interpreted as using the channel estimates on one sub-carricr to perform bcamforming on other sub-carriers, vvith an additional loss factor corresponding to the circuit phase-noise. Simulation results suggest that PACE aided beamforming suffers only a small beam forming loss in comparison to conventional analog bcamforming in sparse channels, at sufficiently high SNR, This loss however increases with the number of channel MPCs L, and hence PACE is mostly suitable for sparse channels with few MPCs. The CE overhead reduction with PACE is significant When downlink CE with feedback is required. Benefits of PACE aided bcamforming during IL phase are also discussed, although a more detailed analysis will be a subject for future work. Similarly the performance of PACE at very low SNR and with system ismatches ' imperfections also requires more attention.

3. MELTI-ANTENNA ESS. RECEIVERS: LOW COMPLEXITY, NON-CO H E RENT, MASSIVE ANTENNA RECEIVERS

11174] L Introduction |0175] In the present embodiment, a novel multi-antenna frequency shift reference (MA-

FSR) receiver is provided. The MA-FSR receiver (RX ) uses only one down-conversion chain, supports wide -band transmission with non-coherent demodulation, and can perform receive beanxfomiing without requiring phase-shifters, explicit channel estimation, or complicated signal processing thus alleviating the drawbacks of the above mentioned schemes. Inspired by the frequency shift reference (FSR) schemes for single-input-single-output (SISO) ultra-wideband (UWR) systems, in this scheme the transmitter (TX) transmits a reference signal and several data signals on different frequency sub-carriers via orthogonal frequency division multiplexing (OFDM). At each RX antenna, the received waveform corresponding to the data sub-carriers is then correlated with the received waveform corresponding to the reference signal via simple squaring operation. The outputs are then summed up and fed to a single down-conversion chain for data demodulation. As shall be shown later, this operation emulates maximal ratio combining (MRC) at thc RX with imperfect channel estimates. Since the RX bcamforming is enabled without channel estimation, MA-FSR is especially suitable for fast time-varying channels, such as in V2V or V2X networks. Furthermore, due to the non-coherent RX architecture, the phase noise of the transmit signal has negligible influence on the performance. The RX aiso exploits power from ail the channel multi-path components (MFCs) and is therefore resilient to blocking of MFCs. Unlike conventional UWR FSR systems there is no bandwidth spreading of the data signal involved. Therefore the noise enhancement duo to the non -linear RX architecture is significantly smaller, making it practically viable. On the flip side, the proposed scheme only uses 50% of frequency sub-carriers for data transmission, can only support a single spatial data -stream, cannot suppress interference and can only be used for beamforming in the receive mode of a node.. Therefore, MA- FSR is more suitable for scenarios with abundant spectrum and where bcamfornung at the TX is unnecessary or where beamforming at TX is achieved using conventional channel estimation methods. Examples include deviee-to-deviee networks where beam for ing at RX provides sufficient link margin or infrastructure based networks where down-link traffic is dominant. The contributions of the present embodiment, include:

1. Development of an MA-FSR RX architecture for massive MIMO systems, that allows non-coherent transmission, lowers implementation cost and energy consumption at the cost of 50% bandwidth efficiency and that does not require phase-shifters or channel estimation at the RX

|0!77J 2. Characterization of the achievable throughput for the proposed ML-FSR system, both analytically and via simulations, for the single-input-multiple-output (SIMO) scenario in a wide-band channel . 3. Presentation of a class of improved MA-FSR architectures that can further improve performance, albeit, with a higher hardware complexity.

P179] Notation: scalars are represented by light-ease letters; vectors by bold-ease letters; and sets by calligraphic letters. Additionally j— v ' — 1, E{} represents the expectation operator, c * is the complex conjugate of a complex scalar c, c is the I lermitian transpose of a complex vector c, b(t) represents the Dirac delta function 6 l } . represents ihe Kroneoker della function and Re{· refer to the real imaginary component, respectively. Furthermore a * and a denote the complex conjugate and the conjugate transpose of a vector a, respectively,

II. Oenera! Assumptions and System model W e consider a SIMO link (which can be part of a larger system) where the TX has a single antenna and ihe RX has M » 1 antennas and one down-conversion chain. Note that this model also covers a MTMO link where the TX transmits a single spatial data stream, since the combination of TX preceding vector and propagation channel creates an effective SIMO link. The TX transmits OFDM symbols with 2 K sUb-carriers, indexed as (0, ... , 2K— 1) L reference signal is transmitted on th 0-th sub-carrier and K— g data signals are transmitted on the sub- carrier set 3C ~ {K, K ·+· 1, .. K ~~ g— 1}. Here g ensures that the transmit signal lies within the system bandwidth, and is usually small, determined by the TX phase noise. The remaining sub- carriers, i.e.. {1, , K— 1} U {2K— g, . . . .2 K— 1} are unused. While it uses only « 50% of the sub-carriers for data transmission, this OFDM structure is necessary to prevent inter-stream interference, as shall be shown in section 3. Then, the complex equivalent transmit signal for the 0-th symbol (for— T p < t < T s ) can be expressed as: where E x is the energy allocated to the reference signal, x k is the data signal tor k-th OFDM sub- carrier, f c is the carrier frequency, [ k - k/'i represents the frequency offset of k-th sub-carrier from the reference signal, G(t) represents the phase noise process at the TX and T s .7 p arc the symbol duration and the cyclic prefix duration, respective 1y Here we define the complex equival nt signal such that the actual (real ) transmit signal is given by Re{s(f )}. We assume further that the data signals on fire sub-carriers {x k \k€ X} are mutually independently distributed with zero means. The total average transmit symbol energy is then given by E s — E r T E^ where is the energy allocated to the fc-th sub-carrier.

jei82] The channel is assumed to have L « M scatierers with the M x 1 channel impulse response vector given as (M. Akdeniz, Y. Liu, M Sami i. S. Sun, S Rangars, T. Rappuport, and L. Erkip.‘‘Millimeter wave channel modeling and cellular capacity evaluation,” ill EL Journal on Selected Areas in Communications, vol . 32, pp. 1 164· 1 179, June 2014):

where «y· is the complex gain, T> is the delay and a* is the RX array response vector, respectively, of the Ath MFC. As an illustration, the array response vector for a l/2-spaced uniform linear array is given by: {¥] = v ·^^ j the wavelength of the carrier signal and f is the angle of arrival of the f-th MFC. Note that here e implicitly assume the system bandwidth is small enough to ignore beam squinting effects. For ease of analysis, we assume that the array response vectors for the scatterers are mutually orthogonal i .e = Mli f j· . This assumption is reasonable if the scatterers are well separated and M » L (Q. FI Ayach, R. Heath, S. Abu-Siirra, S, Rajagopal, anil Z. Pi,“The capacity optimality of beam steering in large millimeter wave MIMO systems, 4 ’ in IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 100 104, June 2012) Later, in section 5 we shall also study the system performance when the above assumption is relaxed. To prevent inter symbol interference, we let the cyclic prefix be longer than maximum channel delay: T p > TO model a generic time varying channel, we assume that the MFC parameters remain constant for at least a coherence time interval T <:oh , and may/may not change afterwards. f T83j The RX front-end is assumed to have a low noise amplifier followed by a bandpass filter (BPF) at each antenna dement, as depicted in Fig, 5. The BPF has a cut-off frequency of fzi ( and leaves the transmitted signal im-distorted but suppresses the out-of-band noise. We also assume the RX to have perfect timing and clock synchronization Withthe TX. The filtered complex equivalent received waveform for the 0-th symbol for 0 < t < T s can then be expressed as: where % # (£) s(L)e ~ i ZjIf,:C is the baseband transmit signal including the carrier phase noise and n(t) is the M x 1 baseband equivalent, stationary, complex additive Gaussian noise process vector, with individual entries being circularly symmetric, independent and identically distributed (Lid.), and having a power spectral density S n (f)— 2N 0 for 0 < f £ f 2t( . The filtered si nals at each antenna are then squared and summed up, as depicted in Fig 5. Note that such a squaring operation can be performed using square law devices or multipliers with identical inputs. For the purpose of this paper we shall assume that the squaring operation, and also the filtering mixing and adding operations at the RX can be implemented exactly. Since it is the actual, real received signal which gets squared, the output for the 0-th symbol, after the squaring and summing, can be expresse as:

here r m (t)— r(i)] m is the complex equivalent received signal at the m-th antenna. Note that both r m (t) 2 , r ? ;(t) 2 are hig pass signals with a carrier frequency of 2 f e . This summed signal r sq (t) is then low-pass filtered (with a cut-off frequency of 2K fT s ) to get: iin(t)! 2

where we use the orthogonality assumption for the array response vectors. Finally, ypQ f) is sampled by an ADC at a sampling rate of 4K/T S samples/sec and conventional OFDM demodulation follows. Note that since ?) .. pp f) is a real signal with maximum frequenc 2 K/T s , the ADC sampling rate must be atleast 4-K /T s samples/see to prevent aliasing. However it can be shown that the signal of interest i.e.. the product between the reference and data sub -carriers only lies within the frequency range K /T s £ |/| < (2 K— g— l)/T s . Thus tite same performance can als© be obtained by replacing the low-pass filter by a band-pass filter with a pass-band of K/T s < using an ADC sampling rate of 2K/T S samples/sec. . Analysis of the demodulation outputs

Inspired by our similar analysis for the UWB FSR JRX in (V. V. atnam, A. F. Molisch, A. Alasaad, F, Alawwad, and IF Beiiairy.“Bit and power allocation in QAM capable multi-differential frequency-shifted reference UWB radios’ in lEliE Global Communications Conference (GLOBECOM), pp. 1 -7, Dec 2017), the current section analyzes the OFDM demodulation outputs. The OFDM demodulated output for the k-th subcarrier of the 0-th symbol Can be expressed as (—2 K < k < 2 K— 1)·

|0186j We shall express each demodulation output as ¾ = S k + 7 k where referred to as the signal component involves terms in (3,6) not containing the channel noise and Z k , referred to as the noise component, containing the remaining terms. It can be verified from <3.5) and the expression for s BB (t) that only the demodulation outputs {Y k \K < | c| < (2 K - g)} involve signal components. We shall therefore consider a suWoptima!, albeit simple, demodulation approach where only the outputs {Y k \k E K] arc utilized for demodulating the data, and the noise components are treated as noise

A. Signal component analysis [0188] From (3.5) (3,6), the signal components of the OFDM demodulated outputs for k £ X. can be expressed as:

CO . , . . ,

where— follows f rom the sub-carrier allocation, which ensures that, despite the non-linear RX architecture, only the cross-product between the reference signal and the data on the fc-th sub- carrier contribute to S fe , for k ³ K. Essentially, the MA-FSR RX utilizes the receive vector corresponding to the reference tone as weights to combine the received signal corresponding to the data, thus emulating; maximal ratio combining of the contributions from the different antennas with imperfect channel estimates. Since this combining takes place via squaring in the analog domain, the proposed RX enables beamforming without channel estimation or use of phase- shifters, However, as is evident from (3.7), the signals from the L multi-path components do not add up in-phast at tire demodulation outputs. This is due to the fact that the reference signal and the fc-th data stream pass through slightly different channels owing to the difference in modulating frequency. This leads to some amount of frequency selective fading, as shall be explained later in Section IV

10189] B. Noise component analysis

[0190] From (3.6), the noise component of the OFDM demodulation output Y k , for k £

X, can be expressed as:

Note that the noise consists of both: noise-noise cross produc t an data-noise cross product terms. Given the transmit data vector x and channel impulse response h(i), the conditional mean of the noise components can be computed as: E{Z ¾ |x, h} - 0, where e use the fact that the noise process n(t) is stationary, has a zero mean and = 0 for k > 0. Similarly, the conditional second order statistics of tire noise components can be computed as (detailed steps arc given in Appendix 8):

Xa. ( ¾) = aZ }

~ MS a b [(2K - b) NS + h 0 b 0 0 E,.]

from (3.9a)-(3.9h) e see that the noise components at the OFDM outputs are mutually correlated and are further dependent on the data vector x. For reducing computational complexify, we consider the sub-optimal approach where each sub-carrier data is decoded independently. Under this assumption, the noise variances averaged over the transmit data vector x reduce to:

¾ fe (h) = M[(2K - fc)N 2

¾, fe (h) = 0, (3.10b) where k £ 3i and wc use the tact that the data streams have a zero mean and arc mutually independent (see Section Ϊ1). We shall hence forth approximate the noise component at each OFDM output (¾ |A £ K‘ ] to be jointly Gaussian distributed. While this allows finding a lower hound to: the system capacity, the accuracy of this assumption is also justified via simulations in Section V.

[0191] IV. Performance Analysis |0192] Using (3.7) and (3.10), the effective SISO channel between the transmit data and the fe-th demodulation output ( k K’ ) can be expressed as; where ¾ A (h)). We assume that at regular intervals the TX

transmits pilot symbols using which the R.X can estimate the‘fading coefficients ' {/? fc 0 jfe E X}. Similarly, using blanked pilots he., symbols where no signal is transmitted, the RX can also estimate N 0 . Mote that since 0 k Q , N 0 and bo^ are average channel parameters, they change slowly and can be tracked accurately and with low overhead. Henceforth, we shall assume the RX has perfect knowledge of these channel parameters for each channel realization h(£). These values are further assumed to be fed back to the TX, via a feedback channel, for bit and power allocation. Note tliat since these pilots are used only to estimate the average channel parameters and not the actual SI MO channel, the advantages of simplified channel estimation are still applicable for ML- FSR RX. From the Gaussian assumption for {Z k \k E X} and from (3.10), note that (3.1 1 ) represents a parallel Gaussian channel across the sub-carriers. The effective SNR for the k-t demodulation output (k E X ) is then given by;

|0I93J Note that even though the RX does not explicitly perform channel estimation we still observe a beamforming gain of M in y k (h{t)). However since the different MPCs do not add up in phase at the RX and the noise power varies with /e, the system suffers from frequency selective fading, which causes some loss in performance. Similarly, we define the instantaneous sum rate (iSR) as: where we neglect the cyclic prefix overhead for convenience.

[0194] A, Power Allocation i 5] Since both the signal and noise variances in (3.11) are affected by the transmit powers in a non-linear way, finding the iSR maximizing jx>v, er allocation to the data and reference tone is difficult. We shall therefore rely on the following sub-optimal solution.

(k ' )

pi 96] Lemma 1 For any feasible power allocation {E r , E d J \k E X}, we hare:

Y k ChitJ) < 2?f c (h(ty) for : alik 6 X, where f k (h(t)) is the effective SNR with: I 7] Proof. Case 1: Let; E Y > E s /2. Then for any power allocation (ifo, E^ k X } and any k E X we have:

where ³ follows from the AM-G inequality, and < follows by noting that åj^ ¾ ~ nffc?.} /

/ E s — E r ) < 1, and hence the right han side is a non-increasing function of E r. From (3, 5), it is clear that y /c (h(t)j < y fc (h(t)). pi 98] Cose 2: If on the other hand E r < E s /2, then from (3 12), we can write far any k E X:

theorem follows

As a consequence of Lemma 1, using E r s= £ s /2 can at-worst cause a 3 dB loss in the SNR of the data treams. Note that the SNR expression in (3.12) can be approximated as: which is obtained by using by replacing £¾ by (J? s — E r ) j E— g). Now using A (h(t)) instead ofy fe (h(/;)) in . Bar and F. Dittrich,“Useful formula for moment computation of normal random variables with nonzero means,” IEEE Transactions on Automatic Control, vol. 16, pp. 263-265, Jun 1971 . (3.13) with E r = E s f 2 and — E s / 2, a sub-optimal iSR maximizing power allocation for {f¾*· * \k £ K j can be obtained by the water-filling algorithm in tact, it can be show that this allocation is optimal, as 0,

IV, Simulation Results

For simulations we consider a SIMO system, where the RX has a half-wavelength spaced uniform linear array (M ~ 64) with one down -conversion chain and is equippe with a MA-FSR RX, The TX transmits OFDM symbols

GHz, The phase noise at the TX is modeled as a Wiener process with = p 2 . We consider a sample channel impulse response h(/:) with: L = 3, t -— SOf-f— 1 j ns, =

assume perfect timing synchronization, and perfect knowledge of {/4 , o i k€ X} at the RX, For this h(t), the symbol error rate (SBR) for (3.0), obtained by Monte -Carlo simulations, is compared to the analytical SLR for the effective channel (3. 1 1 ) in Fig 15. The perfect match between the analytical and simulation results validates our analysis and the effective channel model (3.1 .! k Due to the frequency selective signal and noise powers, we also observe that the SER changes with L. |Q202] We next compare the analytical iSR of MA-FSR (3.13) to the iSR of a coherent RX with analog bcamforming that only occupies half the bandwidth, i.c., \f\ < K/T s , in Fig. 16. For analog beamforming, we assume the use of statistical beamforming with perfect channel estimates, where the beamfor er is a·. /|a t | 5 assume perfect phase noise cancellation, and do not include the channel estimation overhead. The results show that for ¾ ,(! ¾/!¾ ³ Ί0 dB, MA-FSR suffers from an SNR loss of < 9 dR in comparison to analog beamforming. However at lower values, the SNR loss increase significantly, as is also evident from (3. 13 ). Note that /? f 0 E s /N 0 — .10 dB corresponds to a per sub-carrier SNR of around—10 dB without theRX beam forming gain, and thus indeed represents a scenario where the RX beamforming gain is essential. Furthermore, we observe that the performance of equal data power allocation is comparable to water- fil ling. However these results depend on L. Larger values of L intensify the frequency selective fading of MA-FSR, thereby possibly increasing the SNR loss in Fig. 16. Therefore., MA-FS is more suited to sparse channels with verv few MFCs.

VI. improved MA-FSR design

Note that the MA-FSR RX performance degrades significantly below a certai threshold SNR. This is mainly due to the noise enhancement resulting from the squaring operation, which leads to the large noise-noise cross term in (3.8), Since the transmit signal is mainly

1 K 2 K

restricted to frequencies |/ j < and— < \f— f c \ < ~ (ignoring phase noise), the impact of this

noise enhancement can be significantly reduced by suppressing the noise at lower frequencies by a factor of y e using a filter, as illustrated in Fig, 17A Using a similar analysis as presented here, it can be shown that the effective S on the k-th sub-carrier for this design (with no phase noise) is: While this design reduces the noise enhancement of

the RX, note that the RX still only has a 50% bandwidth efficiency. This efficiency can be boosted to » 100% by using an alternate design where the squaring circuit is replaced by a multiplier, with one input being the received signal processed via a narrow bandpass filter that isolates only the reference sub-carrier, as illustrated in Fig 1 7B. T his design, called reference tone aided transmission has been analyzed in detail in (V. V. R.atnam and A. F. Molisch. ‘Reference tone aided transmission for massive MGMO: Analog beamforming without CS1,” in TF.F.F. International Conference on Communications (ICC), rr. 1 7, May 2018). Note that while these designs show significant improvements in performance, implementing such sharp filters at the carrier frequency is difficult and may have to rely o carrier recovery techniques, thus increasing RX complexity. VII. Conclusion In this work a novel non-coherent massive antenna RX is proposed, that only requires: a single down -con vers ion chain, can support high datu-rutes and can perform RX beamforming without phase shifters or channel estimation. The MA-FSR RX essentially uses the received signal for a reference tone to combine the received signal corresponding to the data, via a squaring operation at each antenna. The carefully designed sub-carrier allocation prevents inter carrier interference. Ί he analysis suggests that the effective channel between the sub-carrier inputs and the demodulated outputs behaves like a paral lel Gaussian channel with frequency selective lading, where the frequency selectivity arises due to modulating frequency mismatch between the reference and data sub-carriers and due to the varying noise levels. These varying noise levels arise due to the noise enhancement experienced by the squaring operation at the RX. The simulation results show that MA-FSR suffers only » 6 dB SN R loss in comparison to analog beamforming in sparse channels, as long as the mean received power is above a certain threshold. This threshold behavior is clue to the noise enhancement due to the squaring operation, and several improved FS designs that can reduce its impact: are also proposed, Appendix 3. A f(!209] Fro (3,8), the conditional cross-covariance between the noise components at the a-th and b-th sub-carriers can further be computed as:

X a, a( x - h ) x MR n [ v— u]]]

Ci)

where = follows from the fact that n(t) is zero-mean Gaussian and therefore the odd moments of

(2)

n(£) are zero; = follows by using the identity r any scalars

A f B, and by ignoring the terms involving pseudo-covariance of the circularly symmetric Gaussian

(3)

noise and— follows by defining

for any 1 < l £ M, using the results on expectation of a product of four Gaussian random variables [15]. and from the orthogonality of the array response vectors. Defining a new variable tv - v— u and using change of variables, we can approximate 5f w it (x, h) as:

_ _ j2n:{w~+m fc¾

+å%« &.¾,/¾/ « )I»M1

(4)

where *» follows by assuming that the phase noise is constant within the support of the noise

(S)

auto-correlation function R n [w) and « follows by changing the summation limits since fi n |w/j has a very narrow support of around 0(1) and by defining p a Note

W

that « is accurate as long as the system bandwidth is much larger than the phase noise bandwidth

[16]. Now taking a summation over u, we obtain:

where we define xft(a, d) {(fe 1 ,/f z )|A: j , /c 2 e > kx— k 2 — a— b} and the remaining terms vanish since |a— h| < K— 1 < k k 2 £ 2K— g— 1. Note that the sampled noise autocorrelation function can be expressed in terms of the power spectral density as: ¾[nn3 ~ substituting it n [vv| we have:

(6 , y f g 1 - ¾

where = follows from the identity: = ~å^-¥ 8 f“ f/T^and Using the fact

that S n (f) is non-zero only in the range 0 < / < 2K fT and we define cA(a, h) k- ) &

< A(a, b)\kz > h]. Using a similar sequence of steps the noise pseudo-co variance can be computed as:

where we define if,

|021O] While exemplary embodiments are described above, it is not intended that these embodiments describe ah possible forms of the invention. Rather, the words used in the specification are words of description rather tha limitation, and it is understood that various changes may be made without departing from the spirit and scope of the invention. Additionally, the features of various implementing embodiments may be combined to form further embodiments of the invention.