Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ROOM CHARACTERIZATION AND CORRECTION FOR MULTI-CHANNEL AUDIO
Document Type and Number:
WIPO Patent Application WO/2012/154823
Kind Code:
A1
Abstract:
Devices and methods are adapted to characterize a multi-channel loudspeaker configuration, to correct loudspeaker room delay, gain and frequency response or to configure sub-band domain correction fillers. In an embodiment for characterizing a multi-channel loudspeaker configuration, a broadband probe signal is supplied to each audio output of an preamplifier of which a plurality are coupled to loudspeakers in a multi-channel configuration in a listening environment. The loudspeakers convert the probe signal to acoustic responses that are transmitted in non-overlapping time slots separated by silent periods as sound waves into the listening environment. For each audio output that is probed, sound waves are received by a multi-microphone array that converts the acoustic responses to broadband electric response signals.

Inventors:
FEJZO ZORAN (US)
JOHNSTON JAMES D (US)
Application Number:
PCT/US2012/037081
Publication Date:
November 15, 2012
Filing Date:
May 09, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
DTS INC (US)
FEJZO ZORAN (US)
JOHNSTON JAMES D (US)
International Classes:
H04R5/02
Domestic Patent References:
WO2010036536A12010-04-01
WO1992010876A11992-06-25
Foreign References:
US20070121955A12007-05-31
US20050254662A12005-11-17
US20060083389A12006-04-20
US6760451B12004-07-06
US5757927A1998-05-26
US7881482B22011-02-01
US20070025559A12007-02-01
US7630881B22009-12-08
US20060140418A12006-06-29
US20050053246A12005-03-10
US7158643B22007-01-02
US20070121955A12007-05-31
Other References:
DE LA FUENTE ET AL.: "Time Varying Process Dynamids Study Based on Adaptive Multivariate AR Modelling.", HIGH TECHNICAL SCHOOL OF INDUSTRIAL ENGINEERING UNIVERSITY OF OVIEDO., 2010, XP008171572, Retrieved from the Internet [retrieved on 20120919]
TYAGI ET AL.: "On Variable Scale Piecewise Stationary Spectral Analysis of Speech Signals for ASR.", 11 September 2006 (2006-09-11), pages 1182 - 1191, XP005586244, Retrieved from the Internet [retrieved on 20120919]
See also references of EP 2708039A4
Attorney, Agent or Firm:
MOHINDRA, Gaurav, K. (Senior Counsel Intellectual Property,5220 Las Virgenes Rd, Calabasas CA, US)
Download PDF:
Claims:
WE CLAIM:

L A method for' characteri ing a nsutti-c aanel loudspeaker c mfiguration, comprising

producing a irst probe signal;

supplying- the first probe signal fo a plurality of audio out uts coupled to 5 respective electro-acoustic transducers positioned in a multi-channel configuration in a listening environment for converting the first probe signal to & first acoustic .response and for sequentially transmitting the acoustic responses in nou-overlapptng time slots separated by silent periods as sound waves into the listening environment;, and

} for each said audio o utput,

receiving sound waves at a multi-microphone array comprising at least two noo-eoiscideat acousto-eieetric transducers, each converting the acoustic responses to first electric response signals;

deconvolving the first electric response signals- with the first probe 15 signal to determine a first room es onse for. said electro-acoustic transducer at each said acousto --electric transducer;

computing -and recording in 'memory a delay for said electro-acoustic transducer at each said acousto-eiectric transducer; and

recording the first room responses in memory for .a specified period 0 offset by the delay for said electro-acoustk transducer, at each said acousto-eieetric transducer;

based on ' the. delays' to each, said &eousto~e!ectro transducer, determining a distance and at least a first angle to each said eiectro-aeousto transducer, ami

using the distances and at least said first angles to the electro-acousto 5 transducers. automatically selecting a particular multi-channel configuration, and computing a position for each efectrc-aeousto transducer in that multi-channel coaflguratioa -within the listening environment

2. The method of claim i, wherein the step of computing the delay comprises: processing each said first electric response signal and the first probe signal to generate a time sequence; detecting aft existence or absence of a pronounced peak m the time sequence as indicating wnether the au o output is coupled o the electro-acoustic transducer; and

computing he position of the peak a the delay,

3. The met d of claim I , wherein the first electric response signal is partitioned into blocks and deconvolved with ¾ partition of the first probe si nal as the -first electrical response k received ¾t the acotisto-eleclric transducers, and wherein the delay and. first room response are computed and recorded to memory in the silent eriod prior to the transmission of the next probe signal.

4. The method of claim 3» wherein the step of deconvolving the partitioned first response signal w lls the partition of the first probe signal comprises:

pre-compiiting and storing a set of K partitioned N-poioi Fast Fourier Transforms (FFTs) of a time-reversed first probe signal of length *N 2 for non- negative frequencies as a probe matrix;

cotnpoting an N-poiat FFT of successive overlapping blocks of N 2 samples of the first electrical, response signal and storing the 2+1 FFT coefficients for noa~ negative frequencies as a partition;

accumulating FFT partitions as a response matrix;

performing a fast convolution of the response matrix wife the prob matrix to provide an.N/2+i point frequency response for the current block;

computing an N-poi inverse FFT of the frequency response with conjugate symmetric extension to the negative frequencies to .form -a first candidate, room response for the current block; and

appending the first candidate room responses fox successive blocks to form the first room response

5. The met hod of claim 4, wherein the step of estimating the delay comprises: computing an -poiot inverse FFT of the frequency response with the negative frequency values set to zero to produce a Milbert 'Envelope (HE);

tracking the maximum of the HE over successive blocks to update the computation of the delay . The method of claim 5, further comprising;

supplying- a 'secon pre~emphasi¾ed probe signal to each of the plurality of audio outputs after ths first probe signal to record second electrical response signals; deconvolving overlapping blocks of the 'second res onse signals with the partition of the first probe signal to generate a sequence of second candidate room responses; and

using the delay for tne first probe signal to append successive second candidate room .responses to form the second room response.

7, The method of claim ! , wherein,

if said 'multi-microphone array comprises only two acoost -eiectrie transducers, computing at least said first angle to electro-acoustic transducers located on a half-plane;

if said Kiuiti-microphoae array comprises only three acausto-eiechic transducers, computing at least said rs angle to electro-acoustic transducers located on a plane; and

if said niuiti-mierophone arra comprises tour or more acousto-electric transducers* computing a! least said first angle as an azimuth angle and an elevation angle to electro-acoasiic transducers located in- three-dimensions.

8, A method for characterising a listening environment for playback of multi- Channel audio, comprising;

producing a first probe signal;

supplying the first probe signal to each of a plurality of electro-acoustic transducers positioned in a mdii-eharmel configuration in a listening environment for converting the first probe signal to a first acoustic response and sequentially transmitting the acoustic, responses in non-overlapping time slots as sound waves into the listening environment; and

for each said siaerro-acousiic transducer,

receiving the sound waves at a muttj-rjjicropfcooe array comprising at least two non-coincident acousto-electoe transducers each converting lite acoustic responses to first electric response signals; econvolving the first electric response signals wis she fet probe signal to d«t«r»une a .room . es onse for -eac electro-acoustic transducer;

for frequencies above a cut-off frequency, computing a first part of a r m energy .measure from the room resp ns s as a function of sound pressure;

for frequencies below m cut-off frequency, computing a econd part of the .room ene 'measure torn the room responses as function of 'sound pressure and sound elocity

blersdkg ihe first and second parts of the energy measure to provide the room energ measure over the specified acoustic band; and

computing filter coefficient from the room energy measure.

9. The method of claim 8, wherein a processor computes the filter coefficients from the room energy measure,

10. The method of claim 9, farther comprising the ste of:

using the filter coefficients to configure a digital correction filter in a processor,

1 L The method of claim 10 , further comprising the steps of

receiving a multi-cbannel audio signal;

decoding the multi-chaanel audio signal with a processor to t rs an audio signal for each said electro-acoustic transducer;

passing each of .said audio signals through the corresponding digital correction filte -to form a corrected audio signal; and

supplying each said corrected audio signal to the corresponding electro-- acoustic transducer for converting the corrected audio signals to acoustics responses and transmitting the acoustic responses as sound waves into the listening environment.

12. The method of claim 8, further comprising:

progressively smoothing the room responses or the room energy measure so that greater smoothing is applied to higher frequencies. 13, The method of claim 12, wherein the step of progressively s it ing the room responses comprising appl ng ¾ rime varying filter to she room response r» ich the bandwidth of the low pass response of the filter b comes progressively smaller -n time.

1.4. T e method of claim 12, wherein the step of rogessivel smoothing the room energy measure comprises applying forward aad backward frequency domain average with a variable forgetting factor.

15. The- method of claim 8„ wherein the sec nd .part of the energy measure is computed by,

computing a first energy component as a function of sound pressure from the room responses;

computing a pressure gradient from said room responses;

applying a freque cy dependent weighting to the pressure gradient to calculate ound velocity com onents;

computing a second energy component from the sound- elocity components; artd

computing the second part, of the energy measure as a u ction of the -first and second energy components,

16. The- method of claim 15, wh rein the steps of computing the pressure gradient and applying th frequency dependent weighting to the pressure gradient to calculate She sound velocity components are integrally performed directly from the room responses,

17. The method of claim 5, wherein computing the first energy component comprises:

averaging the room responses for the at least two said aco«stk e1ectnc transducers t compute an average re uency response ; and

computing the first energy component from the average frequency response.

I S, The method of claim 8, wherein said first probe signal is a broadband sequence ehatac eria¾id by a magnitude spectrum this is substantially constant over a specified aeeusiio band, furtbe* comprising;

producing a second probe signal, said second probe sig»al being a pt - emphasized sequence characterized by a are-emphas function with magnitude spectrum inversely proportion to frequency" applied t a base-baud sequence that provides an amplified magnitude spectrum over a low frequency .portion, of specified the acoustic band;

supplying t e second probe signal to each of the electro-acoustic transducers for converting the second probe signals to second acoustic responses and transmitting the second acoustic responses in non-overlapping time slots as soun waves in the listening env ronment;

for each said electro -acoustic transducer,

receiving the sound waves at the muld-microphone array for said fmi and second probe signals with said at least two HOB- coincident acoiisto-eleefrie transducers each converting the acoustic responses to first and second electric response signals as a measu re of sound pressure;

deconvolving, the first and second electric response signals with the first probe signal and the base-band sequence, respectively, to determine first and second room responses for each electro-acoustic transducer;

for fre uencies above a cut-off fre uency, computing a first part of a room energy measure fess the first room responses as a function, of sound pressure;

for frequencies below a cut-off frequency, computing a second part of the room energy measure f om the second tmm responses as a function of sound pressure and sound velocit ;

blending the first and second parts of the energy measure to provide the room energy measure over th specified acoustic band; and

computing filter coefficients from the room energy measure.

19, The method of claim 18, wherein the broadband sequence is the base-band sequence, said p e-emphasis function being applied to the base-band sequence to generate the pie-emphasised sequence.

20. Toe method of claim 9, wherein the broadband sequence comprise an ail- pass sequence c aracterized by a magnitude spectrum thai Is substantially constat over the .specified acoustic hand and an autocorrelation se uence hav g a ¾efo-lag value a least 30 d8 above any non-xero .lag value,

21. The method of claim 20, wherein the s!!-pass sequence is formed by,

ene ating a raftdom mnaber seq ent betwe n -a and

appl ing overlapping high -and low pass filters to .smooth the random n mber sequence;

5 generate an al!-pass probe signal m the fre uenc domain having unify mag itude arid phase of the smoothed random number s ue ce;

performing an inverse FFT on the all-pass probe signal to form the all-pass sequence, and wherein the pre-emphasixed sequ nce is formed by

applying the pre-e phasls function to the all-pass probe si nal m the0 frequency domain to form a pie-emphasized probe signal in the fre uency domain; ami

performing an inverse FFT on the pre-emphasized probe signal to form the pre-emphasi¾ed sequence.

22. The method of claim 1 , wherein the second part of me ene gy measure is computed y,

computing a first energy component as a fusctton. of sound pressure from the second room responses;

S computing a pressure gradient from sak! second room responses;

computing sound velocity components from the pressure gradient; computing a second energy component from the velocity components; and computing the second part of the energy measure as a function of the first and second energy components.

23. The method of claim 22, wherein the first energy component k computed by. computing an average pre-emphasized frequency response from the second room responses for the at least two said acoustic-electric transducers;

applying a de-emphasis sealing to the pre -emphasized average frequency 5 response; and computing the first energy component from the average' pre-smphasixed frequenc response,

24. The method of claim 22, wherein the steps of computing th pressure gradient and appl in tie frequency dependent weig ing to he pressure gradient to calculate (he sound velocity c mp nents are integrally performed directly from the room

25. The method of chum 22, herek the second part of Use energy measure is the sum of ihe first and second energy components.

26. The method of claim 8, wherein the filter coefficients for each channel are computed by eompanrsg the room, etiergy measure to a channel target etitve, further comprising applying frequency smoothing to the room energy measure to define the channel target curve.

27. The mefliod of claim 26, further comprising:

averaging the ehamiel. target curves to form a c mmon target curve; and applying correction to each, correction filter to compensate for difference beJween the channel and cornmoi* target, curves,

28. A method of geaefgtteg. correction filters for a. raalti-chatinei audio system, comprising;

providing s P-band oversampSed analysis filter 'bank tha downsa pjes an audio signal to base-band for F sub-bands and a -band oversampied synthesis filter b&ftk that upsampies the F sub-bands to reconstruct the audio signs! where P is an soicget;

providing a spectral measure for each channel;

comb in each said spectral .measure with a channel target curve to provide an aggregate spectral measure per channel;

for at least one c hannel

extracting portions of the aggregate spectral measure that correspond to different sub-bands;

5 ! reraappiftg the extracted jjortsons of the spectral measure to base-baud to xm &k- the downsamp!mg of the anal is filter bank;

estimating -an .auto regressive CAR) n¾sdel to the remapped spectra! measure for each sub-band; ami

ma p ng coefficient of each said A.R mode! to coefficients of a rniatmum,~pb&se ad-zero sub-bard correction filter; d

configuring P digital sl -zero sub-band correction filters from the corresponding coefficients that frequenc correct the P base band audio signals between the analysis god synthesis fitter basks,

29, The meiiiod of claim 28, wherein the spectra! measure comprises a room, spectral measure.

30, The method of claim 28, wherein the P sub-bands are of timform bajtulwrdth and overlapping.

31 , The metbod of steins 28, wheresu the spectral measure has progressively less resolution, at higher fre ue ci s.

32, The- method, of claim 28, wherein the Ait model. is comp ed by,

computing as autocorrelation sequence as an inverse FFT of the remapped spectra! measure; and

applying a Levlasos-Durfam algorithm to the autocorrelation sequence so compute the AR model

33, The method of claim 32, wherein the LeviirsoivDnrbin algorithm produces residual power estimates for the sub-bands, further comprising;

selecting as order for the correction filter based on the residua! power estimate for the sub-band,

34, The method of claim 28, wherein, the channel target curve is a user selected target curve. 35, The. method of claim 28, farther compris ng a plying frequency smoothing to ih'e channel room spectral response to define the channel target carve.

36, The .method' of elsi m 28, further comprising:

providing & common target curve for ait said channels; and

applying correction to each correction, filter to compensate for difference between th chaanei and ommon target curves,

37, The method of claim 33» further comprising averaging the channel target corves to form the common target curve.

38- A device for rocessin muM-ehannel audio, comprising:

a plurality of audio outputs for driving respective ©leetro-aeoustie transdnem coupled thereto, said electro-acoustic transducers positioned in. a multi-channel configuration in a listening environme t;

one or more- audio inputs for receiving first electric response signals from & plurality of aeousto-eiee ro trsnsdueets-coupted thereto;

an input receiver coupled to the one or more audio inputs for receiving 'the plurality of first electric response signals;

device memory, and

one ot more processors -adapted to implement,

g probe generating and transmission scheduling module adapted to, produce a first probe signal and

'supply the first probe signal to each of the plurality of audio outputs in. non-overlapping time slots separated by silent periods;

a room analysis module adapted to,

for each said audio output, deconvolve the first electric response signals with th first probe signal to determine a first room response at each said acousto-eiectric transducer, compute and record in the device memory a delay at each said aeousto-elscmc transducer and record the first room responses in the device memory for a specified period offset by the delay at each said acousto-electrie transducer,

based on the delays at each said acousto-e'lectro transducer for

S3 each said electro-acoustic transducer, determine a distance and at least a. first angle to the electro-aeous o transducer, and

us n distances d at least he first angles to the electro- acousto transducers, automatically select a particular mufti-channel configuration and compute a pos t on for each eleetro-aeousto transducer n that moltl-channel configuration within the listening ermroament.

39. The device of laim 38. wherein the room analysis module is adapted to parti ion the first electric response signal into overlapping blocks and deconvolve each block with pattittoa of t e first probe signal as the first electrical response received and to compute and record the delay and first room res onse is the silent period prior to the transmission of the next probe signal.

40, A device for processing multi-channel audio, comprising;

a plurality of audio outputs for driving respective electri acoustie transducers coupled thereto;

one or more audio inputs for receiving first electric response signals from at least two non-coincident .aeeusio-eleetro transducers coupled thereto;

an- input receiver coupled to the one or more audio inputs for receivin the -plurality of first electric response signals;

device memory, and

one .or more processors adapted to implement,

a probe generating end transmission scheduling module adapted to, produce a -first probe signal, and

supply the .first probe si nal to each of the plurality of audio outputs in non-overlapping time .slots separated by silent periods;

a room analysis module adapted to., for each said electro-acoustic transducer,

deconvolve the first electric response signals with the first probe signal to determine a room response at each aco sio -electric transducer for the electro-acoustic transducer;

for fre uencies above a cut-oif frequency, compute a first pan of a room energy measure from the room responses as a function, of sound pressure; for frequencies below the cut»«if fceques-e , compute a second part of the room energy measate -from the room responses as a function of sound, pressu e and sound velocity;

blend the first asd second parts of the- energy meas re to provide she room energy measure over the specified acoustic band; and

compute filter coefficients from the room energy -measure.

41. The device of claim 40, wherein said first probe signs! is a broadband sequence characterised by a magnitude spectrum that is substantially c nstant over a spe ified acoustic band, and wherem the probe generating and transmission schedaliag module k adapted to produce and supply a second probe signal to each of the electro-acoustic transducers, said second probe signal being a pre-empbasized sequence characterized by a pre-emphasis function with magnitude spectrum inversely proportion to frequency applied to a base-band sequence that provides an amplified magnitude spectrum over a tow frequenc portion of specified l$ie acoustic band, and wherem (he analysis module is adapted t convert acoustic responses for the second probe signa s into second electric response signals and deconvolve those second electric response sign ls with the base-band sequence to determine second room responses. at each aeonio-eSeet ie transducer for the electro-acoustic transducer, and fo fre uencies above the cut-off frequency, compute a first part of the room energy .measure '.from (he first room responses as a f unction of sound pressure and for frequencies' below the cut-off frequency, compute the second part of the room energy measure from the second room responses .as a function of sound pressure and sound velocity, and Wend the first and second parts of the energy measure to provide the room energy pleasure ove the specified acoustic hand.

42, The device of claim 41 , wherein the analysis module is adapted to compute the second part of the energy measure by,

computing a first energy component as a function of sound pressure from the second room responses;

estimating a pressure gradient from said second room responses;

estimating sound, velocity components from the pressure gradient;

computing a second energy component irons the sound velocity component:*; stud

computing thus .second part of the energy measure as a function of the Sirs! and second energy m nents. 5. A device- for ge-nerating correction filters- far a'muUi-cteanel -audio s stem, one r more processors adap ed to im l ment for 'at least, ne aud channel, a playback module adapted to provide a P-baad pversa pled analysts filter bank that downsarnples an audio signal to b&se-hasd for ¥ sab-bands, P minimum-phase all-aero sub-hand correction filters, and a P-hand oversatnpied synthesis filter back that upsamp!es the P sub-bands to reconstruct the audio signal where P is an integer, and

an analysis module adapted to comhke a spectra! measure with a channel target curve to provide an aggregate spectral measure, extract and remap portions of the aggregate spectral measure that correspond to different sub-bands to base-band to mimic the do nsamplmg of the analysis filter bank, compute, an auto regressive (AR) model to the remapped spectral measure for each sub-baad* and map .coefficients of each sa d AR model to the coefficients of the corresponding mfeimura- phase all-zero sub-band correction f Iter in the playback module.

44. The de ice of eisim 43, wherein the -ana ysis module computes the AR module by,

computing an autocorrelation sequence as an inverse FFT of the remapped spectral measure; and

applying a Levinson-Durbip algorithm to the autocorrelation sequence to compute the AR model.

45. A method of characterising a listening environment, comprising:

producing a first probe signal, said first prob signal being a broadband sequence characterised by a magnitude spectrum that is substantially constant over a specified acoustic band and an autocorrelation sequence having a zero-lag value at least 30 dB above any non-zero lag value;

producing a second probe signal, said second probe signal being pre- ernphasized sequence characterized by a pre-empbasis function applied to a' baseband se uen e t t provides an amplified magnitude spe tr m over a specified target band that overlaps the specified the acousti band;

supplying the first and second probe signals to each of a plurality of sleet.ro- acoustic converters in a .multichannel audio system for converting the first and second probe s gnals to first and second acoustic responses and sequentially iransmttiing the acoustic responses in non-overlapping time slots as sound waves m a listening environment; and

for each said electro-acoustic con erter,

receiving the sound waves at one or more acoasto- eetrie transducers for converting {he acoustic responses to first and second electric response signals;

deconvolving the first and second electric response signals to determine first and a second room responses;

for fre uencie outside the target band, computing a first spectral measure from the first room response;

for frequencies in the target band, computing a second s ectral measure from the- second response;

blending the first and second spectral measures to provide a pectra! measure over the specified acoustic hand.

46. The method of claim 45, wherein the first probe signal's broadband sequence provides the baseband sequence for the second probe signal

Description:
ROOM CHARACTERIZATION AND CORRECTION OR MITO-

CMA NEL AUDIO

Field of the Inversion

This invention is directed to a mulli-eharmel audio playback device and method, and more particularly to a device and method adapted to characterize a mulli- channel loudspeaker configuration and correct loudspeaker/room delay, gain and freqoeacy response,.

¾¾Bgl o fe «te|¾i Art

Home entertainment systems have moved from simple stereo systems to multi-channel audio systems, such as surround sound systems and more recently 3D sound systems,, and to systems with video displays. Although these home entertainment systems have improved, room acoustics still suffer from deficiencies such as sound distortion caused by reflections from surfaces in a room and/or non- uniform placement of loudspeakers in relation to a listener. Because home esiertain neai sy stems are widely used in homes , improvement, of acoustics in a room is a concern tor home entertainment system users to better enjoy their preferred listening environment.

"Surround sound" i a term ' used its audio engineering to refer to sound reproduction systems that use multiple channels and. speakers to provide a listener positioned between the speakers with a simulated placement of sound sources. Sound can be reproduced with a different delay and at different intensities through one or more of the speakers to "surround" the listener with sound sources and thereby create a more interesting or realistic listening experience. A traditional surround sound system includes & t o-di eussosai configuration, of speakers e.g.. front, center, back and possibly side. The more recent 3D sound systems include a three-dimensional configuration of speakers. For example, the configuration may Include high and low frost, center, back or side speakers. As used herein a multi-channel speaker configuration encompasses stereo, surround so nd and 3D sound systems.

Multi-channel surround sound is employed is movie theater and home theater

I ap lkalioss, I» oae common configuration, the listener in a horns theater is surrounded by five speakers- instead of the two speakers used in a irsditioaa! home stereo system-. Of the five speakers, three are placed is the front of the room, with the remaining two surround speakers located to -the rear or sid s (ΤίϊΧΦ dipolar) of the ii ' steniag/viewing position. A new eoralguration b t use a "sound bar" that comprises multiple speakers that cars simulate the surround sound experience. Among the various sur ound sound formats in use today, Dolby Surround* is the original surround format, developed in the early I970 ! s for movie theaters. Dolby Digital* made its debut in 1996, Dolby D gital^ is a digital format with six discrete audio channels and overcome certain limitations of Dolby Surround® that relies on a matrix system that combines four audio channels into two channels to be stored on the recording media. .Dolby Digit l® is also called & 5 J -channel format and was universally adopted several years ago for film-sound recording, Another format in use today is DTS Digital Surround™ that offers higher audio quality than Dolby .Digital® (1,41.1,200 versus 384,000 bits per second) as well as many different speaker configurations e.g, 5, 1, 6, 1, X L ! 1 ,2 etc. and variations thereof e.g, 7,1 Front Wide, Front Height, Center Overhead, Side Height or Center Height., For e ample, DTS- HD supports seven different 7,1 channel configurations on Bla-Rayf ; discs,

The audio/video preamplifier (or A/V controlle or A/V recei ver} todies she ob of decoding the iwo-channel Dolby Siawund®, Dolby Digital®, or DTS Digital Surround™ or DTS-HD® signal into the respective separate channels. The A/V- preamplifier output provides s x line level signals for the le.fi, center, right, left surround, right surround, and subwoofer channels, respectively. These separate outputs are fed to a multiple-channel power ' .amplifier or as is the case with an integrated receiver, are internally amplified, to drive the honse-theater loudspeaker system.

Manuall setting up and fine-tuning the A/V preamplifier for best performance can he demanding. After connecting s home-theater system according to the owners' manuals, the preamplifier or receive for the loudspeaker setup have to he configured. For example, the A/V preamplifier must know the specific surround sound speaker configuration in use. In marry cases the A/V preamplifier only supports a default output configuration, if the user cannot place the 5.1 or ?, l speakers at those locations he or she is simply out of luck. A. few high-end .A/V preamplifiers support multiple 7.1. configurations and let the user select from a menu the appropriate configuration for he om, is addition, t e loudness of each of the audio channels (the actual number of channels ' being determined by the specific surround sound format s use) should be indivkbtaUy set to provide an overall, balance In the volume from the loudspeakers. This process begins by producing a "test signal* in the ibnn of noise sequentially from each speaker and adjusting the volume »f each speaker independently at the listening/viewing position. The recommended tool for this task is the Sound Pressure Level (SPL) meter. This provides, compensation for different loudspeaker sensitivities, listening-room acoustics, and loudspeaker placements. Other factors, such as an asymmetric listening space and/or angled viewing area, windows, archway and sloped ceilings, can make calibration much more complicated It would therefbte he desirable to provide a system and process that automatically calibrates a multi-channel sound system by adjusting the frequency response, amplitude response and time res onse of each audio channel It is moreover desirable that the process can be performed during the normal, operation of !he surround s und system without disturbing the listener,

U.S. patent no. 7, 158,643 entitled "Auto-Calibrating Surrou d System" describes one approach ' that allows automatic and independent calibration and adjustment of the frequency, amplitude and time response of each channel of the surround: sound ' system. The system generates a test signal that is played through the speaker and recorded by the microphone. The system processor correlates the: received sound signal with the test signal and determines from the correlated signals a whitened esp nse, U,S. patent publication no. 3007,0121955 entitled "Room Acoustics Correction Devic describes a similar approach.,

The following is a summary of the invention in order to provide a basic understanding of some -aspects of the invention. This summary is not intended to identify key or critical elements of the invention or to delineate me scope of the invention. Its sole purpose is to present some concepts of the invention m a simplified form as a prelude to the more deta led description and the defining claims that are presented later.

The present invention provides devices and methods adapted to characterize a multi-channel loudspeaker configurate-, to correct loudspeaker/rooxn -delay, gain and frequency esponse or to configure sub-band domain correction filters.

In an embodiment for characterizing a rmtlti-cbaanel loudspeaker mtlgurahon, a broadband probe signal h supplied to each, audio output of an A/V preamplifier of which a plurality are coupled to .loudspeakers, in a multi-channel configuration in & listening environment. The loudspeakers convert the probe signal, to acoustic responses that, are transmitted in non-overlapping time- slots separated by silent periods as sound waves into the listening environment. For each audio output thai is probed, sound waves are received by a multi-microphone array that converts the acoustic responses to broadband electric response signals, in the ' silent period prior to the t ansmission of the next probe signs!, a processors) deconvolves the broadband electric response .signal with the broadband probe signal to determine a broadband room response at each microphone for the loudspeaker, computes and records in memory delay at each microphone for the loudspeaker, records the broadband response at each microphone in mem for a specified period offset by the delay for the loudspeaker and determines whether the audio out ut is coupled to a loudspeaker. The determination of whether the ud o output is coupled may he deferred until the roorn responses .for each channel are . processed. The processors) may partition, the broadband electrical response signal as it is received and process ' the partitioned signal ' using, for example, a partitioned FFT to form the broadband room response. The processo s) may compute and continually update a Mil ert ' Envelope {HE-} ' from the partitioned signal. A pronounced peak in the HE may be used t compute the delay and to determine whether the audio output is coupled to a loudspeaker.

Based on the computed delays, the processors) determine a distance and at least a first angle (e.g. azimuth) to the loudspeaker for each connected channel. If me multi-microphone array includes tw microphones, the processors can resolve angles to loud speakers positioned in a half-plane either to the f ont, either side or to the rear, if the multi-microphone array includes three microphones, the processors can resolve angles to teud speakers positioned in the plane defined by the three microphones to the front, sides and to the rear, if the multi-microphone array includes fear or more microphones in a 3D arrangement, the processors can resolve both azimuth and elevation angles to loud speakers positioned in three-dimensional space. Using these distances and angles t the coupled loudspeakers, the processors) automatically select a particular multi-channel configuration and calculate a position each loudspeaker within the listening en ironment.

In an em od ment for correcting loudspeaker/room frequency response, a broadband probe signal, sod possibly a pre-en phasixed probe signal is or are supplied to each audio output of as A/V preamplifier o -which at least a plurality are- coupled to loudspeakers in a muhi-ehaouei. configuration a listening envimrtment. The loudspeakers convert the probe signal to acoustic responses that are transmitted in non-overlapping time slots separated by silent periods as sound waves into the listening environment. For each audio output that is probed, sound waves are received by a multi-microphone array that converts the acoustic responses to electric response .signals. A proeessor(s) deconvolves the electric response signal with the broadban probe signal to determine a room response at each microphone for the l udspeake ,

The processor^} compute a room energy measure fr m the room responses.

The ptocessar(s) compute- a first part of the room energy measure for frequencies above a cut-off frequency as a function of sound pressure- and second part of the room energy measure for -fre uencies below the cut-off frequency as a function of sound pressure and sound velocity. The sound velocity is obtained f om a gradient of the sound pressure across th microphone array, if a dual-probe signal comprising both broadband and pre-emphssised probe- signals is utilised, the high frequenc portion of the energy measure based only on sound pressure is extracted from the broadband room response and the low frequency -portion of the energy measure based on both sound pressure at\d sound velocity s extracted from the pre-smphasi¾e<t room response. The dual-probe- signal may be used to compute the room energy measure without the sound velocity component, in which case the pre-e phasteed probe signal is used for noise shaping. The processors} blend the fsrst and second parts of the energy measure to provide the room energy measure over the specified acoustic band.

To obtain a more perceptually appropriate measurement, the room responses or room energy measure may be progressively smoothed to capture substantially the entire time response at the lowest frequencies sad essentially only the direct path plus a few milliseconds of the time response at the highest frequencies. The processors) computes filter coefficients from the room energy measure, which are used to configure digital correction .filters within he oces o s)- The processors) ma compute the filter coefficients for a channel target carve, user defined or a smoothed version, of the channel energy measure, and may then adjust the filter coefficients to a common target curve, which may he user defined or an average of the channel target. oaves, The processors) pass audio signals through, the corresponding digital correction filters and to the loudspeaker for playback into the l kening environment.

In ah embodiment for generating sub-band correction filters for a multichannel audio system, a P~band oversampled analysis filter bank that downsanspies an audio signal to base-band for F sub-bands and a P-band oversanxpled synthesis filter bank that ups&mp!es the P sub-bands to reconstruct the audio signal where P is an intege ate provided in a processors) in the A/V presmpSit!er. A. spectral measure is provided for each channel. The processors) combine each, spectral measure with a channel target curve to provide an -aggregate spectral rateasure per channel. For each channel, the processors} ex r c portions of the aggregate spectral measure that correspond to different sub-bands and remap the extracted portions of the spectral measure to base-band to mimic the do nsamplmg of the analysis filter bank. The processor ' s) comput an auto-regressive (AH) .model to the remapped spectral measure for each, sub-band and map coefficients of each Alt model to coefRcients ' ofa n aimom-phase -aU-xero sub-band correction filter. The processorfs) may compute the AR. model by computing an autocorrelation sequ nce as an inverse F T of the remapped spectral measure- and appl ing s Levinson-Dufbin. algorithm to the autocorrelation sequence to compute the- A. -model. The Levinsoa-Dur m algorithm produces residual o r estimates for the sub-bands that may be used, to select the order of the correction filter. The rocessor } configures F digital all-zero sub-hand eorr-eciion filters from the corresponding coefficients that frequency correct the P base band audio signals between, the analysis and synthesis filter banks. The rocessor's) may compute the filter coefficients for channel target curve, user defined or a smoothed version, of the channel energy measu e, and may then, adjust the filter coefficients to a common target curve, which may be an average of the channel target curves.

These and other features and advantages of the invention will be apparent to these skilled in the art from the following detailed description of preferred embodiments, taken together with the accompanying drawings, in which: fiEIBP DESCRIPTION OF THE DRAWINGS

Figures k .and lb are a block diagram of m s bodiment f a malti-chanael dh playback system and listening -environment in a al sis mode and a diagram of an embodiment of a ieirabedral microphone, respectively;

Figure 2 is a block diagram of m embo iment of a mulsi -channel audio- playback system and listening environment in playback mode;

Figure 3 is a block digram of as embodiment of sub-band- filter bank in playback mads- adapted to correct deviations of the loudspe&ker/reom frequency response determined in analysis mode;

Figure 4 is a flow diagram of an embodiment of the nalysis made;

Figures 5a through 5d are time, frequency and autocorrelation sequences for an ail-pass probe signal;

Figures 6a and 6b are a time sequence and magnitude spectrum of a pre- emphasised probe signal;

Figure 7 is a flow diagram of an embodiment for generating an all-pass probe signal and a pm-emphasized probe signals from th« same frequency do-main signal;

Figure 8 is s diagram of an embodiment for -schedultHg the transmission of the- probe-signals for acquisition;

Figure § is a block diagram of an embodiment for teal-time ac uisition processing of the probe signals So provide a room response and delays;

Figure 10 is a flow diagram of an embodiment for post-processing of the room response to provide the correction filters;

Figure 1 1 is a diagram of an embodiment of a room spectral measure blended from the spectral ' measures- of a broadband probe signal and a pre-emphamed probe i nal;

Figure 12 i a flow diagram of an embodiment for computing the energy measure for different probe signal and microphone combinations;

Figure 13 is a flow diagram of an embodiment for processing the energy measure to calculate frequency correction filters; and

Figures 14a through 14c are diagrams illustrating an embodiment for the extraction, and remapping of the energy measure to base-band to mimic the downsampHag of the analysis filter bank. The presesi .Investion provides devices and methods adapted to character ze a multi-cfeanne! l udspeaker eonilguratkm, to correct loudspeato room. delay, gain an freqxsency res onse or to configure s b-ta&d domain correction filters. Various dev ces and methods a« adapted to automatically locate the loudspeakers in spaee to detenoMe whether a audio channel is connected, select, the particular multi-channel loudspeaker configuration and position each loudspeaker within the listening environmen Various devices and methods are adapted to extract a perceptually appropriate energy measure that captures both sound pressure and velocity at low fre uencies and is accurate over a wide listening area. The energy measure is derived from the room responses gathered by using a closely spaced non-coinekleos multi- microphone array placed its a single location in the listening emdroixment and used to configure digital correction filters. Various devices and methods are adapted to configure sub-baud correction filters for correcting the frequency response of an input multi-channel audio signal for deviations from a target response caused by, for example, room response and loudspeaker response. A spectral -measure (such as a room spsetral/energy measure) is partitioned, and remapped to base-band to mimic the do nsamplffig of the analysis filter bank. AE models are independently computed ' br each sub-band and the models' coefficients are mapped to an all-zero minimum phase -filters. Of note, the ' shapes of the analysis filters, are &«* included fc the remapping. The sub-band filter implementation may be configured, t balance MIPS, memory requirements and processing delay and cars piggyback- on the anaiysis/syn thesis filter bash architecture should one already exist for other audio processing,

Mul j^^ei Aixh LAnalvs and |¾y½c y^ em

Referring now to the drawings, figures la-lb > 2 and 3 depict an embodiment of a multi-channel audio system 18 for probing and analyzing a multi-channel speaker configuration 12 in a listening environment 14 to automatically select the multi- channel speaker configuration and position the speakers in the room, to extract perceptually appropriate spectral (e.g. energy) measure over a wide listening area and to configure frequency correction -filtats and for playback of a multi -channel audio signal 16 with room correction (delay, gain and frequency). Multi-channel audio signal |6 may be provided via a cable or satellite .feed <¾r may be read pit a storage media such as a DVD or Blu-Eay 5 ** disc. Audio signal 16 may be paired wish a video signs! that is supplied ' to a television 1 ' 8. Alternatively, audio signal 16 may be a music: signal w vn ' video signal

MtdU-channel audio system tO comprises -an audio source 20 such as a cable or satellite receiver or DVB or Blu~Ray !M player for providing mu ti-cha iel audio signal 16, an AN preamplifier 22 that decodes ' the multi-channel audio signal into separate audio channels si audio outputs 24 and a plurality of loudspeakers 26 (electro-acoustic transducers) couple to respective- audio outputs 24 that convert the electrical signals supplied by the A/V preamplifier to acoustic responses? that are transmitted as sound waves 28 into listening environment 14. Audio outputs 24 may be tanma!s that are hardwired to loudspeakers or wireless outputs that are wiretess!y coupled to the loudspeakers. If an audio output is coupled to a loudspeaker the corresponding audio channel is said to be connected. he loudspeakers, may be Individual speakers arranged in a discrete 2D or 3D layout or sound bars each comprising multiple speakers configured to emulate a surround sound experience. The system, also comprises a microphone assem l that ncludes one or more microphones 3§ nd a microphone ransmissi n box 32. The mfcrophone(s) (aeousto-eleetni: transducers), receive sound waves associated with probe signals supplied to he loud-speakers and convert the acoustic response lo electric signals, Trans ission box 32 supplies the electric signals to one or snore of the A/V preamplifier's audio inputs .34 through a. wired or wireless connection *

A/V preamplifier 22 comprise one or more processors 36 such as general purpose Compute Processing. Units (CPUs) or dedicated Digital Signal Processor (DSP) chips that are typically provided with their own processor memory, system memory 38 and a digital-to-analog converter and amplifier ( 1 connected to audio outputs 2-4. i some system configurations, the D/A converter and/or amplifier may be separate devices. For example, the A/V preamplifier could output corrected digital signals to a D/A converter that outputs analog signals to a power amplifier. To implement analysis and playback .modes of operation, various "module " of computer program instructions are stored in memory, processor or system, and executed by the one or more processors 3t>.

A V preamplifier 22 also comprises an input: receiver 42 connected to the one or- ore audio inputs 34 to receive inpu microphone signals and provide separate microphone -channels to the proce$sor(s) 36, Microphone transmission box and input receiver 42 are a ' matched pair. For example the transmission box. 32 may comprise microphone analog preamplifiers, A 0 converters an a TDM (time domain $ multiplexor) or Af converter a packer and a US transmitter and m tched input receiver 42 may comprise aft analog preamplifier -ami. A/D converters, a SM5IF receiver and TDM demultiplexer or a USB receiver and unpaeker. The A/V preamplifier may include an audio inn-ut 34 for each microphone signal. Alternately, the multiple microphone signals may be multiplexed to a single signal aad supp ied to

I D a single audio input 34.

To support the analysis mode of operation (presented m. Figure 4), the A/V preamplifier is provided with a probe generation and t nsmission scheduling m dule 44 and a room analysis module 4b, As detailed is figures .Sa-Sd s . 6a*6b, ? arid 8, module 44 generates a broadband probe signal, and possibly a paired pre-emphasized

I S probe signal, and transmits the probe signals via A/D converter and. amplifier 41 to each audio output 24 in son-overfeppkg time slots separated by silent periods according to a schedule. .Each audio output 24 is probed whether the output is coupled to a loudspeaker or not. Module 44 provides the probe signal or sig als and the transmission, schedule to. room -analysis module 46.. As detailed in -figures

20 through 14, module 46 processes the microphone apd probe signals in accordance w th the transmission schedule to -automatically select the malti-chanse! speaker configuration a»d position the speakers in the room, to extract a perceptually appropriste spectra (energy) measure over a wide listening - area and to configure freqwency correction filters (such as sub-band frequency correction .filters). Module 46 5 Stores the loudspeaker configuration and speaker positions and filter coefficients in system, memory 38.

The number and ' layout of microphones 30 affects the analysis module's ability to select, the multi-channel loudspeaker configuration and position the loudspeakers and to extract a perceptually appropriate energy measure that Is valid 30 over a wide listening area. To support these functions, the microphone layout provides a certain amount of diversity to 'loc lise" the loudspeakers in two or three- dimensions and to compute sound velocity, in general, the microphones are rron- coincident and have a fixed separation. For example, a single microphone supports estimating only the distance to the loudspeaker. A pair of microphones s«pp.ori estimating the dis ance to the louds eaker and an angle such as the sximuth angle in half a plane (front, back or either aide) and estimating (he sound velocity in a singl direction. Three microphones support estimating the distance to the loudspeaker and 5 the azimuth angle is the -satire plane (front, back and both side) and estimating the sound, velocity a three-dlmensmnai space. Four or awe microphones positioned on a ¾ree»dii»eBSioaal ball support estimating the distance to the loudspeaker and the a g imuih and elevations angle a full three-dimensional space and estimating the sound velocity a mree-dimens-ionai space.

if) An embodiment of a multi-microphone array 48 for the case of a teirahedrai microphone array and for a specially selected coordinate system is depicted in Figure S h. Four microphones 3# arc placed at the vertices of a tetrahedrai object f 'half ') 9. Ah microphones are a umed to be omnidirectional ;,e.. the microphone signals repre ent the pressure tneasure snts at different locations. MkrophoR.es I , 2 and 3

15 lie the x ? y plane with microphone I at the origin of the eoordiuate system aad mierophoaes 2 and 3 equidistant from the s-axis. Microphone 4 lies out of the x s y plane. The distance between each of the mi roph nes is equal and denoted by d. The direction of arrival (DOA) indicates the sound wave direction of -arrival (to be used for localization process In Appendix A), The separation of the microphones xi d" 0 represents a trade-off of needing a small separation to accurately compute sound velocit up to 5-00 Hz to I kHz and a large separation to accurately position the loudspeakers. A separation of approximately 8.5 to 9 cm satisfies both requirements.

To support the playback mod of operation, the A V preamplifier is provided with an input teceiver/decoder module S2 and an audio playback module 54, input

25 receiver/decoder module 52 decodes multi-channel audio signal 16 into separate audio channels.. For example, the multi-channel audio signal 16 may he delivered in a standard two-channel format. Module 52 handles the job of decoding the two-channel Dolby Surround®, Dolby Digital®, or DIE Digital Su round or DTS-HD* signal mis the respective separate audio channels. Module 54 processes each audio channel 0 to perform ge«eraM¾ed formal conversion and ioudspeate room calibration and correction. For example, module $4 may perform up or down-mixing, speaker remapping or viriualkahon.. apply deiay, gain or polarity compensation, perform bass management and perform room frequency eooechoo. Module 54 may use the frequency correction -parameters (e.g, delay asd gain adjustments and filter coefficients) generated by the analysts :mode and stored to s stem memory- 38 o configure one or more digital frequency coxfection filters for each audio channel. The frequency correction filters may be implemented in time domain, frequency domain or sub-band domain. Each audio channel is passed through its frequency correction- filter and converted to an analog audio signal that drives the loudspeaker to produce an. ac ust c response that is tran mitted as sound waves into the listenin environ ent.

An embod m nt of a digital frequency correction filter 56 implemented i» the sub-band dom in h depicted in Figure 3. Filter 56 comprises a P-band complex nen- critically sampled analysis titer bank 58, a room ftsqumxcy correction filter 68 comprising P minimum phase FIR t ' Fimte Impulse Response) filters 62 for the P sub- bands nd a P-band complex non-eritically sampled synthesis fitter hank 64 where P is an integer. As shown room frequency correction filter 68 has been added to an. existing filter architecture such as DTS NEO-X™ that performs the generalised remapping virtuali tion functions 66 in the sob-hand domain. The majority of eompnt&tio-ns in sob -band based room equenc correction lies 555. implementation of the analysis and synthesis filter sks. The incremental increase of processing requirements imposed fey the addition of roam correction to an existing sub-band architecture s-nch as DTS EO-X M is minimal

Frequency correction is performed in sub-band domain by passing art audio signal (e.g. input FC samples) .first through oversampled analysis filter bank S8 then in each band independently applying, a mmimum-^hase FIR correction filter 62, suitably of different lengths, and finally applying synthesis filter bask 64 to create a frequency corrected output PCM audio signal Because the frequency correction filters are designed to be .minimum-phase the sub-band signals even after passing through different length filters are stil time aligned between the bands. Consequently the delay introduced by this frequency correction approach is solely determined by the delay m the chain of analysis and synthesis filter banks, in a particular implementation with 64~band over-sampled complex filter-banks this delay is less than 20 milliseconds. Acquisition Room Response Processing and Fitter Ceastructioa

A high-level fl w diagram for as embodiment of She analysis mode of operation is depicted in figure. 4, la general, the analysis modules generate the broadband probe signal, and possibl a .prs-emph&sixed probe signal, transmit the 5 probe signals n accordance with, a schedule through the loudspeakers as sound waves into the listening environment and record the acoustic responses detected at the microphone array. The modules com ute a delay and room response for each loudspeaker at each microphone and each probe signal This processing may be done in "real im " prior to the transmission of the nest probe signal or offline after all the

10 probe signals have been transmitted and the microphone signals recorded. The modules process the room responses to calculate a spectral (e.g, energy) measure for each loudspeaker and, using the spectral measure, calculate frequency correction filters and gain adjustments. Again this processing may be done in the silent period prior to the transmi io of the next probe signal or offline. Whether the ac uisition

I S and room response processing is dorse in real-time or offline is a tradeoff off of computations measured in millions of instructions per .second (MIPS), memory and overall acquisition lime and depends on the resources and requirements of a particular A V preamplifier. The modules use the computed delays, to each loudspeaker to determining a distance and. at ' least an azimuth angle to the loudspeaker for each 0 connected channel, and use that information to automatically select the particular multi-channel configuration and calculate a position for each loudspeaker within the listening environment.

Analysis mode starts by initializing system parameters and analysis module parameters (step 7f). System parameters may include the number of available 5 channels ( amCh), the number of microphones -( ' umMics) rid the output volume setting based on microphone sensitivity, output levels etc. Analysis module parameters include the probe signal or signals S (broadband) and PeS (pre-- emphasised.) and a schedule for transmitting the signal(s) to each of the available channels. The probe signa.i(s) may be stored in system memory or generated when 0 analysis is initiated. The schedul may be stored in system emory or generated when analysis is initiated. The schedule supplies the one or more probe signals to the audio outputs so that each probe signal is transmitted as sound waves by a speaker into the listening environment in non-overlapping time slots separated by silent per ods * The extent of the silent period will depend at least in part on whether an of the processing is being performed prior to transmission of the next probe signal.

The first probe signal S s a broadband sequence characterized by a magnitude spectrum that is substantially constant over a specified acoustic band. Deviations from a constant ma nitude spectrum within the acoustic band sacrifice Signal-to- Noise Ratio (SNR), which affects the cha acte isation of the room and correction filters. A system specification may prescribe a maximum dB deviate ' front constant over the acoustic band. A second probe signal PeS is a pre-emphasked sequence characterised by a pre-eraphasis function applied to a base-ham! se uen thst provides an amplified magnitude spectrum over a portion of the specified the acoustic band. The p.re-emphasi¾ed sequence may be derived from the broadband sequence. In general, the second probe si na may ' be useful for noise shaping or attenuation in a particular target baud that may partially or fully overlap the specified acoustic band. In a particular application, the magnitude of the pre-smphasis function is Inversely proportion to frequency within a target band at overlaps a low frequency region of the s eci ied acoustic band. When used in combination with a multi-microphone array the dual-probe signal provides a sound velocity calculation that is more .robus in the presence of noise.

The preamplifier's probe generation and transmission scheduling module initiate transmission of the probe signals) and capture of the microphone signalf s) P and PeP according to the Schedule (step 72), The probe signal(s) t ' S and PeS) and captured microphone signai(s) (P and PeP) are provided to the room analysis module to perform room response acquisition (step 74). This acquisition outputs a room response, either a time-domain room impulse response (RIR) or a freq ei y-domain room f equency response (Rfl ), and a delay at each captured microphone signal for each loudspeaker.

la general, the acquisition process involves a deconvokstion of the microphone signal(s) with the probe signal to extract the room response. The broadband microphone signal is deconvolved with the broadband probe signal. The pre- emphasi e microphone signal may be deconvolved with the pre-emphasked microphone signal or its base-band sequence, which may be the broadband probe signal. Deconvolving the pre-emphasiaed microphone signal with its base-band sequence superimposes the pre-e pha is function onto the room response. The deeenvoiubon -may be performed by com utin a FFT (Fast Fourier Transform) of the microphone signal, computing a FFT of the probe signal, and dividing the microphone fre uenc response by the probe frequency response to form the ' room frequency response (RFR}- The R!R is provided by computing an inverse FFl " of the ' RF.R. Deeon volution m y be performed "off-line" by recording the entire microphone signal a d computing a single FFT on the entire microphone signal and probe signal This may be done in the silent period between probe signals however the duration f the silent period may need to be increased to accommodate the calcukiksa. Alternately, the microphone signals for all channels may he recorded and. stored its memory before any processing commences, Deeonvohstion may be performed in "real-time"' by partitioning the microphone signal into blocks ss it is captured and computing the PFTs on me microphone and probe signals based on the partition (see figure 9). The "real-time" approach tends to reduce memory requirements but increases the acquisition time.

Acquisition al so entails computing a delay at each of the captured microphone signals for each loudspeaker. The delay may be computed from the probe signal and microphone signal using many different techniques including cross-cofTclation of the signals, cross-spectral phase or an analytic envelope such as a Hilberl Envelope (HE).. The. delay, lor example, may correspond to the position of a pronounced .peak in the HE (e.g. the maximum peak that -exceeds a defined threshold). Techniques such as the HE that produce a time-domain sequence .may be interpolated around the peak to compute a new location of the peak on a finer time scale wi th a f action of a sampling interval time accuracy. The -sampling interval time is the interval at which the received microphone signals are sampled, and should be chosen to be less than or equal to one half of the inverse of the maximum frequency to be ' sampled, as ss kaown in ths art.

Acquisition also entails determining whether the audio output is m fact coupled to a loudspeaker. If the terminal is not coupled, the microphone will still pick up and record any ambient signals but the eross-eorre!si on-½oss-speetrsl -phase analytic envelop will not exhibit a pronounced peak indicative of loudspeaker connection. The acquisition module records the maximum peak and compares it to a threshold. If the peak exceeds the peak, the Sp«akerActivltyMask[neh] is set to true and the audio channel is deemed connected. This determination can he made during the silent period or off-Mae.

For each connected audio chaane-l, she analysis module processes She o m response (either the MB. or RFB.) and the delays from each loudspeaker at each microphone and outputs a room spectra! .measure for eac loudspeaker (step 76). Th s ro m response processing may be es or ¾1 duriag ie silent period prior to ansm ssion of the next probe signal or off-line after all he robing arid acquisition is fiaished. At its simplest, the room spectral measure may comprise the RFR for a single microphone, possibly averaged over multiple microphones and possibly blended to use the broadbaad RFR. at higher frequencies and the pre-emphasked RF1. at lower frequencies. Further processing of the room response may yield a more perceptually appropriate spectral response arid esse that Is valid over a wider listening ares.

There are several acoustical issues with standard rooms (listening envirottiuents) that affect how one may measure, calculate, and apply room correction beyond the usual gain/distance issues. To understand these issues, ne should consider the perceptual issues. In particular, the role of "first arrival", also known as "precedence effect' ' in. human .hearing plays a role in the actual perception of imaging and timbre, in any listening eavi ooment aside from an. anechoie chamber, the ''direct" timbre, meaning the actual perceived timbre of the sound source, Is affected by the first arrival (direct f om speaker/instrarnent) sound an the first few reflection^ After this direct timbre is understood, the listener compares ' that timbre lo that of the reflected, latex sound ¼ & roopt. This, amon other things, helps with Issues like front baek disambiguation, because the eomparison of the Head Related Transfer Fusetioa (MRT ) iafluenee to the direct vs. the full-space power response, of the ear is something humans know, and learn to use, A consideration is that if the direct signal has more high frequencies than a weighted indirect signal, it is generally heard as "frontal", whereas a direct signal that lacks high frequencies will localize behind the listener. This effect is strongest, from about 2kHz upward. Due to the nature of the auditory system, signals .from a .low frequency cutoff to about 500Hz are localized via one method, and signals above that by another method,

In addition to the effects of high frequency perception due to first arrival, physical acoustics plays a large part la room compensation. Most loudspeakers do not have an overall flat: power radiation curve, even, if they do come close to that ideal for the first arrival, ¾ means that a listening mvir o eat will be driven by less energ at high frequencies than it will be at lower frequencies. This, alone, would me ' an that if one were to use a kmg-temi energy average far compensation calei ahon. ne ould be applying an undesirable pre-emphasis to the direct signal Unfortunately, the . situation is worsened by the typical room acoustics, because pica ly, at higher frequencies* wails, ferns lure, people, etc, will absorb more energy, which reduces the energy storage (i.e. T6 ) of the room, causing, a long-term measurement to have even more of a misleading relationship to direct timbre.

As a result, our approac makes measurements in she scope of the direct sound, as determined by the actual cochlear mechanics, with a long measurement period ai lower requencies (due o the longer impulse res onse of the cochlear filters), arsd a shorter measu ement period at high frequencies. The transition from lower to higher frequency is smoothl varied. This time interval can be -approximated by the rule of t « 2 ERB bandwidth where ERB Is the equivalent rectangular bandwidth until Y reaches a lower limit of several milliseconds, at which time other factors in tire aud to y system suggest that the time should not be further reduced. This "progressive smoothing" ma be . performed on the room impulse response or n the room spectral measure. The progressive smoothing may also be performed to promote perceptual listening. Perceptual listening encourages listeners to process audio si nals at the two ears,

At low frequencies, he. long wavelengths, sound energy varies little over different locations as compared to the sound pressure or any axis of velocity alone. Using ' the measurements , from a non-coincident muhi- microphone array, the modules compute,, at. low frequencies, -a total energy measure that takes into consideration not just sound pressure but also the sound velocity, preferably In all directions. By doing so, the modules capture the actual stored energy at low frequencies in the room from one point. This conveniently allows the A/V preamplifier to avoid radiating energy into a room at a frequency where there is excess storage, even if the pressure at the measurement point does sot reveal that storage, as the pressure zero will be coincident with the maximum of the volume velocity. When used in combination with a multl- mkrophone array the dual-probe signal provides a room response that is more robust in the presence of noise.

The analysis module uses the room spectral (e.g. energy) measure to calculate

1 ? frequency correction Oilers and gain adjustment for each connected audio channel arid store the parameters in the system memory (step ?#), Many different architectures including time domain filters (e,g. FIR or O.R), frequency domain filters. (eg. FIR im lemented by overlap-add, overlap save) and sub-band domain filters can be used to provide the loadspeate room equency correction. Room correction at very low frequencies requires a correction ' filter with an impulse esponse thai can easily teach a duration, of several hundred milliseconds. In terms of required operations p r cycle the most efficient way of implementing these filters would be in the frequency domain using overlap-save or overlap-add methods. Due to the large size of the required FFI the inherit delay and memory requirements may be prohibitive for some consumer electronics applications. Belay can be reduced at the price of an increased number of operations per cycle if a partitioned FFT approach is used. However this method still has high memory requirements. When the processing, is performed m the sub-band domain it k possible to ttne-txtne the compromise between the required number of operations per cycle, the memory requirements and the processing delay. Frequency correction in. the sub-band domain can efficiently utilize filters of different order in different frequency regions especially if. -filters in very few sub-bands (as in case- of room correction with very few low frequency bands) have much higher order then filters in all other sub-bands. If captured room responses are processed using long measurement periods at lower · frequencies and progressively shorter measurement periods ' towards higher fre uencie , the room correction filtering requires; even lower order filters as the filtering from low to high frequencies, is this case a sub-band based room, frequency correction filtering approach offer similar computational complexity as fa t convolution using overlap-save or overlap-add. methods; however, a sub-hand domain, ap ach achieves this with much lower Memory requirements as well as much lower processing delay.

Once all of the audio channel /have been processed, the analysis module automatically selects a particular multi-channel configuration tor the loudspeakers and compute a position for each loudspeaker within the listening environment (step ). The module uses the delays from each loudspeaker to each, of the .microphones to determine a distance and at least an azimuth angle, and preferably an elevation angle to the loudspeaker in a defined 3D coordinate system. The module's ability to resol ve azimuth and elevation angles depends on the number of microphones and diversity of received siguals. The moriuks :readj«sts the ela s to correspond to a del y from (he loud& eakw o the origtn of the coordinate system. Based oa given system e!ec roaics propagation delay, the module computes an absolute delay cormi fKj n to air propagation from loudspeaker lo Che origin. Based on this , delay and a constant speed of ound, the module com utes an abs lute dis ance to each loudspeaker.

Using the distance and angles of each loudspeaker the module selects t e closest multi-channel loudspeaker configuration. Either due to the physical characteristics of the room or user error or preference, the loudspeaker positions may aot correspond exactly with a supported configuration. A table of predefined loudspeaker locations, suitably speci ied according industr standards, is saved in memory,. The standard surround sound speakers lie approximately m the horizontal plane e.g. elevation angle of roughly tsm and specify the a&knath. angle. Any height loudspeakers may have elevation angles between, for example 30 and 60 degrees. Below Is as example of such a table.

Current industry standards specify abou nine different layouts from mono to 5. L DTSdlXLt- currently specifies four 6. i configurations: - C · L R · L. ., : (\

- LR · L it i L, : R, and seven 7 J configurates

·· C -L -f-LFEr-i-L-iv i t ^-Lss ss

- C-;-LRi'L s R s i-LFE s ÷L Sf R S!

- OHJ H,, 5 H,FE f rt w s

As the industry moves towards 3D, mote industry standard mid DTS-HD® layouts- will be defined, Given the number of connected channels and the distances and ang1e(s> for those channels, the modul identifies individual speaker locations irorn the sable arid sele ts the c sest snatch to a .specified ull^ehanne! configurat on. The "closest ma ch" a fee determined by an error metric or by logic. The error metric ma , for example count the number of correct naaiehes to a particular configuration or com ut distance (eg, sum of the squared etror) to at! of the speakers in. a particular configurate. Logic could identify one or more candidate configurations with the . largest number of speaker matches and ihen. determine based en any mismatches which candidate configuration is the most likely.

The analysis module stores the delay and gain adjustmeats and filter coefficients for each audio channel in system memory (step 82).

The probe signals) may he designed to allow for an efficient sod accurate tneasuremest of the room response and a eaScuktfion of an energy measure valid over a wide listening area. The first probe signal is a broadband sequence characterised by a magnitude spirant thai is substantially constant over a specified acoustic band. Deviations from "con tant" over the specified acoustic ban produce a loss of SNR at those frequencies, A de gn specification will typicall specify & maximurn deviation in the magnitude spectrum over the specified acoustic baud. Probe Si nals and Acquisition

One version of the first probe signal S is m all-pass sequence 108 as shown in Figure 5s. As sho s in figure 5b, the magnitude spectrum 102 ' of an all-pass s quence A.H* is approximately coRStant (i .e. 0 d8) over all frequencies. This probe signal has a very narrow peak autocorrelation sequence 184 as shown in figures 5c and Si. The narrowness of the peak is inversely proportional to the bandwidth over which the magnitude spectrum is constant The autocorrelation sequence's zero-kg value is far above my non-zero lag values and does not repeat. How much depends os the length of the sequence. A sequence of 1 ,024 (2 !t ) samples will have a xero-kg value at least 30dB above any non-zero lag values while a sequence of 65,536 (2 S<> ) samples will have a zero-lag value .at least 60dB above my non-zero lag valises. The lower the non-stero lag values the greater the noise rejection and the more accurate the delay. The all-pass sequence is uch that during the mom response acquisition process the energy the room will be building u for all frequencies at the same time. This allows for shorter probe length hen compared to sweeping sinusoidal probes, la addition, ail-pass excitation exercises loudspeakers closer to their nominal mode of operation. At the -same time this probe allows for accurate fell bandwidth messuremest: of loudspeaker/room ' responses allowing for a ve quick overall measurement process, A probe length of 2 it! samples -allows .for a frequency resolution of 0,73 Hz.

The second probe signal may be designed for noise shaping or attenuation in a particular target band that -may partially or fully overlap She speetr!ed acoustic band of the first probe signal. The second probe signal is a re-emphas&ed sequence characterized by a pre-emphasis function applied to a base-band sequence that provides an amplified magnitude spectrum ever a portion of the specified the acoustic band- Because the sequence has an amplified magnitude spectrum f> 0 d8) over a portion of the acoustic band it will exhibit an attenuated magnitude spectrum < 0 dB) over other portions of the acoustic ' band tor energy conservation, hence .is not suitable for use as lite first or primary probe signal. One version of the second probe signal PeS as shown figure 6a is a pre- emphasized sequence ' m whseb the pre-etnphasis function applied to ' (he baseband sequence is inversely proportion to frequency ie¾)d) where c is the speed of sound- and d is. the separation of the -microphones ' o ver a low .fre uency region of (he sp c fied acoustic band Note, radial frequenc <o ~2sf where f is H¾. As the two are represented by a constant scale factor, tbey ate sed, interchangeably, -furthermore,, we.fnnclioaal dependency on frequency maybe omitted for simplicity. As shown In figure 6k (he magnitude spectrum 112 is inversely proportional to ireqxtency. For fre uenc es less than 500 Hz, (he magnitude spectrum is >0 dB. The amplification is clipped at 20 dS at the lowest frequencies. The use of the second probe signal to compute the room spectra! measure at low frequencies has the advantage of attenuating low frequency noise in the- case of a single microphone and of attenuating low frequency noise in the pressure component and improving the computation of the velocity component in the case of a multi-microphone array.

There are many different ways to construct he first broadband probe signal and the second pre-emphasized probe signal. The second pre mphasized probe signal, is generated from a base-band sequence, which may or way not be the broadband sequence of the first probe signal An embod ment of a method for constructing an all-pass probe signal and a pre-emphasixed probe signal is illustrated i» figure ?, In accordance with one embodiment of the invention, the probe - signals are preferably constructed in the frequency domain by generating a -random number sequence between % having a l gth of a power of (step !2t>). There are many known techniques to generate a random number sequence, the MATLAB (MS-rix L boratory) "rand" function based on the Mersene wiste algorithm may suitably be used in the invention to generate a uniformly distributed pseudo-random sequence. Smoothing filters (e.g. a combination of overlapping high-pass and low-pass filters) are applied to the random number sequence (step 121). The random sequence is used as the phase (f) of a frequency response assuming an all-pass magnitude to generate the all-pass probe sequence S(£) in the frequency domain (step 122). The all pass magnitude is S(f) j where S(f) is conjugate symmetric (I.e. the negative frequency part is set to be the complex conjugate of the positive part). The inverse FIT of S(f) is calculated (step 124) and normalized (step 126) to produce the first ail- pass probe signal S(n) in the time domain where n is a sample index in time. The frequency dependeat ie/<ad) -pre-emph&sss function Pe(i) is defused (step I ' M) and applied to the all-pass fr quency d nate signal S(f) f« yield PeS(f) (step :i3i). FeP(f may be bound or clipped at ' the lowest frequencies (ste 132), The inverse- FFT of PeS(i) is calculated (step 134), exam ned to ensure at there are no- serious edge- effects- and normaliz d to have high level while avoiding dipping (step 136) to produce the second pre-emphssiml pro-be signal PeS(n) in the time- dom in. The probe signals) may fee calculated offline and stored in memory.

As shown in figure 8, in an embodiment the A V preamplifier supplies the one or more probe signals, all-pass probe (APP) and pre-emphasixed probe (FES) of du ation (length) "P", to the audio outputs in accordance with a transmission schedule !4¾ so that each probe signal is transmitted- as sound waves by a loudspeaker into the listening envimmmot n non-over!appitig time slots separated by silent periods. The preamplifier sends one probe signal to one loudspeaker at a time, in the case of dual probing, the all-pass probe APP is sent first to a single loudspeaker and after a pmletermmed. silent period the pte^emphasiaed probe signal PES is sent to the same loudspeaker.

A silent: period is inserted between the transmission of the l si and 2 s8 probe signals to the same speaker. A silent period and is inserted between the transmission of the f £ -and 2"* probe signals between, the and 2 !iii loud speakers and the-k* and k ra -H loudspeakers, respectively, to enable r bust yet fast -acquisition.. The minimum duration of the silent period S is the m ximum RIR length to he acquired. The minimum duration of the silent period is the sum of the maximum RIR. length and the maximum assumed delay through the system. The minimum duration of the silent period ¾*( is imposed by die sum of (a) the maximum RIR length to fee acquired, (b) twice the maximum assumed relative delay between the loudspeakers and (e) twice the room response -processing block length. Silence between the probes to different loudspeakers may be- increased if a processor is performing, the acquisition processing or room response processing in the silent periods and requites more time to finish the calculations. The first channel is suitably probed twice, once at the beginning and once after all other loudspeakers to check for consistency in the delays. Th total system acquisition length $ys Aeq J n∞ 2*F * S ÷ S * N_ LoudSpkrs*{2*F S + Sfc*n). ith a probe length of 65536 and dual- probe test of i¾ loudspeakers the total acquisition time can fee less than 31 seconds. The meifcodology for deconvoiniion of c&ptared microphone signals based on ver long FFTs, a described previously, is suitable for off-line processing scenarios, in this ease it is assumed mat the pre-amplifier has enough m mo to store the en tire captured microphone signal and only after the capturing process is completed to start the estimation of the propagation, delay sad room, response.

in DSP implementations of room response acquisition, to minimize the required memory and required duration of the ac uisition process, the A/V preamplifier suitably performs the de-coavolution and delay estijfcatkwv leaHim white capturing the microphone signals. The methodology for real-time estimation of delays and room responses can be tailored for different system requirements in terms of trade-off between, .memory, . MIPS ami acquisition time requirements:

* "the deconvotutton of captured microphone sign ls is performed via a matched filler whose impulse response Is a time-reversed probe sequence (i.e., for a 65536- sample probe we have a 65536-ts FIR. filter), For reduction of complexity the .matched filtering is done in the frequency domain and for reduction, in memory requirements and processing delay the partitioned FFT overlap aad save method is used with 50% o verlap,

* In each block this approach yields a candidate frequency response that corresponds to a specific time portion of a candidate room impulse response. For esch block an inverse BFT is performed to obtain new block .of samples of a candidate mom impulse response (RIR),

* Also f m the same candidate- frequency response, by zeroing its values for negative frequencies, applying !FFT to the .result, and faking the absolute value of the IFFT, a new block of samples of an analy tic en velope (AE) of the candidate room impulse .response is obtained, in an embodiment the AE is the Hiihert Envelope (HE)

* The global peak (over all blocks) of the AE is tracked and its location is recorded.

* The RI and AE are recorded starting a predetermined n mber of samples prior to the AE global peak location; this allows for fine-tuning of the propagation delay during room response processing, * In ever new block if the ne global peak of the AE is found the previously reco ded candidate R1R and AE are reset and recording of ne candidate- RIR and

AE a e started.

♦ To reduce false detection the AE global peak search space is limited to expected regions; these expected regions ' for each loudspeaker depend on assumed ximum delay through the system and the maximum assu ed relative delays between the .loudspeakers

Referring now to Figure % hi a specific embodiment each, successive block of N/2 samples (with a 50% overlap) is processed to update the SIR. Ass N-pomt FFT is performed on each block for each microphone to output a frequency response of length Nx 1 (step 159). The current FFT partition for each microphone signal (non- negative fre u nci only) is stored k a vector of length (N/2+1) x 1 (step 152), These vectors are accumulated in a fixst-in first-out (FIFO) bases to create a matrix Input. FPT . Matrix of FFT partitions of dimension (M/2H) x (step 154). A set of partitioned. FFT (non-negative frequencies only) of a time reversed broadband probe sigsai of length K*N 2 samples are pre-calculated and stored as a matrix FiltJFFT of dimensions ( 2rf x (step 1S6). A fast convolution using an. Overlap and ' save method is perf rmed on the with, the FiltJFFT matrix to provide- a» N/2- f point candidate fre uency response fo -the current black, (step 158} , The overlap and save method multiplies the value in each frequency bin of the Flit . FFT _ ains by the corresponding value in the Input . FFT .Matrix -and averages the values across the & columns of the matrix. For each block so M-pofct. inverse FFT is performed with conjugate ' symmetry extension for negative frequencies to obtain a new block of 2xi samples of a candidate room impulse response (MR) (step MS), Successive blocks of candidate FIRs are appended and stored up to a specified RIR length (RiRJLength) (step M2}-

Al ' so from the same candidate fce sene response, by zeroing Its values for negative frequencies,, applying an IFFT to the result, and taking the absolute value of the IFFT, a new block of N 2xI samples of the HE of the candidate room impulse response is obtained (step 164). The maximum (peak) of the- HE over the incoming blocks of N/2 samples is tracked and updated to track a global peak over all blocks (step 16$), i samples of the HE around its global peak are stored (step 168), If a new global peak is detected, a eoattol signal }s issued to flash the stored can date MR and restart. The DSP outputs the RIR.» HE peak: location, and the M mnpten of the HE around its peak.

Ift an. embodiment in which a dual-probe a roach is used, the pre-emphasized probe signal is processed in the same manner to generate a candidate RIR that is stored up to RIR Length (step 378), The location of the global peak of t e HE for the all-pass probe signal is us d to s a t ccumulation, of the candidate MR. The DSP outputs the RIM for the pre-emphastxed probe signal. Room Response Processing

Once te acquisition process is completed the room responses are processed by a cochlear mechanics inspired h ' nae-frequeaey processing, where a longer part of room response is considered at lower frequencies ami progressively shorter pasts of room es onse are considered at higher and higher in frequencies. This variable resolution time-frequency processing may be performed either on the time*domain RIR or the frequency-domain spectral .measure.

An. embodiment of the method of room res onse processing is illustrated in Figure 10. The audio channel indicator neh is set to .aero (step 280). If the $peakerAvilvhy asfc[nehJ is not true (i.e. no more loudspeaker coupled) (step 2 )2) the loop processing, terminates and skips to the final, step of adjusting all correction filters to a common target curve. Otherwise the process optionally applies variable resolution time -frequency processing to the RIR (step 204). A time varying filter ' is applied to the RiR. The time varying filler is coaslrneted so that the beginning of the Rill is not ; filtered at all but as the filter progresses n time through the MR s low pass filter is applied whose bandwidth becomes progressive smaller with time.

An exemplary process for constructing and. applying the time varying filter to the RIR is as follows:

» Leave the first few milliseconds of Ri ' R unaltered (ail frequencies present) » Few milliseconds into the RIR start applying a time-varying low pass filter to the RIR

* The time variation of low-pass filter may be done m stages:

o each stage corresponds to the particular time interval withi the RIR o this time interval may be increased by facte of 2x when. compared to the time interval in previous stage

o time intervals between two consecutive stages may be overlapping by

50% (of the time iftterv&l corresponding io the ea lier stage) o at each new stage the low pass filter may reduce its bandwidth by 50%

* The time- interval at iakial stages shal be around few milliseconds,

* Implementation -of time varying filter may be done in FFT dom in using overlap-add methodology; la particular:

o extract, a ort on of the RI corresponding to the current block o apply a window function to t e extracted block of RIR,

ø apply as FFT to the current block,

o multiply with corresponding frequency bins of the same ske FFT of the current stage low-pass filter

o compute an in verse FFT of the result to generate m output,

extract a ctsrretrt. block output and add the saved output from the previous block

o save the emainder of the output for combining with the nes block o These steps are repeated a the 'Curr nt block' * of the Rl ' R slides to time through the RIR with a 50% overlap wish respect to the- revious block,

o The length of the block may increase at each stage (matching the duration of time interval associated with the stage), stop increasing at a certain stage or he uniform throughout

The room responses for different microphones are .realign d ' (step I ). In the case of a single microphone no real gnment is required, if the room responses are provide in the time domain as a RI , they are realigned such, that the relative delays between Rills in each microphone are restored and a FFT is calculated to obtain aligned RFR. If the room responses are provided m the r quency domain as a RFR, realignment is achieved by a pb&se shift corresponding to the relative delay between microphone signals. The fre uency response for each frequency bin k for the all-pass probe signal is ¾ and for the pre-emphasi¾ed probe signal is ' *¾» where the functional dependency on frequency has been omitted. A spectral measure is eonstmete from the realigned .F s for the current a dio channel (ste 28S). In general the spectral n es ue may be ca cu ated in any number of ways from the RFRs including but not limited to a magnitude spectrum and m energy measure. As show in Figure 1 1, fee spectral measure 210 may fetead. a spectral measure 212 calculated from the frequency response for the pre- emphasixed probe signal for frequencies below a cut-off frequency bin and a spectral measure 214 from the frequency response.i¾ for the broadband, probe signal for frequencies above the cut-off frequency bm . in the simplest case, the spectral measures are hl.en.ded by appending the H¾ above the cut-off to the .¾, γί£: below the cut-off. Alternately, the differen spectral measures may be combined as a weighted average in a ttans ian region 216 around the cut-off frequency bin if desired.

If variable resolution time-frequency processing was not applied to the room responses in step 204, variable resolution time-frequency processing may be applied to the spectral measure (step 22§), A smoothing filter is applied to the spectral .measure. The smoothing filter is constructed so that the amows of smoothing increases with frequency.

An exemplary process for conside ing and applying the smoothing filter to the spectral! measure comprises using a single pole low pass filter difference equat on aad applying ft to the frequency bins. Smoothing is performed is 9 frequency ' bands (expressed in Hxk Baud 1 ; 0-93.8, Band 2: 93.8- 1 $7 J, Band 3:187.5-375, Basel 4; 375-759, Band 5: 750-500, Band 6: 1500-3000, Band 7: 3000-6000, Band. 8: 6000- 12000 aad Band 9; i 20 O-24 O.Snmothing uses forward and backward frequency domain averaging with variable exponential forgetting factor. The variability of ex onent al forgetting factor is determined by the bandwidth- of the .frequency band (SaadJB ) i.e. Larsd - I - C/Band ^ BW with C being a scaling constant. When transitioning from one band to nest the value of ' Lambda is obtained by linear interpolation between the values of Lambda in these two bands.

Once the final spectral measure has been generated, the frequency correction filters can. ' be calculated. To do so, the system is provided with a desired corrected frequency response or "target curve". This target curve is one of the main contributors to the characteristic sound of any room correction system. One approach is to use a single common target curve reflecting any user preferences for all audio channels. Another approach reflected In Figure 10 is to generate and save a unique ehaimsl target curve for each audio channel, (step 222) and generate a common target curve for all channels (step 224).

For correct stereo or multichannel imaging, a oom correction process should first of all achieve matching of the first arrival of sound (in time, mpli ud and timbre) from each of the loudspeakers hi the room. The. room spectral measure Is smoothed with very coarse low pass filter such that only the trend of the measure Is preserved. In other words the trend of direct: path of a loudspeaker response Is preserved since ah room contributions are excluded or smoothed out. These smoothed direct path loudspeaker responses are used as the channel target curves during the calculation of frequency correction filte s for each loudspeaker separately (step 226), As a result only relatively small order correction filters are required since only peaks and dips around the target need to be corrected. The audio channel indicator ach is incremented by one (step 228) and tested against the total number of channel umCh to determine if all possible aud channels have been processed (step 230). if not, Use entire process repeats for he next audio channel. If yes, the process proceeds to make final adjustments to the correction filters for the common target curve.

In step 224, the common target curve is generated as an average of the channel target curves over all loudspeakers. Any user preferences or user selectable target curves may he superimposed on the common target curve. Any adjustments to the correction fitters are made to compensate for differences is me channel target curves and the common target curve (step 223). Due to the relatively small variations between the per channel and common target curves and the highly smoothed curves, th requirements imposed by the common, target curve ca he Implemen ted with very simple filters.

As mentioned previously the spectral measure computed in step 208 ma constitute an energy measure, An embodiment for computing energy measures for various combinations of a single microphone or a tetrahedral microphone and a single probe or a dual probe is illustrated m figure 12.

The analysis module determines whether there are i or 4 microphones (step

230) and then determines whether there is a single or dusl-probe room response (step 232 tor a single microphone and ste 234 for a tetrahedral. microphone). This embodiment is escribed for 4 microphones,, more generally the method may be applied to any ahi- ieropbone array.

For the case of a single microphone and single robe room response ¾, the analysis module const uc s he energy measure !¾ (functional dependent on frequency omitted) m each frequency bin k as E* ~ Hk ' *conj(¾) where conji*.) fe the conjugate operator (step 236), Energy measure E¾ corresponds to the sound pressure.

For the ease of a s n le microphone and dtsal probe room responses M¾ and H f c j w, the analysis module constructs the energy measure ¾· at low frequency bins k < k.» as E* D *Hk,p < c nj(Dc*i¾,p i ) where De is the complementary de-emphasi function to the pre-empbasis function Pe (le. De*Pe « I for all frequency bins k) (step 238), For example, the pre-emphasis function Pe i:: c.½d and the de-emphasis function f>e∞ s»d c At high frequency bins k k< B k « ¾*cor¾(¾) (step 249). The effect of using the daabprobe is to attenuate low frequency noise in the energy measure,

For the tetrahedrat microphone eases, the analysis module computes a pressure gradierU across the mkropbone army from which sound velocity eoin onents may be extracted. As will be detailed, s energy measure based on both sound pressure and sound velocity lor low frequencies is more robust across a wider listening area.

For the eas of a ietrabedral microphone and a single probe response ¾, si -each low frequency bin k ¾ a first part of the energy .measure includes a sound pressure component ami a sound velocity component (step 242), The sound pressure component PJ¾ may be computed by averaging th frequency response over all microphones Av¾ ~ 0,3S*p¾(ml) 4· ¾(m2) 4· ¾(m3) 4 F¾{m4)) and computing P_E ' k ~ A 'H k Conj{AvH f e) (step 244), The "average" ma be computed as any variation of a weighted average. The sound velocity component V % Is computed by estimating a pressure gradient VP ' from the ί¾ for ail 4 microphones, applying a frequency dependent weightin (e/e>d) to to obtain velocity components ¾ s , V¾ . y and V k y along the x, y and z coordinate axes, and computing V J% »~ \ s eonj(V'k >,) + V k . yConj(V y) -; (step 246}, The application of frequency dependent weighting will have the effect of amplifying noise at low frequencies. The low frequency pardon of the energy measure EK « ,5(P J¾ ·$· V_E j5 ) (step 248) although any variation of a weighted average may be used. The second part of the energy measure at each high frequency bin k > k is computed as the square of the sums E = j0.25{H k (iul) + H ¾ (ra2) + HrfmS) * k ( ))f or the sum of the qu res % " ! 0,2S(j¾.{.ml}f * !¾(m2)i + ¾(«Ϊ3) 2 + |H fe (m4 f) for example (step ISO).

For the esse of a tetrahedral microphone aad a dual-probe response ¾ arid ' I-it^, at each low frequency bin k < k, a first pari of the energy measure includes a sound pressure com onen and -a sound velocity component (step 262). The sound pressure component P E ¾ may be computed by averaging the frequency response over all microphones- AvH&^ » 0,2S*(Mk !?i! (ml) + Hk. <m2} + I¾^m3 + I¾^(m4)), apply de-emphasis sca in and computing PJ¾ ~ (step 264). The "a erag " may be computed as any variation of a weighted average. The sound velocity c mponent: V Ji^ is computed ' by estimating a pressure gradient VP from the I:h <f ¾ for all 4 microphones estimating velocity components · V k and Vj.- ¾ along the X, y and z coordinate axes from fF, and computing V J¾ V kJt -com " CV 4j J ÷ V^eonj{\V,,) + V e ¾(V k ,) (step 266). The use of the pre- em asize probe signal removes the step of applying frequency dependent weighting. The low frequency portion of the energy measure ER ~ 0,5{J J¾ + VJE¾) (step M% j (or other weighted combination). The second part of the energy measure at each. high, frequency bis k k. may be computed as the square of the sums £ ¾ . » |0.25(}¾(nil) %;(ns2) Η*{ηβ) + H k (rs4) f or the um of e squares E & . « < 0.25(|I¾(niI) ·* jl¾m2>j 2 * |H ¾ (m3)| } * ί¾(η¾4}Γ} for example (step 27(). The dual-probe, mui ti -mierophoue case combines both -forming t e ener measure from sound pressure and sound velocity components and using the pre-emphasize probe signal in order to avoid the frequency dependent scaling to extract the sound, velocity components, hence provide a sound velocity thai is more robust In the presence of noise.

A more rigorous development of the methodology for constructing the energy measure, and particularly the low frequency component of the energy measure, for the tetrshedrat microphone array using either single or dual-probe techniques follows. This development illustrates both the benefits of the mohi- icropbooe array and the use of the dual-probe signal.

in an embodiment, at low frequencies, the spectral density of the acoustic energy density in the room is estimated Instantaneous acoustic energy density, at the point, is given by: ir^ f# + ^ (I ) where all variables ' marked in bold represent vector variables, the p(r, fc) aad ¾(r, i) are instantaneo s soua ressure and sound velocity ve tor, respectively, at location determined by osition vector r, c k the speed of sound, and ρ is the mean density of the air. The jjUfj indicating the < ( 2 norm, of vector 11. If t e analysis is done i frequency domain, ' via the Fourier transform, then where Z(r, w) ~ :Γ(ΐ{ ί)) ~ i L 2(r,

The sou d velocity at Is related to the pressure using the linear Euler's equation.

and in the frequenc o ain

The term VF(r, H>) is a Fourier transform of a pressure gradient alo«g x, y and z coordinates at frequenc w. Hereafter, all analysis will be conducted ia the frequency domain and the functional dependency on w indicating the Fourie transform will be omitted as before. Similarly functional dependency ©» location vector r will be o itted from notation.

With this the expression for desired energy measure at each frequency in desired low frequency region can be written as

The fceehnique thai uses the differences bet een the press s at multiple m ci apb Be locaiksas to compute he pressure gradien .has .bees described Thomas, .0, C, (20Q8), T w y ^ Estimation of Acoustic Me tsU? md Energy Density* MSe. Thesis, Brigharn Young University. This pressure gradient estimation technique for the case of tetrahedrai microphone army and for specially selected coordinate system shown Figure lb is presented. All microphones are assumed t be omnidirectional i.e.. the microphon signals represent the pressure measurements at different locations..

A pressure gradient m y be obtained from the assumption that the microphones are positioned such that the spatial, variation in the pressure field is small over the volume occupied by the microphone array. This assumption places an upper bound OK the frequency range at which this assumption may be used. in this case, the pressure gradient may be ^prox matel related to the pressure difference between any microphone pair by r M 7 · VP « ¾ « p i - p k where is & pressure component measured at microphone k, r ki Is vector pointing from microphone k to microphone

1. i.e., r ki ~ Tf ■■■ r k ~- T denotes matrix transpose operator and «

denotes a vector dot product. For particular the microphone array and particular selection of the coordinate s stem the microphone position vectors are r x ~ (0 0 Of, 2 = | ~™ 0.5 o) » r s =* d |-~^ -O.S θ anrf r ~

41—™ 0 ~~| . Considering all 6 possible microphone pairs in the teirahedral. array an over determined system of equations . can he solved for unknown components (along x, y and % coordinates) of a pressure gradient by means of a least squares solution, in particular if ail equations are grouped in a matrix form the following matrix equation is obtained:

K - VP■■ ■·■ P + & (6 ) with jR∞~ ,2 r o »V. r¾3 r 24 r \ >

P "· {Pit r¾ P%4 Pn PJA ¾F and & is aa estimation error. The pressure gradient ΨΡ that m nimises the estimation error m a least sqxt re sense- is obtained, as follows

here - the (R r 8)~ R 7 h left psendo mverse of matrix B. The m i R is only dependant on selected rtactophone array geometry sad s lect d origin of a coord nate, system. The existence of its pseudo inverse is guaranteed as long as the number of microphones is greater than the numbe of dimensions. For estimation of die pressure r dient in 3D space (3 dimensions) at least 4 microphones are required,

There are several issues feat .seed to be considered when it comes to applicability of the above described method to fee real lite msasureraeats of a pressure gradient and ultimately sound velocity:

• The method uses phase matched microphones, although the effect of slight phase mismatch for constant frequency, decreases as the distance between the microphones- increases .

• The maximum distance between the microphones is limited fey the assumption that spatial variation in the pressure field is small over the volume occupied by the microphone array Implying that the distance between the microphones shall be much less than a wavelength, λ of the highest requency of interest, it has been su g s ed by Fairy, F, J. (1995). Sound tetmsify, 2nd ed London: E & FN Spon that the microphone separation, i methods using finite difference approximation for estimatio of a pressure gradient should be less than 0. S3X to avoid errors in the pressure gradient greater than .5%,

• Considering that in real life measurements noise is always present in microphone signals especially at lo .frequencies the gradient becomes very noisy. The difference in pressure due to sound wave coming from a loudspeaker at different microphone locations becomes very small at low frequencies, for he same microphone separation. Considering that for velocity estimation the signal of interest is the difference between two microphones at low frequencies the effective signal to noise ratio is reduced when compared to original SN!l in microphone signals. To make things even worse, during the calculation of velocity signals, these microphone difference signals are weighted fay a function that is inverse pro;portional to the. f equenc effectively causing noise amplification. This mposes a lower bound on a frequency region, in which the methodology for velocity es imation, based on the pressure difference between the spaced microphones, can.be applied.

* Room correction should he implemented in variety of consume* AV equi ment in whicb great phase- matching between different microphones in a microphone array •cannot be assumed. Consequently the microphone spacing should be as large as possible.

For room correction the interest is in obtaining pressure and velocity based energ measure in ¾ frequency region between 2QH¾ and 500Hz wher the room modes have donnnating effect. Coaseq-aemly spacing between the -microphone capsules that does not exceed approximately 9cm (0.13*340/500 m) is appropriate.

Consider a received signal at pressure microphone k and at its Fourier transform ¾! "). Consider a loudspeaker feed .signal S(w) (i.e., probe signal) and characterize tra»S. isS.io« of a probe signal f om a loudspeaker to misro hone k with the room frequency response H k (w). Then the P & (w) =» S(w) ^{w) + ¾.(w) where N k {w) is & xm ' m component at .microphone k. For simplicity of notation in the following equations the dependency on w i,e, P k {w) will simply be: enoted as i¾ etc.

For the purpose of a room correction the goal is . to find a representative room energy spectrum that can be used for the calculation of frequency correction filters, ideally if there is no noise in the system the representative room energy spectrum {RfnES} can be expressed as

in reality noise will always be present in the system and an estimate of RmES cm be expressed a

At very low fkquescies the magnitude squared of the differences be ween frequency responses from a loudspeaker to closely spaced .microphone cansates .e., \H k ~- i>j 2 k very small On & other hand, the noise in different microphones may he considered useorrefated and consequently j.¾ - \N k \ z + f^ p. This effectively reduces the desired signal to noise ratio and makes the pressure gradient noisy at low frequencies, increasing the distance between the microphones will make the magnitude of desired signal { k - H t ) larger and consequently improve the

11) effective SNR,

Ths frequency weighting factor ·~~ for ail ftequenciss- of interest is >i ¾ad it effectively amplifies the noise with a scale that is inversely proportional to the frequency. This introduces upward tilt in ' fftnES as towards lower frequencies. To prevent ' this low frequency ' Silt m estimated energy measure i¾i¾ v he pre-

I S emphasise probe signal is used for room probing at lo frequencies. In particular , the pre-ernphasiKed probe signal S pe - Furmer ore when extracting room responses from the microphone sigosis. de-convolii ion is perfbtmed not with the transmitted .probe signal $ w "but rather with the original probe signal S. The room responses extracted in - that manner will have the following form H kpe

20 . Consequently the modified form of the estimator for the energy measure

To ohserve its behavior regarding noise amplification the energy measure is written as

3?

Wife this estimato noise components entering the velocity estimate are not amplified by ~~ and n addition the noise components entering the pressure estimate ate attenuated by ----- ence improving the SNR of pressure microphone. As stated before this low frequency processing Is applied s frequency region from 20Hz to arotmd 500 k, lis goal is to obtain an energy measure that representative of a wide listening ar a in. the room. At higher frequencies the goat is to characterize he dirsct path and few early reflections front the loudspeaker to the listening area. These characteristics mostly depend on loudspeaker construction and its position within the room ' -and consequently do not vary much between different locations within the listening area. Therefore at high .frequencies an energy measure based on a simple average- {or more complex weighted average) of tetrahedral microphone signals is used, The resultin overall room energy measure is written as in Equation ( . 12).

RtnEn » 02 )

These equations relate directly to the eases for contracting Ihe energy measures &. for the singe-probe and dual-probe ieirahedral microphone esnfigurations. In particular, equation corresponds to step .242 for computing the low-frequency component of B ¾ , The I** term in equation 8 is the magnitude squared of the average frequency response (step 244} and the 2* 1 term applies the frequency dependent weighting to the pressure gradient to estimate the velocity components and computes the magnitude squared (step 246). Equation \2 corresponds to steps 260. (ksw-tremieney) and 270 (high-frequency). The l s> term n equation 12 is the magnitude s uare of the de-emphasi d average frequency response (step 264). The 2" 1! term is the magnitude squared of t e velocity components estimated from the pressure ra ient For both the- single and dual-probe eases, the sound velocity component of the low-.freque.ncy measure is computed directly Irani the measured room response H x r H^, the steps of estimating the pressure gradient and obtaining the velocity components are integrally performed.

Sub-Band Frequency Correetfoa Filters

The construction of mmiroum-pnase FIR sub-hand correction filters is based on A model estimation for each hand mdepeadeoffy using the previously described room spectra! (energy) measure. Each band can be constructed independently because the analysis/synthesis filter banks are oon-criticaUy sampled.

.Referring no to figures 13 and !4a-|4c, for each audio channel and loudspeaker a channel target curve is provided (step 300), As described previously, the chanftel target curve may be calculated by applying frequency smoothing to the room spectral measure, selecting a user defined target curve or by superimposing a user defined target c rve onto the frequency smoothed room spectral measure. Additionally, the room spectra! measure may be hounded to prevent extreme requirements on the correction filters (step 382), The per channel mid-band gain may be estimated as an average of the room spectral measure o ver the mid-band frequency regit®. Excursions of the room spectrum measure are hounded ' between a maximum of ihe mid-band - am plus an upper bound (e.g. 2(>dB) and a minimum o the nn ' d- band gain minus a lower bound (e.g. IQdB), The upper hound is typically larger than the lower bound to avoid pumping excessive energy into the a frequency band where the roo spectral measure has a deep null The per channel target curve is combined with the bounded per channel room spectral measure to obtain an aggregate room spectra! measure 363 (step 304). In each frequency bin, the o spectral measure Is divided by the corresponding bio of the target curve to provide the aggregate room spectral measure. A sub-band counter sb is aiittaifesd to ze o (step 366)..

Portions of the aggregate spectral measure are extracted that correspond to different sub-bands and remapped to base-band to mimic the do nsamphng of the analysis filter bank (step 308), The aggregate room spectral measure 303 is partitioned into Overlapping fre uen y .re ions 3108, 31u% and so faith corresponding so each hand is the overs&mpted fitter bank. Each partition is mapped to the base-bam! according to decimation rules thai apply tor even arid odd filter bank bands as shown in figures J.4c and Mb, respectively. Notice ' thai the shapes of analysis fillers are not included into the mapping. This Is im rted because it is desirable to obtain correction filters that have as Sow order as possible.. If the analysis filter ank fitters are i seclu e h mapped .spectrum will, have steep felling edges. Hence the correction filters would require high order to unnecessarily correct for a shape of analysis filters.

After ma ping to base-ba d the partitions eotrespondmg to the odd or even will .have parts of lbs spectrum shifted but some other parts also flipped. This may result is spectral discontinuity that would require a high order frequency correction filter; In order to prevent this unnecessary increase of correction filter order, the region of flipped spectrum is smoothed. This in return changes the fine detail of the spectrum to the smoothed region. However it shall be noted that the flipped sections are always irt the region where synthesis filters already have high attenuation and consequently the contribution of this part of the partition to the Snsl spectrum is negligible.

As auto regressive (AR) model is estimated to ihe remapped aggregate room spectral measure (step 3 2). Each partition of room spectral m asure .afte being mapped to the base b d, mimicking t e effect of decimation, is interpreted as some equivalent- spectrum. Hence its inverse Fourier transform will he a: corresponding autocorrelation sequence. This autocorrelation sequence Is used as the input to the Levinsoa-Durbia algorithm which ' computes art AR model, of desired order, that best matches the given energy spectrum in a least s uare sense. The denominator of this AR mode! (all-pole) filter is a minimum phase ' polynomial. The length of frequency correction filters in each sub-band are roughly determined by the length of room response, in the corresponding frequency region, that we have considered during the creation of overall room energy measure (length proportionally goes down as we move from low to high frequencies). However the final lengths can either he fine tutted empirically or automatically by use of AR order selection algorithms that observe the residual power and stop when a desired resolution is reached.

The coefficients of the AR are map ed to coefficients of a minimum-phase all- zero sub-band correction filter (step 314). This FIR filter will perform frequency correction according to the inverse of the spect um o t ined by the AR m del To match filters between different bands all of the correction filters are suitably normalized.

The sub-hand counter sb is incremented (step 31e) aad compared to the 5 number of sub-hands SB (step 318} to repeat the process for the next audio ehsuisel or to terminate the per ' channel construction of the correction filters. At this point, the channel Fi .-filter ' coefficients may he adjusted to a common target curve (step 329), The adjusted filter coefficients are stored ia system memory and used to configure the one m more processors to implement the digital FIR sub-band correction filters or I 0 each audio channel shown in Fignre 3 (step 322),

<^£gi!dk.A÷,.^

For fully automated system calibration and set-up it is desirable to have knowledge of the exact location ami number of loudspeakers present in the room.

15 The distance can be computed based on estimated propagation delay from the loudspeaker to the microphone array. Assuming that the sound wave propagating along the direct path between loudspeaker and microphone array can be approximated by a plane wave then She corres ondi g angle of arrival (AOA), elevation with respect to a» origin of a coordinate system defined by microphone -array, can be

20 estimated by observing the relationship between different microphone signals within the array, The loudspeaker azimuth and elevation "are calculated from the estimated AOA,

It is possible to use frequency domain based AOA algorithms, in principle relying on the ratio between the phases in each bin of the frequency responses from a

25 loudspeaker to each of the microphone capsules,, to determine AOA. However as shown in Cobos, M., L ez, JJ. and Marti, A. (2 10 ), On the Effects of Room Reverberation in 30 DO A Estimation- Using Tetrahedral Microphone Array, MS 128th Convention, London, UK > 201 May 22-2S the presence of room reflections has & considerable effect on accuracy of estimated AOAs. instead a time domain approach

30 to AOA estimation is used relying on the accuracy of our direct path delay estimation, achieved by using analytic envelope approach paired with the probe signal. Measuring the loudspeaker/room responses with tetrahedra ' i microphone array allows us to estimate direct pafe dela s fiom each loudspeaker to each microphone capsule. By comparing these delays the loudspeakers car* be localized in 3D space.

Referring to Figure l as azimuth angle 8 and a» elevation angle φ are determined from ah estimated angle of anivai (AOA) of a sound wave propagating from loudspeaker to the tetrahedral microphone array. The algorithm for estimation of the AOA is ased n a property of vector dot product to characterize the angle between two vectors, In particular with specifically selected origin of a coordinate system the following dot product equation can be written as

03 )

where r lk indicates vector connecting the microphone k to the microphone I, Ί indicates matrix/array transpose operation, % denotes a unary vector that aligned with the direction of arrival of plane soand wave, c indicates the speed of sound, Fs indicates the sampling frequency, t* indicates the time of arrival of a Sound- wave to the microphone k and ¾ indicates the time of arrival of a sound wave to the microphone 1,

Coiieiiting equations for ail microphone pairs the following matrix equation is obtained,

(14) This matrix -equation epresents an over-determined system of linear equations ifcai can be solved by method of least squares resulting the following expression for direction of arrival vector

The azimuth and elevation asglss are obtained from the estimated coordinates of normalised vector s∞™r as 8 ~ arc an(s y, ¾) and φ ~ arcsin(¾); where arctanO is a four quadrant inverse tangent function and arssm() is an inverse sine nc ion.

The achievable angular accuracy of ΛΟΑ algorithms using the time delay estimates, ultimately limited by the acc acy of delay estimates and the separation between the mierophoae capsules. Smaller separation between the capsules implies smaller achievable accuracy. The separation between the .microphone capsules is limited torn .th top by requirements of velocity estimation as well as aesthetics of the end product. Consequently the desired angular accuracy is achieved by adj stkg the ' delay estimation accuracy. If the required delay- estimation accuracy becomes a fraction of s m li g interval, the analytic envelope of the m m responses are interpolated around iftetr .co res on ng peaks> Hew peak, locations, ith a fraction of sample .accuracy, represent new delay estimates used by the AOA algoritlaa.

While several illustrative embodiments of the invention have, been shown and described-, nunwroos variation and alternate emhodimeats. will occur to those skilled in the art Such variations and. alternate embodiments arc contemplated, and can be .made without departing fr m the spirit and scope of the invention as defined in the appended claims.