Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ACOUSTIC ECHO AND NOISE CANCELLATION
Document Type and Number:
WIPO Patent Application WO/2001/001571
Kind Code:
A1
Abstract:
Stereo echo cancellation is necessary to overcome the objections observed by for example teleconferencing, voice controlled video/audio apparatuses etc.. To improve the existing filters the invention provides an adaptive filter and a signal processing device which obtain the coefficient updates in the transformed domain, reducing the required calculation complexity. Further the filter comprises means to reduce the correlation between the input signals on the coefficient updates.

Inventors:
EGELMEERS GERARDUS P M
JANSE CORNELIS P
Application Number:
PCT/EP2000/005711
Publication Date:
January 04, 2001
Filing Date:
June 21, 2000
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KONINKL PHILIPS ELECTRONICS NV (NL)
International Classes:
H03H21/00; H04B3/20; (IPC1-7): H03H21/00
Foreign References:
US4355368A1982-10-19
Other References:
YUAN-HWANG CHEN ET AL: "FREQUENCY-DOMAIN IMPLEMENTATION OF GRIFFITHS-JIM ADAPTIVE BEAMFORMER", JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA,US,AMERICAN INSTITUTE OF PHYSICS. NEW YORK, vol. 91, no. 6, 1 June 1992 (1992-06-01), pages 3354 - 3366, XP000290213, ISSN: 0001-4966
Attorney, Agent or Firm:
Schoenmaker, Maarten (Internationaal Octrooibureau B.V. Prof Holstlaan 6 AA Eindhoven, NL)
Download PDF:
Claims:
CLAIMS:
1. Adaptive filter comprising at least two inputs for receiving at least two signals, and an output for supplying an output signal, characterized in that the coefficient updates are determined in a transformed domain and that the filter comprises means to reduce the effect of the correlation between the input signals on the coefficient updates.
2. Adaptive filter according to claim 1, characterized in that the transformed domain is the frequency domain.
3. Adaptive filter according to claim2, characterized in that the filter comprises an update algorithm with transformed autoand a cross correlation matrices.
4. Adaptive filter according to claim 2, characterized in that the reduction of the effect of the correlation is achieved by multiplying the frequency domain input signals with the inverse of the input channel's power matrix.
5. Adaptive filter according to claim 4, characterized in that the input channel's power matrix is determined by a first order recursive network, with the product of the frequency domain input signals and their conjugates as input, and further characterized in that at each iteration a certain positive value is added to all elements of the main diagonal.
6. Adaptive filter according to claim 4, characterized in that the algorithm comprises a solving a linear set of equations with the input channel power matrix as one of the elements of the equations.
7. Adaptive filter according to claim 3, characterized in that the inverse of the input channel's matrix is estimated directly, using a recursive update algorithm, and further characterized in that a limit is imposed on the eigenvalues of the matrix.
8. Signal processing device comprising a filter according to claim 1.
9. Signal processing device according to claim 8, characterized in that the device further comprises a dynamic echo and noise suppressor as a postprocessing device coupled to an output of the filter.
10. Signal processing device according to claim 8, characterized in that the signal processing device comprises a programmable filter.
11. Teleconferencing system comprising at least one signalprocessing device according to claim 8.
12. Voice controlled electronic device comprising at least one signalprocessing device according to claim 8.
13. Noise cancellation system comprising at least one signalprocessing device according to claim 8.
14. Method for filtering at least two signals and for supplying an output signal characterized in that the method determines the coefficient updates in the frequency domain and that the method reduces the effect of correlation between the input signals on the coefficient updates.
Description:
"Acoustic echo and noise cancellation."

The invention relates to a filter as described in the preamble of Claim 1.

The invention further relates to a signal processing device comprising such a filter.

The invention further relates to a teleconferencing system.

Further the invention relates to a voice controlled electronic device.

The invention also relates to a noise cancellation system.

The invention further relates to a method as described in the preamble of claim 14.

Recent developments in audio and video systems require the use of multiple channel processing and reproduction with acoustic echo cancellers (AEC) and noise cancellers.

For example in mini group video conferencing systems multiple channel transmission leads to a better"localization"of the diverse people in the room. This enhances the intelligibility and naturalness of the speech.

Further multiple channel echo cancellation is needed in voice controlled stereo audio and video equipment such as television receivers, radio receivers, CD players etc..

A multiple channel AEC can in general not be created by simple combination of multiple single channel AEC's.

From the United States patent US-A 5,828,756 a method and apparatus are known of a stereophonic communication system such as a teleconferencing system, which involves selectively reducing the correlation between the individual channel signals of the stereophonic system. Herein non-linearities are added to the input signals to reduce the correlation. However by adding these non-linearities audible artifacts in the output signals are introduced. These non-linearities can (sometimes) be accepted in teleconferencing systems but are certainly not acceptable in other applications such as supplying music etc.

It is, inter alia, an object of the invention to provide a filter, which overcomes the objections of the prior art. To this end a first aspect of the invention provides a filter as claimed in claim 1.

Herewith the performance of the adaptive filters is improved without a huge increase in computational complexity.

A second aspect of the invention provides a signal-processing device as claimed in claim 8.

A third aspect of the invention provides a teleconferencing system as claimed in claim 11.

A fourth aspect of the invention provides a voice controlled electronic device as claimed in claim 12.

A fifth aspect of the invention provides a noise cancellation system as claimed in claim 13.

A sixth aspect of the invention provides a method as claimed in claim 14.

An embodiment of the invention comprises the features of claim 2.

The invention and additional features, which may be optionally used to implement the invention to advantage will be apparent from and elucidated with references to the examples described below, hereinafter and shown in the figures. Herein shows: Figure 1 schematically the multiple input adaptive FIR filter, according to the invention, Figure 2 schematically the calculation of the output of the FIR filter, according to the invention, Figure 3 schematically the calculation of Y in the Multiple Input Partitioned Frequency Domain adaptive filter, according to the invention for the case with direct inverse power estimation, Figure 4 schematically the calculation of the coefficient vectors w, according to the invention, Figure 5 a schematic example of a stereo echo cancellation in a teleconferencing system according to the invention, Figure 6 a more detailed schematic example of a teleconferencing system according to the invention, Figure 7 a schematic example of a voice controlled device according to the invention, and

Figure 8 a schematic example of a noise canceller according to the invention.

In the description the equations, matrices, etc are shown as described below.

Signals are denoted by lower case characters, constants by upper case. Underlining is used for vectors, lower case for time domain, and upper case for frequency domain. Matrices are denoted by bold face upper case, like 1. The dimension is put in superscript (e. g. the B xQ matrix X is given by XB Q, for a square matrix the second dimension is omitted). Diagonal matrix are denoted by a double underline, like P, with its diagonal denoted as P = diag #P#, A subscript i, like wi, denotes the i'th version. The k'th element of w is given by (w) k. Finally, appending [k] denotes the time index, (.) t denotes the transpose, (.) * the complex conjugate and (.) h the Hermitain transpose (complex conjugate transpose).

A general multiple input adaptive FIR filter, depicted in figure 1, uses the S signals xo [k] until x5, [k] to remove unwanted components correlated with these signals in the signal e [k]. The signals x0 [k] until xs-1#k# are input to S FIR-filters Wo until Wus_,, with outputs #0 [k] until es_, [k]. The goal of the update algorithm is to adapt the coefficients of the FIR filters in such a way that the correlation between r [k] and the input signals xi [k] and x5, #k# is removed.

For S > a > 0 the FIR filter Wa performs the convolution of the signal xa [k] and the coefficients wao [k]... wa,N-1[k] of that filter. The output signal ea [k] of such a filter can be described as follows with for S > a > 0

The output of the multiple input adaptive filter is given by These filter parts of the separate (adaptive) filters Wo until Ws-1 can be implemented efficiently in frequency domain with help of partitioning, block processing and Discrete Fourier Transforms (DFTs). A reduction in computational complexity is obtained since the convolutions per sample in the time domain transform to elementwise multiplications per block in the frequency domain. We use block processing with block length B and DFTs of length M, with M 2 N + B-1. The transformation of the input signals can be described for S>a20 by where FM is the M x M Fourier matrix. The (a, b)'th element (for 0 < a < M, 0 < b < M) of the Fourier matrix is given by

The filter can then be computed in the frequency domain, by Note that the frequency domain filter coefficients are related to the time domain coefficients, for all S > a > 0 this can be denoted by To obtain an efficient implementation, the block length B must be chosen in the same order as the filter length N, which results a large processing delay.

To reduce the processing delay, the filter can be partitioned into smaller pieces of length B and with g = rN/Bl we get the implementation of figure 2, that can be described by For the update part of the filter one can use S separate update algorithms to improve convergence behavior the input signals can be decorrelated separately in the time domain by using RLS like algorithms, leading to a huge computational complexity.

Complexity reduction can be obtained by implementation in the frequency domain with (partitioned) Block Frequency Domain Adaptive Filters as described in G. P. M. Egelmeers, Real time realization of large adaptive filters, Ph. D. thesis, Eindhoven University of Technology, Eindhoven (The Netherlands), Nov. 1995. When there is correlation between the

input signals of the filters this might still lead to very bad convergence behavior, due to the non-uniqueness problem.

In this application it is proposed to use a partitioned algorithm in the frequency domain that reduces the effect of the crosscorrelation between the input signals on the algorithm's convergence behavior. To reduce complexity block processing with block length A to compute the sum of A consecutive updates with each iteration is used. For S > a 2 0 the coefficient vectors waN [IA] are partitioned into gu parts of length Z with gu = #N/Z# such that for S> j20 withfor S> j20 and gu >i20 A Fourier transform length L is used with L > Z + A-1, we define the input signal Fourier transforms for S > a > 0 as XaL#lA#=FL.xaL#lA# The diagonal matrices #aL#lA# contain the vector #aL#lA# as main diagonal, so for S > # 0

XL [IA] =diagiaL [IA] ; An overlap-save method is used to compute the correlation involved in the adaptation process in the frequency domain, the frequency domain transform of the residual signal vector equals The set of update equations for the filter coefficients in the MFDAF (Multiple Input Frequency Domain Adaptive Filter) algorithm can now be defined for gz, > i # 0 by

and the transformation matrix GS.Z,S.L is given by The input channel's power matrix p S.L#lA# is defined by The expectation operator s {} of the above equation has to be replaced by an estimation routine.

The power matrix pS L [lA] can be estimated by To reduce the number of multiplications, the stepsize parameter a of equation is incorporated in the above power estimation routine by defining Estimation of the power matrix Pas L [lA] can then be done by Direct application of this algorithm leads to stability problems. When the input signal power in a certain frequency bin is very small, the power in that bin will decrease to a (very) small value. The inverse of the matrix will then have large values and will be inaccurate (due to numerical and estimation errors) In the ideal case, the eigenvalues of the power matrix estimate cancel the eigenvalues of the input signal power matrix. Due to estimation errors this goal is only approximated, and the mismatch introduces a deviation from the ideal convergence behavior and might even lead to instability. Especially when some of the eigenvalues of the estimate of the inverse power matrix get large, and do not (exactly) cancel a (small) eigenvalue of the input signal power matrix, instability might occur. A lower limit to the eigenvalues of the estimate of the power matrix can solve this problem. In the single channel case (or when we forget the cross-terms) we can solve this problem by applying a lower limit to the lower values. We can do this because the eigenvalues of a diagonal matrix are equal to the elements of the diagonal, so we actually limit the eigenvalues. In the multiple channel case we also have to limit the eigenvalues to assure stability, but these no longer equal the elements on the diagonal.

We know however that all eigenvalues of the power matrix are positive. We can now create a lower limit on the eigenvalues by shifting them by the suggested minimum. We know that for all eigenvalues X of a matrix A, the determinant of A+ (7-')-I must be zero. So for all i', eigenvalue of A+ t, there must be a A, eigenvalue of A, such that

#'=#+Pmin (and the other way around). This means that by adding a constant Pmin to the main diagonal of a matrix, all eigenvalues of that matrix are shifted by Pni,,.). So we define: which results in The effect of this shifting of the eigenvalues on the (theoretical ideal) convergence behaviour of the algorithm will be very small, and in practice the algorithm is much more stable.

Although pS-L [lA] is a sparse matrix, computing its inverse would still require L inversions of SxS matrices, which takes in the order of L S3 operations. As we however do not need the inverse itself, but only its matrix-vector product with the input signals we can also look at it as solving the system which requires in the order of L-S2operations. Another option is to estimate the inverse of PS. L [lA] directly, which also results in a number of operations proportional to L#S2.

However, also in this case we have to limit the eigenvalues to assure stability.

A simple algorithm is given by:

We can incorporate a, which results in The above algorithm does not guarantee a matrix PαS#L#lA#)-1 with positive eigenvalues, and therefore introduces a lot of stability problems. In the single channel case we are able to stabilize the algorithm by using a lower limit on the estimate, which automatically results in positive eigenvalues because the matrix is diagonal, but that is not possible in the multiple channel case.

Positive eigenvalues An exact transformation of an algorithm for estimating paS #L#lA# with positive eigenvalues will lead to an estimation algorithm for the inverse with positive eigenvalues. This can be done by using the matrix inversion lemma. When there is matrix A such that A=B+C. D. E (15) then the inverse matrix (A)-'of A can be expressed by

(A)-1=(B)-1-(B)-1#C#((D)-1+E#(B)-1#C)-1#E#(B)-1.(16) By choosing we obtain by using equation (14) Algorithm (19) involves no matrix inversion, and only L/2 +1 divisions, as the matrix #L #lA# of equation equation (20) is real valued diagonal diagonal Limits on the eigenvalues

An operation on the inverse power matrix that is equivalent to adding a constant to the diagonal of the (non-inverse) power matrix would solve the problem. Adding a full (S #L)#(S#L) identity matrix and trying to find an equivalent operation on the inverse power matrix with the matrix inversion lemma results in an algorithm that requires the matrix inversion we would like to avoid, so we try As the matrix I has rank S and the product matrices (XlimS#L.L.#lA#)*#(XlimS#L,L#lA#)t and (XS#L,L#lA#)^*#(XS#L,L#lA#)t both have a rank of (at most) L this is not possible for S>1. As we need the average of (X L'L #lA#)*#(XS#L,L#lA#)t, we can find a solution by taking the average over S consecutive updates. We will try to find XlimS#L,L#lA# such that with for i = I mod S

For S =1 we get For S > 1 there are an infinite number of solutions. If we try to keep the maximum distorsion (the largest matrix element) as small as possible, we have to choose for all S > j > 0 and i#0S> If there is a real symmetric matrix UL for S=L then a real symmetric matrix UIL for S = 2L is given by Using the above equation we can construct all U 2 with i > 0. If S + 1 is not a power of two, then we will use the matrix U2' where wi > S + 1 > 2'-', and use the last S rows. In table 1 the power matrix estimation algorithm using a direct inverse estimation with limits on the eigenvalues is summarized.

Initialization =2#log2S+1#1.Su . fori=ltolog2 (Su) do 2. begin end 3. Initialize power matrix.

Iteration 4. Calculate (#L#lA#)-1 Table 1: Direct inverse power update with limits.

Note that the inverse of p L [IA] is also a sparse matrix with the same structure and we define wherefor O<i<S and O<j<S L [, A] =diagVLj [, A] I.

Figure 5 shows schematically an example of a teleconferencing system TS5 using a stereo echo canceller SEC5 with adaptive filters AF5 (only one shown). The teleconferencing system comprises a far room FR5 and a near room NR5. The adaptive filter AF5 has to filter the stereo echo signals.

Figure 6 shows an example of a stereo echo canceller SEC6 used in a teleconferencing system TS6. The stereo echo canceling has to be performed between the near room NR6 and the far room FR6. In this example also programmable filters PF61 and PF62 are used to improve the performance of echo canceling. Programmable filters are described in US-A-4,903,247.

Further also the output of the programmable filters is supplied to a dynamic echo suppressor DES6, which is coupled with an output to the output of the stereo echo canceller. Dynamic echo suppressors are described in WO-97-45995.

A full stereo communication requires four stereo AECs, two on the near end side and two on the far end side. In figure 6 only one of those echo cancellers is depicted. Note that on each side we can combine the input signal delay-lines, the FFTs and the multiplication by the inverse power matrix of the two echo cancellers, which implies that the relative extra computational complexity for removing the crosscorrelation is even further reduced. The performance of the AECs further improved by adding Dynamic Echo Suppressors as shown.

Figure 7 shows another application wherein a stereo echo canceller SEC7 is used in a voice controlled audio (and video) system VCS7. To be able to recognize the local speaker by a voice recognition engine we have to cancel the sound emitted by the audio set through the loudspeakers. This is done by using the stereo echo canceller SEC7. To improve the stereo echo canceling also in this example the programmable filters PF71 and PF72 are used, and the Dynamic Echo Suppresser DES7 is used. The output of the Dynamic Echo Suppressor is coupled to a voice recognizer VR7 for handling the filtered signal.

Figure 8 shows an example of a noise canceller NC8 for canceling the noise received on microphones in a room R8 together with a speech signal spl from a person in the room. In this example the microphones supply signals to a beam former BF8 which beam

former supplies signals to the noise canceller NC8 and to programmable filters PF81, PF82 and PF83. Further the noise canceller comprises a dynamic echo suppressor DES8. The output of the Dynamic Echo Suppressor is coupled to the output of the noise suppressor to supply an estimate of the received speech sp2.

Also in the multiple input noise canceller we can apply a DES (which is in fact not suppressing an echo, but is similar to the DES in the AEC's case) and programmable filters to improve performance, as shown in figure 8. An extra problem is that the inputs of the filters may contain some elements of the desired signal ("signal leakage"), because the beamformer is not perfect. When the desired signal is speech signal, a speech detector can be used to improve the behaviour of the MFDAF.

Above some examples of application of a stereo echo canceller and of a noise canceller are described. It is to be noticed that the invention can be used in different applications and is not restricted to the described applications.