SIGNAL PROCESSING APPARATUS AND METHOD

Title:

SIGNAL PROCESSING APPARATUS AND METHOD

Document Type and Number:

WIPO Patent Application WO/1992/015150

Kind Code:

A1

Abstract:

The invention provides a real-time (or non real-time) signal processing apparatus which enables the accurate and robust discrimination, separation and suppression (or detection) of one or more component random signals/variables which comprise an unknown proportion of a composite random signal (or statistical mixture/ensemble) contained within a finite bandwidth; wherein, the component random signals/variables are characterised by having different (and not necessarily known) statistical attributes, e.g. cumulative probability distribution functions. The signal processing method and apparatus, which are suitable, for example, for audio noise reduction, divide a digital signal into successive sets of data and derive from each set a power spectrum. By sorting the power spectrum into rank order of magnitude, a point (Pd) along the rank ordered data set (Pr) where a change of gradient or rate of change of magnitude with element number is detectable in the case where the signal is a composite signal containing both signal and noise, and the elements having the lower gradient represent the noise component. The original signal can therefore be processed to suppress or zero those elements which have been identified as noise.

More Like This:

JPS62260414	SIGNAL PROCESSING PROCESSOR
JPH07184201	THE METHOD AND EQUIPMENT TO DEFORM IMAGE DATA
JP6213155	Digital filter

Inventors:

SUMMERS BRENT EDWIN (GB)

Application Number:

PCT/GB1992/000232

Publication Date:

September 03, 1992

Filing Date:

February 07, 1992

Export Citation:

Click for automatic bibliography generation Help

Assignee:

DSP CONSULTANTS LTD (GB)

International Classes:

H03H17/02; (IPC1-7): H03H17/02

Other References:

INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING vol. 3, 26 May 1989, GLASGOW, GB pages 1401 - 1404; ZAMPERONI: 'Variations on the rank-order filtering theme for grey-tone and binary image enhancement'
INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING vol. 2, 26 May 1989, GLASGOW, GB pages 1349 - 1352; BILGUTAY: 'Fequency diverse stastistic filtering for clutter supression'

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1.

Signal processing apparatus for use in dis¬ criminating between the component random signals, having different statistical distributions, in a composite signal consisting of a summation of two or more such random signals, comprising means for acquiring from a sequence of digital samples of said signal successive sets each containing a predetermined number of samples, and processing means for processing each successive set in turn, said processing means comprising calculating means for weighting and transforming the set of samples into a discrimination space and for computing a representation of the magnitude of the signal representation within the discrimination space, ordering means for sorting the discrete values or elements of the signal magnitude representation within the dis¬ crimination space to yield the data arranged in rank order, discriminating means for identifying a discrimination point or points within said rank ordered set, and thresholding means for dividing said rank ordered set in accordance with said discrimination point or points to form two or more subsets, the subsets substantially representing said component random signals of said composite signal, and means for identifying in and extracting from the original set of digital signal samples or the transformed set before rank ordering those elements which correspond to at least one of said component signals.

2.	Apparatus according to Claim 1, wherein the discriminatingmeans is arrangedto identify a discrimination point or points within said rank ordered set at which there is a significant change in gradient.

3.

Apparatus according to Claim 2, wherein the orderingmeans is arrangedto sort the data set into ascending order of magnitude, and the discriminating means comprises means for calculating for each element in the sorted set an amplitudeindependent discrimination sequence: wherein d is the index number of the magnitude data element in the rank ordered sequence, P is the magnitude and { _r}" _o is the rank ordered data sequence, and means for identifying the subset {P_r} _mQ representing those elements substantially from one of said component signals, such that, in the process of computing the said discrimination sequence g(d), a value of d can be determined such that g(d) is less than or equal to a predetermined value and g(d+l) is greater than said predetermined value.

4.	Apparatus according to Claim 3, wherein said predeterminedvalue is in the range 1.9 to 2.4 for a magnitude representatio .

5.	Apparatus according to Claim 2, 3 or 4, which uses the magnitude P _r._d of the element at the discrimination point d as a basis of a thresholding process.

6.	Apparatus according to any of Claims 2 to 5, comprising adjustment means arranged to adjust a dis¬ crimination point adaptively according to a statistical confidence factor derived from the angular difference or equivalent thereof in gradients of adjacent subsets.

7.	Apparatus according to any preceding claim, wherein the discrimination space is the frequency domain and the calculating means is arranged to perform a Discrete or Fast Fourier Transform on said set, whereby said representation is the magnitude or power spectrum.

8.	Apparatus according to any preceding claim, comprising means for suppressing those elements which correspond to a selected one of the subsets, and means for reconstructing the original composite signal with one, or at least one, of the component random signals substantially suppressed.

9.

Apparatus for increasing the signaltonoise ratio of an analogue signal containing additive random noise, comprising means for receiving a stream of digital samples of said signal and signal processing apparatus according to Claim 8, the suppressing means being arranged to suppress those elements which correspond to the subset having the lower gradient (for ascending order ranking) substantially representing the random noise component of said analogue signal.

10.

Apparatus according to Claim 9, comprising means forthe realtime design of an adaptive filter for suppressing the data elements within a selected subset representing an unwanted component signal of the composite signal and means for applying the designed filter only to the original set of data, appropriately delayed, on which the analysis was conducted.

11.	Apparatus according to Claim 10, comprisingmeans for temporally monitoring the contents of each frequency band in the wanted signal subset and for suppressing any element which is not present within the same frequency band for a predetermined number C of successive sets of data.

12.	Apparatus according to Claim 11, wherein means are provided for selectively varying the value of C.

13.

Apparatus according to any of Claims 10 to 12, comprising means for monitoring the output of each frequency band exiting the thresholding process and for detecting a transition from noise to signal over successive sets of data, andmeans for controllingthe adaptive filter amplitude for the said frequency band such that the filter amplitude of that frequency band in each of the preceding B sets of data before the detected transition is increased by a small increment such that the filter amplitude has reached unity value by the noise to signal transition point.

14.	Apparatus according to Claim 13, wherein means are provided for selectively varying the increment size and the value of B.

15.

Apparatus according to any of Claims 10 to 14, comprising means for monitoring the output of each frequency band exiting the thresholding process and for detecting a transition from signal to noise over successive sets of data, and means for reducing the adaptive filter amplitude for each of D data sets subsequent to the signal to noise transition from unity to some predetermined maximum sup¬ pression level.

16.	Apparatus according to Claim 15, wherein means are provided for selectively varying the filter amplitude decay rate and function.

17.	Apparatus according to Claim 15 or 16, wherein means are provided for selectively varying the maximum suppression level between 1 and 0.

18.	Apparatus according to Claim 15, 16 or 17, wherein means are provided for selectively varying the value of D.

19.	Apparatus according to any of Claims 8 to 18, comprising means for converting said analogue signal into said stream of digital samples.

20.	Apparatus according to any of Claims 1 to 6, comprising means for selecting those samples in the transformed set which correspond to a selected one of the subsets, and means for further processing of said selected samples.

21.

Apparatus for detecting a desired signal in a composite signal comprising the desired signal and random noise, means for receiving a stream of digital samples of said composite signal, and signal processing apparatus according to Claim 20 for processing said samples, wherein said further processing means is adopted to carry out a detection.decision based on an ordered statistics estimate of the noise component of said composite signal.

22.	Apparatus according to any preceding claim, wherein digital to analogue converter means are provided for converting the processed digital signal representation back into an analogue form.

23.	Apparatus according to any preceding claim, wherein the acquiring means is arranged such that the successive sets of samples overlap.

24.	Apparatus according to Claim 23, wherein the amount of overlap is at least 50%.

25.

A signal processing method for use in dis¬ criminating between the component random signals, having different statistical distributions, in a composite signal consisting of a summation of two or more such random signals, comprising: (a) acquiring from a sequence of digital samples of said signal a set containing a predetermined number of samples; (b) weighting and transforming the set of said signal samples into a discrimination space; (c) computing a representation of the magnitude of the signal representation within the discrimination space; (d) sorting discrete values of the signal magnitude representation within the discrimination space to yield the data arranged in rank order; (e) identifying a discrimination point or points within said rank ordered set; (f) dividing said rank ordered set in accordance with said discrimination point or points to form two or more subsets, the subsets substantially representing said component random signals of said composite signal; (g) identifying in or extracting from the original set of digital signal samples or the transformed set before rank ordering those elements which correspond to at least one of said component signals; and (h) repeating steps (a) to (g) with successive sets of samples.

26.	A method according to Claim 25, wherein the or each discrimination point is a point at which there is a significant change in gradient of the rank ordered set.

27.

Amethodaccordingto Claim 26, comprising sorting thedata set into ascendingorder ofmagnitude and calculating for each element in the sorted set an amplitudeindependent discrimination sequence: . P t g (d ) — ^ft :d= 0, l Λ. 1 V (r) r0 wherein d is the index number of the magnitude data element in the rank ordered sequence, P is the magnitude and {P_r}_r_o is the rank ordered data sequence, and identifying the subset {_ _Γ}_Γ__O representing those elements substantially from one of said component signals, such that, in the process of computing the said discrimination sequence g(d), a value of d can be determined such that g(d) is less than or equal to a predetermined value and g(d+l) is greater than said predetermined value.

28.	A method according to Claim 27, wherein said predeterminedvalue is in the range 1.9 to 2.4 for a magnitude representation.

29.	A method according to Claim 26, 27 or 28, com¬ prising using the magnitude P_r._d of the element of the discriminating point d as a basis of a thresholding process.

30.	A method according to any of Claims 26 to 29, wherein the discrimination space is the frequency domain and the transforming step is performed by means of a Discrete or Fast Fourier Transform, whereby said power representation is the magnitude or power spectrum.

31.	A method according to any of Claims 26 to 30, comprising adaptively adjusting the discrimination point according to a statistical confidence factor derived from the angular difference or equivalent thereof in gradients of adjacent subsets.

32.	A method according to any of Claims 26 to 31, comprising suppressing those elements which correspond to a selected one of the subsets, and reconstructing the original composite signal with one, or at least one, of the component random signals substantially suppressed.

33.

A method for increasing the signaltonoise ratio of an analogue signal containing additive random noise, comprising receiving a stream of digital samples of said signal, and processing said samples according to the method specified in Claim 32, wherein those elements which correspond to the subset having the lower gradient (for ascending order ranking) substantially representing the random noise component of said analogue signal.

34.

A method according to Claim 33, comprising de¬ signing in realtime an adaptive filter for suppressing the data elements within a selected subset representing an unwanted component signal of the composite signal and applying the designed filter only to the original set of data, appropriately delayed, on which the analysis was conducted.

35.

A method according to Claim 34, comprising temporally monitoring the contents of each frequency band of each subset of processed samples corresponding to the desired signal, and suppressing any element in the subset which is not present within the same frequency band for a predetermined number of successive such subsets of data.

36.	A method according to Claim 35, wherein said predetermined number is selectively variable.

37.

A method according to Claim 34, 35 or 36, com¬ prising monitoring the output of. the thresholding process for each frequency band in each subset of processed samples and detecting in said frequency band a transition from noise to signal over successive such subsets of data, and adjusting the adaptive filter amplitudes for the frequency band such that the filter amplitude in the frequency band in each of the preceding B sets of data before the detected transition is increased by a small increment such that the filter amplitude has reached unity value by the noise to signal transition point.

38.	A method according to Claim 37, wherein the increment size and the value of B are selectively variable.

39.

A method according to any of Claims 34 to 38, comprising monitoring the output of the thresholding process for each frequency band in each subset of processed samples and detecting a transition from signal to noise over suc¬ cessive sets of data, and reducing the adaptive filter amplitude for each of D data sets subsequent to the signal to noise transition from unity to some predetermined maximum suppression level.

40.	A method according to Claim 39, comprising se¬ lectively varying the filter amplitude decay rate and function.

41.	A method according to Claim 39 or 40, comprising selectively varying the maximum suppression level between 1 and 0.

42.	A method according to Claim 39, 40 or 41, com¬ prising selectively varying the value of D.

43.	A method according to any of Claims 33 to 38, comprising converting said analogue signal into said stream of digital samples.

44.	A method according to any of Claims 25 to 43, wherein the successive sets of samples are selected so as to overlap.

45.	A method according to Claim 44, wherein the amount of overlap is at least 50%.

46.	A method according to any of Claims 25 to 31, comprising selecting those samples in the transformed set which correspond to a selected one of the subsets, and further processing said selected samples.

47.

A method for detecting a desired signal in a composite signal comprising the desired signal and random noise, comprising receiving a stream of digital samples of said signal, and processing said samples according to the method specified in Claim 46, said further processing comprising carrying out a detection decision based on an ordered statistics estimate of the noise component of said composite signal.

48.	An audio recording, when processed by a method according to any of Claims 33 to 45.

Description:

SIGNAL PROCESSING APPARATUS AND METHOD This invention relates to a signal processing ap¬ paratus and method for use in discriminating between the component random signals, having different statistical distributions, in a composite signal consisting of a sum¬ mation of two or more such random signals.

The invention enables the accurate and robust dis¬ crimination, and preferably separation and suppression (or detection), of one or more component random signals. One preferred embodiment of the apparatus and method is for example for improving the signal-to-noise ratio of a band limited signal, such as audio.

For an audio noise reduction system, advantage may be taken of the inherent spectral redundancy associated with audio signals. The audio frequency spectrum is divided into spectrally adjacent frequency bands which are independently deactivated (or suppressed) according to the instantaneous spectral occupancy of the audio signal. By deactivating the frequencybands which do not contain signal, net bandwidth reduction occurs resulting in an overall improvement in signal-to-noise ratio (S/N). The "quality" of such a system is essentially determined by the accuracy with which the noise can be discriminated from the signal. Therefore, if the frequency bands containing signal are mistaken for frequency bands containing noise, their subsequent rejection or suppression will result in audible distortion. Whereas,

in the event that frequency bands containing noise are mistaken for frequency bands containing signal, their in¬ clusion will degrade the potentially available S/N im¬ provement.

In general, both audio signals and noise may be considered as random signals. Therefore, the composite signal resulting from audio plus noise, i.e. noisy audio, may also be considered as a random signal consisting of the sum of two random signals from different and generally unknown (especially in the short term) statistical dis¬ tributions. Thus, the primary mission of an audio noise reduction apparatus is to accurately separate the two random signal components from an unknown statistical mixture, and then remove or suppress the noise component to give a noise-reduced signal.

It has now been found that if the results of a N-point discrete transform of such a sampled composite signal or statistical mixture (e.g. the power spectra of a temporal section of a noisy audio signal) are considered as a set of random numbers and are subsequently arranged in magnitude (or rank) order, as illustrated in Figure 1 for example, there is a certain point (d) along the rank ordered data set where a noticeable change in the rate of change of magnitude difference between adjacent data elements (or gradient) is evident. For a signal composed of the summation of two random signals from different statistical dis-

tributions, it has been realised that this point in the data set represents the discrimination point between elements of a component signal substantially from one distribution and those from the other. For an ascending order ranking, data elements to the left of the point are largely from one component signal (e.g. noise), and data elements to the right of the point, which exhibit a higher gradient, are substantially from the other component signal, e.g. audio. The invention provides a real-time (or non real-time) signal processing apparatus which enables the accurate and robust discrimination, separation and suppression (or de¬ tection) of one or more component random signals/variables which comprise an unknown proportion of a composite random signal (or statistical mixture/ensemble) contained within a finite bandwidth, wherein the component random sig¬ nals/variables are characterised by having different and not necessarily known statistical attributes, e.g. cumulative probability distribution functions. Preferably, the invention achieves the aforementioned by providing means for accepting an already digitised composite signal, or alternatively for digitising an incoming composite analogue signal, such that the samples are a 'unique representation' (in the sampling theorem sense, i.e. taken at a rate ap¬ propriate to the signals bandwidth), and also represented by an appropriate level of quanta, means for collecting (overlapped) batches of N successive signal samples, means

for storing the batch of N signal samples for future use, means for transforming each batch of the said signal samples into a 'discrimination space' (with appropriately controlled sidelobes) such as for example, the 'frequency domain' by means of a weighted Discrete or Fast Fourier Transform or some other transform such as a Cosine, Hartley, Bracewell or Walsh, means for computing a representation of the magnitude of the signal representation within the dis¬ crimination space, e.g. the modulus (or power spectrum), means for sorting the discrete values of the signal magnitude representation within the discrimination space (e.g. the line spectra) into ascending or descending rank order, means for identifying a discrimination point (or points) within the said rank ordered data set, preferably by identifying a point or points at which there is a significant change in gradient, and means for separating the data elements to one side (or in between) of the identified discrimination point (or points) whereby the data elements to one side (or in between) of the discrimination point (or points) substan¬ tially represent elements comprising one of the component signals.

Means may also be provided for identifying and re¬ taining or for identifying and suppressing those data el¬ ements to one side of, or in between, the discrimination point (or points) in the original un-sorted data set representing the elements of a component signal (or signals)

withinthe composite signal represented in the discrimination space. Further means may output the data elements repre¬ senting a component signal (in the case of a signal detection type system) or use the data as a basis for the real-time design of an adaptive filter which is subsequently applied to the original (but delayed) batch of N signal samples previously stored yielding a component signal suppressed output (eg noise suppressed) means for reconstructing the processed output samples back into analogue form or means for outputting the processed data directly in digital form. The aforementioned method performed by the apparatus is repeated on successive blocks of data with preferably 50% or more block overlap, and means may be provided for removing data block end effects (by, for example, interpolation or incorporating an attack and decay function on the filter amplitudes block to block) .

Preferably, the method of the invention comprises accepting an already digitised composite signal, or al¬ ternatively sampling an incoming composite analogue signal such that the samples are a 'unique representation' (in the sampling theorem sense, i.e. taken at a rate appropriate to the signals bandwidth), and also represented by an appro¬ priate level of quanta, and then collecting (overlapped) batches of N successive signal samples, storing the batch of N signal samples for future use, performing a transform operation on each batch of the said signal samples such that

the composite signal is represented in some new 'discrim¬ ination space' (with appropriately controlled sidelobes) such as for example, the 'frequency domain' by means of a weighted Discrete or Fast Fourier Transform or some other transform such as a Cosine, Hartley, Bracewell or Walsh, calculating a representation of the magnitude of the signal representation within the discrimination space, e.g. the modulus (or power spectrum), sorting the discrete values of the signal magnitude representation within the discrimi¬ nation space (e.g. the line spectra) into a ascending or descending rank order, identifying a discrimination point (or points) within the said rank ordered data set, preferably by identifying a point or points at which there is a sig¬ nificant change in gradient, separating the data elements to one side (or in between) of the identified discrimination point (or points) whereby the data elements to one side (or in between) of the discrimination point (or points) sub¬ stantially represents the data elements comprising one of the component signals, identifying and retaining or identifying and suppressing those data elements to one side of the discrimination point (or points) in the original unsorted data set representing the elements of a component signal (or signals) within the composite signal represented in the discrimination space, outputting the data elements representing a component signal (in the case of a signal detection type system) or using the data as a basis for the

- 1 -

design of an adaptive filter which is subsequently applied to the original (but delayed) batch of N signal samples previously stored yielding a component signal suppressed output e.g. noise suppressed, and reconstructing the pro¬ cessed output samples back into analogue form or outputting the processed data directly in digital form. The aforementioned method is repeated on successive blocks of data with preferably 50% (or more) block overlapping and may suppress data block end effects, for example by in¬ terpolation or incorporating an attack and decay function on the filter amplitudes block to block.

Preferably, the step of determining the discrimination point from the rank ordered sequence of the discrimination space representation is achieved by means of calculating an amplitude independent 'discrimination sequence or function' such as, for example, . P ι _d s g (d ) - --— - ^ft- rd-0.1 Λ/-1 (1)

. -0 where P(r) is the rank ordered data sequence.

Having computed (or in the process of computing) g(d), a value of d can be determined such that g(d) is equal to (within some specified bounds) a predetermined value and g(d+l) is greater than said predetermined value. For many circumstances, it has been found that, in the case of two component signals, a value of g(d) in the range of 1.9 to

2.4 yields an accurate estimate of the discrimination point (where the sorted data sequence are magnitude values and d is at least 20 to overcome any initial small data set problems). However, depending on the application, other values of g(d) may also be selected which may be more appropriate to the nature of the signals being analysed and/or the specific application to which the method is being applied.

For the majority of applications, the accuracy with which the method described hereinbefore determines the discrimination point will be sufficient. However, in order to obtain even greater separation accuracy it has also been realisedthat the angleθ (or some equivalentmeasure thereof) subtended between the two gradients either side of the discrimination point - see Figure 1 - is a direct measure of the 'difference' between the two statistical distribution functions from which the two component signals can be considered as originating. In other words, it is a direct measure of "statistical confidence" in the particular data set being analysed.

Since ultimately a decision has to be made as to which data elements are to be declared as belonging to a particular component signal (or distribution), the stat¬ istical confidence factor can be used to bias or weight the decision made. Thus, if θ tended towards 90°, this would signify that the two data sets originate from very different

statistical distribution functions and consequently the confidence in the decisions made about which elements belong to which component signal (or distribution) would be high, i.e. tending towards 100%. If however, θ tended towards 0°, then this would signify that the two data sets originate from very similar statistical distribution functions and confidence in the subsequent decisions would be very low, i.e tending towards 0%. Since in audio noise reduction systems for example it would be generally preferable to degrade the potentially available S/N improvement rather than cause audible distortion, the lower the confidence in a particular set, the more it might be desirable to bias the discrimination point towards the left, i.e. to discard fewer of the supposed noise samples. If no discrimination point has been detected, i.e. θ = 0°, then this implies that only one distribution is present, or that if two distributions are present, it is impossible to discriminate between them. To explain this further, suppose a temporal section of a composite signal comprising signal plus noise and having a positive signal-to-noise ratio is sampled, weighted, Discrete Fourier transformed and modulus-squared in order to obtain the power spectrum. In addition, suppose that the amplitude of the composite signal is adjusted such that, regardless of the temporal section selected, the peak value of the composite power spectrum is never more than the value M. If the probability density function of the signal plus

noise were plotted, then it would have, for example, a 'normal distribution' centred on M/x, as shown in Figure' 2(a). That is to say, the magnitudes of the signal power spectra would be most likely to have a value near M/x, whereas the likelihood of their having a value of 0 or M is small by comparison. Since the composite signal has a positive signal-to-noise ratio, the mean noise magnitude will be located at some lower value, say M/n.x, where n is some positive constant. If the probability density function for the noise alone were plotted, it would have, for example, a Rayleigh distribution with mean M/n.x as also shown in Figure 2(a). It will be seen that magnitude _(r- _d) s the magnitude of the intersect point of the two distributions, and defines that magnitude of the composite power spectrum which is equally likely to belong to either distribution, andcorresponds directlyto the gradient transitionmagnitude in the rank ordered sequence hereinbefore described. Similarly, Figure 2(b) illustrates the cumulative dis¬ tribution functions (cdf) for the same scenario. It can be seen from this diagram that the gradient transition point in the rank ordered data sequence is also equivalent to the magnitude at which there is maximum difference between the cdf of noise alone and cdf of signal plus noise. This diagram in particular is indicative of why the method herein described is so robust. If magnitude P - _r-. _d) is declared as the threshold value, then everything to the left of the line

in the case of an ascending ranking is deemed to be noise and as such is either discarded or suppressed, leaving what is substantially signal to the right of the line. However, it can be seen from Figure 2 that in both sections of the density functions either side of the threshold value there is a residual component of the other distribution. This is an inevitable consequence of any thresholding system, as is well known.

For the case where the system is a noise rejection system, it is possible to define two statistical noise rejection parameters, i.e. the probability of detecting noise (Pdn), and the probability of falsely declaring signal as noise (pfs). Pdn is specified as a percentage, e.g. 90%, whereas pfs can be specified either as a percentage or as a ratio. Here we shall adopt the ratio notation e.g. 10-6, read as "one noise declaration in a million will be a falsely declared signal". With reference to Figure 2(a), Pdn and pfs are equivalent to the areas under the curves as indicated by the shadings (Note that the total area under each curve is normalised to unity) . In the same way that signal detection systems trade between the probability of detection and the probability of a false alarm (pfa), final selection of a threshold is a trade-off between Pdn and pfs. For example, it can be seen from Figure 2(a) that if the threshold line at magnitude P - _r- ₎ were moved towards the right of the illustration, pfs will increase more rapidly than Pdn. That

is to say, in order to increase the probability of correctly detecting noise, the probability of falsely declaring a signal as noise also increases, but at a much faster rate. It can be seen therefore that for optimum performance the correct and robust choice of the threshold point is critical. As illustrated, Figure 2 implies that the probability density functions of both signal plus noise and noise alone are well-behaved. Such circumstances are only true of long-term statistics, i.e. where the observation time of the sample set is very large. For practical systems, it is necessary to process relatively small data sets. However, the problem with doing this is that short term statistics are notoriously unreliable, in that the probability density functions distort in an unpredictable manner, due to in¬ sufficient data, making it sometimes impossible to dis¬ criminate between them. Consequently, this results in a situation wherein the accuracy of, and the confidence in, any decision made about the data set is very low.

The rank ordering discrimination process as her¬ einbefore described includes within the method of the in¬ vention a means of extracting an inherent and direct measure of confidence in the data setbeing analysed. This confidence factor can therefore be used, in the case of audio noise reduction for example, to bias the decision point adaptively in favour of reduced S/N improvement for situations of low confidence, such as poor S/N.

The number of samples processed in a particular batch (N) is preferably not less than 128 for reliable dis¬ crimination. It should also be noted that, since the number of points in a Transform process (as well as the window function selected) determines the process coherent gain, the system will work with input signals having negative signal to noise ratio, i.e. with more noise than signal.

The apparatus and method of the invention enable the accurate and robust discrimination, separation and sup¬ pression (or detection) of one or more component random signals/variables which comprise an unknown proportion of a composite random signal (or statistical mixture/ensemble) contained within a finite bandwidth; wherein, the component, random signals/variables are characterised by having dif¬ ferent (and not necessarily known) statistical attributes, e.g. probability density functions. It has been found that the apparatus is particularly suitable for real-time im¬ plementation as the method exhibits robust performance hitherto unattainable by more well-known methods in terms of its separation accuracy under a wide range of input signal conditions. In addition, it has also been found that the method maintains its accuracy performance even with small (in statistical terms) sample sets. It is envisaged that this invention will find application in most types of signal rejection and signal detection systems particularly when accuracy of the detection/rejection is of prime importance.

It will be appreciated that, since the apparatus and method of the invention process discrete temporal sets of digital samples selected from some arbitrary starting point, repeating the process with a different starting point would afford a further reduction in noise. Thus, a two or more stage system could also be envisaged, although the additional S/N improvement of each additional stage would be in¬ creasingly small, since each set would become increasingly correlated. Similarly, multi-channel versions can be envisaged by simple repeats of the process herein described. Reference is now made to the drawings, in which: Figure 1 illustrates the rank ordered power spectra of a sampledtemporal sectionof a composite signal comprising two component random signals from different probability density functions such as audio and noise;

Figure 2(a) illustrates the relationship between the discrimination point and typical probability density functions of signal plus noise and noise alone such that there is a positive S/N;

Figure 2(b) illustrates the relationship between the discrimination point and typical cumulative distribution functions and signal plus noise and noise alone such that there is positive S/N;

Figure 3 is a block diagrammatic representation of an apparatus according to one exemplary embodiment of the invention for use as a real-time audio noise reduction system, i.e. an adaptive spectral noise gate;

Figure 4 is a graph corresponding to that of Figure 1, but illustrating the effect of reducing, rather than completely suppressing, the noise content of the signal;

Figure 5a illustrates the instantaneous power spectrum of some hypothetical temporal section of an audio plus noise signal;

Figure 5b illustrates the frequency response of a filter designed for such a signal;

Figure 5c illustrates the result of multiplying the original signal power spectrum with the frequency response of the filter;

Figure 6 is a flow chart illustrating the method by which the threshold level may be determined;

Figure 7 is a graph illustrating the ideal nth filter amplitude profile; and

Figure 8 is a flow chart illustrating a method of adjusting the ideal filter levels.

Reference is now made to Figure 3, which illustrates the data-flow diagram of an apparatus according to one exemplary embodiment of the invention for use as a real-time

audio noise reduction system. The apparatus illustrated in Figure 3 will now be explained in full detail with reference also to Figures 4 to 8.

N noisy audio signal samples are collected in a double buffered store 8 via an input selection switch 7 from either an analogue input 1 via a buffer amplifier 3, lowpass filter 4 and analogue to digital converter 5, or a digital audio input 2 comprising a digital audio interface 6 (e.g. AES/EBU or S/PDIF) . During the time τ it takes to collect the next batch of noisy audio samples, where τ = N .ct .T (2) and N = Number of samples (typically 128) α = Block overlap ratio e.g. 0.5 for 50%

T = time interval between samples (must satisfy Ny- quist) typically 1/(48 kHz) or 1/(44.1 kHz) for audio,

the current batch 'b' of data {{x _kT*b } _k-d } -,. _<■ (where 'k' is

the sample index and 'b' is the batch index) is passed to a memory device 11 which stores the data for future use and a processing device 9 which weights the data with a window function 10 to produce the output sequence } _b ^'"_ ₀ where

{ ^ ₆ :: ₀ ^ι};. ₀ = { ^λ-^ _6τ^(i),r>:: ₀ ^ι} . ₀ o)

and {ω(l) _fcτ) _fc- ^"o = Pre-computed N-point window function such as a -90dB Dolph-Chebyshev or similar. Note -90dB

sidelobe-level is an arbitrary choice and although typical can be selected according to the signal quality processed. The weighted data sequence is then transformed at 12 into the discrimination space by means of, for example, a Fast Fourier Transform (or similar) to produce (in the case of the FFT/DFT) the complex-valued N-point data sequence { { ^a_ N_T.- _b ⁺ JA

. Nτ ^, _b ^υ I) n-0 )>6-0 where (in the case of the FFT/DFT) for the b-th batch

The power spectra __., _b for the b-th batch is then obtained by computing the magnitude-squared at 13. That is

Alternatively, the magnitude sequence could be c nputed, but this is generally a more computationally intensive operation to perform for no additional benefit in this case. The power spectra of the b-th batch are then passed to a further computation stage 14 which sorts the sequence by amplitude into ascending rank-order. (Note that since the original data sequence was real-valued it is only necessary to operate on the first-half (plus one) of the data array since the second-half is merely the image of the first and as such will reveal no further information. This is a highly significant observation for performing data sorts in real-time. However, if advantage of this fact is

taken then it is important to note that the value p __ ₀ is replaced with the value ?„„ _<,/4 and similarly P _N_ is replaced

"2 with thevalue P __/4justbefore sorting for proper scaling. )

Then, for the b-th batch

Note that since the values of the sequence elements have not altered through this process, but only their indices, the original index 'n' has been replaced by the index 'r' once sorted.

The sorted or rank ordered sequence is then passed to a further computational element 15 which finds an ap¬ propriatethreshold. Inorder to explain this it is necessary to digress temporarily to consider the impact of equation (1) and the discrimination process herein described in this audio noise reduction application.

From equation (1) and Figure (1) it can be seen that at the discrimination point 'd'

(7) where Constant = 4 (typically, for magnitude-squared sorted data)

If data elements in the sorted array to the left of the discrimination point (for an ascending order ranking)

are those substantially from noise alone, then the mean noise level may be estimated as d

Σ - - mnl - '-^_ - ^S -

Consequently, the instantaneous signal-to-noise ratio (SNR) of the temporal section of audio may be estimated as

Note that in Equation (9) the average of the sum of the noise components is assumed to be present in the frequency bands containing signal, i.e. those to the right of the discrimination point.

From the point of view of audio it has been found that to discard all of the potentially available noise components every block is too severe since the noise content can change quite dramatically in accordance with the in¬ stantaneous spectral occupancy of the signal - resulting in what is commonly termed a 'noise pumping effect'.

Consequently, it has been found that a more pleasing audio effect is obtained if the user specifies a total noise power value 'P _τ' which they wish to discard or suppress. So long as this value is less than that implied by the discrimination point the noise components can be safely discardedwith the assurance that no signal is being rejected.

In this way the noise-pumping effect can be avoided.

It has also been found that rather than suppress the declared noise components completely, for certain types of audio signal where the generally perceived view is that the noise is part of the 'expected' audio content, e.g. a live recording where a portion of noise adds to the atmosphere of the recording as compared to a studio recording, there are practical advantages in allowing the user to control the level of suppression.

Figure 4 illustrates the implied consequence on the sorted data sequence of performing this action. Also from Figure 4, it can be seen that

SNR a ftβr processing - 10. log where

0 < u < d < - ^'

MA = maximum attenuation level (user specified e. g. 0 . 01 ) u = user implied specified point from P _τ, where u ≤ d Note that the user specifies P _τ where

from which the index 'u' can be found and checked to ensure that it is less than or equal to 'd'. If not, then only the data elements up to 'd' are permitted to be suppressed by a maximum amount of MA.

From equations (9) and (10) it can be seen therefore that the signal-to-noise improvement of the process can be expressed as

SNR improvement - SNR after processing - SNR before processing ( 12 )

It should be noted, however, that whilst in the scientific sense a SNR improvement can be calculated by an equation such as (12), in terms of its audio performance, a perceived further improvement to the listener is apparent due to what is commonly termed the "masking effect" .

Figure 5(a) illustrates the instantaneous power spectrum of some hypothetical temporal section of audio plus noise. Figure 5(b) illustrates the frequency response of the filter we would wish to design for such a signal. Figure 5(c) illustrates the result of multiplying the original signal power spectrum with the frequency response of the filter.

It can be seen from Figure 5(c) that the inter-line frequency bands containing noise have been suppressedwhereas the noise in close proximity to a signal component has not. Therefore, whilst there is an overall signal-to-noise im¬ provement as described by equation (12) it is not in the mean-level sense. This fact is important since it has been shown by a number of workers in the audio field that a low-level signal (such as noise in this case) cannot be

detected by the ear if it is in the close proximity to a stronger tone. In other words the close-in noise is 'masked' by the presence of the audio signal component. Because of this, the "apparent" SNR improvement of such a process may be more accurately described by

Apparent SNR improvement - 1 dB ( 13)

Note that unlike equation (12) if u = d and MA = 0 then from equation (13) the 'effective' SNR improvement = °°, i.e. no residual noise is audible at all!

Figure 6 illustrates one such example of the 'Find Threshold' process 16 in detailed flow-graph form.

It will be seen from Figure 6 that the find threshold process essentially consists of two parts. The first part finds the index value 'u' which corresponds to the user's input parameter 'P _τ' for total noise power to be removed in r accordance with equation (11). (Note that Λ= __ P _x, i.e. x-0 the sum of noise components up to and including the r-t . ) The second part checks the determined value 'u' against the discrimination point 'd' and backs off the index if u > d or leaves it unmodified if u < cf. The threshold value is then equal to the amplitude pointed to by the index 'u' (or 'd' if u > d) . In other words, the threshold value is equal to either P _r. _u or P _r. _ if a > d. In this exemplary embodiment, the pre-determined value = 4 (see Equation (1) ) .

It will be appreciated that the method described for determining a threshold value on the basis of the sorted' data set and discrimination point is one exemplary embodiment of the invention which yields exceptionally good results in this audio case. Clearly, depending on the application there may be many more ways of using the invention, either directly or indirectly.

Having determined a threshold value it can be seen from Figure 3 that the unsorted data sequence of the b-th p _ N - \ batch ( ^r . _ιb> is compared with the threshold value in comparison stage 17. If the nth data element is greater than the threshold value the output of the comparison stage 17 produces a '1'. Similarly if the nth data element is less than or equal to the threshold value the output of the comparison stage 17 produces a '0' so that the l's and 0's vector π» 0 is produced. That is, for the b-th batch, IF P __ > Threshold va lue THEN I ₁ - 1 ELSE T ._ - 0} ". ^" ₀' ( 1 4)

Clearly, a value of '1' signifies that a signal component is present in that particular frequency band, whereas a '0' signifies that there is only noise. Therefore, the l's and 0's vector can be considered as the 'ideal' sampled frequency response of the filter to be designed to pass signal but to suppress noise such that a '1' = a passband and '0' = a εtopband.

Given an ideal filter sampled frequency response it is a relatively straightforward matter to obtain the filter coef icients. However, in the case of audio as so far described there is a further complication resulting from batch-to-batch effects. This audible effect is termed the "gurgling effect" as the resultant sound resembles that of a trickling stream of water. It stems from the fact that batch-to-batch both the quantity and spectral disposition of the suppressed frequency bands are completely random. It is hypothesised that if the process herein described were to be performed on every incoming sample, i.e. a running block of N rather than overlapped block processing, the gurgling effect would sound like noise again since the rate of change of the residuals is higher than the highest frequency in the audio band. However, given that the al¬ gorithm can be performed on blocks overlapped by as little as 50% the rate of change of the residuals is within the audio band and thus can be heard as an effect other than noise, i.e. a self generated signal.

It has been found that in order to suppress this residual sound, a very effective solution is to introduce a filter 'attack' and 'decay' function block-to-block. The mechanism of the 'attack' and 'decay' function will be described in detail shortly. However, before doing so, it

is appropriate at this stage to mention a further audible effect which can also be mitigated within the filter design process.

Suppose as a result of the previous analysis a noise spectral spike or short-term spectral transient is declared as 'a signal' which requires passing. The resultant audible effect termed "popping effect" sounds like a wave crashing on the rocks in that there is a short-termburst of narrow-band noise. As it is not in close proximity to a true signal component it is therefore not masked and is consequently quite audible. Although the probability of occurrence by virtue of the detection process is very low, it does nevertheless occur from time to time, since the probability of falsely declaring noise as signal (pfa=l-Pdn) is never zero.

In order to overcome this, a further 'temporal duration check' can be included at this point. That is to say, the probability of some audio signal component occurring for one block time only (1.3 mS if N=128, fs=48 kHz and 50% block overlap) is very low. In this sense, even though the process is being applied in blocks rather than every sample, the process is over-sampled. Consequently, a very effective mitigation technique is to simply check that the signal exists for a number (C) of consecutive blocks.

Typically, C=4 for N=128, fs=48 kHz and 50% block overlap, implying that a further acceptance test of the declared signal being truly signal is that it exists in the same spectral location for at least 5 mS or so. Whilst this may not sound like a very long time in audio terms, from a detection point of view this simple method dramatically decreases the probability of falsely declaring noise as signal because the pfa then becomes the cumulative prob¬ ability of 'C false declarations in succession. If, for example, the pfa was 10-5, i.e. one falsely declared noise component as signal every 100,000 decisions, then at 48,000 decisions per second it would be expected that the popping effect would be audible approximately every 2 seconds, which would clearly be unacceptable.

Given that there are - unique frequency bands, then

probability of falsely declaring = N x p fa

(15) noise as signal in one batch 2 probability of falsely declaring = N x p fa ² (16) noise as signal in two successive 2 batches in the same frequency band probability of falsely declaring = _t\r _X pf _a ^c

(17) noise as a signal in 'C successive 2 batches in the same frequency band

In the previous example if pfa=10- ⁵ and C=4 then from equation (17) it can be seen that it would be expected to occur once every 1020 decisions. At 48,000 per second, this is equivalent to 6.6 x 107 years. Clearly, the pfa has been

decreased by this technique dramatically, so even though larger values of C could be used, in practice there would appear to be little point.

As an example of the mitigation technique for both the gurgling and popping effects, suppose that the observed sequence of l's and 0's for a particular frequency band (exiting the thresholding process) was

B blocks C blocks time -> block no: b-B-C b-B-C+1 b-C b-C+1 ... b-1 b b+1 b+2 b+3 .. nth frequency band 0 1 1 1 1 0 0 0 threshold output <- current position in time

where 'b' is the current block, b-1 = previous block and so on.

It can be seen that as a result of the analysis performed on the current block 'b' it is found that this frequency band contains signal as do the previous 'C blocks whereas, prior to this, for the previous 'B' block it only contained noise. In other words, the temporal duration check has been satisfied. In this circumstance, the desired nth frequency band filter now needs to be activated since a genuine signal component has been detected. To allow the signal to pass unaffected it is necessary to make the am¬ plitude of the filter equal to unity. However, it has been found that, due to the sensitivity of the ear, a more pleasing audio effect is obtained if the amplitude of the filter builds up to a value of 1 over a number of 'B' blocks prior

to the current and previous 'C blocks. Similarly, the amplitude of the filter decays (if subsequent signal de¬ tections reveals no signal) over subsequent blocks by making the current amplitude equal to the previous amplitude multiplied by some fractional scaling factor 'DR' . Consequently, if a '1' is detectedas exitingthethresholding process for the current and previous 'C blocks the previous B+C filter amplitudes for that particular frequency band have to be adjusted. For filter amplitudes in blocks (b-B-C+1) to (b-C-1) the amplitudes may be adjusted by a linear amount, for example Δ/l, where

__ _A -- - Λmp _b→_ _c \ _{/ B (18)}

NT

Λmp _b --- current filter amplitude.

{"" ^p_-.-c _.ft)::; - {" ^m"_ _ _ft-*-^*}: ; c

(Note that any function rather than linear could be used but linear is simple to implement and works perfectly well for the majority of applications)

Filter amplitudes in blocks (b-C) to (b) have filter amplitudes for the nth frequency band equal to unity.

It can be seen therefore, that the depth of the data delay block 11 as shown in Figure 3(a) has to be at least (B+C) .N in order to be able to go back and adjust the ideal filter amplitudes of the previous (B+C) blocks on the basis of the current blocks analysis. In order to be able to perform the analysis and adjust the ideal filter amplitudes

of the previous (B+C) blocks it is necessary to store both the l's and 0's vector and filter amplitudes of the previous (B+C) blocks in stores 18 and 26 as shown in Figure 3(b). The same process is applied to each of the ( _j) - 1 unique filter frequency bands for every block.

It has been found useful to make the parameters of Attack rate (B), Decay rate (DR), min signal duration (C) and maximum attenuation (MA) operator controls so that a user can adjust to suit the audio material being processed. This is illustrated by user controls 20, 24 and 25, 23 and 22 respectively in Figure 3(b).

Figure 7 illustrates how the filter/frequency band amplitudes are adjusted from the ideal (process 19 in Figure 3(b)) and Figure 8 illustrates the process in full detail by means of a flow chart.

Note also that in Figure 7 the parameters B, C and MA are constants whereas the nth filter can have its own decay rate. The purpose of this is that, for some audio material, it has been found that the higher frequency bands should have a different decay rate to the lower frequency bands. In the majority of cases if two decay rates are specified, i.e. DR _n. ₀ and DR _ _m!_ (as illustrated by 24 and 25 in Figure 3b) a linear increment Δ DR between them is sufficient. With reference to Figure 3 the computational

element 21 computes the following equations given the upper and lower decay rate parameters:

Δ /?=(^ _π. -^ _π-o)/( ) (20)

and 2. 1

{DR _n} ₀ ^ft- {DR _n_ ₀ -- n._mDR}l ₀ (21)

The sequence {DR _n} _ ₀ is then output by 21 to the adjust ideal filter levels 19 and used as illustrated in Figure 8 in the computation of the modified filter levels.

The output in the b-th block of the 'adjust ideal is the modified sampled frequency

for the (b-B-C+l)-th block as illustrated in Figure 8. Note that, because of the need to go back and adjust the filter amplitudes of the previous (B+C-l) blocks in this way, it necessarily introduces an overall process or pipeline delay of the same amount.

Given the adjusted ideal filter level sequence

'_ N2T_ I} π ^'-0 the process of obtaining the adaptive filter weights consists of firstly inverse transforming (27 in Figure 3b) the sequence to give the ideal impulse response of the filter. That is, for the case of the IDFT/IFFT and the (b-B-C+l)-th batch

ih_ _τ. _tb. _B c.ι _k:i ' (22) where

(23)

The number of outputs required from the filter process is

/V(l-α) filtered output samples/batch (24) where N is the number of points α. is the overlap ratio e.g. 0.5 if 50% Therefore the number of stages within the finite impulse response filter should ideally be the maximum possible such that when convolved with an N point data sequence it produces /V.(l-α) filtered outputs.

For the case of 50% overlap this is equivalent to -J+ 1 stages implying ( _j ⁺lj filter coefficients. In this case, the filter coefficients are obtained from the following

{θ _fcr.( _b- _δ-c.ι,> ;ό ={ ⁵^- _fcτ.( _b- _β-c.)>_ _o (26)

and are computed by stage 29 in Figure 3b. Computing the filter coefficients as per equations (25) and (26) ensures that the coefficients are real-valued and symmetric which is a necessary and sufficient condition to ensure linear phase.

The pre-computed set of weights (stored in window

__ function store 28 (Figure 3b)) { w(Z) _kT } _k ² _m0 would be typically

-40 dB Dolph-Chebyshev (or similar).

The filter coefficient sequence {b _{kT b}_ _B_ _c. _v- }l_ ₀is then

convolved with the delayed original input data previously held in delay store 11 to produce the filtered - ' output sequence by the finite impulse response filter 30 as follows:

Notice that in equation (27) the designed filter coefficients are only valid for the (b-B-C+l)-th batch of data as they are completely re-computed for each subsequent batch. The main advantage of this technique is that the filter applied to the data has no a priori assumptions made about it but instead is designed strictly on the basis of the real-time analysis of a particular data batch and then only applied to that data. For the next batch the whole process is repeated.

The processed block of noise suppressed data is then re-assembled into a continuous digital data stream by a further double bufferedmemory 31 which stores one set whilst outputting the previous set.

Finally, two alternative output stages are provided, selectable by an output signal select switch 32. A digital audio signal output 38 is supplied direct with data from the double buffered memory 31 via a digital audio interface 37. Alternatively, data from store 31 is passed via switch

32 to a digital-to-analogue converter 33 whose output is low pass filtered at 34 in order to reconstruct the analogue signal before being passed to the analogue audio signal output 36 via a buffer amplifier 35.

It will be appreciated that whilst this exemplary embodiment of the invention has been described in terms of a time domain convolution, an alternative equally valid approach would be to perform a frequency domain multipli¬ cation, i.e. Fast Convolution, yielding a potentially more computationally efficient implementation.

Similarly, it will be appreciated that this exemplary embodiment of the invention has been described in terms of real-valued samples being taken of the original signal and where some appropriate advantage has been taken of this fact to reduce the computational requirements, i.e. halve the data arrays in certain circumstances. People skilled in the art will of course recognise that the method can be modified for the case of complex-valued samples e.g. Phase and Quadrature sampled baseband outputs of a communications receiver, or further computational advantage may be gained byperformingmulti-channels or multi-blocks simultaneously. For example, it is well known that it is possible to perform two simultaneous real-valued FFT's/DFT's with one complex-FFT. In this way the real-time computational re¬ quirements can be considerably reduced, or the size of the

data batch (implying resolution) can be increased for a given computational capability, or the block overlap ratio increased.

The invention has been particularly described in relationto audio noise reduction, as it is eminently suitable for use in the re-mastering of analogue recordings onto digital format or cleaning up poor quality recordings or restoring film sound tracks. However, it will be appreciated that the method and apparatus of the invention have much wider application in communications systems and in noise reduction in signals representing images, such as those found in Tomography, old movies etc, as well as signal detection systems more generally which require an accurate noise estimate to be established for subsequent use as a basis for a detection decision, adaptive actions or adaptive component signal extraction or suppression.

Previous Patent: FOUR QUADRANT MOTOR CONTROLLER

Next Patent: SQUARE-ROOT ANTI-SYMMETRIC FILTERS