Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AN ADAPTIVE FILTER FOR SIGNAL PROCESSING AND METHOD THEREFOR
Document Type and Number:
WIPO Patent Application WO/1997/011538
Kind Code:
A1
Abstract:
This invention is a component of an adaptive filter signal processor whose purpose is to separate a mixture of signals received by each of a plurality of transducers. This invention estimates the relative propagation delays among the transducers for each source. First, it randomly generates a fixed number of sets of delay parameters (220), called a population (230). Each set is processed by an instananeous-performance calculator (240), to generate an instantaneous-performance value which is added to a cumulative performance value (360). The set with the greatest cumulative performance value is transferred to the adaptive filter signal processor. New parameter sets are generated at random. Whenever a parameter set is found whose instantaneous performance value exceeds that of the set in the population with the least instantaneous performance value, it is incorporated into the population. The set with the least cumulative performance value is deleted from the population.

Inventors:
,
Application Number:
PCT/US1996/014682
Publication Date:
March 27, 1997
Filing Date:
September 13, 1996
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INTERVAL RESEARCH CORP (US)
NGO JOHN THOMAS CALDERON (US)
BHADKAMKAR NEAL ASHOK (US)
International Classes:
G10L21/0208; G06K9/00; G10L21/0272; G10L21/028; H03H21/00; H04B1/10; H04B3/20; H04R3/00; H04S7/00; H04R25/00; (IPC1-7): H04B15/00
Other References:
IEEE, 1994, Vol. 6, NG, S.C. et al., "Fast Convergent Genetic Search for Adaptive IIR Filtering", pages III 105 - III 108.
Download PDF:
Claims:
What Is Claimed Is:
1. A method of determining a set of a plurality of signal parameters for use in an adaptive filter signal processor to process a plurality of input signals to generate a plurality of processed signals, said method comprising: a) generating a fixed plurality of sets; b) storing said fixed plurality of sets; c) generating a plurality of cumulative performance values, with a cumulative performance value corresponding to each of said fixed plurality of sets; d) comparing said plurality of cumulative performance values, e) choosing one of said plurality of cumulative performance values, based upon said comparison; f) processing said plurality of input signals for a duration of time using one set, corresponding to said chosen one cumulative performance value, from said stored fixed plurality of sets to generate the plurality of processed signals; g) generating a new set of a plurality of signal parameters; h) generating a plurality of cumulative performance values for said new set and for each of said stored fixed plurality of sets; i) comparing said plurality of cumulative performance values generated in step (h), j) choosing one of said plurality of cumulative performance values generated in step (h), based upon the comparing step of (i); k) based upon the comparing step of (i), either: i) replacing one of said stored fixed plurality of sets, by said new set; or ii) deleting said new set; 1) processing said plurality of input signals for a duration of time using one set, corresponding to said chosen one cumulative performance value, from said stored fixed plurality of sets to generate the plurality of processed signals; and m) periodically reverting to steps (g)(l).
2. The method of claim 1 , wherein said generating step (h) comprising arithmetically combining random values from a pseudorandom number generator, one set in said plurality of sets, and recent values in said plurality of input signals.
3. The method of claim 1 wherein: said comparing step (i) compares the cumulative performance value of said new set with the cumulative performance values of one of said stored fixed plurality of sets having a least performance value; and wherein said replacing step (k)(i) replaces one of said stored fixed plurality of sets having said least performance value.
4. The method of claim 1 wherein: said generating step (h) further comprises generating an initial cumulative performance value for said new set by subtracting a fixed value from a cumulative performance value of one of said stored fixed plurality of sets having the greatest cumulative performance value; and wherein said choosing step (j) chooses one of said plurality of cumulative performance values generated in step (h) having the largest cumulative performance value.
5. The method of claims 14 wherein said generating steps of (c) and (h) further comprises: generating a plurality of instantaneous performance values, each at a different time, for each set; and combining said plurality of instantaneous performance values for each set to generate said plurality of cumulative performance values; and wherein said plurality of input signals is characterized as x,(t), x2(t),..., xn(t), said plurality of processed signals is characterized as y,(t), y2(t), ..., y„(t), and each of said choosing steps (e) and (j) selects a set having associated instantaneous performance values that are highest for the sets that would generate processed signals y,(t) and y.(t), that are most statistically independent for different i and j.
6. The method of claim 5 wherein said plurality of input signals and said plurality of processed signals are assumed to have zero mean and to fluctuate in power over a few clock cycles, and said choosing steps (e) and (j) selects a set for which the associated instantaneous performance values are highest for the sets that would generate processed signals y,(t) and y.(t) that most closely satisfy the relation E[y,(t)y.(t)] = E[y,(t))E[y.(t)] for any different indices i and j, where the operation E[...] is a weighted average over a period of time of said value of x,(t), x2(t), ..., xn(t), y,(t),y2(t), ..., y„(t).
7. The method of claim 6 wherein said choosing step of (e) and (h) computes an instantaneous performance value in accordance with: C(t) = log( Σ,. {E[y,(t)yj(t)]}2 ), and wherein the logarithm computation is protected against numerical overflow.
8. The method of claim 7 further comprising the step of: generating said plurality of input signals by a plurality of transducer means, based upon waves received by said plurality of transducer means from a plurality of sources.
9. The method of claim 8 wherein said plurality of input signals is generated by a plurality of transducer means, based upon acoustic waves received by said plurality of transducer means from a plurality of sources.
10. The method of claim 8 wherein said plurality of input signals is generated by a plurality of transducer means, based upon electromagnetic waves received by said plurality of transducer means from a plurality of sources.
11. The method of claim 8 wherein said transducer means are directional and positioned to minimize relative propagation delays.
12. The method of claim 11 wherein each set of said plurality of signal parameters represents values of relative gains in transduction of said waves from said waves by said plurality of transducer means.
13. The method of claim 12 wherein said choosing steps (e) and (j) computes said instantaneous performance value in accordance with: Y(t) = inv G * X(t), where X and Y are the vector representations of the signals x.(t) and y,(t), G is the matrix representation of said relative gains, and inv G is its inverse matrix.
14. The method of claim 13 wherein said choosing steps (e) and (j) comprises computing Y(t) explicitly from the currently available input signals X(t); and calculating said instantaneous performance value by operating on the values Y(t) for at least a duration of time equal to the averaging time of the expectation operator E[].
15. The method of claim 13 further comprising the step of storing said plurality of input signals X(t) for at least a duration of time equal to the averaging time of the expectation operator E[]; and wherein said choosing steps (e) and (j) computes Y(t) explicitly from said storage of input signals X(t), and subsequently said instantaneous performance value.
16. The method of claim 13 wherein said choosing steps (e) and (j) comprises the steps of: precalculating, once per clock cycle, the quantity e,. = E{ Xi(t) Xj(t) } for every (i j) pair; and computing said instantaneous performance value C(t) as required, from the quantities e,..
17. The method of claim 8 wherein said plurality of signal parameters represent the coefficients of filters that model the transfer function of propagation of said waves from said plurality of sources to said plurality of transducer means; and said choosing steps (e) and (j) computes said instantaneous performance value in accordance with Y(t) = inv H * X(t), where X and Y are the vector representations of the signals x,(t) and y,(t), H is the matrix representation of said filters, and inv H is its inverse matrix.
18. The method of claim 17 wherein said choosing steps (e) and (j) comprises the steps of: computing Y(t) explicitly from the currently available input signals X(t); and calculating said instantaneous performance value by operating on the values Y(t) for at least a duration of time equal to the averaging time of the expectation operator E[] plus the longest duration of said filters.
19. The method of claim 17 further comprising the step of storing said plurality of input signals X(t) for at least a duration of time equal to the averaging time of the expectation operator E[] plus the longest duration of said filters; and wherein said choosing steps (e) and (j) computes Y(t) explicitly from said storage of input signals X(t), and subsequently said instantaneous performance value.
20. The method of claim 17 wherein said choosing steps (e) and (j) comprises the steps of: precalculating, once per clock cycle, the quantity e,.(k„k.)=E{x,(tkiΔt)xJ(tk.Δt)} for every (i j) pair, and for each such pair, for every (k„k.) pair; and computing C(t) from the quantities e,.(k„k.) every time the instantaneous performance value is required.
21. The method of claim 8 wherein said transducer means are spaced apart and omnidirectional; and wherein said plurality of signal parameters represent values of relative delays in propagation of said waves from said plurality of sources to said plurality of transducer means; and wherein said choosing steps (e) and (j) computes said instantaneous performance value using the definition Y(t) = adj D* X(t), where X and Y are the vector representations of the signals x,(t) and y,(t), D is the matrix representation of said relative delays, and adj D is its adjugate matrix.
22. The method of claim 21 wherein said choosing steps (e) and (j) comprises the steps of: computing Y(t) explicitly from the currently available input signals X(t); and calculating said instantaneous performance value by operating on the values Y(t) for at least a duration of time equal to the averaging time of the expectation operator E[] plus the maximum value of said relative delays.
23. The method of claim 21 further comprising the step of storing said plurality of input signals X(t) for at least a duration of time equal to the averaging time of the expectation operator E[] plus the maximum value of said relative delays; and wherein said choosing steps of (e) and (j) computes Y(t) explicitly from said storage of input signals X(t), and subsequently said instantaneous performance value.
24. The method of claim 22 wherein said choosing steps (e) and (j) comprises the steps of: precalculating, once per clock cycle, the quantity el.(k„k.)=E{x,(tk,Δt)x.(tk.Δt)} for every (ij) pair, and for each such pair, for every (k„kj) pair; and computing C(t), as is required, from the quantities e,.(k„k.) by implementing the necessary time delays using linearphase, noncausal FIR filters with a truncated sineshaped impulse response.
25. A method of processing waves from a plurality of sources, comprising: receiving said waves, including echoes and reverberations thereof, by a plurality of transducer means; converting said waves, including echoes and reverberations thereof from said plurality of sources, by each of said plurality of transducer means into a signal, thereby generating a plurality of signals; calculating a set of a plurality of signal parameters by: generating a fixed plurality of sets of a plurality of signal parameters; storing said fixed plurality of sets; generating a plurality of instantaneous performance values for each set, with each instantaneous performance value generated at a different time; combining said plurality of instantaneous performance values for each set to produce a plurality of cumulative performance values, with a cumulative performance value produced for each set; storing said plurality of cumulative performance values; periodically generating a new set of a plurality of signal parameters, generating a new cumulative performance value, based upon said new set; comparing said new cumulative performance value to said plurality of stored cumulative performance values; based upon said comparing step, either: i) replacing one of said stored plurality of cumulative performance values, by said new cumulative performance value, and the corresponding stored set by said new set; or ii) deleting said new cumulative value and said new set; comparing said stored plurality of cumulative performance values; choosing one of said stored plurality of cumulative performance values, based upon said comparison; choosing one set, corresponding to said chosen one cumulative performance value; supplying said chosen one set to a first processing means for operation thereon; first processing said plurality of signals, using said chosen one set, corresponding to said chosen one cumulative performance value, to generate a plurality of first processed signals, wherein each of said first processed signals represents waves from one source, and a reduced amount of waves from other sources; and then secondly processing said plurality of first processed signals to generate a plurality of second processed signals, wherein in the presence of echoes and reverberations of said waves from said plurality of sources, each of said second processed signals represents waves from only one different source.
26. The method of claim 25 wherein said transducer means are spaced apart omnidirectional microphones, and said chosen one set of plurality of signal parameters has a set of relative delay parameters associated therewith; said first processing step further comprises: delaying said plurality of signals using said set of relative delay parameters and generating a plurality of delayed signals in response thereto; and combining each one of said plurality of signals with at least one of said plurality of delayed signals to produce one of said first processed signals.
27. The method of claim 26 further comprising the step of: filtering each of said second processed signals to generate a plurality of third processed signals.
28. The method of claim 27 further comprising the step of: sampling and converting each one of said plurality of signals and for supplying same to said plurality of delay means and to said plurality of combining means, as said signal.
29. The method of claim 25 wherein said second processing step further comprising: subtracting by a plurality of combining means one of said first processed signals received at said first input, and the sum of input signals received at a second input, to produce an output signal, said output signal being one of said plurality of second processed signals; generating a plurality of adaptive signals, with each of said adaptive signals being the output signal of one of said plurality of combining means; and supplying each of said plurality of adaptive signals to second input of said plurality of combining means other than the associated one combining means.
30. The method of claim 25, wherein said step of generating a new set comprises arithmetically combining random values from a pseudorandom number generator, one set in said plurality of sets, and recent values of said plurality of input signals.
31. The method of claim 30 wherein: said comparing step compares the cumulative performance value of said new set with the cumulative performance value of one of said stored plurality of sets having a least cumulative performance value; and wherein said replacing step replaces one of said stored plurality of sets having said least cumulative performance value.
32. The method of claim 31 wherein: said replacing step further comprises: generating an initial cumulative performance value for said new set by subtracting a fixed value from a cumulative performance value of one of said stored plurality of sets having the greatest cumulative performance value; and wherein said comparing step chooses one of said stored plurality of sets having the greatest cumulative performance value, for processing by said processing step.
33. An adaptive filter for determining a set of a plurality of signal parameters for use in an adaptive filter signal processor to process a plurality of input signals to generate a plurality of processed signals, said filter comprising: means for generating a fixed plurality of sets; means for storing said fixed plurality of sets; means for generating a plurality of cumulative performance values, based upon said fixed plurality of sets, with a cumulative performance value generated for each set; means for evaluating said plurality of cumulative performance values, and choosing one of said plurality of cumulative performance values, based upon said evaluation; and means for processing said plurality of input signals for a duration of time using one set, corresponding to said chosen one cumulative performance value, from said fixed plurality of stored sets to generate the plurality of processed signals.
34. The filter of claim 33, further comprises: means for periodically generating a new set of a plurality of signal parameters; means for generating a new cumulative performance value, for each set of signal parameters, including said new set generated; means for comparing said new cumulative performance value corresponding to said new set generated to said cumulative performance values for each set of signal parameters; and means for either i) replacing one of said fixed number of plurality of said sets, by said new set; or ii) deleting said new set, in response to said comparing means.
35. The filter of claim 34 further comprises: means for generating a plurality of instantaneous performance values for each set, with each instantaneous performance value generated at a different time;and means for combining said plurality of instantaneous performance values for each set to produce a plurality of cumulative performance values, with a cumulative performance value produced for each set.
36. The filter of claim 35, wherein said means for generating a new cumulative performance value comprises means for arithmetically combining random values from a pseudorandom number generator, one set in said plurality of sets, and recent values in said plurality of input signals.
37. The filter of claim 36 wherein: said evaluating means compares instantaneous performance value of said new set with instantaneous performance value of one of said stored plurality of sets having a least instantaneous performance value; and wherein said replacing means replaces one of said stored plurality of sets having said least instantaneous performance value.
38. The filter of claim 37 wherein: said means for generating a new cumulative performance value further comprises means for generating an initial cumulative performance value for said new set and means for subtracting a fixed value from a cumulative performance value of one of said stored plurality of sets having the greatest cumulative performance value; and wherein said evaluating means comprises means for choosing one of said stored plurality of sets having the greatest cumulative performance value, for processing by said processing means.
39. A signal processing system for processing waves from a plurality of sources, said system comprising: a plurality of transducer means for receiving waves from said plurality of sources, including echoes and reverberations thereof and for generating a plurality of signals in response thereto, wherein each of said plurality of transducer means receives waves from said plurality of sources including echoes and reverberations thereof, and for generating one of said plurality of signals; means for calculating a set of a plurality of signal parameters, said calculating means comprising: means for generating a fixed plurality of sets of signal parameters; means for storing said fixed plurality of sets of signal parameters; eans for generating a plurality of cumulative performance values, based upon said fixed plurality of sets of signal parameters, with a cumulative performance value generated for each set of signal parameters; means for evaluating said plurality of cumulative performance values, and choosing one of said plurality of cumulative performance values, based upon said evaluation, and one of said sets of signal parameters corresponding to said one of said plurality of cumulative performance values chosen; first processing means for receiving said plurality of signals, and said plurality of signal parameters of said set chosen for generating a plurality of first processed signals in response thereto, wherein each of said first processed signals represents waves from one source, and a reduced amount of waves from other sources; and second processing means for receiving said plurality of first processed signals and for generating a plurality of second processed signals in response thereto, wherein each of said second processed signals represents waves from only one source.
40. The system of claim 39, further comprising: means for generating a direction of arrival signal for said waves; wherein said first processing means for generating said plurality of first processed signals, in response to said direction of arrival signal.
41. The system of claim 39, wherein the number of transducer means is two, and the number of sources is two.
42. The system of claim 39, wherein said transducer means are spaced apart omnidirectional microphones and wherein said chosen one set of plurality of signal parameters has a set of relative delay parameters, and said first processing means comprises: a plurality of delay means, each for receiving one of said plurality of signals and using said set of relative delay parameters for generating a plurality of delayed signals in response thereto;; and a plurality of combining means, each for receiving at least one delayed signal and one of said plurality of signals and for combining said received delayed signal and said signal to produce one of said first processed signals.
43. The system of claim 39 wherein said plurality of transducer means are co located directional microphones and wherein said one of said set of signal parameters has a set of gain parameters associated therewith, and wherein first processing means comprises: a plurality of multiplying means, each for receiving different ones of said plurality of signals and said set of gain parameters and for generating a scaled signal in response thereto; and a plurality of combining means, each for receiving at least one scaled signal and one of said plurality of signals and for combining said received scaled signal and said signal to produce one of said first processed signals.
44. The system of claim 39, 40, 41, 42, and 43 wherein said second processing means comprises: a plurality of combining means, each combining means having a first input, at least one other input, and an output; each of said combining means for receiving one of said first processed signals at said first input, an input signal at said other input, and for generating an output signal, at said output; said output signal being one of said plurality of second processed signals and is a difference between said first processed signal received at said first input and the sum of said input signal received at said other input; a plurality of adaptive filter means for generating a plurality of adaptive signals, each of said adaptive filter means for receiving said output signal from one of said plurality of combining means and for generating an adaptive signal in response thereto; and means for supplying each of said plurality of adaptive signals to one of said other input of said plurality of combining means other than the associated one combining means.
45. The system of claim 44 further comprising means for filtering each of said second processed signals to generate a plurality of third processed signals.
46. The system of claim 45 wherein said second processed signals are characterized by having a low frequency component and a high frequency component, and wherein said filtering means boosts the low frequency component relative to the high frequency component of said second processed signals.
47. A signal processing system for processing waves from a plurality of sources, said system comprising: a plurality of transducer means for receiving waves from said plurality of sources, including echoes and reverberations thereof and for generating a plurality of signals in response thereto, wherein each of said plurality of transducer means receives waves from said plurality of sources including echoes and reverberations thereof, and for generating one of said plurality of signals; an adaptive filter for generating a plurality of signal parameters, said filter comprising: means for generating a fixed plurality of sets of signal parameters; means for storing said fixed plurality of sets of signal parameters; means for generating a plurality of cumulative performance values, based upon said fixed plurality of sets of signal parameters, with a cumulative performance value generated for each set; means for evaluating said plurality of performance values, and choosing one of said plurality of performance values and its corresponding set of plurality of signal parameters, based upon said evaluation; first processing means for receiving said plurality of signals and said plurality of signal parameters and for generating a plurality of first processed signals in response thereto, wherein in the absence of echoes and reverberations of said waves from said plurality of sources, each of said first processed signals represents waves from only one different source; and second processing means for receiving said plurality of first processed signals and for generating a plurality of second processed signals in response thereto, wherein in the presence of echoes and reverberations of said waves from said plurality of sources, each of said second processed signals represents waves from only one source.
48. The system of claim 47 wherein said waves are acoustic waves, and said transducer means are microphones.
49. The system of claim 48 further comprising means for filtering each of said second processed signals to generate a plurality of third processed signals.
50. The system of claim 49 wherein said second processed signals are characterized by having a low frequency component and a high frequency component and wherein said filtering means boosts the low frequency component relative to the high frequency component of said second processed signals.
51. The system of claim 49 wherein said microphones are spaced apart omnidirectional microphones and wherein said corresponding set of signal parameters has a set of relative delay parameters associated therewith; and said first processing means comprises: a plurality of delay means, each for receiving one of said plurality of signals and said set of relative delay parameters and for generating a delayed signal in response thereto; and a plurality of combining means, each for receiving at least one delayed signal and one of said plurality of signals and for combining said received delayed signal and said signal to produce one of said first processed signals.
52. The system of claim 48 wherein said microphones are colocated directional microphones wherein said corresponding set of signal parameters has a set of gain parameters associated therewith; and said first processing means comprises: a plurality of multiplying means, each for receiving different ones of said plurality of signals and said set of gain parameters and for generating a scaled signal in response thereto; and a plurality of combining means, each for receiving at least one scaled signal and one of said plurality of signals and for combining said received scaled signal and said signal to produce one of said first processed signals.
53. The systems of claims 47, 48, 49, 50, 51 and 52, wherein said second processing means comprises: a plurality of combining means, each combining means having a first input, at least one other input, and an output; each of said combining means for receiving one of said first processed signals at said first input, an input signal at said other input, and for generating an output signal, at said output; said output signal being one of said plurality of second processed signals and is a difference between said first processed signal received at said first input and the sum of said input signal received at said other input; a plurality of adaptive filter means for generating a plurality of adaptive signals, each of said adaptive filter means for receiving said output signal from one of said plurality of combining means and for generating an adaptive signal in response thereto; and means for supplying each of said plurality of adaptive signals to one of said other input of said plurality of combining means other than the associated one combining means.
54. The system of claim 53 wherein each of said adaptive filter means comprises a tapped delay line.
55. An adaptive filter signal processing system for processing waves from a plurality of sources, said system comprising: a plurality of transducer means for receiving waves from said plurality of sources, including echoes and reverberation thereof and for generating a plurality of signals in response thereto, wherein each of said plurality of transducer means receives waves from said plurality of sources including echoes and reverberations thereof, and for generating one of said plurality of signals; an adaptive filter for generating a plurality of signal parameters, said filter comprising: means for generating a fixed plurality of sets of signal parameters; means for storing said fixed plurality of sets signal parameters; means for generating a plurality of performance values, based upon said fixed plurality of sets of signal parameters, with a performance value generated for each set; means for evaluating said plurality of performance values, and choosing one of said plurality of performance values, and its corresponding set of signal parameters, based upon said evaluation; first processing means comprises a beamformer for receiving said plurality of signals and said plurality of signal parameters, and for generating a plurality of first processed signals in response thereto, wherein each of said first processed signals represents waves from one source, and a reduced amount of waves from other sources; and second processing means for receiving said plurality of first processed signals and for generating a plurality of second processed signals in response thereto, wherein each of said second processed signals represent waves from only one source.
56. The system of claim 55, wherein said transducer means are spaced apart omnidirectional microphones and said corresponding set of signal parameters has a set of delay parameters associated therewith, and wherein said first processing means comprises: a plurality of delay means, each for receiving one of said plurality of signals and said set of delay parameters, and for generating a delayed signal in response thereto; and a plurality of combining means, each for receiving at least one delayed signal and one of said plurality of signals and for combining said received delayed signal and said signal to produce one of said first processed signals.
57. The system of claim 55 wherein said second processing means comprises: a plurality of combining means, each combining means having a first input, at least one other input, and an output; each of said combining means for receiving one of said first processed signals at said first input, an input signal at said other input, and for generating an output signal, at said output; said output signal being one of said plurality of second processed signals and is a difference between said first processed signal received at said first input and the sum of said input signal received at said other input; a plurality of adaptive filter means for generating a plurality of adaptive signals, each of said adaptive filter means for receiving said output signal from one of said plurality of combining means and for generating an adaptive signal in response thereto; and means for supplying each of said plurality of adaptive signals to one of said other input of said plurality of combining means other than the associated one combining means.
58. The system of claims 55, 56, and 57, wherein said first processing means comprises analog circuits.
59. The system of claims 55, 56, and 57, wherein said second processing means comprises analog circuits.
60. The system of claims 55, 56, and 57, wherein said first processing means are a part of a digital signal processor.
61. The system of claims 55, 56, and 57, wherein said second processmg means are a part of a digital signal processor.
62. The system of claims 55, 56, and 57, wherein said first processing means are a part of a general purpose computer.
63. T e system of claims 55, 56, and 57, wherein sajd second processing means are a part of a general purpose computer.
64. The system of claims 55, 56, and 57, wherein said first processing means are reconfigurable gate array circuits.
65. The system of claims 55, 56, and 57, wherein said second processing means are reconfigurable gate array circuits.
66. The system of claim 55, further comprising: a plurality of third processing means for receiving said plurality of second processed signals and for removing frequency coloration therefrom.
67. The system of claim 56 further comprising: a plurality of sampling digital converting means, each for receiving a different one of said plurality of signals and for generating a digital signal; said digital signal supplied to said plurality of delay means and to said plurality of combining means, as said signal.
68. The system of claims 55, 56, and 57 further comprises: means for periodically generating a new set of a plurality of signal parameters; means for generating a new cumulative performance value, for each of said sets, including said new set; means for comparing said new cumulative performance values; and switch means for either i) replacing one of said fixed number of plurality of said sets, by said new set; or ii) deleting said new set, in response to said comparing means.
69. The system of claim 68 further comprises: means for generating a plurality of instantaneous performance values for each of said fixed plurality of sets, with each instantaneous performance value generated at a different time; means for combining said plurality of instantaneous performance values for each set to produce a plurality of cumulative performance values, with a cumulative performance value produced for each set.
70. The system of claim 69, wherein said means for generating a new cumulative performance value for each of said sets, including said new set, comprises means for arithmetically combining random values from a pseudorandom number generator, one set in said plurality of sets, and recent values in said plurality of input signals.
71. The system of claim 70 further comprises: means for generating a plurality of instantaneous performance values for each of said sets, including said new set, with each instantaneous performance value generated at a different time; and wherein: said comparing means compares instantaneous performance value of said new set with instantaneous performance value of one of said stored plurality of sets having a least instantaneous performance value; and wherein said replacing means replaces one of said stored plurality of sets having said least instantaneous performance value.
72. The system of claim 68 further comprises means for generating an initial cumulative performance value for said new set and means for subtracting a fixed value from a cumulative performance value of one of said stored plurality of sets having the greatest cumulative performance value; and wherein said comparing means comprises means for choosing one of said stored plurality of sets having the greatest cumulative performance value, for processing by said processing means.
Description:
Adaptive Filter for Signal Processing and Method Therefor

This application is submitted with a microfiche appendix, containing copyrighted material,

Copyright 1994, Interval Research Corporation The Appendix consists of one (1) microfiche with forty- six (46) frames The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever in the appendices

Technical Field

This invention relates to the field of adaptive filter signal processing, and more particularly to an adaptive filter signal processor for use in a two stage microphone-array signal processor for separating sounds acoustically mixed in the presence of echoes and reverberations More particularly, the present invention relates to an improved Direction of Arrival Estimator that estimates the optimal delay values for use in the direct signal separator, l e , the first stage in the two-stage signal processor

Background of the invention

It is well known that a human being can focus attention on a single source of sound even in an environment that contains many such sources This phenomenon is often called the "cocktail-party effect " Considerable effort has been devoted in the prior art to solve the cocktail-party effect, both in physical devices and in computational simulations of such devices In a co-pending application we have disclosed a two stage microphone-array signal processor in which a first stage partially accomplishes the intended aim using information about the physical directions from which signals are arriving However, that application discloses the use of a conventional direction of arrival estimator to estimate the various delays in the acoustic waves, used m the first stage to produce the signals This application discloses an improved direction-of-arrival estimator

Estimation of signal parameters is pivotal in such a processor as well as a variety of signal- processing applications, such as source separation and source localization Examples of signal parameters are the differences in propagation delays from a given source to the various sensors, and the direction from which a given source signal is arriving

One prior art solution to the problem of signal-parameter estimation is to use directional sensors either a collection of such sensors can be mounted so as to tesseliate the set of directions of interest, or a single such sensor can be used to scan the possible directions In either case, the strategy is to identify directions from which maximum signal power is detected Another prior art is to scan with a directional sensor employing classical beamforming Signals from a collection of sensors at different spatial locations are averaged together, after being subjected to relative delays and subsequent addition such that waves arriving from a particular "look direction" are enhanced relative to waves arriving from other directions Thus, the beamformer behaves as a directional sensor whose direction of maximum sensitivity can be changed without physically reconfiguring the sensors Flanagan et al (1985) described a system m which speakers in an auditorium are located by a beamformer that continuously scans the auditorium floor

There are at least two fundamental shortcomings of this approach First, the rate at which a new source can be located and tracked is limited by the rate at which the space of possible directions can be

swept; that rate depends, in turn, on the length of time required to obtain a reasonably accurate estimate of the signal power being received from a particular direction. Second, resolution is fundamentally limited since the signal averaging done by a classical beamformer cannot provide relative attenuation of frequency components whose wavelengths are larger than the array. For example, a microphone array one foot wide cannot attenuate frequency components in a sound that are below approximately 1 kHz.

Other prior art to increase the speed of source localization is achieved by a strategy that involves, in essence, shifting signals from two or more sensors in time to achieve a best match. A simple version of this principle is applied in a commercial device called the Direction Finding Blast Gauge (DFBG), developed by G. Edwards of GD Associates (1994). Designed to determine the direction from which short explosive sounds arrive, the DFBG records the times at which a shock wave traverses four symmetrically placed pressure sensors. Direction of arrival is calculated from these times in conventional manner. The DFBG is designed for short explosive sounds that are relatively isolated.

The same principle used in the DFBG has been applied in a more refined way to other types of sound. Coker and Fischell (1986) patented a system that localizes signals from speech, which consists approximately of a train of energy bursts. Sugie et al. (1988), Kaneda (1993), and Bhadka kar and

Fowler (1993) implemented systems that exploit the tendency of naturally occurring sounds to have fairly well-defined onsets: associated with each microphone is a detector that produces a spike whenever an onset occurs. All of the systems mentioned in this paragraph operate by finding inter-microphone time delays that maximize the rate at which coincident spikes are detected. They compensate for noise in the individual time-delay estimates by generating joint estimates from multiple measurements. When many sources of sound are present, these statistical estimates are derived by histogram methods. Thus, the accuracy of the method depends on an assumption that over the time scale on which the individual time- delay estimates are made, the spike trains come predominantly from one source.

In a similar vein, noisy time-delay estimates can be obtained from narrowband signals by comparing their complex phases, provided that the magnitudes of the delays are known in advance not to exceed the period of the center frequency. Joint estimates can be produced, as above, by histogram methods. The front end of a system described by Morita (1991) employs a scheme of this type.

The approaches to sound localization employed by Edwards (1994), by Coker and Fischell (1986), by Sugie et al. (1988), by Kaneda (1993), by Bhadkamkar and Fowler (1993) and by Morita (1991) accommodate multiple sources by exploiting an assumption that in most time intervals over which individual time-delay estimates are made, signal from only one source is present.

The techniques to which we now turn our attention, like the present invention, differ from these in two ways. First, they do not rely on the assumption that at most one source is present per measurement interval. Second, they are designed for a more general domain of signal-parameter estimation, in which time-delay estimation is only one example.

The so-called "signal-subspace" techniques, as exemplified by ESPRIT (Roy et al., 1988) and its predecessor, MUSIC (Schmidt, 1981), are specific for narrowband signals. The collection of n sensor signals is treated as a 2n-dimensional vector that varies in time. From prior knowledge of the array configuration, it is known what dimensions of this 2n-dimensional signal space would be accessible to the signal vector arising from a single source, as a function of the desired signal parameters. Moreover, by diagonalizing the covariance matrix of the observed signal vector and by inspecting the resulting eigenvalues and eigenvectors, it is known what dimensions of the signal space are actually spanned by the signal vector. Estimates for the desired signal parameters are extracted by comparing these two pieces of information. This prior art requires the source signals not be fully correlated; otherwise, the observed covariance matrix contains insufficient information to extract the signal parameters.

The shortcomings of this prior art relative to the present invention are (a) that it is computationally intensive; (b) that the sensor-array configuration must be partially known; and (c) that the number of sensors far exceed the number of sources (otherwise, the signal vector can span all available dimensions). Another prior art is based on an assumption that source signals are statistically independent. The condition of statistical independence is much more stringent than that of decorrelation: whereas two signals x(t) and y(t) are decorrelated if E{[x(t)y(t)]} = E{x(t)}E{y(t)}, they are statistically independent only if E{f[x(t)]g[y(t)]} = E{f[x(t)]}E{g[y(t)]} for any scalar functions f and g. (The notation E{ « } denotes an estimate of the expected value of the enclosed expression, as computed by a time average.) In return for requiring the more stringent conditions on the nature of the source signals, statistical- independence techniques do not always require prior knowledge of the array configuration. (For example, if the signal parameters to be estimated are the relative propagation delays from the sources to the sensors, nothing need be known about the array geometry; however, inter-sensor distances are obviously needed if the relative time delays are to be translated into physical directions of arrival.)

Moreover, these techniques accomplish the more difficult goal of source separation, i.e., recovery of the original source signals from the observed mixtures; signal-parameter estimation is normally a by¬ product.

Like signal-subspace techniques, some statistical-independence techniques begin by diagonalizing the covariance matrix because doing so decorrelates the components of the signal vector. However, they are distinct because they use the results of diagonalization in different ways. Critical to understanding the distinction is that neither parameter estimation nor source separation can be achieved merely by diagonalizing the covariance matrix (Cardoso, 1989; Lacoume and Ruiz, 1992; Moulines and Cardoso, 1992). Whereas the signal-subspace techniques fill in the missing information by employing knowledge of the sensor-array configuration, the statistical-independence techniques do so by applying additional conditions based on the sensor-signal statistics alone. Statistical-independence techniques can be summarized as follows. The source signals are treated as a signal vector, as are the sensor signals. The processes by which the source signals propagate to the sensors are modeled as a transformation H from the source signal vector to the sensor signal vector. This transformation can consist of a combination of additions, multiplications, time delays, and convolutive filters. If the transformation H were known, then its inverse (or pseudoinverse) could be computed, and applied to the sensor signal vector to recover the original sources. However, since the source locations relative to the sensor array are not known, the inverse transformation must be estimated. This is done by finding a transformation S that, when applied to the sensor signal vector, produces an output signal vector whose components are statistically independent. Statistical-independence techniques are relevant in the present context inasmuch as signal parameters can often be estimated from S. Most existing statistical-independence techniques that are adaptive, i.e., able to alter S in response to changes in the mixing transformation H, operates by maintaining a single estimate of S. As new data from the signal stream become available, they are incorporated by updating the estimate of S. These are referred to as "single-estimate" techniques because memory of past data is contained, at any given instant, in a single estimate of S. The chief goal of this invention is to remove a dilemma, common to single-estimate techniques, that arises from the presence of two opposing factors in the selection of time constants for adaptation. On one hand, the mixing transformation H can undergo large, abrupt, sporadic changes; the abruptness of these changes calls for short time constants so that the system can adapt quickly. On the other hand, the source signals can be such that they are statistically independent only over relatively long time scales: when averaging is performed over shorter time scales, statistical error and coincidental correlation imply deviations from true statistical independence. To obtain accurate estimates of S from such data, it would seem necessary to employ long time scales.

Previous adaptive techniques for parameter estimation based on statistical independence are now reviewed. In each case it is shown why the given technique can be considered a single-estimate procedure.

In one class of single-estimate techniques, the following steps are executed periodically Joint statistics of the signal vector are averaged over a set of times defined relative to the current time, with inverse time constant β. A value of S, called S' , is chosen such that it enforces certain conditions of statistical independence exactly over the averaging time. The estimate of S used to generate the separated outputs of the system may be S' itself, but more commonly it is a smoothed version of S\ with smoothing parameter μ. A well-known technique in this class is Fourth Order Blind Identification (FOBI), in which S' is computed by simultaneously diagonalizing the covariance and quadricovariance matrices, thus satisfying conditions of the form E{x(t)y(t)}=0 and E{x 3 (t)y(t)+x(t)y 3 (t)}=0 (Cardoso, 1989; Lacoume and Ruiz, 1992). Since x(t) and y(t) are assumed to have zero mean, the two equations are necessary conditions for statistical independence.

Another class of techniques, whose genesis is attributed to Herault and Jutten (1986), is closely related to the LMS algorithm (Widrow, 1970). S is taken to be a matrix of simple gains; thus, it is assumed that each source signal arrives at every microphone simultaneously. The quantity ΔS is computed at every time step, based on the instantaneous value of the output vector, y=Sx:

ΔS = fly,(t)] g[y.(t)], where f and g are different odd nonlinear functions. The estimate S is updated at every time step, according to the rule S(new) = S(old) + μ ΔS. We will refer to the use of rules of this type as "weight-update" techniques. This technique works because when the components of the output-signal vector x are statistically independent, the statistical relation E{f[x,(t)]g[x.(t)]}=E{fIx,(t)]}E{g[x.(t)]} holds for any different i and j. Since the signals have zero mean and f and g are odd, Consequently, E{ΔS}=0; thus, the transformation S fluctuates about a fixed point. (If f and g were both linear, then the Herault- Jutten technique would only decorrelate the outputs; and decorrelation is insufficient for statistical independence.)

A host of other weight-update techniques have been derived from Herault and Jutten (1986). Some variants employ a different criterion for separation in which y,(t) is decorrelated with y,(t-T,) and with y.(t-T 2 ), where T, and T 2 are different time delays, determined in advance (Tong et al., 1991; Abed- Meraim et al., 1994; Van Gerven and Van Compernolle, 1994; Molgedey and Schuster, 1994). Other variants of Herault- Jutten employ a transformation S whose elements are time delays (Platt and Faggin, 1992; Bar-Ness, 1993) or convolutive filters (Jutten et al., 1992; Van Compernolle and Van Gerven, 1992). By going beyond the simple gains employed by Herault and Jutten, these techniques come closer to separating signals that have been mixed in the presence of propagation delay and echoes. Certain difficulties in adaptation associated with the use of convolutive filters are addressed by a recent class of two-stage structures (Din? and Bar-Ness, 1993; Najar et al, 1994) in which a second weight-update stage is placed before or after the first, to improve the quality of separation attained after convergence. Single-estimate techniques for adaptive source separation all suffer to some extent from the dilemma described above: that the rate of adaptation cannot be increased without simultaneously increasing the size of fluctuations present even after convergence is reached. In some practical applications, there exist settings of μ for which both the rate of adaptation and the size of the fluctuations are acceptable; in others, there is a pressing need for techniques in which adaptation and fluctuation are not intrinsically co-dependent.

To accomplish this end, the present invention is founded in a break from the tradition of maintaining a single estimate of S and incorporating new data by changing the value of S. Rather, multiple estimates of S, called hypotheses, are maintained. Associated with each hypothesis is a number called the cumulative performance value. In contrast with single-estimate techniques, which incorporate

new data by perturbing the estimate of S, the present technique incorporates new data by perturbing the cumulative performance values. Thus, each hypothesis remains unchanged from the time it is created to the time that it is destroyed. At any given instant, the hypothesis of greatest cumulative performance value is taken to be the correct one. Optionally, smoothing may be employed to avoid discontinuous changes in S; however, this smoothing is not an intrinsic part of the adaptive process.

Until this point we have discussed related art for achieving sound separation, which is one particular application of adaptive filtering. However, the present invention is contemplated to have potential uses in other applications of adaptive filtering. We now therefore turn our attention to related art in the broader field of adaptive filtering. Compared to any given variant of the standard techniques, the present invention is superior in at least one of the following ways:

• There is essentially no restriction on the form of the performance function, although from a practical standpoint certain performance functions can be computationally intensive to handle.

• Once convergence is achieved, it does not necessarily exhibit fluctuations in proportion to the rate of adaptation.

• It does not necessarily fail when the performance function is underdetermined.

These attractive properties of the present invention will be explained in the context of the standard techniques.

We first describe stochastic-gradient (SG) techniques and explain the shortcoming of SG techniques that the present invention is intended to avoid. We describe how least squares (LS) techniques overcome this limitation, and state why LS techniques cannot always be used. We identify a source of difficulty in adaptive filtering that we call instantaneous underdetermination, and explain why the present invention is superior to SG and LS with respect to its handling of instantaneous underdetermination.

We then turn discuss a less standard, but published technique based on the genetic algorithm (GA). We state why the GA might be considered prior art, then differentiate the present invention from it.

Problems in adaptive filtering can generally be described as follows. One or more time-varying input signals x^t) are fed into a filter structure H with adjustable coefficients h k , i.e., H=H(h,,h 2 ,...). Output signals y^t) emerge from the filter structure. For example, an adaptive filter with one input x(t) and one output y(t) in which H is a linear, causal FIR might take the form: N y(t) = Σ h k (t-kΔt) . k=0

In general, H might be a much more complicated filter structure with feedback, cascades, parallel elements, nonlinear operations, and so forth. The object is to adjust the unknowns h k such that a performance function C(t) is maximized over time.

The performance function is usually a function of statistics computed over recent values of the outputs y,(t); but in general, it may also depend on the input signals, the filter coefficients, or other quantities such as parameter settings supplied exogenously by a human user.

The most basic SG technique, Least Mean Square (LMS), can be used for instances of this problem in which the gradient of the performance function C(t) with respect to the filter coefficients h ( (t) can be expressed as a time average of some quantities G ( (t). The Herault-Jutten algorithm for sound separation, described previously, may be regarded as an example of LMS.

For example, consider the transmission-line echo-cancellation problem typical in the design of full-duplex modems:

y(t) = x,(t) + Σ h k x 2 (t-kΔt) k

C(t) = E[y 2 (t)]. Differentiating C with respect to the filter coefficients h„ we have: (d/dh C = E[2 y(t) x 2 (t-iΔt)] = E[GJ, where

G,(t) = 2 y(t) x,(t-iΔt) .

The LMS algorithm calls for updating the coefficients h, as follows: h,(new) = h,(old) + μ G^t), where μ is a stepsize that is chosen in advance. It may be seen that when the coefficients h, are near a local maximum of C, i.e., the gradient of C is zero and the Hessian matrix of C is negative definite, then E[G,] = 0 and the coefficients fluctuate about that maximum.

Thus, unlike the present invention, SG techniques require that the performance function be differentiable, and that the gradient be numerically easy to compute. In addition, SG techniques suffer from a key problem: for any nonzero stepsize μ, the coefficients fluctuate. Stated more generally, two important performance criteria - the rate of adaptation and the steady-state error in the outputs - are at odds with each other in the setting of μ. Numerous solutions to this dilemma have been proposed:

• In block-based variants of LMS, updates to the filter coefficients are accumulated over blocks of time and added to the coefficients at the end of each block, instead of being added at every time step.

• In adaptive-stepsize LMS, the quantity μ is adjusted according to a predefined schedule ("gearshifting") or is itself adapted as a parameter of the system using an algorithm similar to LMS ("gradient adaptive step size"). These incremental modifications to the LMS algorithm mitigate, but do not eliminate the problem of fluctuation. This problem cannot be eliminated fully without abandoning the principle of operation behind LMS and its variants: that only the time-averaged fluctuation E[μG,] (not the individual update, μG,) is zero when the filter is fully adapted.

The stochastic fluctuations just described are absent in the so-called Least Squares (LS) techniques, of which Recursive Least Squares (RLS) is the archetype. The operations performed by RLS are mathematically equivalent to maximize C(t) exactly at every time step; thus, in RLS, the computed filter coefficients converge asymptotically to their correct values as fast as the statistics estimated in the computation of C(t) converge.

Maximizing C(t) exactly at every step would be computationally prohibitive if done by brute force at every time step, because C(t) is a functional that depends on integrals of the data from the beginning of time. In RLS, a practical level of efficiency is achieved by a mathematical trick that permits the exact maximum of C(t) for the current time step to be computed from the exact maximum computed at a previous time step and the intervening data.

An LS technique is often the method of choice when a problem can be formulated appropriately. However, because of reliance on "mathematical tricks" to maximize C(t) exactly and efficiently at every time step, LS techniques have been developed only for certain restricted classes of adaptive-filtering problems: • H is a FIR filter and

C(t)= Σ a(n)b[y(t-nΔt)-d(t-nΔt)] k=0 where d(t) is a desired signal, a(«) is one of a handful of averaging functional that can be computed recursively, and b[ « ] is typically » 2 ;

• H is a lattice filter and C(t) is as above;

• H is a filter with a systolic architecture and C(t) is as above;

• H is an IIR filter and C(t) is in "equation-error" form.

In contrast, the present invention requires neither a closed-form solution for the maximum of C(t) nor an efficient update scheme for computing the current maximum of C(t) from a past maximum and the intervening data. Thus, the present invention may be considered an alternative to LS and the SG methods for situations in which the latter do not work well enough and the former do not exist.

In addition to the comparisons made so far, differences between LS, SG, and this invention arise in the handling of instantaneous underdetermination. For the present purposes, we use the term "instantaneously underdetermined" to describe a performance function for which global maximization at any particular instant does not necessarily produce the desired filter coefficients—the "correct" filter coefficients might not be the only ones that produce a maximal or numerically near-maximal value of the performance function. Such degeneracy can occur when the locus of global maximal of the performance function is delocalized, or when the basin of attraction is of sufficiently low curvature in one or more dimensions that noise and numerical imprecision can cause the apparent maximum to move far from the ideal location. When a performance function is instantaneously underdetermined, snapshots of the performance surface from many different instants in time may together contain enough information to determine a solution.

LS methods operate by attempting to find the maximum of C(t) at every instant in time, effectively by solving a system of linear equations. Underdetermination is manifested as ill-conditioning or rank deficiency of the coefficient matrix. The solution returned by such an ill-conditioned system of linear equations is very sensitive to noise.

The behavior of SG methods and of the present invention with respect to instantaneous underdetermination is more difficult to analyze since both systems involve the accumulation of small changes over time. To simplify the comparison, note that the present invention can use a standard SG technique or any other method for adaptive filtering as a source of guesses at the correct filter coefficients. Thus, such a combination can always find the correct (or nearly correct) filter coefficients if the SG technique alone can find them. The two methods differ in what happens after such coefficients are found. In the case of SG, fluctuations can eventually cause the coefficients h j (t) to drive far from their correct values because with instantaneous underdetermination, the update vectors mu G,(t) can point away from the correct values (Cohen, 1991). In the case combining SG with the present invention, if a set of correct filter coefficients ever appears in the population of candidate parameter sets, the invention as a whole will use those coefficients without fluctuation.

We turn our attention in the remainder of this section to published work in adaptive filtering based on the genetic algorithm (GA):

R. Nambiar, C.K.K. Tang, and P. Mars, "Genetic and Learning Automata Algorithms for Adaptive Digital Filters," ICASSP '92, IV:41-44 (1992).

S.J. Flockton and M.S. White, "Pole-Zero System Identification Using Genetic Algorithms," Fifth International Conference on Genetic Algorithms, 531-535 (1993).

S.C. Ng, C.Y. Chung, S.H. Leung, and Andrew Luk, "Fast Convergent Search for Adaptive IIR Filtering," ICASSP '94, 111 : 1 05-108 (1994). These applications of the GA to adaptive filtering are similar in form to the present invention because the GA employs a population of candidate solutions.

The works of Calloway (1991), Etter and Dayton (1983), Etter et al. (1982), Eisenhammer et al. (1993), Flockton and White (1993), Michielssen et al. (1992), Suckley (1992), and Xu and Daley (1992) are the easiest to eliminate from consideration as prior art because the problems under consideration did not require adaptation: the object was to adjust unknown filter coefficients such that the frozen performance function C(t) is maximized at a single instant in time. Thus the design of these GAs does not take into account any of the difficulties associated with maximizing a time-varying performance function consistently over time.

The remaining known applications of the GA to signal processing are examples of adaptive filtering in the sense that they do accommodate time variation of the performance function. These are Nambiar et al.

(1992), Nambiar and Mars (1992), Ng et al. (1994), Patiakin (1993), and Stearns et al. (1982). In every case, however, the only way that time variation is taken into account is via what might be called the "headstart" principle. Rather than attempt to find the optimum of C(t) independently for every time t, these methods use the population of solutions from a time in the recent past as an initial guess at the solution for the current time. In some cases, rounds of gradient descent are interleaved with short runs of the GA, but the principle is the same. All of these GAs operate with the goal of finding the global optimum of C(t) for every time t.

These GAs do not solve the fundamental problem that a snapshot of the performance function at any given time contains noise. When the performance function fluctuates, so do the optimized filter parameters. One way to make each snapshot more reliable, and therefore make the optimized filter parameters fluctuate less, is to lengthen the time windows of the averages used to compute the performance function. Another way is to smooth the optimized filter parameters in time. Both approaches make the system slow to react to changes. Moreover, the fluctuations and sensitivity to noise mentioned in this paragraph are exacerbated if the performance function is underdetermined. It is desirable to develop a method for adaptive filtering that applies even when the performance function cannot be maximized analytically or differentiated, that does not exhibit fluctuations in proportion to the rate of adaptation, and that copes well with noise and underdetermination.

Summary of The Invention

The present invention is a method and an apparatus for determining a set of a plurality of parameter values for use in an adaptive filter signal processor to process a plurality of signals to generate a plurality of processed signals.

The method stores a plurality of the sets, called a population. The population is initially generated at random. Each set is evaluated according to an instantaneous performance value. The instantaneous performance value of each set is added, at each clock cycle, to a cumulative performance value. At any given time, the set with the greatest cumulative performance value is used in the adaptive filter signal processor.

Periodically, a new parameter set is generated at random for possible incorporation into the population. If the instantaneous performance value of the new parameter set exceeds the least instantaneous performance value in the population, it is incorporated into the population. The set with the least cumulative performance value is deleted from the population, and the new parameter set is

assigned an initial cumulative performance value that falls short of the greatest cumulative performance value by a fixed amount.

Brief Description of The Drawings

Figure 1 is a schematic block diagram of an embodiment of an acoustic signal processor, using two microphones, with which the adaptive filter signal processor of the present invention may be used.

Figure 2 is a schematic block diagram of an embodiment of the direct-signal separator portion, i.e., the first stage of the processor shown in Figure 1.

Figure 3 is a schematic block diagram of an embodiment of the crosstalk remover portion, i.e., the second stage of the processor shown in Figure 1. Figure 4 is a schematic block diagram of an embodiment of a direct-signal separator suitable for use with an acoustic signal processor employing three microphones.

Figure 5 is a schematic block diagram of an embodiment of a crosstalk remover suitable for use with an acoustic signal processor employing three microphones.

Figure 6 is an overview of the delay in the acoustic waves arriving at the direct signal separator portion of the signal processor of Figure 1 , and showing the separation of the signals.

Figure 7 is an overview of a portion of the crosstalk remover of the signal processor of Figure 1 showing the removal of the crosstalk from one of the signal channels.

Figure 8 is a schematic block level diagram of an embodiment of the DOA estimator of the present invention for use in the acoustic signal processor of Figure 1.

Detailed Description of The Invention

The present invention is an embodiment of a direction of arrival estimator portion of an adaptive filter signal processor, and the adaptive filter signal processor, which may be used in a device that mimics the cocktail-party effect using a plurality of microphones with as many output audio channels, and a signal-processing module. When situated in a complicated acoustic environment that contains multiple audio sources with arbitrary spectral characteristics, the processor as a whole supplies output audio signals, each of which contains sound from at most one of the original sources. These separated audio signals can be used in a variety of applications, such as hearing aids or voice-activated devices.

Figure 1 is a schematic diagram of a signal separator processor of one embodiment of the present invention. As previously discussed, the signal separator processor of the present invention can be used with any number of microphones. In the embodiment shown in Figure 1 , the signal separator processor receives signals from a first microphone 10 and a second microphone 12, spaced apart by about two centimeters. As used herein, the microphones 10 and 12 include transducers (not shown), their associated pre-amplifiers (not shown), and A/D converters 22 and 24 (shown in Figure 2).

The microphones 10 and 12 in the preferred embodiment are omnidirectional microphones, each of which is capable of receiving acoustic wave signals from the environment and for generating a first and a second acoustic electrical signal 14 and 16 respectively. The microphones 10 and 12 are either selected or calibrated to have matching sensitivity. The use of matched omnidirectional microphones 10 and 12, instead of directional or other microphones leads to simplicity in the direct-signal separator 30, described below. In the preferred embodiment, two Knowles EM-3046 omnidirectional microphones were used, with a separation of 2 centimeters. The pair was mounted at least 25 centimeters from any large surface in order to preserve the omnidirectional nature of the microphones. Matching was achieved by connecting the two microphone outputs to a stereo microphone preamplifier and adjusting the individual channel gains so that the preamplifier outputs were closely matched. The preamplifier outputs were each digitally sampled at 22,050 samples per second, simultaneously. These sampled electrical signals 14 and 16 are supplied to the direct signal separator 30 and to a Direction of Arrival (DOA) estimator 20.

The direct-signal separator 30 employs information from a DOA estimator 20, which derives its estimate from the microphone signals. In a different embodiment of the invention, DOA information could come from an source other than the microphone signals, such as direct input from a user via an input device. The direct signal separator 30 generates a plurality of output signals 40 and 42. The direct signal separator 30 generates as many output signals 40 and 42 as there are microphones 10 and 12, generating as many input signals 14 and 16 as are supplied to the direct signal separator 30. Assuming that there are two sources, A and B, generating acoustic wave signals in the environment in which the signal processor 8 is located, then each of the microphones 10 and 12 would detect acoustic waves from both sources. Hence, each of the electrical signals 14 and 16, generated by the microphones 10 and 12, respectively, contains components of sound from sources A and B.

The direct-signal separator 30 processes the signals 14 and 16 to generate the signals 40 and 42 respectively, such that in anechoic conditions (i.e., the absence of echoes and reverberations), each of the signals 40 and 42 would be of an electrical signal representation of sound from only one source. In the absence of echoes and reverberations, the electrical signal 40 would be of sound only from source A, with electrical signal 42 being of sound only from source B, or vice versa. Thus, under anechoic conditions the direct-signal separator 30 can bring about full separation of the sounds represented in signals 14 and 16. However, when echoes and reverberation are present, the separation is only partial.

The output signals 40 and 42 of the direct signal separator 30 are supplied to the crosstalk remover 50. The crosstalk remover 50 removes the crosstalk between the signals 40 and 42 to bring about fully separated signals 60 and 62 respectively. Thus, the direct-signal separator 30 and the crosstalk remover 50 play complementary roles in the system 8. The direct-signal separator 30 is able to bring about full separation of signals mixed in the absence of echoes and reverberation, but produces only partial separation when echoes and reverberation are present. The crosstalk remover 50 when used alone is often able to bring about full separation of sources mixed in the presence of echoes and reverberation, but is most effective when given inputs 40 and 42 that are partially separated.

After some adaptation time, each output 60 and 62 of the crosstalk remover 50 contains the signal from only one sound source: A or B. Optionally, these outputs 60 and 62 can be connected individually to post filters 70 and 72, respectively, to remove known frequency coloration produced by the direct signal separator 30 or the crosstalk remover 50. Practitioners skilled in the art will recognize that there are many ways to remove this known frequency coloration; these vary in terms of their cost and effectiveness. An inexpensive post filtering method, for example, consists of reducing the treble and boosting the base.

The filters 70 and 72 generate output signals 80 and 82, respectively, which can be used in a variety of applications. For example, they may be connected to a switch box and then to a hearing aid.

Referring to Figure 2 there is shown one embodiment of the direct signal separator 30 portion of the signal processor 8 of the present invention. The microphone transducers generate input signals 11 and 13, which are sampled and digitized, by clocked sample-and-hold circuits followed by analog-to- digital converters 22 and 24, respectively, to produce sampled digital signals 14 and 16 respectively. The digital signal 14 is supplied to a first delay line 32. In the preferred embodiment, the delay line 32 delays the digitally sampled signal 14 by a non-integral multiple of the sampling interval T, which was 45.35 microseconds given the sampling rate of 22,050 samples per second. The integral portion of the delay was implemented using a digital delay line, while the remaining subsample delay of less than one sample interval was implemented using a non-causal, truncated sine filter with 41 coefficients. Specifically, to implement a subsample delay of t, given that t<T, the following filter is used :

y(n) = Σ w(k) x(n-k) k=-20 where x(n) is the signal to be delayed, y(n) is the delayed signal, and w(k) {k=-20,-19,...19,20} are the 41 filter coefficients. The filter coefficients are determined from the subsample delay t as follows: w(k) = (1/S) sinc{ π [(t/T) - k] } where sinc(a) = sin(a) / a if a not equal to 0 = 1 otherwise, and S is a normalization factor given by

S = L sinc{ π [(t/T) - k] } . k=-20

The output of the first delay line 32 is supplied to the negative input of a second combiner 38. The first digital signal 14 is also supplied to the positive input of a first combiner 36. Similarly, for the other channel, the second digital signal 16 is supplied to a second delay line 34, which generates a signal which is supplied to the negative input of the first combiner 36.

In the preferred embodiment, the sample-and-hold and A/D operations were implemented by the audio input circuits of a Silicon Graphics Indy workstation, and the delay lines and combiners were implemented in software running on the same machine.

However, other delay lines such as analog delay lines, surface acoustic wave delays, digital low- pass filters, or digital delay lines with higher sampling rates, may be used in place of the digital delay line 32, and 34. Similarly, other combiners, such as analog voltage subtractors using operational amplifiers, or special purpose digital hardware, may be used in place of the combiners 36 and 38. Schematically, the function of the direct signal separator 30 may be seen by referring to Figure 6.

Assuming that there are no echoes or reverberations, the acoustic wave signal received by the microphone 12 is the sum of source B and a delayed copy of source A. (For clarity in presentation here and in the forthcoming theory section, time relationship between the sources A and B and the microphones 10 and 12 are described as if the electrical signal 14 generated by the microphone 10 were simultaneous with source A and the electrical signal 16 generated by the microphone 12 were simultaneous with source B.

This determines the two-arbitrary additive time constants that one is free to choose in each channel.) Thus, the electrical signal 16 generated by the microphone 12 is an electrical representation of the sound source B plus a delayed copy of source A. Similarly, the electrical signal 14 generated by the microphone 10 is an electrical representation of the sound source A and a delayed copy of sound source B. By delaying the electrical signal 14 an appropriate amount, the electrical signal supplied to the negative input of the combiner 38 would represent a delayed copy of source A plus a further delayed copy of source B. The subtraction of the signal from the delay line 32 and digital signal 16 would remove the signal component representing the delayed copy of sound source A, leaving only the pure sound B (along with the further delayed copy of B). The amount of delay to be set for each of the digital delay lines 32 and 34 can be supplied from the DOA estimator 20. Numerous methods for estimating the relative time delays have been described in the prior art (for example, Schmidt, 1981 ; Roy et al., 1988; Morita, 1991 ; Allen, 1991). Thus, the DOA estimator 20 is well known in the art.

In a different embodiment, omni-directional microphones 10 and 12 could be replaced by directional microphones placed very close together. Then all delays would be replaced by multipliers; in particular, digital delay lines 32 and 34 would be replaced by multipliers. Each multiplier would receive the signal from its respective A/D converter and generate a scaled signal, which can be either positive or negative, in response.

A preferred embodiment of the crosstalk remover 50 is shown in greater detail in Figure 3 The crosstalk remover 50 comprises a third combiner 56 for receiving the first output signal 40 from the direct signal separator 30 The third combiner 56 also receives, at its negative input, the output of a second adaptive filter 54 The output of the third combiner 56 is supplied to a first adaptive filter 52 The output of the first adaptive filter 52 is supplied to the negative input of the fourth combiner 58, to which the second output signal 42 from the direct signal separator 30 is also supplied The outputs of the third and fourth combiners 56 and 58 respectively, are the output signals 60 and 62, respectively of the crosstalk remover 50 Schematically, the function of the crosstalk remover 50 may be seen by referring to Figure 7 The inputs 40 and 42 to the crosstalk remover 50 are the outputs of the direct-signal separator 30 Let us assume that the direct-signal separator 30 has become fully adapted, l e , (a) that the electrical signal 40 represents the acoustic wave signals of source B and its echoes and reverberation, plus echoes and reverberation of source A, and similarly (b) that the electrical signal 42 represents the acoustic wave signals of source A and its echoes and reverberation, plus echoes and reverberation of source B Because the crosstalk remover 50 is a feedback network, it is easiest to analyze subject to the assumption that adaptive filters 52 and 54 are fully adapted, so that the electrical signals 62 and 60 already correspond to colored versions of B and A, respectively The processing of the electrical signal 60 by the adaptive filter 52 will generate an electrical signal equal to the echoes and reverberation of source B present in the electrical signal 42, hence subtraction of the output of adaptive filter 52 from the electrical signal 42 leaves output signal 62 with signal components only from source A Similarly, the processing of the electrical signal 62 by the adaptive filter 54 will generate an electrical signal equal to the echoes and reverberation of source A present in the electrical signal 40, hence subtraction of the output of adaptive filter 54 from the electrical signal 40 leaves output signal 60 with signal components only from source B

Theory It is assumed, solely for the purpose of designing the direct-signal separator 30, that the microphones 10 and 12 are omnidirectional and matched in sensitivity

Under anechoic conditions, the signals x,(t) and x 2 (t), which correspond to the input signals, received by microphones 10 and 12, respectively, may be modeled as Xι(t) = w,(t) + 2 (t-τ.) x 2 (t) = w 2 (t) + w,(t-τ,), where w,(t) and w 2 (t) are the original source signals, as they reach microphones 10 and 12, respectively, and τ, and τ 2 are unknown relative time delays, each of which may be positive or negative

Practitioners experienced in the art will recognize that bounded "negative" time delays can be achieved by adding a net time delay to the entire system The relative time delays τ, and τ 2 are used to form outputs y,(t) and y 2 (t), which correspond to signals 40 and 42 , y,(t) => x,(t) - x 2 (t-τ 2 ) = w,(t) - w,(t-(τ,+τ 2 )) y 2 (t) = x 2 (t) - x,(t-τ,) = w-(t) - w.(t-(τ,-K 2 )) As depicted in Figure 2, these operations are accomplished by time-delay units 32 and 34, and combiners 36 and 38

Under anechoic conditions, these outputs 40 and 42, would be fully separated, I e , each output 40 or 42 would contain contributions from one source alone However under echoic conditions these outputs 40 and 42 are not fully separated

Under echoic and reverberant conditions, the microphone signals x,(t) and x 2 (t), which correspond to input signals received by the microphones 10 and 12, respectively, may be modeled as

x,(t) = w,(t)+w 2 (t-τ 2 )+k I , , (t)*w 1 (t)+k 12 , (t)*w 2 (t) x 2 (t) = w 2 (t)+w,(t-τ 1 )+k 21 , (t)*w,(t)+k 22 , (t)*w 2 (t), where the symbol "*" denotes the operation of convolution, and the impulse responses k,,'(t), k, 2 '(t), k 21 '(t), and k 22 "(t) inco orate the effects of echoes and reverberation. Specifically, k π '(t)*w,(t) represents the echoes and reverberations of source 1 (w,(t)) as received at input 1 (microphone 10), k, 2 '(t)*w 2 (t) represents the echoes and reverberations of source 2 (w 2 (t)) as received at input 1 (microphone 10), k 2 | '(t)*w,(t) represents the echoes and reverberations of source 1 (w,(t)) as received at input 2 (microphone 12), and k 22 '(t)*w 2 (t) represents the echoes and reverberations of source 2 (w 2 (t)) as received at input 2 (microphone 12). In consequence of the presence of echoes and reverberation, the outputs 40 and 42 from the direct- signal separator 30 are not fully separated, but instead take the form y,(t) = x,(t)-x 2 (t-τ 2 ) y 2 (t) = x 2 (t)-x,(t-τ,) = w 2 (t)-w 2 (t-(τ,+τ 2 ))+k 2I (t) * w,(t)+k 22 (t) * w 2 (t) where the filters k,,(t), k 12 (t), k 21 (t), and k 22 (t) are related to k n '(t), k ]2 '(t), k 2 ,'(t), and k 22 '(t) by time shifts and linear combinations. Specifically, k„(t) = k π '(t) - k 21 '(t-τ 2 ), k 12 (t) = k 12 '(t) - k 22 '(t-τ 2 ), k 2 ,(t) = k 2 ,'(t) - k π '(t-τ,), and k 22 (t) = k 22 '(t) - k 12 '(t-τ,). Note that y,(t) is contaminated by the term k 12 (t) * w 2 (t), and that y 2 (t) is contaminated by the term k 21 (t) * w,(t).

Several possible forms of the crosstalk remover have been described as part of the background of this invention, under the heading of convolutive blind source separation. In the present embodiment, the crosstalk remover forms discrete time sampled outputs 60 and 62 thus:

1000 Zι(n) = yι(n) - i h 2 (k)zj (n-k) k = 1 1000

Z j (n) = y 2 (n) - Σ h,(k)z, (n-k) k = 1 where the discrete time filters h, and h 2 correspond to elements 52 and 54 in Figure 3 and are estimated adaptively. The filters h, and h 2 are strictly causal, i.e., they operate only on past samples of z, and Z j . This structure was described independently by Jutten et al. (1992) and by Platt and Faggin (1992).

The adaptation rule used for the filter coefficients in the preferred embodiment is a variant of the LMS rule ("Adaptive Signal Processing," Bernard Widrow and Samuel D. Steams, Prentice-Hall, Englewood Cliffs, N.J., 1985, p 99). The filter coefficients are updated at every time-step n, after the new values of the outputs z,(n) and z^n) have been calculated. Specifically, using these new values of the outputs, the filter coefficients are updated as follows : h,(k) [new] = h,(k) [old] + m Z j (n) z,(n-k) k=l,2,...,1000 h 2 (k) [new] = h 2 (k) [old] + m z,(n) Z j (n-k) k= 1,2,..., 1000 where m is a constant that determines the rate of adaptation of the filter coefficients, e.g. 0.15 if the input signals 10 and 12 were normalized to lie in the range -1 < x(t) < +1. One skilled in the art will recognize that the filters h, and h 2 can be implemented in a variety of ways, including FIRs and lattice

IIRs.

As described, the direct-signal separator 30 and crosstalk remover 50 adaptively bring about full separation of two sound sources mixed in an echoic, reverberant acoustic environment. However, the output signals z,(t) and z^t) may be unsatisfactory in practical applications because they are colored versions of the original sources w,(t) and w 2 (t), i.e., z, = ζ,(t)*w,(t)

Z, = ζ:(t)*w 2 (t) where ζ,(t) and ζ 2 (t) represent the combined effects of the echoes and reverberations and of the various known signal transformations performed by the direct-signal separator 30 and crosstalk remover 50.

As an optional cosmetic improvement for certain commercial applications, it may be desirable to append filters 70 and 72 to the network. The purpose of these filters is to undo the effects of filters ζ,(t) and ζ 2 (t). As those familiar with the art will realize, a large body of techniques for performing this inversion to varying and predictable degrees of accuracy currently exist.

The embodiment of the signal processor 8 has been described in Figures 1 -3 and 6-7 as being useful with two microphones 10 and 12 for separating two sound sources, A and B. Clearly, the invention is not so limited. The forthcoming section describes how more than two microphones and sound sources can be accomodated.

General case with M microphones and M sources

The invention is able to separate an arbitrary number M of simultaneous sources, as long as they are statistically independent, if there are at least M microphones. Let w.(t) be the j'th source signal and x,(t) be the i'th microphone (mic) signal. Let t,. be the time required for sound to propagate from source j to mic i, and let d(t υ ) be the impulse response of a filter that delays a signal by t,.. Mathematically, d(t u ) is the unit impulse delayed by t,., that is

whe *•> " δ( '^ re δ(t) is the unit impulse function ("Circuits, Signals and Systems", by William McC. Siebert. The MIT

Press, McGraw Hill Book Company, 1986, p. 319).

In the absence of echoes and reverberation, the i'th mic signal x,(t) can be expressed as a sum of the appropriately delayed j source signals

repr esentation allows a compact representation of this equation for all M mic signals :

X(i) = D(t) * W(t) whe re X(t) is an Λ -element column vector whose i'th element is the i'th mic signal x,(t), D(t) is an MxM element square matrix whose ij'th element (ie., the element in the i'th row and j'th column) is d(t, j ), and

W(t) is an M-element column vector whose j'th element is the j'th source signal w.(t). Specifically,

L M (tJj • • • A iJ lw M (t)J

For each source w.(t), if the delays t,. for \=l,2,,..,M to the M mics are known (up to an arbitrary constant additive factor that can be different for each source), then M signals y.(t), j=l ,2,...,_V , that each contain energy from a single but different source w^t), can be constructed from the mic signals x,(t) as follows

Y(t) = adjD(t) * X(t) ,

where

is the Λ -element column vector whose j'th element is the separated signal y.(t), and adjD(t) is the adjugate matrix of the matrix D(t). The adjugate matrix of a square matrix is the matrix obtained by replacing each element of the original matrix by its cofactor, and then transposing the result ("Linear Systems", by Thomas Kailath, Prentice Hall, Inc., 1980, p 649) The product of the adjugate matrix and the original matrix is a diagonal matrix, with each element along the diagonal being equal to the determinant of the original matrix Thus,

Y(i)= adjD(t) *X(t)

= adjD(t) * D(t) * W(i)

where |D(t)| is the determinant of D(t). Thus, y j (t) = \ *w ! (t) for j = 1,2 M

y.(t) is a "colored" or filtered version of w.(t) because of the convolution by the filter impulse response |D(t)|. If desired, this coloration can be undone by post filtering the outputs by a filter that is the inverse of |D(t)|. Under certain circumstances, determined by the highest frequency of interest in the source signals and the separation between the mics, the filter |D(t)| may have zeroes at certain frequencies; these make it impossible to exactly realize the inverse of the filter |D(t)|. Under these circumstances any one of the numerous techniques available for approximating filter inverses (see, for example, "Digital Filters and Signal Processing", by Leland B. Jackson, Kluwer Academic Publishers, 1986, p.146) may be used to derive an approximate filter with which to do the post filtering.

The delays t, j can be estimated from the statistical properties of the mic signals, up to a constant additive factor that can be different for each source. Alternatively, if the position of each microphone and each source is known, then the delays t can be calculated exactly. For any source that is distant, i.e., many times farther than the greatest separation between the mics, only the direction of the source is needed to calculate its delays to each mic, up to an arbitrary additive constant.

The first stage of the processor 8, namely the direct signal separator 30, uses the estimated delays to construct the adjugate matrix adjD(t), which it applies to the microphone signals X(t) to obtain the outputs Y(t) of the first stage, given by: Y(t) = adjD(t) * X(t).

In the absence of echoes and reverberations, each output y j (t) contains energy from a single but different source W(t).

When echoes and reverberation are present, each mic receives the direct signals from the sources as well as echoes and reverberations from each source. Thus

M M xffi = ∑ tK *w *∑ efi)* γ >μi for i = 1,2,..., j~l j-X

where e B (t) is the impulse response of the echo and reverberation path from the j'th source to the i'th mic. All M of these equations can be represented in compact matrix notation by

X(t) = D(t)*W(t)+E(t)*W(t) where E(t) is the MxM matrix whose ijth element is the filter e, j (t).

If the mic signals are now convolved with the adjugate matrix of D(t), instead of obtaining separated signals we obtain partially separated signals:

Y(t)= adjD(t) *X(t)

= \D(t) I * W(i) +adjD(t) *E(t) * W(t)

Notice that each y(t) contains a colored direct signal from a single source, as in the case with no echoes, and differently colored components from the echoes and reverberations of every source, including the direct one.

The echoes and reverberations of the other sources are removed by the second stage of the network, namely the crosstalk remover 50, which generates each output as follows:

M ψ) = y - ∑ h jk (t) *z k {t) for j = 1,2,...,

*-ι

where the entities h jk (t) are causal adaptive filters. (The term "causal" means that h Jk (t)=0 for t<0.) In matrix form these equations are written as

2(1) = Y(l)-H(t)*Z(t) where Z(t) is the M-element column vector whose j'th element is Z j (t), and H(t) is an MxM element matrix whose diagonal elements are zero and whose off diagonal elements are the causal, adaptive filters h jk (f).

These filters are adapted according to a rule that is similar to the Least Mean Square update rule of adaptive filter theory ("Adaptive Signal Processing," Bernard Widrow and Samuel D Stearns, Prentice-Hall, Englewood Cliffs, NJ, 1985, p 99)

This is most easily illustrated in the case of a discrete time system Illustrative weight update methodology for use with a discrete time representation

First, we replace the time parameter t by a discrete time index n Second, we use the notation H(n)[new] to indicate the value of H(n) in effect just before computing new outputs at time n At each time step n, the outputs Z(n) are computed according to

Z(n) = Y(n)-H(n)[mw]*Z(n) Note that the convolution on the right hand side involves only past values of Z, ie Z(n- l),Z(n-2), , Z(n-N), because the filters that are the elements of H are causal (N is defined to be the order of the filters in H)

Now new values are computed for the coefficients of the filters that are the elements of H These will be used at the next time step Specifically, for each j and each k, with j ≠ k, perform the following h jk (u)[old\ = h jk (u)[new] u = \,2,... h lk (u)[new] = h Jk (u)[old +μ jt z. } ( )z k (n-u) u = 1,2,...,Λ/

The easiest way to understand the operation of the second stage is to observe that the off-diagonal elements of H(t) have zero net change per unit time when the products like z | (t)z k (t-u) are zero on average Because the sources in W are taken to be statistically independent of each other, those products are zero on average when each output z,(t) has become a colored version of a different source, say w.(t) (The correspondence between sources and outputs might be permuted so that the numbering of the sources does not match the numbering of the outputs )

More specifically, let Z(t)=Ψ(t)*W(t) From the preceding paragraph, equilibrium is achieved when Ψ(t) is diagonal. In addition, it is required that

Z(t) =Y(t)-H(t) *V(t)*W(t)

= I D(t) I * W(t) +adjD(t) *E(t) * W(i) -H(t) * Ψ(t) * W(t) =( \D(t) \I+adjD(t) *E(t) -H(t) * Ψ (t)) * W(t)

so that

This relation determines the coloration produced by the two stages of the system, taken together

An optional third stage can use any one of numerous techniques available to reduce the amount of coloration on any individual output

Example of general case with M = 3, i.e. with 3 mics and 3 sources

In the case where there are 3 mics and 3 sources, the general matrix equation

X = D*W becomes

If the delays t are known, then the adjugate matrix of D(t) is given by d(t n + t 33 ) - d(t n + t 32 ) d(t n + t n ) - d(t n * 1 33 ) d(t 12 + t M ) - d(ι 22 + 1 „) adjD = <*»23 + t »|) -*2l + '»> *ll + t 33> -<tølϊ + »ϊl> ^l + f 13) -*ll + t 2l) d(/ 2l + f 32 ) -d(t n + 1 31 ) d(t 12 + -,,) -««(„ + r 32 ) </(.„ + _ 22 ) -d(t 21 +f 12 )

Note that adding a constant delay to the delays associated with any column of D(t) leaves the adjugate matrix unchanged. This is why the delays from a source to the three mics need only be estimated up to an arbitrary additive constant.

The output of the first stage, namely the direct signal separator 30, is formed by convolving the mic signals with the adjugate matrix.

The network that accomplishes this is shown in Figure 4.

In the absence of echoes and reverberations, the outputs of the first stage are the individual sources, each colored by the determinant of the delay matrix.

In the general case when echoes and reverberation are present, each output of the first stage also contains echoes and reverberations from each source. The second stage, namely the cross talk remover 50, consisting of a feedback network of adaptive filters, removes the effects of these unwanted echoes and reverberations to produce outputs that each contain energy only one different source, respectively. The matrix equation of the second stage

Z = Y-H*Z becomes

where each hj. is a causal adaptive filter. The network that accomplishes this is shown in Figure 5.

It should be noted that the number of microphones and associated channels of signal processing need not be as large as the total number of sources present, as long as the number of sources emitting a significant amount of sound at any given instant in time does not exceed the number of microphones. For example, if during one interval of time only sources A and B emit sound, and in another interval of time only sources B and C emit sound, then during the first interval the output channels will correspond to A and B respectively, and during the second interval the output channels will correspond to B and C respectively.

Referring to Figure 8 there is shown a block level schematic diagram of a direction of arrival estimator 200 of the present invention. T e direction of arrival estimator 200 may be used with the direct signal separator 30 shown in Figure 1 to form an adaptive filter signal processor.

As previously described, the function of the direct signal processor 30 is to receive digital acoustic signals 26 and 28 (for the two microphone embodiment shown in Figure 1), represented as x,(t), and x 2 (t), respectively, where they are: x 2 (t) = w 2 (t) + w,(t-τ,), and process them to obtain the processed signals of y,(t) and y 2 (t), which correspond to signals 40 and 42: y,(t) = χ,(t) - x 2 (t-τ 2 ) = w,(t) - w,(t-(τ,^ 2 )) y*(t) = X ι (t) - x,(t-τ,) = w.(t) - w 2 (t-(τ,+τ 2 ))

Under anechoic conditions, i.e., in the absence of echoes and reverberations of said acoustic waves from said plurality of sources, each of said first processed acoustic signals, y,(t) and y 2 (t), represent acoustic waves from only one different source. To achieve such result, the variables τ, and τ 2 , must match the actual delay of the acoustic waves from the source reaching the microphones 10 and 12 as shown graphically in Figure 6. In the embodiment described heretofore, the variables τ, and τ 2 are supplied from the DOA estimator 20. At best, however, this is only an estimation. The aim of the present invention is to perform this estimation accurately and rapidly.

In Figure 8, the preferred embodiment of the Direction of Arrival estimator 200 of the present invention is shown as receiving two input sample signals 26 and 28. However, as is clearly described heretofore, the present invention is not so limited, and those skilled in the art will recognize that the present invention may be used to process any number of input signals. As the number of input signals rises, the task of estimating directions of arrival becomes more computationally intensive. Nonetheless, the number of input signals that the present invention can process is limited only by the available data and computational resources, not by a fundamental principle of operation.

The purpose of direction-of-arrival estimator 200 is to estimate a pair of delay values, namely τ, and τ 2 , that are consistent with the input signals x,(t) and x 2 (t) in the manner

described above. For purpose of discussion, a pair of delay values is termed a set of parameter values.

A clock 210 is present in the preferred embodiment and runs at a "fast" rate of, e.g. a multiple times 100 cycles per second. The fast clock rate is supplied to the block 202, which comprises the parameter generator 220, the parameter latch 230, the instantaneous-performance calculator 240, the instantaneous-performance latch 250, and the instantaneous-performance comparator 260, and the gate 270.

The "fast" clock rate of multiple times 100 cycles per second is then divided by a divider 204, which yields a "slow" clock rate of, e.g. 100 cycles per second. The "slow" clock rate is supplied to the components in block 206, each of which executes once per each

"slow" clock cycle. The order of their execution is obvious from Figure 8. Although the "fast" and "slow" clock rates are described here as running synchronously, one skilled in the art will recognize that their operation could be made asynchronous without departing from the spirit and scope of the invention. Parameter sets originate at the parameter generator 220. Each time a parameter set is created, it is supplied to the parameter latch 230 for temporary storage. The first N parameter sets that arrive at the parameter latch 230 are stored in the parameter population storage 230(1-N), which holds N parameter sets, where N is 80 to 100 in the preferred embodiment. Thereafter, a parameter set is transferred from the parameter latch 230 to the parameter population storage 230(1 -N) (and a parameter set previously in the parameter population storage 230(1 -N) is discarded) only at times determined by the state of the gate 270, which in turn is determined by conditions described below.

Once per "slow" clock cycle, one parameter set is copied from the parameter population storage 230(1-N) to the best-parameter latch 300. Which parameter set is copied is determined by selector 290 using information from the cumulative-performance comparator

310, whose operation is described below. The parameter set in the smoothed-parameter latch 320 is updated via a leaky integrator 330 from the value in the best-parameter latch 300. In the preferred embodiment, the leaky integrator 330 performs the operation of : τ^new) = (l-ε)τi(old) - ε^^best-parameter latch) where the constant ε is set as follows:

ε = (l/2) (, ) where k is about 10, so that the leaky integrator has a half-life of approximately 10 clock cycles. At each "slow" clock cycle, the parameter set in the smoothed-parameter latch 320 is supplied to the direct-signal separator 30. It remains to describe (a) the conditions under which the gate 270 opens, (b) how a parameter set in the parameter population storage 230(1 -N) is chosen to be discarded each time the gate 270 opens, and (c) how the selector 290 chooses a parameter set to copy from the parameter population storage 230(1 -N), based on information from the cumulative-performance comparator 310. Item (a). Each time a parameter set, generated by the parameter generator 220, arrives at the parameter latch 230, it is supplied to the instantaneous-performance calculator 240 for operation thereon. Also, at each "slow" clock cycle, the N parameter sets in the parameter population storage 230(1-N) are supplied to the bank of N instantaneous-performance calculators 240(1-N) for operation thereon. The performance value computed by the instantaneous-performance calculator 240 is supplied to the instantaneous-performance latch 250. At the same time, the plurality of performance values

computed by the plurality of instantaneous performance calculators 240(1 -N) are supplied to a plurality of instantaneous-performance population storage 340(1-N), respectively.

The values in the instantaneous-performance population storage 340(1 -N) may change even during a "slow" clock cycle in which the contents of the parameter-population storage 230(1-N) do not change, since they depend on the input signals 26 and 28, x,(t) and x 2 (t), which do change.

Each of the instantaneous-performance calculators 240(1 -N) and 240, calculates an instantaneous-performance value for the given parameter set in accordance with: instantaneous performance value = -protlog[ ao, E{ y,(t) * y 2 (t) } 2 ].

The notation E{»} denotes a time- windowed estimate of the expected value of the enclosed expression, as computed by a time average. In the preferred embodiment, this time average is computed over 50 milliseconds. The notation protlogfaoja) denotes a protected logarithm: protlog[ ao, a ] = log[a] if a > ao

= log[ao] otherwise .

As will be seen, the value of y,(t) * y 2 (t) is related to the input signals 26 and 28, and also depends on the values of τ, and τ 2 in a parameter set. Parameter sets with higher instantaneous performance values are deemed more consistent with recent input data x,(t) and x 2 (t) than are parameter sets with lower instantaneous performance values. The strategy of the remainder of the invention is to identify those parameter sets in the parameter population storage 230(1 -N) that exhibit the highest instantaneous performance values consistently over time.

The instantaneous-performance calculator 240 and the N instantaneous-performance calculators 240(1 -N) employ values supplied by the instantaneous-performance precalculator

350, which performs operations on the input data 26 and 28 that would otherwise constitute effort duplicated across the N+l instantaneous-performance calculators 240 and 240(1 -N). In particular, the quantity E{y,(t)*y 2 (t)} can be computed without explicitly computing y,(t) and y 2 (t) (see theory section, below). Each time a value arrives at the instantaneous-performance latch 250, it is compared by the instantaneous-performance comparator 260 with the least value in the instantaneous-performance population storage 240(1 -N). If it is less, the gate 270 remains closed. If it is greater, the gate 270 opens.

Item (b). At each "slow" clock cycle, each value in the instantaneous-performance population storage 340(1 -N) is added to the corresponding value in the cumulative-performance population storage 360(1 -N) by one of the adders in the bank of adders 370(1-N). Thus, each cell in the cumulative-performance population storage 360(1 -N) maintains a cumulative sum of instantaneous-performance values. Other embodiments could employ a weighted sum or other arithmetic operation for incorporating new instantaneous-performance values into the cumulative performance.

Each time the gate 270 opens, the parameter set currently in the parameter latch 230 is copied to the parameter population storage 230(1 -N), replacing the parameter set with least cumulative performance. Its cumulative performance is initialized to a value that falls short of the greatest value in the cumulative-performance population storage by a fixed amount determined in advance. This fixed value is chosen so that the cumulative performance of the parameter set newly introduced into the parameter-population storage 230(K) cannot exceed

the currently greatest cumulative performance in the cumulative-performance population storage 360(1 -N) until sufficient time has elapsed for the statistics to be reasonably reliable. In the preferred embodiment, sufficient time is deemed to be 100 milliseconds.

Item (c). At each "slow" clock cycle, the cumulative-performance comparator 310 identifies the greatest value in the cumulative-performance population storage 360(1 -N) and supplies the index (1 through N) of that value to the selector 290. The selector 290 copies the corresponding parameter set from the parameter population storage 230(1 -N) to the best-parameter latch 300 for further processing as described above.

The generator 220 can generate τ, and τ 2 in a number of different ways. First the generator 220 can draw the delay values randomly from a predetermined distribution of τ, and τ 2 values. Secondly, the values of τ, and τ 2 can be estimated from the incoming data streams using crude methods that exist in the prior art. Alternatively they can be variants of existing values of t, and τ 2 . The values of τ, and τ 2 can also be based upon prior models of how the sources of the acoustic waves are expected to move in physical space. Further, the values of τ, and τ 2 can be determined by another adaptive filtering method, including well known conventional methods such as a stochastic-gradient method. Finally, if the number and required resolution of the filter coefficients are low enough that the initial sets of delay values contain all the possible delay values, then clearly the generator 220 can even be omitted.

Theory

Problems in adaptive filtering can generally be described as follows. One or more time-varying input signals x,(t) are fed into a filter structure H with adjustable coefficients h k , i.e., H=H(h,,h 2 ,...). Output signals y^t) emerge from the filter structure. For example, an adaptive filter with one input x(t) and one output y(t) in which H is a linear, causal FIR filter might take the form: y(t) = Σ h k x(t-kΔt).

In general, H might be a much more complicated filter structure with feedback, cascades, parallel elements, nonlinear operations, and so forth.

The object is to adjust the unknowns h k such that a performance function C(t) is maximized consistently over time. The performance function is usually a statistic computed over recent values of the outputs y,(t); but in general, it may also depend on the input signals, the filter coefficients, or other quantities such as parameter settings supplied exogenously by a human user.

For example, consider the 2x2 case of the preferred embodiment of the direct-signal separator 30. Each of the input signals, x,(t) or x 2 (t), comes from one omnidirectional microphone. Each omnidirectional microphone receives energy from two source signals, s,(t) and s 2 (t). The structure H is a feed-forward network, with time delays, that implements the following relations: y.(0 = Xι(t) - x 2 (t - τ 2 ) , and y 2 (t) = x 2 (t) - x,(t - τ,). The adjustable coefficients are τ, and τ 2 ; i.e., H=H(τ„τ 2 ). Note that even when the signals are sampled, τ, and τ 2 need not be integral multiples of Δt. The goal is to recover versions of s,(t) and s 2 (t), possibly after some frequency coloration, from x,(t) and x 2 (t), i.e., to make y,(t) be a filtered version of s,(t) only, and y 2 (t) a filtered version of s 2 (t) only, or vice versa.

The pivotal assumption is that s,(t) and s 2 (t) are statistically independent. For mathematical convenience, it is also assumed that both have (possibly different) probability distributions symmetric about zero. (For example, speech signals have probability distributions that are sufficiently symmetric about zero, though not perfectly so.) These two assumptions imply that

E[f(s,(t))g(s 2 (t))] = 0 holds for every possible choice of the two odd functions f and g. (The notation E['] denotes a time-windowed estimate of the expected value of the enclosed expression, as computed by a time average. In the preferred embodiment, this time average is computed over 50 milliseconds. Typical weighting functions are rectangular or exponential, but many others are also used.)

In the absence of noise, echoes, reverberation, and certain mathematical degeneracies, the desired goal is accomplished when H is set such that the relation of statistical independence holds for y,(t) and y 2 (t) as well, i.e., E[f(y,(t))g(y 2 (t))] = 0 for every possible choice of the two odd functions f and g.

To specify these conditions in a form appropriate for the present invention, it is necessary to define some performance function C(t) that is largest when the conditions of statistical independence are closest to being satisfied. In practice, a number of choices must be made in the definition of C(t). First, although the number of equations that must be satisfied for true statistical independence is infinite, it is generally necessary to choose a small number of pairs of functions f and g for which the equations are to be satisfied. Second, the time window over which the expectation operator E[ » ] performs averaging must be chosen. Third, the averages used to compute the function C(t) can be defined in one of two ways: direct or retrospective.

Each of these choices can affect the speed at which the system adapts to changes in the configuration of sources, and the computational demands that it places on the underlying hardware. With some combinations of these choices, including those made in the preferred embodiment, the present invention provides advantages not found with competing methods such as LMS or RLS.

We now discuss the three choices, their, effects on speed and efficiency, and the conditions under which the present invention provides unique advantages.

The first choice concerns which pairs of f and g functions to use in enforcing the statistical independence conditions. One familiar with the art will recognize that a variety of ways to enforce the statistical-independence relations exist, and that any one of these could be used without departing from the spirit and scope of the invention. In the preferred embodiment of the present invention, the relations are enforced by incorporating them into a least-squares error function of the following form: e(t) = Σ c k e k (t) , k where c k is a weighting factor, e k (t) = { E[ f k (y,(t)) g k (y 2 (t)) ] } 2 , and f k and g k are odd functions. Often they are defined to be odd polynomial terms, e.g., f k (y) = y 3 . The goal is to maximize e(t). The performance function C(t) is then defined thus:

C(t) = - protlog[ a,,, e(t) ] ,

and the goal is then to maximize C(t). The notation ρrotlog[ a„,a ] denotes a protected logarithm: protlog[ a„, a ] = log[a] if a > a^ = logfa,,] otherwise.

In the preferred embodiment, the bound a,, is equal to le-20.

Given this definition of C(t), the first choice amounts to the selection of odd functions f_ and g k and weighting coefficients c k . Each function f k and g k can be linear or nonlinear, and any number of pairs of f and g functions can be used. These choices have been debated extensively in the published literature because they involve a tradeoff between performance and mathematical sufficiency. On one hand, raw performance issues call for the number of pairs of functions f and g to be small, and for the functions themselves to be simple; otherwise C(t) becomes intensive to compute, difficult to optimize (because of multiple competing optima), and sensitive to noise. On the other hand, if those guidelines are carried too far, the performance function C(t) becomes "instantaneously underdetermined." That is, at any given time t, the relation e(t) = 0 does not uniquely determine H; the value of H that causes y,(t) and y 2 (t) to be filtered versions of s,(t) and s 2 (t) is only one of a family of values of H that satisfy e(t) = 0.

The second choice pertains to the length of the window over which the expectation operator E[ » ] performs averaging. This choice gives rise to a tradeoff between adaptation rate and fluctuation. On one hand, if averaging times are too long, the system will adapt to changes slowly because the statistics that make up C(t) change slowly. On the other hand, if the averaging times are too short, C(t) will be noisy. Moreover, within short time windows it is possible for the pivotal assumption of statistical independence to be violated transiently. The third choice pertains to precisely how the averaging done by the expectation operator E[ « ] is defined. This choice has an impact on how quickly and accurately the invention can adapt to the incoming signals. Consider the expression E[f k (y,(t))g k (y 2 (t))], which involves averaging over the outputs of the invention. The outputs depend, in turn, on the transfer function H, which varies in time as the invention adapts to the incoming signals. One method is to perform the time average using the actual past values of y,(t) and y 2 (t). This is the conventional averaging procedure. Although simpler to implement, this method creates a problem: the averaging window may include past times during which the output signals y,(t) and y 2 (t) were computed with past values of H that were either much better or much worse than the current H at separating the sources. Consequently, with conventional averaging the performance function C(t) is not an accurate measure of how well the invention is currently separating the signals.

The alternative is to perform the averaging retrospectively: compute the time average using the past values that y,(t) and y 2 (t) would have taken, had the current value of H been used to compute them. This way, the performance function C(t) is a more accurate measure of the performance of the current value of H, and adaptation to movement of the sources can occur more quickly. However, this method of computing the average calls for a somewhat more complicated technique for processing the input signals x,(t) and x 2 (t). The technique used in the preferred embodiment is described below.

A key advantage of the present invention over existing adaptive filtering techniques is its relative imperviousness to the effects of mathematical insufficiency and fluctuation in C(t),

relative to existing methods for performing adaptive filtering. These properties permit the first two choices to be made in a manner that favors performance and adaptation rate.

In particular, consider the behavior of the performance function C(t) in the preferred embodiment. Only one pair of functions f, g is used, and both functions are linear: f(y) = y; g(y) = y, so that C(t) is maximized by driving the expression

E[ y,(t) y 2 (t) ] as close as possible to zero. This condition, known as decorrelation, is a much less stringent condition than statistical independence. It has long been recognized in the literature (see, for example, Cardoso, 1989) that decorrelation over a single time window, even a long one, is an insufficient condition for source separation. In the terms used here, the resulting performance function is instantaneously underdetermined. Moreover, in the preferred embodiment, the expectation operator E[»] averages over a very short time window: 50 milliseconds; this causes the performance function C(t) to be sensitive to noise and transient departures from statistical independence of the sources. It is under conditions of instantaneous underdetermination and rapid fluctuation of C(t) that the greatest advantages of the present invention over prior art are realized. The so-called Least Squares methods, of which Recursive Least Squares (RLS) is the archetype, cannot be used because they effectively compute, at each time step, a unique global optimum of the performance function. (RLS performs this computation in an efficient manner by updating a past solution with data from the current time window; nonetheless, the resulting solution is the unique global optimum.) In the presence of instantaneous underdetermination, no unique global optimum exists.

Under conditions of instantaneous underdetermination, LMS and GA solutions exist but are expected to fail. This is because both methods attempt to move toward the global optimum of the performance function at any given instant in time, using the gradient of the performance function. Under the conditions of instantaneous underdetermination presently under discussion, the theoretical global optimum is not a single point on the performance surface, but an infinite collection of points spread over a large region of the parameter space. At any given instant, LMS or GA techniques will attempt to move their parameter estimates toward one of these optimal points. Subsequent linear smoothing will produce a weighted mean of these estimates, which in general may be very different from the ideal value. Instead of producing a weighted mean of estimates, the present invention effectively calculates a weighted mode: it chooses the single estimate that most consistently optimizes the performance surface over time. The present invention copes well with instantaneous underdetermination only under certain conditions. In particular, the preferred embodiment of the present invention exploits the fact that if the sources s,(t) and s 2 (t) vary in their relative power, as is true in conversational speech, the relation e(t) = 0 holds true consistently over time only for the correct value of H. Other values of H can satisfy the relation only transiently. The behavior of the present invention is to choose τ, and τ 2 such that C(t) is maximized consistently over time.

T e strategy of maximizing C(t) consistently over time is an advantage of the present invention over existing LMS and GA based algorithms even when instantaneous undetermination is absent and a weighted mean does converge to the ideal parameters. Even under such favorable conditions, LMS and GA based algorithms may still produce a parameter set that fluctuates rapidly, albeit with values centered about the ideal. For certain applications, these fluctuations would detract from the usefulness or attractiveness of the

system. They may be removed by the conventional technique of linear smoothing, but usually at the expense of adaptation speed: the system will react slowly to true changes in the ideal parameters. In contrast, the present invention can produce non-fluctuating parameters while retaining the ability to adapt quickly. This is because noise fluctuations (and fluctuations in C(t) due to transient departures of the sources from statistical independence) are translated into fluctuations in the cumulative performance values, not in the parameters themselves.

An important advantage of the present invention over the so-called Least-Squares methods, of which Recursive Least Squares (RLS) is the archetype, is its generality. Least-Squares methods can be used only when the form of C(t) permits its global maximum to be computed analytically. The present invention has no siich requirement.

In the present invention, the third choice (conventional vs. retrospective averaging) is especially critical. The invention relies on being able to obtain a rapid, accurate estimate of how well a given trial value of H would separate the signals, and therefore essentially requires that averaging be performed retrospectively. In addition, unlike conventional averaging, retrospective averaging can permit the bulk of the processing effort to be performed in a precomputation step that is common to all population members.

In the preferred embodiment, this precomputation proceeds as follows. As previously discussed, the purpose of the performance calculator is to calculate C(t) in accordance with: C(t) = -protlog[ ao, e 2 (t) ] , where e(t) = E[ y,(t) * y-(t) ] . The expectation operator E[ # ] is expensive to calculate, but the computation can be structured so that multiple calculations of C(t) for the same time but different τ, and τ 2 can share many operations. Since, as described above, the time delay τ, and τ 2 are implemented by convolution with a sine-shaped filter w,: y,(t) = x,(t) - Σ w 2 2 ,k 2 ) x 2 ( t - k 2 Δt ) k 2

y 2 (t) = x 2 (t) - ∑ w,(τ t ,k.) χ,( t - k, Δt ), k, and since the expectation operator E[ « ] is linear, it follows that the quantity in question can be expressed as follows:

E{ y,(t) y 2 (t) } =

E{ x,(t) x 2 (t) } - Σ w,(τ„k,) e 12 (k„0) k,

- Σ w 2 2 ,k 2 ) e l2 (0,k 2 )

+ Σ Σ w,(τ„k,) w 2 2 ,k 2 ) e 12 (k„k 2 ) , k, k 2 where w, and w 2 are coefficients that depend on the delays but not the input data, and e 12 (k„k 2 ) = E{ x,( t - k, Δt ) x 2 ( t - k 2 Δt ) }. The quantity e 12 (k,,k 2 ) can be precalculated once per clock cycle for each (k,,k 2 ) pair, leaving the sums over k, and k 2 to be calculated once for each τ„ τ 2 parameter set.

When the number of input channels exceeds two, the quantity e. j fo.k.) = E{ x,( t - k, Δt ) x.( t - k. Δt ) } must be precalculated for every (ij) pair, and for each such pair, for every (k„k.) pair. In the preferred embodiment of the present invention, the invention has been implemented in software, as set forth in the microfiche appendix. The software code is written in the C++ language for execution on a workstation from Silicon Graphics However, as previously discussed, hardware implementation of the present invention is also contemplated to be within the scope of the present invention. Thus, for example, the direct signal separator 30 and the crosstalk remover 50 can be a part of a digital signal processor, or can be a part of a general purpose computer.

-// Croβββcho2 -"-C++-*- SRβvialoni 12.2 $ SDatβi 1994/09/05 23ι37t57 $

// Second stage of our two-stage algorithm. lifndef _Croβsβcho2__h fdefine _Crossacho2_h class Croaβecho2 {

// Constructor, destructor publici

Crossecho2( int ntap, float mu ); -Croasecho2() ;

// Settings privatet float nut int nta j

// State privatei float *wa, *Wb; float *za, *zbt int inowt int firβt_tiιιιβ_through;

// Interface to descendants publici void Process) float* za_now, float* zb_now, float ya, float yb )ι void du_np_to_cout() i

1; lendif // _Croβββcho2_h

// Cro*sacho2 -•-<_♦♦- - SRavi■iont 12.2 $ JDatβi 1994/08/05 23ι37ι57 $ llncludβ *Croaaβcho2.h* ♦Include "myatraa .h*

Crossecho2i ιCrossacho ( int ntap_. float mu_ ) ( mu = mu__.; ntap - ntap_; wa » new float (ntap) ; wb * new float [ntap]; za * new float (ntap) ; zb = new float [ntap] ; for( int 1 • 0ι i < ntap; 1++ ) wa.i) « wbtl) « 0; inow ■ 0; firβt_time_through = 1; )

Cross*cho2i t-Croβsecho () . delete wa; delete wb; delete za; delete zb; ) void Crossecho2■ iProcess( float* za_now_, float* s_b_now_, float ya, float yb ) I

II Initialize output valuea with <ya,yb) register float za_now = ya; register float zb_now = yb;

IK tfirat_tlme_through ) I

// Update output valuea with feedback int Hag * inow; for I int i • 0; i < ntap; i++ ) ( za_now += wa(i) • zb(ilag) ; ιb_now ♦» wb[ij * za(ilaø) ; if I -n-ilag >= ntap ) Hag * 0; )

// Update tha weights register float u_za * mu * za_now; register float mu_zb = mu * zb_now; for ( i » 0; i < ntap; i+* ) ( watU -■ mu_za * zbtilag] ; wbti) -« ιtιu_zb * za(ilag) ; lf ( ++ilag >* ntap ) Hag = 0; ) 1

// Update the circular buffer za (inow) « za_now_ » za_now; ib [ Ino 1 « zb_now_ = zb_now; if ( **inow >« ntap ) ( firet_timβ_through » 0; inow « 0; )

I void Croaβecho 11 ump_to_cout () ( cout. idth(10);

// cout .satriioai iatickywldth) ;

for( int i » 0; 1 < ntap; !♦+ ) cout « a(i) TB « wbli] « endl;

//

// Teat main // -

#lfdβf TEST

Iinclude 'Signal.h' •include "myatrβam.h* nain() ( cout.βatf(ioβi tunitbuf) ;

// Read in signals cout « "Reading in signals\n*;

Signal ya, yb; ya.read_aifc("Croasachoa-a.in*) t yb.rβad_aifc("Croaβechoa-b.in") ; // ya. runcate(50000); // yb.truncate(50000); long βampB « ya.gat_samplea(); float freq > ya.get_sampling_fre<ιuency() ;

// Prepare output signals cout « 'Preparing out signalβXn";

Signal za(aampβ,f βq), zb(βattipβ,f βq);

// Process the signals, sample by sample cout « "Spawning glzmoVn";

Crossechoa gizmo( 2000, 0.1S / (32768. « 32768.) ); float za_max « 1, zb_»ax « 1; cout « "Looping over saπplea\n*; for( long i » 0; i < βampβ; 1++ ) ( float za_, zb_; gizmo. rocess! za_, zb_, ya[l], ybli) ); za(l) • za_; zblij « xb_; ifl 1 * 1000 »» 0 ) cout « i TB « za_ TB « zb_ HL;

// cout « za_ TB « tb_ TB « ya(i) TB « ybli] HL; ) cout « "Writing output Crossecho2-a.out\n"i za.writa.aitc("Croasecho2-a.out"); cout « "Writing output Croasecho2-b.out\n"; zb.wrltβ_aitc("Croβββcho2-b.out") ; // cout « "Playing output a\n" ; za.playd; // cout « "Playing output b\n"; zb.playd; return 0;

) tendif

tdefine MAXMICS 10

♦include "Signal.h" tinclude "SrchSStourn.h" tinclude "SrchSSpop.h" tinclude "SrchSShc.h" (include "RepnSS.h* tinclude *Crossecho2.h* tinclude "tnystreajn.h" tinclude "SS2.h" tinclude <stdlib.h> tinclude -catring.h> class SoSensor; static SrehSSpop* arch » 0; char' filename [MAXMICS]; int nmica; int main( int argc, char** argv ) ( cout.set ( lositunitbuf ); cerr.setfl ios::unitbuf );

// Parse command-line args

Char *filβl = 0, *filβ2 = 0; int delays = 0; int croaaecho = 0; int rep - 1;

Signal ya, yb; long aampa * 0; double freq • 0; double lo > 0, hi _ 0;

RepnSS* repn » 0;

SS i :Setup( argc, argv, SG, repn, ya, yb, filel, fileϊ, lo, hi, samps, freq, delays, croβsecho, nrep );

// Figure out sampling frequency and window aiza ♦define HINDOW_TIME 0.01 const long window_sa_npa «= long( freq * WIN_X_W_T__ME ♦ 0.5 ) ;

// Form outputs

Signal za( aampa, fraq ), zb( aampa, freq );

// Start up a naw searcher RepnSS* suggest = (RepnSS*) repn->Clone() ; tdefine NPOP 30

SrchSShc* suggester « new SrchSShc(suggest) ; srch « new SrehSSpop) repn, NPOP, suggester ) ; iff delays < 2 ) suggeater->pop « srch; βuggester->maxran * NPOP-1; βuggestar->ran _factor •= 0.8;

RepnSS* follower « (RepnSS*) srch->get_follower() ;

// Set some parameters switch! delays ) { caae Oi βrc__->min_iterations_to_jr___ιc_aasets = 50;

βrch->»ax_lncoBe_dlff_pβr_lteration « 10; srch->iterationβ_per_βuggestion = 10; srch->suggestions_per_sample = 10; break; βrch->mir__itβrations_to_____x_aβset» = 10; arch->__-_x_inco_ne_diff_par_itβration = 10; srch->iterationβ_per_suggestion * 3; βrch->Buggastions_per_sample ■ 3; break; caaa 2ι βrch->ιrdn_ltβrationa_to_jr___x_aββeta = 50; arch->π_υ _income_diff_per_iteration = 10; βrch->iterationβ_pβr_auggβation « 5; arch->suggβstions_pβr_aample • 5; break; ) βrch->replacβment_thrβahold « 0;

// Do the loop over samples long it « 0; int irap ■ 0; f ollowar->Inβtall ( ) ;

Croβββcho2 staga3 ( 225, 0.15 / (3276B . * 327S8. ) ) ; while (1) (

B a2-_-Sampla2(ya[it) ,yb(it] ) ; ifl irep «« 0 ) ( float za_, zb_ι as2->PoraOut2( ta_, zb_ ); if( croβββcho ) stagea. rocess! za[lt], zb(it], za_, zb_ ); elae za[it] = za_; zb.it] » zb_; ) if( It % window_aamps « 0 ) ( βrch->Itβratβ() ;

// cout « follower->Bvaluate () NL; cout « followβr->to_βtringO NL; ) if( *+it >= aampa ) ( it = 0; ifl ++lreρ «» 1 ) ( char *infilβl « 0, *infile2 = 0; char *outfllel, *outfile2; if( filβJ ) ( outfilel « new char (βtrlβn(filβl)τ5) ; outfile! « new char (strlen(file2)+51 ; βtrcpy( outfilel, "ββ2-" ); strcpyl ou filβ2, *ss2-" ); strcatt outfilel, fi el ); atrcat( outfilo2, fila2 ); ) elae ( outfilel • new char [atrlen( ilel)+8) ; outfile2 ■ new char tβtrlen(filel)*8] ; strcpyl outfilel, "ss2-za-" ) ; βtrcpy( outfile2, "sβ2-zb-" ) , stree ( outfilel, filel ); stree ( outfilβ2, filel ); infilβl ■ new char Iatrlen(filel)+8) ; infilea «= new char tstrlβn(filel)+8) ; strcpyl infilβl, "ββ2-ya-" ); strcpyl infile2, "sβ2-yb-" ); atrcat( infilel, filel ); atrcatl infile2, filel ) ;

cerr « "Writing outputs: « outfilal SP « outfile2 NL; double βdev = 0x7fff * 0.02; double var » adev * sdev; za.nor_nalize_variance( var ); zb.normalize_variance( var ); za.write_aifc( outfilel ); zb.write_aifc( outfile2 ) t i l infilel fcfc infile2 ) 1 cerr « "Writing inputs: « infilel SP « infile2 NL; ya.write_aifc( infilel ); yb.write_aifc( infile2 ); ) ) if( irep >= nrep ) break;

return 0;

tifndef _myβtreaι_t_h

#dβfine _jπyβtream_h tinclude <stream.h> ♦include <fstream.h> •define SP « ' ' •define NL « '\n' ♦define TB « '\f

•endif // _mystream_b

// Signal module -*-c++-*-

// Handles monophonic sound signals

// Is able to do arithmetic lilce superpose signals with scaling and

// time delay; compute running correlation

♦define DEPAULT_SAHPLING_FREQU__NCY 44100

♦include <audio.h> claas Signal (

// Constructor and destructor public:

Signal! long βamplββ_ «0, float βa_πpling_ requβncy_ « DEFAULT_SM_PLING_FREQUENCY ) ; -SignalO ;

// Sampling frequency private i float sampl ing_ frequency i float dt; public i void aβt_βampling_ rβquβncy ( float ) ; float get_ea_npling_frequency( ) ( return βa_npling_f rβquβncy; )

// Size and storage public: void truncate( long eamplββ_new ) ; privatei long βam lβa; float* value; void reaize( long βamplββ_new ) ; void resize( f oat approx_βa_npleβ_ _βw ) ;

// Access publici float* operator [] ( long 1 ) ( return valued]; ); const long gat_samples() ( return samples; )

// Read or create a signal publici void crβatβ_zβro( long βamplβa «0 ) ; void crβatβ_random( long βa plβs e0 ) ; void craate_βlna( float βinfreq, long samplea >0 );

// Arithmetic publici void operator ** ( float ) ; void add( Signalt that, float acale B l.O, float delay = 0.0 ); void add_product( Signalt a, Signalt b );

// void damp! float e ife ); void window_averagβ( long iwindow ) ; friend Signalt window_average( Signalt ere, long elife ); void βxponential_average( Signalt arc, long elife ); void force_zero_jnean() ; void noπoalize_variancβ( float desired_varia_.ee -_l.0 ) ; void normaliza_to_max( float desired_max ) ;

// void correlate! Signalt a, Signalt b, float elife );

// Convolution public; void add_timedelay( Signalt arc, double tau, double desired_svm =1 ); // void timedelayl Signalt arc, double tau, double desired_sum =1 ) ; privatet atatic double* convolution; static int nconvolution, nconv2;

// Transformed signals public:

Signalt operator * ( Signalt that ) ; // void AGC( Signalt, float elife );

// Audio I/O publici void play! float speedup __1.0 )t void record! float duration ) ; void play_aaync( float speedup «1.0 ); static long play_aβync_more() ; // ret cur sample or -1 private: typβdβf short soundβamplβ; static ALport alport; static βoundβaπplβ* alport_bu ; // 0 when buffer empty atatic void alport_buf_rββizβ( long ); static long alport_buf_aizβ; static long alport_)juf_curβor; 11 -1 when port not in use

// Pile I/O public: void writs( const char* filename ) ; void read! const char* filename ) ; void wrltβ_aifc( coast char* filename ); void rβad_aifc( const char* filename, int rightchannel «0 ); void write_human( const char* filename ) ; void raad_human( const char* filename ) ;

);

// -«-c*+-*-

♦include "Signal.h"

♦include "myatream.h"

♦include <atdlib.h> tinclude <string.h> tinclude <audiofile.h> tinclude math.h>

•define TOO_PI 2 * KJPI extern "C" ( extern int sginapllong ticks); ) static double sine! double x ) ( if( X >» 0 ) return 1; return sin(x) / x; )

//

// Static buffer //

ALport Signal: ialport;

Signal: iβoundββmplβ* Signali ιalport_buf « 0; void Signal:ιalport_buf_rβaize( long nβwaiza )

( ifl alport_buf_βizβ «t newsize ) return; if( alport_buf ) delete alport_buf; alport_buf « new aoundsample [alport_buf_size = newsize] ; ) long Signal: ιalport_buf_size « 0; long Signal: :alport_buf_curβor = -1; double* Signal::convolution « 0; int Signali inconvolution >= 0; int Signal: inconvS * 0;

//

// Constructor and destructor

//

Signal: ιSignal( long samples.., float βampling_frequency_ ) i aamplea(O), value(0)

( βet_aampling_f equency( aampling_ requency_ ); resize( βa_n>lβa_ ) ; for! long i « 0; 1 < samples; i+τ ) valued) - 0.0; if! Iconvolution ) ( nconv2 « 10; nconvolution = nconv2 * 2 + 1; convolution = new double [nconvolution] ; )

)

Signali :-Signal() { if! samples > 0 ) delete value; ) void Signal: :βet_βajπpling_frequency( float x ) ( aampling_ requency = x; dt B 1.0 / x;

) void Signal11resize! long sampleβ_nβw ) ( if( aamples_new Is samples ) ( if( samples > 0 ) delete value; samples « samplaa_new; value = (aamplaa > 0) ? new float (samples = samples_new] : 0; . ) ) void Signal: :resize! float appro_(_samples ) ( long βamplβs.nβw « (long) floor(approjς_βamples + 0.5); resize( βawplee.new ) ; )

/

// Truncate / void Signal : : runcate ( long βamplββ_nβw ) ( if l βa«plββ_nβw «■ samples ) return; float* valuβ.nβw ■ new f oat (βamplβs_nβw) ; fort long i = 0; i < βampleajiew; 1++ ) value_new[i) * (i < samples) ? valued) : 0; delete value; value • value_pew; aaπplea « βatnplββ_r_βw; )

// -

// Read or create a aignal

// - void Signali:craata_sine( float ainfreq, long samples. ) ( if! samples. > 0 ) resize( samples. ); for! long i = 0; 1 < samples; i++ ) ( float t = 1 * dt; valued) = ain( TWO.PI * t * ainfreq ); ) ) void Signal: icraate_random( long samples. ) ( ifl samplea. > 0 ) reβizel aamplea. ); float aan • 0; for( long i * 0; 1 < aamplaa; iτ+ ) mean +* valued] * UrandomU % 10000) / 5000.0) - 1.0; mean /= aamplaa; for! i = 0; i < samples; i*+ ) valued) -* mean; ) void Signal::craata.zero( long samples. ) ( if! aamplea. > 0 ) resize! samplea. ); for( long i « 0; i < aamplea; iτ+ ) valued) = 0.0; )

// - -

// Arithmetic

// —

void Signaliiadd( Signalt that, float acale, float delay ) ( if( thia->aampling_ requency l= that.sampling.£requency ) cerr « "Warning: Signal: :a ;

// Compute delay in terms of number of time samples long idelay = long(delay * aampling. requency * 0.5); if1 idelay > this->samplea ) idelay * this->samplea

// Copy aignal long ithiβ * idelay, ithat ■= 0; while! (ithia < thia->samples) tt (ithat < tha .samples) ) this->value[ithiaττ) ♦= that.value(ithatττ] * scale; ) void Signal: :add_product( Signalt a, Signalt b ) ( lf( aampling.frβquβncy Is a.aampling.f equency ) cerr « "Warnin ι Signal11ad () encountered unequal sampling freqa\n"; if( aampling.frβquβncy le b.aampling.frequency ) cerr « "Warning: Signalt tadd() encountered unequal aampling freqa\n"; if! aamplea Is a.aamplaa ) cerr « "Warning: Signal: :add!) encountered unequal signal lengths\n"; ifl samples I* b.samples ) cerr « "Warning: Signal: :add() encountered unequal aignal lengths\n";

// Copy signal for! long i ■ 0; i < aamplea; i*+ ) valued] τ-_ a.valued) * b. alued); ) void Signal: iwindow.average( long iwindow ) ( float iwin_inv « 1.0 / iwindow; float aum ■ 0; float* valuβ_nβw = new float (samples]; for( long 1 ■ 0; i < samplea; i+τ ) { sum ♦« valued] ; if! i >* iwindow ) sum -= value[i-iwindow] ; value newd] = sum * iwin_inv; ) delete value; value * valuβ_nβw; )

Signalt window.average( Signalt arc, long iwindow ) ( long samples « arc.get_aamplea() ;

Signal* dst * new Signal( aamplββ ) ; float iwin_inv = 1.0 / iwindow; float aum » 0; for( long i « 0; i < samplea; i*+ ) ( aum ♦« arc.valued) ; if l i >« iwindow ) aum -« arc.valued-iwindow] ; dst->value[i) « aum * lwin_inv; ) return *dat;

)

Signalt exponen ial.averagel Signalt arc, long elife ) ( long aanplea « arc.get.aamplea() ;

Signal* dst * new Signal( aamplea ) ; dβt->βxponβntial.averagel arc, elife );

return *dat; ) void Signal: :exponential.average( Signalt arc, long elife ) { float frac = -expmll -1.0 / elife ); float avg • 0; fort long i = 0; i < aamplea; i++ ) ( avg ■ frac * src.value(i) + (1-frac) * avg; valued] = avg; ) ) void Signalt :force_z β ro_mean() t float mean ■ 0; for! long i « 0; i < aamplaa; i*+ ) mean += valued); mean /» aamplaa; for! i • 0; i < aamplaa; lττ ) value(i) -= mean; ) void Signal: :normalize.varlancet float deaired_variance ) t forcβ_ιβro_meaι_(); float aumaq « 0.0; for! long i « 0; 1 < samplea; !♦+ ) aumaq ♦= valued ] * valued) ; float aca a « (aumaq « 0.0) 7 1.0 i βqr (dasired_varianca* (βamplββ/βumβq) ) ; fort i « 0; i < samplea; 1++ ) valued] *= scale;

) void Signal : :normalize_to_jnax ( float desiredjnax ) ( f orce_xβro_jttee_n ( ) ; float actual_max * 0; for! long i « 0; i < aamplea; i*+ ) ifl faba(valued)) > aetual_jnax ) actual_max * fabs(value[11 ) ; float scale ■ (actualjnax _» 0.0)

? 1.0 i deeired_jt__-X / actualjnax; for( i * 0; i < samples; i++ ) valued) *= scale; ) void Signal::operator *■ ( float acale ) ( for! long i « 0; 1 < aamplaa; i+* ) valued) •= scale; )

// — - - ~ -

// Time delay

// - - void Signal: itimedela l Signalt arc, double tau, double desired_sum ) // Tau should be nuaibar of time aamples, not necessarily integer

(

// set convolution double sum - 0; for( int d « 0; d < nconvolution; Λ+* ) sum += (convolutio [d] « sine! M.PI • (d - nconv2 - tau) )) ; double scale * deairad_βum / sum; for( d * 0; d < nconvolution; d++ ) convolutiontd] *= scale;

// Compute this by convolving βrc with convolution for! long t = 0; t < samples; t++ ) ( value! ] = 0; for( d ■ 0; d < nconvolution; d++ ) ( long tβ = t + d - nconv2; if! tβ >= 0 tt tβ < βrc.samplea ) value! t ] += convolutionid] * arc.value! ts ]; ) ) 1 void Signal: :add_timadelay( Signalt βrc, double tau, double deβirβd_βum ) // Tβu should be number of time samples, not necββsarily integer

(

// set convolution double aum ■ 0; for! int d • 0; d < nconvolution; d++ ) aum ♦■ (convolution[d] = aincl M_PI * (d - nconv2 - tau) )); double acala « dββirβd_βu_n / sum; fort d > 0; d < nconvolution; Λ+* ) convolutionid] ** scale;

// Compute this by convolving src with convolution for( long t * 0; t < aamplea; t+* ) ( for! d ■ 0; d < nconvolution; d+<- ) ( long tβ = t ♦ nconv2 - d; if( ts >= o tt tβ < arc.aamplaa ) value! t ] « convolutionid] * βrc.value! ta ); ) )

)

//

// Transformed aignala //

Signalt Signal: ioperator = ( Signalt that ) ( resize( that.samples ); sampling_frβquβncy «= that.sampling.frequency; dt « that.dt; for( long i « 0; i < samplea; i++ ) valued] = that.valued) ; return *thiβ; ) fdefine DEPAOLT.GAIN 1.0

//

// Audio I/O //

Idefine VOLUME 20000 void Signal: tplay.aβync( float βpeedup ) (

// Close port if necessary if! alport_buf_curβor >= 0 ) A closaport (alport) ;

// Create buffer into which sound will be copied alport.buf.reβize( samples ) ;

// Find maximum abaolute value of signal, and from that, scaling loat valuβ_maxabs = 0.0; for( long i = 0; i < alport.buf.βize; i+τ ) (

register float valuβ.abβ - fabsf (valued] ) ; if l value.abβ > value_maxabs ) valuejnaxabs - value.abs; ) float βcale = (value_ naxaba -_= 0) ? 0.0 : (VOLUME / value_maxabs ) ;

// Copy to buffer for ! i ■ 0; i < alport_buf_βlι«j i++ ) alport_buf [i] (βoundsample) (value d) * scale + 0.5 ) ;

// Set correct sampling rate atatic long PVbuffar(2) ;

PVbuffertO] » AL_OUTPϋT_RATB;

PVbuffer(l) « long! aampling.frβquβncy * speedup + 0.5 );

ALβetparama(ΛI_Jl_FAULT_D__VICE, PVbuffer, 2);

// Set other parameters: monophonlc, 16-bit valuea ALconflg config • ALnβwconfig() ; ALaβtchβnnelβ(config,AL.MONO) ; Al_ββtwidth(config,AI__SAMPLE_16);

// Open the port, write the buffer, play it alport * ALopβnport('sound*, *w",config) ; alpor .bu .cursor « 0;

) long Signali :play.asyncjnorβ() (

// Return if not currently playing a sound ifl alport.buf.curβor < 0 ) return -1;

// Decide how much to write long chunk « ALgetfinable(alport) ; ifl chunk ■« 0 ) // port can't take anything return alport_buf_curaor - ALgetfillβ (alport); ifl chunk > alport_buf_βize - alport.buf.curβor ) chunk » alport__buf_βlzβ - alportJ)uf_curβor;

// If there ia more to be written, do it ifl chunk > 0 ) {

ALwriteβatπpa(alport,alport_buf+alport_buf_cursor,chunk ) ; alportJiuf.curaor +■ chunk; return alportjiuf.curaor - ALgetfilled(alport) ; )

// Must be done writing long stillplaying • ALgetfille (alpor ) ; ifl atillplaying > 0 ) return alport_buf_curβor - stillplaying;

// Done playing too ALcloaβport(alport) ; return alpor _buf.cursor «= -1; ) void Signali iplayl float speedup ) ( play.async( speedup ); while! play_aβync_ norβ() >» 0 ) sginap(l); ) void Signal: trecord! float duration ) { raaize(duration * aampling.frequency) ; alport.buf.reaiza( aamplaa ) ;

// Set correct aampling rate

atatic long PVbuffer[2] ; PVbuffer(O) = AL.INPOT.RATE; PVbufferd] e (long) sampling. requency; ALββtparamβ(AL.DEFAULT.DEVICE, PVbuffβr, 2);

// Set other parameterat monophonic, 16-bit values ALconfig config = ALnewconfi () ; ALsetchannela(config,AL_MONO) ; ALβetwidth{config,λL_SAMPLE_16) ;

// Record the signal

ALport p ■ ALopenportl"sound", 'r',config) ; cout « 'Starting to record aignal * « duration « " aeconds long\n" long recorded » 0; whilβ( recorded < aamplaa ) ( samples - recorded; ;

ALcloaaport(p) ;

// Find — »«*—» absolute value of buffer, and from that, acaling long axaba « 0; for( alport_buf_curaor « 0; alport_buf_curaor < aamples; alport_)>uf_curβor+τ ) ( register long x * label (long)alport.bu talport_buf.cursor] ) ; if( x > maxaba ) maxabs = x; ) float scale » (maxabs -- 0) ? 0.0 : (1.0 / maxaba);

// Copy to buffer for! alport_buf_curaor = 0; alport_buf_curaor < samplea; alportJ_n_ιf_curaorτ+ ) value[alportJouf.cursor) «

(float) alport.bufIalport_buf_cursor) * scale; alport.buf.curβor » -1;

)

//

// File I/O // void Signal: twrite! const char* filename ) ( ofβtrβa out; out.open(filename) ; out « 'Signal* HL; out « aamplaa NL; out « aampling.frequency HL; out. rite( (char*lvalue,int(sizeof(float) "samples) ) NL; out « 'End' NL; ) tdefine BTJFLEN 1000 void Signal: ireadl const char* filename )

( char buft_.UFL__N-.2_;

ifstream in; in.open(filename) ;

// Check that flrat line says 'Signal' in.gβtlinβlbu .BUFLEN) ; ifl atmcmplbuf, "Signal*.6) ) cerr « "Warning: Hot a Signal file\n";

// Read in number of aampleβ, βampling frequency long aamplaa.new; in » samplββjiβw; resize(samplea_new) ; in » βampling.frβquency; set.aampling.frequency(sampling.frβqu β ncy) ;

// Read in the aignal in.ignored); in.read( (cha *)value,int(βizβof( loa )"samples) ) ;

// Check that laat line aaya 'End' in.ignore(1); in.getlinelbuf.BUFLEN) ; ifl atmcmplbu , 'End", 3) ) cerr « "Warning: End of file corrupt\n"; ) tundef BUPLBH

•define OTTB1TS 16 void Signalt iwri β.aifcl const char* filename ) ( alport.bufjreβizβl aamplaa );

// Configure the output file AFfilβββtup outaatup « AFnewfilesetup() ; AFinitfilef t( outβetup, AF.FILEJMFFC ) ; AFinltcαmpresalon( outaatup, ΛF_DBFAULT_TRλCK,

// AF.CCMPRESSICWJlWARE.DEFAULT.LOSSLKSS ) ; AF_C(_»IPRESSIONJ.0NE ); AFinitrata( outaatup, AF.DEFAULT.TRACK, aampling.frequency ) ; APinitaampfmt( outaatup, AF_DEFAOLT_TRAC ,

AF.SAMPFMT.TWOSCOMP, aizβof(aoundsample) *8 ); AFlnitchannels( outaatup, AF.DKFAULT.TRACK, 1 );

// Write the data to the output file

AFfilehandle outhandle « AFopenfilβl filename, *w", outaβtup ) ; for! long i « 0; i < samplea; i+* ) alport_buf[i] * aoundsample( valued] + 0.5 ); long frameβwritten • AFwritβframββ( outhandle, AF.DEFAULT.T ACK, alport.buf, aamplaa ); iff framββwrittβn la aamplea ) cerr « "Warning: frameawritten * " « framaswritten « '; aamples ■ ' « samples HL; AFcloβe lie( outhandle ) ;

// Free setup

AFfreefilβββtu ( outβetup ); void Signali :read_aifc( const char* filename, int rightchannel )

// Open the input file and find out everything about it

AFfileaβtup inβetup AFnewfilβββtu () ; AFfilehandle inhandlβ AFopβnfilβl filename, "r*, inaetup ); long inframecnt AFgetfra βcnt( inhandle, AF.DEFAULT.TRACK ); long inchannelβ AFgβtchannela( inhandle, AF.DEFAULT.TRACK ); long inβampcnt inframecnt * inchannela; long insampf t, inaampwidth;

AFgβtβamp m I inhandle, AF.DEFAULT.TRACK, iinβampftnt, tinaampwidth) ; long insβmpbytewidth = (inaampwidth + 7) / β; long inbytββ = inaatrpcnt * inaampbytewidth; int inrate = int(AFgetratβ( inhandle, AF_DEFAULT_TRACK

) ♦ .5); if( inchannels 1= 2 ) rightchannel ■» 0;

// Read the data into a buffer aigned char* inbuf = new aigned char (inbytββI; long frsmeβread « ΛFreadframeβ( inhandle, AF.DEFAULT.TRACK, inbu , in amecnt ) ; if( framasread Is inframecnt ) cerr « 'Warning: framaaread • ' « fra aaraad « ; inframecnt e " « inframecnt NL;

// Copy the data to our own member variablea resize( inframecnt ); signed char *p ■ inbuf +• rightchannel * inaampbytewidth; for! long i * 0; i < Inframecnt; i++ ) ( if( inaampbytewidth <« 1 ) ( valued) * *p; ) elaa i ( inaampbytewidth <« 2 ) ( abort* q ■ (short *) p; valued] > *q; ) e βe ifl inaampbytewidth <-= 4 ) ( long* q « (long *) p; valued] « *q; ) else cerr « 'Warning: inβa pbytβwith > 4\n*; p ♦« inβaapbytβwidth * inchannels; ) ββt_βaaιρling.frβquβncy( inrate ) ;

// Free stuff delete inbuf; AFcloββfilβf inhandle ); AFfreefilβββtu ( inaetup ) ;

) void Signal: :write_human( const char* filename ) ( ofstream out; out.open(filenaaβ) ; out « 'Signal' NL; out « aamplea NL; out « aampling.frequency NL; for! long i • 0; i < aamplea; i++ ) out « valued) NL; out « 'End' NL; out « 'End 2* NL; )

(define BUFLEN 1000 void Signal: :read_huι__an( conat char* filename )

( char bu [BUFLBN+2] ; ifstream in; in.open(filename) ;

// Check that first line aaya 'Signal' in.g«tlin*(buf,BUFLEN) ; if! βtrncmp(buf, 'Signal',6) ) cerr « "Warningi Not a Signal flle\n";

// Read in number of samples, sampling frequency long aamples.new;

in » βamplβs_nβW! reβize(βamplβfl.new) ; in » aampllng.fraquency; eet.βampling.frequenc (βampling.frequency) ,-

// Read in the signal for( long i * 0; i < aamplea; i** ) in » valued);

// Check that last line says "End" in.ignored); in.gβtlinβlbuf,BUFLEN) ; if( atracmpfbuf, 'End",3) ) cerr « "Warning: End of file corrupt\n"; ) tundef BUFLEN

/

// Test main

//

*ifdef TEST int main! int argc, char** argv )

// Test of AGC

// Signal ainl;

// βinl.creatβ.ained,4*0) ;

// ainl *« 100;

// Signal ain2;

// βin2.ACC(*βinl,0.1) ; //

// ainl-write_human{"fool.sig") ;

// sln2.write_human(*foo2.sig") ;

// Test of read_aifc // Signal aig; // aig.read_aifc(argv[l) ) ;

// βig.play! (argc >* 3) ? atof(argv.2) ) : 1 ); // βig.write.aifcCfoo.aifc'); return 0;

) •endif

47-

// Modint -*-cτt~*- SRβviβion: 1.1 $ SDate: 1994/08/04 17:15116 $

// Integerβ modulo base iifndef _J!odint_h ♦define .Modint.h claβs Modint ( private: int i; publici

Modint() ( i - 0; ) Modint(int j) ( i «= j; ) Kodin (Modint* J) I i « j.i; ) -Modint() {) static int base; publici friend Modint* min(Modint,Modint) ; operator into { return i; ) void operator ♦+ () { iff ♦+! >= baae ) i = 0; ) void operator — () { i ( —i < 0 ) i = base-1; ) int operator - (int); int operator - (Modint); ); inline Modint* min(Modint i, Modint j) ( return! i.i < j.i ? i : j ); ) inline int Modint::operator - (int j) ( j " i - i> while! i < 0 ) j += baae; return j; ) inline int Modint: ioperator - (Modint ) { int * = i - j.i; if( k < 0 ) k ♦= baae; return k; ) tendif // _Modint_h

// Modint -•-e+*-*- SRevision : 1.1 $ $Datβ: 1994/08/04 17 : 15 :16 $

♦include "Modint.h" int Modint: :baae = 1;

//---

// Test main

// -

♦ifdef TEST

♦include "mystream.h" main! int argc, char** argv ) I cout « 'Modint haa no teat mainXn"; return 0; )

♦endif

// SS - « -e+τ-*- SReviaion: 12.1 $ $Dateι 1994/08/04 17:16:01 S

// Abstract base class for accumulation of statistics for // sound-separation problems.

♦ifndef _SS_h ♦define _SS_h claaβ Signal; claaa RepnSS; class SS {

// Conatructor, destructor publici SSO; -SS(); virtual RepnSS* new_repn() «0;

// Leaky integration public: void Set ea l double esamplaa « -1 ) ; protected: double nβarly.onβ; double nβa ly.zβro; ♦define ACCυM_I___A_ULY(βx,x) ( \ ex *» nea ly.one; \ ex ♦« nβarly.rero * (x) ; X ) ♦define ADD_l_RAKII_Y(βx.exold.x) { \ ex ■ (exo d) • nearly.onβ ♦ (x) * nearly.zaro; X )

// Maintain running averages public: virtual void ClearStatat) »0; virtual void Sample! double* y ) «0; virtual void SampleNoStatβ( double* y ) «0;

// Bvaluate candidate separation parameters publici virtual double Evaluate! double** ) *0;

// Step down gradient public: virtual void GoDownhilll double** aepcoeff, double amount ) <_0

// Apply aeperation parameters public: virtual void Install! double** ) =0; virtual void Foπrøutputl float* ) «0;

// // Utilities comnon to many SS applicationa

// publici

// atatic void Setup! int argc, char** argv,

// SS** SS,

// Signal* ya. Signal* yb,

// char** filel, char** filβ2,

// double* lo, double* hi,

// long* samps, double* freq,

// int* delays, int* nrep ) ;

);

♦βndif // _SS_h

// SS - « -o«-«- SRβviβion: 12.1 $ SDate: 1994/08/04 17:16:01 $

♦Include 'Signal.h" ♦include "SS.h" ♦include * epnSS.h" ♦include <mat__.h> ♦include <atdlib.h>

SS.iSSO ( SetLeakl); )

SS::-SS() () void SSi iSetLeak! double esβmplβa ) ( iff eaaaplea > 0 ) { nearly.zero ■= -axpmK -1.0 / eββmplββ )ι nea ly.onβ » 1.0 - nearly.zero; ) else { // no leak desired nearly.zero « 1; nearly.onβ = 1; ) )

// void SSJISetup! int argc, char** argv,

// SS** SS,

// Signal* ya, Signal* yb,

// char** filel, char** filel,

// double* lo, double* hi,

// long* aampa, double* freq,

// int* delays, int* nrep )

// (

// for( int iarg « 1; iarg < argc; iarg++ ) (

// if( argvϋargltO] == -' ) βwitch! argvtiarg] [1] ) {

// case 'g'ι delaya ■ 0; break;

// caββ 'd'ι delaya * 1; break;

// caae 'f'ι delaya ■= 2; break;

// caae 'r'ι nrep = atoi(argvtiarg)) ; break;

// ) elae (

// if! Ifilel ) filel « argvliarg];

// elae if! I filβ2 ) file2 -_ argv(iarg);

// )

// )

//

// // Read in βignala

// ya.read_aifc( filel );

// ifl file2 ) yb.rβad_aifc( file2 );

// elae yb.read.aifc! filel, 1 );

// βa pβ ■ ya.gat_βampleβ() ;

// if( yb.gβt.βamplββ!) < aampa ) aampa « yb.get.samplesO ;

// freq • ya. e .aampling. requency() ;

//

// // Figure out sampling frequency and window size

// ♦define HimWW.TIME 0.01

// const long window.sanps • longt freq * WIKDOH.TIMB + 0.5 ) ;

//

// // Do SS-dependent stuff

// switchI delaya ) {

// caae 0:

// SS s new SSGaina;

// lo » 0;

// hi ■ 1;

// break;

// case 1:

// SS ■ new SSDelaysl 10 );

// lo = -2;

■51-

hi = 2;

RepnSSi !force_upper.triangle » 1;

RepnSS: ιmutation_atyle « RepnSS: :MUTATE_ONE_DIMENSION;

KSSDelays « )SS)->SβtAlpha! .9 ); break; caae 2:

SS = new SSfir2Delayβ( 10 ) ; lo = -2; hi <= 2;

RepnSS: :force_uppβr_triangle * 1;

RepnSSi :mutation.βtyle = RepnSS: iMUTATE.ONE.DXMENSION;

{(SSDelaya *)SS)->SβtAlpha( 0.5 ); break; )

SS->SetLaa ( window.aa ps ) ; RepnSS: iββt.SS! SS ); RepnSSi !set Oounds! lo, hi, lo, hi );

// SS2 -«-ct*-*- SRevision: 12.2 $ SDate: 1994/08/05 23:37:57 $

// Abstract baae class for accumulation of statistics for 2x2 // sound-separation problems.

♦ifndef _SS2_h ♦define _SS2_h class Signal; claaa RepnSS;

♦include *SS.h" clβββ SS2 : public SS {

// Conβtructor, destructor public: SS20; -SS 1);

// Utilitieβ common to many SS2 applications publici atatic void Setup( int argc, char** argv,

SS** ss, RepnSS** repn,

Signal* ya. Signal* yb, char** filel, char** filβ2, double* lo, double* hi, long* aampa, double* freq, int* delaya, int* croasecho, int* nrep );

// Interface to daacandants publict virtual void Sample( double* y ) ; virtual void Sample]( double ya, double yb ) =0; virtual void Sa pleNoStata( double* y ) ; virtual void Sampla2NoStatβ( double ya, double yb ) «0; virtual double Bvaluata( double** ); virtual double Bval2( double a, double b ) «0; virtual void OoDownhilK double** aapcoeff, double amount ); virtual void GoDownhilU ( double* a, double* b, double amount ) =0; virtual void Instal ( double** aapcoeff ); virtual void Inatall ( double a, double b) «0; virtual void PoraOutput( float* z ); virtual void PonaOut ( float* za, float* zb) =0;

);

♦βndif // _SS2_h

// SS2 -*-c++-*- SReviaion: 12.2 $ SDate: 1994/08/05 23:37:57 S

♦include Signal.h"

♦include *SS2.h"

♦include *SS2FIR.h'

♦include *SS2Delaya.h"

♦include "SS2Gainβ.h"

// ♦include "SS2DelGalns.h'

♦include *RepnSS2.h'

♦include *RepnSSFIR2.h'

// ♦include 'RepnSS2DelGains.h"

♦include <math.h>

♦include <atdlib.h>

SS2::SS2() {) void SS ι:Samplβ( double* y )

Sample2< ytO], y[l] ); void SS2:ιSamplaNoStata{ double* y ) Sβmple2NoStata( y[0], y(l] ); double SS2: :Evaluate( double** aepcoeff ) return Bval2( aepcoeff(01 [1] , aapcoeff[1] (0) ); void SS :iGoDownhill( double** sepcoaff, double amount ) GoDownhill2( aepcoeff[0] (1], aepcoeff[1] [0] , amount ); void SS : :Inβtall( double** aepcoeff )

Inatall2( sβpcoeff[0] [1] , ββpcoeff[1] [0] ); void SS2: :FormOutpu ( float* z ) Form0ut2( z[0), z(l] ); void SS2ι:Setup( int argc, char** argv, SS** aβ, RepnSS** repn. Signal* ya. Signal* yb, char** filel, char** file2, double* lo, double* hi, long* samps, double* freq, int* delays, int* crossecho, int* nrep )

( croββecho « 0; for! int iarg * 1; iarg < argc; iarg++ ) ( if( argv.iarg) {0] -_« -' ) switch! argvtiarg] II] ) I caae 'x' : croβββcho * 1; break; caae 'g' i delaya - 0; braak; caae 'd' : delaya « 1; braak; case 'f'l delaya « 2; braak;

// caae '4': delaya = 3; brea ; caae 'r't nrep = atoi(argv[iarg] ) ; break;

) alae ( if( Ifilel ) filel * argv(iarg] ; else if! I filβ2 ) flle2 > argv(iarg); )

// Read in aignala ya.read_alfc( filel ); if! file2 ) yb.rβad_aifc( file2 ); else yb.read_aifc( filel, 1 ); aampa « ya.gβt.aamplββ() ; if! yb.get.βa plββl) < samps ) samps = yb.get.semplesO ; freq « ya.get.aampling.frequencyO ;

// Figure cut aampling frequency and window size ♦define WINDO .TIMB 0.01 const long window.βa pβ ■= long( freq * WINDOW.TIME + 0.5 ) ;

// Do SS2-dependent atuff awitchl delaya ) { case 0ι ss « new SS2Gaina; lo « 0; hi » 1;

RepnSS2: iInitialize! aa, lo, hi >

RepnSS2::mutation_atyle = RβpnSS i iMUTATB.GRADIBNT.RBSTARTS; // RβpnSS2::mutation_βtylβ ■ RepnSS2: iMOTATE.CUBIC; break; caae li aa « new SS2Delaya( 10, 10 ); lo « -2; hi « 2;

RβpnSS2ι iInitialize( ss, lo, hi ); RβpnSS : iforce_upper.trlangle * 1;

RepnSS2::mutation__βtylβ « RβpnSS : iMUTATE.ONE.DIMKNSION; break; caaa 2: aa * new SS2FIR1 0, 0 ); lo « -2; hi • 2;

RepnSSPIR2ι;Initialize( 2, (SS2FIR*)aa, lo, hi );

RβpnSSFIR2ιιmuta ion_βtyle > RepnSSFIR2: :MOTATE_GRADIBNT_R__STARTS; break; // caae 3:

// as * new SS2Delβaina( 10, 10 ); // lo « hi « 0;

// RepnSS2DelOaina: :Initialize! (SS2DelGainβ*)ss, 0, -2. 1, 2 ) ; // braak;

// SS2FIR -•-C++-*- SReviaioni 12.3 $ SDate: 1994/08/05 23:37:00 S

// Class for accumulation of statiβticβ for sound βeparation baaed // on differential FIRa (aeparated omni mice plus echoes and reverb)

♦ifndef _SS2PIRJι ♦define _SS2FIR__h

♦include *SS2.h' ♦include "Modint.h' class SS2FIR : public SS2 ( friend class RepnSSFIR2;

// Constructor, destructor public:

SS2PIR1 int T_head, int T.tail );

-SS2FIRO; virtual RepnSS* nβw.repn() ; protected: int T_head, T_tall, T.tot;

// int T2, T2.1; // precosiputed quantities

// Running averagaa public: virtual void ClearStata() ; virtual void Sample2( double, double ); virtual void Samplβ2NoStatβ( double, double ); privatei

Modint tree; double *xbuf, *ybuf; // Signala in recent aat double *βx, *βy; // Decayed βignala [time] double »*βxx, **βyy; // Autocorrelation (time) (relative offset] double **βxy, **βyx; // Cross-correlation [time] (relative offset]

// Evaluate candidate aeparatlon parameters public: virtual double Eval2FIR( double*, double* ); private: double eu, ev; double numβr, f; int f.βign;

// Go down gradient a little bit public: void GoDownhillPIR2( double* fir.a, double* fir_b, double maxamount, double s-inamount, int maxcount ) ; private: double *grad_a, *gradj>

// Apply aaparation parameters public: virtual void Inatall2FIR{ double*, double* ); virtual void PormOut2 ( float*, float* ); private: double *inatalled_a, *inatalled_b;

// Halve aolution of binding problem private: double correlate.different.solutionsl double* al, double* bl,

56-

double* a2, double* b2, int u_vs_v_l, int u_v β _v_2 ),

)>

♦endif // _SS2FIR_h

// SS2PIR -«-cτ+-«- SReviβion: 12.3 $ SDate: 1994/08/05 23:37ι00 S

// Mice X, y

// Outputs u = x - b y, v = y - a x

// where a and b are convolving functions

// F = <u v> - <u> <v>

♦include 'SS2FIR.h* ♦include *RepnSSFIR2.h* ♦include <math.h> ♦include <atdlib.h>

♦include "myβtream.h" // db

// -

// Utilitieβ

// inline int mini int a, int b ) { return (a < b) 7 a : b; ) inline int max( int a, int b ) ( return (a > b) 7 a i b; )

// -

// Constructor, deβtructor

// -- —

SS2PIRι:SS2FIR( int T_hβad_. int T.tail. ) (

Modint: :baae - T.tot x (T.head - T_haad_) ♦ (T.tail = T.tail.) + 1; xbuf new double (T.tot] ; ybuf « new double [T.tot] ; ex ■ new double (T.tot); ey x new double (T.tot); exx new double* [T.tot] ; eyy * new double* [T.tot] ; βxy x new double* [T.tot] ; eyx x new double* [T.tot] ; for( int t 0; t < T.tot; t+τ ) ( βxxtt] new double [T.tot] ; βyytt] x new double [T.tot] ; βxytt] new double [T.tot] ; βyx[t] = new double [T.tot] ; ) inatalled_a x new double [T.tot); installed_b * new double [T.tot); grad_a » new double [T.tot] ; grad_b ■ new double [T.tot] ; c x new double (T.tot * 2 - 1] ;

ClearStata() ;

)

SS2FIR:t-SS2FIR() ( delete ex; delete ey; fort int t x 0; t < T.tot; t*+ delete exx[t] ; delete eyy(t) ; delete exy(t) ; delete βyx[t] ;

) delete exx; delete eyy; delete exy; delete eyx; delete xbuf; delete ybuf; delete inatallad_a; delete installed_b; delete grad_a; delete gradjb; delete c;

RepnSS* SS2FIR: :new_repn() ( return new RepnSSFIR2; )

//

// Running averagaa of aignal products //

// ♦define MODT(t) ( ( (t)*T_tot)%T_tot) void SS2FIR: :ClearStats() ( for( int d 0; d < T.tot; d+* ) buf td] « ybuf td] - 0; for( int t 0; t < T.tot; t++ ) ( ex[t) ey[t] x 0; for ( d x 0; d < T.tot; d++ ) βxx[t] [d] = eyy[t] [d] = 0 ; for( d 0; d < T.tot; d++ ) exy[t] [d) = eyxtt) td] = 0; ) tree x 0; ) void SS2PIR:ιSβmple2NoStata( double ya, double yb )

(

Modint tprev « tree;

♦♦tree;

// tree M0DT( tree ♦ 1 );

// Update x, y xbuf[tree] - ya; ybuf[tree] yb; ) void SS2FIR: tSamplβϊ ( double ya, double yb ) (

Modint tprev x tree;

// tree M0DT( tree ♦ 1 );

♦♦tree; double* x_ ya; double* y_ = yb;

// Update x, y xbuf(tree) x χ_; ybuf[tree] x y_;

// Update ex, ey

ADD_t_BAKII.Y(βxttrβc],ex(tprev),x_) ; ADD_I_-_AKILY(βy[trβc] ,ey[tprβv] ,y_) ;

// A minor optimization... double* exxnow « ex [tree); double* βyynow ey [tree]; double* exynow « exy(trec); double* βyxnow x eyxttrecl;

double* exxprev x exx( prev] ; double* eyyprβv eyyttprev] ; double* exyprev x βxy[ prev] ; double* eyxprev x eyxt prev] ;

// Update exx, eyy, exy, eyx

Modint td x tree; for! int d = 0; d < T.tot; d++ ) (

ADD_ EAIU Y(exxnowtd] ,exxprβv[d] , _*xbuf [td] ) ;

ADD.LEAKII-Yleyynowld] , eyyprev[d] ,y_*ybuf [td] ) ;

ADDJUIAKILY ( exynow [d] , exyprev t d) , χ_*ybu t d] ) ;

ADD_LEAKILY(eyxnowtd) , eyxprevtd] , y_*xbuf [td] ) ;

— td;

// if ( — d < 0 ) td = T.tot - 1; )

//

// Evaluate candidate aeparation parameters // _ - double SS2FIRι:Bval2FIR( double* eval.a, double* eval_b ) ( );

) (

// Compute euv regiater double euv « exy[tplay] [0] ; forl d x 0; d < T.tot; A*+ ) (

Modint td tree - mi (T_hβad, d) ; int dd > aba (T.head - d) ; euv »« eval.atd) * exx(td] [dd] ; euv * χ eval_b(d] * eyy[td] [dd] ; ) ♦if 0 forl d 0; d < T.tot; d++ ) forl int d2 « 0; d2 < T.tot; d2*+ ) ( if l d < d2 ) euv +χ eval.afd] * eval_b[d2] exy[MODT(trec-d) ] [d2-d] ; else euv M eval.atd) * βval_ [<___) ey [MOOT (trec-d2) ] [d-d2] ; ) ♦elae forl int d2 « 0; d2 < T.tot; d2++ ) ( forl d * 0; d < d2 ; d++ ) euv «- eval.atd] * eval_ >[d2) βxyttrβc-d) [d2-dj ; forl d « d2; d < T.tot; dτ+ ) euv ♦* eval.atd] * βval_b[d2] eyx[trec-d2] [d-d2] ; ) ♦endif

// Compute numer nu βr x euv - eu * βv;

// Compute c = convolve (a, b)

nor ϊ.c « 0; for ! int ic 0; lc <= 2 * T.tot - 2 ; ic+* ) 1 c[ic] x 0 ; int j.lo = max! 0, ic- (T_tot-l) ) ; int j.hi x mint T_tot-l, ic ) ; fort int j x j.lo; j <= j_hi; j++ ) ( c[ic] * χ eval_a[j] * βval_b[ic-j ] ; // cout « j TB « βval.atj] TB « eval_b[j] NL; ) c[2*T_head) -x 1; norm2_c ♦= die] * die] ;

// // Compute deno

// norm2_a 0;

// norm2_b ■ 0;

// for( d x 0; d < T.tot; dτ+ ) (

// norm2_a ♦■ eval.atd] * eval.atd) ;

// norm2_b * χ βval_b[d] * eval_J>[d] ;

// // cout « d TB « eval.atd) TB « βval_b[d] NL;

// )

// Return a function of euv double invdenom (noπ____c « 0) ? Q : l./βqrt(norm2_c) ; dotible f x nu er • invdenom; f_aign x [f > χ 0) T 1 t -1; double P « £ * f;

// cout « numer TB « norm2_c TB « invdenom TB « F HL; ♦define MIN.F le-20 if( F < HI!I_F ) F x MIHJP; return log(F) ; )

//

// Go downhill along gradient

// Reference: Lab notebook, 7/13/94 and 8/3/94

// - - — - void SS2FIRι:CoDownhillFIR2( double* fir.a, double* fir_b, double maxamount, double minamount, int maxcount ) double before x Eval2FIR( fir.a, fir_b ) ; // precompute aome quantities double norm2 « 0; fort int i 0; i < T.tot; i++ ) (

// (da.db) will become the gradient of f

Modint tl tree - i;

Modint t x tree - min(T_head, 1) , int dd x aba(T.head - 1); re later double da x exx[tm] [dd]; regiater double db « eyyttm] [dd] ; forl int d x 0; d < i; d++ ) (

Modint td x tree - d; da ♦• fir_b[d] • ey ttd] [i-d] t db ♦» fir_a(d] * βxy[td] ti-d] ; ) for t d x i; d < T_tot; d++ ) ( da *χ fir_)>[d] * exy[ti] (d-i) ; db ♦« fir_a[d] * βyx[ti] [d-i] ; )

// (da.db) iβ now d(euv) /d(ai,bi)

da -x eu * extti] ; db -= βv * eylti] ;

// (da.db) is now d( )/d(ai,bi) da *χ noπrώ.c; db * χ norπώ.c; regiater double aumca = 0, sumcb = 0; for( int isum » 0; isum < T.tot; iβum++ ) | sumca TS c[iaum + d] * ir.a[isum); sumcb « c[isum + d) * fir_b Iisum) ; I da -x numer * sumcb; db -x numer * sumca;

// da * χ deno * a * b;

// db * χ denom * a * b;

// da -x norm2_b * fir.a[d] * numer;

// db -x norm2_a » fir_b[d] * numer;

// (da.db) la now d(f')/d(ai,bi) da * χ f_aign; db *χ f.aign;

// (da.db) ia now proportional to f.βign(f') • d(f' )/d(ai,bi) gra4_a[i) da; gradj>[i) db; norm2 ♦■ da * da + db * db; 1 if( norm2 » 0 ) return; double acala __ 1.0 / aqrt(noxm2) ; for{ i x 0; i < T.tot; i++ ) { grad_a[i] * χ acale; grad_b[i) *= acala; ) double after « before; for! double amount x maxamount; fabβ(amount) > mlnamount; amount *= 0.1) ( for! int sign ■ -1; sign <= 1; sign * χ 2 ) I while (1) ( if( maxcount-- < χ 0 ) braak; for( i x 0; i < T_tot; i*+ ) { fir.a!i) +» amount • sign * grad_a[i]; fir_ >(i) « amount • sign * grad bli]; )

// cout « amount * sign TB « after NL; double eval2 - Eval2FIR( fir.a. fir_b ); if( eval2 > after ) ( forl 1 x 0; i < T.tot; i+τ ) ( fir.a (i) -x amount * aign * grad_a(i) ; fir_b(i] -x amount * aign * gradj>[i] ; ) break; } elae after x βval2; ) if( maxcount <= 0 ) break; ) ifl maxcount < χ 0 ) break; ) // cout « after - before TB « before TB « after NL;

//-

// Apply aaparation parameters //- void SS2FIR::Install2FIR( double* a, double* b ) ( fort int d x 0; d < T.tot; d++ ) ( installed_a[d] * a[d); inβtalled_b[d] x b[d] ; ) ) void SS2FIR: :FormOut ( float* za, loats zb ) t

// Compute (za, zb) , i.e., colored outputs

Modint tplay ■ tree - T.head; za x buf[ tplay ]; zb x ybuft tplay ]; tplay tree; forl int d x 0; d < T.tot; d++ ) 1 za + Inataller Xd] • ybuf[tplay] ; zb « installβd_a(d) * xbuf[tplay] ; —tplay;

// if( —tplay < 0 ) tplay x T.tot - 1; ) )

//

// Naive aolution of binding problem / / double SS2FIR::correlate_different_βolutiona( double* al, double* bl, double* a2, double* b2, int u_vβ_v_l, int u_vβ_v_2 ) <0> <V>

//

// Test main // tifdef TEST tinclude myβtrβam.h" main! int argc, char** argv ) ( cout « "SS2FXR haa no teat mainXn",- return 0;

) tendif

// SS2Delayβ -*-c+4-»- SReviβion: 12.1 S SDate: 1994/08/04 17ιl6:01 S

// Claas for accumulation of statistics for 2x2 sound βeparation baaed // on differential delays (aeparated omni mics)

Iinclude *SS2FIR.h- claas SS2Delayβ : public SS2FIR (

// Constructor, destructor publici

SS2Delayβ( int T__head, int T.tail );

-SSaOelaya() ; virtual RepnSS* new_repn() ;

// Evaluate candidate βeparation parameters public: virtual double Bval2( double tau_a, double tau_b ) ; privatei void compute.dβlay.convolution( double*, double tau, double deβired_βum =1 ) ; double *sine_a, *ainc__b; // Sine functions

// Apply separation parameters publici virtual void Install2 ( double, double ) ;

It

// SS2Dβlays -*-c++-«- SReviaion: 12.1 $ SDate: 1994/08/04 17ιlβ:01 S tinclude 'SS2Delaya.h* tinclude 'RepnSS2.h' tinclude <math.h> tinclude <stdlib.h>

// tinclude 'mystrβam.h* // db

//

// utilities

// — static double sine! double x ) ( if( X xx 0 ) return 1; return βin(x) / x; )

//

// Constructor, destructor

//

SS2Delayβ::SS2Dβlaya( int T_hβad. int T.tail ) : SS2FIR( T.head, T.tail ) ( βinc.a « new double (T.tot] ; βinc_b x new double [T.tot] ; )

SS2Dβlayβ: :-SS2Delayβ() ( delete βinc.a; delete sinc_b; )

RepnSS* SS2Delayβ::new_repn() I return new RepnSS2; )

//

// Evaluate candidate aaparation parameters

// — double SS2Delays: :Eval2( double tau_a, double tau_b ) (

// Compute ainc functions caaqputa.delay.convolution( βinc.a, tau_a, -1 ); computβ.dβlay.convolution( βinc_b, tau_b, -1 ) ; return Bval2FIR( βinc.a, βinc_b ); ) void SSJDelayβi icotnputβ.delay.convolution! double* a, double tau, double deβired_βum ) ( double aum x 0; for( int d x 0; d < T.tot; d++ ) sum T (a[d) x ainc( M.PI * (d - T.tail - tau) )); double βcale deaired_sum / sum; for( d 0; d < T.tot; d++ ) a[d] * χ βcale; )

/ / -

// Apply separation parameters

// void SS2Deleyβι :Inβtall2| double tau_a, double tau.b )

( compute.delay.convolution! βinc.a, tau_a, -1 ) ; compute.delay.convolution! sinc_b, tau_b, -1 ); SS2FIR::Inatall2FIR! alnc.a, sinc.b );

)

tifdef TEST tinclude "mystream.h* main! int argc, char** argv ) I cout « *SS2Delays haβ no test main\n'; return 0; ) tendif

// Repreβentation - « -cττ- « - SRevision : 12.2 S SDate : 1994/10/25 17 :05 :04 $

// Abstract base class for a solution-space representation to be used // with stochastic optimization. It is expected that problem- instance // variables will be static, whereas solution-instance variables will // not be. tifndef _Repreeentation_h tdefine _Repreaentation_h tinclude Math.h" claββ Representation t public Hath (

// Cone rue or, destructor public:

Repreβentation!) ; -Repreββntation();

// Virtual methods public: virtual void Randomize() χ 0; virtual void Mutate( int exploit χ 0 ) χ 0; virtual void OndoMutation() «0; virtual double Evaluate() χ 0; virtual Rβprβββntβ ion* Clone!) χ 0; // juβt alloca; doaan't copy contents virtual void Copy! const Repreβentation' ) χ 0; // juat copies; doesn't new virtual const char* to_atring() χ 0; virtual void fro _atring( char* ) χ 0;

); endif // .Representation.).

// Representation -•-C++-*- SReviβion: 12.2 $ SDate: 1994/10/25 17 :05 :04 S tinclude "Representation. h"

// - —

// Repreβentation itself

// -

Representation: Representation()

(

)

Repreββntation: t-Repreaentation()

! )

// RepnSS -«-c++-*- SRevision: 12.2 $ SDatei 1994/10/25 17:04:54 $

// Abstract base class for representations of 2x2 sound separation. tifndef .RepnSS.h tdefine .RepnSS.h tinclude "Representation.h" claas SS; class RepnSS : public Representation (

// Constructor, destructor public:

RepnSS() ;

-RepnSS() ; static void Initialize! int nmic, SS*. double lo, double hi ); protected: static SS *sa; static double lo, hi; atatic double range; atatic int nmic; βtatie char* buf;

// Candidate aeparation parameters public: double** βepcoeff; privatei double** aepcoeff.old;

// Mutation atylββ (which affect expectation) public: anum ( MUTATE.CTJBIC,

KUTATE_O E_DIMBNSION,

MOTATE.GRADIENT,

MDTATE.ORADIENT.RBSTARTS ); βtatic int mutatlon_atyla;

// Virtual methods public: virtual void Randomize(); virtual void Mutate! int exploit -0 ); virtual void SaveForDndo() ,- virtual void Ondo() ; virtual double Evaluated; virtual Repreβentation* Clone!); // juat alloca; doean't copy contents virtual void Copy! conat Repreaantation* ); // juat eoplea; doesn't new virtual const char* to_string(); virtual void fro—.βtring( char* ) ; virtual void Follow! RepnSS*, double leak );

// Form outputs given current separation parameters public: virtual void Install!>

// virtual void ForaOutput! float* );

); tendif // .RepnSS.h

// RepnSS -*-e*+- « - SRevision: 12.2 S SDate: 1994/10/25 17:04:46 S tinclude "RepnSS.h' tinclude "SS.h" tinclude 'mystream.h* tinclude <math.h> tinclude <βtdlib.h> tinclude <string.h> tdefine BUFLEN 1000

SS "RepnSS: :βs x 0; int RepnSS: in ic = 0; double RepnSS: :lo x 0; double RepnSS: ihi « 0; double RepnSS: :range ■> 0; char* RβpnSSubuf x 0; int RepnSS: tstutation.style = MOTATE.COBIC; tdefine ALL.DUC for! int i ic = 0; i ic < nmic; imic++ ) tdefine ALL.DilC.PAIRS \

ALL.IM1C for! int imic2 = 0; imic2 < nmic; imic2++ ) \ ifl lmic2 I- imic ) tdefine SBPCC-BFF ββpcoβff[imic] (imic2) tdefine SBPCOBFF.OLD aβpcoeff_old[imic) [imic2]

RepnSS: :RepnSS() I aepcoeff x new double* [nmic]; sβpcoβff.old new double* [nmic] ; ALL_HCC ( sepcoef [imic] = new double [nmic]; βepcoeff.oldtimic] new double [nmic]; ) )

RepnSSi i-RepnSS() (

ALL.IMIC ( delete aepcoeff[imic]; delete aepcoeff.oldtimic] ; ) delete ββpcoβff; delete βepcoeff.old; ) void RepnSS: iInitialize( int nmic., SS* ss , double lo , double hi ) t lf( nmic | χ 0 ) exit|l); nmic nmic.; ββ x ββ_; lo = lo_; hi = hi.; range x hi - lo; buf x new char [BUFLEN-.2) ; ) void RepnSS: :Random!ze() (

AI_I__ΠOC_PAIRS

SBPCOEFF random_double( lo, hi );

) void RepnS : :Mutate(int)

// perturb switch! mutation.style ) ( case MUTATE.CUBIC: (

ALL_IMIC_PAIRS ( register double r - random_double(-l,1) ; SEPCOEFF += r«r*r * range; ) ) break; caae HUTATE_ONE_DIHEHSION: I int imic x rando _int(0,nmic-l) ; int imic2 x random_int(l,nmic-l) ; if! imic2 xx imic ) imic2 - 0; ifl random_char(Q,l) ) ( regiater double r randoπ_doublβ(-l,l) ; SEPCOEFF += r*r*r * range; elβe

SEPCOEFF x random_double(lo,hi) ;

) break; caae HOTATE.GRADIBNTi ( double amount x 0.1 * range * exp(random_double(-3,0) ) ;

// random_doublβ(0, (hi-lo)«0.01|; ββ->GoDownhill( aepcoeff, amount ); ) braak; case KDTATE_GRλDIEMT_RBSTARTS: if! random_lnt(0,9) < 1 ) ( ALL_IKIC_PAIRS I register double r random.double(-1, 1) ; SEPCOEFF * χ r*r*r * range; ) ) elae { double amount = 0.1 * range * exp(randoπ_double(-3,0) ) ;

// random_doublβ(0, |hi-lo)*0.01) ; βs-X3oDownhill( βepcoeff, amount ); ) break; )

ALL_IMIC_PAIRS {

// wrap once ifl SEPCOEFF < lo ) SEPCOEFF TS range; if! SEPCOEFF > hi ) SEPCOEFF -= range;

// confine if( SEPCOEFF < lo ) SEPCOEFF = lo; ifl SEPCOEFF > hi ) SEPCOEFF = hi;

) void RepnSS: iSaveForUndo() (

ALL.IMIC.PAIRS

SEPCOHFF.OLD = SEPCOEFF; ) void RepnSSi :Undo()

ALL_IMIC_PAIRS

SEPCOEFF x SEPCOEFF.OLD; double RepnSS :: Evaluate ( )

1 return βs->Evaluate(sepcoeff) ;

)

// void RepnSS::FoπnOutput ( float* z )

// (

// ββ->Foπn utput (z) ;

// )

Repreβentation* RepnSS: :Clone!) I return new RepnSS; void RepnSS::Copy( const Representation* that. ) (

RepnSS* that = (RepnSS*) that.;

ALL_IMIC_PAIRS

SEPCOEFF x that->SEPCOEFF conat char* RepnSS : : to.atring ( )

(

•buf x '\0'; AI_ .IMIC_PAIRS atreat! buf, formCalf ", SEPCOEFF) ); buflβtrlen(buf)-l) = '\0' ; return buf; void RepnSS: :froπv_βtring( char* )

{

) void RepnSS: : Follow) RepnSS* target, double leak ) (

ALL_IMIC_PAIRS

SEPCOEFF x SEPCOEFF • (1-leak) + target->SEPCOEFF * leak; void RepnSS : : Install ( ) ( ββ-> Instβll (sepcoef f ) ;

// RepnSS2 -«-cττ-*- SRevision: 12.1 $ SDate: 1994/08/04 18:47:36 $

// Abstract base class for representations of 2x2 sound separation. tifndaf _RepnSS2_h tdefine _RepnSS2_h tinclude "RepnSS.h" class SS2; claaβ RepnSS2 i public RepnSS (

// Constructor, destructor publict

RepnSSS() ;

-RepnSS (); βtatlc void Initialize( SS*. double lo, double hi ); // atatic void aat_SS2( SS2* );

// βtatic void ββt_boundβ( double alo, double ahi, // double bio, double bhi ) ; atatic int force_upper.triangle; private:

// atatic SS2 *aa2; // static double alo, ahi, bio, bhi; // atβtic double arange, brange;

// Candidate separation paramβtβrβ

// public:

// double a, b;

// privatei

// double a_old, b.old;

// // Mutation atyleβ (which affect expectation) // public:

// enum ( MOTATE.CUBIC, HϋTATE_OHB_DIMENSI0N ); // βtatic int mutatlon_atyle;

// Virtual methoda public:

// virtual void Randomize!); // virtual void Mutated ; // virtual void SavaForUndo() ; // virtual void Undo(); // virtual double Evaluate!);

// virtual Repreaantation* Clone!); // juat alloca; doesn't copy contents // virtual void Copy! conat Representation* ); // juβt copieβ; doeβn't new virtual conat char* to_atring(); // virtual void f om atrlng( char* ); // virtual void Follow( RepnSS*, double leak );

// Interface to RepnSS private:

// atatic double** βcoeff; βtatic float fval[2); public:

// double Eval2( double a, double b ); void Form0ut2( floatfc za. float* zb );

// Form outputa given current aeparation parameters // private: // float fval[2); // publict // void Install!);

// void Form0ut2( floats, floats );

);

♦endif // _RepnSS2_h

// RepnSS2 -«-c++-*- SRβviβion: 12.1 $ SDate: 1994/08/04 18:47:36 $ tinclude "RepnSS2.h" tinclude "SS2.h" tinclude "mystream.h" int RepnSS i iforce.upper.triangle x 0; tdefine PORCE.UPPER.TRIAM3LE \ if! force.upper.triangle ss a+b < 0 ) \ ( double βwap x -a; a x -b; b = swap; ) // SS2 *RepnSS2ttss2 x 0; // double RepnSS2::alo * 0; // double RβpnSS2:tahi « 0; // double RβpnSS2::blo 0; // do ble RβpnSS ιιbhi x 0; // double RepnSS2ι iβrange; // double RepnSS2ι ibrange;

// int RepnSS2: ιmutation_atylβ x MDTATE.COBIC;

RepnSS2: :RepnSS2()

(

)

RepnSS2: :-RepnSS2()

( )

// double** RepnSS2: :scoeff « 0; float RapnSS2: :fvalI2) ; void RepnSS2: iInitialize( SS* aβ_, double lo_, double hi. ) I

RepnSSi iInitialize! 2, as., lo_, hi. ); // scoeff x new double* [2]; // scoeff[0] x new double [2]; // scoeff[1] x new double (2); )

// void RepnSS2:ιset_SS2( SS2* ss2_ ) ( βs2 β s2_; )

// void RepnSS2::βet_bounds( double alo_, double ahi_,

// double blo_, double bhi. )

// (

// alo x alo_;

// ahi ahi_;

// bio - blo_;

// bhi bhi_;

// arange x ahi - alo;

// brange = bhi - bio;

// )

// void RepnSS2i :Randomize()

// (

/ / a x random_double ( alo , ahi ) ;

// b x random_double( bio, bhi ) ;

// PORCE JPPER.TRIANGLE;

// )

// void RepnSS2::Mutate(int)

// (

// // perturb

// βwitchf mutation_βtylβ ) (

// caae MOTATE.CUBIC:

// rβglater double ra x random_double(-1.1) ;

// regiβter double rb = random_double(-l,l) ;

-75-

// double RepnSS2: iEvaluate!)

// (

// return sβ2->Evaluate(a,b) ;

// )

//

// void RepnSS2::Install!)

// (

// ss2->Install( a, b ) ;

// )

//

// double RepnSS2: :Bval2| double a, double b )

// (

// acoeff[0][l] « a;

// scoeff[11 [0] x b;

// return RepnSS: ιBvaluate(acoeff) ;

// 1

// void RepnSS2: :FormOut2( floats za, floats zb )

// (

// RapnSS: :FoπnOutputIfva1) ;

// za = fvallO);

// zb fval(l);

// )

// Representation* RepnSS2: :Clone!)

// (

// return new RepnSS2;

// ) //

// void RepnSS2: :Copy( const Representation* that. )

// (

// RepnSS2* that = (RepnSS2*) that.;

// a s that->a

// b = that->b

// ) conat char* RepnSS2: :to.βtring!) ( return form! *%lf %lf", aepeoeff(0] [1] , aepcoeff[1] [0] ); )

// void RepnSS2::froa atring( char* ) // I // )

// void RepnSS2::Follow! RepnSS* target., double leak )

// t

// RepnSS2* target x (RepnSS2*) target.;

// a x a * (1-leak) + targβt->a * leak;

// b x b « (1-leak) + targβt->b * leak;

// )

// Searcher -*-cττ-*- SReviβion: 12.1 S SDate: 1994/10/25 17:05:37 S

// Abstract baae class for a stochastic search engine. tifndef _Searcher_h tdefine .Sβarcher.h tinclude "Repreβentation.h* claaβ Searcher : public Math {

// Constructor, dβatructor public:

Searcher() ; -Searcher() ;

// Virtual methoda public: virtual void Initialize!) χ 0; virtual void Iterate! short niter χl ) =0; virtual Repreaantation* get_beat() χ 0; virtual double get.beβt.βcorβ() χ0;

); tendif // .Searcher.h

// Searcher -*-eττ- « - SRβviβion: 12 .1 S SDate: 1994/10/25 17 :05 :37 S tinclude " Searcher. h"

Searche : : Searcher ( ) ( )

Searcher i : -Searche ( ) ( ) void Searche i : Initialize ( ) ( ) void Searcher: : Iterate! short ) { )

Repreβentation* Searcher : : get_bβs ( ) ( return 0; ) double Searcher : : gβt.bββt.βcore ( ) ( return 0 ; )

// SrehSS -*-c++-*- SRevision: 12.0 $ SDate: 1994/08/04 00.23:25 $

// Abstract base class for an adaptive stochastic search engine for // sound separation tifndef .SrchSS.h tdefine .SrchSS.h tinclude "Searcher.h* claaβ SrehSS : public Searcher (

// Constructor, destructor public: SrehSS!) ; -SrehSS!) ;

// Virtual methods publici virtual void Initialize|) -0; virtual void Iterate! short nltar =1 ) χ 0; virtual Representation* get_best() *0; virtual double get_best_score() χ 0;

// Some popular random-number generators protacted: static abort rando___ehort( short lo, short hi ) ; static double random.fraction() ;

); tβndif // .SrehSS...

// SrehSS - « -c+τ- « - SRevision: 12.0 $ SDate: 1994/08/04 00:23:25 S tinclude "SrchSS.h"

SrehSS:: SrehSS!) () SrehSS:: -SrehSS () {)

// SrchSShc -«-c++-*- SReviβion: 12.1 S SDate: 1994/10/25 18ι02:25 $

// Hill climber for sound aeparation tifndef .SrchSShc.h tdefine .SrchSShc.h tinclude "SrchSS.h" claaβ RepnSS; claββ SrehSSpop; class SrchSShc : public SrehSS (

// Constructor, destructor public:

SrchSShc( RepnSS* repn ); -SrchSShc() ;

// State private: RepnSS* he; double latest_hc_score;

// Attached population (optionally used for getting parentβ) public:

SrehSSpop* pop; int maxrank; double rank_ actor;

// Virtual mβthodβ public: virtual void Initialize!); virtual void Iterate! ahort niter =1 ); virtual Representation* get_bes () ; virtual double get_beβt_βcorβ() ;

); tendif // .SrchSShc.--

// SrchSShc -«-c++-*- SRevision: 12.1 $ SDate: 1994/10/25 18ι02:25 $ tinclude "SrchSShc.h* tinclude *SrehSSpop.h" tinclude "RepnSS.h" tinclude "mystream.h" // db

SrchSShc:SrchSShc( RepnSS* repn ) ( he x repn; pop x 0; maxrank x 0; rank ^ factor « 0.8;

Initialize!); )

SrchSShc : : -SrchSShc ( ) I delete he; ) void SrchSShc : : Initialize ( ) ! hc->Randoαice ( ) ; latββt.hc.βcore » hc->Evaluate( ) ; ) void SrchSShc i literate ! short niter ) ( int exploit x 0; while! niter-- > 0 ) { double old_acore x hc->Bvaluata() ; hc->SavβForUndo ( ) ; if ( pop ) hc->Copy( pop->gβt_β«neonβ_good ( maxrank, rank_f actor ) ) ; hc->Mutatβ ( exploit ) ; exploit x 1; double new_acore = hc->Evaluatβ() ; if! new.acore > old_score ) ( he->Undo() ; latest.hc.score = old.score; ) else { latββt.hc.βcorβ = new.βcore; ) )

Repreβentation* SrchSShc: :get_beβtI) return he; double SrchSShc: :get_beat_score() return latest.hc.score;

//

// Test main // tifdef TEST

tdefine DELAYS 0 tif DELAYS tinclude "SS2Delays.h" telβe tinclude "SS2Gains.h" tendif tinclude 'mystream.h* tinclude "InventorSS2.h" tinclude "InventorWindow.h" tinclude *RepnSS2.h" tinclude "Signal.h" static SrchSShc* arch = 0; void searcher! long ) ( srch->Iterate() ; ) main! int argc, char** argv ) ( cout.setf( ios::unitbuf );

)

Signal ya, yb; ya.raad_aifc( argvtl) ); yb.rβad.aifc( argv[2] );

// Do SS2-dependent atuff tif DELAYS

SS2Delays ss2(10); double lo x -2; double hi s 2; telse

SS2Gainβ βs2; double lo x 0; double hi = 1; tendif

// Start up a new searcher tdefine INDOW.TIME 0.05 sβ2.SetLeakl ya.gβt.βampling.frequency!) * WINDOW.TIME ); RepnSS2::aet_SS2( Sss2 ) ; RepnSS2::βet_boundβ( lo, hi, lo, hi ),- RepnSS2* he = new RepnSS2 () ; βrch x new SrchSShc( he ) ;

// Start the graphics tdefine MARKER.DIM 0.01 tdefine MARKER^ALT 0.001

InventorSS2 invβs2( Sββ2, lo, hi, 10, lo, hi, 20 ); invas2.allocate_aquareaeta( 1 ); invss2.register_βquareset( 0, 1,

She,

KARKER_DIM, MARKER.ALT, 0.1, 1, 0.1, SoKeyboardEvent: :H );

// Runl

-84-

tif DELAYS invβs2.max_catchup = 100; tendif invββ . un( Sya, Syb, searcher, 100, new InventorWindow("SS2") ); return 0; ) tendif

// SrehSSpop -*-cτ+-*- SReviβion: 12.1 $ SDate: 1994/08/05 23ι40:37 $

// Hypothesis population for Bound separation tifndaf _SrchSSρop_h tdefine .SrchSSpop.h tinclude "SrchSS.h" clβss RepnSS; class SrehSSpop : public SrehSS I

// Constructor, deβtructor public:

SrehSSpop! RepnSS* repn, int npop, SrehSS* suggeβter ) ;

-SrehSSpop! ) ; private :

SrehSS* suggester;

// Parameters public: double min_iterations_to_mBX_aaaetβ; double maχ_income_diff_per_itβration; int iteretions_per_βuggeβtion; int suggeations_per_sample; double replacement.threahold;

// State private:

RepnSS* follower; int npop; int ipop_median_income; int ipop_median_asβetβ; double lnv_npop;

RepnSS** pop; double* eval; double* income; double* assets; short* income.invrank; short* assets.invrank; double* colorfrac; public:

RepnSS** get_pop() ( return pop; ) double* get.colorfrac() ( return colorfrac; )

// Moβtly virtual... public: virtual void Initialize!); virtual void Iterate! abort niter si ) ; virtual Repreaantation* get_best() ; virtual double get.beβt.βcore() ; virtual Repreaantation* get.aomeone.good( int maxrank, double rank-factor ) ;

RepnSS* gβt.follower() ( return follower; ) private: int richest; double richest.βcore;

// Qβort support

private: static int invrank_decreasing( const void*, const void* ) ; static βhort* qaort.invrank; βtatic double* qsort.value;

); tendif // .SrchSSpop.h

// SrehSSpop - « -c+τ-*- SReviβion: 12.1 S SDate: 1994/08/05 23:40:37 S tinclude "SrehSSpop.h" tinclude "RepnSS2.h* tinclude <Inventor/Xt/viewera/SoXtExaminerViewer.h> tinclude <math.h> tinclude <atdlib.h> tinclude "mystrea .h* // db

SrehSSpop: :SrehSSpop! RepnSS* repn, int npop., SrehSS* β. ) ( suggester = s_; follower repn; npop x npop.; ipop_aedian_income = 5; ipop_ nedian.aaβetβ x npop / 2; inv.npop 1. / npop; pop new RepnSS* [npop] ; pop(O) x repn; for! short i x 0; i < npop; i*+ ) pop.i) x (RepnSS*) repn->Clone{) ; aval x new double (npop) ; income x new double (npop) ; assats x new double [npop] ; income.invrβnk x new ahort [npop]; aββata.invrank x new ahort [npop] ; colorfrac x new double [npop]; forl 1 x 0; i < npop; i** ) income.invrankd) = aββetβ_invrank[i] x i;

// default valuea αJ_n.iterationa_to_jnaχ_aβsetβ x 10; ma] _income_diff-_per_iteratioπ * 10; lterationa_par_auggestion 10; suggestions_per_eample x 1; replacement.threshold = -1;

Initialize !) ;

)

SrehSSpop: :-SrehSSpop() ( forl int i x 0; i < npop; iτ+ ) delete pop[i] ; delete pop; delete eval; delate income; delete aasetβ; delete income.invrank; delete aaβeta.invrank; delete colorfrac; ) void SrehSSpop: :Initialize() I richeβt x -1; double rlchβat.aaβetβ = 0; for( int i 0; i < npop; i++ ) { po [i]->Randomize() ; tif 0 assetsti] -pop[i)->Evaluate() ;

if ! richest < 0 | | aaaetβd] > richest.assets ) I richeβt = i; richest.assets = assets[i); ) telse assets[i) x 0; richeβt 0; richest.assets = 0; tendif ) follower->Randomize() ; ) tdefine MAX_ΛSSETS (min.iterationβ.to.max.assets * max_income_diff.per.iteration) tdefine IHVJ1AX .SSETS (1. / MAX_ASSETS) ahort* SrehSSpop: : aort.invrank x 0; double* SrehSSpop: iqsort.value x 0; int SrehSSpop: iinvrank_dacreasing( const void* one., const void* two.) ( short one x * (short*)one.; βhort two x * (short *) two. ; if ( qsort_value[one] > qaort_value [two] ) return -1; if ! qβort.value ( one ] < qsort. value [two) ) return 1; return 0;

void SrehSSpopi :Iterata( short ) (

// Evaluate population and compute gross income for! int lp = 0; ip < npop; ip+τ ) ( evaldp] « pop(ip]->Bvaluata() ; income[ip] x -evaltip); )

// // Find median income

// qaort.invrank x income.invrank;

// qβort.value = income;

// qaortf qsort.invrank, npop, βizeo (short) , SrehSSpop::invrank ^ decreasing );

// double median_income x income[income.invran [ipop.median ^ incomβ] ] ;

// // Compute income and mean income

// double mean_income = 0;

// for! int ip x 0; ip < npop; ip+τ ) (

// evaldp] x pop[ip)->Bvaluate() ;

// income(ip) x -evaldp);

// mean_incomβ *χ income(ip);

// )

// mean_income * χ inv.npop;

// Pay Income for! ip x 0; ip < npop; ip+τ ) ( // incomelip] -« median.income;

// if! income[ip] < -ma _income_diff.per.iteration ) // incomedp) « -maχ_income_diff.per.iteration; // if! incomedp) > maχ_incoαe.diff_per_iteration ) // incomedp) x maχ_incomβ_diffjβr.itβration; assets[ip] « income(ip);

)

// Rank assets qβort.invrank x aββets.invrank; qβort.value x assetβ;

qsort! qsort.invrank, npop, sizeof(short) , SrehSSpop: :invrank_decreasing ); // double median_assets = assets[aasets_invrank[ipop_median_assets] ) ,-

// for! int idb = 0; idb < npop; ldb++ ) // cout « idb TB // « income [income.invran (idb] ] TB // « aaβetβ [aββetβ_invran ddb] ] TB // « evaKaaβets.invrankddb] ] HL; // cout NL;

// // Compute mean assets // double mean_aβsβts x 0; // for! ip = 0; ip < npop; ip++ ) mean.assets + χ assetsdp]; // ■eβn_aβsets * χ inv_npop;

// Make richest have MAX_ASSETS richest x assets.invran t 0 ] ; double richβst.aβββts x assets[ richest ] ; for! ip x 0; ip < npop; ipτ+ ) { aaβetadp] *« MAX.ASSBTS - riches t.aββeta; ) richest.assets « MAX^SSETS;

// Make aaaeta have zero median, and apply limit; find richest and poorest richest.assets x 0; richest -1; for! ip x 0; ip < npop; ip+τ ) ( ββββtβ dp] -x median_aaaata ; if ! assets p] > MAX_ASSBTS ) aaβetβ(ip) x MAXJ>ιSS__TS; if ! aaβetadp) < -MAXJtSSBTS ) assets dp] x -MAXJ SSETS; if ! richest < 0 | | aaaeta [ip] > richest.assets ) ( richest.assets x assets dp] ; richaat x ip;

) )

// Solicit βuggeβtionβ double old.βug_acore » 0; for( int iβ = 0; iβ < suggestions_per_βa_nple; iβτ+ ) (

// Update βuggeater βuggeβter->Iterate( iterations_per_βuggestion ) ; double sug_βcore x βuggeβter->get_Jaest_βcorβ( ) ;

// If this suggestion is same as last, ignore it if! iβ > 0 S aug_βcore ss old.sug.βcore ) continue; old_βug_βcore βug.βcore;

// If suggeβter not much better than best, ignore it if! sug.βcore - aval[richest] > replacβment.threshold ) continue;

// Replace poorest int poorest « -1; double poorest.amount x 0; for! ip x 0; ip < npop; ip++ ) ( if! poorest < 0 || assets[ip] < poorest.amount ) ( poorest.amount x assetsdp); pooreβt x ip;

) ) int poorββt.rank x npop - 1 - ia; if! poorββt.rank < 0 ) poo est.rank - 0; int poorest x aββetβ.invran [ poorest.rank ]; po [poorest)->Copy( βuggeβter->get_best () ); assets[poorest] x 0;

)

// Update colorfrac for[ ip x 0; ip < npop; ipτ+ ) colorfracdp] ■ (assetsdp] / MAXJVSSETS) * 0.5 + 0.5;

// for! ip = 0; ip < npop; ip*+ ) cout « colorfracdp] NL; // cout NL;

// Follower tdefine FOLLOWER.LEAK 0.1

// cout « richest TB « assβts.invrank[0) TB // « popt iches ]->to_βtring() ML; followβr->Follow( pop[richβat], FOLLOWER.LEAJ. ); follower->Inatall() ;

)

Representation* SrehSSpop: ιgβt_beβt() ( return poplrichest] ; ) tdefine GRAN 100000

Repreβentation* SrehSSpop::get.βomeone.good( int maxrank, double rank_factor ) ( if( maxrank xx 0 ) return pop! income.lnvranklO) 1; double prob.unlf rank_£actor / (1 + maxrank * (1 - rank_factor) ) ,- long rank x random() % (maxrank T 1); if! (rank > random!) % maxrank)

SS (random!) % GRAN > χ prob.unlf • GRAN) ) rank maxrank - rank; return pop[ income.invranktrank] ];

) tundef GRAN double SrehSSpop: :get_best_score() ( return eval[richest] ; ) / -

// Test main

// - tifdef TEST tinclude *SS2Delays.h" tinclude "SS2Gaina.h" tinclude "Inven orSS2.h" tinclude 'InventorWindow.h* tinclude "RepnSS.h" tinclude "SrchSShc.h" tinclude "SrchSStourn.h* tinclude "Signal.h" tinclude "mystream.h"

stβtic SrehSSpop* srch x 0; void searcher( long ) ( srch->Iterate() ; ) main! int argc, char** argv ) ( cout.aetf( ioβ::unitbuf );

// Parae command-line args and βet up soundsep problem char *filel = 0, *file2 x 0; int delays = 0; int nrep x 1;

Signal ya, yb; long aampa x 0; double freq = 0; double lo x 0, hi x 0;

SS2::Setup! argc, argv. ya, yb, filel, filβ2, lo, hi, samps, fraq, delβya, nrep );

RepnSS2* repn = new RepnSS2;

// Read in aignalβ and deal with lengths, frequencies freq « 8000; if! delayβ ) { ya.set.saβpling. requency! freq ); yb.set.βampling.frequency! freq ); ya.normalizβ.varlance() ; yb.normallze.variance() ; )

Signal za( samps, freq ) , zb( samps, fraq ) ; if( yb.get.aamplββ!) < βamps ) samps x yb.get.samples!) ; tdefine WINDOW.TIME 0.01 const long window.samps long( freq * WINDOW.TIME * 0.5 ) ;

// Start up a new searcher

RepnSS2* suggest = (RepnSS2*) repn->Clone() ; tdefine NPOP 30

SrchSShc* βuggeβter = new SrchSShc(suggest) ; srch x new SrehSSpop( repn, NPOP, suggester ); suggester->pop srch; suggaater->maxrank = NPOP-1; βuggeβter->rank_ actor = 0.8;

RepnSS2* follower (RepnSS2*) βrch->get_followβr() ,-

// Start the graphics tdefine MARKER.DIM 0.01 tdefine MARKER- LT 0.001

InvantorWindow* invwin x new InventorWindowCSS") ; InventorSS2 invss2 ( ββ2, lo, hi, 20, lo, hi, 10,

Inventor: iCGPea ) ; invββ2.allocate_βquaresetβ( 3 ); invβs2.register.squareset( 0, 1,

Sauggest,

MARKER_DIM*3 , MARKEf lLT * 0 .98 , SoKeyboardEvent : :H, 0 ,

.4. .4, .4 ) ; invaβ2.regiβter_βquareaet ( 1 , NPOP,

(RepnSS2**) βrch->get.pop() , MARKER.DIM, MARKER.ALT. SoKeyboardBvent: :P, 1, βrch->get_color rac() , Inventor: :BMRY ) ; invss2.register.squareset( 2, 1,

Sfollower,

MARKER_DIM*2, MARKER.ALT * 0.99,

SoKeyboardEvent: :F,

1,

0, 0, 0 );

// Set aome parameters if( delaya ) ( βreh->min_iterations_to_max_aββets x 10; βrch->max_income_diff_per_itβration - 10; βrch->iterationa_per.auggeetion x 3; srch->suggeationβ_per_βaι_g>la = 3; invss .maχ_catchup = 100 ; ) else ! srch->min_iterations_to_max_aBββts = 50 ; βrch->Biax_incoma_diff-_pβr_iterBtion = 10 ; βrch->iterations__per_βuggββtion x 5 ; srch->suggestions_jιer .sample = 5; ) srch->replacβmβnt_threβhold « 0;

// RunI invwin->get_viewer()->βetViewing( FALSE ) ; invss2.Run| Sya, Syb, βearcher, window.βamps, invwin ); return 0;

) tendif

// SrchSStourn -*-c++-*- SRevision: 12.0 S SDate: 1994/08/04 00:23:30 S

// Tournaments for sound separation tifndβf _SrchSStourn_h tdefine .SrchSStourn.h tinclude "SrchSS.h" class RepnSS; class SrchSStourn : public SrehSS I

// Constructor, destructor public:

SrchSStour ( RepnSS* repn ) ; -SrchSStour () ;

// State private:

RepnSS* best; double beβt.βcore;

// Virtual methods public: virtual void Initialize!); virtual void Iterate( short niter χ l ) ; virtual Representation* get_bββt(); virtual double get_best_acore() ;

); tendif // _SrchSStourn_h

// SrchSStourn - « -c+τ-*- SRevision: 12.0 S SDate: 1994/08/04 00:23:30 S tinclude "SrchSStourn.h" tinclude "RepnSS.h"

// tinclude "mystream.h" // db

SrchSStourn: :SrchSStourn! RepnSS* repn ) beat x repn; Initialize!);

SrchSStour : :-SrchSStourn() delate beat; void SrchSStourn: :Initialize!) Iterated); void SrchSStourn: :Iterate! short niter ) beat->Randomize() ; best.score x beβt->Evaluate() ; for( int iter x 1; itβr < niter; itβr+τ ) ( beβt->SavaForUndo() ; beat->Randomize() ; double new_βcore = beβt->Bvaluate() ; if( new.βcore > best.score ) best->Undo() ; else best.score = new.score;

)

Representation* SrchSStourn: i et_best() return best; double SrchSStourn: :get.beβt.βcore() return best.score;

//

// Test main // tifdaf TEST tdefine DELAYS 0 tif DELAYS tinclude "SS2Delayβ.h" telβe tinclude "SS2Gaina.h" tendif tinclude "myβtream.h" tinclude InventorSS2.h" tinclude "InvβntorWindow.h"

♦include "RepnSS2.h" ♦include "Signal.h" static SrchSStourn* srch = 0; void searcher! long ) I srch->Itβrate() ; ) main! int argc, char** argv ) ( cout.setf! ios::unitbuf ); if! argc 1= 3 ) { cerr « 'Usage: « argv[0] « " sigl sig2\n"; return 1; )

Signal ya, yb; ya.read.aifc( argv[1] ) ; yb.read_aifc( argv[2] );

// Do SS2-dependent atuff tif DELAYS

SS2Delayβ βs2(10); double lo = -2; double hi x 2; telβe

SS2ββins ss2; double lo x 0; double hi = 1; tendif

// Start up a new searcher tdefine WINDOW.TIME 0.05 ss2.SetLeak( ya.get.sampling.frequency() * WINDOW.TIME ) ;

RepnSS2: :βet_SS2( Sss2 );

RepnSS2: :set_bounds ( lo, hi, lo, hi );

RepnSS2* best = new RepnSS2(); srch x new SrchSStour ( best ) ;

// Start the graphics tdefine MARKER.DIH 0.01 tdefine MARKERJU.T 0.001

InventorSS2 invss2( Ssβ2, lo, hi, 10, lo, hi, 20 ); invss2.allocate_aquareaets! 1 ); inv8s2.regiβter_squareβet( 0, 1,

Sbeβt,

MARKER.DIM, MARKER.ALT, 0.1, 1 , 0.1, SoKeyboardBvent : : H ) ;

// Run! tif DELAYS invss2-max_catchup = 100; tendif invss2.Run( Sya, yb, searcher, 100, new InventorWindow("SS2") ) ; return 0; tendif