Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SOUND MASKING SYSTEM AND METHOD OF OPERATION THEREFOR
Document Type and Number:
WIPO Patent Application WO/2009/156928
Kind Code:
A1
Abstract:
An audio system comprises a sound emitter array (101) for radiating audio signals and a masking signal generator (105) that generates a masking signal. An array processor (103) generates a masking feed signal for the sound emitter array (101) from the masking signal such that a radiation pattern for the radiated masking signal has a reduced gain in a first angular interval comprising a first direction for a listening location. Optionally, the system comprises a receiver (301) which receives a first signal and the array processor (103) generates a feed signal for the sound emitter array (101) such that the radiation pattern for this radiated signal has an increased gain in at least a second angular interval of the first angular interval. The approach may allow improved privacy in an audio system.

Inventors:
DE BRUIJN WERNER P J (NL)
Application Number:
PCT/IB2009/052646
Publication Date:
December 30, 2009
Filing Date:
June 19, 2009
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KONINKL PHILIPS ELECTRONICS NV (NL)
DE BRUIJN WERNER P J (NL)
International Classes:
G10K11/175
Domestic Patent References:
WO2006037587A22006-04-13
WO2002025631A12002-03-28
Foreign References:
JP2008103851A2008-05-01
JPH02230899A1990-09-13
JPH0316499A1991-01-24
Attorney, Agent or Firm:
UITTENBOGAARD, Frank et al. (Building 44, AE Eindhoven, NL)
Download PDF:
Claims:

CLAIMS:

1. An audio system comprising: a sound emitter array (101) for radiating audio signals; a masking signal generator (105) for generating a masking signal; and first feed means (103) for generating a masking feed signal for the sound emitter array (101) from the masking signal; wherein the first feed means (103) is arranged to generate the masking feed signal to provide a radiation pattern for a radiated masking audio signal having a reduced gain in a first angular interval comprising a first direction for a listening location.

2. The audio system of claim 1 further comprising: a receiver (301) for receiving a first signal; and second feed means (103) for generating a first feed signal for the sound emitter array (101) from the first signal; and wherein the second feed means (103) is arranged to generate the first feed signal to provide a radiation pattern for a radiated audio signal having an increased gain in at least a second angular interval of the first angular interval.

3. The audio system of claim 2 wherein a signal level of the radiated masking audio signal is lower than a signal level of the radiated audio signal in the second angular interval and the signal level of the radiated masking audio signal is higher than the signal level of the radiated audio signal outside the first angular interval.

4. The audio system of claim 2 wherein the second feed means (103) is arranged to generate a main beam of the radiated audio signal in substantially the first direction.

5. The audio system of claim 2 wherein the second feed means (103) is arranged to generate a notch of the radiated audio signal in substantially a second direction corresponding to a masking location.

6. The audio system of claim 2 wherein the masking generator (105) is arranged to measure an averaged frequency spectrum characteristic of the first signal and to adapt an averaged frequency characteristic of the masking signal in response to the averaged frequency spectrum characteristic of the first signal.

7. The audio system of claim 2 wherein the masking generator (105) is arranged to measure a temporal envelope characteristic of the first signal and to adapt a temporal envelope characteristic of the masking signal in response to the temporal envelope characteristic of the first signal.

8. The audio system of claim 2 wherein the masking generator (105) is arranged to segment the first signal into a plurality of signal segments and to generate the masking signal in response to a reordering of the plurality of signal segments.

9. The audio system of claim 1 wherein the first feed means (103) is arranged to generate a notch of the radiated masking audio signal in substantially the first direction.

10. The audio system of claim 1 wherein the first feed means (103) is arranged to generate a main beam in a direction outside the first angular interval corresponding to a masking location.

11. The audio system of claim 1 further comprising a microphone (701) for providing a microphone signal and wherein the masking means (105) is arranged to generate the masking signal in response to the microphone signal.

12. The audio system of claim 1 further comprising a directional display arranged to directionally display an image in a main direction within the first angular interval.

13. The audio system of claim 1 further comprising: means for estimating an angular direction for a user from the sound emitter array; and means for adapting the radiation pattern in response to the angular direction.

14. The audio system of claim 1 further comprising: means for receiving a user input; and

means for adapting the radiation pattern in response to the user input.

15. A method of operation for an audio system comprising a sound emitter array for radiating audio signals, the method comprising: generating a masking signal; and generating a masking feed signal for the sound emitter array from the masking signal providing a radiation pattern for a radiated masking audio signal having a reduced gain in a first angular interval comprising a first direction for a listening location.

Description:

SOUND MASKING SYSTEM AND METHOD OF OPERATION THEREFOR

FIELD OF THE INVENTION

The invention relates to an audio system and a method of operation therefor and in particular, but not exclusively, to masking of speech signals to provide improved privacy.

BACKGROUND OF THE INVENTION

The ability to provide audio to specific locations and users while preventing the audio from being intelligible to other users is of increasing interest. Specifically, there are many situations in which privacy- sensitive speech information has to be delivered to a specific person or persons in such a way that other people are not able to overhear what is being said. This may arise in many situations and in various types of environments, including hands-free phone conversations in open office environments or public spaces, laptop/desktop chatting/conferencing sessions in open office environments or public spaces, information services in public spaces (e.g. interactive information or shopping applications, ATM machines, service desks, etc.), medical environments, etc.

In some audio systems, it has been attempted to provide such privacy by directing the sound towards the intended person while reducing the sound radiated in other directions. Examples of such systems include "sound shower" type products which are often used in environments such as museums, etc. Such systems tend to be based on: acoustical parabolic reflectors; ultrasound beams; or directional electrostatic panel loudspeakers.

The systems are usually mounted in the ceiling above the intended position of the user. However, although the directionality of this class of systems can be sufficient to enable the delivery of audio to individual people without disturbing other people substantially, they tend not to be able to provide a sufficient level of acoustic privacy for applications where a high degree of privacy is required and where it is important that reproduced speech is unintelligible for other people than the user. This is due to the fact that

the amount of directivity that can be achieved is not sufficient to completely attenuate the signals outside the listening zone.

Another class of systems aims to provide acoustical privacy to a person who is speaking rather than to prevent other people from overhearing the audio presented to the user. Scenarios that are targeted by these systems include open-space office environments, such as a scenario wherein a person is making a phone call and it is desired to prevent other people from overhearing the phone conversation. In these systems, this is achieved by radiating a masking signal from a loudspeaker mounted at a position relatively far from the person making the phone call and relatively close to the person(s) who should not be able to overhear the conversation.

Although such systems can be effective in masking the speech they also have a number of associated disadvantages. Firstly, they tend to require high levels of the masking sound to ensure effective masking of the speech. However, the masking sound is typically a form of noise which tends to be irritating and disturbing at high levels. Also, the masking sound may be disturbing to the person that the system tries to mask and tends to require substantial spatial separation between the different users. However, this is impractical in many environments, such as office environments, where a high user density is required. Furthermore, the systems tend to be inflexible and rely on the users being located in specific predetermined locations. Hence, an improved audio system would be advantageous and in particular a system allowing increased flexibility, improved privacy, facilitated implementation and/or improved performance would be advantageous.

SUMMARY OF THE INVENTION Accordingly, the Invention seeks to preferably mitigate, alleviate or eliminate one or more of the above mentioned disadvantages singly or in any combination.

According to an aspect of the invention there is provided an audio system comprising: a sound emitter array for radiating audio signals; a masking signal generator for generating a masking signal; and first feed means for generating a masking feed signal for the sound emitter array from the masking signal; wherein the first feed means is arranged to generate the masking feed signal to provide a radiation pattern for a radiated masking audio signal having a reduced gain in a first angular interval comprising a first direction for a listening location.

The invention may allow improved privacy and may in particular allow an improved masking of an audio source. The system may allow facilitated implementation and/or deployment. Specifically, efficient masking may often be achieved in densely populated environments and restrictions and requirements for loudspeaker positioning may in many situations be less stringent.

The reduced gain may be reduced relative to an averaged gain for all angles or for angles outside the first angular interval. Specifically, the reduced gain of the first angular interval may be less than a threshold. The threshold may specifically be an average gain and/or a median gain for all angles or for angles outside the first angular interval. In some embodiments, the gain may be a fraction of the maximum gain of the radiation pattern such as e.g. 3 dB or 6 dB below the maximum gain. In some embodiments, the gain of the radiation pattern is below a threshold for all angles within the first angular interval and above a threshold for all angles outside the first angular interval.

In accordance with an optional feature of the invention, the audio system further comprises: a receiver for receiving a first signal; and second feed means for generating a first feed signal for the sound emitter array from the first signal; and wherein the second feed means is arranged to generate the first feed signal to provide a radiation pattern for a radiated audio signal having an increased gain in at least a second angular interval of the first angular interval. This may allow a highly efficient system for providing an audio signal to a user while maintaining a high degree of privacy. Furthermore, facilitated implementation and/or deployment may be achieved. In particular, a single loudspeaker box may provide an efficient audio system for providing both a high quality audio signal to a desired user and masking of the signal to other users. The radiation pattern for the radiated audio signal may specifically have a substantially complementary shape to the radiation pattern for the radiated masking signal. The gain of the radiation pattern for the radiated audio signal may be higher than the gain of the radiation pattern for the radiated masking signal within the first angular interval and lower outside this interval. The gain may be defined as the average gain for a 5 or 10 degree interval.

In accordance with an optional feature of the invention, a signal level of the radiated masking audio signal is lower than a signal level of the radiated audio signal in the second angular interval and the signal level of the radiated masking audio signal is higher than the signal level of the radiated audio signal outside the first angular interval.

This may allow improved performance and in particular may allow efficient sound provisions as well as masking.

In accordance with an optional feature of the invention, the second feed means is arranged to generate a main beam of the radiated audio signal in substantially the first direction.

This may allow improved performance and in particular may allow efficient sound provisions as well as masking. Alternatively or additionally it may allow facilitated implementation. The approach may allow the provided sound to be specifically targeted to the desired user thereby improving the masking effect in other directions. In accordance with an optional feature of the invention, the second feed means is arranged to generate a notch of the radiated audio signal in substantially a second direction corresponding to a masking location.

This may allow improved privacy and in particular may allow an improved masking at the masking location. The notch may specifically correspond to a local minimum of the gain and may specifically be a null of the radiation pattern.

In accordance with an optional feature of the invention, the masking generator is arranged to measure an averaged frequency spectrum characteristic of the first signal and to adapt an averaged frequency characteristic of the masking signal in response to the averaged frequency spectrum characteristic of the first signal. This may allow improved performance and in particular may allow an improved masking effect of the masking signal by this being adapted to the signal to be masked. The average frequency characteristics may specifically be a frequency spectrum measure averaged over a suitable time interval. The averaging may be a weighted and/or time variant averaging. In accordance with an optional feature of the invention, the masking generator is arranged to measure a temporal envelope characteristic of the first signal and to adapt a temporal envelope characteristic of the masking signal in response to the temporal envelope characteristic of the first signal.

This may allow improved performance and in particular may allow an improved masking effect of the masking signal by this being adapted to the signal to be masked. The temporal envelope characteristic may specifically be an envelope amplitude.

In accordance with an optional feature of the invention, the masking generator is arranged to segment the first signal into a plurality of signal segments and to generate the masking signal in response to a reordering of the plurality of signal segments.

This may allow improved performance and in particular may allow an improved masking effect of the masking signal. Alternatively or additionally, a facilitated generation of the masking signal may be achieved.

The reordering may be a time domain reordering. The signal segments may e.g. be time domain segments or time and frequency tile segments. Individual (different) reordering may be performed in different frequency bands.

In accordance with an optional feature of the invention, the first feed means is arranged to generate a notch of the radiated masking audio signal in substantially the first direction. This may allow improved audio quality to be provided while maintaining privacy. The notch may specifically correspond to local minimum of the gain and may specifically be a null of the radiation pattern.

In accordance with an optional feature of the invention, the first feed means is arranged to generate a main beam in a direction outside the first angular interval corresponding to a masking location.

This may allow improved performance and in particular may allow efficient masking that may be focused at a specific masking location. Alternatively or additionally, it may allow facilitated implementation.

In some embodiments, the audio system may comprise a microphone for providing a microphone signal which can be communicated to a remote destination.

This may allow an efficient two-way audio system. In particular, it may allow an improved privacy and may specifically allow an effective masking of the audio environment. Specifically, for a speech communication system, the speaker may be effectively masked to provide privacy to the speaker. In accordance with an optional feature of the invention, the audio system further comprising a microphone for providing a microphone signal and wherein the masking means is arranged to generate the masking signal in response to the microphone signal.

This may allow an efficient two-way audio system with matching performance. In particular, for a speech communication system improved performance can be achieved with communication in both directions being optimized for a specific user position.

In some embodiments, the microphone may be a directional microphone having an increased sensitivity in a direction within the first angular interval. Specifically, the directional microphone may be implemented as a microphone array with associated beam

forming functionality. This may allow particularly advantageous operation with increased privacy.

In accordance with an optional feature of the invention, the audio system further comprises a directional display arranged to directionally display an image in a main direction within the first angular interval.

This may allow an improved privacy and may specifically allow a highly effective masking of the audio environment. Specifically, for a speech communication system, the speaker may be effectively masked to provide privacy to the speaker.

The masking signal may specifically be generated in response to both the microphone signal and the received first audio signal.

In accordance with an optional feature of the invention, the audio system further comprises: means for estimating an angular direction for a user from the sound emitter array; and means for adapting the radiation pattern in response to the angular direction. This may allow improved performance and may in particular allow the masking effect to be maximized and/or minimized for a given user location.

The radiation pattern for the first signal may alternatively or additionally be adapted in response to the angular direction.

In accordance with an optional feature of the invention, the audio system further comprises: means for receiving a user input; and means for adapting the radiation pattern in response to the user input.

This may allow improved performance and may in particular allow the system to be effectively and easily adapted to different deployment scenarios and environments.

According to an aspect of the invention there is provided a method of operation for an audio system comprising a sound emitter array for radiating audio signals, the method comprising: generating a masking signal; and generating a masking feed signal for the sound emitter array from the masking signal providing a radiation pattern for a radiated masking audio signal having a reduced gain in a first angular interval comprising a first direction for a listening location. These and other aspects, features and advantages of the invention will be apparent from and elucidated with reference to the embodiment(s) described hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will be described, by way of example only, with reference to the drawings, in which

Fig. 1 illustrates an example of an audio system in accordance with some embodiments of the invention; Fig. 2 illustrates an example of a radiation pattern for an audio system in accordance with some embodiments of the invention;

Fig. 3 illustrates an example of an audio system in accordance with some embodiments of the invention;

Fig. 4 illustrates an example of a use scenario for an audio system in accordance with some embodiments of the invention;

Fig. 5 illustrates an example of radiation patterns for an audio system in accordance with some embodiments of the invention;

Fig. 6 illustrates an example of radiation patterns for an audio system in accordance with some embodiments of the invention; Fig. 7 illustrates an example of an audio system in accordance with some embodiments of the invention; and

Fig. 8 illustrates an example of an audio system in accordance with some embodiments of the invention.

DETAILED DESCRIPTION OF THE DRAWINGS

The following description focuses on embodiments of the invention applicable to providing enhanced privacy for speech signals and in particular to providing efficient masking of a speech signal. However, it will be appreciated that the invention is not limited to this application but may be applied to many other systems, applications and signals including for example to music or other audio signals.

Fig. 1 illustrates an example of an audio system in accordance with some embodiments of the invention. The system is arranged to provide a low complexity and easy to deploy system for providing privacy for a speaker at a given location.

The system comprises a sound emitter array which in the specific example is a loud speaker array 101. Fig. 1 illustrates an example with four loudspeakers in the array 101 but it will be appreciated that in other embodiments, the array may contain other numbers of sound emitting devices and specifically that the loudspeaker array 101 may comprise a larger number of loudspeakers.

The loudspeaker array 101 is coupled to an array processor 103 which is arranged to generate feed signals for the loudspeaker array 101 and specifically to generate an individual feed signal for each of the loudspeakers of the loudspeaker array 101. The array processor 103 is specifically capable of modifying the radiation pattern of the sound signal radiated from the loudspeaker array 101. For example, the array processor 103 may comprise beam forming functionality which can determine relative weights, phase offsets and/or delays for the individual signal being fed to the individual loudspeakers to result in the desired radiation pattern.

It will be appreciated that any suitable processing for generating desired radiation patterns may be used by the array processor 101. Specifically, it would be appreciated that the person skilled in the art will be aware of many different audio beam forming algorithms and that any suitable algorithm may be used without detracting from the invention.

The array processor 101 is coupled to a masking signal generator 105 which is arranged to generate a masking signal. The masking signal may for example be a noise signal, such as a white noise signal in the audio frequency band (e.g. 20 Hz-8kHz).

The masking signal is fed to the array processor 103 which proceeds to generate feed signals for the individual loudspeakers such that the loudspeaker array 101 radiates a masking audio signal that has a desired radiation pattern. The array processor 103 specifically generates a masking feed signal for the loudspeaker array 101 which provides a radiation pattern for the radiated masking audio signal which has a reduced gain in a first angular interval that comprises a first direction towards a listening location.

An example of such a radiation pattern is illustrated in Fig. 2. In the example, the gain of the radiation pattern for the loudspeaker array 101 is reduced within an angular interval, α, that includes a direction 201 towards a listening position A.

Thus, in the example, a masking signal is generated which can prevent speech signals from being understood by users who are not located in the listening area corresponding to the directions of reduced gain. A very flexible and efficient masking of the speech may be achieved using only a single loudspeaker array 101 which specifically can be implemented in a single loudspeaker box. Furthermore the array processor 103 and masking generator 105 may in some embodiments be implemented in the same loudspeaker box thereby providing a single box privacy solution. Also, a low-cost and/or low complexity may be achieved. In addition, a very flexible and adaptable deployment in practical scenarios is

enabled. For example, a single box solution may be located at a suitable location and in a suitable orientation. The exact radiation pattern generated by the array processor 103 may then simply be adapted to the desired listening location(s) by adjusting the parameters used by the array processor 103 generating the feed signals. This may for example be adjusted as part of an automated calibration process using e.g. a microphone coupled to the array processor 103 and located at the desired listening location. Furthermore, the system may easily be updated to reflect a changed scenario. For example, for an office deployment a change of the office layout may simply be adapted to by performing a recalibration of the system and without requiring any movement or physical alterations for the privacy system. The listening location may for example correspond to a position of a speaker for whom the privacy is provided. For example, a simple, flexible single box solution may be provided to mask a human speaker at a specific location to listeners in all other locations of the audio environment. For example, a single worker speaking on the telephone may be masked to all other workers in the office environment. It will be appreciated that the first angular interval in which the gain of the loudspeaker array 101 is reduced need not be a narrow angular interval specifically directed at the listening location. For example, in some scenarios, the radiation pattern for the masking signal may be a relatively narrow beam which is directed towards a specific location in which the masking effect is required. For example, in a medical environment the masking signal may be specifically directed towards a patient while allowing all others to hear the discussion.

The reduced gain within the first angular interval relative to a gain of the radiation pattern outside of the first angular interval may for example be a gain which is below a threshold for all angles within the interval. At the same time, the gains for all angles outside the first angular interval may be above the threshold. Specifically, the threshold may be the lowest gain of the radiation pattern outside the first angular interval. It will be appreciated that the gain for an angle may be calculated as the average gain for an angle interval centered on the specific angle. This angle interval may for example be 5° or 10°. Such an averaged gain consideration may reflect that for some radiation patterns, there may be narrow local minima outside the first angular interval that have a gain less than some local maxima within the first angular interval.

The reduced gain may specifically be less than a threshold for all angles within the first angular interval where the threshold is given as a filtered or (possibly weighted) averaged gain value for all angles or for all angles outside the first angular interval.

In some embodiments, the first angular interval having a reduced gain may be defined as the angle interval for which a gain of the radiation pattern is below a threshold where the threshold is a function of the averaged gain of all angles. For example, the first angular interval with reduced gain may be an interval in which the gain of the radiation pattern is below an offset average gain of the radiation pattern, wherein the offset may for example be OdB, -3dB, -6dB, -9dB or -12 dB. As another example, the first angular interval having a reduced gain may be defined as the angle interval for which a gain of the radiation pattern is below a threshold where the threshold is a function of the maximum gain of the radiation pattern. For example, the first angular interval with reduced gain may be an interval in which the gain of the radiation pattern is below an offset peak gain of the radiation pattern, wherein the offset may for example be -3dB, -6dB, -9dB, -12 dB, -15 dB or -18 dB.

In some embodiments, the reduced gain of the first angular interval may correspond to a gain reduction achieved by directing a local gain minimum in the direction of the listening location. Specifically, a notch or null of the antenna array radiation pattern may be directed in the listening location direction.

Fig. 3 illustrates an audio system for providing a received speech signal to a user while providing enhanced privacy for the signal. The system may be seen as an enhancement of the system of Fig. 1 and also comprises the loudspeaker array 101, the array processor 103 and the masking generator 105. In addition, the system comprises an interface 301 which is capable of interfacing the system to a communication system from which the speech signal to be provided to the user is received. It will be appreciated that in other embodiments, the speech signal may be provided from an internal source such as e.g. a speech signal memory. The interface 301 may for example be a network interface coupling the system to a suitable network, such as the Internet, or may for example be a radio transceiver capable of communicating with elements of a radio communication system, such as a cellular telephone system.

The interface 301 is coupled to a receive signal processor 303 which is arranged to decode and process the received speech signal. For example, the receive signal processor 303 may comprise functionality for performing speech decoding, equalization, volume adjustment etc.

The receive signal processor 303 is coupled to the array processor 103 which is fed the received signal. The array processor 103 is arranged to generate a feed signals for the loudspeaker array 101 and specifically to generate an individual feed signal for each of

the loudspeakers of the loudspeaker array 101 from the received speech signal. The array processor 103 is specifically capable of modifying the radiation pattern of the sound signal which is radiated from the loudspeaker array 101. For example, the array processor 103 may comprise beam forming functionality which can determine relative weights, phase offsets and/or delays for the individuals signals fed to the individual loudspeakers thereby resulting in the desired radiation pattern.

The received signal is thus fed to the array processor 103 which proceeds to generate feed signals for the individual loudspeakers such that the loudspeaker array 101 radiates the speech signal with a desired radiation pattern. In the example, the array processor 103 is thus arranged to both generate a radiation pattern for the received speech signal as well as for the masking signal. Specifically, for each of the speech signal and the masking signal, weights, phase offsets and/or delays may be determined for each loudspeaker. The feed signal for the individual loudspeaker is then generated by combining the compensated signals from the masking signal and the speech signal (e.g. by a simple summation).

The array processor 103 specifically generates a speech feed signal for the loudspeaker array 101 which provides a radiation pattern for the radiated speech signal that has an increased gain in at least a second angular interval of the first angular interval. The second angular interval may be identical to the first angular interval or may be smaller than the first angular interval.

In some embodiments, the second angular interval having an increased gain may be defined as the angle interval for which a gain of the radiation pattern is above a threshold where the threshold is a function of the averaged gain of the radiation pattern for all angles. For example, the second angular interval with increased gain may be an interval in which the gain of the radiation pattern is above an offset average gain of the radiation pattern, wherein the offset may for example be OdB, 3dB, 6 dB, 9dB or 12 dB. As another example, the second angular interval may be defined as the interval in which the gain of the radiation pattern is within a given offset of the maximum gain of the radiation pattern. For example, it may include the angles for which the gain is higher than e.g. the maximum gain minus 3, 6, 9 or 12 dB.

In some scenarios, the first and second angular intervals may be defined relative to each other. Specifically, the first and second angular intervals may both be defined as an interval wherein the gain of the radiation pattern for the speech signal is higher than the radiation pattern for the masking signal.

Specifically, the signal level of the radiated masking audio signal may be lower than a signal level of the radiated speech signal in the first angular interval whereas the signal level of the radiated masking audio signal may be higher than the signal level of the radiated speech signal outside the first angular interval. Thus, in the embodiment of Fig. 3, a single loudspeaker array 101 is used to radiate both a desired speech signal as well as a masking signal. The radiation patterns for the two signals are different such that the desired speech signal is dominant in one or more directions corresponding to a listening location whereas the masking signal is dominant in other directions corresponding to locations wherein the speech signal is effectively masked. Fig. 4 illustrates an example of a usage scenario for the system of Fig. 3. In the example, a first relatively narrow beam 401 is generated for the desired speech signal. The direction of the beam is in the direction 403of a desired listening position A. For example, in an office deployment, the speech signal is directed towards the working position of a first worker who is the intended recipient. Furthermore, a second relatively narrow beam 405 is generated for the masking signal. The direction of the beam is in the direction 407 of a desired masking position B. For example, in an office deployment, the masking signal may be directed towards the working position of a co-worker of the first worker.

Thus, in this example, two individual beams are generated by the array processor 103 and the loudspeaker array 101. The first beam 401 carries the desired speech signal and is aimed at person A for whom the signal is intended. The second beam carries the masking signal generated by the masking signal generator 105 and is aimed at person B who is not supposed to be able to listen to the speech signal. The gain of this beam 405 is such that at the position of person B, the speech signal is masked sufficiently by the radiated masking signal. Thus, the masking signal strength is sufficient to render the speech unintelligible for person B.

Fig. 5 illustrates an example of radiation patterns for the desired speech signal beam 401 and the masking beam 405 for a scenario in which person A is located at 0 degrees relative to the loudspeaker array 101 and person B is located at -45 degrees relative thereto. The shown example is for a loudspeaker array comprises 12 loudspeakers with an internal spacing of 5 cm.

Thus, the system may provide improved privacy from a low complexity, low cost and easy to deploy speech communication system. In particular, a single box solution may be used to provide flexible provision of private speech.

It will be appreciated that whereas the scenario of Fig. 4 corresponds to providing a speech signal to a first person (A) while masking the signal for a second person (B), the system may be used flexibly. For example, the same system and loudspeaker array 101 can provide a speech signal to the second person (B) which is kept private for person (A) simply by changing the generated beams. As a specific example, the system of Fig. 3 may be used to provide a flexible Public Address (PA) system allowing the generated audio to be a public announcement (e.g. no masking signal, broad beam for the audio signal) or be a private announcement for an individual (either A or B).

In many embodiments, the scenario of Fig. 4 may be improved by actively minimizing the radiation gain for one beam in the direction of the other. Specifically, the first beam 401 may be generated to have a notch substantially in the direction of the masking direction 407 (say within an interval of ±10-15°). Similarly, the second beam 403 may be generated to have a notch substantially in the direction of the listening direction 403 (say within an interval of ±10-15°). Thus, for both beams, the level in the direction of the other person is sought to be minimized. Specifically, the speech signal is reproduced by the loudspeaker array 101 in such a way that there is a strong beam in the direction 403 of person A, while there is maximum suppression in the direction 407 of person B. For the masking signal, the situation is the reverse, i.e. maximum gain is in the direction 407 of person B and maximum suppression is in the direction 403 of person A. This will tend to increase the acoustic separation between the two signals at the positions of both A and B, resulting in an improved masking and privacy while at the same time improving the audio quality for person A.

Depending on the loudspeaker configuration and use scenario, minimization of the level of each beam can thus be explicitly in the direction of the other person, by the creation of a notch in the corresponding direction.

A scenario such as that of Fig. 4 may provide a highly advantageous system for many applications. A specific example is a hospital application wherein a doctor performing a medical procedure on a conscious patient (e.g. a Cardio-Vascular X-Ray imaging procedure) receives information from other medical staff through a hands- free telephone system without this being disturbing for the patient.

In some embodiments, the two radiation patterns for the speech signal and the masking signal may specifically be substantially complementary. For example, the speech signal may be radiated in a relatively narrow beam in the direction of the listening location whereas the masking signal may be radiated in most directions except for towards the

listening location. Thus, a masking signal radiation pattern similar to that of Fig. 2 may be used together with a narrow beam radiation of the speech signal. This may allow a high- quality audio speech signals to be provided to a listener while effectively masking the signal to listeners in all other directions thereby resulting in improved privacy of the radiated speech signal.

This approach is very suitable for applications in which the main goal is to provide a speech signal only to a specific person A located at a known location, and to make the speech unintelligible for any person other than person A. Such operation is attractive in many applications in public spaces and office situations. An example of suitable radiation patterns for such a scenario are shown in Fig.

6 where the desired speech signal and the masking signal are radiated using substantially complementary directional radiation patterns. The speech signal is reproduced such that there is a strong beam in the direction of person A, while the level is kept as low as possible in all other directions. The directional radiation pattern for the masking signal is complementary and has a more or less homogeneous level which is high enough to mask the speech signal in every direction except in the direction of person A where the gain of the radiation pattern is reduced. This results in a situation where the speech is unintelligible in every direction, except in the direction of person A in which the masking signal is virtually absent (except for reflected sound) so that the speech can be easily understood. Examples of applications wherein such an approach may be used include for example laptops or other portable devices, enabling private chatting in public environments; ATM machines and other service machines that provide spoken information to the user; service desks in public spaces where the user is separated from the service staff by a glass window and interactive shopping windows providing individual product information to shoppers.

It will be appreciated that in other scenarios, the opposite effect may be achieved, i.e. it may be desired to make the speech signal unintelligible for one or more persons located at specific known positions while keeping it intelligible to everyone else. This may be achieved by adjusting the radiation patterns complementarily to the previous example, i.e. by generating a narrow beam pattern for the masking signal and a broad beam for the audio signal. This scenario may e.g. apply to a hospital situation where only the patient should be unable to understand the speech while all medical staff in the room should be able to understand it.

In the previous examples, the masking signal was specifically a noise signal. However, in some embodiments, the masking signal is generated in response to the received speech signal. Thus the received speech processor 303 may be coupled to the masking generator 105 which may generate the masking signal based on the characteristics of the received speech signal.

Specifically, the masking generator may measure an averaged frequency spectrum characteristic of the speech signal. For example, a Fast Fourier Transform (FFT) may be applied to the received signal to generate a suitable number of frequency bins. The values of each bin may be low pass filtered in the time domain to provide an averaged signal level for the frequency interval represented by the bin. Thus, the signal levels of the different frequency bins represent a long-term frequency spectrum for the received speech signal.

The masking signal may be generated to have an average frequency spectrum characteristic which is similar to or the same as the one determined for the speech signal. For example, the output signal from a white noise generator may be filtered by a filter that has the same frequency response as the frequency spectrum determined for the speech signal. As a specific example, the output of the white noise generator may be converted to the frequency domain using an FFT corresponding to the one applied to the received signal. The value of each bin is then multiplied by a weight proportional to the relative signal level of the corresponding bin determined for the received signal. Such an approach may provide a masking signal that closely resembles the speech signal thereby resulting in an improved masking effect.

However, although it is effective for masking, noise may be considered an unpleasant sound to listen to. Therefore, the masking signal may in some embodiments be generated to more closely resemble a speech signal, i.e. it may be a signal which has the same or similar characteristics as speech, i.e., it sounds like speech but is unintelligible.

Such a masking signal can be generated from the actual speech input signal by dividing the signal into short (order of a phoneme or a syllable) segments, storing these segments in a buffer, and outputting them in random order. To make the masking properties of this signal even better, the individual segments can be divided into a number of frequency bands and the random re-ordering of the segments can be applied to each frequency band individually (see e.g.: J. Festen, 'Contributions of comodulation masking release and temporal resolution to the speech-reception threshold masked by an interfering voice', J.Acoust.Soc.Am. 94(3), pp.1295-1300, 1993).

Thus, in some embodiments, the masking generator 105 segments the speech signal into a plurality of signal segments and then generates the masking signal by reordering the plurality of signal segments in the time domain.

Alternatively or additionally, the generated masking signal may be adapted to follow a temporal envelope characteristic of the speech signal. Specifically, the time- envelope of the masking signal can be modulated such that it follows the time-envelope of the original speech signal. This processing results in an unintelligible speech-like signal that sounds like many people with voices similar to that of the original speech signal talking at the same time (like a cocktail party). This may not only provide a masking signal that is less disturbing or frustrating but may also result in improved masking.

Fig. 7 illustrates an audio system wherein improved privacy is provided in a two-way speech communication application. The system corresponds to that of Fig. 3 but with the addition of functionality that allows locally generated speech to be picked up and communicated to a remote destination. Specifically, the audio system further comprises a microphone 701 which captures the audio environment and specifically generates a microphone signal comprising a speech signal for a user speaking into the microphone. In the example, the microphone is directional and has an increased sensitivity in a direction that falls within the area of reduced gain for the masking signal. Specifically, the direction of the microphone is angled towards the listening position for the desired speech signal radiated by the loudspeaker array 101. Thus, the system is adapted to a specific location for a user of the two way speech communication system.

It will be appreciated that in other embodiments, non-directional microphones may be used. For example, the system may use "close-miking" wherein a microphone is located close to the mouth of the user thereby predominantly picking up the speech of the user. This may e.g. be advantageous in embodiments wherein the user is wearing a headset with built-in microphone.

It will be appreciated that whereas Fig. 7 illustrates only a single microphone 701, this may represent a microphone array and associated beam forming means for generating a directional microphone sensitivity pattern as will be well known to the skilled person. Specifically, the microphone array 701 may become controlled similarly to the loudspeaker array 101 such that a microphone sensitivity beam is controlled to follow the beam generated for the desired speech signal.

The microphone 701 is coupled to a transmit controller 703 which is furthermore coupled to the interface 301. The transceiver controller 703 is arranged to receive the microphone signal and to process this for communication. For example the transmit controller 703 may be arranged to amplify and encode the microphone signal. The resulting signal is then transmitted to a remote destination by the interface 301.

Thus, in the example of Fig. 7, a location targeted two-way speech communication system is provided wherein the masking signal may effectively mask both the speech signal radiated by the loudspeaker array 101 towards the user and the actual speech of the user itself. Accordingly an improved privacy for the two-way speech communication may be achieved. Furthermore, an improved quality of the signal communicated to the remote destination is achieved by providing an improved separation between the microphone sensitivity and the masking signal radiation. Thus, as the microphone 701 picks up a reduced level of the masking signal the signal to noise ratio of the transmitted signal is improved.

In some embodiments, the masking signal generated by the masking generator 105 is furthermore generated in response to the signal from the microphone 701.

For example, referring to the scenario of Fig. 4, the voice of person A can also actively be made unintelligible to person B by generating an additional masking signal based on the speech signal of person A as recorded by the microphone 701. Thus, an additional masking signal can be generated from the microphone signal (e.g. using any of the techniques previously described for generating a masking signal from the received speech signal). This additional masking signal can be combined with the masking signal generated from the received speech signal and the combined signal can be radiated towards person B using beam 405.

Such an approach will allow a masking of both directions of the two way communication and may make the speech unintelligible to other users (e.g. user B).

Furthermore, in such a system, the relative amplitude of the received speech masking signal and the microphone masking signal may be adjusted to reflect the different requirements for the individual masking of the two different sources. Specifically, the level of the microphone masking signal may be set higher than that of the received speech masking signal to reflect that whereas the received speech signal is radiated in a highly directional beam, the speech from person A is likely to radiate in a much broader angle.

Indeed, as illustrated in Fig. 8, the described approach of generating a masking signal from a local microphone may be used in other applications and may specifically be

used independently of any other audio output signal being generated by the loudspeaker array 101.

Thus, a system may be implemented wherein a masking signal is generated from a local microphone signal. In this example, only the voice of the user (or users) is picked up and the masking signal is generated from this and radiated from the loudspeaker array 101 using a directional radiation pattern, such as e.g. a narrow beam or a broader radiation pattern such as the one illustrated in Fig. 2. Such an application may e.g. be useful in open office environments where it may facilitate or enable private conversations using e.g. a hand-held telephone or a headset, or to enable a private conversation between two or more people who are both physically present.

In some embodiments, the system may furthermore comprise a directional display which is arranged to directionally display an image in a main direction within the first angular interval. Specifically, the display may be arranged to radiate an image in the direction towards the listening position. For example, for the scenario of Fig. 4, a display may be implemented that radiates the image signal in substantially the same direction 403 as the beam 401. Thus, in the example, the display has a limited viewing angle which is arranged to fall within the first angular interval.

For example, for laptops and other devices with a display, the described audio processing can be combined with visual privacy technology, so as to provide audio-visual privacy to the user. Examples of visual privacy technology that could be used include special foils that can be placed on an Liquid Crystal Display and which limit the viewing angle of the display, and lenticular displays (as are used for autostereoscopic 3D video today), which can be used for visual privacy purposes by displaying the visual content on only a subset of the total number of available views. This may for example enable private audiovisual chatting or conferencing using laptops or other portable audiovisual devices.

In some embodiments, the system may furthermore be operable to automatically detect the user direction and to adapt the generated radiation patterns in response to this direction. For example user direction tracking may be employed using e.g. video-based, acoustic, or RF tag approaches as will be known to the skilled person. The algorithms may track the desired user(s) and/or one or more users for which the signals should be masked. The angular direction to one or more of these users may be input to the array processor 103 which can proceed to modify the radiation patterns accordingly. Specifically, it may change the directions of the speech signal beam(s) and/or masking beam(s) accordingly.

In some embodiments, the audio system may furthermore comprise functionality for receiving a user input and the array processor 103 may be arranged to adapt a radiation pattern of either the received speech signal and/or the masking signal in response to the user input. Specifically, the widths and/or directions of both the received signal beam(s) and/or the masking signal beam(s) may be user-controllable thereby allowing the user to optimise the performance of the audio system for his personal preferences.

It will be appreciated that the above description for clarity has described embodiments of the invention with reference to different functional units and processors. However, it will be apparent that any suitable distribution of functionality between different functional units or processors may be used without detracting from the invention. For example, functionality illustrated to be performed by separate processors or controllers may be performed by the same processor or controllers. Hence, references to specific functional units are only to be seen as references to suitable means for providing the described functionality rather than indicative of a strict logical or physical structure or organization. The invention can be implemented in any suitable form including hardware, software, firmware or any combination of these. The invention may optionally be implemented at least partly as computer software running on one or more data processors and/or digital signal processors. The elements and components of an embodiment of the invention may be physically, functionally and logically implemented in any suitable way. Indeed the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the invention may be implemented in a single unit or may be physically and functionally distributed between different units and processors.

Although the present invention has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present invention is limited only by the accompanying claims. Additionally, although a feature may appear to be described in connection with particular embodiments, one skilled in the art would recognize that various features of the described embodiments may be combined in accordance with the invention. In the claims, the term comprising does not exclude the presence of other elements or steps. Furthermore, although individually listed, a plurality of means, elements or method steps may be implemented by e.g. a single unit or processor. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. Also the inclusion of a feature in

one category of claims does not imply a limitation to this category but rather indicates that the feature is equally applicable to other claim categories as appropriate. Furthermore, the order of features in the claims do not imply any specific order in which the features must be worked and in particular the order of individual steps in a method claim does not imply that the steps must be performed in this order. Rather, the steps may be performed in any suitable order. In addition, singular references do not exclude a plurality. Thus references to "a", "an", "first", "second" etc do not preclude a plurality. Reference signs in the claims are provided merely as a clarifying example shall not be construed as limiting the scope of the claims in any way.