ACOUSTIC SHOCK PROCESSING - ST ERICSSON FRANCE SAS

Title:

ACOUSTIC SHOCK PROCESSING

Document Type and Number:

WIPO Patent Application WO/2011/048101

Kind Code:

Abstract:

The invention proposes a method for processing acoustic shocks in an audio signal, comprising the steps of: - dividing the signal into frequency bands, - detecting a variation in the energy of the audio signal in each frequency band, - making a first comparison in order to compare the variation in energy for at least one frequency band with a first threshold, and - determining an acoustic shock, based on the result of the first comparison, when the variation is an increase greater than the first threshold.

Inventors:

LOUBOUTIN NICOLAS (FR)

Application Number:

PCT/EP2010/065729

Publication Date:

April 28, 2011

Filing Date:

October 19, 2010

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ST ERICSSON FRANCE SAS (FR)
ST ERICSSON SA (CH)
LOUBOUTIN NICOLAS (FR)

International Classes:

A61F11/14; H04M1/60; H04R5/04

Domestic Patent References:

WO2007014795A2

2007-02-08

Foreign References:

GB2456296A	2009-07-15
US20050018862A1	2005-01-27
US20050058274A1	2005-03-17

Other References:

CHOY G ET AL: "Subband-Based Acoustic Shock Limiting Algorithm On A Low Resource DSP System", 20030901, 1 September 2003 (2003-09-01), pages 2869, XP007007046

Attorney, Agent or Firm:

VERDURE, Stéphane et al. (52 rue de la Victoire, Paris Cedex 09, FR)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A method for processing acoustic shocks in an audio signal, comprising the steps of:

- dividing (S21) the signal into frequency bands,

- detecting (S23) a variation in the energy of the audio signal in each frequency band,

- making a first comparison (S24) in order to compare the variation in energy for at least one frequency band with a first threshold, and

- determining (S25) an acoustic shock, based on the result of the first comparison when the variation is an increase greater than the first threshold.

2. A method according to claim 1 , wherein the threshold corresponds to an average energy of the audio signal.

3. A method according to either of the above claims, wherein the detection is based on several comparisons in several frequency bands.

4. A method according to any one of the above claims, additionally comprising the steps of :

- making a second comparison in order to compare an energy level with a second threshold, in at least one frequency band, and

- determining an acoustic shock in the audio signal, based on the results of the first and second comparisons.

5. A method according to claim 4, wherein the second threshold is representative of one of the following

- a noise floor level,

- an environmental noise level of an acoustic system intended to reproduce the audio signal, and

- a fixed sound level for the reproduction of the audio signal by an acoustic system intended to reproduce the audio signal.

6. A method according to claim 5, wherein the second threshold is representative of a combination of these levels.

7. A method according to claim 4, wherein the second comparison is made for several thresholds representative of these levels.

8. A method according to any one of the above claims, additionally comprising:

- applying an acoustic compensation (COMPENS) to each frequency band in order to compensate for the effect of the acoustic system intended to reproduce the sound signal.

9. A method according to any one of the above claims, additionally comprising:

- attenuating the audio signal in a frequency band when an acoustic shock has been detected in the audio signal.

10. A method according to claim 9, wherein the audio signal is attenuated in a band in which an acoustic shock has been detected.

11. A computer program comprising instructions for implementing a process according to any one of the above claims when the program is executed by a processor.

12. An audio reproduction system, comprising:

- an acoustic system (SPK) for reproducing an audio signal,

- a memory (MEM) for storing signal processing data, and

- a processing unit (PROC) configured to implement a process according to any one of claims 1 to 10.

13. An integrated circuit comprising a system according to claim 12.

14. A communication terminal comprising a system according to claim 12.

Description:

ACOUSTIC SHOCK PROCESSING

Technical Field

The present invention relates to signal processing. More particularly, it aims to improve the detection and processing of anomalies in audio signals, in order to protect the organs of hearing of a user to whom the audio signal is sent.

Technological Background

Audio systems are subject to a certain number of regulations concerning the protection of users' organs of hearing.

Standards exist that define this protection. For example, there is the

2003/10/EC directive in Europe, the 29CFR1910.95 regulations in the United States, and the ITU-T P.360 recommendations of the International Telecommunications Union (ITU).

In most cases, these standards aim to prevent the risk of acoustic injury to users. Such injury, called acoustic trauma, is an irreversible injury to the organs of hearing due to exposure to sound of too high an intensity.

The common devices therefore already have sound limiters.

These limiters monitor the audio signal emitted over time, and detect peaks of sound intensity in order to cut off the audio signal when an intensity threshold is reached.

For example, directive 2003/ 0/EC sets the threshold at 137 dB(C) (where dB(C) represents a weighted decibel using a filter representing the frequency response of the human ear for signals of high intensity), the 29CFR1910.95 regulations set the threshold at 140 dB(C), and the ITU-T P.360 recommendations set the threshold at 125 dB(A) and 118 dB(A) for a headset (dB(A) represents a weighted decibel using a filter representing the frequency response of the human ear for signals of average intensity).

However, these measures do not prevent all user exposure to acoustic shock. An acoustic shock can be defined as exposure to a rapid transition of a sound of a given intensity to a sound of a higher intensity. In other words, it is the consequence of a sudden elevation of the sound level, more than the exceeding of an absolute threshold.

An acoustic shock can for example be caused by an audio signal coding error. It can also simply result from the dynamics of the changes in intensity and/or frequency in the reproduced sound.

An acoustic shock can occur in ranges of intensity that are lower than the thresholds recommended by the directives and regulations listed above.

Therefore the devices of the prior art do not always permit the detection of these acoustic shocks.

Depending on the difference in intensity, an acoustic shock can be very unpleasant for the user. In the worst cases, repeated acoustic shocks can irreversibly damage the inner ear.

In fact, because acoustic shocks occur in ranges of sound intensity that are not within the ranges of intensity where trauma can occur, the ear is not prepared for an increase in intensity. The user's body therefore does not have time to protect itself by contracting the muscles concerned, and injury to the inner ear can occur.

Summary of the Invention

A need to improve the acoustic protection for users therefore exists, particularly to protect them from acoustic shocks.

A method for processing acoustic shocks in an audio signal is proposed for this purpose, comprising the steps of:

- dividing the signal into frequency bands,

- determining an evolution in the energy of the audio signal for each frequency band,

- making a first comparison to compare an energy increase in the energy evolution with a first threshold, and - detecting an acoustic shock based on the result of the first comparison.

The invention therefore proposes processing the signal by frequency bands in order to detect an acoustic shock. This allows detecting acoustic shocks in a more refined manner, and to adapt how this shock is processed also in a more refined manner.

The ear does not have the same sensitivity across the entire audible frequency spectrum. In addition, abrupt changes in the sound reproduced for the user can consist of a transfer of energy from one frequency band to another, without increasing the average energy level. The invention allows adapting the detection to each frequency band selected.

For example, the signal can be divided into three frequency bands: one for the lows, one for the mediums, and one for the highs. One can choose other frequency band subdivisions depending on the case. In particular, the invention is not limited by the number of bands. For example, it is possible to choose even smaller frequency bands when there are significant computational resources.

The method can be implemented in a sound reproduction device such as a portable audio or video player, a mobile telephone, or other device.

The method allows handling acoustic shocks which can result from coding errors of a recording medium during reads or writes, or coding errors in the transmission of an audio signal for example.

The energy of the audio signal can allow measuring the sound intensity of the corresponding acoustic signal.

For example, the threshold is an average energy of the audio signal. The threshold is thus adapted to each audio signal processed. An acoustic shock can have a very different effect depending on the signal being processed, because the effect of the shock can depend on the state of contraction of the muscles in the ear at the moment the shock occurs.

The detection can be based on several comparisons in several frequency bands. Thus one can choose the level of detail in the analysis for detecting the acoustic shock. As mentioned above, the ear does not have the same sensitivity across the entire range of audible frequencies. In addition, depending on the audio signal being processed, acoustic shocks can appear in particular frequency bands.

In some embodiments, there can additionally be the steps consisting of:

- performing a second comparison in order to compare an energy level of the audio signal with a second threshold, in each frequency band, and

- detecting an acoustic shock based on the result of the first and second comparisons.

In this manner, both the increases in energy and the initial level of the sound signal are taken into account.

In fact, if a large increase in the energy (or intensity) is detected while the average level of the audio signal renders it undetectable to the ear, the shock is harmless to the user's ear. It is then possible to refrain from determining whether an acoustic shock is present.

In respective embodiments, the second threshold is representative of one of the following:

- a noise floor level;

- an environmental noise level of an acoustic system intended to reproduce the audio signal; and

- a fixed sound level for the reproduction of the audio signal by an acoustic system intended to reproduce the audio signal.

A noise floor can be noise generated by sources of interference. For example, it can be noise introduced by components of the acoustic signal reproduction device. For example, the crackling of a speaker is part of the noise floor.

Environmental noise can be noise issuing from the environment in which the acoustic signal reproduction device is located. For example, if a mobile telephone is in a concert hall, the sound of the music is environmental noise. Such noise can be captured by a microphone. A fixed sound level for the reproduction of the audio signal can correspond to the volume level as adjusted by the user.

One can also combine the results of several comparisons with more than one threshold from among those presented here.

The second threshold can also be representative of a combination of these levels.

By taking these different thresholds into account, the detection of acoustic shock can be fine-tuned. As has already been mentioned, the effect of an acoustic shock can depend on the state of the user's ear. Thus the noises mentioned above and the volume of the reproduced audio signal can render the determination of acoustic shock more or less relevant.

In some embodiments, a step can be provided consisting of applying, for the comparisons made, an acoustic compensation to each frequency band in order to compensate for the filter effect of an acoustic system intended to reproduce the audio signal.

In fact, the acoustic dynamics of the system can have non-negligible filter effects on the perception of the audio signal and on the effect of an acoustic shock. It can therefore be relevant to take it into account in acoustic shock detection.

Such compensation can for example take into account the mechanics of the system, the type of speaker, the digital filters used, audio gain, and more generally the acoustic response of the system, or some other aspect.

In some embodiments, steps can be provided consisting of attenuating the audio signal in a frequency band when an acoustic shock has been detected in the audio signal.

Thus the appearance of acoustic shock in the reproduced acoustic signal is prevented in order to protect the user's ear.

For example, the audio signal is attenuated only in a frequency band in which an acoustic shock has been detected. Thus the response to the detection of an acoustic shock can be fine- tuned. By restricting the attenuation of the signal to a frequency band, there is less degradation to the user's general perception of the audio signal.

Excessive degradation of a signal in which an acoustic shock would have been erroneously detected can thus be avoided.

The level of detail of the detection can depend on the complexity of the processing (number of frequency bands, size and rate of advancement of the measurement window).

Other aspects of the invention also allow for:

- a computer program comprising instructions for implementing a method according to the invention, when the program is executed by a processor;

- an audio reproduction system according to the invention;

- an integrated circuit comprising a system according to the invention; and

- a communication terminal comprising a system according to the invention.

The computer program, the integrated circuit, the system, as well as the terminal present at least the same advantages as those provided by the method according to the first aspect of the invention.

Brief Description of the Drawings

Other features and advantages of the invention will become apparent from reading the following description. This description is purely illustrative and is to be read in light of the attached drawings, in which:

- Figure 1 illustrates a general context for implementing an embodiment of the invention;

- Figure 2 is a flowchart of the steps of a process according to an embodiment of the invention;

- Figure 3 illustrates different objects implemented in the steps of the flowchart in Figure 2; - Figure 4 illustrates the compensation of the energy evolution spectrum according to one embodiment;

- Figure 5 illustrates the attenuation of an audio signal according to one embodiment; and

- Figure 6 illustrates a system according to one embodiment of the invention.

Detailed Description of Embodiments

A general context for implementing an embodiment of the invention is described with reference to Figure 1. In this context, an audio signal is sent from a source SRC to a user USR. The audio signal passes through a communication channel TRANS to an audio reproduction device RECEIV. Once the audio signal is received by the device RECEIV, it is emitted for reproduction for the user.

For example, the device RECEIV is a mobile communication terminal, the communication channel TRANS represents a telecommunications network, and the source SRC represents another communication terminal which communicates with the device RECEIV.

As a further example, the source SRC is a means of storing data such as a digital compact disk, a memory card, a hard drive, a USB key, or other device; the device RECEIV is an audio or video player, and the communication channel TRANS represents the decoding and player circuits of the device RECEIV.

In the context of this embodiment, assume that an error ERR occurs in the communication channel TRANS. For example, a decoding error or a transmission error occurs.

As will be apparent to a person skilled in the art, such an error can give rise to an acoustic shock in the audio signal, which is sent to the user USR and which can be unpleasant or even hazardous to his physical well-being.

An embodiment of a process according to the invention is described with reference to Figures 2 and 3. Figure 2 is a flowchart representing steps implemented in this embodiment. Figure 3 represents certain objects used in this embodiment.

During a first step S20, an audio signal SIG to be sent to a user is received. Detection of a possible acoustic shock in this signal is proposed.

For this purpose, its frequency spectrum is established over various intervals of time during the step S21.

As illustrated in Figure 3, for the interval T1 a first spectrum SP1 is obtained, subdivided into three frequency bands B1 , B2, and B3. Also obtained for the interval T2 is a second spectrum SP2, subdivided in the same manner as SP1.

For example, the band B1 corresponds to the frequencies of 20 to 500 Hz (low), the band B2 to the frequencies 500 Hz to 5 kHz (medium), and the band B3 from 5 kHz to 20 kHz (high).

As a further example, the bands B1 , B2, and B3 are of identical respective sizes. A person skilled in the art can choose this identical size as a function of the computational resources available for implementing the process. The more computational resources there are, the smaller and more numerous the frequency bands can be that he uses.

Using these spectra, Fast Fourier Transforms (FFT) are performed in the step S22 to obtain the energy of the audio signal in the intervals T1 and T2..

Then in the step S23 a variation in the energy between the intervals T1 and T2 is determined, for each frequency band B1 , B2, and B3

In the example illustrated by Figure 3, one can see in the graph ΔΕ that in the band B3 the energy of the audio signal has increased more than in the bands B1 and B2.

For each band, the energy increase is compared to a threshold during the step S24.

The result of this comparison allows determining during the step S25 whether the audio signal includes an acoustic shock. In digital applications one can choose values of 40 ms for the intervals T1 and 12. This value is relevant for the detection of acoustic shock because it is the average period required for integration by the human ear.

The duration of the intervals (or window) can be configurable as will be evident to a person skilled in the art. These intervals advance along the signal at a rate which can also be configured as a function of the computational performance.

It is also possible to use several windows of different sizes at the same time in order to refine the measurements (for example fast high frequency detection in a small window, with low frequency detection requiring a larger window).

An energy increase threshold can be 50 dBSPL. This threshold allows protecting the ear at rest for all sounds higher than 20 dbSPL because the ear muscle reflex appears at about 70 dBSPL ("Sound Pressure Level").

This muscle reflex appears at about 70dB SPL but not for very long (it is a reflex related to a sudden increase and is maintained for a short time only). It takes about 40ms to trigger.

A dynamic threshold can also be chosen, for example, the average energy of the audio signal measured over a given interval of time (shorter or longer depending on the computational resources available).

In certain embodiments, the choice of threshold can also take into account the noise attributed to the acoustic system components, or the noise attributed to the environment in which the reproduction of the audio signal occurs. In addition, the volume selected by the user for listening to the audio signal can also be taken into account.

For example, in a very noisy environment, the user's ear is already in a condition that provides better resistance to acoustic shock because the muscles of the ear are already contracted in order to attenuate these noises.

In order to refine the detection of acoustic shocks, one can perform, in addition to the comparison of the energy increase to a threshold, the comparison to another threshold of the intensity (or energy) of the audio signal at the moment the shock was detected. For example, this second comparison is made at the highest level of the acoustic shock.

This can allow ignoring acoustic shocks which are situated at very low levels, and which therefore do not represent a danger or even a true annoyance to the user.

Also for the purposes of refining the detection, one can also base the detection of acoustic shocks on the detection of energy increases in several frequency bands. For example, it is possible to select a set of frequency bands sensitive for the user (for example pre-established frequency bands, or frequency bands specific to each user) and to decide on the presence of an acoustic shock only if an increase was detected in each sensitive frequency band.

In some embodiments, a compensation of the audio signal as a function of the acoustic properties of the audio signal reproduction system is provided.

For example, a filter representative of these properties is applied to the spectrum obtained during the step S21.

As a further example, a filter is applied to the energy evolution spectrum ΔΕ obtained during the step S23.

The compensation notably aims to take into account effects which do not arise from the signal itself, and from any acoustic shocks that it contains, but from effects of the signal reproduction system.

For example, if the system has the effect of greatly reducing the low frequencies (this is the case for example for speakers of small dimensions such as those found in laptop computers), the effect of acoustic shocks will also be greatly reduced in the low frequencies, to the extent that it may be relevant to take this into account in the shock detection.

The acoustic properties of the system can for example comprise the dynamics of the speakers, the audio gain, the configuration and type of materials used to form the cases of the resonance chambers, etc.

Figure 4 illustrates a compensation according to one embodiment. In this Figure 4 the energy evolution spectrum for an audio signal ΔΕ on different frequency bands B1 to B6 is represented, before (the left graph ΔΕ) and after (the right graph LEm) the application of a compensation filter COMPENS.

As illustrated by this Figure 4, the filter COMPENS has a general bell shape that first rises then descends, with a peak in the B5 band. The filter COMPENS represents a system which greatly attenuates the sounds in the band B1 -B3, and to a lesser extent in the band B6.

Thus, in the right of the graph we find the energy evolution spectrum of an audio signal after filtering (AE )- One can see that the increases in the bands B1 and B2 have been erased, and those in the bands B3 and B6 greatly lessened.

The application of the filter therefore allowed focusing on the bands B4 and B5 in order to make the comparison with the thresholds and detect an acoustic shock.

In another embodiment (not represented), rather than applying a filter, the comparison thresholds are modified as a function of the properties of the acoustic system. This embodiment can economize the calculations to be done.

When an acoustic shock has been detected, one can for example apply an attenuation filter to the audio signal, in the frequency band where the acoustic shock is located.

Figure 5 illustrates a spectrum of an audio signal, divided into three frequency bands B1 , B2, and B3 before (the left graph SP) and after (the right graph SPATT) the application of an attenuation filter ATT. Let us assume that an acoustic shock has been detected in the frequency band B3. The filter ATT is then applied which is substantially flat in shape in the bands B1 and B3 and has a bell shape that rises and then falls in the band B3. The filter ATT therefore barely attenuates the signal in the bands B1 and B2 and attenuates the signal in the band B3. The attenuation of the signal can be correlated (for example proportionally) to the energy increase detected in the band B3. In other embodiments, the filter is applied only to the band where the shock was detected.

One can also provide other attenuations or compensations as they exist in the prior art.

A computer program of the invention can be realized according to a general algorithm deduced from the general flowchart in Figure 2 and from the present description.

An integrated circuit of the invention can be realized by techniques known to a person skilled in the art and be configured to implement a process according to the invention. For example, a system of the invention can be realized in an integrated circuit in the form of a System on Chip (SoC).

An audio reproduction system according to an embodiment of the invention is described with reference to Figure 6.

The system comprises a processing unit PROC configured to implement a process of the invention. For example, the unit PROC comprises an integrated circuit of the invention, and as a further example the unit PROC comprises a processor for implementing a computer program according to the invention.

The unit PROC comprises an acoustic shock detection unit DTCT for detecting an acoustic shock in an audio signal. The unit PROC also comprises an attenuation unit ATT for attenuating the audio signal when an acoustic shock has been detected.

The system also comprises a unit for reproducing the audio signal SPK. For example, the unit for reproducing the audio signal SPK is a speaker or an audio headset.

The system also comprises a memory MEM for storing various data, for example computational data for the processing unit, or a computer program according to the invention so that it can be implemented by a processor of the unit PROC.

For example, the audio signal processed by the processing unit PROC is stored in the memory MEM. As a further example, the audio signal is received by a communication unit COM.

The system can for example be a mobile communications terminal.

The invention has been described and illustrated in the present detailed description and in the Figures. The invention is not limited to the embodiments presented. Other variations and embodiments can be deduced and implemented by a person skilled in the art upon reading the present description and the attached drawings.

In the claims, the term "comprise" does not exclude other elements or other steps. The indefinite article "a" does not exclude the plural. One processor or multiple processing units can be used to implement the invention. The various characteristics presented and/or claimed can advantageously be combined. Their presence in the description or in different dependent claims does not exclude this possibility. The reference labels are not to be understood as limiting the scope of the invention.

Previous Patent: AUDIO ENCODER, AUDIO DECODER, METHOD FOR ENCODING AN AUDIO INFORMATION, METHOD FOR DECODING AN AUDIO...

Next Patent: APPARATUS AND METHOD FOR TORQUE FILL-IN