Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AN AUDIO SIGNAL PROCESSING APPARATUS AND METHOD
Document Type and Number:
WIPO Patent Application WO/2017/097324
Kind Code:
A1
Abstract:
The invention relates to an audio signal processing apparatus (100) for processing an input audio (101) signal to be transmitted to a listener in such a way that the listener perceives the input audio signal (101) to come from a virtual target position defined by an azimuth angle and an elevation angle relative to the listener, the audio signal processing apparatus (100) comprising a memory (103) configured to store a set of pairs of predefined left ear and right ear transfer functions, which are predefined for a plurality of reference positions relative to the listener, wherein the plurality of reference positions lie in a a two-dimensional plane, a determiner(105) configured to determine a pair of left ear and right ear transfer functions on the basis of the set of predefined pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position and an adjustment filter (107) configured to filter the input audio signal (101) on the basis of the determined pair of left ear and right ear transfer functions and an adjustment function (109) configured to adjust a delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions and a frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position in order to obtain a left ear output audio signal (111a) and a right ear output audio signal (111b).

Inventors:
PANG LIYUN (DE)
GROSCHE PETER (DE)
FALLER CHRISTOF (CH)
FAVROT ALEXIS (CH)
Application Number:
PCT/EP2015/078805
Publication Date:
June 15, 2017
Filing Date:
December 07, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECH CO LTD (CN)
PANG LIYUN (DE)
GROSCHE PETER (DE)
International Classes:
H04S1/00
Domestic Patent References:
WO1999031938A11999-06-24
Foreign References:
US20010040968A12001-11-15
US5440639A1995-08-08
Other References:
R. O. DUDA: "Modeling head related transfer functions", 27TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 1993
V. R. ALGAZI ET AL.: "The use of head-and-torso models for improved spatial sound synthesis", AES 113TH CONVENTION, October 2002 (2002-10-01)
H. GAMPER: "Head-related transfer function interpolation in azimuth, elevation and distance", JASA EXPRESS LETTERS, 2013
Attorney, Agent or Firm:
KREUZ, Georg (DE)
Download PDF:
Claims:
CLAIMS

1 . An audio signal processing apparatus (100) for processing an input audio signal (101 ) to be transmitted to a listener in such a way that the listener perceives the input audio signal (101 ) to come from a virtual target position defined by an azimuth angle and an elevation angle relative to the listener, the audio signal processing apparatus (100) comprising: a memory (103) configured to store a set of pairs of predefined left ear and right ear transfer functions, which are predefined for a plurality of reference positions relative to the listener, wherein the plurality of reference positions lie in a two-dimensional plane; a determiner (105) configured to determine a pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position; and an adjustment filter (107) configured to filter the input audio signal (101 ) on the basis of the determined pair of left ear and right ear transfer functions and an adjustment function (109) configured to adjust a delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions and a frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position in order to obtain a left ear output audio signal (1 1 1 a) and a right ear output audio signal (1 1 1 b).

2. The audio signal processing apparatus (100) of claim 1 , wherein the adjustment filter (107) is configured to adjust the delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position by compensating for sound travel time differences associated with the distance between the virtual target position and a left ear of the listener and the distance between the virtual target position and a right ear of the listener. 3. The audio signal processing apparatus (100) of claims 1 or 2, wherein the adjustment filter (107) is configured to adjust the delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of the following equations: TL(0) = T(0 + f) and

τΛ (Θ) = τ(Θ - |), wherein Tl denotes a delay applied to the left ear transfer function, wherein TR denotes a delay applied to the right ear transfer function and wherein τ and Θ are defined on the basis of the following equations: τ(Θ) = - sin Q, and

c wherein τ denotes a delay in seconds, c denotes the velocity of sound, a denotes a parameter associated with the head of a listener, Θ denotes the azimuth angle of the virtual target position and φ denotes the elevation angle of the virtual target position.

4. The audio signal processing apparatus (100) of any one of the preceding claims, wherein the adjustment filter (107) is configured to adjust the frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of a plurality of infinite impulse response filters (401 a,b, 403a-c), wherein the plurality of infinite impulse response filters (401 a,b, 403a-c) are configured to approximate at least a portion of the frequency dependence of a left ear transfer function and a right ear transfer function of a plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position. 5. The audio signal processing apparatus (100) of claim 4, wherein the frequency dependence of each infinite impulse response filter (401 a,b, 403a-c) is defined by a plurality of predefined filter parameters and wherein the plurality of predefined filter parameters are selected such that the frequency dependence of each infinite impulse response filter (401 a,b, 403a-c) approximates at least a portion of the frequency dependence of a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position.

6. The audio signal processing apparatus (100) of claim 5, wherein the plurality of infinite-impulse-response filters (401 a,b, 403a-c) comprises a plurality of biquad filters (401 a,b, 403a-c), wherein the plurality of biquad filters can be implemented as parallel filters or cascaded filters.

7. The audio signal processing apparatus (100) of claim 6, wherein the plurality of biquad filters (401 a,b, 403a-c) comprises at least one shelving filter (401 a,b), wherein the at least one shelving filter (401 a,b) is defined by a cut-off frequency parameter f0 and a gain parameter g0, and/or at least one peaking filter (403a-c), wherein the at least one peaking filter (403a-c) is defined by a cut-off frequency parameter f0, a gain parameter g0 and a bandwidth parameter Δ0.

8. The audio signal processing apparatus (100) of claim 7, wherein for at least one infinite impulse response filter (403a-c) of the plurality of infinite response filters the plurality of predefined filter parameters are selected by determining a frequency and an azimuth angle and/or an elevation angle, at which a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions has a minimal or maximal magnitude, and by approximating the frequency dependence of the left ear transfer function or the right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions by the frequency dependence of the at least one infinite impulse response filter (403a-c).

9. The audio signal processing apparatus (100) of claim 7 or 8, wherein the cut-off frequency parameter f0, the gain parameter g0 and/or the bandwidth parameter Δ0 are determined on the basis of the following equations:

wherein Mf gA and m,f gA denote maximal and minimal values of f, g, A, respectively, and wherein af g denote coefficients controlling the speed of changing the corresponding filter parameters. 10. The audio signal processing apparatus (100) of any one of the preceding claims, wherein the adjustment filter (107) is configured to filter the input audio signal (101 ) on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function (109) by convolving the adjustment function (109) with the left ear transfer function and by convolving the result with the input audio signal (101 ) in order to obtain the left ear output audio signal (1 1 1 a) and/or by convolving the adjustment function (109) with the right ear transfer function and by convolving the result with the input audio signal (101 ) in order to obtain the right ear output audio signal (1 1 1 b).

1 1 . The audio signal processing apparatus (100) of any one of claims 1 to 10, wherein the adjustment filter (107) is configured to filter the input audio signal (101 ) on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function (109) by convolving the left ear transfer function with the input audio signal (101 ) and by convolving the result with the adjustment function (109) in order to obtain the left ear output audio signal (1 1 1 a) and/or by convolving the right ear transfer function with the input audio signal (101 ) and by convolving the result with the adjustment function (109) in order to obtain the right ear output audio signal (1 1 1 b).

12. The audio signal processing apparatus (100) of any one of the preceding claims, wherein the audio signal processing apparatus (100) further comprises a pair of transducers, in particular headphones or loudspeakers using crosstalk cancellation, configured to output the left ear output audio signal (1 1 1 a) and the right ear output audio signal (1 1 1 b).

13. The audio signal processing apparatus (100) of any one the preceding claims, wherein the pairs of predefined left ear and right ear transfer functions are predefined for a plurality of reference positions relative to the listener, which lie in the horizontal plane relative to the listener.

14. The audio signal processing apparatus (100) of any one of the preceding claims, wherein the determiner (105) is configured to determine the pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position by selecting a pair of left ear and right ear transfer functions from the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position and/or by interpolating a pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position.

15. An audio signal processing method (1000) for processing an input audio signal (101 ) to be transmitted to a listener in such a way that the listener perceives the input audio signal (101 ) to come from a virtual target position defined by an azimuth angle and an elevation angle relative to the listener, the audio signal processing method (1000) comprising: determining (1001 ) a pair of left ear and right ear transfer functions on the basis of a set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position, wherein the pairs of predefined leaft ear and right ear transfer functions are predefined for a plurality of reference positions relative to the listener, wherein the plurality of reference positions lie in a two-dimensional plane; and filtering (1003) the input audio signal (101 ) on the basis of the determined pair of left ear and right ear transfer functions and an adjustment function (109) configured to adjust a delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions and a frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position in order to obtain a left ear output audio signal (1 1 1 a) and a right ear output audio signal (1 1 1 b). 16. A computer program comprising program code for performing the method (1000) of claim 15 when executed on a computer.

Description:
DESCRIPTION

An audio signal processing apparatus and method TECHNICAL FIELD

Generally, the invention relates to the field of audio signal processing. More specifically, the invention relates to an audio signal processing apparatus and method allowing for generating a binaural audio signal from a virtual target position.

BACKGROUND

The human ears can locate sounds in three dimensions: in range (distance), in direction above and below (elevation), in front and in rear (azimuth), as well as to either (right or left) side. The properties of sound received by an ear from some point of space can be characterized by head-related transfer functions (HRTFs). Therefore, a pair of HRTFs for two ears can be used to synthesize a binaural sound that seems to come from a target position, i.e. a virtual target position. Many applications of 3D audio using headphones, such as virtual reality, spatial teleconferencing, virtual surround, require high quality HRTF datasets, which contain transfer functions for all necessary directions. Some forms of HRTF-processing have also been included in computer software to simulate surround sound playback from

loudspeakers. However, measuring HRTFs for all azimuth angles is a tedious task, which involves hardware and materials. Moreover, the memory required to store the database of measured HRTFs can be very large. Additionally, using personalized HRTFs can further improve the sound experience, but acquiring them complicates the process of the synthesis of 3D sound. The idea of a fully parametric model for deriving HRTFs to synthesize binaural sound has been proposed in R. O. Duda, "Modeling head related transfer functions", 27th Asilomar Conference on Signals, Systems and Computers, 1993 and V. R. Algazi et al, "The use of head-and-torso models for improved spatial sound synthesis", AES 1 13th Convention, Oct. 2002. However, for realistic binaural sound rendering the obtained HRTFs are not accurate enough, since these models strongly deviate from the personalized HRTFs. A lot of research has been conducted to develop a method to obtain HRTFs that would not strongly deviate from personalized (user specific) HRTFs. 3D HRTFs interpolation can be used to obtain estimated HRTFs at the desired source position from measured HRTFs, as demonstrated in H. Gamper, "Head-related transfer function interpolation in azimuth, elevation and distance", JASA Express Letters, 2013. This technique requires HRTFs measured at nearby positions, e.g. four measurements forming a tetrahedral enclosing the desired position. Additionally, it is hard to achieve a correct elevation perception with this technique. Thus, there is a need for an improved audio signal processing apparatus and method allowing for generating a binaural audio signal from a virtual target position.

SUMMARY It is an object of the invention to provide an improved audio signal processing apparatus and method allowing for generating a binaural audio signal from a virtual target position.

This object is achieved by the feature of independent claims. Further implementation forms are apparent from the dependant claims, the description and the figures.

According to a first aspect, the invention relates to an audio signal processing apparatus for processing an input audio signal to be transmitted to a listener in such a way that the listener perceives the input audio signal to come from a virtual target position defined by an azimuth angle and an elevation angle relative to the listener, the audio signal processing apparatus comprising: a memory configured to store a set of pairs of predefined left ear and right ear transfer functions, which are predefined for a plurality of reference positions relative to the listener, wherein the plurality of reference positions lie in a two-dimensional plane, a determiner configured to determine a pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position and an adjustment filter configured to filter the input audio signal on the basis of the determined pair of left ear and right ear transfer functions and an adjustment function configured to adjust a delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions and a frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position in order to obtain a left ear output audio signal and a right ear output audio signal.

Thus, an improved audio signal processing apparatus allowing for generating a binaural audio signal from a virtual target position is provided. In particular, the audio signal processing apparatus according to the first aspect allows extending a set of predefined transfer functions defined for virtual target positions in a two-dimensional plane, for instance in the horizontal plane (which for a given scenario are very often already available), relative to the listener, in a computationally efficient manner to the third dimension, i.e. to virtual target positions above or below this plane. This has, for instance, the beneficial effect that the memory required for storing the predefined transfer functions is significantly reduced.

The set of pairs of predefined left ear and right ear transfer functions can comprise pairs of predefined left ear and right ear head related transfer functions.

The set of pairs of predefined left ear and right ear transfer functions can comprise measured left ear and right ear transfer functions and/or modelled left ear and right ear transfer functions. Thus, the audio signal processing apparatus according to the first aspect can use a database of user-specific measured transfer functions for a more realistic sound perception or modelled transfer functions, if user-specific measured transfer functions are not available.

In a first possible implementation form of the audio signal processing apparatus according to the first aspect as such, the adjustment filter is configured to adjust the delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position by compensating for sound travel time differences associated with the distance between the virtual target position and a left ear of the listener and the distance between the virtual target position and a right ear of the listener.

By introducing a delay as a function of the azimuth angle and/or the elevation angle of the virtual target position, sound travel time differences can be compensated resulting in a more realistic sound perception by the listener. In a second possible implementation form of the audio signal processing apparatus according to the first aspect as such or the first implementation form thereof, the adjustment filter is configured to adjust the delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of the following equations:

T L (0) = T(0 + ) and

T R (0) = T(0 - f) , wherein T l denotes a delay applied to the left ear transfer function, wherein T R denotes a delay applied to the right ear transfer function and wherein τ and 0 are defined on the basis of the following equations: τ(0) = - sin 0, and

c wherein τ denotes a delay in seconds, c denotes the velocity of sound, a denotes a parameter associated with the head of a listener, Θ denotes the azimuth angle of the virtual target position and φ denotes the elevation angle of the virtual target position.

Thus, a delay for compensating sound travel time differences as a function of the azimuth angle and/or the elevation angle of the virtual target position can be determined in a computationally efficient way.

In a third possible implementation form of the audio signal processing apparatus according to the first aspect as such or the first or second implementation form thereof, the adjustment filter is configured to adjust the frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of a plurality of infinite impulse response filters, wherein the plurality of infinite impulse response filters are configured to approximate at least a portion of the frequency dependence of a left ear transfer function and a right ear transfer function of a plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position.

By approximating measured transfer functions by MR filters and considering only the main spectral features thereof, in particular those which are relevant for the perception of azimuth and/or elevation, the computational complexity can be reduced.

In a fourth possible implementation form of the audio signal processing apparatus according to the third implementation form of the first aspect, the frequency dependence of each infinite impulse response filter is defined by a plurality of predefined filter parameters and wherein the plurality of predefined filter parameters are selected such that the frequency dependence of each infinite impulse response filter approximates at least a portion, in particular prominent spectral features, such as a spectral maximum or a spectral minimum, of the frequency dependence of a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position.

Defining each infinite impulse response filter by a finite set of filter parameters allows saving memory space, as only the filter parameters have to be saved in order to reconstruct the main spectral features of the measured transfer functions.

In a fifth possible implementation form of the audio signal processing apparatus according to the fourth implementation form of the first aspect, the plurality of infinite-impulse- response filters comprises a plurality of biquad filters, i.e. biquadratic filters. The plurality of biquad filters can be implemented as parallel filters or cascaded filters. The use of cascaded filters is preferred as it approximates the spectral features of the transfer functions better. The order of the plurality of biquad filters can be different. In a sixth possible implementation form of the audio signal processing apparatus according to the fifth implementation form of the first aspect, the plurality of biquad filters comprises at least one shelving filter, wherein the at least one shelving filter is defined by a cut-off frequency parameter f 0 and a gain parameter g 0 , and/or at least one peaking filter, wherein the at least one peaking filter is defined by a cut-off frequency parameter f 0 , a gain parameter g 0 and a bandwidth parameter Δ 0 . The frequency dependence of shelving and/or peaking filters provides good

approximations to the frequency dependence of the measured transfer functions on the basis of 2 or 3 filter parameters. In a seventh possible implementation form of the audio signal processing apparatus according to the sixth implementation form of the first aspect, for at least one infinite impulse response filter of the plurality of infinite response filters the plurality of predefined filter parameters are selected by determining a frequency and an azimuth angle and/or an elevation angle, at which a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions has a minimal or maximal magnitude, and by approximating the frequency dependence of the left ear transfer function or the right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions by the frequency dependence of the at least one infinite impulse response filter.

Thus, the predefined filter parameters can be determined in a computationally efficient way.

In an eighth possible implementation form of the audio signal processing apparatus according to the sixth or seventh implementation form of the first aspect, the filter parameters, namely the cut-off frequency parameter f 0 , the gain parameter g 0 and the bandwidth parameter Δ 0 are determined on the basis of the following equations:

wherein M f gA and m. f gA denote maximal and minimal values of f, g, A, respectively, and wherein a f g denote coefficients controlling the speed of changing the corresponding filter design parameters.

In a ninth possible implementation form of the audio signal processing apparatus according to the first aspect as such or any one of the first to eighth implementation form thereof, the adjustment filter is configured to filter the input audio signal on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function by convolving the adjustment function with the left ear transfer function and by convolving the result with the input audio signal in order to obtain the left ear output audio signal and/or by convolving the adjustment function with the right ear transfer function and by convolving the result with the input audio signal in order to obtain the right ear output audio signal.

In a tenth possible implementation form of the audio signal processing apparatus according to the first aspect as such or any one of the first to eighth implementation form thereof, the adjustment filter is configured to filter the input audio signal on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function by convolving the left ear transfer function with the input audio signal and by convolving the result with the adjustment function in order to obtain the left ear output audio signal and/or by convolving the right ear transfer function with the input audio signal and by convolving the result with the adjustment function in order to obtain the right ear output audio signal.

In an eleventh possible implementation form of the audio signal processing apparatus according to the first aspect as such or any one of the first to tenth implementation form thereof, the audio signal processing apparatus further comprises a pair of transducers, in particular headphones or loudspeakers using crosstalk cancellation, configured to output the left ear output audio signal and the right ear output audio signal.

In a twelfth possible implementation form of the audio signal processing apparatus according to the first aspect as such or any one of the first to eleventh implementation form thereof, the pairs of predefined left ear and right ear transfer functions are predefined for a plurality of reference positions relative to the listener, which lie in the horizontal plane relative to the listener. That is, the set of pairs of predefined left ear and right ear transfer functions can consist of pairs of predefined left ear and right ear transfer functions for a plurality of different azimuth angles and a fixed zero elevation angle.

In a thirteenth possible implementation form of the audio signal processing apparatus according to the first aspect as such or any one of the first to twelfth implementation form thereof, the determiner is configured to determine the pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position by selecting a pair of left ear and right ear transfer functions from the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position and/or by interpolating a pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position.

According to a second aspect, the invention relates to an audio signal processing method for processing an input audio signal to be transmitted to a listener in such a way that the listener perceives the input audio signal to come from a virtual target position defined by an azimuth angle and an elevation angle relative to the listener, the audio signal processing method comprising: determining a pair of left ear and right ear transfer functions on the basis of a set of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position, wherein the pairs of predefined left ear and right ear transfer functions are predefined for a plurality of reference positions relative to the listener, wherein the plurality of reference positions lie in a two-dimensional plane, and filtering the input audio signal, e.g. by an adjustment filter, on the basis of the determined pair of left ear and right ear transfer functions and an adjustment function configured to adjust a delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions and a frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position in order to obtain a left ear output audio signal and a right ear output audio signal. In a first possible implementation form of the audio signal processing method according to the second aspect as such, the adjustment function is configured to adjust the delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position by compensating for sound travel time differences associated with the distances between the virtual target position and a left ear of the listener and between the virtual target position and a right ear of the listener.

In a second possible implementation form of the audio signal processing method according to the second aspect as such or the first implementation form thereof, the adjustment function is configured to adjust the delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of the following equations:

T L (0) = T(0 + f) and

T R (0) = T(0 - f) , wherein T l denotes a delay applied to the left ear transfer function, wherein T R denotes a delay applied to the right ear transfer function and wherein τ and 0 are defined on the basis of the following equations: τ 0) = sin 0, and

wherein τ denotes a delay in seconds, c denotes the velocity of sound, a denotes a parameter associated with the head of a listener, Θ denotes the azimuth angle of the virtual target position and φ denotes the elevation angle of the virtual target position.

In a third possible implementation form of the audio signal processing method according to the second aspect as such or the first or second implementation form thereof, the adjustment function is configured to adjust the frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of a plurality of infinite impulse response filters, wherein the plurality of infinite impulse response filters are configured to approximate at least a portion of the frequency dependence of a left ear transfer function and a right ear transfer function of a plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position.

In a fourth possible implementation form of the audio signal processing method according to the third implementation form of the second aspect, the frequency dependence of each infinite impulse response filter is defined by a plurality of predefined filter parameters, wherein the plurality of predefined filter parameters are selected such that the frequency dependence of each infinite impulse response filter approximates at least a portion, in particular prominent spectral features, such as a spectral maximum or a spectral minimum, of the frequency dependence of a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position.

In a fifth possible implementation form of the audio signal processing method according to the fourth implementation form of the second aspect, the plurality of infinite-impulse- response filters comprises a plurality of biquad filters, i.e. biquadratic filters. The plurality of biquad filters can be implemented as parallel filters or cascaded filters. The use of cascaded filters is preferred as it approximates the spectral features of the transfer functions better. The order of the plurality of biquad filters can be different.

In a sixth possible implementation form of the audio signal processing method according to the fifth implementation form of the second aspect, the plurality of biquad filters comprises at least one shelving filter, wherein the at least one shelving filter is defined by a cut-off frequency parameter f 0 and a gain parameter g 0 , and/or at least one peaking filter, wherein the at least one peaking filter is defined by a cut-off frequency parameter f 0 , a gain parameter g 0 and a bandwidth parameter Δ 0 . In a seventh possible implementation form of the audio signal processing method according to the sixth implementation form of the second aspect, for at least one infinite impulse response filter of the plurality of infinite response filters the plurality of predefined filter parameters are selected by determining a frequency and an azimuth angle and/or an elevation angle, at which a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions has a minimal or maximal magnitude, and by approximating the frequency dependence of the left ear transfer function or the right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions by the frequency dependence of the at least one infinite impulse response filter.

In an eighth possible implementation form of the audio signal processing method according to the sixth or seventh implementation form of the second aspect, the filter parameters, namely the cut-off frequency parameter f 0 , the gain parameter g 0 and the bandwidth parameter Δ 0 are determined on the basis of the following equations: g 0 = max (m g , min (M 3 , ¾(Ø - φ ρ ) 2 + £ p )) , Δ 0 = max (m A , min (M A , α Δ (φ - φ ρ ) 2 + Δ ρ )),

wherein M f g and m f g A denote maximal and minimal values of f, g, A, respectively, and wherein a f g denote coefficients controlling the speed of changing the corresponding filter design parameters.

In a ninth possible implementation form of the audio signal processing method according to the second aspect as such or any one of the first to eighth implementation form thereof, the step of filtering the input audio signal on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function comprises the steps of convolving the adjustment function with the left ear transfer function and convolving the result with the input audio signal in order to obtain the left ear output audio signal and/or the steps of convolving the adjustment function with the right ear transfer function and convolving the result with the input audio signal in order to obtain the right ear output audio signal.

In a tenth possible implementation form of the audio signal processing method according to the second aspect as such or any one of the first to eighth implementation form thereof, the step of filtering the input audio signal on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function comprises the steps of convolving the left ear transfer function with the input audio signal and convolving the result with the adjustment function in order to obtain the left ear output audio signal and/or the steps of convolving the right ear transfer function with the input audio signal and convolving the result with the adjustment function in order to obtain the right ear output audio signal.

In an eleventh possible implementation form of the audio signal processing method according to the second aspect as such or any one of the first to tenth implementation form thereof, the audio signal processing method further comprises the step of outputting the left ear output audio signal and the right ear output audio signal by means of a pair of transducers, in particular headphones or loudspeakers using crosstalk cancellation.

In a twelfth possible implementation form of the audio signal processing method according to the second aspect as such or any one of the first to eleventh implementation form thereof, the pairs of predefined left ear and right ear transfer functions are predefined for a plurality of reference positions relative to the listener, which lie in the horizontal plane relative to the listener. In a thirteenth possible implementation form of the audio signal processing method according to the second aspect as such or any one of the first to twelfth implementation form thereof, the step of determining the pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position comprises the step of selecting a pair of left ear and right ear transfer functions from the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position or the step of interpolating a pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position.

The audio signal processing method according to the second aspect of the invention can be performed by the audio signal processing apparatus according to the first aspect of the invention.

According to a third aspect the invention relates to a computer program comprising program code for performing the audio signal processing method according to the second aspect of the invention or any of its implementation forms when executed on a computer. The invention can be implemented in hardware and/or software.

BRIEF DESCRIPTION OF DRAWINGS

Further embodiments of the invention will be described with respect to the following figures, wherein:

Fig. 1 shows a schematic diagram illustrating an audio signal processing apparatus according to an embodiment; Fig. 2 shows a schematic diagram illustrating -an adjustment filter of an audio signal processing apparatus according to an embodiment; Fig. 3 shows a diagram illustrating an exemplary frequency magnitude analysis of a database of head related transfer functions as a function of the elevation angle for a fixed azimuth angle; Fig. 4 shows a schematic diagram illustrating a plurality of biquad filters, including shelving filters and peaking filters, which can be implemented in an adjustment filter of an audio signal processing apparatus according to an embodiment;

Fig. 5 shows schematic diagrams illustrating the frequency dependence of an exemplary shelving filter and the frequency dependence of an exemplary peaking filter, which can be implemented in an adjustment filter of an audio signal processing apparatus according to an embodiment;

Fig. 6 shows a schematic diagram illustrating the selection of filter parameters by an audio signal processing apparatus according to an embodiment;

Fig. 7 shows a schematic diagram illustrating a part of an audio signal processing apparatus according to an embodiment; Fig. 8 shows a schematic diagram illustrating a part of an audio signal processing apparatus according to an embodiment;

Fig. 9 shows a schematic diagram illustrating an exemplary scenario, where an audio signal processing apparatus according to an embodiment can be used, namely for binaural sound synthesis over headphones simulating a virtual loudspeaker surround system; and

Fig. 10 shows a schematic diagram illustrating an audio signal processing method for processing an input audio signal according to an embodiment.

In the various figures, identical reference signs will be used for identical or at least functionally equivalent features. DETAILED DESCRIPTION

In the following description, reference is made to the accompanying drawings, which form part of the disclosure, and in which are shown, by way of illustration, specific aspects in which the present invention may be placed. It is understood that other aspects may be utilized and structural or logical changes may be made without departing from the scope of the present invention. The following detailed description, therefore, is not to be taken in a limiting sense, as the scope of the present invention is defined be the appended claims. For instance, it is understood that a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa. For example, if a specific method step is described, a corresponding device may include a unit to perform the described method step, even if such unit is not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary aspects described herein may be combined with each other, unless specifically noted otherwise.

Figure 1 shows a schematic diagram of an audio signal processing apparatus 1 00 for processing an input audio signal 1 01 to be transmitted to a listener in such a way that the listener perceives the input audio signal 101 to come from a virtual target position. In a spherical coordinate system the virtual target position (relative to the listener) is defined by a radial distance r, an azimuth angle Θ and an elevation angle φ.

The audio signal processing apparatus 1 00 comprises a memory 1 03 configured to store a set of pairs of predefined left ear and right ear transfer functions, which are predefined for a plurality of reference positions/directions, wherein the plurality of reference positions define a two-dimensional plane.

Moreover, the audio signal processing apparatus 1 00 comprises a determiner 105 configured to determine a pair of left ear and right ear transfer functions on the basis of the set of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position. The determiner 105 is configured to determine the pair of left ear and right ear transfer functions for a position/direction associated with the virtual target position which lies in the two-dimensional plane defined by the plurality of reference positions. More specifically, the determiner 1 05 is configured to determine the pair of left ear and right ear transfer functions by determining the pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the projection of the virtual target position/direction onto the two-dimensional plane defined by the plurality of reference positions. In an embodiment, the determiner 105 can be configured to determine the pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position by selecting a pair of left ear and right ear transfer functions from the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position.

In an embodiment, the determiner 105 can be configured to determine the pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position by interpolating, for instance, by means of nearest neighbour interpolation, linear interpolation or the like, a pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position. In an embodiment, the determiner 105 is configured to use a linear interpolation scheme, a nearest neighbour interpolation scheme or a similar interpolation scheme to determine a pair of left ear and right ear transfer functions on the basis of the set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position. Moreover, the audio signal processing apparatus 100 comprises an adjustment filter 107 for extending the pair of left ear and right ear transfer functions, which has been determined by the determiner 105 for the projection of the virtual target position/direction onto the two-dimensional plane defined by the plurality of reference positions, to the "third dimension", i.e. to positions/directions above or below the two-dimensional plane defined by the plurality of reference positions. To this end, the adjustment filter 107 is configured to filter the input audio signal 101 on the basis of the determined pair of left ear and right ear transfer functions and a predefined adjustment function Μ(τ, θ, φ) 109 configured to adjust a delay between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions and a frequency

dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position in order to obtain a left ear output audio signal 1 1 1 a and a right ear output audio signal 1 1 1 b.

In an exemplary embodiment, the set of pairs of predefined left ear and right ear transfer functions comprises four pairs of predefined left ear and right ear transfer functions in the horizontal plane, i.e. for an elevation angle φ = 0°. The four pairs of predefined left ear and right ear transfer functions can be defined for the azimuth angles Θ =

0°, 90°, 180°, 270°, respectively. In case an exemplary virtual target position is associated with an azimuth angle Θ = 20° and an elevation angle φ = 20°, the determiner 105 can determine the pair of left ear and right ear transfer functions for the azimuth angle Θ = 20° and the elevation angle φ = 0° by means of a linear interpolation using the pairs of predefined left ear and right ear transfer functions at Θ = 0°, 90°. In an alternative embodiment, the determiner 105 can determine the pair of left ear and right ear transfer functions for the azimuth angle Θ = 20° and the elevation angle φ = 0° by selecting the pair of predefined left ear and right ear transfer functions at Θ = 0° (which corresponds to a nearest neighbour interpolation). The extension of the determined pair of predefined left ear and right ear transfer functions at the azimuth angle Θ = 20° and the elevation angle φ = 0° to the elevation angle φ = 20° is performed by the adjustment filter 107. The set of predefined left ear and right ear transfer functions can be, for example, a limited set of head related transfer functions (HRTFs). The set of pairs of predefined left ear and right ear transfer functions can be either personalized (measured for a specific user) or obtained from a generalized database (modelled). As already mentioned above, in an embodiment, the set of pairs of predefined left ear and right ear head related transfer functions can be defined for a plurality of azimuth angles and a fixed elevation angle. For instance, for a fixed elevation angle φ = 0° the set of pairs of predefined left ear and right ear head related transfer functions can be defined as left ear HRTFs h L (r, Θ, 0) and right ear HRTFs h R (r, Θ, 0) parametrized by the azimuth angle Θ.

As already mentioned above, in an embodiment, the set of pairs of predefined left ear and right ear head related transfer functions can be defined for a fixed azimuth angle and a plurality of elevation angles. For instance, for a fixed azimuth angle 0 = 0° the set of pairs of predefined left ear and right ear head related transfer functions can be defined as left ear HRTFs h L (r, 0, ø) and right ear HRTFs h R (r, 0, ø) parametrized by the elevation angle φ.

Figure 2 shows a schematic diagram illustrating an adjustment function Μ(τ, θ, φ) 109 as used in an adjustment filter of an audio signal processing apparatus according to an embodiment, for instance the adjustment filter 107 of the audio signal processing apparatus 100 shown in figure 1 . In the exemplary embodiment shown in figure 2 the set of pairs of predefined left ear and right ear head related transfer functions are horizontal transfer functions h L (r, Θ, 0) and h R (r, Θ, 0), i.e. transfer functions defined for reference positions/directions in the horizontal plane relative to the listener.

The adjustment function M(r, θ, φ) 109 shown in figure 2 comprises a delay block 109a for applying a delay to the horizontal transfer functions h L (r, Θ, 0) and h R (r, Θ, 0) and a frequency adjustment block 109b for applying a frequency adjustment to the horizontal transfer functions h L (r, 6, 0) and h R (r, 6, 0).

In an embodiment, the adjustment filter 107 is configured to adjust the delay 109a between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of the adjustment function Μ τ, θ, φ) 109 by compensating for sound travel time differences associated with the distances between the virtual target position and a left ear of the listener and between the virtual target position and a right ear of the listener. In an embodiment, the adjustment function 109 is configured to determine an additional time delay due to the elevation angle φ for the set of predefined transfer functions h L (r, Θ, 0) and h R (r, Θ, 0) on the basis of a new angle of incidence Θ derived in the constant elevation plane. In an embodiment, the adjustment filter 107 is configured to adjust by means of the adjustment function 109 the delay 109a between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position on the basis of the following equations:

T L (0) = T(0 + ¾ and τ Λ (Θ) = τ(Θ - |), wherein T l denotes a delay applied to the left ear transfer function, wherein T R denotes a delay applied to the right ear transfer function and wherein τ and Θ are defined on the basis of the following equations: τ(Θ) = - sin Q, and

c wherein τ denotes a delay in seconds, c denotes the velocity of sound (i.e. c = 340 m/sec), a denotes a parameter associated with the head of a listener (e.g. a = 0.087 m), Θ denotes the azimuth angle of the virtual target position and φ denotes the elevation angle of the virtual target position. The above equations for determining the new angle of incidence Θ are based on a projection of the azimuth angle Θ of the virtual target position in the horizontal plane into the constant elevation plane.

The frequency adjustment block 109b of the adjustment function M(r, θ, φ) 109 shown in figure 2 is configured to apply a frequency adjustment to the horizontal transfer functions h L (r, Θ, 0) and h R (r, Θ, 0), in order to extend the "two-dimensional" set of pairs of predefined horizontal transfer functions by adding the relevant perceptual information related to elevation, i.e. the third dimension.

In an embodiment, the frequency adjustment block 109b of the adjustment function Μ(τ, θ, φ) 109 shown in figure 2 can be based on a spectral analysis of a complete database of transfer functions, which covers all desired positions/directions. This allows, for example, to elevate or adjust the horizontal HRTFs, h L (r, Θ, 0) and h R (r, Θ, 0), which are defined by the azimuth angle Θ in the horizontal plane, to an elevation angle φ above or below the horizontal plane. Figure 3 shows an exemplary frequency magnitude analysis of a database of head related transfer functions as a function of the elevation angle, namely the measured MIT HRTF database using the KEMAR dummy head. The frequency magnitude responses are shown in figure 3 for the left HRTFs h L as a function of the elevation angle φ for the azimuth angle Θ = 0° of the virtual target position. By repeating such spectral analysis for a plurality of azimuth angles of interest, a complete set of transfer functions can be obtained to extend any set of horizontal transfer functions defined only by the azimuth angle, to elevated ones at desired elevation angles.

In an embodiment, the transfer functions derived in the manner described above are replaced by equalizing, i.e. adjusting the frequency dependence, of a set of predefined left ear and right ear transfer functions, which preferably takes into account only the main spectral features relevant to the perception of elevation or azimuth angles. By doing so, the required data to generate elevated transfer functions is significantly reduced. The elevation or azimuth angles can be then rendered as a spectral effect, i.e. applying an equalization or adjustment function, and can be used on any transfer functions.

In an embodiment, the adjustment filter 107 of the audio signal processing apparatus 100 is configured to adjust the frequency dependence of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle Θ and/or the elevation angle φ of the virtual target position on the basis of a plurality of infinite impulse response filters, wherein the plurality of infinite impulse response filters are configured to approximate spectrally prominent features, such as a maximum or a minimum, of the frequency dependence of a left ear transfer function and a right ear transfer function of a plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position. In an embodiment, the frequency dependence of each infinite impulse response filter is defined by a plurality of predefined filter parameters, wherein the plurality of predefined filter parameters are selected such that the frequency dependence of each infinite impulse response filter approximates at least a portion of the frequency dependence of a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position.

In an embodiment, the plurality of infinite-impulse-response filters comprises a plurality of biquad filters. The plurality of biquad filters can be implemented as parallel filters or cascaded filters. The use of cascaded filters is preferred as it approximates the spectral features of the transfer functions better. Figure 4 shows a plurality of biquad filters, including shelving filters 401 a, b and peaking filters 403a-c, which can be implemented in the filter 105 of the audio signal processing apparatus 100 shown in figure 1 for minimizing the distance between the transfer functions obtained from the spectral analysis and the filter magnitude response, as already described above.

Figure 5 shows schematic diagrams illustrating the frequency dependence of an exemplary shelving filter 401 a and the frequency dependence of an exemplary peaking filter 403a, which can be implemented in the filter 105 of the audio signal processing apparatus 100 shown in figure 1 . The shelving filter 401 a can be defined by two filter parameters, namely the cut-off frequency f 0 defining the frequency range, where the signal is changed, and the gain g 0 defining how much the signal is boosted (or attenuated if g 0 < 0 dB). The peaking filter 403a can be defined by three filter parameters, namely the cut-off frequency f 0 , where the peak is located, the gain g 0 defining the height of the peak (or of the notch if g 0 < 0 dB) and the bandwidth Δ 0 of the peak (or notch), directly related to the quality factor Q 0 = f 0 /A 0 .

In an embodiment, the filter parameters can be obtained using numerical optimization methods.

However, in an embodiment, which is more memory efficient, an ad-hoc method can be used to derive the filter parameters on the basis of the spectral information provided, for instance, in figure 3. Thus, in an embodiment, for at least one infinite impulse response filter of the plurality of infinite response filters the plurality of predefined filter parameters are computed or selected by determining a frequency and an azimuth angle and/or an elevation angle, at which a left ear transfer function or a right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions has a minimal or maximal magnitude, and by approximating the frequency dependence of the left ear transfer function or the right ear transfer function of the plurality of pairs of measured left ear and right ear transfer functions by the frequency dependence of the at least one infinite impulse response filter.

Figure 6 shows a schematic diagram illustrating the selection of filter parameters using the data already shown in figure 3, which can be implemented in an audio signal processing apparatus according to an embodiment, for instance, the audio signal processing apparatus 100 shown in figure 1 . The derivation of the filter parameters starts with locating the most significant spectral features, namely peaks and notches, in the measured transfer functions. For each of the identified features the relevant feature characteristics are then extracted, namely the corresponding central elevation angle φ ρ , which can be read on the horizontal axis, the corresponding central frequency f p , which can be read on the vertical axis, the maximal corresponding spectral value g p (with g p > 0 corresponding to a peak and g p < 0 to a notch) and the maximal bandwidth A p .

In an embodiment, the filter parameters, namely the cut-off frequency parameter f 0 , the gain parameter g 0 and the bandwidth parameter Δ 0 (defined for the peaking filters 403a-c) are determined on the basis of the following equations:

wherein M f gA and . f gA denote maximal and minimal values of f, g, A, respectively, and wherein a f g denote coefficients controlling the speed of changing the corresponding filter design parameters.

In an embodiment, the parameters M f gA , m f g A and a f g are set manually for the three filter design parameters f 0 , g 0 and Δ 0 to model the selected spectral feature as closely as possible. Subsequently, the parameters M, m and a can be refined for all spectral features in such a way that the magnitude response of the MR filters match the transfer functions obtained by the spectral analysis.

In the above described embodiment for determining the filter parameters only thirteen parameters (φ ρ , f p , g p , A p , M f g , m f g A , a f gA ) have to be stored for each MR filter, wherein the first four parameters (φ ρ , f p , g p , A p ) can be directly taken from the spectral analysis and the other parameters can be set manually. Thus, given the equations described above the parameters of the filters 401 a,b and 403a- c can be directly derived as a function of the desired elevation angle φ. Given a predefined set of transfer functions measured only in the median plane, i.e. containing information only for certain radial distances r and certain elevation angles φ, i.e.

h L (r, 0, ø) and h R (r, 0, ø), these transfer functions can be extended to any desired azimuth angle Θ, i.e. to the third dimension, in a similar way as described above.

Figure 7 shows a part of an audio signal processing apparatus according to an embodiment, for instance part of the audio signal processing apparatus 100 shown in figure 1 . In an embodiment, the adjustment filter 107 of the audio signal processing apparatus 100 is configured to filter the input audio signal 101 on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function 109 by convolving the adjustment function 109 with the left ear transfer function and by convolving the result with the input audio signal 101 in order to obtain the left ear output 1 1 1 a audio signal and/or by convolving the adjustment function 109 with the right ear transfer function and by convolving the result with the input audio 101 signal in order to obtain the right ear output audio signal 1 1 1 b.

Figure 8 shows a part of an audio signal processing apparatus according to an embodiment, for instance part of the audio signal processing apparatus 100 shown in figure 1 . In an embodiment, the adjustment filter 107 of the audio signal processing apparatus 100 is configured to filter the input audio signal 101 on the basis of the determined pair of left ear and right ear transfer functions and the adjustment function 109 by convolving the left ear transfer function with the input audio signal 101 and by convolving the result with the adjustment function 109 in order to obtain the left ear output audio signal 1 1 1 a and/or by convolving the right ear transfer function with the input audio signal 101 and by convolving the result with the adjustment function 109 in order to obtain the right ear output audio signal 1 1 1 b. Figure 9 shows a schematic diagram illustrating an exemplary scenario, where an audio signal processing apparatus according to an embodiment can be used, for instance, the audio signal processing apparatus 100 shown in figure 1 . In the embodiment shown in figure 9, the audio signal processing apparatus 100 is configured to synthesize a binaural sound over headphones simulating a virtual loudspeaker surround system. To this end, the audio signal processing apparatus 100 can comprise at least one transducer, in particular headphones or loudspeakers using crosstalk cancellation, configured to output the binaural sound, i.e. the left ear output audio signal 1 1 1 a and the right ear output audio signal 1 1 1 b.

In the example shown in figure 9 the virtual loudspeaker surround system, that is being simulated, is a 5.1 sound system setup with front left (FL), front right (FR), front center (FC), rear left (RL), and rear right (RR) loudspeakers. In this example, the five HRTFs corresponding to the five loudspeakers can be stored to synthesize the binaural sound for the virtual loudspeakers. Given the positions of desired height loudspeaker positions, front left height (FLH), front right height (FRH), front center height (FCH), rear left height (RLH), and rear right height (RRH), the audio signal processing apparatus 100 can efficiently extend the stored five horizontal HRTFs to the corresponding elevated ones. Thus, using the audio signal processing apparatus 100 the binaural rendering system over a 5.1 sound system is extended to a 10.2 sound system. Figure 10 shows a schematic diagram illustrating an audio signal processing method 1000 for processing an input audio signal 101 to be transmitted to a listener in such a way that the listener perceives the input audio signal 101 to come from a virtual target position defined by an azimuth angle and an elevation angle relative to the listener. The audio signal processing method 1000 comprises the steps of determining 1001 a pair of left ear and right ear transfer functions on the basis of a set of pairs of predefined left ear and right ear transfer functions for the azimuth angle and the elevation angle of the virtual target position, wherein the pairs of predefined left eat and right ear transfer functions are predefined for a plurality of reference positions relative to the listener, wherein the plurality of reference positions lie in a two-dimensional plane, and filtering

1003 the input audio signal 101 on the basis of the determined pair of left ear and right ear transfer functions and an adjustment function 109 configured to adjust a delay 109a between the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions and a frequency dependence 109b of the left ear transfer function and the right ear transfer function of the determined pair of left ear and right ear transfer functions as a function of the azimuth angle and/or the elevation angle of the virtual target position in order to obtain a left ear output audio signal 1 1 1 a and a right ear output audio signal 1 1 1 b. Embodiments of the invention realize different advantages. The audio signal processing apparatus 100 and the audio signal processing method 1000 provide means to synthesize binaural sound, i.e. audio signals perceived by a listener as coming from a virtual target position. The audio signal processing apparatus 100 functions based on a "two- dimensional" predefined set of transfer functions, which can be either obtained from a generalized database or measured for a specific user. The audio signal processing apparatus 100 can also provide means for reinforcing front-back or elevation effect in synthesized sound. Embodiments of the invention can be applied in different scenarios, for example, in media playback, which is virtual surround rendering of more than 5.1 (e.g., 10.2, or even 22.2) by storing only 5.1 transfer functions and parameters to obtain all three-dimensional azimuth and elevation angles based on the basic two-dimensional set. Embodiments of the invention can also be applied in virtual reality in order obtain full sphere transfer functions with high resolution based on transfer functions with low resolution. Embodiments of the invention provide an effective realization of binaural sound synthesis with regard to the memory required and the complexity of the signal processing algorithms.

While a particular feature or aspect of the disclosure may have been disclosed with respect to only one of several implementations or embodiments, such feature or aspect may be combined with one or more other features or aspects of the other implementations or embodiments as may be desired and advantageous for any given or particular application. Furthermore, to the extent that the terms "include", "have", "with", or other variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term "comprise". Also, the terms "exemplary", "for example" and "e.g." are merely meant as an example, rather than the best or optimal. The terms "coupled" and "connected", along with derivatives may have been used. It should be understood that these terms may have been used to indicate that two elements cooperate or interact with each other regardless whether they are in direct physical or electrical contact, or they are not in direct contact with each other.

Although specific aspects have been illustrated and described herein, it will be

appreciated by those of ordinary skill in the art that a variety of alternate and/or equivalent implementations may be substituted for the specific aspects shown and described without departing from the scope of the present disclosure. This application is intended to cover any adaptations or variations of the specific aspects discussed herein. Although the elements in the following claims are recited in a particular sequence with corresponding labeling, unless the claim recitations otherwise imply a particular sequence for implementing some or all of those elements, those elements are not necessarily intended to be limited to being implemented in that particular sequence.

Many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the above teachings. Of course, those skilled in the art readily recognize that there are numerous applications of the invention beyond those described herein. While the present invention has been described with reference to one or more particular embodiments, those skilled in the art recognize that many changes may be made thereto without departing from the scope of the present invention. It is therefore to be understood that within the scope of the appended claims and their equivalents, the invention may be practiced otherwise than as specifically described herein.