Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A SYSTEM FOR AND A METHOD OF MIXING FIRST AUDIO DATA WITH SECOND AUDIO DATA, A PROGRAM ELEMENT AND A COMPUTER-READABLE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2006/085265
Kind Code:
A2
Abstract:
A system (200) of mixing first audio data (201) with second audio data (202) comprising a filter unit (203) adapted to filter the first audio data (201) and the second audio data (202) to generate a component (206) of the first audio data (201) in a first frequency range, a component (207) of the first audio data (201) in a second frequency range, a component (208) of the second audio data (202) in the first frequency range, and a component (209) of the second audio data (202) in the second frequency range, and a determining unit (210) adapted to determine a transition profile between the first audio data (201) and the second audio data (202) in such a manner that transition characteristics for a transition between the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range are determined separately from transition characteristics for a transition between the components (207, 209) of the first audio data (201) and the second audio data (202) in the second frequency range.

Inventors:
LEMMA AWEKE (NL)
VAN DE KERKHOF LEON (NL)
Application Number:
PCT/IB2006/050392
Publication Date:
August 17, 2006
Filing Date:
February 07, 2006
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KONINKL PHILIPS ELECTRONICS NV (NL)
LEMMA AWEKE (NL)
VAN DE KERKHOF LEON (NL)
International Classes:
G10L19/00; G10L21/0316; G10L21/034
Domestic Patent References:
WO2001011809A12001-02-15
Foreign References:
US20020172379A12002-11-21
EP0158055A11985-10-16
Attorney, Agent or Firm:
Röggla, Harald (Triester Strasse 64, Vienna, AT)
Download PDF:
Claims:
CLAIMS
1. A system (200) for mixing first audio data (201) with second audio data (202), the system (200) comprising a filter unit (203) adapted to filter the first audio data (201) and the second audio data (202) to generate a component (206) of the first audio data (201) in a first frequency range, a component (207) of the first audio data (201) in a second frequency range, a component (208) of the second audio data (202) in the first frequency range, and a component (209) of the second audio data (202) in the second frequency range; a determining unit (210) adapted to determine a transition profile between the first audio data (201) and the second audio data (202) in such a manner that transition characteristics for a transition between the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range are determined separately from transition characteristics for a transition between the components (207, 209) of the first audio data (201) and the second audio data (202) in the second frequency range.
2. The system (200) according to claim 1, wherein the filter unit (203) is adapted to filter the first audio data (201) and the second audio data (202) to generate a component of the first audio data (201) in at least one further frequency range and to generate a component of the second audio data (202) in the at least one further frequency range; wherein the determining unit (210) is adapted to determine transition characteristics for a transition profile between the components of the first audio data (201) and the second audio data (202) in the at least one further frequency range separately from transition characteristics for a transition profile between the components (206 to 209) of the first audio data (201) and the second audio data (202) in the first frequency range and in the second frequency range.
3. The system (200) according to claim 1, wherein the determining unit (210) is adapted to determine the transition profile so that, before the transition, mixed data consist of the first audio data (201); during the transition, mixed data comprise a decreasing contribution of the first audio data (201) and an increasing contribution of the second audio data (202); and, after the transition, mixed data consist of the second audio data (202).
4. The system (200) according to claim 1, wherein the determining unit (210) is adapted to determine the transition profile so that a time interval (304, 312) defining the duration of the transition is longer for the first frequency range than for the second frequency range.
5. The system (200) according to claim 4, wherein the first frequency range includes higher frequencies than the second frequency range.
6. The system (200) according to claim 1, wherein the determining unit (210) is adapted to determine the transition profile so that a center of a time interval (304) defining the duration of the transition for the first frequency range essentially equals a center of a time interval (312) defining the duration of the transition for the second frequency range.
7. The system (200) according to claim 1, wherein the determining unit (210) is adapted to determine an overall amplitude of the mixed audio data to be essentially constant during the transition.
8. The system (200) according to claim 1, wherein the determining unit (210) is adapted to simultaneously determine the transition characteristics in the first frequency range and in the second frequency range.
9. The system (200) according to claim 1, comprising a phaseanalyzing unit (701) adapted to analyze a phase relation of the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range and/or a phase relation of the components of the first audio data (201 ) and the second audio data (202) in the second frequency range; wherein the determining unit (210) is adapted to determine the transition characteristics while considering the analyzed phase relation.
10. The system (200) according to claim 9, wherein the determining unit (210) is adapted to determine the transition characteristics so that phasedestructive interference of the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range and/or phasedestructive interference of the components of the first audio data (201) and the second audio data (202) in the second frequency range is substantially prevented during the transition.
11. The system (200) according to claim 9, wherein the determining unit (210) is adapted to determine the transition characteristics so that phasedestructive interference of the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range and/or phasedestructive interference of the components of the first audio data (201) and the second audio data (202) in the second frequency range is prevented during the transition by selectively delaying or advancing the first audio data (201) and/or the second audio data (202) in the first frequency range and/or in the second frequency range.
12. The system (200) according to claim 1, further comprising a mixing unit (215) adapted to mix the first audio data (201) with the second audio data (202) on the basis of the determined transition characteristics.
13. The system (200) according to claim 1, wherein the determining unit (210) is adapted to determine transition characteristics for a transition between the components of the first audio data (201) and the components of the second audio data (202) in the first frequency range in a nonequal manner as compared to transition characteristics for a transition between the components of the first audio data (201) and the components of the second audio data (202) in the second frequency range.
14. The system (200) according to claim 1, wherein the step of determining transition characteristics for a transition between the components of the first audio data (201) and the second audio data (202) includes determining amplitude properties and/or phase properties of the first audio data (201) and/or of the second audio data (202) in the first frequency range and/or in the second frequency range.
15. The system (200) according to claim 1, realized as an integrated circuit.
16. The system (200) according to claim 1, realized as an automatic disc jockey device (800).
17. The system (200) according to claim 1, realized as at least one of the group consisting of a DVD player, a hard diskbased audio player, a portable audio player, a wearable audio player, an internet radio apparatus, a public entertainment apparatus, and an MP3 player.
18. A method of mixing first audio data (201) with second audio data (202), the method comprising the steps of filtering the first audio data (201) and the second audio data (202) to generate a component (206) of the first audio data (201) in a first frequency range, a component (207) of the first audio data (201) in a second frequency range, a component (208) of the second audio data (202) in the first frequency range, and a component (209) of the second audio data (202) in the second frequency range; determining a transition profile between the first audio data (201) and the second audio data (202) in such a manner that transition characteristics for a transition between the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range are determined separately from transition characteristics for a transition between the components (207, 209) of the first audio data (201) and the second audio data (202) in the second frequency range.
19. A program element, which, when being executed by a processor, is adapted to carry out a method of mixing first audio data (201) with second audio data (202), the method comprising the steps of filtering the first audio data (201) and the second audio data (202) to generate a component (206) of the first audio data (201) in a first frequency range, a component (207) of the first audio data (201) in a second frequency range, a component (208) of the second audio data (202) in the first frequency range, and a component (209) of the second audio data (202) in the second frequency range; determining a transition profile between the first audio data (201) and the second audio data (202) in such a manner that transition characteristics for a transition between the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range are determined separately from transition characteristics for a transition between the components (207, 209) of the first audio data (201) and the second audio data (202) in the second frequency range.
20. A computerreadable medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out a method of mixing first audio data (201) with second audio data (202), the method comprising the steps of filtering the first audio data (201) and the second audio data (202) to generate a component (206) of the first audio data (201) in a first frequency range, a component (207) of the first audio data (201) in a second frequency range, a component (208) of the second audio data (202) in the first frequency range, and a component (209) of the second audio data (202) in the second frequency range; determining a transition profile between the first audio data (201) and the second audio data (202) in such a manner that transition characteristics for a transition between the components (206, 208) of the first audio data (201) and the second audio data (202) in the first frequency range are determined separately from transition characteristics for a transition between the components (207, 209) of the first audio data (201) and the second audio data (202) in the second frequency range.
Description:
A system for and a method of mixing first audio data with second audio data, a program element and a computer-readable medium

FIELD OF THE INVENTION

The invention relates to a system for mixing first audio data with second audio data.

The invention further relates to a method of mixing first audio data with second audio data.

Moreover, the invention relates to a program element. Furthermore, the invention relates to a computer-readable medium.

BACKGROUND OF THE INVENTION In the field of electronic entertainment apparatuses, many new applications are currently developed and introduced on the market. When an audio player plays back different audio items one after the other, it is desirable to have an apparently seamless transition between two subsequent tracks. This may be denoted as "mixing". During a "cross-fade", it is possible to amplify each track during the transition phase from one track to another. In an automated system, in order to provide a seamless transition between tracks, amplification of the outgoing track will typically be reduced at the same rate as the amplification of the incoming track is increased.

A diagram 100 will be described with reference to Fig. 1, illustrating a level- complementary transition scheme in accordance with a prior-art system for mixing first audio data with second audio data.

The diagram 100 comprises an abscissa 101 on which the playback time of audio pieces is plotted. A gain of the different audio pieces is shown in arbitrary values between 0 and 1 on an ordinate 102 of the diagram 100. Fig. 1 shows a level-complimentary transition between a first audio piece 103 and a second audio piece 104. In a first portion 105, the first audio piece 103 has a high gain and the second audio piece 104 has a low gain. In a subsequent transition portion 106, the first audio piece 103 is faded out, i.e. the corresponding gain is decreased, while the gain of a second audio piece 104 is increased (faded in) in the transition portion 106. In a second portion 107, the transition is completed and only the second audio piece 104 is played back, while the first audio piece 103 is no

longer played back.

During mixing, there are moments when the outgoing audio piece, or song, 103 and the incoming audio piece, or song, 104 are simultaneously played, namely the transition portion 106. In the state-of-the-art implementation shown in Fig. 1, the cross- fading profile according to the diagram 100 is implemented. The cross-fading is performed in such a way that, at any given moment, the overall audio level remains more or less constant (so-called "level-complementary transition"). However, this approach has the shortcoming that, if there is a slight misalignment in the phases of the low-frequency signals, the baseline can add up destructively. In particular, this may be the case when the transition interval 106 is relatively long.

US 6,534,700 B2 discloses an automated music compilation system wherein, during mixing of two musical tracks, the variations in combined output volume are reduced by analyzing either the intrinsic amplitude, at which each track was mastered, or the output amplitude, and by modifying either the intrinsic amplitude or amplification during the mixing phase. Musical clashes during mixing are avoided by analyzing intrinsic amplitudes of the two tracks at similar frequencies to detect the likelihood of a clash, and in the event a clash is detected, by reducing the output amplitude of one of the tracks at the relevant frequency. Particularly, an audio signal may be passed through a plurality of parallel signal-processing channels each having a respective frequency passband filter. A processor may determine as to which frequency range is dominant for a pair of tracks over their mutual transition period. The dominant range is then used to provide data necessary for equalizing the net output volume over the transition phase between the two tracks.

However, it is a shortcoming of US 6,534,700 B2 that audible artefacts may occur in a transition interval bridging two tracks.

OBJECT AND SUMMARY OF THE INVENTION

It is an object of the invention to achieve a distortion-free smooth transition between two audio tracks to be played back one after the other.

In order to achieve the object defined above, a system and a method for mixing first audio data with second audio data, a program element and a computer-readable medium have the features as defined in the independent claims.

In one embodiment of the invention, a system for mixing first audio data with second audio data is provided, wherein the system comprises a filter unit adapted to filter the first audio data and the second audio data to generate a component of the first audio data in a

first frequency range, a component of the first audio data in a second frequency range, a component of the second audio data in the first frequency range, and a component of the second audio data in the second frequency range. A determining unit may be adapted to determine a transition profile between the first audio data and the second audio data in such a manner that transition characteristics for a transition between the components of the first audio data and the second audio data in the first frequency range are determined separately from transition characteristics for a transition between the components of the first audio data and the second audio data in the second frequency range.

In another embodiment of the invention, a method of mixing first audio data with second audio data is provided, wherein the method comprises the steps of filtering the first audio data and the second audio data to generate a component of the first audio data in a first frequency range, a component of the first audio data in a second frequency range, a component of the second audio data in the first frequency range, and a component of the second audio data in the second frequency range. Furthermore, a transition profile is determined between the first audio data and the second audio data in such a manner that transition characteristics for a transition between the components of the first audio data and the second audio data in the first frequency range are determined separately from transition characteristics for a transition between the components of the first audio data and the second audio data in the second frequency range. Moreover, in yet another embodiment of the invention, a program element is provided, which, when being executed by a processor, is adapted to carry out a method of mixing first audio data with second audio data in accordance with the above-mentioned method steps.

In a further embodiment of the invention, a computer-readable medium is provided, in which a computer program is stored which, when being executed by a processor, is adapted to carry out a method of mixing first audio data with second audio data in accordance with the above-mentioned method steps.

The mixing of first audio data with second audio data according to the invention can be realized by a computer program, i.e. by means of software, or by using one or more special electronic optimization circuits, i.e. in hardware, or in a hybrid form, i.e. by means of software components and hardware components.

The characterizing features according to the invention particularly have the advantage that a transition profile defining properties or parameters of a transition from a first audio piece to a second audio piece can be determined separately for different frequency sub-

bands. By taking this measure, it is possible to take into account frequency-specific frame conditions for a smooth segueing between two subsequent audio pieces, wherein the transition characteristics may be different for different frequency values. For instance, low- frequency components ("bass components") of audio content may be more prone to audible artefacts during a transition than high-frequency audio contributions ("treble components"). Consequently, it may be advantageous for a proper quality of replayed audio content that properties defining shape, length, etc. of a transition range are selected to be different for bass components than for treble components. For instance, it may be advantageous to adjust the amplitude and/or phase in this transition range in a different manner for bass components than for treble components. It may further be advantageous to select a relatively narrow transition range for the bass components to avoid undesired destructive interference of the components, wherein the transition period may be broader (i.e. it may have a longer duration in time) for the treble components. This results in a smoother transition between two audio excerpts. According to the invention, frequency-equalized controlled audio mixing may be performed. In audio processing, the term "equalization" relates to a process of modifying a frequency envelope of audio content. A distortion-free smooth segueing during audio transition between two subsequent tracks may be achieved according to the invention, particularly by adjusting a frequency band-dependent transition duration between two consecutive audio pieces, like songs. It may further be advantageous to provide simultaneous but unequal mixing profiles for different frequency bands. A phase comparison of bass components in both channels and delay adjustment may be implemented to reduce undesired phase cancellation. Avoiding phase cancellation may result in an improved audio quality in the transition range. According to an embodiment of the invention, a method is provided to adjust both the phase and the amplitude of sub-band signals in a graceful way. An amplitude overlap is performed in accordance with an adjustable or predefined transition profile. For instance, a short transition overlap may be selected for the bass component, and a longer overlap may be selected for the treble component. The system according to the invention allows lowering audible artefacts, particularly by addressing to the issue of compensating a possible phase conflict between two songs. Thus, a mixing profile may be controlled in the sub-band domain.

In auto disc jockey (AutoDJ) implementations, accurate alignment of beats is important for a smooth transition between songs. Slight misalignment in the phases of the

low-frequency components of the song can result in severe distractive interference. According to an embodiment of the invention, a method and a system are disclosed that minimize or suppress such undesired interference effects by mixing the high- and low- frequency components differently in a systematic and controlled manner, using a frequency equalization technique.

According to an aspect of the invention, a simultaneous but non-equal mixing profile for treble and bass components may be carried out. Depending on the anticipated phase relation, a frequency-dependent transition interval may be implemented. When long transitions are unavoidable or preferred, a method can be carried out to appropriately mix the bass components in a way that minimizes the risk of phase-destructive addition.

According to one aspect of the invention, an AutoDJ system is provided, implementing a mechanism to resolve baseline-destructive interference in a transition interval when there is a slight misalignment in the phases of the low-frequency signals of the outgoing and incoming songs. According to the invention, a better control of the transition behavior in

AutoDJ applications can be achieved. There is only a minimal chance of phase cancellation left so that audible artefacts are suppressed efficiently. Furthermore, it is possible to introduce pleasant sound effects in the transition interval, if desired. According to one aspect of the invention, an automatic DJ function for creating a smooth transition between two songs is provided, wherein the treble and bass components of the songs may be mixed simultaneously and/or non-equally in multiple frequency bands. Thus, a multiple transition profile mixing is enabled. The invention thus allows providing a graceful cross-fading profile. This cross-fading may or may not be performed in such a way that at any given moment the overall audio level may be essentially constant ("level-complementary transition"). However, the adjustment of the transition profile may be performed separately for different frequency components. This has the advantage that a disruptive addition of the baseline due to slight misalignment in the phases of low-frequency signals is efficiently prevented, since the transition interval and/or the phase properties of the audio contributions can be adjusted in such a manner that artefacts are suppressed. Examples of application fields of the invention are DVD/HD-players, portable/wearable products, internet-radio applications, public entertainment centers, etc.

Particularly, it may be advantageous within the scope of the invention to provide a relatively short overlap for the low-frequency components, and a relatively long overlap for the high-frequency components to efficiently avoid that bass components are

cancelled out in an undesired manner. By preventing such a destructive interference, a bad sound in the overlap region is avoided.

For instance, delaying or advancing can adjust phases of audio contributions to be mixed, for instance. A particular advantage of the invention is that it has been recognized that bass components may be more prone to undesired cancellation than treble components, so that an optimized adjustment of bass components has a strong influence on the quality of the resulting sound.

Further preferred embodiments of the invention will be described hereinafter with reference to the dependent claims. Preferred embodiments of the system for mixing first audio data with second audio data will now be described. These embodiments also apply to the method of mixing first audio data with second audio data, the program element and the computer-readable medium.

In the framework of such a system, the filter unit may be adapted to filter the first audio data and the second audio data to generate a component of the first audio data in at least one further frequency range. The determining unit may be adapted to determine transition characteristics for a transition profile between the components of the first audio data and the second audio data in the at least one further frequency range separately from transition characteristics for a transition profile between the components of the first audio data and the second audio data in the first frequency range and in the second frequency range. In other words, the invention is not limited to distinguishing two different frequency bands (particularly a high-frequency treble range and a low-frequency bass range), but can be implemented as well with three or more different frequency bands, for instance, a high- frequency band, a medium-frequency band and a low-frequency band. Filtering these individual components may be realized by using a respective bandpass filter for each frequency range. The higher the number of frequency bands which are distinguished and treated separately concerning the transition characteristics for the transition profile, the better refined is the mixing scheme and the audio quality which can be achieved.

Furthermore, in contrast to a human disk jockey who is limited to having two hands and two ears to manually control two frequency ranges at maximum, the extension to three or more frequency ranges can only be implemented in an automatic manner. The automated mixing of three or more frequency components in an overlap regime thus significantly improves the flexibility and functionality of the system.

The determining unit may be adapted to determine the transition profile so

that, before the transition, mixed data consist of first audio data; during the transition, mixed data comprise a decreasing contribution of the first audio data and an increasing contribution of the second audio data, and, after the transition, mixed audio data consist of the second audio data. In other words, the system according to the invention may be implemented in a "cross-fading" manner, wherein, at the end of a first audio excerpt, the respective amplitude is successively decreased, whereas simultaneously the amplitude of a subsequent second audio excerpt is successively increased.

The determining unit may be adapted to determine the transition profile so that a transition time interval is longer for the first frequency range than for the second frequency range. The length of the overlap between the first and the second song to be mixed may be selected separately for each frequency range. Particularly, when the first frequency range includes higher frequencies than the second frequency range, it may be advantageous to have a relatively short transition interval for the low-frequency components, which low-frequency components are more prone to the risk of destructive interference than high-frequency components. Then, a relatively short bass transition duration can be combined with a relatively long treble transition duration so that, simultaneously, a smooth transition and an artefact-free transition may be achieved.

The determining unit may be adapted to determine the transition profile so that a center of a transition time interval for the first frequency range essentially equals a center of a transition time interval for the second frequency range. The width of the transition windows for the different sub-bands may differ, but it is advantageous that these transition ranges are arranged symmetrically around a common audible center. This can help to improve the subjective quality experienced by a human listener listening to the mixed first and second audio data. The determining unit may further be adapted to determine an amplitude of mixed audio data to be essentially constant during the transition. In other words, when the amplitude of the superposed first and second audio excerpts remains essentially constant during the mixing operation, this may improve the subjective quality experienced by a human listener hearing the mixed audio content. The determining unit may be adapted to simultaneously determine the transition characteristics in the first frequency range and in the second frequency range. In other words, the determining unit may process the audio data to be mixed in a timely parallel manner.

Furthermore, the system may comprise a phase-analyzing unit adapted to

analyze a phase relation of the components of the first audio data and the second audio data in the first frequency range and/or a phase relation of the components of the first audio data and the second audio data in the second frequency range. The determining unit may be coupled to this phase-analyzing unit and adapted to determine the transition characteristics while considering the analyzed phase relations. By taking into account the frequency-specific phase properties of the different contributions of the audio excerpts to be mixed, the different components can be advanced or delayed in such a manner that audible artefacts are suppressed, which result from an undesired interaction of such components, for instance, destructive interference of bass components. By not only controlling the amplitude in the transition range, but additionally or alternatively controlling the phase properties, the quality of the mixed audio excerpts is improved.

Particularly, the determining sound can be significantly increased. The unit may be adapted to determine the transition characteristics so that phase-destructive interference of the components of the first audio data and the second audio data in the first frequency range and/or phase-destructive interference of the components of the first audio data and the second audio data in the second frequency range is prevented during the transition by selectively delaying or advancing the first audio data and/or the second audio data in the first frequency range and/or in the second frequency range. By including respective delaying (or advancing) elements to selectively and adjustably control the phase relations of the contributions to be mixed in each frequency range separately, the danger of artefacts resulting from unfavorable overlap is reduced.

The system may further comprise a mixing unit adapted to mix the first audio data with the second audio data on the basis of the determined transition characteristics. The mixing unit may add the separate frequency-specific contributions to produce an output signal, which can be output via loudspeakers, headphones, or the like.

The determining unit may further be adapted to determine transition characteristics for a transition between the components of the first audio data and the second audio data in the first frequency range in a non-equal manner as compared to transition characteristics for a transition between the components of the first audio data and the second audio data in the second frequency range. According to this embodiment, the transition characteristics are different for different frequency bands. Separate parameters and/or parameter values defining the transition in each frequency interval may be defined. Thus, the number of degrees of freedom for an optimization is increased, which allows a sensitive adjustment of the transition characteristics.

The step of determining transition characteristics for a transition between the components of the first audio data and the second audio data may include determining amplitude properties and/or phase properties of the first audio data and/or of the second audio data in the first frequency range and/or in the second frequency range. These two parameters in combination are appropriate to accurately define transition characteristics that may be fitted to the frame conditions of an individual application.

The system according to the invention may be realized as an integrated circuit, particularly as a semiconductor integrated circuit. In particular, the system can be realized as a monolithic IC, which can be manufactured in silicon technology. The system according to the invention may be realized as an automatic disc jockey device, that is to say as a disc jockey device mixing different audio excerpts without the necessity of human user interference.

The system according to the invention may be realized as at least one of the group consisting of a DVD player, a hard disk-based audio player, a portable audio player, a wearable audio player, an internet radio apparatus, a public entertainment apparatus, and an MP3 player. These fields of application are merely given by way of example; the system according to the invention may be implemented for other applications as well.

Furthermore, the invention has been described with reference to pure audio data. However, the audio data processed according to the invention may also include combined audio and visual data, like video data. For instance, different subsequent music videos having visual and acoustical components can be mixed according to the invention, particularly in such a manner that the sound in the transition part is changed smoothly from a first video item to a second video item.

These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be described in more detail hereinafter with reference to non-limiting examples of embodiments. Fig. 1 shows a diagram illustrating a level-complementary transition scheme according to a system for mixing first audio data with second audio data according to the prior art.

Fig. 2 illustrates a system for mixing first audio data with second audio data according to a first embodiment of the invention.

Fig. 3 shows a diagram illustrating multiple transition profile mixing according to an embodiment of the invention.

Fig. 4 illustrates a system for mixing first audio data with second audio data according to a second embodiment of the invention. Fig. 5 illustrates a system for mixing first audio data with second audio data according to a third embodiment of the invention.

Fig. 6 illustrates a diagram showing the frequency behavior of a low-pass filter and a high-pass filter implemented in a system for mixing first audio data with second audio data according to the invention. Fig. 7 illustrates a part of a system for mixing first audio data with second audio data according to a fourth embodiment of the invention.

Fig. 8 illustrates an auto disc jockey device according to an embodiment of the invention.

DESCRIPTION OF EMBODIMENTS

The illustrations in the drawings are schematic. In the different drawings, similar or identical elements are denoted by the same reference numerals.

A system 200 for mixing a first audio piece 201 with a second audio piece 202 according to an embodiment of the invention will now be described with reference to Fig. 2. The system 200 comprises a filter unit 203 including a first filter sub-unit 204 and a second filter sub-unit 205. The first filter sub-unit 204 is adapted to filter the first audio piece 201 to generate a low-frequency component 206 of the first audio piece 201 including the audio contributions with frequencies below a threshold, and a high-frequency component 207 including the audio contributions with frequencies at at least the threshold of the first audio piece 201. The second filter sub-unit 205 is adapted to generate a low-frequency component 208 from the second audio piece 202 including the audio contributions with frequencies below the threshold and a high-frequency component 209 of the second audio piece 202 including the audio contributions with frequencies at at least the threshold. Furthermore, a determining unit 210 is provided, which includes a first determining sub-unit 211 and a second determining sub-unit 212. The determining unit 210 is adapted to determine a transition profile between the first audio piece 201 and the second audio piece 202, i.e. characteristics of the transition at the end of the first audio piece 201 and at the beginning of the second audio piece 202. Particularly, the first determining sub-unit 211 determines transition characteristics for a transition between the low-frequency

component 206 of the first audio piece 201 and the low-frequency component 208 of the second audio piece 202 in the low-frequency range. Separately from this determination, the second determining sub-unit 212 determines transition characteristics for a transition between the high-frequency component 207 of the first audio piece 201 and the high-frequency component 209 of the second audio piece 202 in the high-frequency range. In other words, the first determining sub-unit 211 determines parameters defining a transition of the bass components of the input audio pieces 201, 202. The second determining sub-unit 212 determines parameters for a smooth transition of the treble components of the audio pieces 201, 202. Thus, the output of the first determining sub-unit 211 is a low-frequency mixed audio piece 213 obtained by mixing the low-frequency component 206 of the input audio piece 201 and the low-frequency component 208 of the input audio piece 202 in accordance with a certain low-frequency transition profile. The output of the second determining sub-unit 212 is a high-frequency mixed audio piece 214 obtained by mixing the high-frequency component 207 of the input audio piece 201 and the high-frequency component 209 of the input audio piece 202 in accordance with a certain high-frequency transition profile.

The low-frequency mixed audio piece 213 and the high-frequency mixed audio piece 214 are input in a combining unit 215 which merges the different audio contributions in such a manner that mixed audio data 216 are supplied at an output of the mixing unit 215, ready to be output by a loudspeaker, a headphone or the like. The mixing unit 215 mixes the first audio piece 201 with the second audio piece 202 on the basis of the determined transition characteristics for the two different frequency ranges.

A first diagram 300 and a second diagram 310 illustrating the mixing which is performed by the system for mixing audio data 200 will now be described with reference to Fig. 3. In the first diagram 300, the variation in time of playing back the high- frequency component 207 of the first audio piece 201 mixed with the high-frequency component 209 of the second audio piece 202 is plotted on an abscissa 301. The high- frequency component 207 contains the contribution of frequencies of the first audio piece 201 in a range around a frequency f H . The high-frequency component 209 contains the contribution of frequencies of the second audio piece 202 in a range around the frequency f H ., A gain (that is to say the amplitude of the audio contributions 207, 209) is plotted in arbitrary units between 0 and 1 on an ordinate 302 of the first diagram 300. The high-frequency contribution 207 can also be denoted as the treble profile for the outgoing audio piece, or song, 201. The term "outgoing" denotes an audio piece that has already been played for some

time and is to be smoothly reduced in its amplitude to be faded out. The term "incoming" denotes an audio piece which is to be played next and should be smoothly increased in its amplitude to be faded in. The high-frequency component 209 can also be denoted as the treble profile for the incoming song 202. As can be seen from the first diagram 300, there is a first treble portion 303 in which essentially only the first audio piece 201 is played. In a subsequent treble transition portion 304, an overlap of the outgoing first audio piece 201 with the incoming second audio piece 202 is shown for the high-frequency contributions 207, 209. In this treble transition portion 304, the high-frequency component 207 of the first audio piece 201 decreases in intensity, whereas simultaneously the high-frequency component 209 of the second audio piece 202 increases in intensity. In a subsequent second treble portion 305, essentially only treble components 209 of the second audio piece 202 are played back.

In a similar manner as in the first diagram 300, a second diagram 310 illustrates a multiple transition profile mixing for the low-frequency components 206, 208 of the first and second audio pieces 201, 202. The low-frequency component 206 contains the contribution of frequencies of the first audio piece 201 in a range around a frequency f L . The low-frequency component 208 contains the contribution of frequencies of the second audio piece 202 in a range around the frequency f L . The abscissa 301 is divided into three portions, namely a first bass portion 311 , a bass transition portion 312 and a second bass portion 313. In the first bass portion 311, only the low-frequency component 206 of the first audio piece 201 is played back, i.e. the first bass portion 311 represents a bass profile for the outgoing song 201. In the bass transition portion 312, there is a bass overlap, that is to say the low- frequency component 206 of the first audio piece 201 is played back with a decreasing amplitude, whereas the amplitude of the low-frequency component 208 of the incoming song 202 increases in the bass transition portion 312. In the second bass portion 313, there is essentially only a contribution originating from the low-frequency component 208 of the second audio piece 202.

As one can gather from Fig. 3, the transition characteristics for the high- frequency components 206, 208 (see first diagram 300) are adjusted independently and separately from the transition characteristics of the low-frequency components 207, 209 (see second diagram 310).

The illustration of Fig. 3 plots the diagrams 300, 310 in some kind of three- dimensional manner, that is to say on a frequency axis 330. Although separate transition ranges are plotted in Fig. 3 only for two frequency ranges around f L and f H , it is of course

possible to extend this to any desired number of frequency ranges for which the transition profile is adjusted separately.

During the mixing operation performed in the transition periods 304, 312, there are moments when the outgoing song 201 and the incoming song 202 are played simultaneously. A typical duration of the time intervals for such a transition period 304, 312 between two subsequent mixed tracks 201, 202 may be of the order of, for instance, 10 to 30 seconds. According to the invention, a graceful cross-fading profile is realized. As seen in Fig. 3, the treble components 207, 209 and the bass components 206, 208 of the songs 201, 202 are mixed differently and at different moments. A human user (for instance, a human disk jockey) who mixes the treble components 207, 209 and the bass components 206, 208 at different moments can simultaneously concentrate on at most two sounds and two controls ("two ears, two hands" limitation). Thus, a human disk jockey can properly mix at most two signals at a time. In contrast to this, an audio disk jockey based on the system 200 illustrated in Fig. 2 and Fig. 3 does not suffer from such limitations. In addition to distinguishing between high and low frequencies (bass and treble) as illustrated in Figs. 2 and 3, it is possible with the system 200 to simultaneously and non-equally mix profiles for treble and bass components and, if desired, for at least one further frequency component. Thus, any desired number of frequency sub-bands can be treated separately where mixing properties are concerned. Depending on anticipated phase relations, frequency-dependent transition intervals 304, 312 may be implemented. When long transition intervals 304, 312 are unavoidable or preferred, it is possible to mix the bass components in a way that reduces or minimizes the risk of phase-destructive addition.

According to the invention, it is possible to separately but preferably simultaneously control the profiles of the transitions in multiple frequency bands, as shown in Fig. 3. In the simple example of Fig. 3, this is illustrated for the case of two frequency bands, namely a treble frequency band and a bass frequency band.

Since the overlap time of the baseline is small for the bass component (see the relatively narrow time interval of the bass transition portion 312), the risk of phase- destructive mixing is minimal. However, since treble frequencies are not so prone to such destructive mixing, the treble transition portion 304 may be broader, which allows a smooth transition from one song 201 to another song 202. According to the invention, it is generally possible to choose several frequency band-dependent transition profiles.

A system for mixing audio data 400 according to a second embodiment of the invention will now be described with reference to Fig. 4.

Referring to the system of mixing audio data 400, a first audio piece 201 is applied to a first filter bank 401, and a second audio piece 202 is applied to a second filter bank 402. Each filter bank 401, 402 filters the supplied audio pieces 201, 202 to at least separate low-frequency components and high-frequency components. Thus, the data x[n] related to the first audio piece 201 are filtered by the first filter bank 401 to produce a low- frequency component x L [n] 206 and a high-frequency component X H [Π] 207. In a similar manner, the second filter bank 402 filters the data y[n] related to the second audio piece 202 to generate a low-frequency component y L [n] 208 and a high-frequency component yii[n] 209. The low-frequency components 206, 208 are supplied at an input of a low-frequency mixer 403. The high-frequency components 207, 209 are supplied at an input of a high- frequency mixer 404. The mixers 403, 404 receive commands from a microprocessor 405 defining as to how the incoming signals should be mixed in such a manner that a transition between the first audio piece 201 and the second audio piece 202 is performed so that the subjective audio quality perceived by a human listener is good. A low-frequency signal z L [n] corresponding to a low-frequency transition profile 213 is supplied at an output of the low- frequency mixer 403. Data z H [n] corresponding to a high-frequency transition profile 214 is supplied at an output of the second mixer 404. These signals z L [n] and z H [n] are applied to inputs of a synthesis filter 406 synthesizing the different components to generate a signal z[n] representing mixed audio data 216 at an output of the synthesis filter 406. Still referring to the embodiment shown in Fig. 4, during the transition period, the two identical analysis filter banks 401, 402 decompose the two input signals x[n] and y[n] into two complementary components X L [Π] and xii[n], and yL[n] and yii[n], respectively. These may be low-frequency (bass) and high-frequency (treble) components. Subsequently, the mixers 403, 404 (also denoted as MXl and MX2) are applied to mix the corresponding frequency components of the two signals. Assuming that X L [Π] and y L [n] are the low- frequency components, and X H [Π] and yii[n] are the high-frequency components, the transition profiles of MXl and MX2 may be similar to those shown in Fig. 3. Thus, Fig. 4 shows a filter bank-based implementation of a system for mixing audio data according to an embodiment of the invention. The outputs z L [n] and z H [n] of the mixers 403, 404 are then passed on to the synthesis filter bank 406 to generate the output mix signal z[n]. The synthesis filter 406 is preferably designed in such a way that it forms a perfect reconstruction pair with the filter banks 401, 402. The input from the microprocessor 405 controls the two mixers 403, 404 and preferably conveys information concerning the mix moment and the amount of overlap.

A system of mixing audio data 500 according to a third embodiment of the invention will now be described with reference to Fig. 5.

The system 500 differs from the system 400 in that a first low-pass filter 501 and a first high-pass filter 502 replace the first filter bank 401. A second low-pass filter 503 and a second high-pass filter 504 replace the second filter bank 402. The first low-pass filter 501 extracts the low-frequency component x L [n] 206 of the first audio song x[n] 201. The first high-pass filter 502 extracts the high-frequency component X H [Π] 207 from the first audio piece x[n] 201. The second low-pass filter 503 extracts the low-frequency component yL[n] 208 from the second audio piece y[n] 202. The second high-pass filter 504 extracts the high-frequency component yii[n] 209 from the second audio piece y[n] 202.

Furthermore, in the embodiment shown in Fig. 5, the synthesis filter 406 of Fig. 4 is replaced by an adding unit 505 for adding up the components z L [n] and z H [n] supplied at outputs of the mixers 403, 404. The embodiment shown in Fig. 5 has a pair of complementary filters. Fig. 6 shows a diagram 600 having an abscissa 601 on which the frequency is plotted. Furthermore, the intensity is plotted in arbitrary units on an ordinate 602. Fig. 6 shows a low-pass filter frequency behavior 603 illustrating the frequency response of the low-pass filters 501, 503. Fig. 6 also shows a high-pass filter frequency behavior 604 reflecting the frequency response of the high-pass filters 502, 504. The low-pass filters 501, 503 and the high-pass filters 502, 504 should have such a behavior that the sum LPF + HPF forms an all-pass filter. Examples of the frequency responses of LPF and HPF are shown in Fig. 6.

According to the invention, any desired number of multi-frequency bands, each with its own transition profile, can be chosen. Moreover, the transition profile in each frequency band can independently vary from zero overlap to a very large overlap.

A portion of a system 700 for mixing audio data according to a fourth embodiment of the invention will now be described with reference to Fig. 7.

The part of the system 700 for mixing audio data shown in Fig. 7 is addressed to the issue of controlling the phases of the bass components of the signals to be mixed so as to minimize or reduce the risk of destructive interference. For this case, the mixer 403 shown in Fig. 4 and Fig. 5 can be realized as shown in Fig. 7.

The phases of the low-frequency components X L [Π] and yL[n] are first compared in a phase-analyzing unit 701. The output of the comparator 701 serves as a basis for a control signal C that controls a first delay unit 702 and a second delay unit 703 so as to

minimize any phase conflict during addition. Thus, the signal x L [n] is delayed by a particular delay (or advance) defined by the first delay unit 702 and is then applied to a first gain unit 704. The signal yjn] is delayed or advanced, using the second delay unit 703, and is then gained by a second gain unit 705. The outputs of the gain units 704, 705 are added in an adding unit 706 to generate the signal z L [n].

The circuit shown in Fig. 7 thus compensates phase differences and shows details of the mixer unit 403 (or MXl) for phase-compensated mixing.

To prevent audible artefacts, the delays Dx and Dy of the delay units 702, 703 may be changed or adjusted in a graceful way. The gains Gx and Gy of the gain units 704, 705 implement a cross-fading profile similar to that shown in the second diagram 310 of Fig. 3.

An automatic disc jockey device 800 according to an embodiment of the invention will now be described with reference to Fig. 8.

The automatic disc jockey device 800 comprises a system for mixing audio signals according to an embodiment of the invention. Using the automatic disc jockey device 800, it is possible to sort contents based on some similarity criteria and play them in a smooth, rhythmically consistent way. The latter procedure is referred to as automatic disc jockey or AutoDJ. The function of an AutoDJ implementing a system according to the invention is shown in Fig. 8. First, songs that are stored in a song database unit 801 (for instance, a hard disk or a CD or DVD) are analyzed to extract representative parameters. This analysis is performed in an automatic disc jockey analysis unit 802. These representative parameters may include, among others, End of Intro, Begin of Outro, Phrase or bar boundaries, Tempo and beat locations (onsets), Harmonic Signature, or the like. These parameters, which may also be denoted as AutoDJ parameters, can be computed offline and stored in a linked database, namely a feature database unit 803 (which is, for instance, a hard disk or the like).

On a parallel path, a playlist using user preferences is generated, wherein a playlist generator unit 805 generates the playlist. Given such a playlist, a so-called transition analyzer and playlist-reordering unit 804 compares the AutoDJ parameters corresponding to the songs in the playlist, determines an optimal order to play and generates a set of commands to be executed by a playback unit 806 (a CD player, a DVD player, or the like).

Finally, the player streams the songs from the database into an output- rendering device (for instance, loudspeakers 807) executing the sequence of commands dictating how the songs should be mixed and played back. The transitions between two

subsequent audio pieces to be played back by the playback unit 806 and the loudspeaker 807 are determined in the transition analyzer and playback-reordering unit 804 in accordance with the frequency-equalized control audio mixing scheme according to the invention.

It should be noted that use of the verb "comprise" and its conjugations does not exclude the presence of elements or steps other than those stated in the claims, and use of the indefinite article "a" or "an" preceding an element or step does not exclude the presence of a plurality of such elements or steps. Moreover, elements described in association with different embodiments may be combined.

It should also be noted that reference signs in the claims shall not be construed as limiting the scope of the claims.