Document |
Document Title |
WO/2023/183335A1 |
Described are in-ear monitoring (IEM) systems configured for audio performance environments requiring low audio latency and high scalability. IEM systems can include an audio channel allocation device that determines audio channel alloca...
|
WO/2023/183730A1 |
An automated speech recognition (ASR) model (200) includes a first and second encoders (210), (220) and first and second decoders (204), (206). The first encoder receives, as input, a sequence of acoustic frames (104), and generates, at ...
|
WO/2023/182766A1 |
The present disclosure relates to a method, a system, and a non-transitory computer readable recording medium for providing a voice recognition trigger. A method for providing a voice recognition trigger according to an embodiment of the...
|
WO/2023/183663A1 |
Systems and techniques are provided for processing audio data. For example, a dummy prototypical network may be used to perform few-shot open-set keyword spotting (FSOS-KWS). A process can include determining one or more prototype repres...
|
WO/2023/183419A2 |
An electronic door chime can meet safety standards and be compatible with doorbells. A new electronic door chime can safely powered doorbells and can be configured in a single gang box, while also being compatible with simple push button...
|
WO/2023/183201A1 |
A computer-implemented method (500) includes receiving a sequence of acoustic frames (110) corresponding to an utterance and generating a reference speaker embedding (342) for the utterance. The method also includes receiving a target sp...
|
WO/2023/179226A1 |
A method and apparatus for voice control of an air conditioner, and an air conditioner and a storage medium, which relate to the technical field of intelligent household appliances. The method for voice control of an air conditioner comp...
|
WO/2023/182765A1 |
Methods for processing and analyzing audio recordings and, in particular, for speech denoising are provided. A method for speech denoising using a fast Fourier convolution operator comprises: splitting channels of input tensor into local...
|
WO/2023/183683A1 |
A method (700) for training a generalized automatic speech recognition model for joint acoustic echo cancellation, speech enhancement, and voice separation includes receiving a plurality of training utterances (532) paired with correspon...
|
WO/2023/182014A1 |
This voice authentication device comprises: an acquisition unit that acquires voice data; a detection unit that detects from the voice data an utterance section in which a speaker is uttering; an extraction unit that extracts an utteranc...
|
WO/2023/183010A1 |
A method includes receiving a set of training utterances each including a non-synthetic speech representation (304), and for each training utterance, generating a corresponding synthetic speech representation (306) by a voice conversion ...
|
WO/2023/182300A1 |
[Problem] To provide a mechanism that is capable of further improving the quality of binaural reproduction. [Solution] A signal processing system comprising a first control unit that: calculates a transmission characteristic correspondin...
|
WO/2023/182605A1 |
The present specification relates to a method by which a terminal trains a mathematics-related artificial intelligence model, and comprises the steps of: collecting mathematical sentences for training the mathematics-related artificial i...
|
WO/2023/179506A1 |
The present disclosure relates to a prosody prediction method and apparatus, and a readable medium and an electronic device, by means of which more appropriate prosody features can be obtained. The method comprises: acquiring a target te...
|
WO/2023/179229A1 |
A method and apparatus for testing an air conditioner, and a test system and a storage medium. The method comprises: controlling a loudspeaker box to play a wake-up instruction (S230); controlling the loudspeaker box to play a test instr...
|
WO/2023/182015A1 |
Provided is a voice authentication device comprising an acquisition unit which acquires voice data, a detection unit which detects, from the acquired voice data, an utterance section in which a speaker is making an utterance and a non-ut...
|
WO/2023/182065A1 |
A content acquisition unit (23) in the present invention extracts a divided block having an evaluation value equal to or greater than a threshold value from among content in which an evaluation value is associated with each of divided bl...
|
WO/2023/183206A1 |
A method (500) of text-only and semi-supervised training for deliberation includes receiving training data (320) including unspoken textual utterances (330) that are each not paired with any corresponding spoken utterance of non-syntheti...
|
WO/2023/182718A1 |
A method of adjusting a predefined listening time of a voice assistant device includes receiving an audio input; extracting at least one of a speech component and a non-speech artifact from the audio input; determining a user breathing p...
|
WO/2023/179098A1 |
The present application provides a noise reduction method and apparatus, a vehicle, etc. Position information is acquired by a camera, etc., the position information indicating the position of the head or ears of a user. The position of ...
|
WO/2023/181889A1 |
The present technology relates to an image-capturing device, an image-capturing method, and a program with which it is possible to easily register the voice of a specific person, together with a portion of sound, as sound data of a movin...
|
WO/2023/183268A1 |
A method (500) includes receiving, as input to a speech recognition model (200), audio data (110) corresponding to a spoken utterance (106). The method also includes performing speech recognition on the audio data by, at each of a plural...
|
WO/2023/182542A1 |
A display device according to an embodiment of the present disclosure generates and provides abbreviated content by selecting preferred content on the basis of a viewing history of a user and processing the content according to a prefere...
|
WO/2023/180855A1 |
Presented herein are techniques for multi-band channel coordination in medical device systems. More specifically, in accordance with certain embodiments presented herein, a plurality of source filter channel signals are generated via a p...
|
WO/2023/183530A1 |
A method (400) includes receiving a sequence of acoustic frames (110) as input to an automatic speech recognition (ASR) model (200). The method also includes generating, by a first encoder (210), a first higher order feature representati...
|
WO/2023/183684A1 |
A multichannel neural frontend speech enhancement model (200) includes a speech cleaner (300), a stack of self-attention blocks (400) each having a multi-headed self attention mechanism, and a masking layer (240). The speech cleaner rece...
|
WO/2023/179800A1 |
A communication receiving method and an apparatus (200) thereof, relating to the technical field of communications. The specific implementation solution comprises: receiving a bit stream transmitted by a channel (S101); parsing the bit s...
|
WO/2023/183666A1 |
A method includes generating an input data state for each data sample in a time series of data samples of a portion of an audio data stream. The method also includes providing at least one input data state to a first bottleneck and at le...
|
WO/2023/182291A1 |
The present invention improves response time for waveform generation and makes it possible to perform detailed processing of a rhythm feature quantity based on overall input before the waveform generation. According to the embodiments, a...
|
WO/2023/182016A1 |
This voice authentication device comprises: a detection unit for detecting, from speech data, a speech segment in which a speaker is speaking and a non-speech segment in which the speaker is not speaking; an extraction unit for extractin...
|
WO/2023/182055A1 |
A player according to one embodiment and equipped with a plurality of sensors which respectively detect the operation of a plurality of player operators which include a number (a is an integer of 3 or higher, and a>c) of player operators...
|
WO/2023/181571A1 |
A data output method according to one embodiment of the present invention includes: acquiring musical performance data generated by musical performance operation; identifying a musical score performance position in a predetermined musica...
|
WO/2023/182005A1 |
A data output method according to one embodiment includes: sequentially acquiring input data relating to a performance operation; acquiring a plurality of estimation information including first estimation information and second estimatio...
|
WO/2023/181144A1 |
Provided is a noise elimination device capable of responding to a change in noise over time using a single acoustic sensor. The present invention comprises a Fourier transform unit for dividing a single input signal including a compone...
|
WO/2023/181574A1 |
A pad operation part 50 includes a first group GrL including a plurality of pads, and a second group GrR including a plurality of pads disposed at symmetrical positions with respect to the plurality of pads included in the first group Gr...
|
WO/2023/181520A1 |
The present invention provides an air duct with a silencer capable of efficiently reducing sound to be propagated to an air-blowing destination including noise generated in the air duct when air is blown. An air duct with a silencer ac...
|
WO/2023/183370A1 |
A system and method that can be implemented in, among other things, a computer-implemented method for intuitive dictation without or with minimal use of other input devices besides a microphone, and without or with minimal use of keyword...
|
WO/2023/181431A1 |
This electronic musical instrument comprises: a keyboard that accepts a performance operation performed by a user; a sound source unit 41 that generates an acoustic signal S (SL, SR) corresponding to the operation on the keyboard; a reve...
|
WO/2023/181519A1 |
The present invention provides an air duct with a silencer capable of efficiently reducing sound to be propagated to an air-blowing destination in consideration of noise generated in the air duct when air is blown. An air duct with a s...
|
WO/2023/181747A1 |
A sound source system according to the present invention is provided with: a plurality of sound source cores that process musical sound data; a phase control unit that matches, among the plurality of sound source cores, clock phases that...
|
WO/2023/181223A1 |
A speech recognition device (10) according to an embodiment comprises a speech recognition unit (131) and a score calculation unit (132). The speech recognition unit (131) generates a lattice on the basis of the result of performing spee...
|
WO/2023/183292A1 |
A method (400) includes generating, using an audio encoder (210), a higher-order feature representation (212) for each acoustic frame in a sequence of acoustic frames (110); generating, using a decoder (215), based on the higher-order fe...
|
WO/2023/178583A1 |
Systems and methods are provided for generating a pseudo-labeled training dataset by at least one of: (1) extracting a set of intermediate outputs from an automatic speech recognition model based on applying the automatic speech recognit...
|
WO/2023/179828A1 |
The invention relates to a method for actively monitoring sound emissions of turbomachinery, in particular turbomachinery which has an electric motor, preferably a ventilator or a turbomachine. A sound signal, which is produced by superi...
|
WO/2023/181448A1 |
[Problem] To provide an operation display device with which it is possible to effectively utilize space and improve the operability of a touch operation, and an electronic musical instrument provided therewith. [Solution] An operation di...
|
WO/2023/179846A1 |
An apparatus comprising means for: obtaining values for parameters representing an audio signal, the values comprising at least one directional value and at least one energy ratio value for at least one sub-frame of each sub-band of a fr...
|
WO/2023/183680A1 |
A method (900) includes receiving training data that includes unspoken textual utterances (320), un-transcribed non-synthetic speech utterances (306), and transcribed non-synthetic speech utterances (304). Each unspoken textual utterance...
|
WO/2023/181573A1 |
Provided is an input device that includes a first region 10, a second region 20, and a third region 30, which each include at least one pad and to which the tones of a bass drum, snare drum, and cymbals are respectively allocated. In the...
|
WO/2023/182063A1 |
An evaluation value aggregation unit (14) aggregates, for each divided block of content, evaluation values derived by evaluating the divided blocks, and generates an aggregated evaluation value. An electronic ledger management unit (13) ...
|
WO/2023/182008A1 |
In this sound output method, a plurality of directional speakers are arranged side by side in the height direction, detection processing is performed for detecting the position of the head of a user, identification processing is performe...
|