エンドツーエンドモデルによる多言語音声認識のための音素に基づく文脈解析

Title:

エンドツーエンドモデルによる多言語音声認識のための音素に基づく文脈解析

Document Type and Number:

Japanese Patent JP7092953

Kind Code:

B2

Abstract:

A method includes receiving audio data encoding an utterance spoken by a native speaker of a first language, and receiving a biasing term list including one or more terms in a second language different than the first language. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data to generate speech recognition scores for both wordpieces and corresponding phoneme sequences in the first language. The method also includes rescoring the speech recognition scores for the phoneme sequences based on the one or more terms in the biasing term list, and executing, using the speech recognition scores for the wordpieces and the rescored speech recognition scores for the phoneme sequences, a decoding graph to generate a transcription for the utterance.

Inventors:

Who, key
Burgier, Antoine Jean
Sinus, Tara N.
Praver Balkar, Rohit Prakarsh
Pundak, Golan

Application Number:

JP2021564950A

Publication Date:

June 28, 2022

Filing Date:

April 28, 2020

Export Citation:

Click for automatic bibliography generation Help

Assignee:

Google LLC

International Classes:

G10L15/32; G10L15/08; G10L15/16

Domestic Patent References:

JP2019507362A

Other References:

PATEL, Ami et al.,"CROSS-LINGUAL PHONEME MAPPING FOR LANGUAGE ROBUST CONTEXTUAL SPEECH RECOGNITION",Proc. of the 2018 IEEE ICASSP,2018年04月15日,pp.5924-5928

Attorney, Agent or Firm:

Atsushi Honda

Previous Patent: Content Aware PQ Range Analyzer and Tone Mapping in Live Feeds

Next Patent: Manufacturing method and manufacturing equipment for stretchable laminated sheets, disposable wipes,...