Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
INFORMATION PROCESSING DEVICE, PROGRAM, AND INFORMATION PROCESSING METHOD
Document Type and Number:
WIPO Patent Application WO/2020/144857
Kind Code:
A1
Abstract:
The present invention is provided with: an audio speech likelihood calculation unit (103) that calculates audio speech likelihood from an audio signal including the voice of a subject person; a video speech likelihood calculation unit (104) that calculates video speech likelihood from a video signal indicating a video including the subject person; an environmental information determination unit (105) that determines audio reliability indicating the reliability of the audio signal and video reliability indicating the reliability of the video signal; a speech section detection unit (108) that adds a heavier weight to the audio speech likelihood when the audio reliability is higher, adds a heavier weight to the video speech likelihood when the video reliability is higher, calculates, by using the audio speech likelihood and the video speech likelihood, speech likelihood indicating a probability that the subject person is uttering speech in the audio signal and in the video signal, and detects, as a section of speech, a section where the calculated speech likelihood is higher than a predetermined threshold; and an audio recognition unit (109) that executes audio recognition on the audio signal in the section of speech.

Inventors:
TSUCHIYA MASATO (JP)
HANAZAWA TOSHIYUKI (JP)
Application Number:
PCT/JP2019/000722
Publication Date:
July 16, 2020
Filing Date:
January 11, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MITSUBISHI ELECTRIC CORP (JP)
International Classes:
G10L15/04
Domestic Patent References:
WO2016006088A12016-01-14
Foreign References:
JP2011059186A2011-03-24
JP2008134572A2008-06-12
JP2000347688A2000-12-15
JP2018106359A2018-07-05
JP2011191423A2011-09-29
Other References:
FUJIMOTO MASAKIYO , KENTARO ISHIZUKA , HIROKO KATO: "A noise robust voice activity detection with state transition processes of speech and noise", IPSJ SIG TECHNICAL REPORTS, no. 136 (2006-SLP-064), 21 December 2006 (2006-12-21), pages 13 - 18, XP055724628
Attorney, Agent or Firm:
YAMAGATA Yoichi et al. (JP)
Download PDF: