Title:
INFORMATION PROCESSING DEVICE, PROGRAM, AND INFORMATION PROCESSING METHOD
Document Type and Number:
WIPO Patent Application WO/2020/144857
Kind Code:
A1
Abstract:
The present invention is provided with: an audio speech likelihood calculation unit (103) that calculates audio speech likelihood from an audio signal including the voice of a subject person; a video speech likelihood calculation unit (104) that calculates video speech likelihood from a video signal indicating a video including the subject person; an environmental information determination unit (105) that determines audio reliability indicating the reliability of the audio signal and video reliability indicating the reliability of the video signal; a speech section detection unit (108) that adds a heavier weight to the audio speech likelihood when the audio reliability is higher, adds a heavier weight to the video speech likelihood when the video reliability is higher, calculates, by using the audio speech likelihood and the video speech likelihood, speech likelihood indicating a probability that the subject person is uttering speech in the audio signal and in the video signal, and detects, as a section of speech, a section where the calculated speech likelihood is higher than a predetermined threshold; and an audio recognition unit (109) that executes audio recognition on the audio signal in the section of speech.
Inventors:
TSUCHIYA MASATO (JP)
HANAZAWA TOSHIYUKI (JP)
HANAZAWA TOSHIYUKI (JP)
Application Number:
PCT/JP2019/000722
Publication Date:
July 16, 2020
Filing Date:
January 11, 2019
Export Citation:
Assignee:
MITSUBISHI ELECTRIC CORP (JP)
International Classes:
G10L15/04
Domestic Patent References:
WO2016006088A1 | 2016-01-14 |
Foreign References:
JP2011059186A | 2011-03-24 | |||
JP2008134572A | 2008-06-12 | |||
JP2000347688A | 2000-12-15 | |||
JP2018106359A | 2018-07-05 | |||
JP2011191423A | 2011-09-29 |
Other References:
FUJIMOTO MASAKIYO , KENTARO ISHIZUKA , HIROKO KATO: "A noise robust voice activity detection with state transition processes of speech and noise", IPSJ SIG TECHNICAL REPORTS, no. 136 (2006-SLP-064), 21 December 2006 (2006-12-21), pages 13 - 18, XP055724628
Attorney, Agent or Firm:
YAMAGATA Yoichi et al. (JP)
Download PDF:
Previous Patent: COMMUNICATION MANAGEMENT DEVICE AND DATA MANAGEMENT DEVICE
Next Patent: OPTICAL COMMUNICATION DEVICE AND OPTICAL COMMUNICATION METHOD
Next Patent: OPTICAL COMMUNICATION DEVICE AND OPTICAL COMMUNICATION METHOD