To reduce user's trouble when starting a speech recognition.
The speech recognition device 100 includes: an input section 101; a detection section 102; an image recognition device 103; and a speech recognition section 104. User's speech is input to the input section 101. The detection section 102 detects a portion operating by the utterance in a user's body. An image recognition section 103 performs image recognition of an action state regarding the user's utterance, based on the detection result by the detection section 102. The speech recognition section 104 starts speech recognition on the speech which is input to the input section 101, after performing image recognition of the action state regarding the user's utterance by the image recognition section 103.
KATO YOSHIKO
ODA RYO
KOYAMA KEIICHIRO
SHINTO KOJI
MORI KUNIHIKO