Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
LEARNING DEVICE AND LEARNING METHOD
Document Type and Number:
WIPO Patent Application WO/2018/123606
Kind Code:
A1
Abstract:
The present disclosure relates to a learning device and learning method with which it is possible to easily correct a reinforcement learning model on the basis of a user input. A display control unit causes a display unit to display reinforcement learning model information which relates to a reinforcement learning model. A correction unit corrects the reinforcement learning model on the basis of an input from a user regarding the reinforcement learning model information. The present disclosure may be applied to, for example, a personal computer (PC) which corrects a reinforcement learning model on the basis of an input from a user and which learns, by reinforcement learning, a movement policy of an agent using the corrected reinforcement learning model.

Inventors:
NAKADA KENTO (JP)
NARIHIRA TAKUYA (JP)
SUZUKI HIROTAKA (JP)
OSATO AKIHITO (JP)
Application Number:
PCT/JP2017/044839
Publication Date:
July 05, 2018
Filing Date:
December 14, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SONY CORP (JP)
International Classes:
G06N99/00
Foreign References:
JP2013030278A2013-02-07
Other References:
NOGAWA, HIROSHI ET AL.: "F-045 Learning the meaning of instructions using autonomous action learning", INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS IEICE, INFORMATION PROCESSING SOCIETY OF JAPAN IPSJ. FIT 2004, vol. 3, no. 2, 20 August 2004 (2004-08-20), pages 319 - 321, XP009515429
TAMARU, JUNKI ET AL.: "Inverse reinforcement learning to estimate a time-dependent reward function from cyclic sequences of states", THE PAPERS OF TECHNICAL MEETING ON INTELLIGENT TRANSPORT SYSTEMS, vol. ST-13, 24 November 2013 (2013-11-24), pages 7 - 12, XP009515433
BRIAN D. ZIEBARTANDREW MAASJ.ANDREW BAGNELLANIND K. DEY: "Maximum Entropy Inverse Reinforcement Learning", 13 July 2008, ASSOCIATION FOR THE ADVANCEMENT OF ARTIFICIAL INTELLIGENCE (AAAI
RIAD AKROURMARC SCHOENAUERMICH'ELE SEBAG: "APRIL: Active Preference-learning based Reinforcement Learning", EUROPEAN CONFERENCE, ECML PKDD 2012, BRISTOL, UK, pages 20120924
See also references of EP 3561740A4
Attorney, Agent or Firm:
NISHIKAWA Takashi et al. (JP)
Download PDF: