Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AGENT JOINING DEVICE, METHOD, AND PROGRAM
Document Type and Number:
WIPO Patent Application WO/2020/149172
Kind Code:
A1
Abstract:
The present invention makes it possible to construct an agent capable of responding to even a complex task. With respect to a value function for finding a policy for the action of an agent that solves the whole task represented by the weighted sum of a plurality of component tasks, a whole value function, which is the weighted sum of a plurality of component value functions having been previously learned in order to find a policy for the action of a component agent that solves a component task for each of the plurality of component tasks, is found using a weight for each of the plurality of component tasks. The action of an agent for the whole task is determined using the policy obtained from the whole value function, and the agent is made to act.

Inventors:
KOJIMA MASAHIRO (JP)
MATSUBAYASHI TATSUSHI (JP)
TODA HIROYUKI (JP)
Application Number:
PCT/JP2020/000157
Publication Date:
July 23, 2020
Filing Date:
January 07, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NIPPON TELEGRAPH & TELEPHONE (JP)
International Classes:
G06N20/00
Other References:
NATARAJAN, S. ET AL.: "Dynamic preferences in multi-criteria reinforcement learning", PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML'05, August 2005 (2005-08-01), pages 601 - 608, XP058203961, DOI: 10.1145/1102351.1102427
MAINEGRA HING, M. ET AL.: "Order acceptance with reinforcement learning", BETA-PUBLICATE: WP-66, December 2001 (2001-12-01), pages 18 - 22, XP009100058, ISSN: 1386-9213, Retrieved from the Internet > [retrieved on 20200316]
Attorney, Agent or Firm:
TAIYO, NAKAJIMA & KATO (JP)
Download PDF: