AGENT JOINING DEVICE, METHOD, AND PROGRAM - NIPPON TELEGRAPH & TELEPHONE

Title:

AGENT JOINING DEVICE, METHOD, AND PROGRAM

Document Type and Number:

WIPO Patent Application WO/2020/149172

Kind Code:

A1

Abstract:

The present invention makes it possible to construct an agent capable of responding to even a complex task.　With respect to a value function for finding a policy for the action of an agent that solves the whole task represented by the weighted sum of a plurality of component tasks, a whole value function, which is the weighted sum of a plurality of component value functions having been previously learned in order to find a policy for the action of a component agent that solves a component task for each of the plurality of component tasks, is found using a weight for each of the plurality of component tasks. The action of an agent for the whole task is determined using the policy obtained from the whole value function, and the agent is made to act.

Inventors:

KOJIMA MASAHIRO (JP)
MATSUBAYASHI TATSUSHI (JP)
TODA HIROYUKI (JP)

Application Number:

PCT/JP2020/000157

Publication Date:

July 23, 2020

Filing Date:

January 07, 2020

Export Citation:

Click for automatic bibliography generation Help

Assignee:

NIPPON TELEGRAPH & TELEPHONE (JP)

International Classes:

G06N20/00

Other References:

NATARAJAN, S. ET AL.: "Dynamic preferences in multi-criteria reinforcement learning", PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML'05, August 2005 (2005-08-01), pages 601 - 608, XP058203961, DOI: 10.1145/1102351.1102427
MAINEGRA HING, M. ET AL.: "Order acceptance with reinforcement learning", BETA-PUBLICATE: WP-66, December 2001 (2001-12-01), pages 18 - 22, XP009100058, ISSN: 1386-9213, Retrieved from the Internet > [retrieved on 20200316]

Attorney, Agent or Firm:

TAIYO, NAKAJIMA & KATO (JP)

Download PDF:

View/Download PDF PDF Help

Previous Patent: NUMBER OF STEPS MEASURING DEVICE, METHOD, AND PROGRAM

Next Patent: LIGHT RECEPTION DEVICE AND DISTANCE MEASUREMENT DEVICE