Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
時間的価値移送を使用した長いタイムスケールにわたるエージェントの制御
Document Type and Number:
Japanese Patent JP7139524
Kind Code:
B2
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

Inventors:
Gregory Duncan Wayne
Timothy Paul Lilyclap
Cheerleader-Chun Han
Joshua Simon Abramson
Application Number:
JP2021519878A
Publication Date:
September 20, 2022
Filing Date:
October 14, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
Deep Mind Technologies Limited
International Classes:
G06N3/08; G06N20/00; G06V10/764; G06V10/774
Domestic Patent References:
JP2004068399A1
JP2018083238A
JP2018525759A
Foreign References:
US20150100530
US20170032245
Attorney, Agent or Firm:
Murayama Yasuhiko
Shinya Mihiro
Tatsuhiko Abe