情報処理装置、情報処理方法、及びプログラム

Title:

情報処理装置、情報処理方法、及びプログラム

Document Type and Number:

Japanese Patent JP5879899

Kind Code:

B2

Abstract:

Provided is an information processing apparatus including: a reward estimator generating unit using action history data, which includes state data expressing a state of an agent, action data expressing an agent's action, and a reward value expressing a reward of the action, as learning data to generate, through machine learning, a reward estimator estimating the reward value from inputted state data and action data; an action selecting unit preferentially selecting an action not included in the action history data but with a high estimated reward value; and an action history adding unit causing the agent to perform the selected action and adding to the action history data the state data and action data for the action and the action's reward value in association with each other. The reward estimator is regenerated when a set of state data, action data, and the reward value is added to the action history data.

More Like This:

JP6523111	Program and game system
WO/2023/236602	DISPLAY CONTROL METHOD AND DEVICE FOR VIRTUAL OBJECT, AND STORAGE MEDIUM AND ELECTRONIC DEVICE
WO/2022/085478	SEAT SYSTEM

Inventors:

Yoshiyuki Kobayashi

Application Number:

JP2011224639A

Publication Date:

March 08, 2016

Filing Date:

October 12, 2011

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ソニー株式会社

International Classes:

A63F13/56; A63F13/67; A63F13/822; A63F13/833; G06N3/08; G06N3/12; G06N99/00

Domestic Patent References:

JP2010287027A
JP2006268812A

Foreign References:

US20050245303

Other References:

米井友浩他,遺伝的アルゴリズムを用いた時変環境におけるQ-learning(Q-learning using Genetic Algorithms in nonstationary environments),電子情報通信学会技術研究報告(IEICE Technical Report)NC,日本,1997年12月12日,Vol. 97、No. 448,Pages 71-78
西川郁子他,統計的価値関数による強化学習とゲーム戦略獲得への適用(Reinforcement Learning Based on Statistical Value Function and Its Application to a Board Game),計測自動制御学会論文集(Transactions of the Society of Instrument and Control Engineers),日本,2003年 7月31日,Vol. 39、No. 7,Pages 670-678

Attorney, Agent or Firm:

Miaki Kametani
Tetsuo Kanamoto
Koji Hagiwara
Kazuki Matsumoto

Previous Patent: アクチュエータ、光走査装置及び画像形成装置

Next Patent: JPS5879900