Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
情報処理装置、情報処理方法、及びプログラム
Document Type and Number:
Japanese Patent JP5879899
Kind Code:
B2
Abstract:
Provided is an information processing apparatus including: a reward estimator generating unit using action history data, which includes state data expressing a state of an agent, action data expressing an agent's action, and a reward value expressing a reward of the action, as learning data to generate, through machine learning, a reward estimator estimating the reward value from inputted state data and action data; an action selecting unit preferentially selecting an action not included in the action history data but with a high estimated reward value; and an action history adding unit causing the agent to perform the selected action and adding to the action history data the state data and action data for the action and the action's reward value in association with each other. The reward estimator is regenerated when a set of state data, action data, and the reward value is added to the action history data.

Inventors:
Yoshiyuki Kobayashi
Application Number:
JP2011224639A
Publication Date:
March 08, 2016
Filing Date:
October 12, 2011
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ソニー株式会社
International Classes:
A63F13/56; A63F13/67; A63F13/822; A63F13/833; G06N3/08; G06N3/12; G06N99/00
Domestic Patent References:
JP2010287027A
JP2006268812A
Foreign References:
US20050245303
Other References:
米井友浩 他,遺伝的アルゴリズムを用いた時変環境におけるQ-learning(Q-learning using Genetic Algorithms in nonstationary environments),電子情報通信学会技術研究報告(IEICE Technical Report)NC,日本,1997年12月12日,Vol. 97、No. 448,Pages 71-78
西川郁子 他,統計的価値関数による強化学習とゲーム戦略獲得への適用(Reinforcement Learning Based on Statistical Value Function and Its Application to a Board Game),計測自動制御学会論文集(Transactions of the Society of Instrument and Control Engineers),日本,2003年 7月31日,Vol. 39、No. 7,Pages 670-678
Attorney, Agent or Firm:
Miaki Kametani
Tetsuo Kanamoto
Koji Hagiwara
Kazuki Matsumoto