Loading...
Please wait, while we are loading the content...
自己評価により学習するエージェント Using Self-Evaluations for Agent Learning 森山 甲一
| Content Provider | Semantic Scholar |
|---|---|
| Author | Moriyama, Koichi Numao, Masayuki |
| Copyright Year | 2003 |
| Abstract | In game theory, the combination of each player’s action goes to a Nash equilibrium. Thus, almost all multiagent reinforcement learning algorithms aim to converge to a Nash equilibrium. However, a Nash equilibrium is not desirable in some games such as the prisoner’s dilemma (PD). On the other hand, there are several methods aiming to depart from undesirable Nash equilibria and proceed to a preferred combination by handling rewards in PD-like games, but since they use fixed handling methods, they are unsuitable for non-PD-like games. In this paper, we construct an agent learning appropriate actions in both PD-like and non-PD-like games through self-evaluations. The agent has two conditions for judging whether the game is like PD or not and two methods which generate selfevaluations according to the judgement. We conducted experiments in three kinds of games, which are single state iterated games, a multiple state game in which agents know which state they are in now, and multiple state games in which agents does not know state transitions. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.ai.sanken.osaka-u.ac.jp/files/moriyama03jb.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |