Loading...
Please wait, while we are loading the content...
Similar Documents
Une double approche modulaire de l'apprentissage par renforcement pour des agents intelligents adaptatifs
| Content Provider | Semantic Scholar |
|---|---|
| Author | Stmia, Ufr |
| Copyright Year | 2010 |
| Abstract | Reinforcement Learning (RL) and Multi-Agent Systems (MAS) are promising tools in the field of Artificial Intelligence : the former allows the design of behaviors for smart entities (agents) thanks to simple rewards (we are here in the framework of Markov Decision Processes), and the later is based on the idea that a smart behavior may "emerge" from the collaboration of a group of agents. We have tried, in this thesis, to explore the ways of conjointly using both tools. The two main parts of this work present symetric points of view : 1how to design cooperating agents through reinforcement learning methods, and 2how to improve a (reinforcement learning) agent with a distributed internal architecture (such as a MAS). These aspects have both led to incremental approaches, as detailed below. 1. In the Multi-Agent framework, we address the problem of conceiving collaborating reactive agents through RL approaches. As various difficult theoretic problems arise, we have proposed to use shaping methods (a progressive learning beginning with simple situations) in a view to help agents learn their behaviors. The experiments show that some local optima of classical on-line RL algorithm can be overcome through this method. Unfortunately, shaping algorithms are difficult to automate in partially observable frameworks, what limits their use. 2. In the second framework, we have worked on the decomposition of a policy into various parallel policies in an agent. Such a problem lies in the field of Action-Selection, which concerns making a decision when considering different simultaneous goals. A first step was to propose and study algorithms to adaptively combine policies. The results were encouraging, as some local optima have also been overcome here : classical RL techniques are usually limited to only avoiding immediate problems. Then, an innovative algorithm has been proposed for the agent to automatically find the basic policies it requires in a given environment. To our knowledge, this is the first work where the agent autonomously finds and learns the “basic behaviors” necessary to its decision-making : these are usually given by a human designer. To sum up, both parts of this PhD thesis bring together the idea of decomposition approaches (through a MAS or an Action-Selection architecture) and Reinforcement Learning, with a progressive building of the agents in both cases. Due to difficult hypotheses (partial observations, no model...), both parts of the presented work are based on heuristics, what does not prevent them from being very promising to design agents. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.loria.fr/~buffet/papiers/memoire.pdf |
| Alternate Webpage(s) | https://members.loria.fr/FCharpillet/files/files/TheseOlivierBuffet.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |