NDLI: Two Heuristics for Solving POMDPs Having a Delayed Need to Observe

Please wait, while we are loading the content...

Two Heuristics for Solving POMDPs Having a Delayed Need to Observe

Content Provider	CiteSeerX
Author	Zubek, Valentina Bayer Dietterich, Thomas
Abstract	A common heuristic for solving Partially Observable Markov Decision Problems (POMDPs) is to first solve the underlying Markov Decision Process (MDP) and then construct a POMDP policy by performing a fixeddepth lookahead search in the POMDP and evaluating the leaf nodes using the MDP value function. A problem with this approximation is that it does not account for the need to choose actions in order to gain information about the state of the world, particularly when those observation actions are needed at some point in the future. This paper proposes two heuristics that are better than the MDP approximation in POMDPs where there is a delayed need to observe. The first approximation, introduced in [ 2 ] , is the even-odd POMDP, in which the world is assumed to be fully observable every other time step. The even-odd POMDP can be converted into an equivalent MDP, the even-MDP, whose value function captures some of the sensing costs of the original POMDP. An online policy, consisting in a 2-step lookahead search combined with the value function of the even-MDP, gives an approximation to the POMDP's value function that is at least as good as the method based on the value function of the underlying MDP. The second POMDP approximation is applicable to a special kind of POMDP which we call the Cost Observable Markov Decision Problem (COMDP). In a COMDP, the actions are partitioned into those that change the state of the world and those that are pure observation actions. For such problems, we describe the chain-MDP algorithm, which in many cases is able to capture more of the sensing costs than the even-odd POMDP approximation. We prove that both heuristics compute value functions that are upper bounded by (i.e., better than) the value func...
File Format	PDF
Access Restriction	Open
Subject Keyword	Even-odd Pomdp Even-odd Pomdp Approximation Common Heuristic Value Func Mdp Value Function Observation Action Original Pomdp Underlying Mdp Equivalent Mdp 2-step Lookahead Search Chain-mdp Algorithm Time Step Online Policy Sensing Cost Mdp Approximation Second Pomdp Approximation Delayed Need Partially Observable Markov Decision Problem Pomdp Policy Pure Observation Action Cost Observable Markov Decision Problem Underlying Markov Decision Process Fixeddepth Lookahead Search First Approximation Value Function Special Kind
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in