Markov Decision Processes: Monotonicity of Optimal Policy in Exponential and Quasi-Hyperbolic Discounting Parameters
| dc.contributor.author | Kilic, Hakan | |
| dc.contributor.author | Canbolat, Pelin Gulsah | |
| dc.contributor.author | Gunes, Evrim Didem | |
| dc.date.accessioned | 2025-10-15T16:30:46Z | |
| dc.date.available | 2025-10-15T16:30:46Z | |
| dc.date.issued | 2026 | |
| dc.description.abstract | Intertemporal preferences of decision makers, i.e., the way they discount delayed utilities, impact their decisions. Empirical evidence suggests that individuals commonly have hyperbolic discounting preferences. This can result in time-inconsistent behavior, e.g., procrastination, which may be a barrier to adopting preventive behavior such as machine maintenance and patient adherence to treatment. In this paper, we theoretically compare the actions of individuals based on their discounting characteristics. We consider the Hyperbolic Discounting (HD) model, which is more representative of individual behavior than Exponential Discounting (ED). We formulate a discrete-time finite-horizon Markov decision process with Quasi-Hyperbolic Discounting (QHD), an analytically tractable function representing HD and present sufficient conditions that ensure the monotonicity of the optimal policy in the discounting parameters. We consider submodular maximization or supermodular maximization problems. Our paper is the first to investigate the monotonicity of the optimal policy in QHD parameters for these problems. Moreover, we compare the optimal actions under ED and QHD. We apply our results to the settings of machine maintenance, individual health behavior and inventory control. We provide numerical examples that show there might not be monotonicity if our sufficient conditions are not met. Also, we explore the discrepancy between the expected total exponentially-discounted rewards of the actions obtained from QHD and of the actions that are optimal under ED, and observe that this discrepancy is affected mainly by the present bias. | en_US |
| dc.description.sponsorship | AXA Award Grant from the AXA Research Fund; Scientific and Technological Research Council of Turkiye (TUBITAK) [221M581] | en_US |
| dc.description.sponsorship | This research was funded by the AXA Award Grant from the AXA Research Fund and the Scientific and Technological Research Council of Turkiye (TUBITAK) grant 221M581. | en_US |
| dc.identifier.doi | 10.1016/j.ejor.2025.09.013 | |
| dc.identifier.issn | 0377-2217 | |
| dc.identifier.issn | 1872-6860 | |
| dc.identifier.scopus | 2-s2.0-105016809149 | |
| dc.identifier.uri | https://doi.org/10.1016/j.ejor.2025.09.013 | |
| dc.language.iso | en | en_US |
| dc.publisher | Elsevier | en_US |
| dc.relation.ispartof | European Journal of Operational Research | en_US |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | Markov Decision Processes | en_US |
| dc.subject | Optimal Policy Monotonicity | en_US |
| dc.subject | Exponential Discounting | en_US |
| dc.subject | Quasi-Hyperbolic Discounting | en_US |
| dc.title | Markov Decision Processes: Monotonicity of Optimal Policy in Exponential and Quasi-Hyperbolic Discounting Parameters | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication | |
| gdc.author.institutional | Canbolat, Pelin Gülşah | |
| gdc.description.department | Kadir Has University | en_US |
| gdc.description.departmenttemp | [Kilic, Hakan] Univ Toronto Mississauga, Inst Commun Culture Informat & Technol, 3359 Mississauga Rd, Mississauga, ON L5L 1C6, Canada; [Canbolat, Pelin Gulsah] Kadir Has Univ, Fac Engn & Nat Sci, Cibali Mah Kadir Has Cad, TR-34083 Istanbul, Turkiye; [Gunes, Evrim Didem] Koc Univ, Coll Adm Sci & Econ, Rumelifeneri Yolu, TR-34450 Istanbul, Turkiye | en_US |
| gdc.description.endpage | 893 | en_US |
| gdc.description.issue | 3 | en_US |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q1 | |
| gdc.description.startpage | 877 | en_US |
| gdc.description.volume | 328 | en_US |
| gdc.description.woscitationindex | Science Citation Index Expanded | |
| gdc.description.wosquality | Q1 | |
| gdc.identifier.wos | WOS:001600462100010 | |
| gdc.opencitations.count | 0 | |
| gdc.plumx.scopuscites | 0 | |
| gdc.scopus.citedcount | 0 | |
| relation.isAuthorOfPublication | c73f8fe4-181a-4cd2-bc7f-b35358e26ee6 | |
| relation.isAuthorOfPublication.latestForDiscovery | c73f8fe4-181a-4cd2-bc7f-b35358e26ee6 | |
| relation.isOrgUnitOfPublication | b20623fc-1264-4244-9847-a4729ca7508c | |
| relation.isOrgUnitOfPublication | 2457b9b3-3a3f-4c17-8674-7f874f030d96 | |
| relation.isOrgUnitOfPublication | 28868d0c-e9a4-4de1-822f-c8df06d2086a | |
| relation.isOrgUnitOfPublication.latestForDiscovery | b20623fc-1264-4244-9847-a4729ca7508c |