Date of Submission
12-2020
Document Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Electrical & Computer Engineering and Computer Science
Advisor
Dr. Vahid Behzadan
Committee Member
Dr. Mohsen Sarraf
Committee Member
Dr. Muhammad Aminul Islam
Keywords
Adaptive discounting, Markov Decision Process, Fixed-rate discounting, State-wise Adaptive Discounting from Experience (SADE), Batch-wise Adaptive Discounting from Experience (BADE), Deep Q-Network
LCSH
Reinforcement learning, Decision making, Discount
Abstract
In Markov Decision Process (MDP) models of sequential decision-making, it is common practice to account for temporal discounting by incorporating a constant discount factor. While the effectiveness of fixed-rate discounting in various Reinforcement Learning (RL) settings is well-established, the efficiency of this scheme has been questioned in recent studies. Another notable shortcoming of fixed-rate discounting stems from abstracting away the experiential information of the agent, which is shown to be a significant component of delay discounting in human cognition. To address this issue, this thesis proposes a novel method for adaptive discounting entitled State-wise Adaptive Discounting from Experience (SADE). This method leverages the experiential observations of state values in episodic trajectories to iteratively adjust state-specific discount rates. We report experimental evaluations of SADE in Q-learning agents, which demonstrate significant improvements in sample complexity and convergence rate compared to fixed-rate discounting. Additionally, this thesis proposes a second adaptive discounting method for deep RL entitled Batch-wise Adaptive Discounting from Experience (BADE), and reports the experimental analyses of Deep Q-Network (DQN) agents with BADE discounting in an Atari game environment. Finally, the thesis concludes with remarks on future direction of research.
Recommended Citation
Zinzuvadiya, Milan, "Adaptive Discounting in Reinforcement Learning" (2020). Master's Theses. 169.
https://digitalcommons.newhaven.edu/masterstheses/169