The environment and its underlying dynamics of all the reinforcement learning problems are typically abstracted as a Markov decision process (MDP). Because MDPs are useful for studying optimisation problems solved via dynamic programming and reinforcement learning. Today, most of the effective existing reinforcement learning (RL) algorithms are rooted in this dynamic programming paradigm. An alternative…

The post Duality: A New Approach to Reinforcement Learning appeared first on Analytics India Magazine.