Skip to main content

Dr. Mohamad Kazem Shirani Faradonbeh | University of Georgia

Abstract: Design of decision-making algorithms for dynamic environments is a fundamental problem in artificial intelligence. In many applications that outcomes of different decisions are unknown, data-driven algorithms are required to learn the outcomes. We study this problem and propose the first set of reinforcement learning algorithms for uncertain environments that evolve as stochastic differential equations. First, we propose fast and effective algorithms for learning to control instabilities such as drug overdoses and infectious outbreaks. For these algorithms that develop probabilistic beliefs about the unknown environment, we establish performance guarantees. Then, we proceed to the problem of learning to minimize a cost function, that captures many applications such as personalized insulin pumps for diabetic patients. We present a novel and easy-to-implement data-driven algorithm that sequentially updates its probabilistic beliefs. Then, we prove its efficiency and perform a regret analysis that fully characterizes the effect of uncertainty. So, we address the important exploration-exploitation dilemma: the algorithm successfully explores the uncharted territories, while simultaneously making good decisions.