In many disciplines across science and engineering, practitioners often face the challenge of making sequential decisions or actions in the face of uncertainty of the outcomes at each intermittent stage of a multistage process. In this project, we are developing and implementing advanced Reinforcement Learning (RL) methods designed to tackle this sequential control and decision-making task. One example of these sequential tasks is related to how electricity is supplied, and how the time-varying demand is met. Utility companies try to optimize how a set of electrical generators coordinate electricity production in order to meet the fluctuations in consumer demands while minimizing system costs under multiple operational constraints. To solve this supply-demand dynamic, our approach leverages recent advancements in RL algorithms such as Deep Q-Network, Actor-Critics, and Policy Gradient methods to design RL algorithms that learn approximately optimal strategies in an efficient and robust manner. Once developed and refined, these tools and strategies can be applicable to a variety of real-world scenarios across a wide array of domains.