Optimizing Industrial Cooling Systems with Hierarchical Reinforcement Learning

Reinforcement learning (RL) techniques have been deployed in optimizing industrial cooling systems, offering substantial energy reductions compared to traditional heuristic policies. A major challenge in controlling these systems involves learning behaviors that are feasible in the real world due to machinery constraints. For example, certain actions can only be executed every few hours while other actions can be taken more frequently. Without extensive reward engineering and experimentation, an RL agent may not learn realistic operation of machinery. To address this, we use hierarchical reinforcement learning with multiple agents that control subsets of actions according to their operation time scales. Our hierarchical approach achieves energy savings over existing baselines while maintaining constraints such as operating chillers within safe bounds in a simulated HVAC control environment.