HVAC control based on reinforcement learning and fuzzy reasoning: Optimizing HVAC supply air temperature, flow rate, and velocity

This paper presents a novel HVAC control framework that combines reinforcement learning (RL) and fuzzy reasoning to optimize supply air temperature, airflow rate, and air velocity simultaneously. Unlike conventional RL-based HVAC systems that mainly regulate temperature, the proposed method incorporates the Predicted Mean Vote (PMV) model and fuzzy logic-based rewards to balance thermal comfort and electricity costs. By considering user preferences and seasonal variations, the RL agent learns intelligent control strategies that improve occupant comfort while reducing energy consumption. Experimental results demonstrate superior thermal comfort and energy-saving performance compared with traditional temperature-only optimization methods.

Fig. 7. Simulation results for winter month setup: (a) HVAC scheduling results. (b) PMV results.

Technology Overview
The proposed technology integrates reinforcement learning, fuzzy logic, and the PMV comfort model for HVAC control. It jointly optimizes supply air temperature, airflow rate, and air velocity, while action masking ensures occupant comfort. Fuzzy reasoning captures complex user preferences regarding energy cost and thermal comfort.

Applications & Benefits
The framework can be applied to smart buildings, offices, and energy-efficient HVAC systems. It enhances thermal comfort, reduces electricity costs, adapts to seasonal conditions, and intelligently manages airflow without excessive temperature adjustments. Results show improved PMV performance and higher energy savings compared to conventional RL-based HVAC control.

Abstract：
This paper proposes a heating, ventilation and air conditioning (HVAC) control approach based on reinforcement learning (RL) and fuzzy reasoning to collectively optimize HVAC supply air temperature, flow rate, and velocity. Three possible actions are chosen including HVAC supply air temperature, flow rate and velocity. While supply air temperature and flow rate are included in the action space, air velocity is calculated using a newly formulated mathematical equation based on the selected air flow rate and certain system parameters. The Predicted Mean Vote (PMV) model is used to evaluate the thermal comfort based on HVAC supply air temperature, flow rate, and velocity, enabling the optimization of thermal comfort and electricity cost based on HVAC supply air temperature, flow rate, and velocity. To accurately represent the intricate user preferences regarding thermal comfort and electricity cost, fuzzy logic is employed to implement the reward function. Experimental results demonstrate that the proposed approach allows the RL agent to learn a superior intelligence, as evidenced by its action of increasing the HVAC supply air velocity to achieve the same PMV without decreasing the indoor temperature too much. The proposed RL framework, which optimizes HVAC supply air temperature, flow rate, and velocity together, achieves on average 6.16 % higher energy cost savings and 15.15 % better thermal comfort compared to RL methods that only optimize HVAC supply air temperature.

Journal of Building Engineering, Volume 98, 1 December 2024

HVAC control based on reinforcement learning and fuzzy reasoning: Optimizing HVAC supply air temperature, flow rate, and velocity
Author：Yao Leehter, Huang Li-Yu, Teo J.C.
Year：2025
Source publication： Journal of Building Engineering, Volume 103, June 2025, 112143
Subfield Highest percentage： 99% Architecture #2 / 210

https://www.scopus.com/pages/publications/85217970166

Click Num:

Share