| Heating, Ventilation, and Air Conditioning (HVAC) systems play a crucial role in ensuring thermal comfort while accounting for a significant share of energy consumption in buildings. Recent advances in Machine Learning (ML) techniques enable data-driven control strategies that can improve operational efficiency by learning control patterns directly from sensor data. This paper presents a comparative analysis for predicting binary HVAC control states using four ML algorithms: Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Multi-Layer Perceptron (MLP), and Long Short-Term Memory (LSTM) networks. The study utilizes a real-world dataset from a multi-zone commercial building, enriched with both historical and simulated forecast meteorological observations. To address the severe class imbalance, where the OFF state accounts for only 2.92% of the data, a comprehensive preprocessing pipeline was implemented alongside cost-sensitive learning and threshold optimization. Experimental results demonstrated that tree-based ensemble methods, particularly XGBoost, achieve superior performance in detecting rare transition events, yielding a Macro F1-score of 0.713, while balancing sensitivity and precision and preserving physically interpretable decision structures aligned with HVAC operational dynamics. These findings highlight the value of interpretable ML in providing intelligent HVAC operational support. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.