Existing high-performance Machine Learning models typically rely on large training datasets with high-quality manual annotations, which are difficult to obtain in the case of energy consumption anomaly detection. This knowledge gap is addressed in this study, where off-the-shelf unsupervised anomaly detection models are evaluated on real-world buildings energy datasets from different use cases varying in size and features. To this end, empirical evaluations and methodological analysis are conducted in order to evaluate how fully automated unsupervised labeling is in accordance with the domain experts manual labeling. We evaluate and discuss the impact of the anomalies contamination ratio, the missing values, the number of used features on fully automated unsupervised labeling and on avoiding false notifications by minimizing the False Positive and False Negative rates. In addition, the performances of combining models based on a voting process are evaluated. Results demonstrate the capabilities of Machine Learning anomaly detection models in distinguishing anomalous instances from normal instances. Additionally, achieved results highlights the similarity between Machine Learning models labeling and domain experts manual labeling. These findings support the development of anomaly detection in energy consumption scenarios, significantly reducing the time and cost of manual labeling. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.