Comparison of ANN and XGBoost surrogate models trained on small numbers of building energy simulations

Stevanović, Sanja; Dashti, Husain; Milošević, Marko; Al-Yakoob, Salem; Stevanović, Dragan

Authors:	Stevanović, Sanja Dashti, Husain Milošević, Marko Al-Yakoob, Salem Stevanović, Dragan
Affiliations:	Computer Science Mathematical Institute of the Serbian Academy of Sciences and Arts
Title:	Comparison of ANN and XGBoost surrogate models trained on small numbers of building energy simulations
Journal:	PLoS ONE
Volume:	19
Issue:	10
First page:	e0312573
Editors:	Zhang, Jie
Issue Date:	1-Oct-2024
Rank:	~M22
ISSN:	1932-6203
DOI:	10.1371/journal.pone.0312573
Abstract:	Surrogate optimisation holds a big promise for building energy optimisation studies due to its goal to replace the use of lengthy building energy simulations within an optimisation step with expendable local surrogate models that can quickly predict simulation results. To be useful for such purpose, it should be possible to quickly train precise surrogate models from a small number of simulation results (10-100) obtained from appropriately sampled points in the desired part of the design space. Two sampling methods and two machine learning models are compared here. Latin hypercube sampling (LHS), widely accepted in building energy community, is compared to an exploratory Monte Carlo-based sequential design method mc-intersite-proj-th (MIPT). Artificial neural networks (ANN), also widely accepted in building energy community, are compared to gradient-boosted tree ensembles (XGBoost), model of choice in many machine learning competitions. In order to get a better understanding of the behaviour of these two sampling methods and two machine learning models, we compare their predictions against a large set of generated synthetic data. For this purpose, a simple case study of an office cell model with a single window and a fixed overhang, whose main input parameters are overhang depth and height, while climate type, presence of obstacles, orientation and heating and cooling set points are additional input parameters, was extensively simulated with EnergyPlus, to form a large underlying dataset of 729,000 simulation results. Expendable local surrogate models for predicting simulated heating, cooling and lighting loads and equivalent primary energy needs of the office cell were trained using both LHS and MIPT and both ANN and XGBoost for several main hyperparameter choices. Results show that XGBoost models are more precise than ANN models, and that for both machine learning models, the use of MIPT sampling leads to more precise surrogates than LHS.
Publisher:	Public Library of Science
Project:	Science Fund of the Republic of Serbia, grant #6767, Lazy walk counts and spectral radius of threshold graphs—LZWK

Files in This Item:

File	Description	Size	Format
SStevanovic.pdf		3.92 MB	Adobe PDF	View/Open

Show full item record

SCOPUS^TM
Citations

2

checked on Jun 2, 2025

Page view(s)

23

checked on Jun 2, 2025

Download(s)

6

checked on Jun 2, 2025

Files in This Item:

SCOPUS^TM
Citations

Page view(s)

Download(s)

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

SCOPUSTM Citations

Page view(s)

Download(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM