Comparing Transformer-based and gradient boosted decision tree (GBDT) Models on Tabular Data: A Rossmann Case Study
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Heterogeneous tabular data is a common and important data format. This empirical study investigates how the performance of deep transformer models compares against benchmark gradient boosting decision tree (GBDT) methods, the more typical modelling approach. All models are optimised using a Bayesian hyperparameter optimisation protocol, which provides a stronger comparison than the random grid search hyperparameter optimisation utilized in earlier work. Since feature skewness is typically handled differently for GBDT and transformer-based
models, we investigate the effect of a pre-processing step that normalises feature distribution on the model comparison process. Our analysis is
based on the Rossmann Store Sales dataset, a widely recognized benchmark for regression tasks.
Description
Citation
Middel, C. & Davel M. Comparing Transformer-based and gradient boosted decision tree (GBDT) Models on Tabular Data: A Rossmann Case Study