NWU Institutional Repository

Comparing Transformer-based and gradient boosted decision tree (GBDT) Models on Tabular Data: A Rossmann Case Study

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Heterogeneous tabular data is a common and important data format. This empirical study investigates how the performance of deep transformer models compares against benchmark gradient boosting decision tree (GBDT) methods, the more typical modelling approach. All models are optimised using a Bayesian hyperparameter optimisation protocol, which provides a stronger comparison than the random grid search hyperparameter optimisation utilized in earlier work. Since feature skewness is typically handled differently for GBDT and transformer-based models, we investigate the effect of a pre-processing step that normalises feature distribution on the model comparison process. Our analysis is based on the Rossmann Store Sales dataset, a widely recognized benchmark for regression tasks.

Description

Citation

Middel, C. & Davel M. Comparing Transformer-based and gradient boosted decision tree (GBDT) Models on Tabular Data: A Rossmann Case Study

Endorsement

Review

Supplemented By

Referenced By