Skip to content

Latest commit

 

History

History
31 lines (26 loc) · 2.05 KB

README.md

File metadata and controls

31 lines (26 loc) · 2.05 KB

categorical-tabnet

License: Apache Code style: black

@ Code for Paper: categorical boosting based tabnet

Abstract

Recent deep learning models perform well in image and natural language processing. However, in tabular data, there is a problem that good performance is not achieved due to data-level problems. Recently, TabNet, a model that overcomes these shortcomings, has been widely used for tabular data learning. However, categorical variable data does not perform significantly in tabular data. To solve this problem, Catboost Encoding method is used to solve the problem. In the case of this model, the pre-processing of categorical variable data was well utilized to derive more performance than other models, and it showed better performance than other encoding techniques.

Model Architecture

Model Architecture
Model-Architecture
The model was designed to train on distinct categorical and numerical datasets. Fig. illustrates the transformation of the categorical dataset into numerical data through the utilization of the CatBoost encoder. The transformed numerical data, alongside the batch-normalized numerical data, were then trained with the TabNet encoder to yield the final outcomes.

Experiments

Results
image
Categorical boosting demonstrates superior performance compared to TabNet and exhibits excellent results across diverse datasets.

Interpretability

Mask
image
In TabNet, if attention is distributed equally among features, in this approach, attention to different features varies across layers.