Python exercise [03_1_encoding_scaling]: xgboost python is about to support categorical data while R hasn't yet. #35

sandylaker · 2025-05-17T01:28:32Z

However, it says "the feature is experimental and has limited features. Only the Python package is fully supported". So in R it's not supported and mlr3 also does not support this as of now Afaik. So would propose to keep it to be for now but open an issue that needs to be resolved in future when this categorical support is not experimental anymore. Also it looks like it's basically internally doing one hot encoding: "categorical data the split is defined depending on whether partitioning or onehot encoding is used. For partition-based splits, the splits are specified as
value $\in$ categories , where categories is the set of categories in one feature. If onehot encoding is used instead, then the split is defined as value == category "

Originally posted by @giuseppec in #26 (comment)

sandylaker mentioned this issue May 17, 2025

Py 03 feature #26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python exercise [03_1_encoding_scaling]: xgboost python is about to support categorical data while R hasn't yet. #35

Python exercise [03_1_encoding_scaling]: xgboost python is about to support categorical data while R hasn't yet. #35

sandylaker commented May 17, 2025 •

edited

Loading

Python exercise [03_1_encoding_scaling]: xgboost python is about to support categorical data while R hasn't yet. #35

Python exercise [03_1_encoding_scaling]: xgboost python is about to support categorical data while R hasn't yet. #35

Comments

sandylaker commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

sandylaker commented May 17, 2025 •

edited

Loading