Replies: 3 comments 31 replies
-
If not, would be feasible to run FLAML for LightGBM to get tuned parameters, and then train with said parameters inside C# application with Microsoft.ML.LightGBM library? |
Beta Was this translation helpful? Give feedback.
-
Hard time getting same results, C# has much lower accuracy. I will continue by comparing defaults of C# and Python. Currently, setting LightGBM variables as below, keeping others as default.
|
Beta Was this translation helpful? Give feedback.
-
@sonichi If random seed makes a big difference in some datasets it might have some implications for the optimal search space. Choice of booster had also quite big difference. I might try to "tune" seed and booster. Would you or your team have suggestions on best way to do it?
I have one concern about it: If the tuning results rely on randomly dropped columns and it matters a lot which columns gets dropped, then if the next dataset version removes a column from the beginning of file the tuning results are no longer valid and tuning needs to be redone. Not every developer might realize a small change in dataset, like remove column or add a engineer a new column in pipeline, migth break the optimal values. Therefore, maybe, for default search space it should be considered not to use randomity based ones at all? Microsoft.ML would not seem to use it. Tuned results from Microsoft.ML are fairly close to FLAML but the tree is much bigger in Microsoft.ML. For experimentation, we may need a public dataset with high number of columns (+high risk of overfit) so that colsample_bytree becomes effective. I can not share the current dataset, but colsample_bytree values has lowest 0.67 and highest 0.9 (depending on metric) and it has 2500 columns. |
Beta Was this translation helpful? Give feedback.
-
Are there plans to create a C# port of FLAML library? Or any community projects you know of doing it?
Beta Was this translation helpful? Give feedback.
All reactions