Implement `StandardScaler`; add associated tests #132

cryptodeal · 2022-12-05T00:30:23Z

Implemented BaseScaler abstract class + StandardScaler, which extends the base class. Also added simple unit tests for StandardScaler.

Fixed a few type errors in shumai/tensor/tensor.ts while working on this.

cryptodeal · 2022-12-05T00:36:12Z

Also, noticed that Bun's GC is eager to the point that when running consecutive bun wiptest, I would see an occasional failure of the test on memory as Bun was garbage collecting tensors eagerly enough that the resulting mem usage was less than at start as Bun had GC'd some tensors before the call to dispose.

Accordingly, I've updated the test to check that the memory is less than or equal to memory usage at the start of the test.

asilvas

LGTM just a couple suggestions

shumai/util/preprocessing.ts

cryptodeal · 2022-12-05T02:14:03Z

Biggest issue is the weirdness w test CI runs failing due to mismatched dtype w/ logged warnings RE usage of Float16Array from this test (when the tests don't explicitly use Float16Array). I'm going to mark as a draft pending further investigation into this. (Unable to replicate that error specifically locally, but there's further bugs I'm finding where a test will pass when run as Float32Array and fail when run as Float64Array, so definitely needs to be debugged before ready to merge.

Tweaked implementation to debug a few errors I was catching locally.

cryptodeal · 2022-12-10T02:53:38Z

Going to implement remaining class methods so that it's at feature parity with the sklearn.preprocessing.StandardScaler implementation, then will flag as ready for review.

cryptodeal · 2022-12-10T06:06:36Z

At this point, I think there's either something I'm missing in the implementation causing failure on Linux runs due to transformed.dtype not matching the original dtype of the original Tensor.

Welcome feedback with regard to the implementation; I have largely directly ported this Golang sklearn port as I'm more familiar with Go than python.

cryptodeal · 2022-12-11T23:17:03Z

Currently, it's failing on GPU when comparing dtype of scaled against dtype of the pre-transformed inputs. The actual values are scaled as expected, but dtype doesn't match.

…t test

cryptodeal added 2 commits December 4, 2022 18:14

add StandardScaler and accompanying tests

1112245

remove added/unused fn scale + add another test

069b4ce

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 5, 2022

asilvas approved these changes Dec 5, 2022

View reviewed changes

shumai/util/preprocessing.ts Outdated Show resolved Hide resolved

shumai/util/preprocessing.ts Outdated Show resolved Hide resolved

cryptodeal marked this pull request as draft December 5, 2022 02:48

cryptodeal added 3 commits December 4, 2022 21:06

remove extra asContiguousTensor call; switch to interface

8146526

remove debug output to console

9d8ef57

fix implementation to feat parity w sklearn

82bdd3f

cryptodeal added 2 commits December 9, 2022 21:27

attempt fix for dtype mismatch in transformed output

a44c9d2

use Float16Array when init empty float for scaler

9d4d85f

cryptodeal marked this pull request as ready for review December 10, 2022 06:00

clean up implementation; GPU runs pending debug

58de406

cleanup implementation; debug test fails

a7e0358

cryptodeal added 2 commits December 12, 2022 23:20

Merge branch 'main' into StandardScaler

bb4bbe8

clean up post-merge; attempt coercing dtype; remove dtype check on 1s…

a590277

…t test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `StandardScaler`; add associated tests #132

Implement `StandardScaler`; add associated tests #132

cryptodeal commented Dec 5, 2022

cryptodeal commented Dec 5, 2022

asilvas left a comment

cryptodeal commented Dec 5, 2022 •

edited

Loading

cryptodeal commented Dec 10, 2022

cryptodeal commented Dec 10, 2022 •

edited

Loading

cryptodeal commented Dec 11, 2022

Implement StandardScaler; add associated tests #132

Are you sure you want to change the base?

Implement StandardScaler; add associated tests #132

Conversation

cryptodeal commented Dec 5, 2022

cryptodeal commented Dec 5, 2022

asilvas left a comment

Choose a reason for hiding this comment

cryptodeal commented Dec 5, 2022 • edited Loading

cryptodeal commented Dec 10, 2022

cryptodeal commented Dec 10, 2022 • edited Loading

cryptodeal commented Dec 11, 2022

Implement `StandardScaler`; add associated tests #132

Implement `StandardScaler`; add associated tests #132

cryptodeal commented Dec 5, 2022 •

edited

Loading

cryptodeal commented Dec 10, 2022 •

edited

Loading