Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[luci/pass] Bug in ExpandBroadcastConstPass #14554

Closed
dayo09 opened this issue Jan 14, 2025 · 6 comments
Closed

[luci/pass] Bug in ExpandBroadcastConstPass #14554

dayo09 opened this issue Jan 14, 2025 · 6 comments

Comments

@dayo09
Copy link
Contributor

dayo09 commented Jan 14, 2025

What?

I found a bug in ExpandBroadcastConstPass.

It's the passed applied by onecc optimize --expand_broadcast_const.
It found out weird behavior of the optimization pass, it broadcasts a const tensor to wrong dimension.

Example

Image

In the above example, broadcast dimension is 3. Therefore, the broadcasted value must be as below.

[[-0.9289.., -0.9289.., -0.9289.., -0.9289..],
[-0.2297.., -0.2297.., -0.2297.., -0.2297..], ...

Inference results mismatch before/after the optimization

Most clearly, I tried inference on the circle model before/after the optimization.

And the result inference values failed to match.

import circle_interpreter
import numpy as np

intp = circle_interpreter.Interpreter("AddBroadcast.circle")
in_ = np.random.randn(3, 4, 4)
in_ = np.array(in_, dtype=np.float32)
out_ = intp.infer([in_])
print(out_)


intp2 = circle_interpreter.Interpreter("AddBroadcast.opt.circle")
out2_ = intp2.infer([in_])
print(out2_)


print(np.allclose(out_, out2_, atol=1e-3, rtol=1e-3)) # FAIL!
print(np.max(np.abs(out_[0] - out2_[0]))) # 5.7

Broadcasting in pytorch and tensorflow - checked!

To make sure this, I checked on broadcast result of same dimensions from pytorch and tensorflow.

Pytorch
import torch

A = torch.zeros(3, 4, 4)
B = torch.tensor([[[1], [2], [3], [4]]] * 3)

print(B)
print(A + B)
tensor([[[1],
         [2],
         [3],
         [4]],

        [[1],
         [2],
         [3],
         [4]],

        [[1],
         [2],
         [3],
         [4]]])
tensor([[[1., 1., 1., 1.],
         [2., 2., 2., 2.],
         [3., 3., 3., 3.],
         [4., 4., 4., 4.]],

        [[1., 1., 1., 1.],
         [2., 2., 2., 2.],
         [3., 3., 3., 3.],
         [4., 4., 4., 4.]],

        [[1., 1., 1., 1.],
         [2., 2., 2., 2.],
         [3., 3., 3., 3.],
         [4., 4., 4., 4.]]])
Tensorflow ```py import tensorflow as tf

x = tf.constant([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]])
y = tf.constant([[1.], [2.], [3.]])
y_ = tf.constant([[1., 1., 1.], [2., 2., 2.], [3., 3., 3.]])

subXY = x - y
subXY2 = x - y_

print(subXY)
print(subXY2) # 같다.


</details>
@dayo09
Copy link
Contributor Author

dayo09 commented Jan 14, 2025

Tested Locally

By running #14555, inference value matches

However, it seems value testing pass does not exist and it is required..
(Maybe I just failed to find one..?)
It looked into dredd test, but it doesn't provide value comparison functionality.

import circle_interpreter
import numpy as np

intp = circle_interpreter.Interpreter("AddBroadcast.circle")
in_ = np.random.randn(3, 4, 4)
in_ = np.array(in_, dtype=np.float32)
out_ = intp.infer([in_])
print(out_)


intp2 = circle_interpreter.Interpreter("AddBroadcast.opt.v2.circle") # Generated model from the PR
out2_ = intp2.infer([in_])
print(out2_)


print(np.allclose(out_, out2_, atol=1e-3, rtol=1e-3)) # True
print(np.max(np.abs(out_[0] - out2_[0]))) # 0.0

@jinevening

This comment has been minimized.

@seanshpark
Copy link
Contributor

@dayo09 , please prepare a draft with pass changes and below items for validation

  • a recipe to generate problematic model
  • add to luci-pass-value-py-test (@jinevening , new one has py)

@seanshpark
Copy link
Contributor

@dayo09 one more thing, as it seems calculation branches with number of ranks,
it would be better to prepare several recipes with different ranks.

@dayo09
Copy link
Contributor Author

dayo09 commented Jan 16, 2025

@seanshpark Thanks for the suggestion. I have added recipes with 2, 3, 4 ranks and added them to luci-pass-value-py-test.

Please take a look at PR. If you want them to be split into several PRs, let me know.

@dayo09
Copy link
Contributor Author

dayo09 commented Jan 17, 2025

Done ;-D

@dayo09 dayo09 closed this as completed Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants