Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyO3: Always broadcast +,-, / ,* #1101

Merged

Conversation

LLukas22
Copy link
Contributor

Always broadcast magic methods, similarly to pytorch.

A question i still have: Does it have any drawbacks to always use the broadcasting variants of the operations here? Would it be faster to fist check the shape and then perform the normal or broadcasted version of the operation?

@LaurentMazare
Copy link
Collaborator

There should be no performance impact as broadcast_add etc should do the right thing. The main reason for candle to use non-broadcasting ops by default is that it's very easy to shoot yourself in the foot with broadcasting so we require broadcasting to be explicit. I'm pretty convinced this is the right call for the rust api - even if we try to mimic the PyTorch api in rust, one can still use broadcast_add etc to opt-in to the broadcasting behavior.
I'm a bit torn when it comes to this python api, on one hand it would be nice to be as close as possible to the PyTorch version, but on the other hand this doesn't seem to me like a sane default and sticking to non-broadcasting by default certainly has its upside.
Any thoughts?

@LLukas22
Copy link
Contributor Author

I'm 100% on your side regarding the rust-api, not broadcasting should be the default behavior. Regarding the python wrapper i'm also a bit torn. The problem is that pytorch does a lot of things implicitly, to make the api more "pythonic" while candle does nearly everything explicitly as it's located in the rust ecosystem. This results in a bit of a lose-lose situation. We basically have 2 options:

  1. Design the python wrapper implicitly and close to pytorch. That way we have minimal friction between the 2 ecosystems and it should be easier for pytorch users to bring their code with them and start hacking around in candle. The drawback is that similarly to pytorch we lose some of the explicit behavior, but that's just a problem python always had.
  2. Keep the wrapper close to candles api, meaning we want most things to be explicit. This results in bigger differences and more friction between the ecosystems, as we basically need some sort of "Migration Guide". The advantage would be that we can focus more on the safety and explicit behavior we get from this approach, which would be better for production systems.

In my opinion option 1 would be the way to go, as i see the python wrapper as something i can play with and try things out before writing them up in rust. I'm not expecting it to be save or explicit, i just want to hack stuff together and if it explodes in my face, so be it. Regarding option 2, if we want to go that way we can keep the wrapper extremly simple as we can discard most of the implicit features, but if we already need a "migration guide" to port models from pytorch to our wrapper, it would probably be advisable to directly point users towards the rust implementation, because it does type-checking, safety and explicit behavior way better than python ever could.

I basically like the implicit way more, as it's a better fit for python and you just have less to think about when using it, but i definitely wouldn't use it for a production system.
What are your thoughts on this?

@LaurentMazare
Copy link
Collaborator

Ok let's give it a try and we'll see how it goes. Maybe the best would be to have something a bit more like jax, i.e. have one python api that is very low level and matches the c++ side and one higher level python api that tries to be python idiomatic but that seems a bit too much efforts at this point.

@LaurentMazare LaurentMazare merged commit b355ab4 into huggingface:main Oct 17, 2023
10 of 12 checks passed
EricLBuehler pushed a commit to EricLBuehler/candle that referenced this pull request Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants