Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update apache-tvm to v0.15.0 #709

Merged
merged 16 commits into from
Feb 13, 2024
Merged

Update apache-tvm to v0.15.0 #709

merged 16 commits into from
Feb 13, 2024

Conversation

mshr-h
Copy link
Collaborator

@mshr-h mshr-h commented Jun 1, 2023

TVM doesn't provide a stable version of a binary package for Python 3.11. So we'll build it from the source in the CI pipeline.
Also, we'll update TVM to the latest stable release.
Related to #643

TODO

  • run pipeline again after the TVM v0.15.0 release
  • cleanup pipeline

@mshr-h mshr-h force-pushed the tvm-0.12.0 branch 4 times, most recently from 2309e06 to 77bc11e Compare June 2, 2023 03:26
@mshr-h
Copy link
Collaborator Author

mshr-h commented Jun 2, 2023

TVM v0.12.0's pytorch frontend failes to import HB's pytorch model in v1.12.0.
But it's ok when use v0.11.1

@mshr-h
Copy link
Collaborator Author

mshr-h commented Jun 2, 2023

I did git bisect and found that the below commit has a problem.
[Relay][Frontend] Span Filling PyTorch (#14050) · apache/tvm@e9cf04e

@mshr-h
Copy link
Collaborator Author

mshr-h commented Nov 20, 2023

Seems like there's a bug in the PyTorch frontend. Opened new issue apache/tvm#16150.

@mshr-h mshr-h force-pushed the tvm-0.12.0 branch 4 times, most recently from 5a74e85 to 0204661 Compare December 2, 2023 04:43
@ksaur ksaur mentioned this pull request Jan 5, 2024
5 tasks
@mshr-h mshr-h changed the title Update apache-tvm to 0.12.0 Update apache-tvm to 0.15 Jan 9, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jan 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1aed136) 90.11% compared to head (c51e005) 87.92%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #709      +/-   ##
==========================================
- Coverage   90.11%   87.92%   -2.20%     
==========================================
  Files          80       80              
  Lines        4685     4687       +2     
  Branches      857      857              
==========================================
- Hits         4222     4121     -101     
- Misses        266      368     +102     
- Partials      197      198       +1     
Flag Coverage Δ
unittests 87.92% <100.00%> (-2.20%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ksaur
Copy link
Contributor

ksaur commented Jan 9, 2024

Was that last failure transient, or did something new break unrelated to TVM? Please let me know if you are blocked on something and I can try to help!

@mshr-h mshr-h force-pushed the tvm-0.12.0 branch 3 times, most recently from dc15acd to 1cf67c9 Compare January 10, 2024 02:22
@mshr-h
Copy link
Collaborator Author

mshr-h commented Jan 10, 2024

The CI test has passed but I don't know why...

@mshr-h
Copy link
Collaborator Author

mshr-h commented Jan 10, 2024

We'll wait for the TVM v0.15.0 which is going to be released on 22 Jan 2024.
apache/tvm#16277
After the release, we'll run the CI again.

@mshr-h mshr-h changed the title Update apache-tvm to 0.15 Update apache-tvm to v0.15.0 Jan 29, 2024
@mshr-h
Copy link
Collaborator Author

mshr-h commented Jan 29, 2024

CI is failing again.
When TVM was installed from the source, it was ok. But it was restored from the cache, the test_extra_trees_tvm_tree_trav_regressor_converter test failed.

Ok log: https://github.com/microsoft/hummingbird/actions/runs/7692350896/job/20959123441
Failed log: https://github.com/microsoft/hummingbird/actions/runs/7693930724/job/20963638667

@mshr-h
Copy link
Collaborator Author

mshr-h commented Jan 30, 2024

Seems like tests are a bit flaky when TVM is installed from the cache.

Only failed on ubuntu+python3.9: https://github.com/microsoft/hummingbird/actions/runs/7710611530/job/21014292656
Only failed on ubuntu+python3.8: https://github.com/microsoft/hummingbird/actions/runs/7711820622/job/21017938809

@ksaur @interesaaat
Do you have any ideas on this problem?

@ksaur
Copy link
Contributor

ksaur commented Jan 30, 2024

Seems like tests are a bit flaky when TVM is installed from the cache.

Only failed on ubuntu+python3.9: https://github.com/microsoft/hummingbird/actions/runs/7710611530/job/21014292656 Only failed on ubuntu+python3.8: https://github.com/microsoft/hummingbird/actions/runs/7711820622/job/21017938809

@ksaur @interesaaat Do you have any ideas on this problem?

Hmmm, : line 1: 2222 Aborted (core dumped) pytest since it's transient/"bit flaky", i wonder if it is a memory issue. have you ever seen it on your local machine or could get the coredump? And it's only when from cache?

@mshr-h
Copy link
Collaborator Author

mshr-h commented Feb 9, 2024

Note for the memory issue investigation

  • When does it happen?
    • when running pytest with extra dependencies on Ubuntu+python 3.8-3.11 on GitHub Actions.
    • Not happen on macOS or Windows.
    • Sometimes it passes, sometimes it fails.
    • Haven't seen a similar issue on my local Ubuntu machine.
  • What's the error message look like?
  • What I tried?
    • smaller estimators (n_estimators=5) for test_random_forest_tvm_tree_trav_classifier_converter.
    • disabled op fusion for TVM (TVM_MAX_FUSE_DEPTH: 0)
    • limited memory usage by systemd e.g. systemd-run --scope -p MemoryMax=8G --user pytest
    • linked LLVM statically when TVM build
    • updated PyTorch to 2.2.0

@ksaur
Copy link
Contributor

ksaur commented Feb 12, 2024

Hi @mshr-h thanks for all your work on this!! (Especially thank you for the nice summary of the memory issue.) Is the PR still a WIP? I like the is_on_github_actions :) Please let us know if you get stuck or want us to take a look!!

@mshr-h mshr-h marked this pull request as ready for review February 13, 2024 07:37
@mshr-h
Copy link
Collaborator Author

mshr-h commented Feb 13, 2024

Should be good.

@ksaur @interesaaat
Thank you for your support! Can you review it?

@ksaur ksaur enabled auto-merge (squash) February 13, 2024 18:23
@ksaur
Copy link
Contributor

ksaur commented Feb 13, 2024

This looks good to me!! @interesaaat take a look and merge if ready

Copy link
Contributor

@ksaur ksaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for all your hard work! 🎊

@ksaur ksaur merged commit c28c06e into microsoft:main Feb 13, 2024
14 checks passed
@mshr-h mshr-h deleted the tvm-0.12.0 branch February 14, 2024 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants