Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpamDownload assertion failure is causing opam-repo-ci builds to fail on arm32-ocaml-4.14 #5971

Open
shonfeder opened this issue May 23, 2024 · 7 comments · May be fixed by #5979
Open

OpamDownload assertion failure is causing opam-repo-ci builds to fail on arm32-ocaml-4.14 #5971

shonfeder opened this issue May 23, 2024 · 7 comments · May be fixed by #5979
Assignees
Labels

Comments

@shonfeder
Copy link
Contributor

First noticed (afaik) at ocaml/opam-repository#25905 (comment)

The error we're seeing in CI is

/home/opam: (run (network host)
                 (shell "opam init --reinit --config .opamrc-sandbox -ni"))
Fatal error:
File "src/repository/opamDownload.ml", line 140, characters 2-8: Assertion failed
"/usr/bin/linux32" "/bin/sh" "-c" "opam init --reinit --config .opamrc-sandbox -ni" failed with exit status 99

which can be seen in, e.g., this CI log

The failing assertion is at

assert (url.OpamUrl.backend = `http);

@shonfeder shonfeder changed the title OpamDownload assertion failure is causing failures opam-repo-ci on arm32-ocaml-4.14 OpamDownload assertion failure is causing opam-repo-ci builds to fail on arm32-ocaml-4.14 May 23, 2024
@kit-ty-kate
Copy link
Member

is it reproducible or does it only happen from time to time?

@dbuenzli
Copy link
Contributor

FWIW it also happened on the cmdliner release here.

@shonfeder
Copy link
Contributor Author

It's reproducible. E.g., every Jane Street package looks to be suffering the same fate currently: https://opam.ci.ocaml.org/github/ocaml/opam-repository/commit/b0fb4f8c144e4e78cd6de1972fc3453a2024d8a8

@rjbou
Copy link
Collaborator

rjbou commented May 24, 2024

It seems to happen only on arm32 & freebsd images.
If it is at repository reloading stage, it shouldn't go through that code as in the image it is defined as a directory (file:///home/opam/opam-repository).
Is it possible to extract a backtrace and some logs (-vv | --debug)?

@shonfeder
Copy link
Contributor Author

I'll see about getting this reproducing net week. I also realized I didn't take into account the container caching when I claimed it is reproducible, and all of the CI jobs I've looked at so far are pulling that step from the cache.

@kit-ty-kate
Copy link
Member

Trying to debug this without access to those machine has so far not produced any results. I've opened #5975 to at least show a more decent error message, which would help debug this further. My instinct tells me it is due to a file that is somehow removed on those arm machines but i'm still baffled as to why only arm (arm32 and arm64) machines are affected.

@kit-ty-kate
Copy link
Member

The failure came from the fact that the image got broken somewhere and the $HOME directory was no longer readable, writeable or owned by the proper user.

The error message should be fixed though. I'm planning to open a more lightweight version of #5975 very soon to catch that sooner and display a better error message. I've removed this issue from the 2.2 board as it is no longer urgent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants