PR#21778 - LLVM 14 & LLD based macOS toolchain 👀

A PR by Fanquake to replace the current very “non-standard” toolchain for producing macOS release binaries with a more standard toolchain, something that recently became possible. Participants discussed the current and future approach on macOS cross-compiling and potential changes on the Guix (release) build process.

The current macOS toolchain is pretty homebrew. It's constructed from pre-compiled binaries, 3rd-party sources, we compile our own llinker (ld64), and mush it all together to do macOS builds.
LLVM is "a collection of modular and reusable compiler and toolchain technologies". It's the umbrella project for clang (compiler) and lld (linker) as well as a number of other tools and libraries, including the *SAN libraries that we use in our sanitizer, CIs etc.
- llvm is sort of like what java byte code is to the jvm, but it's supposed to be a general purpose target for many different languages.

Questions

Can you think of a reason why we might want to continue using our current macOS toolchain, rather than switch? 🔗

Our current toolchain works.
The -bind_at_load, which is a linker flag we currently use, is not yet supported by lld. Although it's unclear if that will actually be an issue, as the reason for setting the flag may no-longer be relevant when building "modern" macOS binaries.
- By "modern" we mean targetting a more recent minimum version of macOS. As if you know that your binaries are only running on more recent versions, you can assumed certain behaviour out of the dynamic linker, which might obsolete the thing that passing the flag would achieve.
  - We can enforce that the binaries are only run on recent versions at compile time, by passing minimum version flags to the linker (and sanity check those versions in our symbol check scripts).
We care about who is maintaining the tools we are (sometimes blindly) using. There are concers with with the llvm/clang development process where basically anyone can commit to the tree and they rely on some kind of revert process if something goes afoul.
- This is also the same reason to move away from our current setup as there are far more eyes over the LLVM repos, as opposed to https://github.com/tpoechtrager/cctools-port.

Did you try performing a macOS cross compile? Did it work? If not, what problems did you run into? 🔗

[nobody]

Do these changes effect our Guix (release) build process? 🔗

If so, how? (hint: look for usage of `FORCE_USE_SYSTEM_CLANG`)

Yes, they do. Guix uses it's "own" clang (comes with the guix package), from the clang-toolchain package and then use that clang for the build. But even though we use a Guix installed Clang for the Guix build, we still use the rest of the macOS toolchain ld64, cctools etc. from depends.
That would continue to be the case going forward, even after these changes, however as mentioned in the next question, there is a Guix related change missing from this PR.

Is there a Guix build change you’d expect to see, which is missing from the PR?

The Guix related change that the author believes missing from this PR, is a similar migration to using Clang 14 for the Guix build at the same time we swapped over to using LLVM/Clang 14 in depends. i.e installing clang-toolchain-14 over clang-toolchain-10 as we currently do, to keep the clang versions in sync, so that guix builds would be using clang 14, similar to users cross-compilling on linux would be.
If the version change does not happen together, we'll have Guix building with Clang 10, and the CI, or developers doing cross-compiles would be using 14.

[extra question] Why we don't just install and use everything that Guix provides when performing the Guix build? 🔗

The reason is that we need to maintain a macOS toolchain, that works outside of Guix, as Guix does not run everywhere, and there shouldn't be an expectation that you would need to use it to cross-compile.
The depends system must remain generic, and useable as widely as possible and be able to compile all binaries, hosts etc. that we produce in release builds. We should not relying solely on Guix (supporting only Guix builds) to be able to compile bitcoind for certain OS's as we cannot tell people that they need to install and use Guix if they want to compile Bitcoin Core. Especially given that Guix doesn't work "everywhere", either hardware, or OS wise.
Also using Guix in CI is questionable as it's very resource intensive.

In `native_llvm`’s preprocessing step we `rm -rf lib/libc++abi.so*` 🔗

Why do/did we do this? (remember we target a macOS SDK when building)

The rm was originally added in PR#8210, to remove any LLVM C++ ABI related objects, given at that point we were copying more files out of the lib dirs in the clang tarball. Possibly just a belt-and-suspenders thing.
if one makes build for native_llvm_fetched target, it becomes obvious that rm does nothing.

Is it actually necessary? (double-check what is ultimately copied from the tarball).

Likely irrelevant since PR#19240 (commit), where we stopped copying any c++ libs from the clang tarball.
The code in its current state is pointless / broken for 2 reasons:
- The only things we copy out of lib/ are libLTO.so and headers from lib/clang/clang-version/include, so deleting .so files from /lib in advance of that, doesn't achieve anything.
- The libc++abi.so* objects have actually changed location inside the lib/ dir, so even if we kept the current code, it wouldn't actually remove anything anyways

In `native_llvm.mk`, we copy a number of tools (i.e `llvm-*`, not `clang` or `lld`) out of the tarball 🔗

What is one of them (out of those tools copied) used for?

[no answer]

bonus: If we rename that tool when copying it, why might we do that?

The reason we rename lld to ld, is to make things "simpler" for the build system, as most build systems, especially those using autotools, look for, and expect to use a linker called ld.
- ld is sort of a generic name for a linker whereas lld is LLVM's specific ld-compatible linker.
Given we are in full control of our build system here, we can just rename lld, and have it "pretend" to be ld, for the sake of making everything work, and the build systems expecting GNU ld should mostly be none-the-wiser.
The other reason we might rename tools to have the $(host)- is discussed here somewhat. When cross-compiling, autotools generally looks for native (build) tools that have the target arch in the name. i.e x86_64-apple-darwin-strip. So renaming some of the tools is also a bit of a convenience for autotools, and can prevent warning output like: "configure: WARNING: using cross tools not prefixed with host triplet".
A couple other tools, and why we might rename them:
- llvm-install-name-tool -> install_name_tool as that is its "usual" name, and what other tools / build systems will look for / expect.
- Same for llvm-libtool-darwin -> libtool as build systems / autotools expect libtool, not libtool-darwin.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-03-23-#21778.md

2022-03-23-#21778.md

PR#21778 - LLVM 14 & LLD based macOS toolchain 👀

Questions

Can you think of a reason why we might want to continue using our current macOS toolchain, rather than switch? 🔗

Did you try performing a macOS cross compile? Did it work? If not, what problems did you run into? 🔗

Do these changes effect our Guix (release) build process? 🔗

If so, how? (hint: look for usage of `FORCE_USE_SYSTEM_CLANG`)

Is there a Guix build change you’d expect to see, which is missing from the PR?

[extra question] Why we don't just install and use everything that Guix provides when performing the Guix build? 🔗

In `native_llvm`’s preprocessing step we `rm -rf lib/libc++abi.so*` 🔗

Why do/did we do this? (remember we target a macOS SDK when building)

Is it actually necessary? (double-check what is ultimately copied from the tarball).

In `native_llvm.mk`, we copy a number of tools (i.e `llvm-*`, not `clang` or `lld`) out of the tarball 🔗

What is one of them (out of those tools copied) used for?

bonus: If we rename that tool when copying it, why might we do that?

Files

2022-03-23-#21778.md

Latest commit

History

2022-03-23-#21778.md

File metadata and controls

PR#21778 - LLVM 14 & LLD based macOS toolchain 👀

Questions

Can you think of a reason why we might want to continue using our current macOS toolchain, rather than switch? 🔗

Did you try performing a macOS cross compile? Did it work? If not, what problems did you run into? 🔗

Do these changes effect our Guix (release) build process? 🔗

If so, how? (hint: look for usage of FORCE_USE_SYSTEM_CLANG)

Is there a Guix build change you’d expect to see, which is missing from the PR?

[extra question] Why we don't just install and use everything that Guix provides when performing the Guix build? 🔗

In native_llvm’s preprocessing step we rm -rf lib/libc++abi.so* 🔗

Why do/did we do this? (remember we target a macOS SDK when building)

Is it actually necessary? (double-check what is ultimately copied from the tarball).

In native_llvm.mk, we copy a number of tools (i.e llvm-*, not clang or lld) out of the tarball 🔗

What is one of them (out of those tools copied) used for?

bonus: If we rename that tool when copying it, why might we do that?

If so, how? (hint: look for usage of `FORCE_USE_SYSTEM_CLANG`)

In `native_llvm`’s preprocessing step we `rm -rf lib/libc++abi.so*` 🔗

In `native_llvm.mk`, we copy a number of tools (i.e `llvm-*`, not `clang` or `lld`) out of the tarball 🔗