PR#21778 - LLVM 14 & LLD based macOS toolchain 👀
A PR by Fanquake to replace the current very “non-standard” toolchain for producing macOS release binaries with a more standard toolchain, something that recently became possible. Participants discussed the current and future approach on macOS cross-compiling and potential changes on the Guix (release) build process.
- The current macOS toolchain is pretty homebrew. It's constructed from pre-compiled binaries, 3rd-party sources, we compile our own llinker (ld64), and mush it all together to do macOS builds.
- LLVM is "a collection of modular and reusable compiler and toolchain technologies". It's the umbrella project for
clang
(compiler) andlld
(linker) as well as a number of other tools and libraries, including the *SAN libraries that we use in our sanitizer, CIs etc.- llvm is sort of like what java byte code is to the jvm, but it's supposed to be a general purpose target for many different languages.
Can you think of a reason why we might want to continue using our current macOS toolchain, rather than switch? 🔗
- Our current toolchain works.
- The
-bind_at_load
, which is a linker flag we currently use, is not yet supported bylld
. Although it's unclear if that will actually be an issue, as the reason for setting the flag may no-longer be relevant when building "modern" macOS binaries.- By "modern" we mean targetting a more recent minimum version of macOS. As if you know that your binaries are only running on more recent versions, you can assumed certain behaviour out of the dynamic linker, which might obsolete the thing that passing the flag would achieve.
- We can enforce that the binaries are only run on recent versions at compile time, by passing minimum version flags to the linker (and sanity check those versions in our symbol check scripts).
- By "modern" we mean targetting a more recent minimum version of macOS. As if you know that your binaries are only running on more recent versions, you can assumed certain behaviour out of the dynamic linker, which might obsolete the thing that passing the flag would achieve.
- We care about who is maintaining the tools we are (sometimes blindly) using. There are concers with with the llvm/clang development process where basically anyone can commit to the tree and they rely on some kind of revert process if something goes afoul.
- This is also the same reason to move away from our current setup as there are far more eyes over the LLVM repos, as opposed to https://github.com/tpoechtrager/cctools-port.
Did you try performing a macOS cross compile? Did it work? If not, what problems did you run into? 🔗
[nobody]
Do these changes effect our Guix (release) build process? 🔗
- Yes, they do. Guix uses it's "own" clang (comes with the guix package), from the clang-toolchain package and then use that clang for the build. But even though we use a Guix installed Clang for the Guix build, we still use the rest of the macOS toolchain ld64, cctools etc. from depends.
- That would continue to be the case going forward, even after these changes, however as mentioned in the next question, there is a Guix related change missing from this PR.
- The Guix related change that the author believes missing from this PR, is a similar migration to using Clang 14 for the Guix build at the same time we swapped over to using LLVM/Clang 14 in depends. i.e installing
clang-toolchain-14
overclang-toolchain-10
as we currently do, to keep the clang versions in sync, so that guix builds would be using clang 14, similar to users cross-compilling on linux would be. - If the version change does not happen together, we'll have Guix building with Clang 10, and the CI, or developers doing cross-compiles would be using 14.
[extra question] Why we don't just install and use everything that Guix provides when performing the Guix build? 🔗
- The reason is that we need to maintain a macOS toolchain, that works outside of Guix, as Guix does not run everywhere, and there shouldn't be an expectation that you would need to use it to cross-compile.
- The depends system must remain generic, and useable as widely as possible and be able to compile all binaries, hosts etc. that we produce in release builds. We should not relying solely on Guix (supporting only Guix builds) to be able to compile
bitcoind
for certain OS's as we cannot tell people that they need to install and use Guix if they want to compile Bitcoin Core. Especially given that Guix doesn't work "everywhere", either hardware, or OS wise. - Also using Guix in CI is questionable as it's very resource intensive.
In native_llvm
’s preprocessing step we rm -rf lib/libc++abi.so*
🔗
- The
rm
was originally added in PR#8210, to remove any LLVM C++ ABI related objects, given at that point we were copying more files out of the lib dirs in the clang tarball. Possibly just a belt-and-suspenders thing. - if one makes build for
native_llvm_fetched
target, it becomes obvious thatrm
does nothing.
- Likely irrelevant since PR#19240 (commit), where we stopped copying any c++ libs from the clang tarball.
- The code in its current state is pointless / broken for 2 reasons:
- The only things we copy out of lib/ are
libLTO.so
and headers from lib/clang/clang-version/include, so deleting .so files from /lib in advance of that, doesn't achieve anything. - The
libc++abi.so*
objects have actually changed location inside the lib/ dir, so even if we kept the current code, it wouldn't actually remove anything anyways
- The only things we copy out of lib/ are
In native_llvm.mk
, we copy a number of tools (i.e llvm-*
, not clang
or lld
) out of the tarball 🔗
- [no answer]
- The reason we rename lld to ld, is to make things "simpler" for the build system, as most build systems, especially those using autotools, look for, and expect to use a linker called ld.
ld
is sort of a generic name for a linker whereaslld
is LLVM's specific ld-compatible linker.
- Given we are in full control of our build system here, we can just rename lld, and have it "pretend" to be ld, for the sake of making everything work, and the build systems expecting GNU ld should mostly be none-the-wiser.
- The other reason we might rename tools to have the
$(host)-
is discussed here somewhat. When cross-compiling, autotools generally looks for native (build) tools that have the target arch in the name. i.ex86_64-apple-darwin-strip
. So renaming some of the tools is also a bit of a convenience for autotools, and can prevent warning output like: "configure: WARNING: using cross tools not prefixed with host triplet". - A couple other tools, and why we might rename them:
llvm-install-name-tool
->install_name_tool
as that is its "usual" name, and what other tools / build systems will look for / expect.- Same for
llvm-libtool-darwin
->libtool
as build systems / autotools expect libtool, not libtool-darwin.