Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Offload] Add SYCLBIN format and dump tool #16873

Open
wants to merge 9 commits into
base: sycl
Choose a base branch
from

Conversation

steffenlarsen
Copy link
Contributor

This commit adds a new option --syclbin to the clang-linker-wrapper tool to produce SYCLBIN files from packaged device code binaries. The SYCLBIN format parsing added with these changes follow the format defined in the SYCLBIN design document.

To help both users and the testing framework, a tool (syclbin-dump) for printing information about a SYCLBIN file is added.

This commit adds a new option `--syclbin` to the clang-linker-wrapper
tool to produce SYCLBIN files from packaged device code binaries. The
SYCLBIN format parsing added with these changes follow the format
defined in the SYCLBIN design document.

To help both users and the testing framework, a tool (syclbin-dump) for
printing information about a SYCLBIN file is added.

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen
Copy link
Contributor Author

Design document: #16872

@@ -37,6 +37,7 @@
#include "llvm/Object/IRObjectFile.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully we're planning on upstreaming this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely! I am told there are some refactorings coming in for ClangLinkerWrapper, so I wanted to get some pieces in for now so development of other parts, like the driver, can go ahead in parallel.

Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a quick first pass

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp Outdated Show resolved Hide resolved
clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp Outdated Show resolved Hide resolved
clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp Outdated Show resolved Hide resolved
@@ -18,6 +18,7 @@
#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/Module.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we need this instead of forward declaring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do. Unlike shared_ptr, unique_ptr insists on the contained type not being incomplete.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why do we need this in a header file which is otherwise unchanged? It seems like either this #include should be moved straight into .cpp file, or into another .h file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be a dependency bug that just wasn't noticed because Module.h was included before ModuleSplitter.h in all other cases.

sycl/test/syclbin/simple_kernel.cpp Outdated Show resolved Hide resolved
sycl/tools/syclbin-dump/syclbin-dump.cpp Outdated Show resolved Hide resolved
sycl/tools/syclbin-dump/syclbin-dump.cpp Show resolved Hide resolved
clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp Outdated Show resolved Hide resolved
namespace {

template <typename T> void BinaryWriteInteger(raw_ostream &OS, T Val) {
static_assert(std::is_integral_v<T>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could consider using SFINAE instead of the assert, but maybe this way the error is more clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the name is explicit about it needing to be an integer, I like to think the error from a static assertion is clearer. SFINAE would've been nicer if we wanted it to select between implementations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really care about T being an integer? We do reinterpret cast and write raw bytes, the source types shouldn't matter at all, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a bit of a minor optimization, but this function passes Val by value, which for an arbitrary T may be expensive. For integers it may be faster than passing by reference and should never be any more expensive, so that is the reason for the restriction.


namespace object {

class SYCLBIN : public Binary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main question I have here is why don't we re-use existing OffloadBinary subclass?

class OffloadBinary : public Binary {

If we need a hierarchical structure, then we just make it nested by putting one OffloadBinary into each other, or concatenating a few of them together to represent a list.
We can subclass if we want to provide some custom getters like getArch.

OpenMP offloading emits this data structure and libompoffload (that is going to be turned into liboffload and eventually is what we will use instead of UR once we are in upstream) expects OffloadBinary.

I also think that whatever we produce for regular SYCL app compilation, for online compilation from SYCL sources and for SYCLBIN should have the same format for simplicity and uniformity of handling it in SYCL RT.
From that point of view, I vote for re-using OffloadBinary even more, because whilst I can at least somehow justify a custom format for SYCLBIN, I can't justify us not using OffloadBinary for regular SYCL compilation flow because it provides everything we have in our existing legacy custom format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we should eventually switch even the common structure of kernel code over to this new format.

However, I think making this a subclass of OffloadBinary makes the abstraction a little less obvious. The current structure is that the SYCLBIN binary is contained inside the image of an offload binary, so they are not related formats but rather nested formats.

We can subclass if we want to provide some custom getters like getArch.

SYCLBIN does not have architecture at the top-level, as different contained modules can target different device architectures, so I worry that merging the two would make getters like it confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current structure is that the SYCLBIN binary is contained inside the image of an offload binary, so they are not related formats but rather nested formats.

Oh, so the idea is to put SYCLBIN into OffloadBinary::OffloadingImage::Image, right? If so, then I withdraw my objections. In that case we can still have single top-level structure as an interface with SYCLBIN being a specialized version of the underlying content.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a comment to this class declaration to describe that it is supposed to be used as a more complicated from of OffloadBinary::OffloadingImage::Image to simplify working with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment and removed its inheritance from Binary.

Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't yet looked at clang-linker-wrapper and SYCLBIN::read/write implementations

llvm/include/llvm/Object/SYCLBIN.h Outdated Show resolved Hide resolved
SmallVector<AbstractModule, 4> AbstractModules;

private:
SYCLBIN(const SYCLBIN &Other) = delete;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need it to be declared as private to delete it? If we explicitly deleting it, then I think it is better to put it next to other constructors so that they are all next to each other.


namespace object {

class SYCLBIN : public Binary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a comment to this class declaration to describe that it is supposed to be used as a more complicated from of OffloadBinary::OffloadingImage::Image to simplify working with it.

llvm/include/llvm/Object/Binary.h Outdated Show resolved Hide resolved
llvm/include/llvm/Object/SYCLBIN.h Outdated Show resolved Hide resolved
llvm/lib/Object/SYCLBIN.cpp Outdated Show resolved Hide resolved
Comment on lines +163 to +164
for (const ModuleDesc &Desc : ModuleDescs) {
for (const module_split::SplitModule &SM : Desc.SplitModules) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very long for loop, I suggest that we outline some elements of it into smaller helper functions so that it is easier to grasp high-level logic and keep track of indentation levels

Comment on lines +43 to +45
static Expected<SmallString<0>> write(const SmallVector<ModuleDesc> &);

static Expected<std::unique_ptr<SYCLBIN>> read(MemoryBufferRef Source);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation of read/write is quite complex, it has to be exhaustively tested. I'm not saying that all the testing should be here from day 1, but we at least need to lay some groundwork for it. For example what you can do is to have a unit-test which does write, then read and checks that result is the same as the initial input. See e471ba3 for inspiration.

sycl/tools/syclbin-dump/syclbin-dump.cpp Show resolved Hide resolved
sycl/tools/syclbin-dump/syclbin-dump.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants