Implement `__triton_launcher` as pure DLL #3251

anmyachev · 2025-01-23T17:59:59Z

Test CI:

~~https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/12935201507~~ (wrong branch)
https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/12936829251 (pass rate: 93.35%)

Benchmarks:

https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/12937446706 (I don't see any regression in performance)

Signed-off-by: Anatoly Myachev <[email protected]>

anmyachev · 2025-01-23T21:28:12Z

third_party/intel/backend/driver.py

+class TritonLauncher:
+
+    def __init__(self, cache_path: str):
+        self.shared_library = ctypes.PyDLL(cache_path)


Must use PyDLL instead of CDLL (as in d7d55b8) for python api to work correctly, otherwise I get segfaults.

anmyachev · 2025-01-23T21:32:04Z

third_party/intel/backend/driver.py


    def __call__(self, *args, **kwargs):
        # Serialize KernelArguments for SPIR-V Runner
        serialize_kernel_args = os.getenv('TRITON_XPU_DUMP_SPIRV_KERNEL_ARGS', None)
        if serialize_kernel_args:
            serialize_args(args, self.constants, self.signature)
-        self.launch(*args, **kwargs)


If the kwargs were not empty, the launcher would not start without issues, since according to its C signature it supports only args.

Previously, the code unpacked both args and kwargs into a single dictionary and passed that to the launcher - now it looks like we're passing the args dict which seems incorrect as any values in kwargs would be ignored. Could we create a new dictionary that contains both args and kwargs and pass that to maintain the previous behavior?

Previously, the code unpacked both args and kwargs into a single dictionary and passed that to the launcher - now it looks like we're passing the args dict which seems incorrect as any values in kwargs would be ignored.

I tried to set the dictionary and the launcher immediately crashes. Are you sure that kwargs were ever used before?

> self.launch(*args, **{"test": 2}) E TypeError: launch() takes no keyword arguments

Seems only tuple is expected:

intel-xpu-backend-for-triton/third_party/intel/backend/driver.py

Line 480 in cbbc25f

if(!PyArg_ParseTuple(args, \"{format}\", &gridX, &gridY, &gridZ, &py_obj_stream, &py_kernel,

Perhaps we need to make a cleanup in the triton code...

It looks like the calling classes support them: https://github.com/intel/intel-xpu-backend-for-triton/blob/main/python/triton/compiler/compiler.py#L432

It looks like the calling classes support them: https://github.com/intel/intel-xpu-backend-for-triton/blob/main/python/triton/compiler/compiler.py#L432

In this line, all parameters will be packed into one tuple, since parameters are not passed as ..., name_param=param, ....

@alexbaden do you mind if I merge it?

Ok, interesting - I am sure your Python knowledge is better than mine! :)
Do you think this is something we should propose changing upstream? Or is this specific to our backend?

Do you think this is something we should propose changing upstream? Or is this specific to our backend?

Similar code exists only for AMD backend. I suggested removing unused code in triton-lang/triton#5694.

Just for reference how it works for NVIDIA backend:

intel-xpu-backend-for-triton/third_party/nvidia/backend/driver.py

Line 548 in b018ed6

self.launch(gridX, gridY, gridZ, stream, function, self.launch_cooperative_grid, global_scratch, *args)

anmyachev · 2025-01-23T21:32:58Z

third_party/intel/backend/driver.py

@@ -635,15 +649,14 @@ def __init__(self, src, metadata):
        self.constants = {arg_idx(idx): value for idx, value in constants.items()}
        self.signature = {idx: value for idx, value in src.signature.items()}
        src = make_launcher(self.constants, self.signature)
-        mod = compile_module_from_src(src, "__triton_launcher")
-        self.launch = mod.launch
+        self.mod = compile_module_from_src(src, "__triton_launcher")


To keep a reference to so/dll and not call the destructor prematurely.

With unloaded DLL libraries, these changes are no longer necessary. However, two tests that hold a reference to the compiled kernel need to be adjusted - manually clear the cache (inside `JITFunction` object). CI: * https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/12951798922 (passed) * https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/12955820922 (check status) Blocked on #3251 Extra refs: * python/cpython#87319 --------- Signed-off-by: Anatoly Myachev <[email protected]>

Implement __triton_launcher as pure DLL

999e92a

Signed-off-by: Anatoly Myachev <[email protected]>

anmyachev marked this pull request as ready for review January 23, 2025 21:22

anmyachev linked an issue Jan 23, 2025 that may be closed by this pull request

ImportError: DLL load failed while importing __triton_launcher: The parameter is incorrect #3248

Closed

anmyachev commented Jan 23, 2025

View reviewed changes

anmyachev requested review from whitneywhtsang, vlad-penkin and pbchekin January 23, 2025 21:29

pbchekin approved these changes Jan 23, 2025

View reviewed changes

anmyachev commented Jan 23, 2025

View reviewed changes

anmyachev requested a review from alexbaden January 23, 2025 22:12

anmyachev mentioned this pull request Jan 24, 2025

Revert cleaning cache changes on Windows #3217

Merged

alexbaden approved these changes Jan 24, 2025

View reviewed changes

anmyachev merged commit 303c0ab into main Jan 24, 2025
9 checks passed

anmyachev deleted the amyachev/3248 branch January 24, 2025 18:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `__triton_launcher` as pure DLL #3251

Implement `__triton_launcher` as pure DLL #3251

anmyachev commented Jan 23, 2025 •

edited

Loading

anmyachev Jan 23, 2025

anmyachev Jan 23, 2025

alexbaden Jan 23, 2025

anmyachev Jan 23, 2025 •

edited

Loading

alexbaden Jan 23, 2025

anmyachev Jan 23, 2025

anmyachev Jan 24, 2025

alexbaden Jan 24, 2025

anmyachev Jan 24, 2025

anmyachev Jan 23, 2025

Implement __triton_launcher as pure DLL #3251

Implement __triton_launcher as pure DLL #3251

Conversation

anmyachev commented Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anmyachev Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Implement `__triton_launcher` as pure DLL #3251

Implement `__triton_launcher` as pure DLL #3251

anmyachev commented Jan 23, 2025 •

edited

Loading

anmyachev Jan 23, 2025 •

edited

Loading