torch.compile neighbors without graph breaks #305

RaulPPelaez · 2024-03-14T11:03:27Z

Pytorch introduced a new API to handle extensions, it is "documented" here: https://docs.google.com/document/d/1_W62p8WJOQQUzPsJYa7s701JXt0qf2OfLub2sbkHOaU/edit

It makes it possible to write meta registrations for C++ extensions, which I could not make before. With a meta registration torch.compile is able to understand custom operations. A meta registration is an implementation of the operator for the "meta" device (akin to CPU or CUDA), in which tensors only have shapes and are refered to as FakeTensor.
It is used by pytorch to gather information about the input/output shapes of an operator for compilation purposes.

Makes this code possible:

    example_pos = 100 * torch.rand(
        50, 3, requires_grad=True, dtype=dtype, device=device
    )
    model = OptimizedDistance(
        return_vecs=True,
        loop=True,
        max_num_pairs=-50,
        include_transpose=True,
        resize_to_fit=False,
        check_errors=False,
    ).to(device)
    for _ in range(25):
        model(example_pos)
    edge_index, edge_vec, edge_distance = model(example_pos)
    model = torch.compile(
        model,
        fullgraph=True,
        backend="inductor",
        mode="reduce-overhead",
    )
    edge_index, edge_vec, edge_distance = model(example_pos)

Prior to this PR torch.compile had to be instructed to exclude the nieghbor extension from the operation graph:

torchmd-net/torchmdnet/extensions/__init__.py

Lines 116 to 118 in fae79bd

    
           if int(torch.__version__.split(".")[0]) >= 2: 
        
               import torch._dynamo as dynamo 
        
               dynamo.disallow_in_graph(torch.ops.torchmdnet_extensions.get_neighbor_pairs)

So it could not be compiled with fullgraph=True.

The new API starts at version 2.2.1, which is not yet in conda-forge. I made it so that the current behavior is unchanged for versions prior to it.

Still compile is not able to handle code like this, in which a particular item from a tensor is accessed.

        if self.check_errors:
            assert (
                num_pairs[0] <= max_pairs
            ), f"Found num_pairs({num_pairs[0]}) > max_num_pairs({max_pairs})"

It can still be compiled, just not with fullgraph=True.
The general rule being "if you can capture it into a CUDA graph you can torch.compile it"

Make fwd and bkwd independent operators Add meta registrations for forwards and backwards Define meta registrations only in pytorch>=2.2.0

RaulPPelaez · 2024-03-14T11:26:09Z

torchmdnet/extensions/neighbors/neighbors_cpu.cpp

+        static auto fwd =
+            torch::Dispatcher::singleton()
+                .findSchemaOrThrow("torchmdnet_extensions::get_neighbor_pairs_fwd", "")
+                .typed<decltype(forward_impl)>();


@peastman, I think we could use this code here to allow any pytorch model to be used in OpenMM-Torch. Such as torch.compile, torch.jit.trace or even models not compatible with TorchScript.
The code above allows to call any pytorch extension from C++, regardless of where or when the extension was registered.
We could have a function that registers the user model as an Autograd extension python-side and simply sends TorchForce the name of the extension.
Serialization would not be possible with torch.save though, unless we use something like Pickle in those instances.
If we solve serialization, this would decouple OpenMM-Torch from TorchScript entirely.

RaulPPelaez added 7 commits March 13, 2024 19:10

Add test for full graph neighbor torch.compile

19eefdf

blacken

275c730

Add missing sqrt default implementation

f2645cc

Update compile test

17977c5

Import torch.Tensor

0aa4cb2

Expose only the extensions

0ad7700

Make CUDA backwards also the operation used in CPU

195a9a4

Make fwd and bkwd independent operators Add meta registrations for forwards and backwards Define meta registrations only in pytorch>=2.2.0

RaulPPelaez commented Mar 14, 2024

View reviewed changes

Fix CPU extentension potentially allocating with a negative size

eeded26

RaulPPelaez marked this pull request as ready for review March 14, 2024 12:06

RaulPPelaez changed the title ~~Compile neighbors~~ torch.compile neighbors without graph breaks Mar 14, 2024

Fix incorrect type

500a8e4

RaulPPelaez merged commit 72d6e8e into torchmd:main Apr 15, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.compile neighbors without graph breaks #305

torch.compile neighbors without graph breaks #305

RaulPPelaez commented Mar 14, 2024

RaulPPelaez Mar 14, 2024

	if int(torch.__version__.split(".")[0]) >= 2:
	import torch._dynamo as dynamo
	dynamo.disallow_in_graph(torch.ops.torchmdnet_extensions.get_neighbor_pairs)

torch.compile neighbors without graph breaks #305

torch.compile neighbors without graph breaks #305

Conversation

RaulPPelaez commented Mar 14, 2024

RaulPPelaez Mar 14, 2024

Choose a reason for hiding this comment