Trace model process will introduce extra `aten::Int` op #1185

ruoqianguo · 2022-07-18T07:03:53Z

ruoqianguo
Jul 18, 2022

Background: When i tried to convert Pytorch model Swin-Transformer to torchscript, it didn't work with script and seemed like some ops are not supported in script process. So i have to trace the model. But during tracing process, i find it will introduce extra aten::Int op which isn't supported well during TorchTRT converting in some cases. In the Swin-Transformer case, i have to change the source code to avoid tracing process introduce extra aten::Int. I wonder if we have some methods to avoid tracing introducing aten::Int into the torchscript. There is an example about the case below.

There is a snippet code which is widely used in swin-transformer. When i traced the model and fully compiled it with torch-trt, it failed during compiling. When i scripted the model and compiled it with torch-trt, it worked.
The script process log is here:
script_log.txt.
The trace process log is here:
trace_full_compile.txt
In script process, self.window_size is a int value %self.window_size : int = prim::Constant[value=7]() and in trace process, self.window_size is a Tensor value %16 : Long(requires_grad=0, device=cpu) = prim::Constant[value={7}](). In my opinion, this introduced aten::Int op into the graph which leaded to the trace process failure.

So i wonder if we have some methods to avoid parsing self.window_size as a Tensor value during trace process. I think it will avoid lots of aten::Int op in some cases.

import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import torch_tensorrt as torchtrt
torchtrt.logging.set_reportable_log_level(torchtrt.logging.Level.Debug)

class Net(nn.Module):
    def __init__(self, window_size=7):
        super(Net, self).__init__()
        self.window_size = window_size
    
    def forward(self, x):
        B, H, W, C = x.shape
        x = x.view(B, H // self.window_size, self.window_size, W // self.window_size, self.window_size, C)
        windows = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, self.window_size, self.window_size, C)
        return windows

shape = [2, 14, 14, 5]
x = torch.randn(*shape).cuda()
py_net = Net(window_size=7).eval().cuda()
# script_net = torch.jit.script(py_net).cuda()
script_net = torch.jit.trace(py_net, (x, )).cuda()
print("torchscript graph: ", script_net.graph)

y = script_net(x)

# trtorch
compile_settings = {
    "inputs":  [
        torchtrt.Input(shape, dtype=torch.float),
        ],
    "enabled_precisions": {torch.float},
    "require_full_compilation": True,
    "truncate_long_and_double": True,
    # "torch_executed_ops":["aten::Int"],
}

trt_ts_module = torchtrt.ts.compile(script_net, **compile_settings)
print("compiled")

trt_out = trt_ts_module(x)
print("torchscript results, ", y.size())
print("trt results, ", trt_out.size())
print("diff: ", torch.mean(torch.abs(trt_out - y)))

ruoqianguo · 2022-07-18T07:07:57Z

ruoqianguo
Jul 18, 2022
Author

@narendasan @peri044 When you have time please take a look, Thanks.

0 replies

ncomly-nvidia · 2022-08-02T23:52:25Z

ncomly-nvidia
Aug 2, 2022

watching

0 replies

narendasan · 2022-08-03T03:05:54Z

narendasan
Aug 3, 2022
Collaborator

In my experience aten::Int gets introduced when there is type ambiguity in the code. This is why patching models to be explicit about types avoids introducing aten::Int. Tracing has less type information available than scripting. We don't have a ton of control here within the tracing process unless we submit a patch upstream. Even then not sure what that patch would look like

1 reply

ruoqianguo Aug 4, 2022
Author

Thanks, I see it.

ruoqianguo · 2022-08-17T06:21:47Z

ruoqianguo
Aug 17, 2022
Author

I find a blog about tracing process. It mentioned that expressions like tensor.size(0), tensor.size()[1], tensor.shape[2] are integers in eager mode, but Tensors in tracing mode.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trace model process will introduce extra `aten::Int` op #1185

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Trace model process will introduce extra aten::Int op #1185

ruoqianguo Jul 18, 2022

Replies: 4 comments · 1 reply

ruoqianguo Jul 18, 2022 Author

ncomly-nvidia Aug 2, 2022

narendasan Aug 3, 2022 Collaborator

ruoqianguo Aug 4, 2022 Author

ruoqianguo Aug 17, 2022 Author

Trace model process will introduce extra `aten::Int` op #1185

ruoqianguo
Jul 18, 2022

Replies: 4 comments 1 reply

ruoqianguo
Jul 18, 2022
Author

ncomly-nvidia
Aug 2, 2022

narendasan
Aug 3, 2022
Collaborator

ruoqianguo Aug 4, 2022
Author

ruoqianguo
Aug 17, 2022
Author