Shape Tensor handling in conversion and design for dynamic converters #1839

peri044 · 2023-04-19T17:56:10Z

peri044
Apr 19, 2023
Collaborator

Shape Tensor handling in conversion and design for dynamic converters

TL;DR

We recently added support for aten::size to output shape tensors (nvinfer1::ITensor) which can now pass shape information to the conversion stack. Shape Tensors are the method to encode dynamic shape information in TensorRT so this is necessary to add true support for dynamic shape. However, this change introduces violations of the type system which expose the converter and evaluator libraries to intermittently failing in unclear ways.

Background

We recently added support for aten::size to output shape tensors (nvinfer1::ITensor) which can now pass shape information to the conversion stack. The issue is that aten::size schema is aten::size(Tensor t) -> int[] so there is a type confusion introduced. Fundamentally, it is assumed by converters that arguments of a particular type will be actually be that type. Exceptions are Tensor which can be either a torch::Tensor or an nvinfer1::ITensor and Tensor[] which are lists formed of either variant of tensor. The issue with types such as int, int[], float etc. is they are used as compile time static data for the most part consumed in either evaluators to calculate other compile time static data or converters for layer settings. For example you might be multiplying 2 ints together to get the window size of a max pooling layer at compile time. It therefore does not seem feasible to just accept this type conversion and introduce additional support in converters to handle cases where an int or int[] is actually an ITensor as any argument to any tensor that is an int or int[] may in fact not contain static data.

Proposed Solution

Instead we could respect the types that schemas say they accept to keep the contract between converters and the system (https://pytorch.org/TensorRT/contributors/writing_converters.html). The proposed method to do this is introduce a new IR to handle operations in a dynamic shape mode.

The IR will contain placeholder operations with schemas that respect both the expectations of TensorRT and TorchScript. For example: The dynamic version of aten::size has a corresponding trt::size_dyn. The difference between these two operations is that aten::size's schema is aten::size(Tensor t) -> int[] and trt::size_dyn(Tensor t) -> (Tensor) (Note the different output type).

Obviously, consumers of aten::size are expecting an int[] and not Tensor so incrementally we will add dyn alternatives to these ops as they are found. So for example, aten::unflatten.int(Tensor self, int dim, int[] sizes) -> (Tensor), would have a dyn variant trt::unflatten_dyn.int(Tensor, self, int dim, Tensor sizes) -> (Tensor). (#1808)

Now the converter for trt::unflatten_dyn expects a Tensor for the sizes, and the implementation of the converter can operate using this assumption. Converters don't need to handle both static and dynamic cases.

TorchScript enforces types in edges (values) between nodes. This is actually a feature we can leverage. We can have a lowering pass which runs conditioned on if inputs are dynamic (data available at lowering time already) that replaces static variants with dyn variants in place and error out if the graph cannot be reconstructed using dyn ops available. This will in fact give users a clearer, earlier report of what operations need to have dynamic support added vs. opaque unwrapping errors that pop up when the compiler goes to unwrap an int[] from a Var containing an ITensor in a converter.

There is already an effort add dynamic shape support and DDS support to key converters by amending them. Instead we can just make additional converters and this proposal would add infrastructure to do this in a clear maintainable fashion, allowing converters to remain simple.

Alternative approaches

We can take out aten::size dynamic support as we are moving to dynamo and this makes TorchScript more maintainable as more resources transfer over.

We can start patching converters and evaluators to handle shape tensors where these issues pop up today. The problem is that we implicitly throw out the type system and we also will not be able to tell if there is certain compile time required data that has been subsumed into shape tensors. This approach makes converters more complicated and will likely introduce more bugs and model failures with less clear messaging as to why the system is failing.

We can push the TorchScript frontend to do as much as possible in TensorRT at runtime, i.e. auto freezing data as soon as its available. This would still require evaluators to be rewritten to essentially be new converters. It will also massively bloat the size of produced engines to do intermediate operations which we can currently do at compile time and likely cost performance.

Example Workflow

For any operation that needs dynamic shape support.

Create a variant schema trt::[op]_dyn(...) -> (...) where the type of the dynamic shape information is Tensor
Register this op with TorchScript in a manner similar to trt::const and with a translation map which defines mappings between static aten ops and dynamic trt::[op]_dyn ops. (

TensorRT/core/lowering/register_trt_placeholder_ops.cpp

Line 14 in 1d78f43

"trt::const(Tensor val) -> Tensor",

)
Have a pass which is run when dynamic shape support is requested to replace all instances of static ops in the translation map with their dynamic counter parts. In the case that the graph is not complete when these replacements are performed because there is some input expecting an int[] or int being provided a Tensor, provide an error stating what operations need dynamic alternatives.
Each trt::[op]_dyn converter should have a separate implementation from their static counterpart. This converter is written with the assumption that shape information is provided via an ITensor. If necessary we can amend this assumption that dyn converters actually need to handle both ITensor and int[] cases but in this case (static->dynamic) it is much easier to handle than the current situation (dynamic->static). The converter would simply freeze the int[] into a shape tensor like ITensorOrFreeze does. In fact we can add a convince function to handle this ShapeTensorOrFreeze which would freeze torch::Tensors or ints/int[] and return a shape tensor.

Implementation Phases

Prototype - Large

MVP - Large

mfeliz-cruise · 2023-04-20T16:25:46Z

mfeliz-cruise
Apr 20, 2023

We can take out aten::size dynamic support as we are moving to dynamo and this makes TorchScript more maintainable as more resources transfer over.
I would have to maintain a separate fork in this case.

One concern about replacing ops with their _dyn variants is that if these end up in fallback regions (min_block size, IO legalization, etc.). They may cause issues with downstream consumers of the output TorchScript. A cleanup pass to revert these to their original forms might be needed.

Have a pass which is run when dynamic shape support is requested to replace all instances of static ops in the translation map with their dynamic counter parts.
Would we replace all ops generically? Do we have a path to selectively use the evaluator/converter path based on whether the accessed dim/shape_tensor is actually dynamic?

3 replies

narendasan Apr 20, 2023
Collaborator

One concern about replacing ops with their _dyn variants is that if these end up in fallback regions (min_block size, IO legalization, etc.). They may cause issues with downstream consumers of the output TorchScript. A cleanup pass to revert these to their original forms might be needed.

This is true. Alternatively we can run the dyn replacement pass post partitioning only on TRT blocks

Would we replace all ops generically? Do we have a path to selectively use the evaluator/converter path based on whether the accessed dim/shape_tensor is actually dynamic?

My thought right now is we maintain some list of operations which produce shape tensors (e.g. aten::size) and use the uses of those ops outputs to identify operations which might need dyn replacements.

mfeliz-cruise Apr 20, 2023

Would we be able to differentiate these two getitem's (one of which can be resolved to a constant, the other should be handled dynamically)?
%size = (-1, 2) #batch dim is dynamic
%dim0 = aten::__getitem__(%size, 0)
%dim1 = aten::__getitem__(%size, 1)

narendasan Apr 20, 2023
Collaborator

if %size is a shape tensor the both cases would use the dyn version of getitem using the rule that all uses of shape tensors should be dyn variants. AFAIK once the data is in a ITensor, getting it out isnt straight forward.

ONNX Parser solves this by wrapping the TRT tensors in a class that maintains a copy itself. Not sure though without being part way through compilation how you'd tell if a specific dimension is static or dynamic since many of these shapes are constructed on the fly. Also not sure if this is exactly in scope since the primary goal here is to try to keep the torch types and the types we generate aligned so the job of converter writing stays simple. There could be a feature which takes in a "wrapped" shape tensor and figures out if the operation produces static or dynamic data as part of the conversion phase, have not put that much thought into this though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shape Tensor handling in conversion and design for dynamic converters #1839

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Shape Tensor handling in conversion and design for dynamic converters #1839

peri044 Apr 19, 2023 Collaborator

Shape Tensor handling in conversion and design for dynamic converters

TL;DR

Background

Proposed Solution

Alternative approaches

Example Workflow

Implementation Phases

Prototype - Large

MVP - Large

Replies: 1 comment · 3 replies

mfeliz-cruise Apr 20, 2023

narendasan Apr 20, 2023 Collaborator

mfeliz-cruise Apr 20, 2023

narendasan Apr 20, 2023 Collaborator

peri044
Apr 19, 2023
Collaborator

Replies: 1 comment 3 replies

mfeliz-cruise
Apr 20, 2023

narendasan Apr 20, 2023
Collaborator

narendasan Apr 20, 2023
Collaborator