Replies: 1 comment 3 replies
-
One concern about replacing ops with their
|
Beta Was this translation helpful? Give feedback.
-
One concern about replacing ops with their
|
Beta Was this translation helpful? Give feedback.
-
Shape Tensor handling in conversion and design for dynamic converters
TL;DR
We recently added support for
aten::size
to output shape tensors (nvinfer1::ITensor
) which can now pass shape information to the conversion stack. Shape Tensors are the method to encode dynamic shape information in TensorRT so this is necessary to add true support for dynamic shape. However, this change introduces violations of the type system which expose the converter and evaluator libraries to intermittently failing in unclear ways.Background
We recently added support for
aten::size
to output shape tensors (nvinfer1::ITensor
) which can now pass shape information to the conversion stack. The issue is thataten::size
schema isaten::size(Tensor t) -> int[]
so there is a type confusion introduced. Fundamentally, it is assumed by converters that arguments of a particular type will be actually be that type. Exceptions areTensor
which can be either atorch::Tensor
or annvinfer1::ITensor
andTensor[]
which are lists formed of either variant of tensor. The issue with types such asint
,int[]
,float
etc. is they are used as compile time static data for the most part consumed in either evaluators to calculate other compile time static data or converters for layer settings. For example you might be multiplying 2 ints together to get the window size of a max pooling layer at compile time. It therefore does not seem feasible to just accept this type conversion and introduce additional support in converters to handle cases where anint
orint[]
is actually anITensor
as any argument to any tensor that is anint
orint[]
may in fact not contain static data.Proposed Solution
Instead we could respect the types that schemas say they accept to keep the contract between converters and the system (https://pytorch.org/TensorRT/contributors/writing_converters.html). The proposed method to do this is introduce a new IR to handle operations in a dynamic shape mode.
The IR will contain placeholder operations with schemas that respect both the expectations of TensorRT and TorchScript. For example: The dynamic version of
aten::size
has a correspondingtrt::size_dyn
. The difference between these two operations is thataten::size
's schema isaten::size(Tensor t) -> int[]
andtrt::size_dyn(Tensor t) -> (Tensor)
(Note the different output type).Obviously, consumers of
aten::size
are expecting anint[]
and notTensor
so incrementally we will adddyn
alternatives to these ops as they are found. So for example,aten::unflatten.int(Tensor self, int dim, int[] sizes) -> (Tensor)
, would have adyn
varianttrt::unflatten_dyn.int(Tensor, self, int dim, Tensor sizes) -> (Tensor)
. (#1808)Now the converter for
trt::unflatten_dyn
expects a Tensor for the sizes, and the implementation of the converter can operate using this assumption. Converters don't need to handle both static and dynamic cases.TorchScript enforces types in edges (values) between nodes. This is actually a feature we can leverage. We can have a lowering pass which runs conditioned on if inputs are dynamic (data available at lowering time already) that replaces static variants with dyn variants in place and error out if the graph cannot be reconstructed using
dyn
ops available. This will in fact give users a clearer, earlier report of what operations need to have dynamic support added vs. opaque unwrapping errors that pop up when the compiler goes to unwrap anint[]
from aVar
containing anITensor
in a converter.There is already an effort add dynamic shape support and DDS support to key converters by amending them. Instead we can just make additional converters and this proposal would add infrastructure to do this in a clear maintainable fashion, allowing converters to remain simple.
Alternative approaches
We can take out
aten::size
dynamic support as we are moving to dynamo and this makes TorchScript more maintainable as more resources transfer over.We can start patching converters and evaluators to handle shape tensors where these issues pop up today. The problem is that we implicitly throw out the type system and we also will not be able to tell if there is certain compile time required data that has been subsumed into shape tensors. This approach makes converters more complicated and will likely introduce more bugs and model failures with less clear messaging as to why the system is failing.
We can push the TorchScript frontend to do as much as possible in TensorRT at runtime, i.e. auto freezing data as soon as its available. This would still require evaluators to be rewritten to essentially be new converters. It will also massively bloat the size of produced engines to do intermediate operations which we can currently do at compile time and likely cost performance.
Example Workflow
For any operation that needs dynamic shape support.
trt::[op]_dyn(...) -> (...)
where the type of the dynamic shape information isTensor
trt::const
and with a translation map which defines mappings between staticaten
ops and dynamictrt::[op]_dyn
ops. (TensorRT/core/lowering/register_trt_placeholder_ops.cpp
Line 14 in 1d78f43
int[]
orint
being provided aTensor
, provide an error stating what operations need dynamic alternatives.trt::[op]_dyn
converter should have a separate implementation from their static counterpart. This converter is written with the assumption that shape information is provided via an ITensor. If necessary we can amend this assumption thatdyn
converters actually need to handle both ITensor andint[]
cases but in this case (static->dynamic) it is much easier to handle than the current situation (dynamic->static). The converter would simply freeze theint[]
into a shape tensor likeITensorOrFreeze
does. In fact we can add a convince function to handle thisShapeTensorOrFreeze
which would freezetorch::Tensor
s orint
s/int[]
and return a shape tensor.Implementation Phases
Prototype - Large
MVP - Large
Beta Was this translation helpful? Give feedback.
All reactions