Replies: 3 comments 2 replies
-
@peri044 @narendasan any insights? |
Beta Was this translation helpful? Give feedback.
0 replies
-
Next milestone: |
Beta Was this translation helpful? Give feedback.
1 reply
-
@bowang007 is this complete? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Automatic int32 <=> int64 Datatype Conversion in Fallback
Goals(s)
The goal is to support automatic int64 to int32 data type transformation between TensorRT engine and Torch subgraph in partitioning system when needed.
Currently Torch-TensorRT automatically truncates long and double data type to int32 and float32 to since these 2 types are not supported in TensorRT. However, in some cases, when partial compilation is enabled, some of the operators run in Torch requires data type to be int64 or float64. In this case, Torch-TensorRT need to convert the data type from int32 to int64 automatically.
Usecase
This feature is useful when :
Proposed APIs
We expect this datatype conversion is automatically performed when necessary, so we don't propose new APIs as such might also introduce complexities here. However, there is an existing API
truncate_long_and_double
which would need to be true to pass the compilation phase since there is an operator which consumes int64 data.Internals
Design
1. Tracking casting operation
Since Torch-TensorRT only support 32 bit input data type, there must be cast operation to cast the data type from int32 to int64 in the model. On graph level, the cast operation is performed through
aten::to
. This indicates that whenever there is an operator which consumes int64 datatype as input, there must be anaten::to
operator which cast the datatype to int64 to be used in this operator. So, we need a list which stores all theaten::to
operations to track whether the data is cast into int64 datatype.We would need a dictionary by which we can get a
aten::to
operator by any int64 value:std::unordered_map<torch::jit::Value*, torch::jit::Node*> casting_lut;
Through this LUT we can locate the corresponding cast operation and find the info such as the data type before casting.
2. Go through Torch segmented blocks
Since TensorRT segments can only run 32 bit datatypes, we just truncate all int64 input to int32 for TensorRT segments. For Torch segments, there might be operations that only accepts int64 data as input. Not sure if there is some kind of list or some kind of detection mechanism that we can utilize from Torch which indicates whether an op only accepts int64 data as input. 2 scenarios here:
2.1 If there is such utility in Torch, then we just go through every operator in Torch segment and check if there is any op which only consumes int64 data, if there is, then we don't truncate the datatype to int32 but insert an
aten::to
operation which convert the datatype to int64 to be used later.2.2 If there is no such utility, then what we can do here is just go through all Torch segments and check if any input is int64. If there is any input datatype to be int64, and the value is produced by an
aten::to
in a TensorRT segment, then we insert anaten::to
at the beginning of the Torch segment to recast the data type to be int64 to be used later in Torch segment.3. Insert casting operations for Torch segmented blocks
We need to insert an
aten::to
operator at the begining of the Torch segment to recast the datatype to be int64 and we also need to insert anotheraten::to
at the end of the Torch segment to recast the data type to int32 if there is any data which is int64 and is used by later segments.Beta Was this translation helpful? Give feedback.
All reactions