Torch-TensorRT v1.2.0 #1352
narendasan
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
PyTorch 1.2, Collections based I/O, FX Frontend, torchtrtc custom op support, CMake build system and Community Window Support
Torch-TensorRT 1.2.0 targets PyTorch 1.12, CUDA 11.6, cuDNN 8.4 and TensorRT 8.4. This release focuses on a couple key new APIs to handle function I/O that uses collection types which should enable whole new model classes to be compiled by Torch-TensorRT without source code modification. It also introduces the "FX Frontend", a new frontend for Torch-TensorRT which leverages FX, a high level IR built into PyTorch with extensive Python APIs. For uses cases which do not need to be run outside of Python this may be a strong option to try as it is easily extensible in a familar development enviornment. In Torch-TensorRT 1.2.0, the FX frontend should be considered beta level in stability.
torchtrtc
has received improvements which target the ability to handle operators outside of the core PyTorch op set. This includes custom operators from libraries such astorchvision
andtorchtext
. Similarlly users can provide custom converters to torchtrtc to extend the compilers support from the command line instead of having to write an application to do so. Finally, Torch-TensorRT introduces community supported Windows and CMake support.New Dependencies
nvidia-tensorrt
For previous versions of Torch-TensorRT, users had to install TensorRT via system package manager and modify their
LD_LIBRARY_PATH
in order to set up Torch-TensorRT. Now users should install the TensorRT Python API as part of the installation proceedure. This can be done via the following steps:Installing the TensorRT pip package will allow Torch-TensorRT to automatically load the TensorRT libraries without any modification to enviornment variables. It is also a necessary dependency for the FX Frontend.
torchvision
Some FX frontend converters are designed to target operators from 3rd party libraries like torchvision. As such, you must have torchvision installed in order to use them. However, this dependency is optional for cases where you do not need this support.
Jetson
Starting from this release we will be distributing precompiled binaries of our NGC release branches for aarch64 (as well as x86_64), starting with ngc/22.07. These releases are designed to be paired with NVIDIA distributed builds of PyTorch including the NGC containers and Jetson builds and are equivalent to the prepackaged distribution of Torch-TensorRT that comes in the containers. They represent the state of the master branch at the time of branch cutting so may lag in features by a month or so. These releases will come separately to minor version releases like this one. Therefore going forward, these NGC releases should be the primary release channel used on Jetson (including for building from source).
NOTE: NGC PyTorch builds are not identical to builds you might install through normal channels like pytorch.org. In the past this has caused issues in portability between pytorch.org builds and NGC builds. Therefore we strongly recommend in workflows such as exporting a TorchScript module on an x86 machine and then compiling on Jetson to ensure you are using the NGC container release on x86 for your host machine operations. More information about Jetson support can be found along side the 22.07 release (https://github.com/pytorch/TensorRT/releases/tag/v1.2.0a0.nv22.07)
Collections based I/O [Experimental]
Torch-TensorRT previously has operated under the assumption that
nn.Module
forward functions can trivially be reduced to the formforward([Tensor]) -> [Tensor]
. Typically this implies functions fo the formforward(Tensor, Tensor, ... Tensor) -> (Tensor, Tensor, ..., Tensor)
. However as model complexity increases, grouping inputs may make it easier to manage many inputs. Therefore, function signatures similar toforward([Tensor], (Tensor, Tensor)) -> [Tensor]
orforward((Tensor, Tensor)) -> (Tensor, (Tensor, Tensor))
might be more common. In Torch-TensorRT 1.2.0, more of these kinds of uses cases are supported using the new experimentalinput_signature
compile spec API. This API allows users to group Input specs similar to how they might group the input Tensors they would use to call the original module's forward function. This informs Torch-TensorRT on how to map a Tensor input from its location in a group to the engine and from the engine into its grouping returned back to the user.To make this concrete consider the following standard case:
Here a user has defined two explicit tensor inputs and used the existing list based API to define the input specs.
With Torch-TensorRT the following use cases are now possible using the new
input_signature
API:Note how the input specs (in this case just example tensors) are provided to the compiler. The
input_signature
argument expects aTuple[Union[torch.Tensor, torch_tensorrt.Input, List, Tuple]]
grouped in a format representative of how the function would be called. In these cases its just a list or tuple of specs.More advanced cases are supported as we:
These features are also supported in C++ as well:
Currently this feature should be considered experimental, APIs may be subject to change or folded into existing APIs. There are also limitations introduced by using this feature including the following:
Dict
,namedtuple
)require_full_compilation
while using this featureThese limitations will be addressed in subsequent versions.
Adding FX frontend to Torch-TensorRT [Beta]
This release includes the FX as one of its supported IRs to convert torch models to TensorRT through the new FX frontend. At a high level, this path transforms the model into or consumes an FX graph and similar to the TorchScript frontend converts the graph to TensorRT through the use of a library of converters. The key difference is that it is implemented purely in Python. The role of this FX frontend is to supplement the TS lowering path and to provide users better ease of use and easier extensibility in use cases where removing Python as a dependency is not strictly necessary. Detailed user instructions can be find in the document.
The FX path examples are located under
//examples/fx
The FX path unit tests are located under
//py/torch_tensorrt/fx/tests
Custom operators and converters in Torch-TensorRT
While both the C++ API and Python API provide systems to include and convert custom operators in your model (for instance those implemented in
torchvision
)torchtrtc
has been limited to the core opset. In Torch-TensorRT 1.2.0 two new flags have been added totorchtrtc
.These arguments accept paths to .so or DLL files which define custom operators for PyTorch or custom converters for Torch-TensorRT. These files will get
DL_OPEN
'd at runtime to extend the op and converter libraries.For example:
Community CMake and Windows support
Thanks to the great work of @gcuendet and others, CMake and consequentially Windows support has been added to the project! Users on Linux and Windows can now build the C++ API using this system and using
torch_tensorrt_runtime.dll
add support for executing Torch-TensorRT programs on Windows in both Python and C++. Detailed information on how to use this build system can be found here: https://pytorch.org/TensorRT/getting_started/installation.htmlBazel will continue to be the primary build system for the project and all testing and distributed builds will be built and run with Bazel (including future official Windows support) so users should consider this still the canonical version of Torch-TensorRT. However we aim to ensure as best as we can that the CMake system will be able to build the project properly including on Windows. Contributions to continue to grow the support for this build system and Windows as a platform are definitely welcomed.
Known Limitations
Dict
,namedtuple
)require_full_compilation
while using this featureDependencies
=================================
Operators Supported (TorchScript)
Operators Currently Supported Through Converters
Operators Currently Supported Through Evaluators
Device? device=None, bool? pin_memory=None) -> (Tensor)
Layout? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor)
Layout? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor)
What's Changed
New Contributors
Full Changelog: v1.1.1...v1.2.0
This discussion was created from the release Torch-TensorRT v1.2.0.
Beta Was this translation helpful? Give feedback.
All reactions