Skip to content

❓ [Question] Lowering pass for PyTorch Linear #529

Answered by peri044
842974287 asked this question in Q&A
Discussion options

You must be logged in to vote

https://docs.nvidia.com/deeplearning/tensorrt/best-practices/index.html#optimize-layer
It says "A new development is encouraged to use MatrixMultiply in preference to FullyConnected layers for consistency of interface. Matrix multiplication is generally significantly faster in FP16 Tensor Cores compared to FP32."
Fully connected layers are expressed as convolutions. I'm not sure if there would be any perf difference for general layer dimensions that are used in final classification layers.

One reason to do this in TRTorch is that we usually keep our converter library light. Any operations that can be expressed as a composition of existing converters would be done so, unless we know for su…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by narendasan
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
question Further information is requested
3 participants
Converted from issue

This discussion was converted from issue #526 on July 12, 2021 16:14.