-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update MarkDequantization transformation #27406
base: master
Are you sure you want to change the base?
Update MarkDequantization transformation #27406
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is that we start propagating the rt_info which shouldn't be propagated. I have an alternative idea on how the described issue can be fixed.
Currently, MarkDequantizationSubgraph
does several markups including:
disable_constant_folding
markup for convertsenable_keep_const_precision
markup for constants- The rest markup (
mark_as_dequantization_node
orunmark_as_decompression
)
What if we move enable_keep_const_precision
markup in a separate matcher pass, and organize the markup pipeline in the following way?
- First pass will have the same matcher as
MarkDequantizationSubgraph
currently has, and will perform onlydisable_constant_folding
markup. - Then,
ConstantFolding
will be called, and all the intermediate layers between weights and convert will be folded. After that, the subgraph will have canonical form (low precision weights -> Convert -> (optional)Subtract -> Multiply) - At the end,
enable_keep_const_precision
markup will match the subgraphs with canonical form and mark constant nodes in this pattern
What do you think?
src/common/low_precision_transformations/tests/mark_dequantization_subgraph_transformation.cpp
Outdated
Show resolved
Hide resolved
src/common/transformations/include/transformations/rt_info/keep_const_precision.hpp
Outdated
Show resolved
Hide resolved
Yes, it's possible. But wouldn't that complicate the already complicated pipeline? e.g.
"target_shape" Const will keep the original precision and won't be affected by ConvertPrecision transformation. I believe we can expect constant subgraphs to fold and the issues do not arise. |
Theoretically, we can have the subgraphs, which are marked by In this case, the dequantization subgraph has canonical form, and f32->u8 convert allows
It seems like rt info from convert ops is not propagated through the graph during LPT. However, LPT are not the single interaction point with such canonical subgraphs: decompression related passes can also fuse ops from the subgraph in a separate operation, e.g. So In conclusion, we will most likely not get any problems from the copyable |
Ok, in this case, I will split the transformation into 2:
|
…keep_const_precision_attr
…keep_const_precision_attr
…keep_const_precision_attr
…github.com/itikhono/openvino into itikhono/bug_fix/keep_const_precision_attr
CPU functional test failure is not related to this PR. Additional perf validation will be triggered for this PR. |
src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp
Show resolved
Hide resolved
src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp
Outdated
Show resolved
Hide resolved
src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp
Show resolved
Hide resolved
src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp
Show resolved
Hide resolved
|
||
using namespace ov; | ||
|
||
TEST_F(TransformationTestsF, MarkDequantizationSubgraphTransformation) { | ||
TEST_F(TransformationTestsF, KeepConstPrecision) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also compare nodes' rt_info in this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
but it looks like the rt_info comparison doesn't work, the same with other tests, I will double check
std::vector<ov::element::Type>{ ov::element::i8, ov::element::u8, ov::element::i4, ov::element::u4 }); | ||
} | ||
|
||
//if (enableInt8) { Why do we need this check? According to the line 378 we did this marking anyway |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: clarify this with GPU team
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a confirmation from the GPU team
src/common/transformations/src/transformations/common_optimizations/moc_transformations.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comments left from my side (except minor existing ones). Good job 👍
the fix for Or pattern was moved to the separate PR: #27721 |
src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp
Outdated
Show resolved
Hide resolved
https://github.com/itikhono/openvino into itikhono/bug_fix/keep_const_precision_attr
Details:
Original issue description:
Changes:
Tickets: