Update MarkDequantization transformation #27406

itikhono · 2024-11-05T12:19:27Z

Details:

Original issue description:

Inside MarkDequantizationSubgraph, we try to find the dequantization subgraph and mark it with disable const folding and keep const precision attributes.
The dequantization subgraph has a relaxed structure and allows "any_input" as the input of Convert operation.

from MarkDequantizationSubgraph:

in the current case this is
(any input: Constant->Reshape) -> (necessary Convert)

Constant->Reshape will be ConstFolded. ConstFolding doesn't run data copying in case of Reshape operation, so this is valid scenario.

MarkDequantizationSubgraph transformation marks Reshape op with "KeepConstPrecision" attribute.
But after ConstFolding, KeepConstPrecision attr won't be copied to resulting constant because of

and the whole dequantization subgraph will be const folded.

Changes:

MarkDequantizationSubgraph logic was split into 2 transformations: MarkDequantization, KeepConstPrecision

Tickets:

CVS-156576
CVS-156329

v-Golubev

My main concern is that we start propagating the rt_info which shouldn't be propagated. I have an alternative idea on how the described issue can be fixed.

Currently, MarkDequantizationSubgraph does several markups including:

disable_constant_folding markup for converts
enable_keep_const_precision markup for constants
The rest markup (mark_as_dequantization_node or unmark_as_decompression)

What if we move enable_keep_const_precision markup in a separate matcher pass, and organize the markup pipeline in the following way?

First pass will have the same matcher as MarkDequantizationSubgraph currently has, and will perform only disable_constant_folding markup.
Then, ConstantFolding will be called, and all the intermediate layers between weights and convert will be folded. After that, the subgraph will have canonical form (low precision weights -> Convert -> (optional)Subtract -> Multiply)
At the end, enable_keep_const_precision markup will match the subgraphs with canonical form and mark constant nodes in this pattern

What do you think?

src/common/low_precision_transformations/tests/mark_dequantization_subgraph_transformation.cpp

src/common/transformations/include/transformations/rt_info/keep_const_precision.hpp

itikhono · 2024-11-06T10:29:06Z

My main concern is that we start propagating the rt_info which shouldn't be propagated. I have an alternative idea on how the described issue can be fixed.

Currently, MarkDequantizationSubgraph does several markups including:

disable_constant_folding markup for converts

enable_keep_const_precision markup for constants

The rest markup (mark_as_dequantization_node or unmark_as_decompression)

What if we move enable_keep_const_precision markup in a separate matcher pass, and organize the markup pipeline in the following way?

First pass will have the same matcher as MarkDequantizationSubgraph currently has, and will perform only disable_constant_folding markup.

Then, ConstantFolding will be called, and all the intermediate layers between weights and convert will be folded. After that, the subgraph will have canonical form (low precision weights -> Convert -> (optional)Subtract -> Multiply)

At the end, enable_keep_const_precision markup will match the subgraphs with canonical form and mark constant nodes in this pattern

What do you think?

Yes, it's possible. But wouldn't that complicate the already complicated pipeline?
As I can see, KeepConstPrecision is used only for Constants in ConvertPrecision transformation, the flag will be ignored in case of other ops marked. So, it's precisely unsafe to copy this attribute into Constants, not to other ops.

e.g.

initial subgraph: Parameter -> Reshape( "target_shape" const as an input) -> Convert
Reshape is marked with KeepConstPrecision flag
some transformation replaces the Reshape: Const -> Reshape (new "target_shape" Const marked with KeepConstPrecision)

"target_shape" Const will keep the original precision and won't be affected by ConvertPrecision transformation.
This may cause some implicit issues.

I believe we can expect constant subgraphs to fold and the issues do not arise.
Do we have non-constant (Parameter dependent) subgraphs in the real scenarios in this pattern?
Can this flag somehow extend beyond the dequantization subgraph?

v-Golubev · 2024-11-07T07:13:37Z

Do we have non-constant (Parameter dependent) subgraphs in the real scenarios in this pattern?

Theoretically, we can have the subgraphs, which are marked by MarkDequantizationSubgraph, on non-constant path. For example, LPT supports the following subgraphs coming from some frontends (a picture from lpt docs):

In this case, the dequantization subgraph has canonical form, and f32->u8 convert allows MarkDequantizationSubgraph to match by precision on this convert as data node. And such subgraphs are placed both on data flow and weights.

Can this flag somehow extend beyond the dequantization subgraph?

It seems like rt info from convert ops is not propagated through the graph during LPT. However, LPT are not the single interaction point with such canonical subgraphs: decompression related passes can also fuse ops from the subgraph in a separate operation, e.g. GatherCompressed. On your branch, we get the following graph after the fusion:

So keep_const_precision attribute is placed on non-constant flow after the fusion. Currently, we have no transformations which could work with GatherCompressed, but there is no guarantee that such optimizations will not exist in the future, and in case of such unwanted rt_info propagation we could get the problems you described in the example.

In conclusion, we will most likely not get any problems from the copyable KeepConstPrecision attribute in the current pipeline, but such change lays the foundation for potential future problems

itikhono · 2024-11-07T10:24:30Z

Do we have non-constant (Parameter dependent) subgraphs in the real scenarios in this pattern?

Theoretically, we can have the subgraphs, which are marked by MarkDequantizationSubgraph, on non-constant path. For example, LPT supports the following subgraphs coming from some frontends (a picture from lpt docs):

In this case, the dequantization subgraph has canonical form, and f32->u8 convert allows MarkDequantizationSubgraph to match by precision on this convert as data node. And such subgraphs are placed both on data flow and weights.

Can this flag somehow extend beyond the dequantization subgraph?

It seems like rt info from convert ops is not propagated through the graph during LPT. However, LPT are not the single interaction point with such canonical subgraphs: decompression related passes can also fuse ops from the subgraph in a separate operation, e.g. GatherCompressed. On your branch, we get the following graph after the fusion:

So keep_const_precision attribute is placed on non-constant flow after the fusion. Currently, we have no transformations which could work with GatherCompressed, but there is no guarantee that such optimizations will not exist in the future, and in case of such unwanted rt_info propagation we could get the problems you described in the example.

In conclusion, we will most likely not get any problems from the copyable KeepConstPrecision attribute in the current pipeline, but such change lays the foundation for potential future problems

Ok, in this case, I will split the transformation into 2:

disable_constant_folding markup.
enable_keep_const_precision markup

…keep_const_precision_attr

…github.com/itikhono/openvino into itikhono/bug_fix/keep_const_precision_attr

itikhono · 2024-11-18T08:37:39Z

CPU functional test failure is not related to this PR.
GPU test failure is under investigation.

Additional perf validation will be triggered for this PR.

src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp

v-Golubev · 2024-11-19T18:12:02Z

src/common/low_precision_transformations/tests/mark_dequantization_subgraph_transformation.cpp


 using namespace ov;

-TEST_F(TransformationTestsF, MarkDequantizationSubgraphTransformation) {
+TEST_F(TransformationTestsF, KeepConstPrecision) {


Can we also compare nodes' rt_info in this test?

done
but it looks like the rt_info comparison doesn't work, the same with other tests, I will double check

src/core/src/pattern/op/or.cpp

itikhono · 2024-11-21T19:16:00Z

src/plugins/intel_gpu/src/plugin/transformations_pipeline.cpp

-                std::vector<ov::element::Type>{ ov::element::i8, ov::element::u8, ov::element::i4, ov::element::u4 });
-        }
+
+        //if (enableInt8) { Why do we need this check? According to the line 378 we did this marking anyway


todo: clarify this with GPU team

need a confirmation from the GPU team

src/common/transformations/src/transformations/common_optimizations/moc_transformations.cpp

v-Golubev

No comments left from my side (except minor existing ones). Good job 👍

itikhono · 2024-11-25T08:49:45Z

the fix for Or pattern was moved to the separate PR: #27721

src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp

https://github.com/itikhono/openvino into itikhono/bug_fix/keep_const_precision_attr

Make KeepConstPrecision attribute copyable

dffcd9b

itikhono requested review from a team as code owners November 5, 2024 12:19

itikhono requested review from Lyamin-Roman and removed request for a team November 5, 2024 12:19

github-actions bot added category: transformations OpenVINO Runtime library - Transformations category: LP transformations OpenVINO Low Precision transformations labels Nov 5, 2024

itikhono requested review from v-Golubev, CuriousPanCake and pavel-esir November 5, 2024 12:19

itikhono added this to the 2025.0 milestone Nov 5, 2024

v-Golubev reviewed Nov 5, 2024

View reviewed changes

src/common/low_precision_transformations/tests/mark_dequantization_subgraph_transformation.cpp Outdated Show resolved Hide resolved

src/common/transformations/include/transformations/rt_info/keep_const_precision.hpp Outdated Show resolved Hide resolved

itikhono requested a review from v-Golubev November 6, 2024 10:30

itikhono added 4 commits November 8, 2024 09:54

Merge remote-tracking branch 'upstream/master' into itikhono/bug_fix/…

1a50a8d

…keep_const_precision_attr

Merge remote-tracking branch 'upstream/master' into itikhono/bug_fix/…

52e2e30

…keep_const_precision_attr

Merge remote-tracking branch 'upstream/master' into itikhono/bug_fix/…

2143e7e

…keep_const_precision_attr

update mark dequantization transformation

7555a9e

itikhono requested review from a team as code owners November 13, 2024 11:51

itikhono requested review from tadamczx and removed request for a team November 13, 2024 11:51

itikhono added 2 commits November 13, 2024 16:25

try to fix a warning

0dfabd9

revert KeepConstPrecision change

865fd4c

itikhono changed the title ~~Make KeepConstPrecision attribute copyable~~ Update MarkDequantization transformation Nov 13, 2024

itikhono added 5 commits November 14, 2024 13:35

Merge branch 'itikhono/bug_fix/keep_const_precision_attr' of https://…

7cd9c24

…github.com/itikhono/openvino into itikhono/bug_fix/keep_const_precision_attr

align the current behavior with the previous implementation

fa6b0ec

fix tests

1202ab2

add precision check

c93f7a7

Merge branch 'master' into itikhono/bug_fix/keep_const_precision_attr

e7d7c5f

itikhono requested a review from vladimir-paramuzov November 18, 2024 08:35

itikhono added the under_perf_check label Nov 18, 2024

itikhono added 3 commits November 19, 2024 14:56

fix issue on gpu, docs, refactoring

189153f

Merge branch 'master' into itikhono/bug_fix/keep_const_precision_attr

48b5694

remove the dq model pass, leave the separate matchers only

06f1c22

v-Golubev reviewed Nov 19, 2024

View reviewed changes

fix Opattern::op::Or logic

1c7a72e

github-actions bot added the category: Core OpenVINO Core (aka ngraph) label Nov 21, 2024

itikhono requested a review from evkotov November 21, 2024 10:37

fixed the marking on gpu

416d610

itikhono commented Nov 21, 2024

View reviewed changes

src/core/src/pattern/op/or.cpp Outdated Show resolved Hide resolved

itikhono commented Nov 21, 2024

View reviewed changes

src/common/transformations/src/transformations/common_optimizations/moc_transformations.cpp Outdated Show resolved Hide resolved

v-Golubev approved these changes Nov 22, 2024

View reviewed changes

CuriousPanCake reviewed Nov 25, 2024

View reviewed changes

src/common/transformations/src/transformations/low_precision/mark_dequantization_subgraph.cpp Outdated Show resolved Hide resolved

Merge branch 'master' into itikhono/bug_fix/keep_const_precision_attr

7f2e1de

github-actions bot removed the category: Core OpenVINO Core (aka ngraph) label Nov 26, 2024

itikhono added 3 commits November 30, 2024 12:05

resolve review comments

553f2b6

Merge branch 'itikhono/bug_fix/keep_const_precision_attr' of

570e84f

https://github.com/itikhono/openvino into itikhono/bug_fix/keep_const_precision_attr

Merge branch 'master' into itikhono/bug_fix/keep_const_precision_attr

6853a63

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update MarkDequantization transformation #27406

Update MarkDequantization transformation #27406

itikhono commented Nov 5, 2024 •

edited

Loading

v-Golubev left a comment

itikhono commented Nov 6, 2024

v-Golubev commented Nov 7, 2024

itikhono commented Nov 7, 2024

itikhono commented Nov 18, 2024

v-Golubev Nov 19, 2024

itikhono Nov 30, 2024

itikhono Nov 21, 2024

itikhono Nov 30, 2024

v-Golubev left a comment

itikhono commented Nov 25, 2024

Update MarkDequantization transformation #27406

Are you sure you want to change the base?

Update MarkDequantization transformation #27406

Conversation

itikhono commented Nov 5, 2024 • edited Loading

Details:

Tickets:

v-Golubev left a comment

Choose a reason for hiding this comment

itikhono commented Nov 6, 2024

v-Golubev commented Nov 7, 2024

itikhono commented Nov 7, 2024

itikhono commented Nov 18, 2024

v-Golubev Nov 19, 2024

Choose a reason for hiding this comment

itikhono Nov 30, 2024

Choose a reason for hiding this comment

itikhono Nov 21, 2024

Choose a reason for hiding this comment

itikhono Nov 30, 2024

Choose a reason for hiding this comment

v-Golubev left a comment

Choose a reason for hiding this comment

itikhono commented Nov 25, 2024

itikhono commented Nov 5, 2024 •

edited

Loading