NPUW: Deref #27767

dmatveev · 2024-11-27T03:59:27Z

Details:

item1
...

Tickets:

ticket-id

+        if (proto_comp_model_desc.device_it + 1 == m_dev_list.end()) {
+            LOG_INFO("No fallback expected - clear the OV model for Subgraph[" << idx << "]");
+            proto_comp_model_desc.model.reset();
+            // proto_comp_model_desc.compiled_model = {}; // Shouldn't be here, CPU only


NB: It seems CPU's compiled models hold the pointer to the ov::Model - so to the original weights. Shouldn't be the case for NPU

UPD. Also the case for CPU

dmatveev · 2024-11-27T04:05:48Z

src/plugins/intel_npu/src/plugin/npuw/weights_bank.cpp

+template<typename F>
+void non_parallel_for(std::size_t count, F &&f) {
+    for (std::size_t idx = 0u; idx < count; idx++) {
+        f(idx);
    }
 }


can be removed (or moved to util as a debug replacement to ov::parallel_for)

dmatveev · 2024-11-27T04:06:26Z

src/plugins/intel_npu/src/plugin/npuw/weights_bank.cpp

+        // FIXME: Uncomment it later (after the CPU copy revert)
+        // const auto &device_str = bank.first;
+        // if (device_str == "CPU") {
+        //     // CPU memory is non-detachable
+        //     continue;
+        // }


Should be uncommented once tested on device

dmatveev · 2024-11-27T04:06:50Z

src/plugins/intel_npu/src/plugin/npuw/weights_bank.cpp

+        // REVERTME:{{{
+        // Store a copy of the tensor memory even on CPU - to simulate
+        // bank load.
+
+        ov::Tensor new_tensor(transformed_tensor.get_element_type(), transformed_tensor.get_shape());
+        dbank.storage[tensor] = new_tensor;
+        guard.unlock();
+
+        transformed_tensor.copy_to(new_tensor);
+        return new_tensor;
+        // Old code here:
+        // m_device_bank[device_for_alloc][tensor] = transformed_tensor;
+        // return transformed_tensor;
+        // REVERTME:}}}


Should be reverted once tested on-device

dmatveev added 4 commits November 27, 2024 00:39

NPUW: Deref annotiations (REMOVE ME!!)

55bb5c3

NPUW Deref: Fix cycling links in Group/Graph (Online partitioning)

64421c8

NPUW Deref: Hotfix the new LazyTensor & Weight Bank issues

783e83b

- LazyTensor wasn't prepared to all cases and was crashing in Unpack - Weight bank had a race (device bank was not protected)

NPUW Deref: Introduce the detach() routine - draft

c2ad6fe

github-actions bot added category: Core OpenVINO Core (aka ngraph) category: CPP API OpenVINO CPP API bindings category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Nov 27, 2024

dmatveev commented Nov 27, 2024

View reviewed changes

dmatveev added 3 commits November 27, 2024 11:14

NPUW Deref: clean-up CPU changes, move detach to eval() - draft

ef5fde0

NPUW Deref: clean-up before the NPU plugin changes

970fb64

NPUW Deref: fix clang format in the code which will make it to merge

ab8d1b9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NPUW: Deref #27767

NPUW: Deref #27767

dmatveev commented Nov 27, 2024 •

edited

Loading

dmatveev left a comment

dmatveev Nov 27, 2024

dmatveev Nov 27, 2024

dmatveev Nov 27, 2024

dmatveev Nov 27, 2024

dmatveev Nov 27, 2024

dmatveev Nov 27, 2024

dmatveev Nov 27, 2024

NPUW: Deref #27767

Are you sure you want to change the base?

NPUW: Deref #27767

Conversation

dmatveev commented Nov 27, 2024 • edited Loading

Details:

Tickets:

Related:

dmatveev left a comment

Choose a reason for hiding this comment

dmatveev Nov 27, 2024

Choose a reason for hiding this comment

dmatveev Nov 27, 2024

Choose a reason for hiding this comment

dmatveev Nov 27, 2024

Choose a reason for hiding this comment

dmatveev Nov 27, 2024

Choose a reason for hiding this comment

dmatveev Nov 27, 2024

Choose a reason for hiding this comment

dmatveev Nov 27, 2024

Choose a reason for hiding this comment

dmatveev Nov 27, 2024

Choose a reason for hiding this comment

dmatveev commented Nov 27, 2024 •

edited

Loading