Skip to content

Commit

Permalink
NPUW: Hotfix - delay the original weight memory deallocation (24.6) (o…
Browse files Browse the repository at this point in the history
…penvinotoolkit#27887)

### Details:
- Port openvinotoolkit#27886 to
releases/2024/5

### Tickets:
 - *ticket-id*
  • Loading branch information
dmatveev authored Dec 4, 2024
1 parent 104402a commit 398f703
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions src/plugins/intel_npu/src/plugin/npuw/weights_bank.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -110,9 +110,6 @@ ov::Tensor Bank::eval_and_alloc(const LazyTensor& tensor,
return transformed_tensor;
}

// Non-CPU case: detach the evaluated LazyTensor from its memory
const_cast<LazyTensor&>(tensor).detach();

ov::SoPtr<ov::ITensor> remote_tensor;
ov::Tensor allocated_tensor;

Expand All @@ -124,6 +121,12 @@ ov::Tensor Bank::eval_and_alloc(const LazyTensor& tensor,
guard.unlock(); // Unlock the guard, map update is done - copy can continue in parallel

transformed_tensor.copy_to(allocated_tensor);

// Detach the evaluated LazyTensor from its memory here - when it is 100%
// not needed anymore (transformations, if any, and copies are done)
// Note: this is the non-CPU path!
const_cast<LazyTensor&>(tensor).detach();

return allocated_tensor;
}

Expand Down

0 comments on commit 398f703

Please sign in to comment.