Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update access modifiers for CPU impls #28286

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/plugins/intel_gpu/src/graph/impls/cpu/broadcast.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ struct broadcast_impl : public typed_primitive_impl<broadcast> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);
output_host_tensors.push_back(make_tensor(params->output_layouts[0], output_lock.data()));

OPENVINO_ASSERT(op->evaluate(output_host_tensors, input_host_tensors),
Expand Down
2 changes: 1 addition & 1 deletion src/plugins/intel_gpu/src/graph/impls/cpu/concat.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ struct concatenation_impl : public typed_primitive_impl<concatenation> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);

output_host_tensors.push_back(make_tensor(params->output_layouts[0], output_lock.data()));

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ struct fake_convert_impl : public typed_primitive_impl<fake_convert> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);

for (size_t i = 0; i < input_mem_ptrs.size(); i++)
input_host_tensors.push_back(make_tensor(params->input_layouts[i], input_mem_ptrs[i]->lock(stream, mem_lock_type::read)));
Expand Down
2 changes: 1 addition & 1 deletion src/plugins/intel_gpu/src/graph/impls/cpu/gather.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ struct gather_impl : public typed_primitive_impl<gather> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);

for (size_t i = 0; i < input_mem_ptrs.size(); i++)
input_host_tensors.push_back(make_tensor(params->input_layouts[i], input_mem_ptrs[i]->lock(stream, mem_lock_type::read)));
Expand Down
2 changes: 1 addition & 1 deletion src/plugins/intel_gpu/src/graph/impls/cpu/range.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ struct range_impl : public typed_primitive_impl<range> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);

for (size_t i = 0; i < input_mem_ptrs.size(); i++)
input_host_tensors.push_back(make_tensor(params->input_layouts[i], input_mem_ptrs[i]->lock(stream, mem_lock_type::read)));
Expand Down
2 changes: 1 addition & 1 deletion src/plugins/intel_gpu/src/graph/impls/cpu/reorder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ struct reorder_impl : public typed_primitive_impl<reorder> {
auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> input_lock(input_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output memory will be updated as the reason solved by this PR, could you descript the detail reason?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who allocate cl_mem output for this node? and why cpu impl is choosen?
btw, this line will break if output is usm_device, it's unsafe to directly change to read_write here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, River, I will add some test case to explain this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usm_device

If it is usm_device, it will do copying operation by mem_lock itself, right? But we need logic to write back to usm_device memory if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WeldonWangwang, could you please update access modifiers for the other CPU impls as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sshlyapn Sure, will add it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the memory is used for readonly, why change them to read_write?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riverlijunjie, why are they used for read-only? These buffers are used to store CPU implementations output results, so they should be modifiable


input_host_tensors.push_back(make_tensor(params->input_layouts[0], input_lock.data()));
output_host_tensors.push_back(make_tensor(params->output_layouts[0], output_lock.data()));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ struct scatter_update_impl : public typed_primitive_impl<scatter_update> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);

for (size_t i = 0; i < input_mem_ptrs.size(); i++)
input_host_tensors.push_back(make_tensor(params->input_layouts[i], input_mem_ptrs[i]->lock(stream, mem_lock_type::read)));
Expand Down
2 changes: 1 addition & 1 deletion src/plugins/intel_gpu/src/graph/impls/cpu/select.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ struct select_impl : public typed_primitive_impl<select> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<uint8_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<uint8_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);
output_host_tensors.push_back(make_tensor(params->output_layouts[0], output_lock.data()));

OPENVINO_ASSERT(op->evaluate(output_host_tensors, input_host_tensors),
Expand Down
2 changes: 1 addition & 1 deletion src/plugins/intel_gpu/src/graph/impls/cpu/tile.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ struct tile_impl : public typed_primitive_impl<tile> {

auto output_mem_ptr = instance.output_memory_ptr();

cldnn::mem_lock<int32_t, mem_lock_type::read> output_lock(output_mem_ptr, stream);
cldnn::mem_lock<int32_t, mem_lock_type::read_write> output_lock(output_mem_ptr, stream);
output_host_tensors.push_back(make_tensor(params->output_layouts[0], output_lock.data()));

OPENVINO_ASSERT(op->evaluate(output_host_tensors, input_host_tensors),
Expand Down
3 changes: 3 additions & 0 deletions src/plugins/intel_gpu/src/runtime/ocl/ocl_memory.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ struct gpu_buffer : public lockable_gpu_mem, public memory {
assert(0 == _lock_count);
return _buffer;
}
void* buffer_ptr() const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Sergey, it's not needed, this API just used in debug messages here:

GPU_DEBUG_TRACE_DETAIL << internal_name << " with index " << output_idx << " prepare output: " << output_memory->buffer_ptr() << std::endl;

The ptr would be 0 without this API, so I added it.
such as: GPU_Debug: sync_infer_request.cpp:989:prepare_output: result:Result_51154 with index 9 prepare output: 0

return get_buffer().get();
}

event::ptr copy_from(stream& stream, const void* data_ptr, size_t src_offset, size_t dst_offset, size_t size, bool blocking) override;
event::ptr copy_from(stream& stream, const memory& src_mem, size_t src_offset, size_t dst_offset, size_t size, bool blocking) override;
Expand Down
Loading