You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
With SNAPPY compression, the Parquet writer can emit a mix of uncompressed and compressed pages. The uncompressed pages are written when the compression ratio is close to 1 for the page to save work during file read.
However, the Parquet reader does not currently coalesce IO between compressed and uncompressed pages, which fragments the IO in many smaller reads instead of a single large read. You can see this effect by using a host buffer data source and inspecting the CUDA HW trace.
nsys profile -t nvtx,cuda,osrt -f true --cuda-memory-usage=true --cuda-um-cpu-page-faults=true --cuda-um-gpu-page-faults=true --gpu-metrics-device=4 --output=pq_coalesce --env-var CUDA_VISIBLE_DEVICES=4 ./PARQUET_READER_NVBENCH -d 0 -b 1 --profile -a io_type=HOST_BUFFER -a compression_type=[SNAPPY,NONE] -a run_length=1 -a cardinality=[0,1000000,100000,1000]
SNAPPY
With compression NONE we write all uncompressed pages, and with compression ZSTD we write all compressed pages, so these formats do not show the same non-coalesced IO pattern.
NONE
ZSTD
Describe the solution you'd like
We could start with simple solutions:
force the parquet writer to always write compressed when SNAPPY is selected
adjust the size threshold when SNAPPY compression is used to reduce the number of uncompressed pages
adjust the compression heuristic to avoid switching between compressed and uncompressed pages
And we can also consider some more complex solutions
change the reader to coalesce IO even when there is a mix of compressed and uncompressed pages. This could increase memory footprint, but since the number of uncompressed pages is often small, perhaps the impact will be low.
change the reader to coalesce IO even when there is a mix of compressed and uncompressed pages, and then use a DtoD (BatchMemcpy?) to separate the uncompressed and compressed pages
*Performance considerations
forcing SNAPPY could result in larger file sizes and longer decompression times. We should check these signals. I hope that the impact is negligible. The file size signal should be low, and the decompression time might be the same just with slightly higher warp occupancy.
doing an extra DtoD copy might end up slower than just decompressing more low-ratio pages
How common is uncompressed page write in other readers? TBD
The text was updated successfully, but these errors were encountered:
In discussions on 2025-01-08, we came up with the idea to remove the size check altogether. If we remove the size check we could also simplify the interface between writer and nvcomp adapter code.
Is your feature request related to a problem? Please describe.
With SNAPPY compression, the Parquet writer can emit a mix of uncompressed and compressed pages. The uncompressed pages are written when the compression ratio is close to 1 for the page to save work during file read.
However, the Parquet reader does not currently coalesce IO between compressed and uncompressed pages, which fragments the IO in many smaller reads instead of a single large read. You can see this effect by using a host buffer data source and inspecting the CUDA HW trace.
nsys profile -t nvtx,cuda,osrt -f true --cuda-memory-usage=true --cuda-um-cpu-page-faults=true --cuda-um-gpu-page-faults=true --gpu-metrics-device=4 --output=pq_coalesce --env-var CUDA_VISIBLE_DEVICES=4 ./PARQUET_READER_NVBENCH -d 0 -b 1 --profile -a io_type=HOST_BUFFER -a compression_type=[SNAPPY,NONE] -a run_length=1 -a cardinality=[0,1000000,100000,1000]
SNAPPY
With compression NONE we write all uncompressed pages, and with compression ZSTD we write all compressed pages, so these formats do not show the same non-coalesced IO pattern.
NONE
ZSTD
Describe the solution you'd like
We could start with simple solutions:
And we can also consider some more complex solutions
*Performance considerations
The text was updated successfully, but these errors were encountered: