[FEA] Update chunked parquet reader benchmarks to include pass_read_limit
#15057
Labels
0 - Backlog
In queue waiting for assignment
cuIO
cuIO issue
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Spark
Functionality that helps Spark RAPIDS
Milestone
Is your feature request related to a problem? Please describe.
The
BM_parquet_read_chunks
benchmark inbenchmarks/io/parquet/parquet_reader_input.cpp
includes abyte_limit
nvbench axis. This axis controls thechunk_read_limit
. With the new features added in #14360, there is a newchunked_parquet_reader
API that exposes bothchunk_read_limit
andpass_read_limit
parameters to control reader behavior. We currently do not have a method for benchmarkingpass_read_limit
values.Describe the solution you'd like
BM_parquet_read_subrowgroup_chunks
, that provides nvbench axes for bothchunk_read_limit
andpass_read_limit
byte_limit
tochunk_read_limit
inBM_parquet_read_chunks
for clarity, now that we have both input and output byte limits in chunked parquet reading.data_size
for at least the chunked parquet reader benchmarks. It would be useful to allow the benchmarks to operate on tables larger than 536 MB.Describe alternatives you've considered
n/a
The text was updated successfully, but these errors were encountered: