Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DO NOT RVIEW #9413

Closed
wants to merge 38 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
ddbd045
Merge pull request #9293 from NVIDIA/branch-23.10
nvauto Sep 25, 2023
110be04
Merge pull request #9295 from NVIDIA/branch-23.10
nvauto Sep 25, 2023
49a277e
Init project version 23.12.0-SNAPSHOT (#9292)
pxLi Sep 26, 2023
5f84302
Merge pull request #9298 from NVIDIA/branch-23.10
nvauto Sep 26, 2023
2dc4d5b
Merge pull request #9307 from NVIDIA/branch-23.10
nvauto Sep 26, 2023
d66697e
Merge pull request #9312 from NVIDIA/branch-23.10
nvauto Sep 27, 2023
4825341
Merge pull request #9315 from NVIDIA/branch-23.10
nvauto Sep 27, 2023
5678328
Merge pull request #9316 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
da17346
Merge pull request #9317 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
6dcc63f
Merge pull request #9319 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
a9403dc
Initiate arm64 CI support [skip ci] (#9308)
pxLi Sep 28, 2023
f3c0c43
Merge pull request #9323 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
c6f165d
Merge pull request #9324 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
c6664c5
Merge pull request #9333 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
9f54c9a
Merge pull request #9336 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
2eb3cce
Merge pull request #9339 from NVIDIA/branch-23.10
nvauto Sep 28, 2023
59a15b2
Merge pull request #9341 from NVIDIA/branch-23.10
nvauto Sep 29, 2023
9be814f
Merge pull request #9345 from NVIDIA/branch-23.10
nvauto Sep 29, 2023
bf6c51f
Merge pull request #9346 from NVIDIA/branch-23.10
nvauto Sep 29, 2023
07df6a2
Merge pull request #9355 from NVIDIA/branch-23.10
nvauto Sep 30, 2023
b7341fb
Merge pull request #9358 from NVIDIA/branch-23.10
nvauto Oct 2, 2023
c7ec6f8
Merge pull request #9360 from NVIDIA/branch-23.10
nvauto Oct 2, 2023
58d517a
Merge pull request #9361 from NVIDIA/branch-23.10
nvauto Oct 2, 2023
fbe81cb
Merge pull request #9363 from NVIDIA/branch-23.10
nvauto Oct 2, 2023
9f4b7fb
Merge pull request #9368 from NVIDIA/branch-23.10
nvauto Oct 3, 2023
d35610b
Merging branch-23.10 into branch-23.12
mattahrens Oct 3, 2023
08f083b
Merge pull request #9373 from mattahrens/fix-auto-merge-conflict-9372
jlowe Oct 4, 2023
a6b5520
Merge pull request #9378 from NVIDIA/branch-23.10
nvauto Oct 4, 2023
8eef296
Merge pull request #9379 from NVIDIA/branch-23.10
nvauto Oct 4, 2023
677dd4a
Improve JSON empty row fix to use less memory (#9369)
andygrove Oct 5, 2023
84937fd
Merge pull request #9389 from NVIDIA/branch-23.10
nvauto Oct 5, 2023
dcb03d5
Add developer documentation about working with data sources [skip ci]…
andygrove Oct 5, 2023
47e62cb
Merge pull request #9391 from NVIDIA/branch-23.10
nvauto Oct 5, 2023
54dccf1
Merge pull request #9402 from NVIDIA/branch-23.10
nvauto Oct 9, 2023
20fa0f3
Merge pull request #9406 from NVIDIA/branch-23.10
nvauto Oct 9, 2023
b68783e
Merge pull request #9410 from NVIDIA/branch-23.10
nvauto Oct 9, 2023
c08119d
test only
pxLi Oct 10, 2023
cd76d9f
revert test update
pxLi Oct 10, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,16 @@ flag if cross-compilation is required.
mvn clean verify -Dbuildver=330 -P<jdk11|jdk17>
```

### Building and Testing with ARM

To build our project on ARM platform, please add `-Parm64` to your Maven commands.
NOTE: Build process does not require an ARM machine, so if you want to build the artifacts only
on X86 machine, please also add `-DskipTests` in commands.

```bash
mvn clean verify -Dbuildver=311 -Parm64
```

### Iterative development during local testing

When iterating on changes impacting the `dist` module artifact directly or via
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ as a `provided` dependency.
<dependency>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark_2.12</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<scope>provided</scope>
</dependency>
```
4 changes: 2 additions & 2 deletions aggregator/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
</parent>
<artifactId>rapids-4-spark-aggregator_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Aggregator</name>
<description>Creates an aggregated shaded package of the RAPIDS plugin for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<!--
Expand Down
4 changes: 2 additions & 2 deletions api_validation/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
</parent>
<artifactId>rapids-4-spark-api-validation</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<profiles>
<profile>
Expand Down
6 changes: 3 additions & 3 deletions datagen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,12 @@ corresponding profile flag `-P<jdk11|jdk17>`

After this the jar should be at
`target/datagen_2.12-$PLUGIN_VERSION-spark$SPARK_VERSION.jar`
for example a Spark 3.3.0 jar for the 23.10.0 release would be
`target/datagen_2.12-23.10.0-spark330.jar`
for example a Spark 3.3.0 jar for the 23.12.0 release would be
`target/datagen_2.12-23.12.0-spark330.jar`

To get a spark shell with this you can run
```shell
spark-shell --jars target/datagen_2.12-23.10.0-spark330.jar
spark-shell --jars target/datagen_2.12-23.12.0-spark330.jar
```

After that you should be good to go.
Expand Down
2 changes: 1 addition & 1 deletion datagen/ScaleTest.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ $SPARK_HOME/bin/spark-submit \
--conf spark.sql.parquet.datetimeRebaseModeInWrite=CORRECTED \
--class com.nvidia.rapids.tests.scaletest.ScaleTestDataGen \ # the main class
--jars $SPARK_HOME/examples/jars/scopt_2.12-3.7.1.jar \ # one dependency jar just shipped with Spark under $SPARK_HOME
./target/datagen_2.12-23.10.0-SNAPSHOT-spark332.jar \
./target/datagen_2.12-23.12.0-SNAPSHOT-spark332.jar \
1 \
10 \
parquet \
Expand Down
4 changes: 2 additions & 2 deletions datagen/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
</parent>
<artifactId>datagen_2.12</artifactId>
<name>Data Generator</name>
<description>Tools for generating large amounts of data</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<properties>
<target.classifier/>
<rapids.default.jar.excludePattern>**/*</rapids.default.jar.excludePattern>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-20x/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-20x_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Delta Lake 2.0.x Support</name>
<description>Delta Lake 2.0.x support for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-21x/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-21x_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Delta Lake 2.1.x Support</name>
<description>Delta Lake 2.1.x support for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-22x/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-22x_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Delta Lake 2.2.x Support</name>
<description>Delta Lake 2.2.x support for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-24x/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-24x_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Delta Lake 2.4.x Support</name>
<description>Delta Lake 2.4.x support for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-spark321db/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-spark321db_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Databricks 10.4 Delta Lake Support</name>
<description>Databricks 10.4 Delta Lake support for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-spark330db/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-spark330db_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Databricks 11.3 Delta Lake Support</name>
<description>Databricks 11.3 Delta Lake support for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-spark332db/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-spark332db_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Databricks 12.2 Delta Lake Support</name>
<description>Databricks 12.2 Delta Lake support for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions delta-lake/delta-stub/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>rapids-4-spark-delta-stub_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Delta Lake Stub</name>
<description>Delta Lake stub for the RAPIDS Accelerator for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.compressed.artifact>false</rapids.compressed.artifact>
Expand Down
4 changes: 2 additions & 2 deletions dist/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@
<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
</parent>
<artifactId>rapids-4-spark_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Distribution</name>
<description>Creates the distribution package of the RAPIDS plugin for Apache Spark</description>
<version>23.10.0-SNAPSHOT</version>
<version>23.12.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>com.nvidia</groupId>
Expand Down
2 changes: 1 addition & 1 deletion docs/configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The following is the list of options that `rapids-plugin-4-spark` supports.
On startup use: `--conf [conf key]=[conf value]`. For example:

```
${SPARK_HOME}/bin/spark-shell --jars rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar \
${SPARK_HOME}/bin/spark-shell --jars rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar \
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--conf spark.rapids.sql.concurrentGpuTasks=2
```
Expand Down
6 changes: 6 additions & 0 deletions docs/dev/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ following topics:
* [How Spark Executes the Physical Plan](#how-spark-executes-the-physical-plan)
* [How the Plugin Works](#how-the-rapids-plugin-works)
* [Plugin Replacement Rules](#plugin-replacement-rules)
* [Working with Data Sources](#working-with-data-sources)
* [Guidelines for Replacing Catalyst Executors and Expressions](#guidelines-for-replacing-catalyst-executors-and-expressions)
* [Setting Up the Class](#setting-up-the-class)
* [Expressions](#expressions)
Expand Down Expand Up @@ -131,6 +132,11 @@ executor, expression, etc.), and applying the rule that matches. See the
There is a separate guide for working with
[Adaptive Query Execution](adaptive-query.md).

### Working with Data Sources

The plugin supports v1 and v2 data sources for file formats such as CSV,
Orc, JSON, and Parquet. See the [data source guide](data-sources.md) for more information.

## Guidelines for Replacing Catalyst Executors and Expressions
Most development work in the plugin involves translating various Catalyst
executor and expression nodes into new nodes that execute on the GPU. This
Expand Down
68 changes: 68 additions & 0 deletions docs/dev/data-sources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
layout: page
title: Working with Spark Data Sources
nav_order: 2
parent: Developer Overview
---

# Working with Spark Data Sources

## Data Source API Versions

Spark has two major versions of its data source APIs, simply known as "v1" and "v2". There is a configuration
property `spark.sql.sources.useV1SourceList` which determines which API version is used when reading from data
sources such as CSV, Orc, and Parquet. The default value for this configuration option (as of Spark 3.4.0)
is `"avro,csv,json,kafka,orc,parquet,text"`, meaning that all of these data sources fall back to v1 by default.

When using Spark SQL (including the DataFrame API), the representation of a read in the physical plan will be
different depending on the API version being used, and in the plugin we therefore have different code paths
for tagging and replacing these operations.

## V1 API

In the v1 API, a read from a file-based data source is represented by a `FileSourceScanExec`, which wraps
a `HadoopFsRelation`.

`HadoopFsRelation` is an important component in Apache Spark. It represents a relation based on data stored in the
Hadoop FileSystem. When we talk about the Hadoop FileSystem in this context, it encompasses various distributed
storage systems that are Hadoop-compatible, such as HDFS (Hadoop Distributed FileSystem), Amazon S3, and others.

`HadoopFsRelation` is not tied to a specific file format. Instead, it relies on implementations of the `FileFormat`
interface to read and write data.

This means that various file formats like CSV, Parquet, and ORC can have their implementations of the `FileFormat`
interface, and `HadoopFsRelation` will be able to work with any of them.

When overriding `FileSourceScanExec` in the plugin, there are a number of different places where tagging code can be
placed, depending on the file format. We start in GpuOverrides with a map entry `GpuOverrides.exec[FileSourceScanExec]`,
and then the hierarchical flow is typically as follows, although it may vary between shim versions:

```
FileSourceScanExecMeta.tagPlanForGpu
ScanExecShims.tagGpuFileSourceScanExecSupport
GpuFileSourceScanExec.tagSupport
```

`GpuFileSourceScanExec.tagSupport` will inspect the `FileFormat` and then call into one of the following:

- `GpuReadCSVFileFormat.tagSupport`, which calls `GpuCSVScan.tagSupport`
- `GpuReadOrcFileFormat.tagSupport`, which calls `GpuOrcScan.tagSupport`
- `GpuReadParquetFileFormat.tagSupport`, which calls `GpuParquetScan.tagSupport`

The classes `GpuCSVScan`, `GpuParquetScan`, `GpuOrcScan`, and `GpuJsonScan` are also called
from the v2 API, so this is a good place to put code that is not specific to either API
version. These scan classes also call into `FileFormatChecks.tag`.

## V2 API

When using the v2 API, the physical plan will contain a `BatchScanExec`, which wraps a scan that implements
the `org.apache.spark.sql.connector.read.Scan` trait. The scan implementations include `CsvScan`, `ParquetScan`,
and `OrcScan`. These are the same scan implementations used in the v1 API, and the plugin tagging code can be
placed in one of the following methods:

- `GpuCSVScan.tagSupport`
- `GpuOrcScan.tagSupport`
- `GpuParquetScan.tagSupport`

When overriding v2 operators in the plugin, we can override both `BatchScanExec` and the individual scans, such
as `CsvScanExec`.
12 changes: 6 additions & 6 deletions docs/dev/shims.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,17 +68,17 @@ Using JarURLConnection URLs we create a Parallel World of the current version wi
Spark 3.0.2's URLs:

```text
jar:file:/home/spark/rapids-4-spark_2.12-23.10.0.jar!/
jar:file:/home/spark/rapids-4-spark_2.12-23.10.0.jar!/spark3xx-common/
jar:file:/home/spark/rapids-4-spark_2.12-23.10.0.jar!/spark302/
jar:file:/home/spark/rapids-4-spark_2.12-23.12.0.jar!/
jar:file:/home/spark/rapids-4-spark_2.12-23.12.0.jar!/spark3xx-common/
jar:file:/home/spark/rapids-4-spark_2.12-23.12.0.jar!/spark302/
```

Spark 3.2.0's URLs :

```text
jar:file:/home/spark/rapids-4-spark_2.12-23.10.0.jar!/
jar:file:/home/spark/rapids-4-spark_2.12-23.10.0.jar!/spark3xx-common/
jar:file:/home/spark/rapids-4-spark_2.12-23.10.0.jar!/spark320/
jar:file:/home/spark/rapids-4-spark_2.12-23.12.0.jar!/
jar:file:/home/spark/rapids-4-spark_2.12-23.12.0.jar!/spark3xx-common/
jar:file:/home/spark/rapids-4-spark_2.12-23.12.0.jar!/spark320/
```

### Late Inheritance in Public Classes
Expand Down
6 changes: 3 additions & 3 deletions integration_tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ individually, so you don't risk running unit tests along with the integration te
http://www.scalatest.org/user_guide/using_the_scalatest_shell

```shell
spark-shell --jars rapids-4-spark-tests_2.12-23.10.0-SNAPSHOT-tests.jar,rapids-4-spark-integration-tests_2.12-23.10.0-SNAPSHOT-tests.jar,scalatest_2.12-3.0.5.jar,scalactic_2.12-3.0.5.jar
spark-shell --jars rapids-4-spark-tests_2.12-23.12.0-SNAPSHOT-tests.jar,rapids-4-spark-integration-tests_2.12-23.12.0-SNAPSHOT-tests.jar,scalatest_2.12-3.0.5.jar,scalactic_2.12-3.0.5.jar
```

First you import the `scalatest_shell` and tell the tests where they can find the test files you
Expand All @@ -273,7 +273,7 @@ If you just want to verify the SQL replacement is working you will need to add t
assumes CUDA 11.0 is being used.

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar" ./runtests.py
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar" ./runtests.py
```

You don't have to enable the plugin for this to work, the test framework will do that for you.
Expand Down Expand Up @@ -372,7 +372,7 @@ To run cudf_udf tests, need following configuration changes:
As an example, here is the `spark-submit` command with the cudf_udf parameter on CUDA 11.0:

```
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar,rapids-4-spark-tests_2.12-23.10.0-SNAPSHOT.jar" --conf spark.rapids.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.concurrentPythonWorkers=2 --py-files "rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar" --conf spark.executorEnv.PYTHONPATH="rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar" ./runtests.py --cudf_udf
$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar,rapids-4-spark-tests_2.12-23.12.0-SNAPSHOT.jar" --conf spark.rapids.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.memory.gpu.allocFraction=0.3 --conf spark.rapids.python.concurrentPythonWorkers=2 --py-files "rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar" --conf spark.executorEnv.PYTHONPATH="rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar" ./runtests.py --cudf_udf
```

### Enabling fuzz tests
Expand Down
Loading