Skip to content
This repository has been archived by the owner on Apr 19, 2023. It is now read-only.

IllegalArgumentExce #43

Open
tristers-at-square opened this issue Sep 20, 2021 · 4 comments
Open

IllegalArgumentExce #43

tristers-at-square opened this issue Sep 20, 2021 · 4 comments
Assignees

Comments

@tristers-at-square
Copy link

Describe the bug
When trying to train an XGBoost classifier with GPU's, it produces the following error:

IllegalArgumentException: features does not exist

Steps/Code to reproduce bug
Calling the fit method as follows:

val xgbClassifier = new XGBoostClassifier(paramMap)
  .setLabelCol(labelName)
  .setFeaturesCols(featureCols)
xgbClassifier.fit(trainDF)

Expected behavior
I expected the model to successfully train when running on GPU's.

Environment details (please complete the following information)

Running Spark job on GCP Dataproc with Nvidia Tesla T4 GPU.

The following JAR's are in the /usr/lib/spark/jars/ classPath:

  • Rapids-4-Spark: rapids-4-spark_2.12-21.08.0.jar
  • XGBoost4J: xgboost4j_3.0-1.4.2-0.1.0.jar
  • XGBoost4J-Spark: xgboost4j-spark_3.0-1.4.2-0.1.0.jar
  • CUDA: cudf-21.08.2-cuda11.jar

Using the following DataProc initializers to install GPU Drivers and Rapids Accelerators:

  • goog-dataproc-initialization-actions-us-central1/gpu/install_gpu_driver.sh
  • goog-dataproc-initialization-actions-us-central1/rapids/rapids.sh

Using the following Spark parameter configurations:
"spark.executor.resource.gpu.amount": "1"
"spark.task.resource.gpu.amount": "1"
"spark.rapids.sql.explain": "ALL"
"spark.rapids.sql.concurrentGpuTasks": "2"
"spark.rapids.memory.pinnedPool.size": "2G"
"spark.executor.extraJavaOptions": "-Dai.rapids.cudf.prefer-pinned=true"
"spark.locality.wait": "0s"
"spark.plugins": "com.nvidia.spark.SQLPlugin"
"spark.rapids.sql.hasNans": "false"
"spark.rapids.sql.batchSizeBytes": "512M"
"spark.rapids.sql.reader.batchSizeBytes": "768M"
"spark.rapids.sql.variableFloatAgg.enabled": "true"
"spark.rapids.sql.decimalType.enabled": "true"
"spark.rapids.memory.gpu.pooling.enabled": "false"
"spark.executor.resource.gpu.discoveryScript": "/usr/lib/spark/scripts/gpu/getGpusResources.sh"

@tristers-at-square
Copy link
Author

This is similar to the error described here:

#13

However, none of those steps seemed to fix my issue.

@GaryShen2008
Copy link
Collaborator

@wbo4958 Hi Bobby, can you help with this issue? What condition can cause "Features does not exist"?

@tristers-at-square
BTW, we have a new repo https://github.com/NVIDIA/spark-rapids-examples.
I see you're running with the latest version( with rapids-4-spark 21.08.0). The steps should be no change.
It'll be better that you can file the issue to the new repo when you have any issue for the latest version.
Thanks.

@tristers-at-square
Copy link
Author

@GaryShen2008 I see, opened the issue in the new repo. Thanks! 🙏

@GaryShen2008
Copy link
Collaborator

@wbo4958 Hi Bobby, can you check NVIDIA/spark-rapids-examples#21?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants