Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Distributed Qualification Tools CLI #1516

Merged
merged 7 commits into from
Feb 5, 2025

Conversation

parthosa
Copy link
Collaborator

Fixes #1249.

This PR merged the distributed tool branch to dev. We have setup internal CI pipeline to test this feature.

Created a follow-up to support recursive processing of event logs (#1515)

* Add arguments for running tools in distributed mode

Signed-off-by: Partho Sarthi <[email protected]>

* Refactor to use tools config file

Signed-off-by: Partho Sarthi <[email protected]>

* Update specification

Signed-off-by: Partho Sarthi <[email protected]>

* Update tools config file

Signed-off-by: Partho Sarthi <[email protected]>

* Update comment

Signed-off-by: Partho Sarthi <[email protected]>

* Add pylint exception

Signed-off-by: Partho Sarthi <[email protected]>

* Include hdfs output dir in tools config file

Signed-off-by: Partho Sarthi <[email protected]>

* Add comment about assumption of Spark JARs

Signed-off-by: Partho Sarthi <[email protected]>

* Revert changes in stats report

Signed-off-by: Partho Sarthi <[email protected]>

* Submission mode Args

Signed-off-by: Partho Sarthi <[email protected]>

* Modify the arguments structure

Signed-off-by: Partho Sarthi <[email protected]>

* Bump up the API version for tools config file

Signed-off-by: Partho Sarthi <[email protected]>

* Update python arg tests

Signed-off-by: Partho Sarthi <[email protected]>

* Remove pylint disable rule in CSPs

Signed-off-by: Partho Sarthi <[email protected]>

---------

Signed-off-by: Partho Sarthi <[email protected]>
)

* Add implementation for running tools in distributed mode

Signed-off-by: Partho Sarthi <[email protected]>

* Fix temp directory usage and remove automatic spark properties calculation

Signed-off-by: Partho Sarthi <[email protected]>

* Refactor

Signed-off-by: Partho Sarthi <[email protected]>

* Simplify hashing of intermediate output dirs

Signed-off-by: Partho Sarthi <[email protected]>

* Fix typing hint for python 3.8

Signed-off-by: Partho Sarthi <[email protected]>

---------

Signed-off-by: Partho Sarthi <[email protected]>
@parthosa parthosa added feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python) labels Jan 28, 2025
@parthosa parthosa self-assigned this Jan 28, 2025
@parthosa parthosa marked this pull request as draft January 28, 2025 22:55
@parthosa
Copy link
Collaborator Author

This needs licensing errors fixed #1518

This PR fixes the license headers for new files created in the
distributed tools branch to be `2025` and files modified to be `*-2025`

---------

Signed-off-by: Partho Sarthi <[email protected]>
@parthosa parthosa marked this pull request as ready for review February 4, 2025 16:11
amahussein
amahussein previously approved these changes Feb 4, 2025
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa
In general LGTME! one nit question.

user_tools/pyproject.toml Outdated Show resolved Hide resolved
@parthosa parthosa merged commit 2b9473b into dev Feb 5, 2025
13 checks passed
@parthosa parthosa deleted the spark-rapids-tools-distributed-base branch February 5, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Distributed processing of Event Logs
2 participants