Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix up and add flexibility to HTCManager #157

Merged
merged 4 commits into from
Jan 4, 2021

Conversation

aminnj
Copy link
Contributor

@aminnj aminnj commented Dec 17, 2020

Currently, HTCManager is pretty rigid in terms of the job submission script and worker node script, and doesn't communicate properly as noted in #150 and #107.

To address the first, I added parameters that let one send extra inputs to the worker node, insert lines/ClassAds in the job submission script, add extra commands before julia is invoked on the worker node, and modify the location/command used in place of telnet. So one can do something like

myparams = Dict(
                :dir=>".",
                :exename=>"julia",
                :extrainputs=>["/path/to/file1.gz"],
                :extrajdl=>["RequestCpus = 1"],
                :extraenv=>["tar xf file1.gz", "setup_command()"],
               )
addprocs(HTCManager(8); myparams...)

and addprocs(HTCManager(8)) behaves as before.

To address the second, I made the master instance of julia bind to 0.0.0.0, as suggested in #107 (and explained in JuliaParallel/MPI.jl#222). I have the impression that most HTCondor worker processes are not on the submission/login node, so I guess the new binding address should be the default, though I defer to others on this.

@tanmaykm tanmaykm merged commit 44fe0de into JuliaParallel:master Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants