`run-pandora` Install and Run Script Extensions for NDLAr Production #69

alexbooth92 · 2025-01-22T13:31:49Z

This PR extends install_pandora.sh and setup_pandora.sh so that run-pandora can handle both 2x2 and NDLAr jobs in a more automated way that's consistent with other steps in the production chain. It also addresses a couple of suggestions from @jback08. These changes have been tested with small scale interactive tests.

install_pandora.sh calls setup_pandora.sh which is detector dependent, the install is therefore detector dependent. Since setup_pandora.sh is also used in the run_* scripts, the flexibility has been added to pull geometry information from environment via, for example, ARCUBE_GEOM in a way that is consistent with how that variable is used in other stages of production. If you are installing and running 2x2, all changes are completely transparent except having to pass 2x2 to the install script. For NDLAr need to pass ndlar to the install script and specify ARCUBE_GEOM and ARCUBE_PANDORA_GEOM as explained in the install script.
Change what's passed to -r when running pandora from AllHitsNu to AllHitsSliceNu.
Bump LArRecoND to v01-01-03, already been used by 2x2 production but not changed here.

… B and recently used for 2x2)

…, install is therefore detector dependent. setup_pandora.sh is also used int the run_* scripts so added flexibility to pull geometry from environment via, for example ARCUBE_GEOM, which is consistent with how that variable is used in other stages of production. If you are installing and running 2x2, all changes are completely transparent except having to pass '2x2' to the install script. For NDLAr need to pass ndlar to the install script and specify ARCUBE_GEOM and ARCUBE_PANDORA_GEOM as explained in the install script.

…ssue we had with a seg fault that could occur when the CaloHit IDs were being set).

noeroy

Thanks for putting that together, and sorry I haven't been more thorough to push my modifications as soon as I had to make them.

While we're looking at additions to run-pandora, you probably want to add export OMP_NUM_THREADS=1 in the setup_pandora.sh script, otherwise flow2root eats too much memory.

run-pandora/setup_pandora.sh

run-pandora/run_pandora.sh

alexbooth92 · 2025-01-24T15:42:02Z

Thanks for putting that together, and sorry I haven't been more thorough to push my modifications as soon as I had to make them.

While we're looking at additions to run-pandora, you probably want to add export OMP_NUM_THREADS=1 in the setup_pandora.sh script, otherwise flow2root eats too much memory.

No worries - things move fast! On the export OMP_NUM_THREADS=1 do we want this for the "actual" pandora too or just the file conversion step?

alexbooth92 · 2025-01-24T15:54:02Z

Thanks for putting that together, and sorry I haven't been more thorough to push my modifications as soon as I had to make them.
While we're looking at additions to run-pandora, you probably want to add export OMP_NUM_THREADS=1 in the setup_pandora.sh script, otherwise flow2root eats too much memory.

No worries - things move fast! On the export OMP_NUM_THREADS=1 do we want this for the "actual" pandora too or just the file conversion step?

I added it as a configurable export in 8093e67

jback08 · 2025-01-24T21:50:06Z

These changes look OK. I have a few other suggestions that we could try to add to this PR or leave for another future PR:

In run_flow2root.sh, the 2nd from last integer parameter 0 for the command

python3 $ARCUBE_PANDORA_INSTALL/LArRecoND/ndlarflow/h5_to_root_ndlarflow.py $inFile 0 $tmpOutDir

needs to be changed to 1 whenever we have data and not MC. So I would like to propose that we have an isData boolean here equal to 0 or 1 depending if the ARCUBE_PANDORA_INPUT_FORMAT environment variable in setup_pandora.sh is set to either SPMC or SP (data), respectively.

I understand from Bruce Howard that the h5_to_root_ndlarflow.py script we use in run_flow2root.sh needs to use different neutrino vertex variable names depending on the detector setup:

MiniRun6 2x2: nuVertexX.append(vtx["vertex"][0]
NDLAr version: nuVertexX.append(vtx["x_vert"])

and similarly for y and z. Should we handle this by using a 2x2 or ndlar parameter in the h5_to_root_ndlarflow.py script which then decides what these vertex variable names should be? If so, we would also need to pass this extra parameter when this script is called in run_flow2root.sh after the LArRecoND script has been updated. Will these vertex variables be unified to be the same for 2x2 and ND-LAr at some point in the future?

In addition to the output file tmpAnaOut=${tmpRunDir}/LArRecoND.root that is moved at the end of the run_pandora.sh script, which is used to create the CAFs, it will also be useful to move the MCHierarchy.root and EventHierarchy.root output files that are made with LArRecoND version v01-01-04, which will allow us to find MC reconstruction efficiencies and other performance plots independent of the CAFs. But this will almost double the output file directory size. So this is a "nice to have" request that is not necessary for the CAF maker. The LArRecoND.root files store the reco PFOs along with any best matched MC (if any), while the other files store the MC particles along with any matched reco PFOs (best to worse), so we know the efficiency denominators.

alexbooth92 · 2025-01-25T10:56:30Z

These changes look OK. I have a few other suggestions that we could try to add to this PR or leave for another future PR:

In run_flow2root.sh, the 2nd from last integer parameter 0 for the command

python3 $ARCUBE_PANDORA_INSTALL/LArRecoND/ndlarflow/h5_to_root_ndlarflow.py $inFile 0 $tmpOutDir

needs to be changed to 1 whenever we have data and not MC. So I would like to propose that we have an isData boolean here equal to 0 or 1 depending if the ARCUBE_PANDORA_INPUT_FORMAT environment variable in setup_pandora.sh is set to either SPMC or SP (data), respectively.

I understand from Bruce Howard that the h5_to_root_ndlarflow.py script we use in run_flow2root.sh needs to use different neutrino vertex variable names depending on the detector setup:

MiniRun6 2x2: nuVertexX.append(vtx["vertex"][0] NDLAr version: nuVertexX.append(vtx["x_vert"])

and similarly for y and z. Should we handle this by using a 2x2 or ndlar parameter in the h5_to_root_ndlarflow.py script which then decides what these vertex variable names should be? If so, we would also need to pass this extra parameter when this script is called in run_flow2root.sh after the LArRecoND script has been updated. Will these vertex variables be unified to be the same for 2x2 and ND-LAr at some point in the future?

In addition to the output file tmpAnaOut=${tmpRunDir}/LArRecoND.root that is moved at the end of the run_pandora.sh script, which is used to create the CAFs, it will also be useful to move the MCHierarchy.root and EventHierarchy.root output files that are made with LArRecoND version v01-01-04, which will allow us to find MC reconstruction efficiencies and other performance plots independent of the CAFs. But this will almost double the output file directory size. So this is a "nice to have" request that is not necessary for the CAF maker. The LArRecoND.root files store the reco PFOs along with any best matched MC (if any), while the other files store the MC particles along with any matched reco PFOs (best to worse), so we know the efficiency denominators.

Thank you the detailed look @jback08! In the interest of getting this particular PR merged, I suggest that we make an issue for your suggestions 1 and 3 - when I have 30 minutes next week, I'd be happy to tackle them. With regards to number 2 - I am ashamed to say that I haven't found out where this difference in naming for the vertex variables is coming from, however in my on-going work to roll forward the NDLAr version of the software stack from "Run 6" like to present day latest-and-greatest, the NDLAr naming is now consistent with 2x2. I strongly suspect that it's just a single commit somewhere that I missed during cherry-picking back in September. My vote would be to live with it for now since it will soon become a non-issue.

alexbooth92 added 4 commits January 22, 2025 05:03

Enable slicing to reconstruct multiple neutrinos (recommended by John…

88710e5

… B and recently used for 2x2)

Bump tag of LArRecoND to v01-01-03, recommended by John B (fixes an i…

0b03411

…ssue we had with a seg fault that could occur when the CaloHit IDs were being set).

Comment out the useful lines for NDLAr.

f67af2e

alexbooth92 marked this pull request as ready for review January 22, 2025 16:12

alexbooth92 requested review from jback08 and noeroy January 22, 2025 16:13

noeroy approved these changes Jan 24, 2025

View reviewed changes

run-pandora/setup_pandora.sh Outdated Show resolved Hide resolved

noeroy reviewed Jan 24, 2025

View reviewed changes

run-pandora/run_pandora.sh Show resolved Hide resolved

Bump LArNDReco again

cd7260b

Add exporting of OMP_NUM_THREADS configurably.

8093e67

jback08 approved these changes Jan 24, 2025

View reviewed changes

alexbooth92 merged commit a59d136 into develop Jan 25, 2025

alexbooth92 deleted the feature/abooth-pandora_for_nd_prod branch January 25, 2025 10:57

alexbooth92 mentioned this pull request Jan 25, 2025

Quality of Life Additions for run-pandora #70

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`run-pandora` Install and Run Script Extensions for NDLAr Production #69

`run-pandora` Install and Run Script Extensions for NDLAr Production #69

alexbooth92 commented Jan 22, 2025 •

edited

Loading

noeroy left a comment •

edited

Loading

alexbooth92 commented Jan 24, 2025

alexbooth92 commented Jan 24, 2025

jback08 commented Jan 24, 2025 •

edited

Loading

alexbooth92 commented Jan 25, 2025

run-pandora Install and Run Script Extensions for NDLAr Production #69

run-pandora Install and Run Script Extensions for NDLAr Production #69

Conversation

alexbooth92 commented Jan 22, 2025 • edited Loading

noeroy left a comment • edited Loading

Choose a reason for hiding this comment

alexbooth92 commented Jan 24, 2025

alexbooth92 commented Jan 24, 2025

jback08 commented Jan 24, 2025 • edited Loading

alexbooth92 commented Jan 25, 2025

`run-pandora` Install and Run Script Extensions for NDLAr Production #69

`run-pandora` Install and Run Script Extensions for NDLAr Production #69

alexbooth92 commented Jan 22, 2025 •

edited

Loading

noeroy left a comment •

edited

Loading

jback08 commented Jan 24, 2025 •

edited

Loading