Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vboxwrapper: create the 'virtualbox home directory' in the project dir #6018

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

davidpanderson
Copy link
Contributor

@davidpanderson davidpanderson commented Jan 22, 2025

The 'VBox home directory' is where VBox writes log files
(which are read by vboxwrapper).
If this is not specified by the env var VBOX_USER_HOME,
we need to create it somewhere and set the env var to point there.

Previously we put it in /projects.
That's no good because it's not a project, and the client erased it.

We also tried putting it in the (real) user's home dir.
That's no good because

  1. we shouldn't mess with the home dir
  2. in sandboxed configs we're running as user 'boinc_projects',
    and don't have access to the home dir.

According to https://boinc.berkeley.edu/sandbox_design.php,
the only places 'boinc_projects' can write are
projects/, slots/, and their subdirectories.
So the logical places to put .VirtualBox are
this job's slot directory, or its project directory.
I chose the latter.

As far as I can tell this dir is used only on Win;
we run VBoxSVC.exe there, and look for a log file later.
Should we remove it for other platforms?

Also: there's some code to create a 'scratch' dir, projects/scratch.
This seems like a bad idea; we shouldn't put random stuff in projects/,
and also if there are multiple VM jobs they share the same dir.
Should we get rid of this?
@AenBleidd AenBleidd requested a review from Copilot January 22, 2025 21:30

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (1)
  • samples/vboxwrapper/vbox_vboxmanage.cpp: Language not supported
@NenTech
Copy link

NenTech commented Jan 22, 2025

This is the latest result with the 26209 (compared to the code):
`2025-01-22 22:54:38 (13555):
Command: VBoxManage -q --version
Exit Code: 0
Output:
7.1.4r165100

2025-01-22 22:54:38 (13555):
Command: VBoxManage -q list systemproperties
Exit Code: 0
Output:
VBoxManage: error: Failed to initialize COM because the global settings directory '/var/empty/Library/VirtualBox' is not accessible!

2025-01-22 22:54:38 (13555):
Command: VBoxManage -q list hostinfo
Exit Code: 0
Output:
VBoxManage: error: Failed to initialize COM because the global settings directory '/var/empty/Library/VirtualBox' is not accessible!`

Can someone supply me this file compiled for testing?

@davidpanderson
Copy link
Contributor Author

I'm not sure the changes I've been making (involving the 'virtualbox home directory')
are relevant to this problem.

On the Mac, BOINC jobs run as a 'hidden user', boinc_project.
This user has minimal privileges.
It looks like it's unable to run VboxManage successfully because it can't access
'/var/empty/Library/VirtualBox'

I'm confused.

  • Did vboxwrapper ever work on Mac? What's changed?
  • why is it trying to initialize COM?

@NenTech
Copy link

NenTech commented Jan 22, 2025

I'm not sure the changes I've been making (involving the 'virtualbox home directory') are relevant to this problem.

On the Mac, BOINC jobs run as a 'hidden user', boinc_project. This user has minimal privileges. It looks like it's unable to run VboxManage successfully because it can't access '/var/empty/Library/VirtualBox'

I'm confused.

  • Did vboxwrapper ever work on Mac? What's changed?

Yes it did work. When I switch to the 26207b wrapper, tasks are running but failing due to network connection..

  • why is it trying to initialize COM?

Here I have the trace and replay of the task:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6277&postid=51445

The task:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=419017121

@davidpanderson
Copy link
Contributor Author

I can't see (in those logs) any problems except the cvmfs connect failure,
and the COM errors which don't seem to matter.
Possibly a CERN issue?

@NenTech
Copy link

NenTech commented Jan 23, 2025

I can't see (in those logs) any problems except the cvmfs connect failure, and the COM errors which don't seem to matter. Possibly a CERN issue?

That is true. The error I sent earlier here (Command: VBoxManage -q list systemproperties) is with the new wrapper. I'm trying to debug it now on my Mac. I cannot find it. With automatic launch of the wrapper, the project fails immediately after start.

https://lhcathome.cern.ch/lhcathome/result.php?resultid=419017986

New info, the VirtualBox home directory is changed. The folder is made (without the .).
vboxmanage still goes to ~/Virtualbox for the settings.

BOINC is not allowed to create .VirtualBox on Mac since this is system only! (VirtualBox is allowed)

https://lhcathome.cern.ch/lhcathome/result.php?resultid=419018059

The folder for the VirtualBox settings need to be changed since VirtualBox places them in ~ of which is not allowed for other users. I'm tired now. Going to bed and will try to continue tomorrow with testing on this.

@computezrmle
Copy link
Contributor

I just archived a couple of logs from the LHC@home server before they disappear.
Can provide them if necessary.

  • all of them valids from Apple computers
  • all of them vboxwrapper 26207
  • various BOINC client versions
  • various VirtualBox versions (even the old v5.2.44)
  • various subprojects CMS/Theory

I remember even @NenTech had valids reported by the previous app versions using vboxwrapper 26207 (unfortunately the logs are not available any more).

This recent result from @NenTech failed although vboxwrapper 26207 has been used:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=419022990

Very unlikely that it is caused by the vdi file since

  • this hasn't changed (except the filename and it's VirtualBox UUID => both a must to avoid conflicts between old/new app versions if differencing images are used)
  • it is the same that works for Windows and Linux

Host internal networking appears to work since "Mounting the shared directory" succeeded.
External networking fails, hence ntp and CVMFS connections are affected.

The Hypervisor System Log reports lots of this:
00:00:03.831814 DCon01 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={e36a5081-a82a-40bd-9e4e-42a44d6ce50f} aComponent={MachineWrap} aText={The object functionality is limited}, preserve=false aResultDetail=0
Not sure which object is meant.

Could it be the app version setup?
To verify this @NenTech please extract the app_version section (Theory 30060) from client_state.xml and post it here.

Another log from @NenTech's computer reports this:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=419016867

VBoxManage -q showvminfo "boinc_a0b98d0ab8ba4e7c" --machinereadable 
Output:
VBoxManage: error: Failed to initialize COM because the global settings directory '/Users/nentech/Library/VirtualBox' is not accessible!

Not sure why this points to the user's directory.
I would expect it to point to the BOINC account's directory.
@NenTech Did you run the test from your normal user account or via BOINC.

@davidpanderson davidpanderson changed the title vboxwrapper: actually create the 'virtualbox home directory' vboxwrapper: actually create the 'virtualbox home directory' [WIP] Jan 23, 2025
@davidpanderson
Copy link
Contributor Author

don't merge for now

@NenTech
Copy link

NenTech commented Jan 23, 2025

@computezrmle
Hereby the <app_version> of the Theory and CMS app. Both are failing the same way.

<app_version>
    <app_name>Theory</app_name>
    <version_num>30060</version_num>
    <platform>x86_64-apple-darwin</platform>
    <avg_ncpus>1.000000</avg_ncpus>
    <flops>4727558580.521004</flops>
    <plan_class>vbox64_theory</plan_class>
    <api_version>8.1.0</api_version>
    <file_ref>
        <file_name>vboxwrapper_26208_x86_64-apple-darwin</file_name>
        <main_program/>
        <copy_file/>
    </file_ref>
    <file_ref>
        <file_name>Theory_2025_01_16_prod.xml</file_name>
        <open_name>vbox_job.xml</open_name>
        <copy_file/>
    </file_ref>
    <file_ref>
        <file_name>Theory_2025_01_16_prod.vdi</file_name>
    </file_ref>
    <dont_throttle/>
    <needs_network/>
</app_version>
<app_version>
    <app_name>CMS</app_name>
    <version_num>7060</version_num>
    <platform>x86_64-apple-darwin</platform>
    <avg_ncpus>4.000000</avg_ncpus>
    <flops>18910234322.084015</flops>
    <plan_class>vbox64_mt_mcore_cms</plan_class>
    <api_version>8.1.0</api_version>
    <cmdline>--nthreads 4</cmdline>
    <file_ref>
        <file_name>vboxwrapper_26208_x86_64-apple-darwin</file_name>
        <main_program/>
        <copy_file/>
    </file_ref>
    <file_ref>
        <file_name>CMS_2025_01_16_prod.xml</file_name>
        <open_name>vbox_job.xml</open_name>
        <copy_file/>
    </file_ref>
    <file_ref>
        <file_name>CMS_2025_01_16_prod.vdi</file_name>
    </file_ref>
    <dont_throttle/>
    <needs_network/>
</app_version>

I did the testing with normal user account and via BOINC. The results with all versions of the wrapper are the same via normal user account as the 26207b via BOINC.
The result below I got because I changed the wrapper that it would not exit after failing.

VBoxManage -q showvminfo "boinc_a0b98d0ab8ba4e7c" --machinereadable 
Output:
VBoxManage: error: Failed to initialize COM because the global settings directory '/Users/nentech/Library/VirtualBox' is not accessible!

Found the error and I have it fixed in my version 26210!
https://lhcathome.cern.ch/lhcathome/result.php?resultid=419047588

There is still 1 issue left. User group boinc_project is not allowed to create the directory VirtualBox in BOINC_data.

@CharlieFenton
Copy link
Contributor

User group boinc_project is not allowed to create the directory VirtualBox in BOINC_data.

I'm surprised, since I believe all project applications run as user boinc_master and group boinc_project, and are able to traverse the directory tree into the slots and project directories and successfully create files in those subdirectories. I don't see any indication in the referenced result log. How do you know that VBoxWrapper is failing to create the directory due to a permission error?

@CharlieFenton
Copy link
Contributor

believe all project applications run as user boinc_master and group boinc_project,

This is based on my reading of the current setprojectgrp.cpp, which is invoked using switcher in set_to_project_group() in sandbox.cpp which in turn is called from both ACTIVE_TASK::start() and ACTIVE_TASK::setup_file() in app_start.cpp as well as other places.

The 'VBox home directory' is where VBox writes log files
(which are read by vboxwrapper).
If this is not specified by the env var VBOX_USER_HOME,
we need to create it somewhere and set the env var to point there.

Previously we put it in <datadir>/projects.
That's no good because it's not a project, and the client erased it.

We also tried putting it in the (real) user's home dir.
That's no good because
1) we shouldn't mess with the home dir
2) in sandboxed configs we're running as user 'boinc_projects',
and don't have access to the home dir.

According to https://boinc.berkeley.edu/sandbox_design.php,
the only places 'boinc_projects' can write are
projects/, slots/, and their subdirectories.
So the logical places to put .VirtualBox are
this job's slot directory, or its project directory.
I chose the latter.
@davidpanderson
Copy link
Contributor Author

According to https://boinc.berkeley.edu/sandbox_design.php,
apps run as user boinc_projects and group boinc_projects,
so they can't write to the top-level data directory.

I'm submitting a PR for vboxwrapper where it creates .VirtuaBox
in its project directory, which it's guaranteed to have write access to.

@davidpanderson
Copy link
Contributor Author

I think this should work, but I can't currently test it.
It's designed to work on
Mac: standard install (sandboxed)
Win: regular or service (sandboxed) install
Linux: any type of install

@davidpanderson davidpanderson changed the title vboxwrapper: actually create the 'virtualbox home directory' [WIP] vboxwrapper: create the 'virtualbox home directory' in the project dir Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

5 participants