Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fail if users try to run managed builds that would use a nonfunctional container #720

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

mattculler
Copy link
Contributor

  • Have you followed the guidelines for contributing?
  • Have you signed the CLA?
  • Have you successfully run tox?

New LXD versions were released that fix an issue we've seen where a mismatch between the support for cgroupsv1 and v2 between host and guest causes the container's systemd to become partially nonfunctional. Add checks to fail with a good error message if a user's system would launch such a container.

(CRAFT-3096)

Copy link
Collaborator

@mr-cal mr-cal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Seems possible to get this done with or without the loopback executor. I'd like to get @lengau's opinion on that, since he was talking about this when architecting craft-application.

craft_providers/lxd/lxc.py Outdated Show resolved Hide resolved
craft_providers/bases/ubuntu.py Outdated Show resolved Hide resolved
@mattculler
Copy link
Contributor Author

Looks good overall. Seems possible to get this done with or without the loopback executor. I'd like to get @lengau's opinion on that, since he was talking about this when architecting craft-application.

Yeah you could. Taking a fresh look at it today I see a good way to refactor Base._get_os_release so the meat of it it reusable without Executor - going to try that instead.

@mattculler
Copy link
Contributor Author

mattculler commented Jan 28, 2025

Looks good overall. Seems possible to get this done with or without the loopback executor. I'd like to get @lengau's opinion on that, since he was talking about this when architecting craft-application.

Yeah you could. Taking a fresh look at it today I see a good way to refactor Base._get_os_release so the meat of it it reusable without Executor - going to try that instead.

@mr-cal I think this is better, going this route and removing loopback executor unless you disagree: d2ff086

@mr-cal
Copy link
Collaborator

mr-cal commented Jan 28, 2025

@mr-cal I think this is better, going this route and removing loopback executor unless you disagree: d2ff086

I like it. I'm interested in a loopback, but keeping this PR lean will speed up the dev and review process.

@mattculler mattculler marked this pull request as ready for review January 30, 2025 04:53
@mattculler mattculler requested a review from mr-cal January 30, 2025 04:53
# between cgroupv1 and v2 support.
# https://discourse.ubuntu.com/t/lxd-5-0-4-lts-has-been-released/49681#p-123331-support-for-ubuntu-oracular-containers-on-cgroupv2-hosts
if (
host_base_alias > BuilddBaseAlias.FOCAL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on having this be such an Ubuntu-specific solution. We should be able to make the logic generic and just have a data set that we can add to for managing this. For example:

BASE_MINIMUM_REQUIREMENTS = {
    BuilddBaseAlias.ORACULAR: {
        "lxd": [(5, 0, 4), (5, 21, 2)],
        "kernel": [(5, 15)],
    },
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this design would let us make changes in the future without needing to refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make this change, but it does seem to me like we may be making this too generic.

If I do, then this data structure would really need to look like is something like this:

INVALID_VERSIONS = [
  {
    "host": BuilddBase.FOCAL,
    "guest": BuilddBase.ORACULAR,
    "lxd": 
      {
        (4,): (float('inf'), float('inf')), # everything 4.x and below fails
        (5, 0): (4,),
        (5, 21): (2,),
        (6,): (float('-inf'), float('-inf')), # everything 6.x and above passes
      },
    "kernel": [(5, 15)],

This way, your logic can say something like (pseudocode):

for invalid in INVALID_VERSIONS:
  if host > invalid.host or guest < invalid.guest:
    continue

  lxd_version_in_family = lxd_version_match(lxd_version, invalid.lxd.keys()) 
  if not lxd_version_in_family:
    # Current lxd version doesn't match any of the keys
    continue
  # This is a lxd version that may be affected by the bug
  if lxd_version_fail(lxd_version, lxd_version_in_family):
    raise Ex('You need a newer lxd for this host/guest')
  
  if kernel < invalid.kernel:
    raise Ex('You need a newer kernel for this host/guest')

Since you and Callahan both agreed that it should look something like this, I'll start down this path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*Not exactly like that, I'm refining this design as I implement, but this is the gist.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is getting complex fast. Seems like we need a class or library to handle versioning more cleanly. I'd be fine to leave is as you originally implemented.

Also, this code may not be long-lived. It's going to be increasingly rare for someone to use these older minor versions of lxd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mr-cal The code I've got right now is working out a little more cleanly than the above would suggest, mainly by making some non-generic assumptions about lxd versions. I wasn't sold on this approach before, but I think what I'm settling on now is better - look for an update shortly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we forget about 5.0.4 and just check if the version tuple >= (5, 21, 2)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lengau, what's your rationale for 5.21 and not 5.0? Both are LTS versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two commits implement a generic solution:
f4c8161
31dce45

But, I'm torn - I feel like the generic way it's harder to reason about what the code is doing, and we're unlikely to need to add to this INVALID_VERSIONS structure. Thoughts? @lengau @mr-cal ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mr-cal my rationale is that anything after 5.21.2 works, but some things greater than 5.0.4 don't work.

craft_providers/bases/ubuntu.py Outdated Show resolved Hide resolved
if lxd_major == 5:
# Major is 5, we care about patch versions given the minor
if lxd_minor == 0 and lxd_patch < 4:
raise lxd_exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can detect both the lxd version issue and the kernel version issue in one run and raise both problems simultaneously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is probably the pattern I'll move towards. The error messages will be a little less specific, but I think that's an acceptable tradeoff here.

craft_providers/lxd/lxc.py Show resolved Hide resolved
craft_providers/lxd/lxd_provider.py Show resolved Hide resolved
craft_providers/bases/ubuntu.py Show resolved Hide resolved
@mattculler mattculler linked an issue Jan 30, 2025 that may be closed by this pull request
Copy link
Collaborator

@mr-cal mr-cal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Please re-request me after addressing Alex's comments

raise lxd_exception
if lxd_minor == 21 and lxd_patch < 2:
raise lxd_exception
if lxd_major < 5:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if lxd_major < 5:
elif lxd_major < 5:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic will go away with the generic design Alex suggested above.

# between cgroupv1 and v2 support.
# https://discourse.ubuntu.com/t/lxd-5-0-4-lts-has-been-released/49681#p-123331-support-for-ubuntu-oracular-containers-on-cgroupv2-hosts
if (
host_base_alias > BuilddBaseAlias.FOCAL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this design would let us make changes in the future without needing to refactor.

@mr-cal mr-cal requested review from mr-cal and lengau February 3, 2025 14:38
Copy link
Collaborator

@mr-cal mr-cal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 for your new design, readable and seems simpler since it separates this particular cgroups bug away from the if/else statements.

"""


INVALID_VERSIONS = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is very readable

(5, 0, 4),
(5, 21, 2),
],
"kernel_less_than": (5, 15),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making this a list of triples as well and using the same function for matching lxd and kernels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That symmetry would be sort of nice, but the complexity of the lxd logic isn't necessary for the kernel. We can't reason about any minor lxd versions under 5.x other than those listed, so we don't fail them. The kernel, on the other hand, is just a straight-up linear cutoff at 5.15.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary now, but it could be necessary in the future, especially given OEM kernels.

@mattculler mattculler requested review from lengau and mr-cal February 4, 2025 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set security.nesting=true for LXD projects
3 participants