Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework grub setup #246

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Rework grub setup #246

wants to merge 10 commits into from

Conversation

upils
Copy link
Collaborator

@upils upils commented Sep 17, 2024

Properly install grub and update /boot/efi.

Fixes: FR-8890

@upils upils self-assigned this Sep 17, 2024
Copy link

codecov bot commented Sep 17, 2024

Codecov Report

Attention: Patch coverage is 92.41379% with 11 lines in your changes missing coverage. Please review.

Project coverage is 93.99%. Comparing base (0100a11) to head (ca4acab).

Files with missing lines Patch % Lines
internal/statemachine/helper.go 88.29% 9 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #246      +/-   ##
==========================================
- Coverage   93.99%   93.99%   -0.01%     
==========================================
  Files          18       19       +1     
  Lines        3412     3511      +99     
==========================================
+ Hits         3207     3300      +93     
- Misses        132      137       +5     
- Partials       73       74       +1     
Flag Coverage Δ
unittests 93.99% <92.41%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kukrimate
Copy link
Member

This was a massively annoying debugging session, but turns out the reason is that having udev installed is required for grub-install to work.

Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback grub-install /dev/loop1 --boot-directory=/boot --efi-directory=/boot/efi --target=x86_64-efi --uefi-secure-boot --no-nvram
Installing for x86_64-efi platform.
Installation finished. No error reported.
Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback ls /boot/grub
device.map
fonts
gfxblacklist.txt
grubenv
locale
unicode.pf2
x86_64-efi
Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback grub-install /dev/loop1 --target=i386-pc
Installing for i386-pc platform.
Installation finished. No error reported.
Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback ls /boot/grub
device.map
fonts
gfxblacklist.txt
grubenv
i386-pc
locale
unicode.pf2
x86_64-efi
Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback dpkg-divert --local --divert /etc/grub.d/30_os-prober.dpkg-divert --rename /etc/grub.d/30_os-prober
Adding 'local diversion of /etc/grub.d/30_os-prober to /etc/grub.d/30_os-prober.dpkg-divert'
Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback ls /etc/grub.d/
00_header
05_debian_theme
10_linux
10_linux_zfs
20_linux_xen
30_os-prober.dpkg-divert
30_uefi-firmware
40_custom
41_custom
README
Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback update-grub
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
done
Executing: /usr/sbin/chroot /tmp/ubuntu-image-7061f556-b461-4c76-94c6-8d061326fa62/scratch/loopback ls /boot/grub
device.map
fonts
gfxblacklist.txt
grub.cfg
grubenv
i386-pc
locale
unicode.pf2
x86_64-efi
Removing 'local diversion of /etc/grub.d/30_os-prober to /etc/grub.d/30_os-prober.dpkg-divert'
duration: 6.960428099s
Build successful
fonts
gfxblacklist.txt
grub.cfg
grubenv
i386-pc
locale
unicode.pf2
x86_64-efi
--- PASS: TestPackStateMachine_SuccessfulRun (64.37s)

@upils upils marked this pull request as ready for review September 20, 2024 13:59
…evel

This is needed in late stages when setting up the bootloader.

Signed-off-by: Paul Mars <[email protected]>
Properly install grub and packages according to the image architecture.

Signed-off-by: Paul Mars <[email protected]>
@upils upils requested review from mwhudson and sil2100 September 27, 2024 06:57
Copy link
Contributor

@mwhudson mwhudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is moving to a model where we do not expect the gadget to contain boot assets for classic builds, is that correct?

I have some other comments but I guess the above question is kinda important for me to understand!

teardownCmds = append([]*exec.Cmd{
execCommand("udevadm", "settle"),
}, teardownCmds...)

// udev needed to have grub-install properly work
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only in jammy and older, I think

Copy link
Collaborator Author

@upils upils Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it it failed to run in a minbase chroot. See the TestPackStateMachine_SuccessfulRun test and previous comment by @kukrimate who found the fix.
But if you know how we could avoid installing it I am interested :).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That test debootstraps jammy so that tracks. Given that I would expect the majority of ubuntu-image's usage to be for releases newer than jammy (to put it mildly) I would prefer not to have this cruft here...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK some of our users in other teams still build jammy-based images. I would like to discuss our policy regarding support of building old images.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well OK can we say "if release <= jammy" somewhere?

I got distracted halfway through responding to your comments, sorry about that. But I guess we're talking about all this stuff in a few hours anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I have been doing everything I could to avoid hardcoding series-specific stuff in the tool. I would rather understand the root cause and find an heuristic to determine if installing udev is needed or not. Do you know why it was needed for jammy and older (and why it is not anymore?).

}
prepareCmds = append(prepareCmds,
// Try to make sure udev is not racing with losetup and briefly
// vanishing device files. See LP: #2045586
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upshot of that bug is that udevadm settle is not the solution though. The fix is to run "flock {loopUsed} mount {loopUsed}p{rootfsPartNum} {mountDir}"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the fix and discussions about that when it was fixed in livecd-rootfs. I would like to fix it properly everywhere (this is not the only place we are calling udevadm) but this PR was not the place to do it, especially because in need to experiment with flock and check if this is available on older series. See FR-6372

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flock has been part of util-linux (which is Essential) for about a million years (I think it might have been new in 5.04?) but ok

internal/statemachine/helper.go Show resolved Hide resolved
internal/statemachine/helper.go Outdated Show resolved Hide resolved
func (stateMachine *StateMachine) confFromArch(architecture string) (string, []string) {
switch architecture {
case arch.AMD64:
return "x86_64-efi", []string{"grub-pc", "shim-signed"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems slightly incoherent, and in a way that might possibly even be interesting. grub-pc is only about BIOS boot. If you build an amd64 image and your gadget says the bootloader is grub, do we build an image that can only boot EFI or one that can boot grub as well? Ubuntu Core doesn't have this issue because it doesn't boot BIOS ever but for classic I don't suppose we can get away with that.

Sadly, we probably want to do both BIOS+UEFI and UEFI-only images (he says, desperately hoping there is not a use case for BIOS-only images out there).

For UEFI+BIOS images we'll need to call grub-install twice.

For now, it probably makes sense to build UEFI-only images. In which case you can remove grub-pc here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to support the UEFI+BIOS hybrid images from the start, sadly. IIRC the default pc classic gadget we always provided as a reference basically was a hybrid one. So not caring from the start I think would regress the behavior of the tool.
For BIOS-only images I had a chat with Paul earlier and we said we won't explicitly support just yet, but wait if someone asks for them. Since in theory the gadget.yaml allows that.

I'm thinking about what you said about calling grub-install twice, and I think I need to process that for a bit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand adding grub-pc and shim-signed will lead to a UEFI+BIOS image, right?

I am thinking we could also decide the "kind" of bootloader conf we build based on the EFI partition number found:

  • If this is 0 (no EFI partition found) then we can/should build a BIOS-only image. In this case we only install grub-pc and we call grub-install without the efi-specifc args (like in a hook of livecd-rootfs)
  • If this is not 0 (a EFI partition was found) then we build a UEFI+BIOS image. In this case we install grub-pc and shim-signed.

It remains the UEFI-only case for which I have no solution at the moment. Maybe something in the gadget.yaml could let users express that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UEFI+BIOS from the start > ok. maybe we support only that then?

I understand adding grub-pc and shim-signed will lead to a UEFI+BIOS image, right?

Well to be pedantic, I'm not sure that having both of those packages installed means that the image will be bootable via both UEFI and BIOS without further steps, the bootloader assets also need to be installed into the ESP / boot sector of the drive, and relatedly-but-not-the same I don't know that having these packages installed means that those packages being upgraded means that the boot assets get updated in the ESP / boot sector -- which rather gets to the heart of the PR. Maybe we can confer with @kukrimate to ensure we have a solid understanding of what is going on -- I certainly find grub's postinst pretty confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, I guess the UEFI vs BIOS vs UEFI+BIOS should be communicated via the gadget somehow. But also maybe we only support UEFI+BIOS for now.

case arch.AMD64:
return "x86_64-efi", []string{"grub-pc", "shim-signed"}
case arch.ARM64:
return "arm64-efi", []string{"grub-efi-arm64", "grub-efi-arm64-bin"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably just want shim-signed on arm64 too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean instead of grub-efi-arm64 and grub-efi-arm64-bin? I used the list I found in livecd-rootfs but that is a good occasion to clean things so if shim-signed is enough I will be happy to fix.

Copy link
Contributor

@sil2100 sil2100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the general gist of it, and I think most of the code is correct. There's one inline question about no-boot-partition cases handling that I highlighted, open for discussion.
I'm also interested in some of the comments that Michael did. I still need to investigate the 'grub-install twice' suggestion for BIOS+UEFI, since I don't remember what we were doing in, say, livecd-rootfs. Certainly needs investigation.

internal/statemachine/classic_states.go Outdated Show resolved Hide resolved
func (stateMachine *StateMachine) confFromArch(architecture string) (string, []string) {
switch architecture {
case arch.AMD64:
return "x86_64-efi", []string{"grub-pc", "shim-signed"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to support the UEFI+BIOS hybrid images from the start, sadly. IIRC the default pc classic gadget we always provided as a reference basically was a hybrid one. So not caring from the start I think would regress the behavior of the tool.
For BIOS-only images I had a chat with Paul earlier and we said we won't explicitly support just yet, but wait if someone asks for them. Since in theory the gadget.yaml allows that.

I'm thinking about what you said about calling grub-install twice, and I think I need to process that for a bit.

@sil2100
Copy link
Contributor

sil2100 commented Sep 27, 2024

Also, Michael's comment about gadget and boot assets is interesting. Certainly something we need to discuss more when we chat about gadgets next week.

@upils
Copy link
Collaborator Author

upils commented Sep 27, 2024

So this is moving to a model where we do not expect the gadget to contain boot assets for classic builds, is that correct?

I have some other comments but I guess the above question is kinda important for me to understand!

Initially the rework was prompted by a bug some users are facing when trying to update their boot assets. It appears dumping the boot assets at the right place is not enough. Do note though that here we are only dealing with grub. Other bootloaders are handled differently.

In the end I think this rework is indeed a first step toward getting rid of boot assets provided by the gadget in the classic use case. As mentioned by Lukasz this is something we should discuss next week.

This is a first step to create a dedicated package when we want to support more bootloaders and more cases (BIOS-only, UEFI-only)

Signed-off-by: Paul Mars <[email protected]>
…rerequisities

For now we only need the rootfs/bootfs partitions numbers to setup grub. Only check these when setting up grub and let other bootloaders run fine.

Signed-off-by: Paul Mars <[email protected]>
@mwhudson
Copy link
Contributor

OK sorry for the radio silence on this. I think I have two conceptual concerns (haven't reviewed the code again yet):

  1. Do we need to debconf-set "grub-pc grub-{efi,pc}/cloud_style_installation boolean true"? livecd-rootfs does this in disk-image-uefi.binary but I notice it doesn't in disk-image-uefi-non-cloud.binary. grub's postinst confuses me a great deal but I'm not sure that it will actually update grub in one of these images if this is not set.
  2. Do we want to implicitly install the packages required to make the bootloader work? Obviously most (EFI) images we build will have shim-signed and support secure boot, but do we want to support a user who just wants bare grub? And I guess for uboot this will be required so I sort of think that we should expect users to list the boot-related packages in their config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants