Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shim uses wrong TFTP server IP in proxyDHCP mode #165

Open
alkisg opened this issue Jan 28, 2019 · 31 comments
Open

Shim uses wrong TFTP server IP in proxyDHCP mode #165

alkisg opened this issue Jan 28, 2019 · 31 comments

Comments

@alkisg
Copy link

alkisg commented Jan 28, 2019

This works fine:
UEFI > real DHCP > shimx64.efi over TFTP > grubx64.efi over TFTP.

In the scenario above, if we change "real" with "proxy", it fails, because it's trying to download grubx64.efi from the real DHCP server instead of the proxy one.

A proxy DHCP server is one that only sends the boot filename, and leaves the IP assignments to the real DHCP server. We use that a lot in the ltsp.org and in other netbooting projects, as it avoids the need for a special network setup.

I think the problem is in this line:
https://github.com/rhboot/shim/blob/master/netboot.c#L293
memcpy(&tftp_addr.v4, pkt_v4->BootpSiAddr, 4);

There should be an "if proxy ... use that one for tftp ... else use BootpSiAddr" at that point.

Sample dnsmasq.conf for proxy setup, for testing:

enable-tftp
tftp-root=/var/lib/tftpboot
dhcp-range=10.161.254.0,proxy,255.255.255.0
pxe-service=X86-64_EFI,"Boot from network",shimx64.efi
@lcp
Copy link
Collaborator

lcp commented Jan 31, 2019

I can reproduce the bug. Will see how to fix it.

@lcp
Copy link
Collaborator

lcp commented Feb 2, 2019

I reproduced the bug with shim 14 and found that it's fixed in git master. I believe the following commit fixes the issue.
5f4fd53

@alkisg
Copy link
Author

alkisg commented Feb 3, 2019

I tried with version 15+1533136590.3beb971-0ubuntu1 from https://packages.ubuntu.com/disco/shim and the issue is still there.

Is there a more recent binary that I could test with? Thank you!

@alkisg
Copy link
Author

alkisg commented Feb 3, 2019

I also tried with https://fedora.pkgs.org/29/fedora-x86_64/shim-x64-15-7.x86_64.rpm.html, same issue there as well.

@lcp
Copy link
Collaborator

lcp commented Feb 6, 2019

I built my own shim binary with b3e4d1f + a simple debug patch to print the IP address of pkt_v4 in parseDhcp4():netboot.c.
My testing environment is one proxy DHCP VM + one client VM while the host is the DHCP server. My own binary printed the proxy DHCP VM IP and loaded grub.efi. I will try to find the related commit later after my vacation.

@cdadmin
Copy link

cdadmin commented Feb 13, 2019

This is most likely not a Shim issue but a Grub2 problem. The shim has proxy dhcp support but Grub2 does not. I believe it may be in the works for 2.03, but can't say for certain.

@NiKiZe
Copy link

NiKiZe commented Feb 13, 2019

@cdadmin, the initial issue was that grub isn't even started because it tries to be downloaded with settings from DHCP server - and not the settings from proxydhcp settings, as mentioned above this was maybe fixed in 5f4fd53, but either something else have happened after that, or distros are just horrible slow to update.

@alkisg
Copy link
Author

alkisg commented Feb 13, 2019

Yup, my TFTP server logs prove that shim never tries to load grub from the proxyDHCP server.

I did file an issue about adding proxyDHCP support in Grub though.

apt source shim in Ubuntu 18.04 does show 5f4fd53.

@lcp, when you say "while the host is the DHCP server", maybe that test isn't the same as mine?
It fails for me when the proxyDHCP server is the TFTP server.

@NiKiZe
Copy link

NiKiZe commented Feb 13, 2019

Please also remember that the TFTP server address should be read from next-server option, which is not the same as the DHCP server address. (even tho that they many times contains the same data)

@cdadmin
Copy link

cdadmin commented Feb 13, 2019

@NiKiZe
There isn't really a next-server option, it most closely resembles siaddr which according to RFC 2132 is next server. This is currently what Shim is checking for. In my experience this is how most DHCP servers handle next-server or option 66 by placing it in siaddr. However, there is option tftp-server-name AKA option 66 which by some DHCP servers may not set siaddr as. This could definitely be a problem and describe the behavior mentioned above, so good thought on your part. @alkisg can you verify if the proxy is setting siaddr or a different option for the tftp server? Furthermore this gets more complicated because some DHCP servers set the siaddr to themselves even without next-server or option 66 being set. This can sometimes make proxy support appear to be working if the tftp and dhcp servers are on the same server and proxy is on another.

@alkisg
Copy link
Author

alkisg commented Feb 13, 2019

I'm attaching a screenshot of the packet in wireshark.

  • 10.161.254.1 is the router = real DHCP server
  • 10.161.254.11 is my box = proxyDHCP server = TFTP server
  • the client is a VM, where iPXE was loaded, and the command dhcp net0 was ran. In the iPXE config menu, in the proxydhcp submenu, I get: filename: boot.ipxe (it would be shim if I was testing shim), vendor-class: PXEClient, dhcp-server: 10.161.254.11, and nothing else from the proxydhcp server.

wireshark

@cdadmin
Copy link

cdadmin commented Feb 13, 2019

Thanks. It appears this would not work since next server ip is not set. I think this would require some type of rework to Shim to support a proxy dhcp that does not set next-server. Alternatively, you could use a proxy dhcp server that does set the next server.

@cdadmin
Copy link

cdadmin commented Feb 13, 2019

By the way which version of dnsmasq are you using?

@cdadmin
Copy link

cdadmin commented Feb 13, 2019

dnsmasq has another option, I wonder if it would change anything?
dhcp-boot=shimx64.efi,pxeserver,10.161.254.11

@alkisg
Copy link
Author

alkisg commented Feb 14, 2019

@cdadmin, I already had "dhcp-boot=tag:iPXE,boot.ipxe,10.161.254.11" in dnsmasq.conf set while I took the previous screenshot, but this is not used in proxyDHCP mode, it's only used when dnsmasq functions as a real DHCP server. In proxyDHCP mode only the pxe-service= options are used.
I'm using dnsmasq 2.79-1; it's been working the same way since dnsmasq 2.49 from 10 years ago and we never had any issues booting 10.000+ clients with it of different hardware, i.e. PXE stacks do know to use that 54 dhcp-server option.

I don't know of any other famous DHCP server that supports proxyDHCP, which one would you want me to test with?

@alkisg
Copy link
Author

alkisg commented Feb 14, 2019

Btw, from the pxespec.pdf, page 31:

Boot server IP address (Read from the DHCP option 54 (server identifier), if not found, use the siaddr field.)

@lcp
Copy link
Collaborator

lcp commented Feb 18, 2019

For reference, here is my testing environment:

HOST [vmbr0] 192.168.110.1 DHCP server
VM1 [tap0] 192.168.110.3 proxyDHCP and tftp server
VM2 [tap1] the client
tap0 and tap1 are bridged with vmbr0.

dnsmasq.conf in VM1:

enable-tftp
tftp-root=/var/lib/tftpboot
dhcp-range=192.168.110.0,proxy,255.255.255.0
pxe-service=X86-64_EFI,"Boot from network",shimx64.efi

The shimx64.efi binary is based on b3e4d1f. I also built another binary after reverting 5f4fd53 .

Per my test, the original shim loaded grub2 from VM1 and showed the grub2 shell. Besides, I found the DHCP offer from VM1 containing the next server, i.e. 192.168.110.3.

@alkisg
Copy link
Author

alkisg commented Feb 18, 2019

@lcp, could you please upload your binary/binaries somewhere for me to test with?

I'll test with secure boot off, so that the signing keys won't matter. Thank you!

@alkisg
Copy link
Author

alkisg commented Feb 18, 2019

P.S. to be clear, your DHCP server=192.168.110.1 doesn't mention your proxyDHCP server IP=192.168.110.3 anywhere in its configuration, right?

@lcp
Copy link
Collaborator

lcp commented Feb 18, 2019

Here is my dhcpd.conf in the host:

option routers 192.168.110.1;
ddns-update-style none;
default-lease-time 14400;
subnet 192.168.110.0 netmask 255.255.255.0 {
range 192.168.110.100 192.168.110.200;
default-lease-time 14400;
max-lease-time 172800;
}

So the DHCP server unlikely sends anything about 192.168.110.3.

I'll build the testing binaries later.

@lcp
Copy link
Collaborator

lcp commented Feb 18, 2019

I've tested the attached shim binary, and it works for me.
shimx64-b3e4d1f.zip

@cdadmin
Copy link

cdadmin commented Feb 18, 2019

I typically use Shim with my implementation of ProxyDHCP and it has worked since 5f4fd53. I decided to test it with dnsmasq to look for differences and my results are aligning with @lcp.

My setup includes:
1 VM Ubuntu w/ Dnsmasq 2.79 as proxy
1 Physical Windows DHCP Server, no boot options specified
1 VM pxe boot test client

I also use a patched version of Grub2 for proxy support.

I tested with Shim14 and Grub2 doesn't load, from the logs, it is never even requested.
Testing with Shim15, Grub2 loads boot menu properly.

I confirmed with Wireshark, and can see 2 DHCP offers, one from DHCP and one from Proxy. Only the proxy offer contained the boot info. The strange thing is that my Proxy Offer did include the next server option. So for me, dnsmasq is setting the next-server which explains why it works.

@alkisg
Copy link
Author

alkisg commented Feb 18, 2019

I tried with shimx64-b3e4d1f.zip and with the aforementioned 4-line dnsmasq.conf (except I also added port=0) and again it didn't work for me.

Tomorrow I'll try with a real client, in case the use of VirtualBox or iPXE somehow interfere.

Would it be possible for me to test some patch or build that can read the proxyDHCP server from option 54?

Thank you for your awesome support!

@lcp
Copy link
Collaborator

lcp commented Feb 19, 2019

Shim doesn't parse the DHCP packet directly. It relies on the PXE Base Code protocol provided by UEFI firmware.

As you can see here:
https://github.com/rhboot/shim/blob/15/netboot.c#L259-L268
The firmware parses the DHCP offers and caches DHCPAck or ProxyOffer. Shim just reads BootSiAddr from the cached packets.

Actually, option 54 is parsed In edk2 stable 201811:
https://github.com/tianocore/edk2/blob/edk2-stable201811/NetworkPkg/UefiPxeBcDxe/PxeBcBoot.c#L493-L530

I suspect that your firmware doesn't parsed the DHCP offer correctly. Maybe you can try the latest OVMF.

@lcp
Copy link
Collaborator

lcp commented Feb 19, 2019

Oh! Wait. It seems the result of option 54 parsing isn't stored. So We have to figure out another way to get the option 54 in shim.

@alkisg
Copy link
Author

alkisg commented Feb 19, 2019

Test results:

  1. My initial test still doesn't work: real UEFI > Ubuntu's shimx64.efi with 5f4fd53 applied.
  2. @lcp's shimx64-b3e4d1f.efi works with real UEFI stack!
  3. The same shimx64-b3e4d1f.efi doesn't work with VirtualBox UEFI > iPXE > shim.

I will try to raise (3) with the iPXE developers as well, but I think that if shim manages to read option 54 then it will work with iPXE without any changes there. Also I don't know if any real PXE implementations would exist that would behave the same as iPXE, making shim fail.

Thank you very much all again; if you decide not to work on option 54, you could close this or tell me to close it. Cheers!

@lcp
Copy link
Collaborator

lcp commented Feb 21, 2019

That sounds strange to me since there is no real functional change in netboot.c after 5f4fd53 so I don't know why 1 still failed.

Anyway, I'll dig edk2 more to see how it works.

@Steinliiippp
Copy link

I came here from https://bugs.launchpad.net/ubuntu/+source/shim/+bug/1813541

This request seams still unresolved. Even the newest ubuntu releases have no signed grub with a working proxyDHCP implementation. Is there any ongoing work? I could not find any further informations. Do you have any updates on this topic?

@xileF1337
Copy link

xileF1337 commented Apr 27, 2023

This still seems to be an issue, at least for me. Is someone still working on it?

@davispuh
Copy link

I have same issue trying to PXE UEFI boot Debian installation. I'm using grubx64.efi which is provided in Debian netboot image.

My DHCP server has siaddr / Next server IP address as 192.168.0.1 (which is DHCP server itself) and it's not changeable.
So I'm using proxyDHCP for PXE boot with it's own TFTP server at 192.168.0.5 from which UEFI Firmware Tiano Core EDK2 correctly downloads grubx64.efi but then Grub itself tries to TFTP connect to 192.168.0.1 instead of 192.168.0.5 . This makes it broken.

@hejin
Copy link

hejin commented Jan 4, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants