* [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window
@ 2026-01-22 2:03 Dexuan Cui
2026-01-22 7:10 ` Michael Kelley
0 siblings, 1 reply; 16+ messages in thread
From: Dexuan Cui @ 2026-01-22 2:03 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, lpieralisi, kwilczynski,
mani, robh, bhelgaas, jakeo, linux-hyperv, linux-pci,
linux-kernel
Cc: mhklinux, stable
There has been a longstanding MMIO conflict between the pci_hyperv
driver's config_window (see hv_allocate_config_window()) and the
hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically
both get MMIO from the low MMIO range below 4GB; this is not an issue
in the normal kernel since the VMBus driver reserves the framebuffer
MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram() can
always get the reserved framebuffer MMIO; however, a Gen2 VM's kdump
kernel fails to reserve the framebuffer MMIO in vmbus_reserve_fb() because
the screen_info.lfb_base is zero in the kdump kernel: the screen_info
is not initialized at all in the kdump kernel, because the EFI stub
code, which initializes screen_info, doesn't run in the case of kdump.
When vmbus_reserve_fb() fails to reserve the framebuffer MMIO in the
kdump kernel, if pci_hyperv in the kdump kernel loads before hyperv_drm
loads, pci_hyperv's vmbus_allocate_mmio() gets the framebuffer MMIO
and tries to use it, but since the host thinks that the MMIO range is
still in use by hyperv_drm, the host refuses to accept the MMIO range
as the config window, and pci_hyperv's hv_pci_enter_d0() errors out:
"PCI Pass-through VSP failed D0 Entry with status c0370048".
This PCI error in the kdump kernel was not fatal in the past because
the kdump kernel normally doesn't reply on pci_hyperv, and the root
file system is on a VMBus SCSI device.
Now, a VM on Azure can boot from NVMe, i.e. the root FS can be on a
NVMe device, which depends on pci_hyperv. When the PCI error occurs,
the kdump kernel fails to boot up since no root FS is detected.
Fix the MMIO conflict by allocating MMIO above 4GB for the
config_window.
Note: we still need to figure out how to address the possible MMIO
conflict between hyperv_drm and pci_hyperv in the case of 32-bit PCI
MMIO BARs, but that's of low priority because all PCI devices available
to a Linux VM on Azure should use 64-bit BARs and should not use 32-bit
BARs -- I checked Mellanox VFs, MANA VFs, NVMe devices, and GPUs in
Linux VMs on Azure, and found no 32-bit BARs.
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org
---
drivers/pci/controller/pci-hyperv.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 1e237d3538f9..a6aecb1b5cab 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3406,9 +3406,13 @@ static int hv_allocate_config_window(struct hv_pcibus_device *hbus)
/*
* Set up a region of MMIO space to use for accessing configuration
- * space.
+ * space. Use the high MMIO range to not conflict with the hyperv_drm
+ * driver (which normally gets MMIO from the low MMIO range) in the
+ * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer
+ * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being
+ * zero in the kdump kernel.
*/
- ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1,
+ ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1,
PCI_CONFIG_MMIO_LENGTH, 0x1000, false);
if (ret)
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-22 2:03 [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window Dexuan Cui @ 2026-01-22 7:10 ` Michael Kelley 2026-01-22 19:14 ` Long Li 2026-04-02 17:09 ` Dexuan Cui 0 siblings, 2 replies; 16+ messages in thread From: Michael Kelley @ 2026-01-22 7:10 UTC (permalink / raw) To: Dexuan Cui, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, longli@microsoft.com, lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org, robh@kernel.org, bhelgaas@google.com, jakeo@microsoft.com, linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org From: Dexuan Cui <decui@microsoft.com> Sent: Wednesday, January 21, 2026 6:04 PM > > There has been a longstanding MMIO conflict between the pci_hyperv > driver's config_window (see hv_allocate_config_window()) and the > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically > both get MMIO from the low MMIO range below 4GB; this is not an issue > in the normal kernel since the VMBus driver reserves the framebuffer > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram() can > always get the reserved framebuffer MMIO; however, a Gen2 VM's kdump > kernel fails to reserve the framebuffer MMIO in vmbus_reserve_fb() because > the screen_info.lfb_base is zero in the kdump kernel: the screen_info > is not initialized at all in the kdump kernel, because the EFI stub > code, which initializes screen_info, doesn't run in the case of kdump. I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info should be initialized in the kdump kernel by the code that loads the kdump kernel into the reserved crash memory. See discussion in the commit message for commit 304386373007. I wonder if commit a41e0ab394e4 broke the initialization of screen_info in the kdump kernel. Or perhaps there is now a rev-lock between the kernel with this commit and a new version of the user space kexec command. There's a parameter to the kexec() command that governs whether it uses the kexec_file_load() system call or the kexec_load() system call. I wonder if that parameter makes a difference in the problem described for this patch. I can't immediately remember if, when I was working on commit 304386373007, I tested kdump in a Gen 2 VM with an NVMe OS disk to ensure that MMIO space was properly allocated to the frame buffer driver (either hyperv_fb or hyperv_drm). I'm thinking I did, but tomorrow I'll check for any definitive notes on that. Michael > > When vmbus_reserve_fb() fails to reserve the framebuffer MMIO in the > kdump kernel, if pci_hyperv in the kdump kernel loads before hyperv_drm > loads, pci_hyperv's vmbus_allocate_mmio() gets the framebuffer MMIO > and tries to use it, but since the host thinks that the MMIO range is > still in use by hyperv_drm, the host refuses to accept the MMIO range > as the config window, and pci_hyperv's hv_pci_enter_d0() errors out: > "PCI Pass-through VSP failed D0 Entry with status c0370048". > > This PCI error in the kdump kernel was not fatal in the past because > the kdump kernel normally doesn't reply on pci_hyperv, and the root > file system is on a VMBus SCSI device. > > Now, a VM on Azure can boot from NVMe, i.e. the root FS can be on a > NVMe device, which depends on pci_hyperv. When the PCI error occurs, > the kdump kernel fails to boot up since no root FS is detected. > > Fix the MMIO conflict by allocating MMIO above 4GB for the > config_window. > > Note: we still need to figure out how to address the possible MMIO > conflict between hyperv_drm and pci_hyperv in the case of 32-bit PCI > MMIO BARs, but that's of low priority because all PCI devices available > to a Linux VM on Azure should use 64-bit BARs and should not use 32-bit > BARs -- I checked Mellanox VFs, MANA VFs, NVMe devices, and GPUs in > Linux VMs on Azure, and found no 32-bit BARs. > > Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") > Signed-off-by: Dexuan Cui <decui@microsoft.com> > Cc: stable@vger.kernel.org > --- > drivers/pci/controller/pci-hyperv.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c > index 1e237d3538f9..a6aecb1b5cab 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -3406,9 +3406,13 @@ static int hv_allocate_config_window(struct > hv_pcibus_device *hbus) > > /* > * Set up a region of MMIO space to use for accessing configuration > - * space. > + * space. Use the high MMIO range to not conflict with the hyperv_drm > + * driver (which normally gets MMIO from the low MMIO range) in the > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being > + * zero in the kdump kernel. > */ > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1, > PCI_CONFIG_MMIO_LENGTH, 0x1000, false); > if (ret) > return ret; > -- > 2.43.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-22 7:10 ` Michael Kelley @ 2026-01-22 19:14 ` Long Li 2026-01-22 20:22 ` Michael Kelley 2026-04-02 17:09 ` Dexuan Cui 1 sibling, 1 reply; 16+ messages in thread From: Long Li @ 2026-01-22 19:14 UTC (permalink / raw) To: Michael Kelley, Dexuan Cui, KY Srinivasan, Haiyang Zhang, wei.liu@kernel.org, lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org, robh@kernel.org, bhelgaas@google.com, Jake Oshins, linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org > From: Dexuan Cui <decui@microsoft.com> Sent: Wednesday, January 21, 2026 > 6:04 PM > > > > There has been a longstanding MMIO conflict between the pci_hyperv > > driver's config_window (see hv_allocate_config_window()) and the > > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically > > both get MMIO from the low MMIO range below 4GB; this is not an issue > > in the normal kernel since the VMBus driver reserves the framebuffer > > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram() > > can always get the reserved framebuffer MMIO; however, a Gen2 VM's > > kdump kernel fails to reserve the framebuffer MMIO in > > vmbus_reserve_fb() because the screen_info.lfb_base is zero in the > > kdump kernel: the screen_info is not initialized at all in the kdump > > kernel, because the EFI stub code, which initializes screen_info, doesn't run in > the case of kdump. > > I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info should > be initialized in the kdump kernel by the code that loads the kdump kernel into > the reserved crash memory. See discussion in the commit message for commit > 304386373007. On AMD64 the screen_info is passed through kexec system call. But this is not the case for ARM64, it relies on EFI to get screen_info. However, Hyper-v guarantees the framebuffer MMIO is below 4GB. So, the patch works by allocating PCI MMIO separately from that of the framebuffer. Long > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info in the > kdump kernel. Or perhaps there is now a rev-lock between the kernel with this > commit and a new version of the user space kexec command. > > There's a parameter to the kexec() command that governs whether it uses the > kexec_file_load() system call or the kexec_load() system call. > I wonder if that parameter makes a difference in the problem described for this > patch. > > I can't immediately remember if, when I was working on commit 304386373007, I > tested kdump in a Gen 2 VM with an NVMe OS disk to ensure that MMIO space > was properly allocated to the frame buffer driver (either hyperv_fb or > hyperv_drm). I'm thinking I did, but tomorrow I'll check for any definitive notes on > that. > > Michael > > > > > When vmbus_reserve_fb() fails to reserve the framebuffer MMIO in the > > kdump kernel, if pci_hyperv in the kdump kernel loads before > > hyperv_drm loads, pci_hyperv's vmbus_allocate_mmio() gets the > > framebuffer MMIO and tries to use it, but since the host thinks that > > the MMIO range is still in use by hyperv_drm, the host refuses to > > accept the MMIO range as the config window, and pci_hyperv's > hv_pci_enter_d0() errors out: > > "PCI Pass-through VSP failed D0 Entry with status c0370048". > > > > This PCI error in the kdump kernel was not fatal in the past because > > the kdump kernel normally doesn't reply on pci_hyperv, and the root > > file system is on a VMBus SCSI device. > > > > Now, a VM on Azure can boot from NVMe, i.e. the root FS can be on a > > NVMe device, which depends on pci_hyperv. When the PCI error occurs, > > the kdump kernel fails to boot up since no root FS is detected. > > > > Fix the MMIO conflict by allocating MMIO above 4GB for the > > config_window. > > > > Note: we still need to figure out how to address the possible MMIO > > conflict between hyperv_drm and pci_hyperv in the case of 32-bit PCI > > MMIO BARs, but that's of low priority because all PCI devices > > available to a Linux VM on Azure should use 64-bit BARs and should not > > use 32-bit BARs -- I checked Mellanox VFs, MANA VFs, NVMe devices, and > > GPUs in Linux VMs on Azure, and found no 32-bit BARs. > > > > Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for > > Microsoft Hyper-V VMs") > > Signed-off-by: Dexuan Cui <decui@microsoft.com> > > Cc: stable@vger.kernel.org > > --- > > drivers/pci/controller/pci-hyperv.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > b/drivers/pci/controller/pci-hyperv.c > > index 1e237d3538f9..a6aecb1b5cab 100644 > > --- a/drivers/pci/controller/pci-hyperv.c > > +++ b/drivers/pci/controller/pci-hyperv.c > > @@ -3406,9 +3406,13 @@ static int hv_allocate_config_window(struct > > hv_pcibus_device *hbus) > > > > /* > > * Set up a region of MMIO space to use for accessing configuration > > - * space. > > + * space. Use the high MMIO range to not conflict with the hyperv_drm > > + * driver (which normally gets MMIO from the low MMIO range) in the > > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being > > + * zero in the kdump kernel. > > */ > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1, > > PCI_CONFIG_MMIO_LENGTH, 0x1000, false); > > if (ret) > > return ret; > > -- > > 2.43.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-22 19:14 ` Long Li @ 2026-01-22 20:22 ` Michael Kelley 2026-01-23 5:39 ` Matthew Ruffell 0 siblings, 1 reply; 16+ messages in thread From: Michael Kelley @ 2026-01-22 20:22 UTC (permalink / raw) To: Long Li, Dexuan Cui, KY Srinivasan, Haiyang Zhang, wei.liu@kernel.org, lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org, robh@kernel.org, bhelgaas@google.com, Jake Oshins, linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org From: Long Li <longli@microsoft.com> Sent: Thursday, January 22, 2026 11:14 AM > > > From: Dexuan Cui <decui@microsoft.com> Sent: Wednesday, January 21, 2026 6:04 PM > > > > > > There has been a longstanding MMIO conflict between the pci_hyperv > > > driver's config_window (see hv_allocate_config_window()) and the > > > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically > > > both get MMIO from the low MMIO range below 4GB; this is not an issue > > > in the normal kernel since the VMBus driver reserves the framebuffer > > > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram() > > > can always get the reserved framebuffer MMIO; however, a Gen2 VM's > > > kdump kernel fails to reserve the framebuffer MMIO in > > > vmbus_reserve_fb() because the screen_info.lfb_base is zero in the > > > kdump kernel: the screen_info is not initialized at all in the kdump > > > kernel, because the EFI stub code, which initializes screen_info, doesn't run in > > the case of kdump. > > > > I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info should > > be initialized in the kdump kernel by the code that loads the kdump kernel into > > the reserved crash memory. See discussion in the commit message for commit > > 304386373007. > > On AMD64 the screen_info is passed through kexec system call. But this is not the case > for ARM64, it relies on EFI to get screen_info. Hmmm. So does the problem described here only happen on arm64? If so, that might be worth noting in the commit message. I found my notes from working on commit 304386373007. I don't remember testing on arm64, and my notes don't mention it. So I'm wondering if the problem fixed by that commit could happen on arm64. That's potentially a separate issue from this one. I'll do some experiments to verify. > > However, Hyper-v guarantees the framebuffer MMIO is below 4GB. So, the patch works > by allocating PCI MMIO separately from that of the framebuffer. Yes, that seems right. But there's still something nagging at me about this, though I can't immediately identify a problem. I'll follow up if something comes to me. :-) Michael > > Long > > > > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info in the > > kdump kernel. Or perhaps there is now a rev-lock between the kernel with this > > commit and a new version of the user space kexec command. > > > > There's a parameter to the kexec() command that governs whether it uses the > > kexec_file_load() system call or the kexec_load() system call. > > I wonder if that parameter makes a difference in the problem described for this > > patch. > > > > I can't immediately remember if, when I was working on commit 304386373007, I > > tested kdump in a Gen 2 VM with an NVMe OS disk to ensure that MMIO space > > was properly allocated to the frame buffer driver (either hyperv_fb or > > hyperv_drm). I'm thinking I did, but tomorrow I'll check for any definitive notes on > > that. > > > > Michael > > > > > > > > When vmbus_reserve_fb() fails to reserve the framebuffer MMIO in the > > > kdump kernel, if pci_hyperv in the kdump kernel loads before > > > hyperv_drm loads, pci_hyperv's vmbus_allocate_mmio() gets the > > > framebuffer MMIO and tries to use it, but since the host thinks that > > > the MMIO range is still in use by hyperv_drm, the host refuses to > > > accept the MMIO range as the config window, and pci_hyperv's > > hv_pci_enter_d0() errors out: > > > "PCI Pass-through VSP failed D0 Entry with status c0370048". > > > > > > This PCI error in the kdump kernel was not fatal in the past because > > > the kdump kernel normally doesn't reply on pci_hyperv, and the root > > > file system is on a VMBus SCSI device. > > > > > > Now, a VM on Azure can boot from NVMe, i.e. the root FS can be on a > > > NVMe device, which depends on pci_hyperv. When the PCI error occurs, > > > the kdump kernel fails to boot up since no root FS is detected. > > > > > > Fix the MMIO conflict by allocating MMIO above 4GB for the > > > config_window. > > > > > > Note: we still need to figure out how to address the possible MMIO > > > conflict between hyperv_drm and pci_hyperv in the case of 32-bit PCI > > > MMIO BARs, but that's of low priority because all PCI devices > > > available to a Linux VM on Azure should use 64-bit BARs and should not > > > use 32-bit BARs -- I checked Mellanox VFs, MANA VFs, NVMe devices, and > > > GPUs in Linux VMs on Azure, and found no 32-bit BARs. > > > > > > Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for > > > Microsoft Hyper-V VMs") > > > Signed-off-by: Dexuan Cui <decui@microsoft.com> > > > Cc: stable@vger.kernel.org > > > --- > > > drivers/pci/controller/pci-hyperv.c | 8 ++++++-- > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > > b/drivers/pci/controller/pci-hyperv.c > > > index 1e237d3538f9..a6aecb1b5cab 100644 > > > --- a/drivers/pci/controller/pci-hyperv.c > > > +++ b/drivers/pci/controller/pci-hyperv.c > > > @@ -3406,9 +3406,13 @@ static int hv_allocate_config_window(struct > > > hv_pcibus_device *hbus) > > > > > > /* > > > * Set up a region of MMIO space to use for accessing configuration > > > - * space. > > > + * space. Use the high MMIO range to not conflict with the hyperv_drm > > > + * driver (which normally gets MMIO from the low MMIO range) in the > > > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer > > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being > > > + * zero in the kdump kernel. > > > */ > > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1, > > > PCI_CONFIG_MMIO_LENGTH, 0x1000, false); > > > if (ret) > > > return ret; > > > -- > > > 2.43.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-22 20:22 ` Michael Kelley @ 2026-01-23 5:39 ` Matthew Ruffell 2026-01-23 6:39 ` Michael Kelley 0 siblings, 1 reply; 16+ messages in thread From: Matthew Ruffell @ 2026-01-23 5:39 UTC (permalink / raw) To: mhklinux Cc: DECUI, bhelgaas, haiyangz, jakeo, kwilczynski, kys, linux-hyperv, linux-kernel, linux-pci, longli, lpieralisi, mani, robh, stable, wei.liu Hi Michael, > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info in the > > kdump kernel. Or perhaps there is now a rev-lock between the kernel with this > > commit and a new version of the user space kexec command. a41e0ab394e4 isn't a mainline commit. Can you please mention the commit subject so I can have a read. > > There's a parameter to the kexec() command that governs whether it uses the > > kexec_file_load() system call or the kexec_load() system call. > > I wonder if that parameter makes a difference in the problem described for this > > patch. Yes, it does indeed make a difference. I have been debugging this the past few days, and my colleague Melissa noticed that the problem reproduces when secure boot is disabled, but it does not reproduce when secure boot is enabled. Additionally, it reproduces on jammy, but not noble. It turns out that kexec-tools on jammy defaults to kexec_load() when secure boot is disabled, and when enabled, it instead uses kexec_file_load(). On noble, it defaults to first trying kexec_file_load() before falling back to kexec_load(), so the issue does not reproduce. > > > /* > > > * Set up a region of MMIO space to use for accessing configuration > > > - * space. > > > + * space. Use the high MMIO range to not conflict with the hyperv_drm > > > + * driver (which normally gets MMIO from the low MMIO range) in the > > > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer > > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being > > > + * zero in the kdump kernel. > > > */ > > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1, > > > PCI_CONFIG_MMIO_LENGTH, 0x1000, false); > > > if (ret) > > > return ret; > > > -- Thank you for the patch Dexuan. This patch fixes the problem on Ubuntu 5.15, and 6.8 based kernels booting V6 instance types on Azure with Gen 2 images. Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com> Thanks, Matthew ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-23 5:39 ` Matthew Ruffell @ 2026-01-23 6:39 ` Michael Kelley 2026-01-23 18:28 ` Michael Kelley ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Michael Kelley @ 2026-01-23 6:39 UTC (permalink / raw) To: Matthew Ruffell Cc: DECUI@microsoft.com, bhelgaas@google.com, haiyangz@microsoft.com, jakeo@microsoft.com, kwilczynski@kernel.org, kys@microsoft.com, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, longli@microsoft.com, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org From: Matthew Ruffell <matthew.ruffell@canonical.com> Sent: Thursday, January 22, 2026 9:39 PM > > Hi Michael, > > > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info in the > > > kdump kernel. Or perhaps there is now a rev-lock between the kernel with this > > > commit and a new version of the user space kexec command. > > a41e0ab394e4 isn't a mainline commit. Can you please mention the commit subject > so I can have a read. It's this patch: https://lore.kernel.org/lkml/20251126160854.553077-5-tzimmermann@suse.de/ which is in linux-next, but not yet in mainline. Since you are dealing with older kernels, it's not the culprit. > > > > There's a parameter to the kexec() command that governs whether it uses the > > > kexec_file_load() system call or the kexec_load() system call. > > > I wonder if that parameter makes a difference in the problem described for this > > > patch. > > Yes, it does indeed make a difference. I have been debugging this the past few > days, and my colleague Melissa noticed that the problem reproduces when secure > boot is disabled, but it does not reproduce when secure boot is enabled. > Additionally, it reproduces on jammy, but not noble. It turns out that > kexec-tools on jammy defaults to kexec_load() when secure boot is disabled, > and when enabled, it instead uses kexec_file_load(). On noble, it defaults to > first trying kexec_file_load() before falling back to kexec_load(), so the > issue does not reproduce. This is good info, and definitely a clue. So to be clear, the problem repros only when kexec_load() is used. With kexec_file_load(), it does not repro. Is that right? I saw a similar distinction when working on commit 304386373007, though in the opposite direction! > > > > > /* > > > > * Set up a region of MMIO space to use for accessing configuration > > > > - * space. > > > > + * space. Use the high MMIO range to not conflict with the hyperv_drm > > > > + * driver (which normally gets MMIO from the low MMIO range) in the > > > > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer > > > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being > > > > + * zero in the kdump kernel. > > > > */ > > > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > > > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1, > > > > PCI_CONFIG_MMIO_LENGTH, 0x1000, false); > > > > if (ret) > > > > return ret; > > > > -- > > Thank you for the patch Dexuan. > > This patch fixes the problem on Ubuntu 5.15, and 6.8 based kernels > booting V6 instance types on Azure with Gen 2 images. Are you seeing the problem on x86/64 or arm64 instances in Azure? "V6 instance types" could be either, I think, but I'm guessing you are on x86/64. And just to confirm: are you seeing the problem with the Hyper-V DRM driver, or the Hyper-V FB driver? This patch mentions the DRM driver, so I assume that's the problematic config. > > Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com> While this patch may solve the observed problem, I'm interested in understanding the root cause of why vmbus_reserve_fb() is seeing screen_info.lfb_base set to zero. It may be next week before I can take a look, and I may need follow up with you on more details of the scenario to reproduce the problem. Michael ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-23 6:39 ` Michael Kelley @ 2026-01-23 18:28 ` Michael Kelley 2026-01-23 20:21 ` Dexuan Cui 2026-04-02 19:23 ` Dexuan Cui 2026-02-07 1:42 ` Krister Johansen 2026-04-02 18:49 ` Dexuan Cui 2 siblings, 2 replies; 16+ messages in thread From: Michael Kelley @ 2026-01-23 18:28 UTC (permalink / raw) To: Michael Kelley, Matthew Ruffell Cc: DECUI@microsoft.com, bhelgaas@google.com, haiyangz@microsoft.com, jakeo@microsoft.com, kwilczynski@kernel.org, kys@microsoft.com, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, longli@microsoft.com, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org From: Michael Kelley <mhklinux@outlook.com> Sent: Thursday, January 22, 2026 10:39 PM > > From: Matthew Ruffell <matthew.ruffell@canonical.com> Sent: Thursday, January 22, 2026 9:39 PM > > > > Hi Michael, > > > > > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info in the > > > > kdump kernel. Or perhaps there is now a rev-lock between the kernel with this > > > > commit and a new version of the user space kexec command. > > > > a41e0ab394e4 isn't a mainline commit. Can you please mention the commit subject > > so I can have a read. > > It's this patch: > > https://lore.kernel.org/lkml/20251126160854.553077-5-tzimmermann@suse.de/ > > which is in linux-next, but not yet in mainline. Since you are dealing with older > kernels, it's not the culprit. > > > > > > > There's a parameter to the kexec() command that governs whether it uses the > > > > kexec_file_load() system call or the kexec_load() system call. > > > > I wonder if that parameter makes a difference in the problem described for this > > > > patch. > > > > Yes, it does indeed make a difference. I have been debugging this the past few > > days, and my colleague Melissa noticed that the problem reproduces when secure > > boot is disabled, but it does not reproduce when secure boot is enabled. > > Additionally, it reproduces on jammy, but not noble. It turns out that > > kexec-tools on jammy defaults to kexec_load() when secure boot is disabled, > > and when enabled, it instead uses kexec_file_load(). On noble, it defaults to > > first trying kexec_file_load() before falling back to kexec_load(), so the > > issue does not reproduce. > > This is good info, and definitely a clue. So to be clear, the problem repros > only when kexec_load() is used. With kexec_file_load(), it does not repro. Is that > right? I saw a similar distinction when working on commit 304386373007, > though in the opposite direction! > > > > > > > > /* > > > > > * Set up a region of MMIO space to use for accessing configuration > > > > > - * space. > > > > > + * space. Use the high MMIO range to not conflict with the hyperv_drm > > > > > + * driver (which normally gets MMIO from the low MMIO range) in the > > > > > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer > > > > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being > > > > > + * zero in the kdump kernel. > > > > > */ > > > > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > > > > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1, > > > > > PCI_CONFIG_MMIO_LENGTH, 0x1000, false); > > > > > if (ret) > > > > > return ret; > > > > > -- > > > > Thank you for the patch Dexuan. > > > > This patch fixes the problem on Ubuntu 5.15, and 6.8 based kernels > > booting V6 instance types on Azure with Gen 2 images. > > Are you seeing the problem on x86/64 or arm64 instances in Azure? > "V6 instance types" could be either, I think, but I'm guessing you > are on x86/64. > > And just to confirm: are you seeing the problem with the > Hyper-V DRM driver, or the Hyper-V FB driver? This patch mentions > the DRM driver, so I assume that's the problematic config. > > > > > Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com> > > While this patch may solve the observed problem, I'm interested in > understanding the root cause of why vmbus_reserve_fb() is seeing > screen_info.lfb_base set to zero. It may be next week before I can > take a look, and I may need follow up with you on more details of the > scenario to reproduce the problem. One more thought here: Is commit 96959283a58d relevant? The commit message describes a scenario where vmbus_reserve_fb() doesn't do anything because CONFIG_SYSFB is not set. Looking at the code for vmbus_reserve_fb(), it doing nothing might imply that screen_info.lfb_base is 0. But when CONFIG_SYSFB is not set, screen_info.lfb_base is just ignored, with the same result. This behavior started with the 6.7 kernel due to commit a07b50d80ab6. Note that commit 96959283a58d has a follow-on to correct a problem when CONFIG_EFI is not set. See commit 7b89a44b2e8c. If there's a reason to backport 96959283a58d, also get 7b89a44b2e8c. Michael ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-23 18:28 ` Michael Kelley @ 2026-01-23 20:21 ` Dexuan Cui 2026-04-02 19:23 ` Dexuan Cui 1 sibling, 0 replies; 16+ messages in thread From: Dexuan Cui @ 2026-01-23 20:21 UTC (permalink / raw) To: Michael Kelley, Matthew Ruffell Cc: bhelgaas@google.com, Haiyang Zhang, Jake Oshins, kwilczynski@kernel.org, KY Srinivasan, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Long Li, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org Thank you for all the good input! I'll do more research and report back. ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-23 18:28 ` Michael Kelley 2026-01-23 20:21 ` Dexuan Cui @ 2026-04-02 19:23 ` Dexuan Cui 2026-04-05 23:13 ` Michael Kelley 1 sibling, 1 reply; 16+ messages in thread From: Dexuan Cui @ 2026-04-02 19:23 UTC (permalink / raw) To: Michael Kelley, Matthew Ruffell Cc: bhelgaas@google.com, Haiyang Zhang, Jake Oshins, kwilczynski@kernel.org, KY Srinivasan, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Long Li, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org > From: Michael Kelley <mhklinux@outlook.com> > Sent: Friday, January 23, 2026 10:28 AM > ... > One more thought here: Is commit 96959283a58d relevant? The > commit message describes a scenario where vmbus_reserve_fb() > doesn't do anything because CONFIG_SYSFB is not set. Looking at > the code for vmbus_reserve_fb(), it doing nothing might imply that > screen_info.lfb_base is 0. But when CONFIG_SYSFB is not set, > screen_info.lfb_base is just ignored, with the same result. This behavior > started with the 6.7 kernel due to commit a07b50d80ab6. > > Note that commit 96959283a58d has a follow-on to correct a > problem when CONFIG_EFI is not set. See commit 7b89a44b2e8c. > If there's a reason to backport 96959283a58d, also get > 7b89a44b2e8c. > > Michael In my opinion, 96959283a58d ("Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests") is not a good fix for a07b50d80ab6: the commit message of a07b50d80ab6 says "the vmbus_drv code marks the original EFI framebuffer as reserved, but this is not required if there is no sysfb" -- IMO the message is incorrect. Even if CONFIG_SYSFB is not set, we still need to reserve the framebuffer MMIO range, because we need to make sure that hv_pci doesn't allocate MMIO from there. 96959283a58d adds "select SYSFB if !HYPERV_VTL_MODE", but we can still manually unset CONFIG_SYSFB (I happened to do this when debugging the kdump issue), and hv_pci won't work. IMO vmbus_reserve_fb() should unconditionally reserve the frame buffer MMIO range. I'll post a patch like this: --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -2395,10 +2398,8 @@ static void __maybe_unused vmbus_reserve_fb(void) if (efi_enabled(EFI_BOOT)) { /* Gen2 VM: get FB base from EFI framebuffer */ - if (IS_ENABLED(CONFIG_SYSFB)) { - start = sysfb_primary_display.screen.lfb_base; - size = max_t(__u32, sysfb_primary_display.screen.lfb_size, 0x800000); - } + start = sysfb_primary_display.screen.lfb_base; + size = max_t(__u32, sysfb_primary_display.screen.lfb_size, 0x800000); } else { /* Gen1 VM: get FB base from PCI */ pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT, diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 7937ac0cbd0f..78d7f8c66278 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -9,7 +9,6 @@ config HYPERV select PARAVIRT select X86_HV_CALLBACK_VECTOR if X86 select OF_EARLY_FLATTREE if OF - select SYSFB if EFI && !HYPERV_VTL_MODE select IRQ_MSI_LIB if X86 help Select this option to run Linux as a Hyper-V client operating Thanks, Dexuan ^ permalink raw reply related [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-04-02 19:23 ` Dexuan Cui @ 2026-04-05 23:13 ` Michael Kelley 2026-04-08 6:37 ` Dexuan Cui 0 siblings, 1 reply; 16+ messages in thread From: Michael Kelley @ 2026-04-05 23:13 UTC (permalink / raw) To: Dexuan Cui, Michael Kelley, Matthew Ruffell Cc: bhelgaas@google.com, Haiyang Zhang, Jake Oshins, kwilczynski@kernel.org, KY Srinivasan, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Long Li, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org From: Dexuan Cui <DECUI@microsoft.com> Sent: Thursday, April 2, 2026 12:24 PM > > > From: Michael Kelley <mhklinux@outlook.com> > > Sent: Friday, January 23, 2026 10:28 AM > > ... > > One more thought here: Is commit 96959283a58d relevant? The > > commit message describes a scenario where vmbus_reserve_fb() > > doesn't do anything because CONFIG_SYSFB is not set. Looking at > > the code for vmbus_reserve_fb(), it doing nothing might imply that > > screen_info.lfb_base is 0. But when CONFIG_SYSFB is not set, > > screen_info.lfb_base is just ignored, with the same result. This behavior > > started with the 6.7 kernel due to commit a07b50d80ab6. > > > > Note that commit 96959283a58d has a follow-on to correct a > > problem when CONFIG_EFI is not set. See commit 7b89a44b2e8c. > > If there's a reason to backport 96959283a58d, also get > > 7b89a44b2e8c. > > > > Michael > > In my opinion, > 96959283a58d ("Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests") > is not a good fix for a07b50d80ab6: the commit message of a07b50d80ab6 > says "the vmbus_drv code marks the original EFI framebuffer as reserved, but > this is not required if there is no sysfb" -- IMO the message is incorrect. > > Even if CONFIG_SYSFB is not set, we still need to reserve the framebuffer > MMIO range, because we need to make sure that hv_pci doesn't allocate > MMIO from there. > > 96959283a58d adds "select SYSFB if !HYPERV_VTL_MODE", but we can > still manually unset CONFIG_SYSFB (I happened to do this when debugging > the kdump issue), and hv_pci won't work. Just curious -- how would you manually unset CONFIG_SYSFB? The kernel makefile always resync's .config against the Kconfig rules, which would add CONFIG_SYSFB back again. The Kconfig files essentially say that removing CONFIG_SYSFB is an invalid configuration. > > IMO vmbus_reserve_fb() should unconditionally reserve the frame buffer > MMIO range. I'll post a patch like this: > > --- a/drivers/hv/vmbus_drv.c > +++ b/drivers/hv/vmbus_drv.c > @@ -2395,10 +2398,8 @@ static void __maybe_unused vmbus_reserve_fb(void) > > if (efi_enabled(EFI_BOOT)) { > /* Gen2 VM: get FB base from EFI framebuffer */ > - if (IS_ENABLED(CONFIG_SYSFB)) { > - start = sysfb_primary_display.screen.lfb_base; > - size = max_t(__u32, sysfb_primary_display.screen.lfb_size, 0x800000); > - } > + start = sysfb_primary_display.screen.lfb_base; > + size = max_t(__u32, sysfb_primary_display.screen.lfb_size, 0x800000); On arm64 the existence of sysfb_primary_display is conditional on several config variables, including CONFIG_SYSFB and CONFIG_EFI_EARLYCON. (see drivers/firmware/efi/efi-init.c) If you can take away CONFIG_SYSFB, you could also take away CONFIG_EFI_EARLYCON and end up with build error on arm64. So I'm not clear how this approach would be more robust against invalid .config changes. Also this recent patch set [1] submitted by Thomas Zimmerman is even more explicit about sysfb_primary_display being conditional on CONFIG_SYSFB. Michael [1] https://lore.kernel.org/linux-hyperv/20260402092305.208728-1-tzimmermann@suse.de/ > } else { > /* Gen1 VM: get FB base from PCI */ > pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT, > > > diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig > index 7937ac0cbd0f..78d7f8c66278 100644 > --- a/drivers/hv/Kconfig > +++ b/drivers/hv/Kconfig > @@ -9,7 +9,6 @@ config HYPERV > select PARAVIRT > select X86_HV_CALLBACK_VECTOR if X86 > select OF_EARLY_FLATTREE if OF > - select SYSFB if EFI && !HYPERV_VTL_MODE > select IRQ_MSI_LIB if X86 > help > Select this option to run Linux as a Hyper-V client operating > > Thanks, > Dexuan ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-04-05 23:13 ` Michael Kelley @ 2026-04-08 6:37 ` Dexuan Cui 0 siblings, 0 replies; 16+ messages in thread From: Dexuan Cui @ 2026-04-08 6:37 UTC (permalink / raw) To: Michael Kelley, Matthew Ruffell Cc: bhelgaas@google.com, Haiyang Zhang, Jake Oshins, kwilczynski@kernel.org, KY Srinivasan, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Long Li, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org > From: Michael Kelley <mhklinux@outlook.com> > Sent: Sunday, April 5, 2026 4:13 PM > > ... > > 96959283a58d adds "select SYSFB if !HYPERV_VTL_MODE", but we can > > still manually unset CONFIG_SYSFB (I happened to do this when debugging > > the kdump issue), and hv_pci won't work. > > Just curious -- how would you manually unset CONFIG_SYSFB? The kernel > makefile always resync's .config against the Kconfig rules, which would add > CONFIG_SYSFB back again. The Kconfig files essentially say that removing > CONFIG_SYSFB is an invalid configuration. Sorry, my description above is wrong: on the mainline kernel that has 96959283a58d ("Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests"), I'm unable to unset CONFIG_SYSFB. When I was able to unset CONFIG_SYSFB, I was actually on Ubuntu 22.04 (Ubuntu-azure-6.8-6.8.0-1049.55_22.04.1, released in Feb 2026). I thought the kernel has 96959283a58d, but actually it doesn't... > > IMO vmbus_reserve_fb() should unconditionally reserve the frame buffer > > MMIO range. I'll post a patch like this: > > > > --- a/drivers/hv/vmbus_drv.c > > +++ b/drivers/hv/vmbus_drv.c > > @@ -2395,10 +2398,8 @@ static void __maybe_unused > vmbus_reserve_fb(void) > > > > if (efi_enabled(EFI_BOOT)) { > > /* Gen2 VM: get FB base from EFI framebuffer */ > > - if (IS_ENABLED(CONFIG_SYSFB)) { > > - start = sysfb_primary_display.screen.lfb_base; > > - size = max_t(__u32, sysfb_primary_display.screen.lfb_size, > 0x800000); > > - } > > + start = sysfb_primary_display.screen.lfb_base; > > + size = max_t(__u32, sysfb_primary_display.screen.lfb_size, > 0x800000); Please ignore the change above. > On arm64 the existence of sysfb_primary_display is conditional on > several config variables, including CONFIG_SYSFB and CONFIG_EFI_EARLYCON. > (see drivers/firmware/efi/efi-init.c) If you can take away CONFIG_SYSFB, you > could also take away CONFIG_EFI_EARLYCON and end up with build error on > arm64. So I'm not clear how this approach would be more robust against > invalid .config changes. Agreed. Then let's keep vmbus_reserve_fb() as is. Thanks, Dexuan ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-23 6:39 ` Michael Kelley 2026-01-23 18:28 ` Michael Kelley @ 2026-02-07 1:42 ` Krister Johansen 2026-04-02 18:49 ` Dexuan Cui 2 siblings, 0 replies; 16+ messages in thread From: Krister Johansen @ 2026-02-07 1:42 UTC (permalink / raw) To: Matthew Ruffell, Michael Kelley Cc: DECUI@microsoft.com, bhelgaas@google.com, haiyangz@microsoft.com, jakeo@microsoft.com, kwilczynski@kernel.org, kys@microsoft.com, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, longli@microsoft.com, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org Hi Matthew and Michael, On Fri, Jan 23, 2026 at 06:39:24AM +0000, Michael Kelley wrote: > From: Matthew Ruffell <matthew.ruffell@canonical.com> Sent: Thursday, January 22, 2026 9:39 PM > > > > There's a parameter to the kexec() command that governs whether it uses the > > > > kexec_file_load() system call or the kexec_load() system call. > > > > I wonder if that parameter makes a difference in the problem described for this > > > > patch. > > > > Yes, it does indeed make a difference. I have been debugging this the past few > > days, and my colleague Melissa noticed that the problem reproduces when secure > > boot is disabled, but it does not reproduce when secure boot is enabled. > > Additionally, it reproduces on jammy, but not noble. It turns out that > > kexec-tools on jammy defaults to kexec_load() when secure boot is disabled, > > and when enabled, it instead uses kexec_file_load(). On noble, it defaults to > > first trying kexec_file_load() before falling back to kexec_load(), so the > > issue does not reproduce. > > This is good info, and definitely a clue. So to be clear, the problem repros > only when kexec_load() is used. With kexec_file_load(), it does not repro. Is that > right? I saw a similar distinction when working on commit 304386373007, > though in the opposite direction! Just to muddy the waters here, I have a team on the Noble 6.8 kernel train that's running into this issue on Standard_D#pds_v6 with secure boot disabled. I've validated via strace(8) that kexec(8) is calling kexec_file_load(2), but in this case the problem Dexuan describes in the cover letter occurs but affects NIC attachment instead of the NVMe storage device. (e.g. pci_hyperv attach of the NIC reports the pass-through error instead of successfully attaching). > > > > > /* > > > > > * Set up a region of MMIO space to use for accessing configuration > > > > > - * space. > > > > > + * space. Use the high MMIO range to not conflict with the hyperv_drm > > > > > + * driver (which normally gets MMIO from the low MMIO range) in the > > > > > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer > > > > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being > > > > > + * zero in the kdump kernel. > > > > > */ > > > > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > > > > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1, > > > > > PCI_CONFIG_MMIO_LENGTH, 0x1000, false); > > > > > if (ret) > > > > > return ret; > > > > > -- > > > > Thank you for the patch Dexuan. > > > > This patch fixes the problem on Ubuntu 5.15, and 6.8 based kernels > > booting V6 instance types on Azure with Gen 2 images. > > Are you seeing the problem on x86/64 or arm64 instances in Azure? > "V6 instance types" could be either, I think, but I'm guessing you > are on x86/64. > > And just to confirm: are you seeing the problem with the > Hyper-V DRM driver, or the Hyper-V FB driver? This patch mentions > the DRM driver, so I assume that's the problematic config. It's been arm64 and not x86 for the case I've seen. They're currently running with the hyperv_drm driver, but they've also tried swapping to the fb driver without any change in results. > > Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com> All of the above said, I also tested Dexuan's fix on these instances and found that with the patch applied kdump did work again. Tested-by: Krister Johansen <johansen@templeofstupid.com> -K ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-23 6:39 ` Michael Kelley 2026-01-23 18:28 ` Michael Kelley 2026-02-07 1:42 ` Krister Johansen @ 2026-04-02 18:49 ` Dexuan Cui 2 siblings, 0 replies; 16+ messages in thread From: Dexuan Cui @ 2026-04-02 18:49 UTC (permalink / raw) To: Michael Kelley, Matthew Ruffell Cc: bhelgaas@google.com, Haiyang Zhang, Jake Oshins, kwilczynski@kernel.org, KY Srinivasan, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Long Li, lpieralisi@kernel.org, mani@kernel.org, robh@kernel.org, stable@vger.kernel.org, wei.liu@kernel.org > From: Michael Kelley <mhklinux@outlook.com> > Sent: Thursday, January 22, 2026 10:39 PM > ... > This is good info, and definitely a clue. So to be clear, the problem repros > only when kexec_load() is used. With kexec_file_load(), it does not repro. Is > that right? Yes and no. The answer depends on the combination of the version of kdump-tools, and the architecture (x86-64 vs. ARM64), and the hypercall (KEXEC_LOAD vs. KEXEC_FILE_LOAD) and the Linux kernel version (there have been patches fixing and breaking kdump over the past several years...) Please see the reply I posted about 2 hours ago for all the details. > I saw a similar distinction when working on commit 304386373007, > though in the opposite direction! I think this happens because you're using Ubuntu 20.04: > https://lwn.net/ml/linux-kernel/SN6PR02MB41572155B6D139C499814EB7D4F12@SN6PR02MB4157.namprd02.prod.outlook.com/ > To further complicate matters, the kexec on Oracle Linux 9.4 seems to > have a bug when the -c option forces the use of kexec_load() instead > of kexec_file_load(). As an experiment, I modified the kdumpctl shell > script to add the "-c" option to kexec, but in that case the value "0x0" > is passed as the framebuffer address, which is wrong. Before commit 304386373007, hyperv_fb relocates the framebuffer MMIO base, so KEXEC_FILE_LOAD doesn't work for you, because the kdump kernel's screen_info.lfb_base still points to the initial MMIO, and hence the kdump kernel's efifb driver fails to work properly; KEXEC_LOAD works for you because the kdump-tools v2.0.18 in Ubuntu 20.04 doesn't have that commit (see the other reply from me) The kexec on Oracle Linux 9.4 has that commit, and it's not buggy -- unless we'd like to claim that all the recent kdump-tools versions are buggy :-) Thanks, Dexuan ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-01-22 7:10 ` Michael Kelley 2026-01-22 19:14 ` Long Li @ 2026-04-02 17:09 ` Dexuan Cui 2026-04-05 23:11 ` Michael Kelley 1 sibling, 1 reply; 16+ messages in thread From: Dexuan Cui @ 2026-04-02 17:09 UTC (permalink / raw) To: Michael Kelley, KY Srinivasan, Haiyang Zhang, wei.liu@kernel.org, Long Li, lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org, robh@kernel.org, bhelgaas@google.com, Jake Oshins, linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org, Matthew Ruffell, Krister Johansen > From: Michael Kelley <mhklinux@outlook.com> > Sent: Wednesday, January 21, 2026 11:11 PM > ... > From: Dexuan Cui <decui@microsoft.com> Sent: Wednesday, January 21, > 2026 6:04 PM > > > > There has been a longstanding MMIO conflict between the pci_hyperv > > driver's config_window (see hv_allocate_config_window()) and the > > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically > > both get MMIO from the low MMIO range below 4GB; this is not an issue > > in the normal kernel since the VMBus driver reserves the framebuffer > > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram() > > can always get the reserved framebuffer MMIO; however, a Gen2 VM's > > kdump kernel fails to reserve the framebuffer MMIO in vmbus_reserve_fb() > > because the screen_info.lfb_base is zero in the kdump kernel: > > the screen_info is not initialized at all in the kdump kernel, because the > > EFI stub code, which initializes screen_info, doesn't run in the case of kdump. > > I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info Hi Michael, sorry for delaying the reply for so long! Now I think I should understand all the details. My earlier statement "the screen_info is not initialized at all in the kdump kernel" is not correct on x86, but I believe it's correct on ARM64. Please see my explanation below. > should be initialized in the kdump kernel by the code that loads the > kdump kernel into the reserved crash memory. See discussion in the commit > message for commit 304386373007. > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info > in the kdump kernel. Or perhaps there is now a rev-lock between the kernel > with this commit and a new version of the user space kexec command. The commit a41e0ab394e4 ("sysfb: Replace screen_info with sysfb_primary_display") should be unrelated here. > There's a parameter to the kexec() command that governs whether it > uses the kexec_file_load() system call or the kexec_load() system call. > I wonder if that parameter makes a difference in the problem described > for this patch. > > I can't immediately remember if, when I was working on commit > 304386373007, I tested kdump in a Gen 2 VM with an NVMe OS disk to > ensure that MMIO space was properly allocated to the frame buffer > driver (either hyperv_fb or hyperv_drm). I'm thinking I did, but tomorrow > I'll check for any definitive notes on that. > > Michael If vmbus_reserve_fb() in the kdump kernel fails to reserve the framebuffer MMIO range due to a Gen2 VM's screen_info.lfb_base being 0, the MMIO conflict between hyperv_fb/hyperv_drm and hv_pci happens -- this is especially an issue if hv_pci is built-in and hyperv_fb/hyperv_drm is built as modules. vmbus_reserve_fb() should always succeed for a Gen1 VM, since it can always get the framebuffer MMIO base from the legacy PCI graphics device, so we only need to discuss Gen2 VMs here. When kdump-tools loads the kdump kernel into memory, the tool can accept any of the 3 parameters (e.g. I got the below via "man kexec" in Ubuntu 24.04): -s (--kexec-file-syscall) Specify that the new KEXEC_FILE_LOAD syscall should be used exclusively. -c (--kexec-syscall) Specify that the old KEXEC_LOAD syscall should be used exclusively. -a (--kexec-syscall-auto) Try the new KEXEC_FILE_LOAD syscall first and when it is not supported or the kernel does not understand the supplied im‐ age fall back to the old KEXEC_LOAD interface. There is no one single interface that always works, so this is the default. KEXEC_FILE_LOAD is required on systems that use locked-down secure boot to verify the kernel signature. KEXEC_LOAD may be also disabled in the kernel configuration. KEXEC_LOAD is required for some kernel image formats and on architectures that do not implement KEXEC_FILE_LOAD. If none of the parameters are specified, the default may be -c, or -s or -a, depending on the distro and the version in use. We can run strace -f kdump-config reload 2>&1 | egrep 'kexec_file_load|kexec_load' to tell which syscall is being used. Old distro versions seem to use KEXEC_LOAD by default, and new distro versions tend to use KEXEC_FILE_LOAD by default, especially when Secure Boot is enabled (e.g. see /usr/sbin/kdump-config: kdump_load() in Ubuntu). In Ubuntu, we can explicitly specify one of the parameters in "/etc/default/kdump-tools", e.g. KDUMP_KEXEC_ARGS="-c -d". The -d is for debugging. I found it very useful: when we run "kdump-config show" or "kdump-config reload", we get very useful debug info with -d. On x86-64, with -c: The kdump-tools gets the framebuffer's MMIO base using ioctl(fd, FBIOGET_FSCREENINFO, ....): see the end of the email for an example program; kdump-tools then uses the KEXEC_LOAD syscall to set up the screen_info.lfb_base for the kdump kernel. The function in kdump-tools that gets the framebuffer MMIO base is kexec/arch/i386/x86-linux-setup.c: setup_linux_vesafb(): https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/kexec/arch/i386/x86-linux-setup.c?h=v2.0.32#n133 Unluckily, setup_linux_vesafb() only recognizes the vesafb driver in Linux kernel ("VESA VGA") and the efifb driver ("EFI VGA"). It looks like normally arch_options.reuse_video_type is always 0. This means the kdump kernel's screen_info.lfb_base is 0, if hyperv_fb or hyperv_drm loads. In the past, for a Ubuntu kernel with CONFIG_FB_EFI=y, our workaround is blacklisting hyperv_fb or hyperv_drm, so /dev/fb0 is backed by efifb, and the screen_info.lfb_base is correctly set for kdump. However, now CONFIG_FB_EFI is not set in recent Ubuntu kernels: $ egrep 'CONFIG_FB_EFI|CONFIG_SYSFB|CONFIG_SYSFB_SIMPLEFB|CONFIG_DRM_SIMPLEDRM|CONFIG_DRM_HYPERV' /boot/config-6.8.0-1051-azure CONFIG_SYSFB=y CONFIG_SYSFB_SIMPLEFB=y CONFIG_DRM_SIMPLEDRM=y CONFIG_DRM_HYPERV=m # CONFIG_FB_EFI is not set So, with Ubuntu 22.04/24.04, -c can't avoid the MMIO conflict for Gen2 x86-64 VMs now, even if we blacklist hyperv_fb/hyperv_drm. Note: Ubuntu 20.04 uses an old version of the kdump-tools, so the statement is different there (see the later discussion below). hyperv_fb has been removed in the mainline kernel: see commit 40227f2efcfb ("fbdev: hyperv_fb: Remove hyperv_fb driver") so we no longer need to worry about it. Even if we modify setup_linux_vesafb() to support hyperv_drm, it still won't work, because the MMIO base is hidden by commit da6c7707caf3 ("fbdev: Add FBINFO_HIDE_SMEM_START flag") On x86-64, with -s: The KEXEC_FILE_LOAD syscall sets the kdump kernel's screen_info.lfb_base in the kernel: see "arch/x86/kernel/kexec-bzimage64.c" bzImage64_load setup_boot_parameters memcpy(¶ms->screen_info, &screen_info, sizeof(struct screen_info)); so, as long as the first kernel's hyperv_drm doesn't relocate the MMIO base, kdump should work fine; if the MMIO base is relocated, currently hyperv_drm doesn't update the screen_info.lfb_base, so the kdump's efifb driver and hv_pci driver won't work. Normally hyperv_drm doesn't relocate the MMIO base, unless the user specifies a very high resolution and the required MMIO size exceeds the default 8MB reserved by vmbus_reserve_fb() -- let's ignore that scenario for now. On AMR64, with -c: The kdump-tools doesn't even open /dev/fb0 (we can confirm this by using strace or bpftrace), so the kdump kernel's screen_info.lfb_base ia always 0. On AMR64, with -s: "arch/arm64/kernel/kexec_image.c": image_load() doesn't set the params->screen_info, so the kdump kernel's screen_info.lfb_base ia always 0. To recap, with a recent mainline kernel (or the linux-azure kernels) that has 304386373007, my observation on Ubuntu 22.04 and 24.04 is: on x86-64, -c fails, but -s works. on ARM64, -c fails, and -s also fails. Note: the kdump-tools v2.0.18 in Ubuntu 20.04 doesn't have this commit: https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/commit/?id=fb5a8792e6e4ee7de7ae3e06d193ea5beaaececc (Note the "return 0;" in setup_linux_vesafb()) so, on x86-64, -c also works in Ubuntu 20.04, if hyperv_fb is used (-c still doesn't work if hyperv_drm is used due to da6c7707caf3). With this patch "PCI: hv: Allocate MMIO from above 4GB for the config window", both -c and -s work on x86-64 and ARM64 due to no MMIO conflict, as long as there are no 32-bit PCI BARs (which should be true on Azure and on modern hosts.) With the patch, even if hyperv_drm relocates the framebuffer MMO base, there would still be no MMIO conflict because typically hyperv_drm gets its MMIO from below 4GB: it seems like vmbus_walk_resources() always finds the low MMIO range first and adds it to the beginning of the MMIO resources "hyperv_mmio", so presumably hyperv_drm would get MMIO from the low MMIO range. I'll update the commit message, add Matthew's and Krister's Tested-by's and post v2. Thanks, Dexuan //Print the info of the frame buffer for /dev/fb0: #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> #include <sys/ioctl.h> #include <linux/fb.h> static void print_bitfield(const char *name, const struct fb_bitfield *bf) { printf("%s:\n", name); printf(" offset : %u\n", bf->offset); printf(" length : %u\n", bf->length); printf(" msb_right : %u\n", bf->msb_right); } static void print_fix_screeninfo(const struct fb_fix_screeninfo *fix) { printf("struct fb_fix_screeninfo:\n"); printf(" id : %.16s\n", fix->id); printf(" smem_start : 0x%lx\n", fix->smem_start); printf(" smem_len : %u\n", fix->smem_len); printf(" type : %u\n", fix->type); printf(" type_aux : %u\n", fix->type_aux); printf(" visual : %u\n", fix->visual); printf(" xpanstep : %u\n", fix->xpanstep); printf(" ypanstep : %u\n", fix->ypanstep); printf(" ywrapstep : %u\n", fix->ywrapstep); printf(" line_length : %u\n", fix->line_length); printf(" mmio_start : %lu\n", fix->mmio_start); printf(" mmio_len : %u\n", fix->mmio_len); printf(" accel : %u\n", fix->accel); printf(" capabilities : %u\n", fix->capabilities); printf(" reserved[0] : %u\n", fix->reserved[0]); printf(" reserved[1] : %u\n", fix->reserved[1]); } static void print_var_screeninfo(const struct fb_var_screeninfo *var) { printf("struct fb_var_screeninfo:\n"); printf(" xres : %u\n", var->xres); printf(" yres : %u\n", var->yres); printf(" xres_virtual : %u\n", var->xres_virtual); printf(" yres_virtual : %u\n", var->yres_virtual); printf(" xoffset : %u\n", var->xoffset); printf(" yoffset : %u\n", var->yoffset); printf(" bits_per_pixel: %u\n", var->bits_per_pixel); printf(" grayscale : %u\n", var->grayscale); print_bitfield(" red", &var->red); print_bitfield(" green", &var->green); print_bitfield(" blue", &var->blue); print_bitfield(" transp", &var->transp); printf(" nonstd : %u\n", var->nonstd); printf(" activate : %u\n", var->activate); printf(" height : %u\n", var->height); printf(" width : %u\n", var->width); printf(" accel_flags : %u\n", var->accel_flags); printf(" pixclock : %u\n", var->pixclock); printf(" left_margin : %u\n", var->left_margin); printf(" right_margin : %u\n", var->right_margin); printf(" upper_margin : %u\n", var->upper_margin); printf(" lower_margin : %u\n", var->lower_margin); printf(" hsync_len : %u\n", var->hsync_len); printf(" vsync_len : %u\n", var->vsync_len); printf(" sync : %u\n", var->sync); printf(" vmode : %u\n", var->vmode); printf(" rotate : %u\n", var->rotate); printf(" colorspace : %u\n", var->colorspace); printf(" reserved[0] : %u\n", var->reserved[0]); printf(" reserved[1] : %u\n", var->reserved[1]); printf(" reserved[2] : %u\n", var->reserved[2]); printf(" reserved[3] : %u\n", var->reserved[3]); } int main(void) { int fd; struct fb_fix_screeninfo fix; struct fb_var_screeninfo var; fd = open("/dev/fb0", O_RDONLY); if (fd == -1) { perror("open"); return EXIT_FAILURE; } if (ioctl(fd, FBIOGET_FSCREENINFO, &fix) == -1) { perror("ioctl(FBIOGET_FSCREENINFO)"); close(fd); return EXIT_FAILURE; } if (ioctl(fd, FBIOGET_VSCREENINFO, &var) == -1) { perror("ioctl(FBIOGET_VSCREENINFO)"); close(fd); return EXIT_FAILURE; } print_fix_screeninfo(&fix); printf("\n"); print_var_screeninfo(&var); close(fd); return EXIT_SUCCESS; } ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-04-02 17:09 ` Dexuan Cui @ 2026-04-05 23:11 ` Michael Kelley 2026-04-08 6:15 ` Dexuan Cui 0 siblings, 1 reply; 16+ messages in thread From: Michael Kelley @ 2026-04-05 23:11 UTC (permalink / raw) To: Dexuan Cui, Michael Kelley, KY Srinivasan, Haiyang Zhang, wei.liu@kernel.org, Long Li, lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org, robh@kernel.org, bhelgaas@google.com, Jake Oshins, linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org, Matthew Ruffell, Krister Johansen From: Dexuan Cui <DECUI@microsoft.com> Sent: Thursday, April 2, 2026 10:10 AM > > > From: Michael Kelley <mhklinux@outlook.com> > > Sent: Wednesday, January 21, 2026 11:11 PM > > ... > > From: Dexuan Cui <decui@microsoft.com> Sent: Wednesday, January 21, > > 2026 6:04 PM > > > > > > There has been a longstanding MMIO conflict between the pci_hyperv > > > driver's config_window (see hv_allocate_config_window()) and the > > > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically > > > both get MMIO from the low MMIO range below 4GB; this is not an issue > > > in the normal kernel since the VMBus driver reserves the framebuffer > > > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram() > > > can always get the reserved framebuffer MMIO; however, a Gen2 VM's > > > kdump kernel fails to reserve the framebuffer MMIO in vmbus_reserve_fb() > > > because the screen_info.lfb_base is zero in the kdump kernel: > > > the screen_info is not initialized at all in the kdump kernel, because the > > > EFI stub code, which initializes screen_info, doesn't run in the case of kdump. > > > > I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info > > Hi Michael, sorry for delaying the reply for so long! Now I think I should > understand all the details. > > My earlier statement "the screen_info is not initialized at all in the kdump > kernel" is not correct on x86, but I believe it's correct on ARM64. Please see > my explanation below. Sadly, I must agree. It's surprising, because it affects kexec scenarios that don't include Hyper-V. On arm64 bare metal, if you kexec to a kernel configured to run the efifb frame buffer driver, the driver won't load. > > > should be initialized in the kdump kernel by the code that loads the > > kdump kernel into the reserved crash memory. See discussion in the commit > > message for commit 304386373007. > > > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info > > in the kdump kernel. Or perhaps there is now a rev-lock between the kernel > > with this commit and a new version of the user space kexec command. > > The commit > a41e0ab394e4 ("sysfb: Replace screen_info with sysfb_primary_display") > should be unrelated here. Agreed. > > > There's a parameter to the kexec() command that governs whether it > > uses the kexec_file_load() system call or the kexec_load() system call. > > I wonder if that parameter makes a difference in the problem described > > for this patch. > > > > I can't immediately remember if, when I was working on commit > > 304386373007, I tested kdump in a Gen 2 VM with an NVMe OS disk to > > ensure that MMIO space was properly allocated to the frame buffer > > driver (either hyperv_fb or hyperv_drm). I'm thinking I did, but tomorrow > > I'll check for any definitive notes on that. > > > > Michael Evidently, I did not fully test an arm64 VM, or I would have seen that screen_info was't being populated for the kdump kernel. > > If vmbus_reserve_fb() in the kdump kernel fails to reserve the framebuffer > MMIO range due to a Gen2 VM's screen_info.lfb_base being 0, the MMIO > conflict between hyperv_fb/hyperv_drm and hv_pci happens -- this is > especially an issue if hv_pci is built-in and hyperv_fb/hyperv_drm is built > as modules. vmbus_reserve_fb() should always succeed for a Gen1 VM, since > it can always get the framebuffer MMIO base from the legacy PCI graphics > device, so we only need to discuss Gen2 VMs here. Agreed. > > When kdump-tools loads the kdump kernel into memory, the tool can > accept any of the 3 parameters (e.g. I got the below via "man kexec" in > Ubuntu 24.04): > > -s (--kexec-file-syscall) > Specify that the new KEXEC_FILE_LOAD syscall should be used exclusively. > > -c (--kexec-syscall) > Specify that the old KEXEC_LOAD syscall should be used exclusively. > > -a (--kexec-syscall-auto) > Try the new KEXEC_FILE_LOAD syscall first and when it is not supported or the kernel does not understand the supplied im‐ > age fall back to the old KEXEC_LOAD interface. > > There is no one single interface that always works, so this is the default. > > KEXEC_FILE_LOAD is required on systems that use locked-down secure boot to verify the kernel signature. KEXEC_LOAD may be > also disabled in the kernel configuration. > > KEXEC_LOAD is required for some kernel image formats and on architectures that do not implement KEXEC_FILE_LOAD. > > If none of the parameters are specified, the default may be -c, or -s > or -a, depending on the distro and the version in use. We can run > strace -f kdump-config reload 2>&1 | egrep 'kexec_file_load|kexec_load' to tell which syscall is being used. > > Old distro versions seem to use KEXEC_LOAD by default, and new distro > versions tend to use KEXEC_FILE_LOAD by default, especially when > Secure Boot is enabled (e.g. see /usr/sbin/kdump-config: kdump_load() > in Ubuntu). Agreed. I think I had seen that previously. > > In Ubuntu, we can explicitly specify one of the parameters in > "/etc/default/kdump-tools", e.g. KDUMP_KEXEC_ARGS="-c -d". > > The -d is for debugging. I found it very useful: when we run > "kdump-config show" or "kdump-config reload", we get very useful > debug info with -d. > > On x86-64, with -c: > The kdump-tools gets the framebuffer's MMIO base using > ioctl(fd, FBIOGET_FSCREENINFO, ....): see the end of the email for > an example program; kdump-tools then uses the KEXEC_LOAD syscall > to set up the screen_info.lfb_base for the kdump kernel. Thanks. While redoing some experiments yesterday, I found the similar program that I had written a year ago to dump the ioctl results. > > The function in kdump-tools that gets the framebuffer MMIO base > is kexec/arch/i386/x86-linux-setup.c: setup_linux_vesafb(): > https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec- > tools.git/tree/kexec/arch/i386/x86-linux-setup.c?h=v2.0.32#n133 > > Unluckily, setup_linux_vesafb() only recognizes the vesafb > driver in Linux kernel ("VESA VGA") and the efifb driver ("EFI VGA"). > It looks like normally arch_options.reuse_video_type is always 0. > > This means the kdump kernel's screen_info.lfb_base is 0, if > hyperv_fb or hyperv_drm loads. In the past, for a Ubuntu kernel > with CONFIG_FB_EFI=y, our workaround is blacklisting > hyperv_fb or hyperv_drm, so /dev/fb0 is backed by efifb, and > the screen_info.lfb_base is correctly set for kdump. Hmmm. This worse than I thought for x86/x64. In fact, it means a part of my commit message for 304386373007 is now wrong. I had described everything as working when using the kexec_load() system call because the FBIOGET_FSCREENINFO ioctl was returning a good value for smem_start (at least with the hyperv_fb driver). But as you point out further down, newer versions of the kexec user space program are ignoring that smem_start value unless the driver is vesafb or efifb. Was blacklisting hyperv_fb or hyperv_drm in the kdump kernel a workaround we had promulgated in the past? My recollection is vague. But no matter. > > However, now CONFIG_FB_EFI is not set in recent Ubuntu kernels: > $ egrep > 'CONFIG_FB_EFI|CONFIG_SYSFB|CONFIG_SYSFB_SIMPLEFB|CONFIG_DRM_SIMPLEDR > M|CONFIG_DRM_HYPERV' /boot/config-6.8.0-1051-azure > CONFIG_SYSFB=y > CONFIG_SYSFB_SIMPLEFB=y > CONFIG_DRM_SIMPLEDRM=y > CONFIG_DRM_HYPERV=m > # CONFIG_FB_EFI is not set > > So, with Ubuntu 22.04/24.04, -c can't avoid the MMIO conflict > for Gen2 x86-64 VMs now, even if we blacklist hyperv_fb/hyperv_drm. > Note: Ubuntu 20.04 uses an old version of the kdump-tools, so > the statement is different there (see the later discussion below). > > hyperv_fb has been removed in the mainline kernel: see > commit 40227f2efcfb ("fbdev: hyperv_fb: Remove hyperv_fb driver") > so we no longer need to worry about it. > > Even if we modify setup_linux_vesafb() to support hyperv_drm, > it still won't work, because the MMIO base is hidden by commit > da6c7707caf3 ("fbdev: Add FBINFO_HIDE_SMEM_START flag") Agreed. > > On x86-64, with -s: > The KEXEC_FILE_LOAD syscall sets the kdump kernel's > screen_info.lfb_base in the kernel: see > > "arch/x86/kernel/kexec-bzimage64.c" > bzImage64_load > setup_boot_parameters > memcpy(¶ms->screen_info, &screen_info, sizeof(struct screen_info)); > > so, as long as the first kernel's hyperv_drm doesn't relocate the > MMIO base, kdump should work fine; if the MMIO base is relocated, > currently hyperv_drm doesn't update the screen_info.lfb_base, > so the kdump's efifb driver and hv_pci driver won't work. Normally > hyperv_drm doesn't relocate the MMIO base, unless the user > specifies a very high resolution and the required MMIO size > exceeds the default 8MB reserved by vmbus_reserve_fb() -- let's > ignore that scenario for now. > Agreed. > > On AMR64, with -c: > The kdump-tools doesn't even open /dev/fb0 (we can confirm this by using > strace or bpftrace), so the kdump kernel's screen_info.lfb_base ia always 0. Agreed. > > On AMR64, with -s: > "arch/arm64/kernel/kexec_image.c": image_load() doesn't set the > params->screen_info, so the kdump kernel's screen_info.lfb_base ia always 0. Agreed. > > To recap, with a recent mainline kernel (or the linux-azure kernels) that > has 304386373007, my observation on Ubuntu 22.04 and 24.04 is: > on x86-64, -c fails, but -s works. > on ARM64, -c fails, and -s also fails. > > Note: the kdump-tools v2.0.18 in Ubuntu 20.04 doesn't have this commit: > https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec- > tools.git/commit/?id=fb5a8792e6e4ee7de7ae3e06d193ea5beaaececc > (Note the "return 0;" in setup_linux_vesafb()) > so, on x86-64, -c also works in Ubuntu 20.04, if hyperv_fb is used > (-c still doesn't work if hyperv_drm is used due to da6c7707caf3). Ah. That explains why I thought x86/x64 kdump was working with hyperv_fb when working on commit 304386373007. I was testing with kexec user space utility v2.0.18, which*does* propagate smem_start from the ioctl to the loaded kdump image. > > With this patch > "PCI: hv: Allocate MMIO from above 4GB for the config window", > both -c and -s work on x86-64 and ARM64 due to no MMIO conflict, > as long as there are no 32-bit PCI BARs (which should be true on > Azure and on modern hosts.) > > With the patch, even if hyperv_drm relocates the framebuffer MMO > base, there would still be no MMIO conflict because typically hyperv_drm > gets its MMIO from below 4GB: it seems like vmbus_walk_resources() > always finds the low MMIO range first and adds it to the beginning of the > MMIO resources "hyperv_mmio", so presumably hyperv_drm would > get MMIO from the low MMIO range. > > I'll update the commit message, add Matthew's and Krister's > Tested-by's and post v2. See my comments on v2 of your patch. I have a thought for a slightly different approach to solve the problem. Michael > > Thanks, > Dexuan ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window 2026-04-05 23:11 ` Michael Kelley @ 2026-04-08 6:15 ` Dexuan Cui 0 siblings, 0 replies; 16+ messages in thread From: Dexuan Cui @ 2026-04-08 6:15 UTC (permalink / raw) To: Michael Kelley, KY Srinivasan, Haiyang Zhang, wei.liu@kernel.org, Long Li, lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org, robh@kernel.org, bhelgaas@google.com, Jake Oshins, linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org, Matthew Ruffell, Krister Johansen > From: Michael Kelley <mhklinux@outlook.com> > Sent: Sunday, April 5, 2026 4:11 PM > > ... > > Unluckily, setup_linux_vesafb() only recognizes the vesafb > > driver in Linux kernel ("VESA VGA") and the efifb driver ("EFI VGA"). > > It looks like normally arch_options.reuse_video_type is always 0. > > > > This means the kdump kernel's screen_info.lfb_base is 0, if > > hyperv_fb or hyperv_drm loads. In the past, for a Ubuntu kernel > > with CONFIG_FB_EFI=y, our workaround is blacklisting > > hyperv_fb or hyperv_drm, so /dev/fb0 is backed by efifb, and > > the screen_info.lfb_base is correctly set for kdump. > > Hmmm. This worse than I thought for x86/x64. In fact, it means > a part of my commit message for 304386373007 is now wrong. I had > described everything as working when using the kexec_load() system > call because the FBIOGET_FSCREENINFO ioctl was returning a good > value for smem_start (at least with the hyperv_fb driver). But as you > point out further down, newer versions of the kexec user space program > are ignoring that smem_start value unless the driver is vesafb or efifb. > > Was blacklisting hyperv_fb or hyperv_drm in the kdump kernel > a workaround we had promulgated in the past? My recollection > is vague. But no matter. Blacklisting hyperv_fb or hyperv_drm in the *first* kernel was an internal workaround, which no longer works since CONFIG_FB_EFI is not set in the linux-azure kernels. Thanks, Dexuan ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2026-04-08 6:37 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-22 2:03 [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window Dexuan Cui 2026-01-22 7:10 ` Michael Kelley 2026-01-22 19:14 ` Long Li 2026-01-22 20:22 ` Michael Kelley 2026-01-23 5:39 ` Matthew Ruffell 2026-01-23 6:39 ` Michael Kelley 2026-01-23 18:28 ` Michael Kelley 2026-01-23 20:21 ` Dexuan Cui 2026-04-02 19:23 ` Dexuan Cui 2026-04-05 23:13 ` Michael Kelley 2026-04-08 6:37 ` Dexuan Cui 2026-02-07 1:42 ` Krister Johansen 2026-04-02 18:49 ` Dexuan Cui 2026-04-02 17:09 ` Dexuan Cui 2026-04-05 23:11 ` Michael Kelley 2026-04-08 6:15 ` Dexuan Cui
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox