* kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
@ 2026-02-26 13:17 Marek Marczykowski-Górecki
2026-02-26 13:27 ` Andrew Cooper
0 siblings, 1 reply; 14+ messages in thread
From: Marek Marczykowski-Górecki @ 2026-02-26 13:17 UTC (permalink / raw)
To: xen-devel; +Cc: Juergen Gross, Boris Ostrovsky
[-- Attachment #1: Type: text/plain, Size: 8424 bytes --]
Hi,
When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
sometimes:
[ 436.849614] ------------[ cut here ]------------
[ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
[ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
[ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
[ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
[ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
[ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
[ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
[ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
[ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
[ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
[ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
[ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
[ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
[ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
[ 436.850401] Call Trace:
[ 436.850410] <TASK>
[ 436.850420] vmap_pages_pud_range+0x47c/0x530
[ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
[ 436.850451] ? __get_vm_area_node+0x10a/0x170
[ 436.850465] vmap+0x79/0xd0
[ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
[ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
[ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
[ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
[ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
[ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
[ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
[ 436.852769] process_one_work+0x18d/0x380
[ 436.852779] worker_thread+0x196/0x300
[ 436.852787] ? __pfx_worker_thread+0x10/0x10
[ 436.852796] kthread+0xe3/0x120
[ 436.852805] ? __pfx_kthread+0x10/0x10
[ 436.852815] ret_from_fork+0x19e/0x260
[ 436.852824] ? __pfx_kthread+0x10/0x10
[ 436.852832] ret_from_fork_asm+0x1a/0x30
[ 436.852842] </TASK>
[ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
[ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
[ 436.853183] ---[ end trace 0000000000000000 ]---
or this:
[ 548.736884] ------------[ cut here ]------------
[ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
[ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
[ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
[ 548.736962] Workqueue: events delayed_vfree_work
[ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
[ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
[ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
[ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
[ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
[ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
[ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
[ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
[ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
[ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
[ 548.737115] Call Trace:
[ 548.737123] <TASK>
[ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
[ 548.737142] vunmap_p4d_range+0x17d/0x290
[ 548.737151] __vunmap_range_noflush+0x182/0x1d0
[ 548.737161] ? _raw_spin_unlock+0xe/0x30
[ 548.737171] remove_vm_area+0x40/0x70
[ 548.737180] vfree.part.0+0x1b/0x290
[ 548.737189] delayed_vfree_work+0x35/0x50
[ 548.737198] process_one_work+0x18d/0x380
[ 548.737207] worker_thread+0x196/0x300
[ 548.737215] ? __pfx_worker_thread+0x10/0x10
[ 548.737224] kthread+0xe3/0x120
[ 548.737233] ? __pfx_kthread+0x10/0x10
[ 548.737242] ret_from_fork+0x19e/0x260
[ 548.737250] ? __pfx_kthread+0x10/0x10
[ 548.737258] ret_from_fork_asm+0x1a/0x30
[ 548.737269] </TASK>
[ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
[ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
[ 548.737469] ---[ end trace 0000000000000000 ]---
I don't have clear pattern when this happens, one was during host
suspend, but the other was during "normal" test run (starting/stopping
domUs and running stuff around them). Note also one of those is Intel
and the other AMD, so it isn't really hardware specific.
Slightly more details with links (especially serial0.txt in the logs
tab) at
https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
Any idea?
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-02-26 13:17 kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1 Marek Marczykowski-Górecki
@ 2026-02-26 13:27 ` Andrew Cooper
2026-02-26 13:36 ` Marek Marczykowski-Górecki
2026-02-26 13:41 ` Jürgen Groß
0 siblings, 2 replies; 14+ messages in thread
From: Andrew Cooper @ 2026-02-26 13:27 UTC (permalink / raw)
To: Marek Marczykowski-Górecki, xen-devel
Cc: Andrew Cooper, Juergen Gross, Boris Ostrovsky
On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
> Hi,
>
> When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
> sometimes:
>
> [ 436.849614] ------------[ cut here ]------------
> [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
> [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
> [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
> [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
> [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
> [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
> [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
> [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
> [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
> [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
> [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
> [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
> [ 436.850401] Call Trace:
> [ 436.850410] <TASK>
> [ 436.850420] vmap_pages_pud_range+0x47c/0x530
> [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
> [ 436.850451] ? __get_vm_area_node+0x10a/0x170
> [ 436.850465] vmap+0x79/0xd0
> [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
> [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
> [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
> [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
> [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
> [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
> [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
> [ 436.852769] process_one_work+0x18d/0x380
> [ 436.852779] worker_thread+0x196/0x300
> [ 436.852787] ? __pfx_worker_thread+0x10/0x10
> [ 436.852796] kthread+0xe3/0x120
> [ 436.852805] ? __pfx_kthread+0x10/0x10
> [ 436.852815] ret_from_fork+0x19e/0x260
> [ 436.852824] ? __pfx_kthread+0x10/0x10
> [ 436.852832] ret_from_fork_asm+0x1a/0x30
> [ 436.852842] </TASK>
> [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
> [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> [ 436.853183] ---[ end trace 0000000000000000 ]---
>
> or this:
>
> [ 548.736884] ------------[ cut here ]------------
> [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
> [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
> [ 548.736962] Workqueue: events delayed_vfree_work
> [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
> [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
> [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
> [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
> [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
> [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
> [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
> [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
> [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
> [ 548.737115] Call Trace:
> [ 548.737123] <TASK>
> [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
> [ 548.737142] vunmap_p4d_range+0x17d/0x290
> [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
> [ 548.737161] ? _raw_spin_unlock+0xe/0x30
> [ 548.737171] remove_vm_area+0x40/0x70
> [ 548.737180] vfree.part.0+0x1b/0x290
> [ 548.737189] delayed_vfree_work+0x35/0x50
> [ 548.737198] process_one_work+0x18d/0x380
> [ 548.737207] worker_thread+0x196/0x300
> [ 548.737215] ? __pfx_worker_thread+0x10/0x10
> [ 548.737224] kthread+0xe3/0x120
> [ 548.737233] ? __pfx_kthread+0x10/0x10
> [ 548.737242] ret_from_fork+0x19e/0x260
> [ 548.737250] ? __pfx_kthread+0x10/0x10
> [ 548.737258] ret_from_fork_asm+0x1a/0x30
> [ 548.737269] </TASK>
> [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
> [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> [ 548.737469] ---[ end trace 0000000000000000 ]---
>
> I don't have clear pattern when this happens, one was during host
> suspend, but the other was during "normal" test run (starting/stopping
> domUs and running stuff around them). Note also one of those is Intel
> and the other AMD, so it isn't really hardware specific.
>
> Slightly more details with links (especially serial0.txt in the logs
> tab) at
> https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
>
> Any idea?
>
That looks like the issue Juergen fixed with:
https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
~Andrew
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-02-26 13:27 ` Andrew Cooper
@ 2026-02-26 13:36 ` Marek Marczykowski-Górecki
2026-02-26 13:41 ` Jürgen Groß
1 sibling, 0 replies; 14+ messages in thread
From: Marek Marczykowski-Górecki @ 2026-02-26 13:36 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Juergen Gross, Boris Ostrovsky
[-- Attachment #1: Type: text/plain, Size: 9220 bytes --]
On Thu, Feb 26, 2026 at 01:27:17PM +0000, Andrew Cooper wrote:
> On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
> > Hi,
> >
> > When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
> > sometimes:
> >
> > [ 436.849614] ------------[ cut here ]------------
> > [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
> > [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> > [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> > [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
> > [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
> > [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
> > [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> > [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
> > [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
> > [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
> > [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
> > [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
> > [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
> > [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
> > [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
> > [ 436.850401] Call Trace:
> > [ 436.850410] <TASK>
> > [ 436.850420] vmap_pages_pud_range+0x47c/0x530
> > [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
> > [ 436.850451] ? __get_vm_area_node+0x10a/0x170
> > [ 436.850465] vmap+0x79/0xd0
> > [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
> > [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
> > [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
> > [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
> > [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
> > [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
> > [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
> > [ 436.852769] process_one_work+0x18d/0x380
> > [ 436.852779] worker_thread+0x196/0x300
> > [ 436.852787] ? __pfx_worker_thread+0x10/0x10
> > [ 436.852796] kthread+0xe3/0x120
> > [ 436.852805] ? __pfx_kthread+0x10/0x10
> > [ 436.852815] ret_from_fork+0x19e/0x260
> > [ 436.852824] ? __pfx_kthread+0x10/0x10
> > [ 436.852832] ret_from_fork_asm+0x1a/0x30
> > [ 436.852842] </TASK>
> > [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
> > [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> > [ 436.853183] ---[ end trace 0000000000000000 ]---
> >
> > or this:
> >
> > [ 548.736884] ------------[ cut here ]------------
> > [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
> > [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> > [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> > [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
> > [ 548.736962] Workqueue: events delayed_vfree_work
> > [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
> > [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> > [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
> > [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
> > [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
> > [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
> > [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
> > [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
> > [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
> > [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
> > [ 548.737115] Call Trace:
> > [ 548.737123] <TASK>
> > [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
> > [ 548.737142] vunmap_p4d_range+0x17d/0x290
> > [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
> > [ 548.737161] ? _raw_spin_unlock+0xe/0x30
> > [ 548.737171] remove_vm_area+0x40/0x70
> > [ 548.737180] vfree.part.0+0x1b/0x290
> > [ 548.737189] delayed_vfree_work+0x35/0x50
> > [ 548.737198] process_one_work+0x18d/0x380
> > [ 548.737207] worker_thread+0x196/0x300
> > [ 548.737215] ? __pfx_worker_thread+0x10/0x10
> > [ 548.737224] kthread+0xe3/0x120
> > [ 548.737233] ? __pfx_kthread+0x10/0x10
> > [ 548.737242] ret_from_fork+0x19e/0x260
> > [ 548.737250] ? __pfx_kthread+0x10/0x10
> > [ 548.737258] ret_from_fork_asm+0x1a/0x30
> > [ 548.737269] </TASK>
> > [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
> > [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> > [ 548.737469] ---[ end trace 0000000000000000 ]---
> >
> > I don't have clear pattern when this happens, one was during host
> > suspend, but the other was during "normal" test run (starting/stopping
> > domUs and running stuff around them). Note also one of those is Intel
> > and the other AMD, so it isn't really hardware specific.
> >
> > Slightly more details with links (especially serial0.txt in the logs
> > tab) at
> > https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
> >
> > Any idea?
> >
>
> That looks like the issue Juergen fixed with:
>
> https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
The commit message says it's about booting PV guest, here the crash is
significantly later. But I'll test with the patch included anyway.
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-02-26 13:27 ` Andrew Cooper
2026-02-26 13:36 ` Marek Marczykowski-Górecki
@ 2026-02-26 13:41 ` Jürgen Groß
2026-04-05 9:41 ` Marek Marczykowski-Górecki
1 sibling, 1 reply; 14+ messages in thread
From: Jürgen Groß @ 2026-02-26 13:41 UTC (permalink / raw)
To: Andrew Cooper, Marek Marczykowski-Górecki, xen-devel; +Cc: Boris Ostrovsky
[-- Attachment #1.1.1: Type: text/plain, Size: 9181 bytes --]
On 26.02.26 14:27, Andrew Cooper wrote:
> On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
>> Hi,
>>
>> When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
>> sometimes:
>>
>> [ 436.849614] ------------[ cut here ]------------
>> [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
>> [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>> [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>> [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
>> [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
>> [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
>> [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>> [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
>> [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
>> [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
>> [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
>> [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
>> [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
>> [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
>> [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
>> [ 436.850401] Call Trace:
>> [ 436.850410] <TASK>
>> [ 436.850420] vmap_pages_pud_range+0x47c/0x530
>> [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
>> [ 436.850451] ? __get_vm_area_node+0x10a/0x170
>> [ 436.850465] vmap+0x79/0xd0
>> [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
>> [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
>> [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
>> [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
>> [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
>> [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
>> [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
>> [ 436.852769] process_one_work+0x18d/0x380
>> [ 436.852779] worker_thread+0x196/0x300
>> [ 436.852787] ? __pfx_worker_thread+0x10/0x10
>> [ 436.852796] kthread+0xe3/0x120
>> [ 436.852805] ? __pfx_kthread+0x10/0x10
>> [ 436.852815] ret_from_fork+0x19e/0x260
>> [ 436.852824] ? __pfx_kthread+0x10/0x10
>> [ 436.852832] ret_from_fork_asm+0x1a/0x30
>> [ 436.852842] </TASK>
>> [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
>> [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
>> [ 436.853183] ---[ end trace 0000000000000000 ]---
>>
>> or this:
>>
>> [ 548.736884] ------------[ cut here ]------------
>> [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
>> [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>> [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>> [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
>> [ 548.736962] Workqueue: events delayed_vfree_work
>> [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
>> [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>> [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
>> [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
>> [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
>> [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
>> [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
>> [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
>> [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
>> [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
>> [ 548.737115] Call Trace:
>> [ 548.737123] <TASK>
>> [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
>> [ 548.737142] vunmap_p4d_range+0x17d/0x290
>> [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
>> [ 548.737161] ? _raw_spin_unlock+0xe/0x30
>> [ 548.737171] remove_vm_area+0x40/0x70
>> [ 548.737180] vfree.part.0+0x1b/0x290
>> [ 548.737189] delayed_vfree_work+0x35/0x50
>> [ 548.737198] process_one_work+0x18d/0x380
>> [ 548.737207] worker_thread+0x196/0x300
>> [ 548.737215] ? __pfx_worker_thread+0x10/0x10
>> [ 548.737224] kthread+0xe3/0x120
>> [ 548.737233] ? __pfx_kthread+0x10/0x10
>> [ 548.737242] ret_from_fork+0x19e/0x260
>> [ 548.737250] ? __pfx_kthread+0x10/0x10
>> [ 548.737258] ret_from_fork_asm+0x1a/0x30
>> [ 548.737269] </TASK>
>> [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
>> [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
>> [ 548.737469] ---[ end trace 0000000000000000 ]---
>>
>> I don't have clear pattern when this happens, one was during host
>> suspend, but the other was during "normal" test run (starting/stopping
>> domUs and running stuff around them). Note also one of those is Intel
>> and the other AMD, so it isn't really hardware specific.
>>
>> Slightly more details with links (especially serial0.txt in the logs
>> tab) at
>> https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
>>
>> Any idea?
>>
>
> That looks like the issue Juergen fixed with:
>
> https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
No, it doesn't. The fix is already in rc1, and the crash was quite early during
boot (before any secondary CPUs were brought up).
I guess this problem is related to the lazy_mmu_state series [1].
Juergen
[1]: https://lore.kernel.org/lkml/20251215150323.2218608-1-kevin.brodsky@arm.com/
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-02-26 13:41 ` Jürgen Groß
@ 2026-04-05 9:41 ` Marek Marczykowski-Górecki
2026-04-07 9:23 ` Kevin Brodsky
0 siblings, 1 reply; 14+ messages in thread
From: Marek Marczykowski-Górecki @ 2026-04-05 9:41 UTC (permalink / raw)
To: Jürgen Groß, Kevin Brodsky
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
[-- Attachment #1: Type: text/plain, Size: 9697 bytes --]
On Thu, Feb 26, 2026 at 02:41:12PM +0100, Jürgen Groß wrote:
> On 26.02.26 14:27, Andrew Cooper wrote:
> > On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
> > > Hi,
> > >
> > > When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
> > > sometimes:
> > >
> > > [ 436.849614] ------------[ cut here ]------------
> > > [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
> > > [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> > > [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> > > [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
> > > [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
> > > [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
> > > [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> > > [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
> > > [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
> > > [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
> > > [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
> > > [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
> > > [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
> > > [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
> > > [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
> > > [ 436.850401] Call Trace:
> > > [ 436.850410] <TASK>
> > > [ 436.850420] vmap_pages_pud_range+0x47c/0x530
> > > [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
> > > [ 436.850451] ? __get_vm_area_node+0x10a/0x170
> > > [ 436.850465] vmap+0x79/0xd0
> > > [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
> > > [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
> > > [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
> > > [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
> > > [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
> > > [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
> > > [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
> > > [ 436.852769] process_one_work+0x18d/0x380
> > > [ 436.852779] worker_thread+0x196/0x300
> > > [ 436.852787] ? __pfx_worker_thread+0x10/0x10
> > > [ 436.852796] kthread+0xe3/0x120
> > > [ 436.852805] ? __pfx_kthread+0x10/0x10
> > > [ 436.852815] ret_from_fork+0x19e/0x260
> > > [ 436.852824] ? __pfx_kthread+0x10/0x10
> > > [ 436.852832] ret_from_fork_asm+0x1a/0x30
> > > [ 436.852842] </TASK>
> > > [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
> > > [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> > > [ 436.853183] ---[ end trace 0000000000000000 ]---
> > >
> > > or this:
> > >
> > > [ 548.736884] ------------[ cut here ]------------
> > > [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
> > > [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> > > [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> > > [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
> > > [ 548.736962] Workqueue: events delayed_vfree_work
> > > [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
> > > [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> > > [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
> > > [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
> > > [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
> > > [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
> > > [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
> > > [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
> > > [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
> > > [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
> > > [ 548.737115] Call Trace:
> > > [ 548.737123] <TASK>
> > > [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
> > > [ 548.737142] vunmap_p4d_range+0x17d/0x290
> > > [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
> > > [ 548.737161] ? _raw_spin_unlock+0xe/0x30
> > > [ 548.737171] remove_vm_area+0x40/0x70
> > > [ 548.737180] vfree.part.0+0x1b/0x290
> > > [ 548.737189] delayed_vfree_work+0x35/0x50
> > > [ 548.737198] process_one_work+0x18d/0x380
> > > [ 548.737207] worker_thread+0x196/0x300
> > > [ 548.737215] ? __pfx_worker_thread+0x10/0x10
> > > [ 548.737224] kthread+0xe3/0x120
> > > [ 548.737233] ? __pfx_kthread+0x10/0x10
> > > [ 548.737242] ret_from_fork+0x19e/0x260
> > > [ 548.737250] ? __pfx_kthread+0x10/0x10
> > > [ 548.737258] ret_from_fork_asm+0x1a/0x30
> > > [ 548.737269] </TASK>
> > > [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
> > > [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> > > [ 548.737469] ---[ end trace 0000000000000000 ]---
> > >
> > > I don't have clear pattern when this happens, one was during host
> > > suspend, but the other was during "normal" test run (starting/stopping
> > > domUs and running stuff around them). Note also one of those is Intel
> > > and the other AMD, so it isn't really hardware specific.
> > >
> > > Slightly more details with links (especially serial0.txt in the logs
> > > tab) at
> > > https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
> > >
> > > Any idea?
> > >
> >
> > That looks like the issue Juergen fixed with:
> >
> > https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
>
> No, it doesn't. The fix is already in rc1, and the crash was quite early during
> boot (before any secondary CPUs were brought up).
>
> I guess this problem is related to the lazy_mmu_state series [1].
FWIW, the issue still happens on 7.0-rc6.
> Juergen
>
> [1]: https://lore.kernel.org/lkml/20251215150323.2218608-1-kevin.brodsky@arm.com/
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-04-05 9:41 ` Marek Marczykowski-Górecki
@ 2026-04-07 9:23 ` Kevin Brodsky
2026-04-08 2:47 ` Marek Marczykowski-Górecki
2026-05-07 16:31 ` Jürgen Groß
0 siblings, 2 replies; 14+ messages in thread
From: Kevin Brodsky @ 2026-04-07 9:23 UTC (permalink / raw)
To: Marek Marczykowski-Górecki, Jürgen Groß
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
On 05/04/2026 11:41, Marek Marczykowski-Górecki wrote:
> On Thu, Feb 26, 2026 at 02:41:12PM +0100, Jürgen Groß wrote:
>> On 26.02.26 14:27, Andrew Cooper wrote:
>>> On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
>>>> Hi,
>>>>
>>>> When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
>>>> sometimes:
>>>>
>>>> [ 436.849614] ------------[ cut here ]------------
>>>> [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
>>>> [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>> [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>>>> [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
>>>> [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
>>>> [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
>>>> [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>>>> [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
>>>> [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
>>>> [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
>>>> [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
>>>> [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
>>>> [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
>>>> [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
>>>> [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
>>>> [ 436.850401] Call Trace:
>>>> [ 436.850410] <TASK>
>>>> [ 436.850420] vmap_pages_pud_range+0x47c/0x530
>>>> [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
>>>> [ 436.850451] ? __get_vm_area_node+0x10a/0x170
>>>> [ 436.850465] vmap+0x79/0xd0
>>>> [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
>>>> [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
>>>> [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
>>>> [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
>>>> [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
>>>> [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
>>>> [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
>>>> [ 436.852769] process_one_work+0x18d/0x380
>>>> [ 436.852779] worker_thread+0x196/0x300
>>>> [ 436.852787] ? __pfx_worker_thread+0x10/0x10
>>>> [ 436.852796] kthread+0xe3/0x120
>>>> [ 436.852805] ? __pfx_kthread+0x10/0x10
>>>> [ 436.852815] ret_from_fork+0x19e/0x260
>>>> [ 436.852824] ? __pfx_kthread+0x10/0x10
>>>> [ 436.852832] ret_from_fork_asm+0x1a/0x30
>>>> [ 436.852842] </TASK>
>>>> [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
>>>> [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
>>>> [ 436.853183] ---[ end trace 0000000000000000 ]---
>>>>
>>>> or this:
>>>>
>>>> [ 548.736884] ------------[ cut here ]------------
>>>> [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
>>>> [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>> [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>>>> [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
>>>> [ 548.736962] Workqueue: events delayed_vfree_work
>>>> [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
>>>> [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>>>> [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
>>>> [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
>>>> [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
>>>> [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
>>>> [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
>>>> [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
>>>> [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
>>>> [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
>>>> [ 548.737115] Call Trace:
>>>> [ 548.737123] <TASK>
>>>> [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
>>>> [ 548.737142] vunmap_p4d_range+0x17d/0x290
>>>> [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
>>>> [ 548.737161] ? _raw_spin_unlock+0xe/0x30
>>>> [ 548.737171] remove_vm_area+0x40/0x70
>>>> [ 548.737180] vfree.part.0+0x1b/0x290
>>>> [ 548.737189] delayed_vfree_work+0x35/0x50
>>>> [ 548.737198] process_one_work+0x18d/0x380
>>>> [ 548.737207] worker_thread+0x196/0x300
>>>> [ 548.737215] ? __pfx_worker_thread+0x10/0x10
>>>> [ 548.737224] kthread+0xe3/0x120
>>>> [ 548.737233] ? __pfx_kthread+0x10/0x10
>>>> [ 548.737242] ret_from_fork+0x19e/0x260
>>>> [ 548.737250] ? __pfx_kthread+0x10/0x10
>>>> [ 548.737258] ret_from_fork_asm+0x1a/0x30
>>>> [ 548.737269] </TASK>
>>>> [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
>>>> [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
>>>> [ 548.737469] ---[ end trace 0000000000000000 ]---
>>>>
>>>> I don't have clear pattern when this happens, one was during host
>>>> suspend, but the other was during "normal" test run (starting/stopping
>>>> domUs and running stuff around them). Note also one of those is Intel
>>>> and the other AMD, so it isn't really hardware specific.
>>>>
>>>> Slightly more details with links (especially serial0.txt in the logs
>>>> tab) at
>>>> https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
>>>>
>>>> Any idea?
>>>>
>>> That looks like the issue Juergen fixed with:
>>>
>>> https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
>> No, it doesn't. The fix is already in rc1, and the crash was quite early during
>> boot (before any secondary CPUs were brought up).
>>
>> I guess this problem is related to the lazy_mmu_state series [1].
That may well be the case - it seems that xen_enter_lazy_mmu() is called
while already in lazy MMU mode (first splat), and xen_leave_lazy_mmu()
is called without being in lazy MMU mode (second splat). I expect this
is something specific to Xen, which I didn't get the chance to test.
Looking at the series again I don't see anything obviously wrong, but I
think the riskiest change is commit 291b3abed657 ("x86/xen: use
lazy_mmu_state when context-switching") - worth trying to revert it. If
that doesn't help, I'd suggest bisecting the following range:
58852f24f956..291b3abed657
Sorry for the trouble!
- Kevin
> FWIW, the issue still happens on 7.0-rc6.
>
>> Juergen
>>
>> [1]: https://lore.kernel.org/lkml/20251215150323.2218608-1-kevin.brodsky@arm.com/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-04-07 9:23 ` Kevin Brodsky
@ 2026-04-08 2:47 ` Marek Marczykowski-Górecki
2026-04-08 10:38 ` Kevin Brodsky
2026-05-07 16:31 ` Jürgen Groß
1 sibling, 1 reply; 14+ messages in thread
From: Marek Marczykowski-Górecki @ 2026-04-08 2:47 UTC (permalink / raw)
To: Kevin Brodsky
Cc: Jürgen Groß, Andrew Cooper, xen-devel, Boris Ostrovsky
[-- Attachment #1: Type: text/plain, Size: 14652 bytes --]
On Tue, Apr 07, 2026 at 11:23:17AM +0200, Kevin Brodsky wrote:
> On 05/04/2026 11:41, Marek Marczykowski-Górecki wrote:
> > On Thu, Feb 26, 2026 at 02:41:12PM +0100, Jürgen Groß wrote:
> >> On 26.02.26 14:27, Andrew Cooper wrote:
> >>> On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
> >>>> Hi,
> >>>>
> >>>> When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
> >>>> sometimes:
> >>>>
> >>>> [ 436.849614] ------------[ cut here ]------------
> >>>> [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
> >>>> [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> >>>> [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> >>>> [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
> >>>> [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
> >>>> [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
> >>>> [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> >>>> [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
> >>>> [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
> >>>> [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
> >>>> [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
> >>>> [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
> >>>> [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
> >>>> [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
> >>>> [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
> >>>> [ 436.850401] Call Trace:
> >>>> [ 436.850410] <TASK>
> >>>> [ 436.850420] vmap_pages_pud_range+0x47c/0x530
> >>>> [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
> >>>> [ 436.850451] ? __get_vm_area_node+0x10a/0x170
> >>>> [ 436.850465] vmap+0x79/0xd0
> >>>> [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
> >>>> [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
> >>>> [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
> >>>> [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
> >>>> [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
> >>>> [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
> >>>> [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
> >>>> [ 436.852769] process_one_work+0x18d/0x380
> >>>> [ 436.852779] worker_thread+0x196/0x300
> >>>> [ 436.852787] ? __pfx_worker_thread+0x10/0x10
> >>>> [ 436.852796] kthread+0xe3/0x120
> >>>> [ 436.852805] ? __pfx_kthread+0x10/0x10
> >>>> [ 436.852815] ret_from_fork+0x19e/0x260
> >>>> [ 436.852824] ? __pfx_kthread+0x10/0x10
> >>>> [ 436.852832] ret_from_fork_asm+0x1a/0x30
> >>>> [ 436.852842] </TASK>
> >>>> [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
> >>>> [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> >>>> [ 436.853183] ---[ end trace 0000000000000000 ]---
> >>>>
> >>>> or this:
> >>>>
> >>>> [ 548.736884] ------------[ cut here ]------------
> >>>> [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
> >>>> [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> >>>> [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> >>>> [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
> >>>> [ 548.736962] Workqueue: events delayed_vfree_work
> >>>> [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
> >>>> [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
> >>>> [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
> >>>> [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
> >>>> [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
> >>>> [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
> >>>> [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
> >>>> [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
> >>>> [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
> >>>> [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
> >>>> [ 548.737115] Call Trace:
> >>>> [ 548.737123] <TASK>
> >>>> [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
> >>>> [ 548.737142] vunmap_p4d_range+0x17d/0x290
> >>>> [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
> >>>> [ 548.737161] ? _raw_spin_unlock+0xe/0x30
> >>>> [ 548.737171] remove_vm_area+0x40/0x70
> >>>> [ 548.737180] vfree.part.0+0x1b/0x290
> >>>> [ 548.737189] delayed_vfree_work+0x35/0x50
> >>>> [ 548.737198] process_one_work+0x18d/0x380
> >>>> [ 548.737207] worker_thread+0x196/0x300
> >>>> [ 548.737215] ? __pfx_worker_thread+0x10/0x10
> >>>> [ 548.737224] kthread+0xe3/0x120
> >>>> [ 548.737233] ? __pfx_kthread+0x10/0x10
> >>>> [ 548.737242] ret_from_fork+0x19e/0x260
> >>>> [ 548.737250] ? __pfx_kthread+0x10/0x10
> >>>> [ 548.737258] ret_from_fork_asm+0x1a/0x30
> >>>> [ 548.737269] </TASK>
> >>>> [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
> >>>> [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
> >>>> [ 548.737469] ---[ end trace 0000000000000000 ]---
> >>>>
> >>>> I don't have clear pattern when this happens, one was during host
> >>>> suspend, but the other was during "normal" test run (starting/stopping
> >>>> domUs and running stuff around them). Note also one of those is Intel
> >>>> and the other AMD, so it isn't really hardware specific.
> >>>>
> >>>> Slightly more details with links (especially serial0.txt in the logs
> >>>> tab) at
> >>>> https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
> >>>>
> >>>> Any idea?
> >>>>
> >>> That looks like the issue Juergen fixed with:
> >>>
> >>> https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
> >> No, it doesn't. The fix is already in rc1, and the crash was quite early during
> >> boot (before any secondary CPUs were brought up).
> >>
> >> I guess this problem is related to the lazy_mmu_state series [1].
>
> That may well be the case - it seems that xen_enter_lazy_mmu() is called
> while already in lazy MMU mode (first splat), and xen_leave_lazy_mmu()
> is called without being in lazy MMU mode (second splat). I expect this
> is something specific to Xen, which I didn't get the chance to test.
>
> Looking at the series again I don't see anything obviously wrong, but I
> think the riskiest change is commit 291b3abed657 ("x86/xen: use
> lazy_mmu_state when context-switching") - worth trying to revert it.
With that reverted (on top of 7.0-rc6, didn't updated to rc7 yet), I
still got panic, although might be a bit different one:
[ 8.099973] BUG: unable to handle page fault for address: ffff888008000670
[ 8.100004] #PF: supervisor write access in kernel mode
[ 8.100021] #PF: error_code(0x0003) - permissions violation
[ 8.100037] PGD 3a00067 P4D 3a00067 PUD 3a01067 PMD 7cd7063 PTE 8000000008000021
[ 8.100063] Oops: Oops: 0003 [#1] SMP PTI
[ 8.100079] CPU: 0 UID: 0 PID: 226 Comm: kworker/0:2 Not tainted 7.0.0-0.rc6.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
[ 8.100110] Workqueue: events do_free_init
[ 8.100126] RIP: 0010:native_set_pte+0x4/0x10
[ 8.100145] Code: 00 03 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 89 37 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90
[ 8.100195] RSP: 0018:ffffc90000c97c48 EFLAGS: 00010287
[ 8.100212] RAX: e00c4f3d8b48c03e RBX: ffff888008000670 RCX: e00000000000003e
[ 8.100234] RDX: e00c4f3d8b48c13e RSI: e00c4f3d8b48c03e RDI: ffff888008000670
[ 8.100260] RBP: e00c4f3d8b48c13e R08: 0000000000000000 R09: 0000000000000001
[ 8.100282] R10: 0000003b0c274b73 R11: e00000000000013e R12: ffffc90000c97cf0
[ 8.100304] R13: ffffffffc04ce000 R14: fffc4f3d8b48cfff R15: e00000000000013e
[ 8.100327] FS: 0000000000000000(0000) GS:ffff888094e81000(0000) knlGS:0000000000000000
[ 8.100350] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.100369] CR2: ffff888008000670 CR3: 000000000242e003 CR4: 00000000001706f0
[ 8.100394] Call Trace:
[ 8.100404] <TASK>
[ 8.100413] __change_page_attr+0x24f/0x350
[ 8.100429] __change_page_attr_set_clr+0x61/0xd0
[ 8.100446] change_page_attr_set_clr+0x103/0x1a0
[ 8.100467] set_memory_nx+0x39/0x50
[ 8.100481] __execmem_cache_free+0x35/0xb0
[ 8.100496] execmem_free+0x9f/0x180
[ 8.100510] ? nft_chain_nat_exit+0xe70/0xe70 [nft_chain_nat]
[ 8.100531] do_free_init+0x2e/0x60
[ 8.100545] process_one_work+0x198/0x390
[ 8.100559] worker_thread+0x1af/0x320
[ 8.100573] ? __pfx_worker_thread+0x10/0x10
[ 8.103338] kthread+0xe3/0x120
[ 8.103355] ? __pfx_kthread+0x10/0x10
[ 8.103369] ret_from_fork+0x19e/0x260
[ 8.103384] ? __pfx_kthread+0x10/0x10
[ 8.103397] ret_from_fork_asm+0x1a/0x30
[ 8.103412] </TASK>
[ 8.103421] Modules linked in: xenfs nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_redir nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables binfmt_misc intel_rapl_msr intel_rapl_common ghash_clmulni_intel xen_netfront xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn fuse loop nfnetlink ip_tables overlay xen_blkfront
[ 8.103529] CR2: ffff888008000670
[ 8.103542] ---[ end trace 0000000000000000 ]---
[ 8.103558] RIP: 0010:native_set_pte+0x4/0x10
[ 8.103576] Code: 00 03 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 89 37 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90
[ 8.103625] RSP: 0018:ffffc90000c97c48 EFLAGS: 00010287
[ 8.103641] RAX: e00c4f3d8b48c03e RBX: ffff888008000670 RCX: e00000000000003e
[ 8.103664] RDX: e00c4f3d8b48c13e RSI: e00c4f3d8b48c03e RDI: ffff888008000670
[ 8.103686] RBP: e00c4f3d8b48c13e R08: 0000000000000000 R09: 0000000000000001
[ 8.103708] R10: 0000003b0c274b73 R11: e00000000000013e R12: ffffc90000c97cf0
[ 8.103730] R13: ffffffffc04ce000 R14: fffc4f3d8b48cfff R15: e00000000000013e
[ 8.103753] FS: 0000000000000000(0000) GS:ffff888094e81000(0000) knlGS:0000000000000000
[ 8.103775] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.103794] CR2: ffff888008000670 CR3: 000000000242e003 CR4: 00000000001706f0
[ 8.103820] Kernel panic - not syncing: Fatal exception
[ 8.103929] Kernel Offset: disabled
> If
> that doesn't help, I'd suggest bisecting the following range:
> 58852f24f956..291b3abed657
It will take some time, as the issue doesn't happen every time.
> Sorry for the trouble!
>
> - Kevin
>
> > FWIW, the issue still happens on 7.0-rc6.
> >
> >> Juergen
> >>
> >> [1]: https://lore.kernel.org/lkml/20251215150323.2218608-1-kevin.brodsky@arm.com/
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-04-08 2:47 ` Marek Marczykowski-Górecki
@ 2026-04-08 10:38 ` Kevin Brodsky
0 siblings, 0 replies; 14+ messages in thread
From: Kevin Brodsky @ 2026-04-08 10:38 UTC (permalink / raw)
To: Marek Marczykowski-Górecki
Cc: Jürgen Groß, Andrew Cooper, xen-devel, Boris Ostrovsky
On 08/04/2026 04:47, Marek Marczykowski-Górecki wrote:
>> That may well be the case - it seems that xen_enter_lazy_mmu() is called
>> while already in lazy MMU mode (first splat), and xen_leave_lazy_mmu()
>> is called without being in lazy MMU mode (second splat). I expect this
>> is something specific to Xen, which I didn't get the chance to test.
>>
>> Looking at the series again I don't see anything obviously wrong, but I
>> think the riskiest change is commit 291b3abed657 ("x86/xen: use
>> lazy_mmu_state when context-switching") - worth trying to revert it.
> With that reverted (on top of 7.0-rc6, didn't updated to rc7 yet), I
> still got panic, although might be a bit different one:
>
> [ 8.099973] BUG: unable to handle page fault for address: ffff888008000670
> [ 8.100004] #PF: supervisor write access in kernel mode
> [ 8.100021] #PF: error_code(0x0003) - permissions violation
> [ 8.100037] PGD 3a00067 P4D 3a00067 PUD 3a01067 PMD 7cd7063 PTE 8000000008000021
> [ 8.100063] Oops: Oops: 0003 [#1] SMP PTI
> [ 8.100079] CPU: 0 UID: 0 PID: 226 Comm: kworker/0:2 Not tainted 7.0.0-0.rc6.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
> [ 8.100110] Workqueue: events do_free_init
> [ 8.100126] RIP: 0010:native_set_pte+0x4/0x10
> [ 8.100145] Code: 00 03 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 89 37 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90
> [ 8.100195] RSP: 0018:ffffc90000c97c48 EFLAGS: 00010287
> [ 8.100212] RAX: e00c4f3d8b48c03e RBX: ffff888008000670 RCX: e00000000000003e
> [ 8.100234] RDX: e00c4f3d8b48c13e RSI: e00c4f3d8b48c03e RDI: ffff888008000670
> [ 8.100260] RBP: e00c4f3d8b48c13e R08: 0000000000000000 R09: 0000000000000001
> [ 8.100282] R10: 0000003b0c274b73 R11: e00000000000013e R12: ffffc90000c97cf0
> [ 8.100304] R13: ffffffffc04ce000 R14: fffc4f3d8b48cfff R15: e00000000000013e
> [ 8.100327] FS: 0000000000000000(0000) GS:ffff888094e81000(0000) knlGS:0000000000000000
> [ 8.100350] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8.100369] CR2: ffff888008000670 CR3: 000000000242e003 CR4: 00000000001706f0
> [ 8.100394] Call Trace:
> [ 8.100404] <TASK>
> [ 8.100413] __change_page_attr+0x24f/0x350
> [ 8.100429] __change_page_attr_set_clr+0x61/0xd0
> [ 8.100446] change_page_attr_set_clr+0x103/0x1a0
> [ 8.100467] set_memory_nx+0x39/0x50
> [ 8.100481] __execmem_cache_free+0x35/0xb0
> [ 8.100496] execmem_free+0x9f/0x180
> [ 8.100510] ? nft_chain_nat_exit+0xe70/0xe70 [nft_chain_nat]
> [ 8.100531] do_free_init+0x2e/0x60
> [ 8.100545] process_one_work+0x198/0x390
> [ 8.100559] worker_thread+0x1af/0x320
> [ 8.100573] ? __pfx_worker_thread+0x10/0x10
> [ 8.103338] kthread+0xe3/0x120
> [ 8.103355] ? __pfx_kthread+0x10/0x10
> [ 8.103369] ret_from_fork+0x19e/0x260
> [ 8.103384] ? __pfx_kthread+0x10/0x10
> [ 8.103397] ret_from_fork_asm+0x1a/0x30
> [ 8.103412] </TASK>
> [ 8.103421] Modules linked in: xenfs nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_redir nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables binfmt_misc intel_rapl_msr intel_rapl_common ghash_clmulni_intel xen_netfront xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn fuse loop nfnetlink ip_tables overlay xen_blkfront
> [ 8.103529] CR2: ffff888008000670
> [ 8.103542] ---[ end trace 0000000000000000 ]---
> [ 8.103558] RIP: 0010:native_set_pte+0x4/0x10
> [ 8.103576] Code: 00 03 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 89 37 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90
> [ 8.103625] RSP: 0018:ffffc90000c97c48 EFLAGS: 00010287
> [ 8.103641] RAX: e00c4f3d8b48c03e RBX: ffff888008000670 RCX: e00000000000003e
> [ 8.103664] RDX: e00c4f3d8b48c13e RSI: e00c4f3d8b48c03e RDI: ffff888008000670
> [ 8.103686] RBP: e00c4f3d8b48c13e R08: 0000000000000000 R09: 0000000000000001
> [ 8.103708] R10: 0000003b0c274b73 R11: e00000000000013e R12: ffffc90000c97cf0
> [ 8.103730] R13: ffffffffc04ce000 R14: fffc4f3d8b48cfff R15: e00000000000013e
> [ 8.103753] FS: 0000000000000000(0000) GS:ffff888094e81000(0000) knlGS:0000000000000000
> [ 8.103775] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8.103794] CR2: ffff888008000670 CR3: 000000000242e003 CR4: 00000000001706f0
> [ 8.103820] Kernel panic - not syncing: Fatal exception
> [ 8.103929] Kernel Offset: disabled
That is probably the same root cause indeed (lazy MMU appearing disabled
in __xet_set_pte() while it should be enabled).
>> If
>> that doesn't help, I'd suggest bisecting the following range:
>> 58852f24f956..291b3abed657
> It will take some time, as the issue doesn't happen every time.
Understood. Here are the commits that are expected to have a functional
effect on x86 (in reverse chronological order):
- 291b3abed657 ("x86/xen: use lazy_mmu_state when context-switching")
- 5ab246749569 ("mm: enable lazy_mmu sections to nest")
- 9273dfaeaca8 ("mm: bail out of lazy_mmu_mode_* in interrupt context")
- 66bdd779d344 ("x86/xen: simplify flush_lazy_mmu()")
Hope that helps, let me know if you have any further information. It
would be worth enabling CONFIG_DEBUG_VM and then checking if any WARN()
splat appears in the log.
- Kevin
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-04-07 9:23 ` Kevin Brodsky
2026-04-08 2:47 ` Marek Marczykowski-Górecki
@ 2026-05-07 16:31 ` Jürgen Groß
2026-05-08 8:53 ` Juergen Gross
1 sibling, 1 reply; 14+ messages in thread
From: Jürgen Groß @ 2026-05-07 16:31 UTC (permalink / raw)
To: Kevin Brodsky, Marek Marczykowski-Górecki
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
[-- Attachment #1.1.1: Type: text/plain, Size: 10762 bytes --]
On 07.04.26 11:23, Kevin Brodsky wrote:
> On 05/04/2026 11:41, Marek Marczykowski-Górecki wrote:
>> On Thu, Feb 26, 2026 at 02:41:12PM +0100, Jürgen Groß wrote:
>>> On 26.02.26 14:27, Andrew Cooper wrote:
>>>> On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
>>>>> Hi,
>>>>>
>>>>> When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
>>>>> sometimes:
>>>>>
>>>>> [ 436.849614] ------------[ cut here ]------------
>>>>> [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
>>>>> [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>>> [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>>>>> [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97 10/03/2023
>>>>> [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
>>>>> [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
>>>>> [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05 b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>>>>> [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
>>>>> [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX: 000fffffffe00000
>>>>> [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI: 0000000000000000
>>>>> [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09: ffffc90049edc000
>>>>> [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12: ffffc90049edc000
>>>>> [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15: 800000000000006b
>>>>> [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000) knlGS:0000000000000000
>>>>> [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4: 0000000000050660
>>>>> [ 436.850401] Call Trace:
>>>>> [ 436.850410] <TASK>
>>>>> [ 436.850420] vmap_pages_pud_range+0x47c/0x530
>>>>> [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
>>>>> [ 436.850451] ? __get_vm_area_node+0x10a/0x170
>>>>> [ 436.850465] vmap+0x79/0xd0
>>>>> [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
>>>>> [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
>>>>> [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
>>>>> [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
>>>>> [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
>>>>> [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
>>>>> [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
>>>>> [ 436.852769] process_one_work+0x18d/0x380
>>>>> [ 436.852779] worker_thread+0x196/0x300
>>>>> [ 436.852787] ? __pfx_worker_thread+0x10/0x10
>>>>> [ 436.852796] kthread+0xe3/0x120
>>>>> [ 436.852805] ? __pfx_kthread+0x10/0x10
>>>>> [ 436.852815] ret_from_fork+0x19e/0x260
>>>>> [ 436.852824] ? __pfx_kthread+0x10/0x10
>>>>> [ 436.852832] ret_from_fork_asm+0x1a/0x30
>>>>> [ 436.852842] </TASK>
>>>>> [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8 intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci snd_pcm rfkill spi_intel snd_timer snd
>>>>> [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
>>>>> [ 436.853183] ---[ end trace 0000000000000000 ]---
>>>>>
>>>>> or this:
>>>>>
>>>>> [ 548.736884] ------------[ cut here ]------------
>>>>> [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
>>>>> [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>>> [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>>>>> [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000 Nitrokey-v0.2.0-2608-ga649597 01/01/1970
>>>>> [ 548.736962] Workqueue: events delayed_vfree_work
>>>>> [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
>>>>> [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>>>>> [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
>>>>> [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX: 0000000000000000
>>>>> [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffff8881069c0000
>>>>> [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09: 0000000000000027
>>>>> [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12: ffffc90049681000
>>>>> [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15: ffffc90040607dac
>>>>> [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000) knlGS:0000000000000000
>>>>> [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4: 0000000000050660
>>>>> [ 548.737115] Call Trace:
>>>>> [ 548.737123] <TASK>
>>>>> [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
>>>>> [ 548.737142] vunmap_p4d_range+0x17d/0x290
>>>>> [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
>>>>> [ 548.737161] ? _raw_spin_unlock+0xe/0x30
>>>>> [ 548.737171] remove_vm_area+0x40/0x70
>>>>> [ 548.737180] vfree.part.0+0x1b/0x290
>>>>> [ 548.737189] delayed_vfree_work+0x35/0x50
>>>>> [ 548.737198] process_one_work+0x18d/0x380
>>>>> [ 548.737207] worker_thread+0x196/0x300
>>>>> [ 548.737215] ? __pfx_worker_thread+0x10/0x10
>>>>> [ 548.737224] kthread+0xe3/0x120
>>>>> [ 548.737233] ? __pfx_kthread+0x10/0x10
>>>>> [ 548.737242] ret_from_fork+0x19e/0x260
>>>>> [ 548.737250] ? __pfx_kthread+0x10/0x10
>>>>> [ 548.737258] ret_from_fork_asm+0x1a/0x30
>>>>> [ 548.737269] </TASK>
>>>>> [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801 snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw xen_acpi_processor xen_privcmd xen_pciback
>>>>> [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
>>>>> [ 548.737469] ---[ end trace 0000000000000000 ]---
>>>>>
>>>>> I don't have clear pattern when this happens, one was during host
>>>>> suspend, but the other was during "normal" test run (starting/stopping
>>>>> domUs and running stuff around them). Note also one of those is Intel
>>>>> and the other AMD, so it isn't really hardware specific.
>>>>>
>>>>> Slightly more details with links (especially serial0.txt in the logs
>>>>> tab) at
>>>>> https://github.com/QubesOS/qubes-linux-kernel/pull/662#issuecomment-3963326188
>>>>>
>>>>> Any idea?
>>>>>
>>>> That looks like the issue Juergen fixed with:
>>>>
>>>> https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
>>> No, it doesn't. The fix is already in rc1, and the crash was quite early during
>>> boot (before any secondary CPUs were brought up).
>>>
>>> I guess this problem is related to the lazy_mmu_state series [1].
>
> That may well be the case - it seems that xen_enter_lazy_mmu() is called
> while already in lazy MMU mode (first splat), and xen_leave_lazy_mmu()
> is called without being in lazy MMU mode (second splat). I expect this
> is something specific to Xen, which I didn't get the chance to test.
Looking into this again.
I think the main problem is the call of arch_end_context_switch() in
__switch_to(). For xen this is xen_end_context_switch() and it is doing:
if (__task_lazy_mmu_mode_active(next))
arch_enter_lazy_mmu_mode()
But this is wrong here, as current hasn't been switched to "next" yet.
I don't think we can just move the call of arch_end_context_switch(), as
it is needed for issuing the context switch related hypercall for switching
all the needed non-MMU settings.
What we probably really want is to call lazy_mmu_mode_pause() before the
call of arch_start_context_switch() and later call lazy_mmu_mode_resume()
after switching context to next. In xen_start_context_switch() and
xen_end_context_switch() the lazy mmu mode handling should be removed.
I will test that tomorrow, unless someone talks me out of it. :-)
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-05-07 16:31 ` Jürgen Groß
@ 2026-05-08 8:53 ` Juergen Gross
2026-05-08 9:54 ` Kevin Brodsky
0 siblings, 1 reply; 14+ messages in thread
From: Juergen Gross @ 2026-05-08 8:53 UTC (permalink / raw)
To: Kevin Brodsky, Marek Marczykowski-Górecki
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
[-- Attachment #1.1.1: Type: text/plain, Size: 12711 bytes --]
On 07.05.26 18:31, Jürgen Groß wrote:
> On 07.04.26 11:23, Kevin Brodsky wrote:
>> On 05/04/2026 11:41, Marek Marczykowski-Górecki wrote:
>>> On Thu, Feb 26, 2026 at 02:41:12PM +0100, Jürgen Groß wrote:
>>>> On 26.02.26 14:27, Andrew Cooper wrote:
>>>>> On 26/02/2026 1:17 pm, Marek Marczykowski-Górecki wrote:
>>>>>> Hi,
>>>>>>
>>>>>> When testing Linux 7.0-rc1 in PV dom0, I hit the following panic
>>>>>> sometimes:
>>>>>>
>>>>>> [ 436.849614] ------------[ cut here ]------------
>>>>>> [ 436.849669] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:78!
>>>>>> [ 436.849693] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>>>> [ 436.849710] CPU: 3 UID: 0 PID: 4021 Comm: kworker/u25:1 Not tainted
>>>>>> 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>>>>>> [ 436.849729] Hardware name: Star Labs StarBook/StarBook, BIOS 8.97
>>>>>> 10/03/2023
>>>>>> [ 436.849743] Workqueue: i915_flip intel_atomic_commit_work [i915]
>>>>>> [ 436.850226] RIP: e030:xen_enter_lazy_mmu+0x24/0x30
>>>>>> [ 436.850245] Code: 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 8b 05
>>>>>> b8 e5 02 03 85 c0 75 10 65 c7 05 a9 e5 02 03 01 00 00 00 c3 cc cc cc cc
>>>>>> <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>>>>>> [ 436.850270] RSP: e02b:ffffc90045727a68 EFLAGS: 00010202
>>>>>> [ 436.850283] RAX: 0000000000000001 RBX: ffff8881042fa6d0 RCX:
>>>>>> 000fffffffe00000
>>>>>> [ 436.850296] RDX: 0000000000000001 RSI: ffff88810a5a2980 RDI:
>>>>>> 0000000000000000
>>>>>> [ 436.850308] RBP: ffffc90049eda000 R08: ffffc90049edc000 R09:
>>>>>> ffffc90049edc000
>>>>>> [ 436.850320] R10: ffffc90049edc000 R11: ffffc90049edbfff R12:
>>>>>> ffffc90049edc000
>>>>>> [ 436.850332] R13: ffffc90045727bb0 R14: ffffc90045727b28 R15:
>>>>>> 800000000000006b
>>>>>> [ 436.850356] FS: 0000000000000000(0000) GS:ffff888201e6e000(0000)
>>>>>> knlGS:0000000000000000
>>>>>> [ 436.850371] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 436.850383] CR2: 00006543dbade250 CR3: 0000000115ef1000 CR4:
>>>>>> 0000000000050660
>>>>>> [ 436.850401] Call Trace:
>>>>>> [ 436.850410] <TASK>
>>>>>> [ 436.850420] vmap_pages_pud_range+0x47c/0x530
>>>>>> [ 436.850439] vmap_small_pages_range_noflush+0x1f1/0x2b0
>>>>>> [ 436.850451] ? __get_vm_area_node+0x10a/0x170
>>>>>> [ 436.850465] vmap+0x79/0xd0
>>>>>> [ 436.850476] i915_gem_object_map_page+0x13b/0x210 [i915]
>>>>>> [ 436.850812] i915_gem_object_pin_map+0x1e2/0x210 [i915]
>>>>>> [ 436.851123] i915_gem_object_pin_map_unlocked+0x2d/0xa0 [i915]
>>>>>> [ 436.851424] intel_dsb_buffer_create+0xed/0x1a0 [i915]
>>>>>> [ 436.851778] intel_dsb_prepare+0xca/0x1a0 [i915]
>>>>>> [ 436.852110] intel_atomic_dsb_finish+0x92/0x350 [i915]
>>>>>> [ 436.852456] intel_atomic_commit_tail+0x326/0xd40 [i915]
>>>>>> [ 436.852769] process_one_work+0x18d/0x380
>>>>>> [ 436.852779] worker_thread+0x196/0x300
>>>>>> [ 436.852787] ? __pfx_worker_thread+0x10/0x10
>>>>>> [ 436.852796] kthread+0xe3/0x120
>>>>>> [ 436.852805] ? __pfx_kthread+0x10/0x10
>>>>>> [ 436.852815] ret_from_fork+0x19e/0x260
>>>>>> [ 436.852824] ? __pfx_kthread+0x10/0x10
>>>>>> [ 436.852832] ret_from_fork_asm+0x1a/0x30
>>>>>> [ 436.852842] </TASK>
>>>>>> [ 436.852847] Modules linked in: snd_seq_dummy snd_hrtimer
>>>>>> snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269
>>>>>> snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic
>>>>>> snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl
>>>>>> snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt
>>>>>> snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink
>>>>>> snd_sof_intel_hda soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof
>>>>>> snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks
>>>>>> soundwire_generic_allocation snd_soc_sdw_utils snd_soc_acpi crc8
>>>>>> intel_rapl_msr soundwire_bus intel_rapl_common snd_soc_sdca snd_soc_avs
>>>>>> snd_soc_hda_codec snd_hda_ext_core snd_hda_codec vfat
>>>>>> intel_uncore_frequency_common fat snd_hda_core snd_intel_dspcfg
>>>>>> snd_intel_sdw_acpi snd_hwdep intel_powerclamp snd_soc_core iwlwifi
>>>>>> snd_compress spi_nor iTCO_wdt ac97_bus intel_pmc_bxt ee1004 mtd
>>>>>> snd_pcm_dmaengine snd_seq cfg80211 snd_seq_device pcspkr spi_intel_pci
>>>>>> snd_pcm rfkill spi_intel snd_timer snd
>>>>>> [ 436.852939] i2c_i801 soundcore i2c_smbus idma64 intel_pmc_core
>>>>>> pmt_telemetry pmt_discovery pmt_class intel_hid intel_pmc_ssram_telemetry
>>>>>> intel_scu_pltdrv sparse_keymap joydev loop fuse xenfs nfnetlink
>>>>>> vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport
>>>>>> vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool
>>>>>> dm_persistent_data dm_bio_prison dm_crypt xe drm_ttm_helper
>>>>>> drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915
>>>>>> i2c_algo_bit drm_buddy hid_multitouch i2c_hid_acpi ghash_clmulni_intel
>>>>>> video nvme wmi ttm i2c_hid nvme_core nvme_keyring drm_display_helper
>>>>>> nvme_auth xhci_pci pinctrl_tigerlake thunderbolt hkdf cec xhci_hcd
>>>>>> intel_vsec serio_raw xen_acpi_processor xen_privcmd xen_pciback
>>>>>> xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc
>>>>>> scsi_dh_alua uinput i2c_dev
>>>>>> [ 436.853183] ---[ end trace 0000000000000000 ]---
>>>>>>
>>>>>> or this:
>>>>>>
>>>>>> [ 548.736884] ------------[ cut here ]------------
>>>>>> [ 548.736907] kernel BUG at arch/x86/include/asm/xen/hypervisor.h:85!
>>>>>> [ 548.736923] Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>>>> [ 548.736935] CPU: 0 UID: 0 PID: 206 Comm: kworker/0:2 Not tainted
>>>>>> 7.0.0-0.rc1.1.qubes.1001.fc41.x86_64 #1 PREEMPT(full)
>>>>>> [ 548.736949] Hardware name: LENOVO 2347A45/2347A45, BIOS CBET4000
>>>>>> Nitrokey-v0.2.0-2608-ga649597 01/01/1970
>>>>>> [ 548.736962] Workqueue: events delayed_vfree_work
>>>>>> [ 548.736976] RIP: e030:xen_leave_lazy_mmu+0x44/0x50
>>>>>> [ 548.736989] Code: 02 03 83 f8 01 75 23 65 c7 05 6c e4 02 03 00 00 00 00
>>>>>> 65 ff 0d 7d b8 02 03 74 05 c3 cc cc cc cc e8 61 5d fd ff c3 cc cc cc cc
>>>>>> <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90
>>>>>> [ 548.737010] RSP: e02b:ffffc90040607cf0 EFLAGS: 00010297
>>>>>> [ 548.737018] RAX: 0000000000000000 RBX: ffff888164a70408 RCX:
>>>>>> 0000000000000000
>>>>>> [ 548.737029] RDX: 0000000000000000 RSI: 000ffffffffff000 RDI:
>>>>>> ffff8881069c0000
>>>>>> [ 548.737039] RBP: ffffc90049681000 R08: ffffc90049681000 R09:
>>>>>> 0000000000000027
>>>>>> [ 548.737050] R10: 0000000000000027 R11: fefefefefefefeff R12:
>>>>>> ffffc90049681000
>>>>>> [ 548.737060] R13: ffff8881002fd258 R14: 0000000000000000 R15:
>>>>>> ffffc90040607dac
>>>>>> [ 548.737079] FS: 0000000000000000(0000) GS:ffff8881f88ee000(0000)
>>>>>> knlGS:0000000000000000
>>>>>> [ 548.737090] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 548.737099] CR2: 000055576c2e6058 CR3: 000000010d47b000 CR4:
>>>>>> 0000000000050660
>>>>>> [ 548.737115] Call Trace:
>>>>>> [ 548.737123] <TASK>
>>>>>> [ 548.737128] vunmap_pmd_range.isra.0+0x1f1/0x2e0
>>>>>> [ 548.737142] vunmap_p4d_range+0x17d/0x290
>>>>>> [ 548.737151] __vunmap_range_noflush+0x182/0x1d0
>>>>>> [ 548.737161] ? _raw_spin_unlock+0xe/0x30
>>>>>> [ 548.737171] remove_vm_area+0x40/0x70
>>>>>> [ 548.737180] vfree.part.0+0x1b/0x290
>>>>>> [ 548.737189] delayed_vfree_work+0x35/0x50
>>>>>> [ 548.737198] process_one_work+0x18d/0x380
>>>>>> [ 548.737207] worker_thread+0x196/0x300
>>>>>> [ 548.737215] ? __pfx_worker_thread+0x10/0x10
>>>>>> [ 548.737224] kthread+0xe3/0x120
>>>>>> [ 548.737233] ? __pfx_kthread+0x10/0x10
>>>>>> [ 548.737242] ret_from_fork+0x19e/0x260
>>>>>> [ 548.737250] ? __pfx_kthread+0x10/0x10
>>>>>> [ 548.737258] ret_from_fork_asm+0x1a/0x30
>>>>>> [ 548.737269] </TASK>
>>>>>> [ 548.737274] Modules linked in: vfat fat snd_seq_dummy snd_hrtimer ath9k
>>>>>> ath9k_common snd_hda_codec_intelhdmi snd_hda_codec_hdmi ath9k_hw
>>>>>> snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component
>>>>>> snd_hda_codec_generic snd_hda_intel snd_hda_codec mac80211 snd_hda_core
>>>>>> snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep ath snd_seq snd_seq_device
>>>>>> snd_ctl_led cfg80211 snd_pcm at24 thinkpad_acpi intel_rapl_msr i2c_i801
>>>>>> snd_timer sparse_keymap iTCO_wdt intel_rapl_common platform_profile
>>>>>> intel_powerclamp intel_pmc_bxt pcspkr i2c_smbus rfkill libarc4 snd
>>>>>> soundcore mei_me e1000e mei joydev lpc_ich loop fuse xenfs nfnetlink
>>>>>> vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport
>>>>>> vsock zram vmw_vmci lz4hc_compress lz4_compress dm_thin_pool
>>>>>> dm_persistent_data dm_bio_prison dm_crypt i915 i2c_algo_bit drm_buddy
>>>>>> ghash_clmulni_intel ttm sdhci_pci drm_display_helper sdhci_uhs2 sdhci
>>>>>> video xhci_pci cqhci wmi cec xhci_hcd ehci_pci mmc_core ehci_hcd serio_raw
>>>>>> xen_acpi_processor xen_privcmd xen_pciback
>>>>>> [ 548.737348] xen_blkback xen_gntalloc xen_gntdev xen_evtchn
>>>>>> scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput i2c_dev
>>>>>> [ 548.737469] ---[ end trace 0000000000000000 ]---
>>>>>>
>>>>>> I don't have clear pattern when this happens, one was during host
>>>>>> suspend, but the other was during "normal" test run (starting/stopping
>>>>>> domUs and running stuff around them). Note also one of those is Intel
>>>>>> and the other AMD, so it isn't really hardware specific.
>>>>>>
>>>>>> Slightly more details with links (especially serial0.txt in the logs
>>>>>> tab) at
>>>>>> https://github.com/QubesOS/qubes-linux-kernel/
>>>>>> pull/662#issuecomment-3963326188
>>>>>>
>>>>>> Any idea?
>>>>>>
>>>>> That looks like the issue Juergen fixed with:
>>>>>
>>>>> https://lore.kernel.org/xen-devel/20260220123715.834848-1-jgross@suse.com/
>>>> No, it doesn't. The fix is already in rc1, and the crash was quite early during
>>>> boot (before any secondary CPUs were brought up).
>>>>
>>>> I guess this problem is related to the lazy_mmu_state series [1].
>>
>> That may well be the case - it seems that xen_enter_lazy_mmu() is called
>> while already in lazy MMU mode (first splat), and xen_leave_lazy_mmu()
>> is called without being in lazy MMU mode (second splat). I expect this
>> is something specific to Xen, which I didn't get the chance to test.
>
> Looking into this again.
>
> I think the main problem is the call of arch_end_context_switch() in
> __switch_to(). For xen this is xen_end_context_switch() and it is doing:
>
> if (__task_lazy_mmu_mode_active(next))
> arch_enter_lazy_mmu_mode()
>
> But this is wrong here, as current hasn't been switched to "next" yet.
>
> I don't think we can just move the call of arch_end_context_switch(), as
> it is needed for issuing the context switch related hypercall for switching
> all the needed non-MMU settings.
>
> What we probably really want is to call lazy_mmu_mode_pause() before the
> call of arch_start_context_switch() and later call lazy_mmu_mode_resume()
> after switching context to next. In xen_start_context_switch() and
> xen_end_context_switch() the lazy mmu mode handling should be removed.
>
> I will test that tomorrow, unless someone talks me out of it. :-)
That wasn't it, as the reasoning was wrong.
But now I think I have found the real culprit in lazy_mmu_mode_enable():
static inline void lazy_mmu_mode_enable(void)
{
struct lazy_mmu_state *state = ¤t->lazy_mmu_state;
if (in_interrupt() || state->pause_count > 0)
return;
VM_WARN_ON_ONCE(state->enable_count == U8_MAX);
if (state->enable_count++ == 0)
arch_enter_lazy_mmu_mode();
}
Consider a preemption just before calling arch_enter_lazy_mmu_mode(). The
enable_count will be 1 now, but there was no switch to lazy mode yet.
When the task becomes active again, context switch handling will see lazy
mode enabled (enable_count > 0), so it will call arch_enter_lazy_mmu_mode().
And then the task resumes and is calling arch_enter_lazy_mmu_mode() another
time.
The only chance I'm seeing to avoid that would be to disable preemption
around all instances of testing a condition and then enabling or disabling
lazy mmu mode.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-05-08 8:53 ` Juergen Gross
@ 2026-05-08 9:54 ` Kevin Brodsky
2026-05-08 10:09 ` Jürgen Groß
0 siblings, 1 reply; 14+ messages in thread
From: Kevin Brodsky @ 2026-05-08 9:54 UTC (permalink / raw)
To: Juergen Gross, Marek Marczykowski-Górecki
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
On 08/05/2026 10:53, Juergen Gross wrote:
> [...]
>
> But now I think I have found the real culprit in lazy_mmu_mode_enable():
>
> static inline void lazy_mmu_mode_enable(void)
> {
> struct lazy_mmu_state *state = ¤t->lazy_mmu_state;
>
> if (in_interrupt() || state->pause_count > 0)
> return;
>
> VM_WARN_ON_ONCE(state->enable_count == U8_MAX);
>
> if (state->enable_count++ == 0)
> arch_enter_lazy_mmu_mode();
> }
>
> Consider a preemption just before calling arch_enter_lazy_mmu_mode(). The
> enable_count will be 1 now, but there was no switch to lazy mode yet.
>
> When the task becomes active again, context switch handling will see lazy
> mode enabled (enable_count > 0), so it will call
> arch_enter_lazy_mmu_mode().
> And then the task resumes and is calling arch_enter_lazy_mmu_mode()
> another
> time.
Agreed, this must be the problem. I did wonder whether the lack of
atomicity would cause trouble...
arm64 isn't impacted because it tracks related state in task_struct
only. powerpc and sparc do use percpu variables but that shouldn't
matter as they disable preemption in the entire lazy MMU section.
>
> The only chance I'm seeing to avoid that would be to disable preemption
> around all instances of testing a condition and then enabling or
> disabling
> lazy mmu mode.
I don't immediately see why we would need such a big hammer. If we
revert commit 291b3abed657 ("x86/xen: use lazy_mmu_state when
context-switching"), then arch_{start,end}_context_switch() should once
again do the right thing for Xen since the TIF_LAZY_MMU_UPDATES flag is
separate from lazy_mmu_state. I think it looks like this:
lazy_mmu_mode_enable()
state->enable_count++
<PREEMPT>
arch_start_context_switch()
xen_lazy_mode == XEN_LAZY_NONE -> do nothing
<other task runs; this task is scheduled again>
arch_end_context_switch()
TIF_LAZY_MMU_UPDATES not set -> do nothing
<exception return>
enter_lazy(XEN_LAZY_MMU)
Nothing else should be checking lazy MMU state during the context switch.
Does that make sense?
- Kevin
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-05-08 9:54 ` Kevin Brodsky
@ 2026-05-08 10:09 ` Jürgen Groß
2026-05-08 10:28 ` Andrew Cooper
2026-05-08 11:34 ` Kevin Brodsky
0 siblings, 2 replies; 14+ messages in thread
From: Jürgen Groß @ 2026-05-08 10:09 UTC (permalink / raw)
To: Kevin Brodsky, Marek Marczykowski-Górecki
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
[-- Attachment #1.1.1: Type: text/plain, Size: 2895 bytes --]
On 08.05.26 11:54, Kevin Brodsky wrote:
> On 08/05/2026 10:53, Juergen Gross wrote:
>> [...]
>>
>> But now I think I have found the real culprit in lazy_mmu_mode_enable():
>>
>> static inline void lazy_mmu_mode_enable(void)
>> {
>> struct lazy_mmu_state *state = ¤t->lazy_mmu_state;
>>
>> if (in_interrupt() || state->pause_count > 0)
>> return;
>>
>> VM_WARN_ON_ONCE(state->enable_count == U8_MAX);
>>
>> if (state->enable_count++ == 0)
>> arch_enter_lazy_mmu_mode();
>> }
>>
>> Consider a preemption just before calling arch_enter_lazy_mmu_mode(). The
>> enable_count will be 1 now, but there was no switch to lazy mode yet.
>>
>> When the task becomes active again, context switch handling will see lazy
>> mode enabled (enable_count > 0), so it will call
>> arch_enter_lazy_mmu_mode().
>> And then the task resumes and is calling arch_enter_lazy_mmu_mode()
>> another
>> time.
>
> Agreed, this must be the problem. I did wonder whether the lack of
> atomicity would cause trouble...
>
> arm64 isn't impacted because it tracks related state in task_struct
> only. powerpc and sparc do use percpu variables but that shouldn't
> matter as they disable preemption in the entire lazy MMU section.
>
>>
>> The only chance I'm seeing to avoid that would be to disable preemption
>> around all instances of testing a condition and then enabling or
>> disabling
>> lazy mmu mode.
>
> I don't immediately see why we would need such a big hammer. If we
> revert commit 291b3abed657 ("x86/xen: use lazy_mmu_state when
> context-switching"), then arch_{start,end}_context_switch() should once
> again do the right thing for Xen since the TIF_LAZY_MMU_UPDATES flag is
> separate from lazy_mmu_state. I think it looks like this:
>
> lazy_mmu_mode_enable()
> state->enable_count++
> <PREEMPT>
> arch_start_context_switch()
> xen_lazy_mode == XEN_LAZY_NONE -> do nothing
>
> <other task runs; this task is scheduled again>
>
> arch_end_context_switch()
> TIF_LAZY_MMU_UPDATES not set -> do nothing
>
> <exception return>
> enter_lazy(XEN_LAZY_MMU)
>
> Nothing else should be checking lazy MMU state during the context switch.
>
> Does that make sense?
This would work, yes.
OTOH I don't like the multiple conditions used for testing (state->enable_count,
TIF_LAZY_MMU_UPDATES, xen_lazy_mode).
Another variant would be to just let the Xen specific code tolerate the double
calls by disabling preemption in the Xen code and checking via
__task_lazy_mmu_mode_active() if anything needs to be done.
I'd really like to get rid of xen_lazy_mode completely.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-05-08 10:09 ` Jürgen Groß
@ 2026-05-08 10:28 ` Andrew Cooper
2026-05-08 11:34 ` Kevin Brodsky
1 sibling, 0 replies; 14+ messages in thread
From: Andrew Cooper @ 2026-05-08 10:28 UTC (permalink / raw)
To: Jürgen Groß, Kevin Brodsky,
Marek Marczykowski-Górecki
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
On 08/05/2026 11:09 am, Jürgen Groß wrote:
> On 08.05.26 11:54, Kevin Brodsky wrote:
>> On 08/05/2026 10:53, Juergen Gross wrote:
>>> [...]
>>>
>>> But now I think I have found the real culprit in
>>> lazy_mmu_mode_enable():
>>>
>>> static inline void lazy_mmu_mode_enable(void)
>>> {
>>> struct lazy_mmu_state *state = ¤t->lazy_mmu_state;
>>>
>>> if (in_interrupt() || state->pause_count > 0)
>>> return;
>>>
>>> VM_WARN_ON_ONCE(state->enable_count == U8_MAX);
>>>
>>> if (state->enable_count++ == 0)
>>> arch_enter_lazy_mmu_mode();
>>> }
>>>
>>> Consider a preemption just before calling
>>> arch_enter_lazy_mmu_mode(). The
>>> enable_count will be 1 now, but there was no switch to lazy mode yet.
>>>
>>> When the task becomes active again, context switch handling will see
>>> lazy
>>> mode enabled (enable_count > 0), so it will call
>>> arch_enter_lazy_mmu_mode().
>>> And then the task resumes and is calling arch_enter_lazy_mmu_mode()
>>> another
>>> time.
>>
>> Agreed, this must be the problem. I did wonder whether the lack of
>> atomicity would cause trouble...
>>
>> arm64 isn't impacted because it tracks related state in task_struct
>> only. powerpc and sparc do use percpu variables but that shouldn't
>> matter as they disable preemption in the entire lazy MMU section.
>>
>>>
>>> The only chance I'm seeing to avoid that would be to disable preemption
>>> around all instances of testing a condition and then enabling or
>>> disabling
>>> lazy mmu mode.
>>
>> I don't immediately see why we would need such a big hammer. If we
>> revert commit 291b3abed657 ("x86/xen: use lazy_mmu_state when
>> context-switching"), then arch_{start,end}_context_switch() should once
>> again do the right thing for Xen since the TIF_LAZY_MMU_UPDATES flag is
>> separate from lazy_mmu_state. I think it looks like this:
>>
>> lazy_mmu_mode_enable()
>> state->enable_count++
>> <PREEMPT>
>> arch_start_context_switch()
>> xen_lazy_mode == XEN_LAZY_NONE -> do nothing
>> <other task runs; this task is scheduled again>
>>
>> arch_end_context_switch()
>> TIF_LAZY_MMU_UPDATES not set -> do nothing
>>
>> <exception return>
>> enter_lazy(XEN_LAZY_MMU)
>>
>> Nothing else should be checking lazy MMU state during the context
>> switch.
>>
>> Does that make sense?
>
> This would work, yes.
>
> OTOH I don't like the multiple conditions used for testing
> (state->enable_count,
> TIF_LAZY_MMU_UPDATES, xen_lazy_mode).
>
> Another variant would be to just let the Xen specific code tolerate
> the double
> calls by disabling preemption in the Xen code and checking via
> __task_lazy_mmu_mode_active() if anything needs to be done.
>
> I'd really like to get rid of xen_lazy_mode completely.
Without wishing to interrupt the flow too much.
In XenServer, work on migration performance[1] has demonstrated that a
very large number of multicalls issued by Linux are single-op multicalls.
(I blindly assert) these must be coming from the lazy_mode logic, and
they're even less efficient than making the hypercall normally, owing to
the need to marshal it through the multicall ABI.
There's a possibility that you can simply delete lazy mode and stuff
gets faster. (Although it's far more likely that the difference is in
the noise).
~Andrew
[1] The dominating perf problem for migration is ptwr emulation and
Linux not using a hypercall, which IIRC accounts for 40% of wallclock
time during live migration.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1
2026-05-08 10:09 ` Jürgen Groß
2026-05-08 10:28 ` Andrew Cooper
@ 2026-05-08 11:34 ` Kevin Brodsky
1 sibling, 0 replies; 14+ messages in thread
From: Kevin Brodsky @ 2026-05-08 11:34 UTC (permalink / raw)
To: Jürgen Groß, Marek Marczykowski-Górecki
Cc: Andrew Cooper, xen-devel, Boris Ostrovsky
On 08/05/2026 12:09, Jürgen Groß wrote:
>
> OTOH I don't like the multiple conditions used for testing
> (state->enable_count,
> TIF_LAZY_MMU_UPDATES, xen_lazy_mode).
>
> Another variant would be to just let the Xen specific code tolerate
> the double
> calls by disabling preemption in the Xen code and checking via
> __task_lazy_mmu_mode_active() if anything needs to be done.
>
> I'd really like to get rid of xen_lazy_mode completely.
That certainly crossed my mind, but I didn't feel qualified to perform
that kind of surgery, especially considering XEN_LAZY_CPU. There is
presumably a good reason to track this one via a percpu variable, but
for the MMU side it feels like this creates more problems than it
solves. Maybe it is possible to keep XEN_LAZY_CPU untouched while
removing XEN_LAZY_MMU and using is_lazy_mmu_mode_active() instead? If we
do that, I don't think preemption is a concern - the lazy MMU mode is
only relevant for current and cannot be used in interrupt context.
- Kevin
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-05-08 11:34 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-26 13:17 kernel BUG around vmap/vfree - xen_enter_lazy_mmu()/xen_leave_lazy_mmu() - Linux 7.0-rc1 Marek Marczykowski-Górecki
2026-02-26 13:27 ` Andrew Cooper
2026-02-26 13:36 ` Marek Marczykowski-Górecki
2026-02-26 13:41 ` Jürgen Groß
2026-04-05 9:41 ` Marek Marczykowski-Górecki
2026-04-07 9:23 ` Kevin Brodsky
2026-04-08 2:47 ` Marek Marczykowski-Górecki
2026-04-08 10:38 ` Kevin Brodsky
2026-05-07 16:31 ` Jürgen Groß
2026-05-08 8:53 ` Juergen Gross
2026-05-08 9:54 ` Kevin Brodsky
2026-05-08 10:09 ` Jürgen Groß
2026-05-08 10:28 ` Andrew Cooper
2026-05-08 11:34 ` Kevin Brodsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.