From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.0 required=3.0 tests=MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AFBCC83004 for ; Tue, 28 Apr 2020 20:55:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DCAA420B80 for ; Tue, 28 Apr 2020 20:55:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726369AbgD1UzV convert rfc822-to-8bit (ORCPT ); Tue, 28 Apr 2020 16:55:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:54060 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726291AbgD1UzU (ORCPT ); Tue, 28 Apr 2020 16:55:20 -0400 From: bugzilla-daemon@bugzilla.kernel.org To: kvm@vger.kernel.org Subject: [Bug 207489] New: Kernel panic due to Lazy update IOAPIC EOI on an x86_64 *host*, when two (or more) PCI devices from different IOMMU groups are passed to Windows 10 guest, upon guest boot into Windows, with more than 4 VCPUs Date: Tue, 28 Apr 2020 20:55:17 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: AssignedTo virtualization_kvm@kernel-bugs.osdl.org X-Bugzilla-Product: Virtualization X-Bugzilla-Component: kvm X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: high X-Bugzilla-Who: linux-kernel@polvanaubel.com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: virtualization_kvm@kernel-bugs.osdl.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version cf_kernel_version rep_platform op_sys cf_tree bug_status bug_severity priority component assigned_to reporter cc cf_regression attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Bugzilla-URL: https://bugzilla.kernel.org/ Auto-Submitted: auto-generated MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org https://bugzilla.kernel.org/show_bug.cgi?id=207489 Bug ID: 207489 Summary: Kernel panic due to Lazy update IOAPIC EOI on an x86_64 *host*, when two (or more) PCI devices from different IOMMU groups are passed to Windows 10 guest, upon guest boot into Windows, with more than 4 VCPUs Product: Virtualization Version: unspecified Kernel Version: 5.5.0-07987-gf458d039db7e Hardware: Intel OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: kvm Assignee: virtualization_kvm@kernel-bugs.osdl.org Reporter: linux-kernel@polvanaubel.com CC: kvm@vger.kernel.org Regression: Yes Created attachment 288795 --> https://bugzilla.kernel.org/attachment.cgi?id=288795&action=edit .config used to build the bugged kernel ### Summary Kernel panic due to Lazy update IOAPIC EOI on an x86_64 *host*, when two (or more) PCI devices from different IOMMU groups are passed to Windows 10 guest, upon guest boot into Windows, with more than 4 VCPUs. Commit introducing problem: I've bisected this to commit f458d039db7e8518041db4169d657407e3217008 ### Full description I run Windows 10 (64 bit Education edition) virtualized in KVM/Qemu/Libvirt, with three PCI(e) devices passed through: - One AMD RX590 graphics card (GPU & audio card together in one IOMMU group, both passed through, no other devices in IOMMU group) - One Intel C610/X99 chipset HD audio controller (mainboard on-board sound device, alone in one IOMMU group) - One ASMedia Technology ASM1142 USB 3.1 Host controller (PCIe USB3.1 card, alone in one IOMMU group) The host CPU is an Intel i7-5820k; 6 cores with hyperthreading enabled. Mainboard is Asrock X99 Extreme6/3.1. The guest gets 8 to 10 VCPUs, pinned to certain CPU cores, in a single-socket, 2-threads-per-core topology. I want to emphasize that this setup has worked for years, this is confirmed to work fine. On kernel 5.6, my Windows 10 guest VMs (tested on two separate installations, different VM configs) panic the host during guest boot. It's a kernel panic where literally everything freezes. There is no crash of X with a kernel panic visible afterwards. i3 showing the clock and CPU utilization bars freezes completely. Switching to another virtual console shows the kernel panic. It won't write it out to log, because it detects a corrupted stack. I've therefore transcribed it to the best of my ability, but the top of the panic will not show on my screens and I don't have access to a serial console. This happens consistently, every time slightly after the "spinning dots in a circle" start that indicate the guest boot has moved from TianoCore to the actual booting of Windows 10. Further investigation reveals that the guests boot fine if I remove all USB and PCI devices passed to the VM. Re-adding each PCI device individually (with two devices for the AMD GPU & soundcard in the same IOMMU group) does not cause the host to lock up. Adding any combination of two PCI devices in separate IOMMU groups caused the host to lock up on kernel 5.6.7, 5.6.2, and 5.6. Kernels 5.4.35-lts, 5.5.5, and 5.5.13 all do not exhibit this problem. I have bisected between 5.5.13 and 5.6 (which went through the common 5.5 ancestor) by creating a config known to lock up on bad kernels, with just the HD audio controller and the ASM1142 USB controller passed through. If the VM locks up the host, the revision is bad, if the VM boots without locking up the host, the revision is good. The commit introducing this behaviour is f458d039db7e8518041db4169d657407e3217008 kvm: ioapic: Lazy update IOAPIC EOI What I absolutely don't get is how this commit, which seems particular for AMD chips(ets), somehow triggers a kernel panic on my *entirely Intel* stack. Further testing on the kernel built from this commit also revealed that I can only reproduce the issue if I enable more than 4 VCPUs (i.e. more than 2 hyperthreaded cores); the topology that starts triggering the panic is 1 socket, 3 cores, 2 threads per core. See the included libvirt XML for that topology. However, the bug is also triggered with 4 and 5 cores, and with VCPU pinning. ### Keywords KVM, virtualization, kernel, lazy update IOAPIC EOI ### Kernel Information: Kernel version from /proc/version: Linux version 5.5.0-07987-gf458d039db7e (pol@victorinox) (gcc version 9.3.0 (Arch Linux 9.3.0-1)) #16 SMP PREEMPT Tue Apr 28 19:43:19 CEST 2020 Kernel .config: See attached .config ### Most recent known good kernels & commit to blame Kernels 5.4.35-lts, 5.5.5, and 5.5.13 all do not exhibit this problem. Bisected the problem to f458d039db7e8518041db4169d657407e3217008, between v5.5 and v5.6 Bisect log: git bisect start # good: [fe5ae687d01e74854ed33666c932a9c11e22139c] Linux 5.5.13 git bisect good fe5ae687d01e74854ed33666c932a9c11e22139c # bad: [7111951b8d4973bda27ff663f2cf18b663d15b48] Linux 5.6 git bisect bad 7111951b8d4973bda27ff663f2cf18b663d15b48 # good: [d5226fa6dbae0569ee43ecfc08bdcd6770fc4755] Linux 5.5 git bisect good d5226fa6dbae0569ee43ecfc08bdcd6770fc4755 # good: [9f68e3655aae6d49d6ba05dd263f99f33c2567af] Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm git bisect good 9f68e3655aae6d49d6ba05dd263f99f33c2567af # bad: [469030d454bd1620c7b2651d9ec8cdcbaa74deb9] Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 469030d454bd1620c7b2651d9ec8cdcbaa74deb9 # good: [f4a6365ae88d38528b4eec717326dab877b515ea] Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux git bisect good f4a6365ae88d38528b4eec717326dab877b515ea # good: [e310396bb8d7db977a0e10ef7b5040e98b89c34c] Merge tag 'trace-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace git bisect good e310396bb8d7db977a0e10ef7b5040e98b89c34c # bad: [ed39ba0ec1156407040e7509cb19299b5dda3815] Merge tag 'acpi-5.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect bad ed39ba0ec1156407040e7509cb19299b5dda3815 # bad: [9b7fa2880fe716a30d2359d40d12ec4bc69ec7b5] Merge tag 'xtensa-20200206' of git://github.com/jcmvbkbc/linux-xtensa git bisect bad 9b7fa2880fe716a30d2359d40d12ec4bc69ec7b5 # good: [750ce8ccd8a875ed9410fab01a3f468dab692eb4] Merge tag 'sound-fix-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect good 750ce8ccd8a875ed9410fab01a3f468dab692eb4 # bad: [a83502314ce303c6341b249c41121759c7477ba1] x86/kvm/hyper-v: don't allow to turn on unsupported VMX controls for nested guests git bisect bad a83502314ce303c6341b249c41121759c7477ba1 # good: [1ec2405c7cbf3afa7598c6b7546c81aa0cac78dc] kvm: ioapic: Refactor kvm_ioapic_update_eoi() git bisect good 1ec2405c7cbf3afa7598c6b7546c81aa0cac78dc # bad: [7df003c85218b5f5b10a7f6418208f31e813f38f] KVM: fix overflow of zero page refcount with ksm running git bisect bad 7df003c85218b5f5b10a7f6418208f31e813f38f # bad: [33aabd029ffbafe314dad4763dadbc23d71296eb] KVM: nVMX: delete meaningless nested_vmx_run() declaration git bisect bad 33aabd029ffbafe314dad4763dadbc23d71296eb # bad: [e8ef2a19a051b755b0b9973ef1b3f81e895e2bce] KVM: SVM: allow AVIC without split irqchip git bisect bad e8ef2a19a051b755b0b9973ef1b3f81e895e2bce # bad: [f458d039db7e8518041db4169d657407e3217008] kvm: ioapic: Lazy update IOAPIC EOI git bisect bad f458d039db7e8518041db4169d657407e3217008 # first bad commit: [f458d039db7e8518041db4169d657407e3217008] kvm: ioapic: Lazy update IOAPIC EOI ### Output of kernel panic: --- The top is, unfortunately, impossible to see or reliably catch on camera, and the panic is not written out --- --- I have a single, potentially useful, camera frame from a recording, attached as kernelpanictop.jpg --- --- But that is rather shifted and unclear --- --- part that was static on-screen from here --- [ 545.??????] irqfd_resampler_ack+0x32/0x90 [kvm] [ 545.??????] kvm_notify_acked_irq+0x61/0xf0 [kvm] [ 545.??????] kvm_ioapic_update_eoi_one.isra.0+0x3b/0x150 [kvm] [ 545.??????] ioapic_set_irq+0x1fc/0x220 [kvm] [ 545.??????] kvm_ioapic_set_irq+0x62/0x90 [kvm] [ 545.??????] kvm_set_irq+0xc8/0x180 [kvm] [ 545.??????] ? kvm_hv_set_sint+0x20/0x20 [kvm] [ 545.??????] ? kvm_set_ioapic_irq+0x20/0x20 [kvm] [ 545.??????] irqfd_resampler_ack+0x32/0x90 [kvm] [ 545.??????] kvm_notify_acked_irq+0x61/0xf0 [kvm] [ 545.??????] kvm_ioapic_update_eoi_one.isra.0+0x3b/0x150 [kvm] [ 545.??????] ioapic_set_irq+0x1fc/0x220 [kvm] [ 545.??????] kvm_ioapic_set_irq+0x62/0x90 [kvm] [ 545.??????] kvm_set_irq+0xc8/0x180 [kvm] [ 545.??????] ? kvm_hv_set_sint+0x20/0x20 [kvm] [ 545.??????] ? kvm_set_ioapic_irq+0x20/0x20 [kvm] [ 545.??????] irqfd_resampler_ack+0x32/0x90 [kvm] [ 545.??????] kvm_notify_acked_irq+0x61/0xf0 [kvm] [ 545.??????] kvm_ioapic_update_eoi_one.isra.0+0x3b/0x150 [kvm] [ 545.??????] ioapic_set_irq+0x1fc/0x220 [kvm] [ 545.??????] kvm_ioapic_set_irq+0x62/0x90 [kvm] [ 545.??????] kvm_set_irq+0xc8/0x180 [kvm] [ 545.??????] ? kvm_hv_set_sint+0x20/0x20 [kvm] [ 545.??????] ? kvm_set_ioapic_irq+0x20/0x20 [kvm] [ 545.??????] irqfd_resampler_ack+0x32/0x90 [kvm] [ 545.??????] kvm_notify_acked_irq+0x61/0xf0 [kvm] [ 545.??????] kvm_ioapic_update_eoi_one.isra.0+0x3b/0x150 [kvm] [ 545.??????] ioapic_set_irq+0x1fc/0x220 [kvm] [ 545.??????] kvm_ioapic_set_irq+0x62/0x90 [kvm] [ 545.??????] kvm_set_irq+0xc8/0x180 [kvm] [ 545.??????] ? kvm_hv_set_sint+0x20/0x20 [kvm] [ 545.??????] ? kvm_set_ioapic_irq+0x20/0x20 [kvm] [ 545.??????] irqfd_resampler_ack+0x32/0x90 [kvm] [ 545.??????] kvm_notify_acked_irq+0x61/0xf0 [kvm] [ 545.??????] kvm_ioapic_update_eoi_one.isra.0+0x3b/0x150 [kvm] [ 545.??????] ioapic_set_irq+0x1fc/0x220 [kvm] [ 545.??????] kvm_ioapic_set_irq+0x62/0x90 [kvm] [ 545.??????] kvm_set_irq+0xc8/0x180 [kvm] [ 545.??????] ? kvm_hv_set_sint+0x20/0x20 [kvm] [ 545.??????] ? kvm_set_ioapic_irq+0x20/0x20 [kvm] [ 545.??????] irqfd_resampler_ack+0x32/0x90 [kvm] [ 545.??????] kvm_notify_acked_irq+0x61/0xf0 [kvm] [ 545.??????] kvm_ioapic_update_eoi_one.isra.0+0x3b/0x150 [kvm] [ 545.??????] ioapic_set_irq+0x1fc/0x220 [kvm] [ 545.??????] kvm_ioapic_set_irq+0x62/0x90 [kvm] [ 545.??????] kvm_set_irq+0xc8/0x180 [kvm] [ 545.??????] ? kvm_hv_set_sint+0x20/0x20 [kvm] [ 545.??????] ? kvm_set_ioapic_irq+0x20/0x20 [kvm] [ 545.??????] kvm_vm_ioctl_irq_line+0x23/0x30 [kvm] [ 545.??????] kvm_vm_ioctl+0x28a/0xc10 [kvm] [ 545.??????] ksys_ioctl+0x87/0xc0 [ 545.??????] __x64_sys_ioctl+0x16/0x20 [ 545.??????] do_syscall_64+0x4e/0x150 [ 545.??????] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 545.??????] RIP: 0033:0x7f990c1642eb [ 545.??????] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48 [ 545.??????] RSP: 002b:00007f9908979c08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 545.??????] RAX: ffffffffffffffda RBX: 00000000c008ae67 RCX: 00007f990c1642eb [ 545.??????] RDX: 00007f9908979ca0 RSI: ffffffffc008ae67 RDI: 0000000000000011 [ 545.??????] RBP: 00007f990a9fc800 R08: 0000000000000000 R09: 000000000000002c [ 545.??????] R10: 0000000000000001 R11: 0000000000000246 R12: 00007f9908979ca0 [ 545.??????] R13: 00000000000000c8 R14: 0000000000000004 R15: 00007f9501850d40 [ 545.??????] Modules linked in: macvtap tap macvlan wireguard(E) ip6_udp_tunnel udp_tunnel ipt_REJECT nf_reject_ipv4 nct6775 xt_tcpudp msr hwmon_vid iptable_filter nls_iso8859_1 nls_cp437 intel_rapl_msr iTCO_wdt iTCO_vendor_support intel_rapl_common intel_wmi_thunderbolt mxm_wmi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm intel_cstate intel_uncore raid10 intel_rapl_perf pcskr snd_ctxfi alx i2c_i801 lpc_ich mdio amdgpu uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_hda_codec_realtek snd_rawmidi snd_seq_device snd_hda_codec_generic mc md_mod ledtrig_audio snd_hda_codec_hdmi joydev mousedev input_leds snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm gpu_sched snd_timer snd mei_me e1000e soundcore meiwmi evdev mac_hid ip_tables x_tables dm_crypt hid_lg_g15 hid_logitech ff_memless hid_steam hid_generic usbhid hid dm_mod crct10dif_pclmul crc32_pclmul ghash_clmulni_intel [ 545.??????] aesni_intel crypto_simd cryptd glue_helper xhci_pci ehci_pci ehci_hcd xhci_hcd radeon i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys-fops cec ttm drm agpgart vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio ext4 crc32c_generic crc32c_intel crc16 mbcache jbd2 vfat fat [ 545.??????] ---[ end trace 1e45c808e45db214 ]--- [ 545.??????] RIP:0010:__srcu_read_lock+0x21/0x30 [ 545.??????] Code: cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 8b 87 40 08 00 00 48 8b 97 68 08 00 00 41 89 c0 83 e0 01 64 48 ff 04 c2 83 44 24 fc 00 44 89 c0 c3 0f 1f 44 00 00 0f 1f 44 00 00 f0 83 [ 545.??????] RSP: 0018:ffffb061c119c000 EFLAGS: 00010206 [ 545.??????] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000 [ 545.??????] RDX: 000038a6bf8281c0 RSI: 0000000000000002 RDI: ffffb061c11970b8 [ 545.??????] RBP: 0000000000000002 R08: 0000000000000001 R09: ffff97baf7f72400 [ 545.??????] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb061c118d000 [ 545.??????] R13: ffff97baf347b1b0 R14: ffffb061c11970b8 R15: 000000000000000b [ 545.??????] FS: 00007f990897c700(0000) GS:ffff97baffa80000(0000) knlGS:0000000000000000 [ 545.??????] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 545.??????] CR2: ffffb061c119bff8 CR3: 0000000879578002 CR4: 00000000001626e0 [ 545.??????] note: CPU 0/KVM[2479] exited with preempt_count 1 [ 545.??????] Kernel panic - not syncing: corrupted stack end detected inside scheduler [ 545.??????] Kernel Offset: 0x24c00000 from 0xffffffff81000000 (reloacation range: 0xffffffff80000000-0xffffffffbfffffff) [ 545.??????] ---[ end Kernel panic - not syncing: corrupted stack end detected inside scheduler ]--- ### Reproduction I cannot provide a reproduction environment that is not several GiBs of a windows 10 installation and very specific hardware underlying it. I'd be happy to provide additional information and test patches, however. #### libvirt domain xml: lockuptest 309ac794-298a-4e9b-a7a6-660dd50b96ba 16777216 16777216 6 hvm /usr/share/ovmf/x64/OVMF_CODE.fd /var/lib/libvirt/qemu/nvram/wenger_VARS.fd Haswell-noTSX destroy restart restart /usr/sbin/qemu-system-x86_64