[REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
@ 2024-12-14  6:32 Ranguvar
  2024-12-14 18:52 ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Ranguvar @ 2024-12-14  6:32 UTC (permalink / raw)
  To: Peter Zijlstra, regressions@lists.linux.dev
  Cc: regressions@leemhuis.info, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org

Hello, all,

Any assistance with proper format and process is appreciated as I am new to these lists.
After the commit bd9bbc96e835 "sched: Rework dl_server" I am no longer able to boot my Windows 11 23H2 guest using pinned/exclusive CPU cores and passing a PCIe graphics card.
This setup worked for me since at least 5.10, likely earlier, with minimal changes.

Most or all cores assigned to guest VM report 100% usage, and many tasks on the host hang indefinitely (10min+) until the guest is forcibly stopped.
This happens only once the Windows kernel begins loading - its spinner appears and freezes.

Still broken on 6.13-rc2, as well as 6.12.4 from Arch's repository.
When testing these, the failure is similar, but tasks on the host are slow to execute instead of stalling indefinitely, and hung tasks are not reported in dmesg. Only one guest core may show 100% utilization instead of many or all of them. This seems to be due to a separate regression which also impacts my usecase [0].
After patching it [1], I then find the same behavior as bd9bbc96e835, with hung tasks on host.

git bisect log: [2]
dmesg from 6.11.0-rc1-1-git-00057-gbd9bbc96e835, with decoded hung task backtraces: [3]
dmesg from arch 6.12.4: [4]
dmesg from arch 6.12.4 patched for svm.c regression, has hung tasks, backtraces could not be decoded: [5]
config for 6.11.0-rc1-1-git-00057-gbd9bbc96e835: [6]
config for arch 6.12.4: [7]

If it helps, my host uses an AMD Ryzen 5950X CPU with latest UEFI and AMD WX 5100 (Polaris, GCN 4.0) PCIe graphics.
I use libvirt 10.10 and qemu 9.1.2, and I am passing three PCIe devices each from dedicated IOMMU groups: NVIDIA RTX 3090 graphics, a Renesas uPD720201 USB controller, and a Samsung 970 EVO NVMe disk.

I have in kernel cmdline `iommu=pt isolcpus=1-7,17-23 rcu_nocbs=1-7,17-23 nohz_full=1-7,17-23`.
Removing iommu=pt does not produce a change, and dropping the core isolation freezes the host on VM startup.
Enabling/disabling kvm_amd.nested or kvm.enable_virt_at_load did not produce a change.

Thank you for your attention.
- Devin

#regzbot introduced: bd9bbc96e8356886971317f57994247ca491dbf1

[0]: https://lore.kernel.org/regressions/52914da7-a97b-45ad-86a0-affdf8266c61@mailbox.org/
[1]: https://lore.kernel.org/regressions/376c445a-9437-4bdd-9b67-e7ce786ae2c4@mailbox.org/
[2]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/bisect.log
[3]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.11.0-rc1-1-git-00057-gbd9bbc96e835-decoded.log
[4]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.12.4-arch1-1.log
[5]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.12.4-arch1-1-patched.log
[6]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/config-6.11.0-rc1-1-git-00057-gbd9bbc96e835
[7]: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/raw/6.12.4.arch1-1/config

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
  2024-12-14  6:32 [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest Ranguvar
@ 2024-12-14 18:52 ` Peter Zijlstra
  2024-12-16 15:23   ` Juri Lelli
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2024-12-14 18:52 UTC (permalink / raw)
  To: Ranguvar, Juri Lelli
  Cc: regressions@lists.linux.dev, regressions@leemhuis.info,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org

On Sat, Dec 14, 2024 at 06:32:57AM +0000, Ranguvar wrote:
> Hello, all,
> 
> Any assistance with proper format and process is appreciated as I am new to these lists.
> After the commit bd9bbc96e835 "sched: Rework dl_server" I am no longer able to boot my Windows 11 23H2 guest using pinned/exclusive CPU cores and passing a PCIe graphics card.
> This setup worked for me since at least 5.10, likely earlier, with minimal changes.
> 
> Most or all cores assigned to guest VM report 100% usage, and many tasks on the host hang indefinitely (10min+) until the guest is forcibly stopped.
> This happens only once the Windows kernel begins loading - its spinner appears and freezes.
> 
> Still broken on 6.13-rc2, as well as 6.12.4 from Arch's repository.
> When testing these, the failure is similar, but tasks on the host are slow to execute instead of stalling indefinitely, and hung tasks are not reported in dmesg. Only one guest core may show 100% utilization instead of many or all of them. This seems to be due to a separate regression which also impacts my usecase [0].
> After patching it [1], I then find the same behavior as bd9bbc96e835, with hung tasks on host.
> 
> git bisect log: [2]
> dmesg from 6.11.0-rc1-1-git-00057-gbd9bbc96e835, with decoded hung task backtraces: [3]
> dmesg from arch 6.12.4: [4]
> dmesg from arch 6.12.4 patched for svm.c regression, has hung tasks, backtraces could not be decoded: [5]
> config for 6.11.0-rc1-1-git-00057-gbd9bbc96e835: [6]
> config for arch 6.12.4: [7]
> 
> If it helps, my host uses an AMD Ryzen 5950X CPU with latest UEFI and AMD WX 5100 (Polaris, GCN 4.0) PCIe graphics.
> I use libvirt 10.10 and qemu 9.1.2, and I am passing three PCIe devices each from dedicated IOMMU groups: NVIDIA RTX 3090 graphics, a Renesas uPD720201 USB controller, and a Samsung 970 EVO NVMe disk.
> 
> I have in kernel cmdline `iommu=pt isolcpus=1-7,17-23 rcu_nocbs=1-7,17-23 nohz_full=1-7,17-23`.
> Removing iommu=pt does not produce a change, and dropping the core isolation freezes the host on VM startup.
> Enabling/disabling kvm_amd.nested or kvm.enable_virt_at_load did not produce a change.
> 
> Thank you for your attention.
> - Devin
> 
> #regzbot introduced: bd9bbc96e8356886971317f57994247ca491dbf1
> 
> [0]: https://lore.kernel.org/regressions/52914da7-a97b-45ad-86a0-affdf8266c61@mailbox.org/
> [1]: https://lore.kernel.org/regressions/376c445a-9437-4bdd-9b67-e7ce786ae2c4@mailbox.org/
> [2]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/bisect.log
> [3]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.11.0-rc1-1-git-00057-gbd9bbc96e835-decoded.log

Hmm, this has:

[  978.035637] sched: DL replenish lagged too much

Juri, have we seen that before?

> [4]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.12.4-arch1-1.log
> [5]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.12.4-arch1-1-patched.log
> [6]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/config-6.11.0-rc1-1-git-00057-gbd9bbc96e835
> [7]: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/raw/6.12.4.arch1-1/config

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
  2024-12-14 18:52 ` Peter Zijlstra
@ 2024-12-16 15:23   ` Juri Lelli
  2024-12-16 16:50     ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Juri Lelli @ 2024-12-16 15:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ranguvar, Juri Lelli, regressions@lists.linux.dev,
	regressions@leemhuis.info, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, juri.lelli

On 14/12/24 19:52, Peter Zijlstra wrote:
> On Sat, Dec 14, 2024 at 06:32:57AM +0000, Ranguvar wrote:
> > Hello, all,
> > 
> > Any assistance with proper format and process is appreciated as I am new to these lists.
> > After the commit bd9bbc96e835 "sched: Rework dl_server" I am no longer able to boot my Windows 11 23H2 guest using pinned/exclusive CPU cores and passing a PCIe graphics card.
> > This setup worked for me since at least 5.10, likely earlier, with minimal changes.
> > 
> > Most or all cores assigned to guest VM report 100% usage, and many tasks on the host hang indefinitely (10min+) until the guest is forcibly stopped.
> > This happens only once the Windows kernel begins loading - its spinner appears and freezes.
> > 
> > Still broken on 6.13-rc2, as well as 6.12.4 from Arch's repository.
> > When testing these, the failure is similar, but tasks on the host are slow to execute instead of stalling indefinitely, and hung tasks are not reported in dmesg. Only one guest core may show 100% utilization instead of many or all of them. This seems to be due to a separate regression which also impacts my usecase [0].
> > After patching it [1], I then find the same behavior as bd9bbc96e835, with hung tasks on host.
> > 
> > git bisect log: [2]
> > dmesg from 6.11.0-rc1-1-git-00057-gbd9bbc96e835, with decoded hung task backtraces: [3]
> > dmesg from arch 6.12.4: [4]
> > dmesg from arch 6.12.4 patched for svm.c regression, has hung tasks, backtraces could not be decoded: [5]
> > config for 6.11.0-rc1-1-git-00057-gbd9bbc96e835: [6]
> > config for arch 6.12.4: [7]
> > 
> > If it helps, my host uses an AMD Ryzen 5950X CPU with latest UEFI and AMD WX 5100 (Polaris, GCN 4.0) PCIe graphics.
> > I use libvirt 10.10 and qemu 9.1.2, and I am passing three PCIe devices each from dedicated IOMMU groups: NVIDIA RTX 3090 graphics, a Renesas uPD720201 USB controller, and a Samsung 970 EVO NVMe disk.
> > 
> > I have in kernel cmdline `iommu=pt isolcpus=1-7,17-23 rcu_nocbs=1-7,17-23 nohz_full=1-7,17-23`.
> > Removing iommu=pt does not produce a change, and dropping the core isolation freezes the host on VM startup.
> > Enabling/disabling kvm_amd.nested or kvm.enable_virt_at_load did not produce a change.
> > 
> > Thank you for your attention.
> > - Devin
> > 
> > #regzbot introduced: bd9bbc96e8356886971317f57994247ca491dbf1
> > 
> > [0]: https://lore.kernel.org/regressions/52914da7-a97b-45ad-86a0-affdf8266c61@mailbox.org/
> > [1]: https://lore.kernel.org/regressions/376c445a-9437-4bdd-9b67-e7ce786ae2c4@mailbox.org/
> > [2]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/bisect.log
> > [3]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.11.0-rc1-1-git-00057-gbd9bbc96e835-decoded.log
> 
> Hmm, this has:
> 
> [  978.035637] sched: DL replenish lagged too much
> 
> Juri, have we seen that before?

Not in the context of dl_server. Hummm, looks like replenishment wasn't
able to catch up with the clock or something like that (e.g.
replenishment didn't happen for a long time).


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
  2024-12-16 15:23   ` Juri Lelli
@ 2024-12-16 16:50     ` Sean Christopherson
  2024-12-16 20:40       ` Ranguvar
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-12-16 16:50 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Peter Zijlstra, Ranguvar, Juri Lelli, regressions@lists.linux.dev,
	regressions@leemhuis.info, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org

On Mon, Dec 16, 2024, Juri Lelli wrote:
> On 14/12/24 19:52, Peter Zijlstra wrote:
> > On Sat, Dec 14, 2024 at 06:32:57AM +0000, Ranguvar wrote:
> > > Hello, all,
> > > 
> > > Any assistance with proper format and process is appreciated as I am new
> > > to these lists.  After the commit bd9bbc96e835 "sched: Rework dl_server"
> > > I am no longer able to boot my Windows 11 23H2 guest using
> > > pinned/exclusive CPU cores and passing a PCIe graphics card.  This setup
> > > worked for me since at least 5.10, likely earlier, with minimal changes.
> > > 
> > > Most or all cores assigned to guest VM report 100% usage, and many tasks
> > > on the host hang indefinitely (10min+) until the guest is forcibly
> > > stopped.  This happens only once the Windows kernel begins loading - its
> > > spinner appears and freezes.
> > > 
> > > Still broken on 6.13-rc2, as well as 6.12.4 from Arch's repository.  When
> > > testing these, the failure is similar, but tasks on the host are slow to
> > > execute instead of stalling indefinitely, and hung tasks are not reported
> > > in dmesg. Only one guest core may show 100% utilization instead of many
> > > or all of them. This seems to be due to a separate regression which also
> > > impacts my usecase [0].  After patching it [1], I then find the same
> > > behavior as bd9bbc96e835, with hung tasks on host.
> > > 
> > > git bisect log: [2]
> > > dmesg from 6.11.0-rc1-1-git-00057-gbd9bbc96e835, with decoded hung task backtraces: [3]
> > > dmesg from arch 6.12.4: [4]
> > > dmesg from arch 6.12.4 patched for svm.c regression, has hung tasks, backtraces could not be decoded: [5]
> > > config for 6.11.0-rc1-1-git-00057-gbd9bbc96e835: [6]
> > > config for arch 6.12.4: [7]
> > > 
> > > If it helps, my host uses an AMD Ryzen 5950X CPU with latest UEFI and AMD
> > > WX 5100 (Polaris, GCN 4.0) PCIe graphics.  I use libvirt 10.10 and qemu
> > > 9.1.2, and I am passing three PCIe devices each from dedicated IOMMU
> > > groups: NVIDIA RTX 3090 graphics, a Renesas uPD720201 USB controller, and
> > > a Samsung 970 EVO NVMe disk.
> > > 
> > > I have in kernel cmdline `iommu=pt isolcpus=1-7,17-23 rcu_nocbs=1-7,17-23
> > > nohz_full=1-7,17-23`.  Removing iommu=pt does not produce a change, and
> > > dropping the core isolation freezes the host on VM startup.

As in, dropping all of isolcpus, rcu_nocbs, and nohz_full?  Or just dropping
isolcpus?

> > > Enabling/disabling kvm_amd.nested or kvm.enable_virt_at_load did not
> > > produce a change.
> > > 
> > > Thank you for your attention.
> > > - Devin
> > > 
> > > #regzbot introduced: bd9bbc96e8356886971317f57994247ca491dbf1
> > > 
> > > [0]: https://lore.kernel.org/regressions/52914da7-a97b-45ad-86a0-affdf8266c61@mailbox.org/
> > > [1]: https://lore.kernel.org/regressions/376c445a-9437-4bdd-9b67-e7ce786ae2c4@mailbox.org/
> > > [2]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/bisect.log
> > > [3]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.11.0-rc1-1-git-00057-gbd9bbc96e835-decoded.log
> > 
> > Hmm, this has:
> > 
> > [  978.035637] sched: DL replenish lagged too much
> > 
> > Juri, have we seen that before?
> 
> Not in the context of dl_server. Hummm, looks like replenishment wasn't
> able to catch up with the clock or something like that (e.g.
> replenishment didn't happen for a long time).

I don't see anything in the logs that suggests KVM is doing something funky.  My
guess is that the issue is related to isolcpus+rcu_nocbs+nohz_full, and that KVM
setups are one of the more common use cases for such configurations.  But that's
just a wild guess on my part.

The hang from [4] occurs because KVM can't complete a memslot update.  Given that
this shows up with GPU passthrough, odds are good the guest is trying to relocate
a GPU bar and the relocation hangs because the KVM-side update hangs.

There are some interesting/unique paths in KVM's memslot code, but this is a
simple hang on SRCU synchronization.

   INFO: task CPU 0/KVM:2134 blocked for more than 122 seconds.
         Not tainted 6.11.0-rc1-1-git-00057-gbd9bbc96e835 #12
   "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
   task:CPU 0/KVM       state:D stack:0     pid:2134  tgid:2114  ppid:1      flags:0x00004002
   Call Trace:
    <TASK>
   __schedule (kernel/sched/core.c:5258 kernel/sched/core.c:6594) 
   schedule (./arch/x86/include/asm/preempt.h:84 (discriminator 13) kernel/sched/core.c:6672 (discriminator 13) kernel/sched/core.c:6686 (discriminator 13)) 
   schedule_timeout (kernel/time/timer.c:2558) 
   wait_for_completion (kernel/sched/completion.c:96 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:148) 
   __synchronize_srcu (kernel/rcu/srcutree.c:1408) 
   kvm_swap_active_memslots+0x133/0x180 kvm
   kvm_set_memslot+0x3de/0x680 kvm
   kvm_vm_ioctl+0x11da/0x18d0 kvm
   __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:907 fs/ioctl.c:893 fs/ioctl.c:893) 
   do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) 

And in [5], the host hang that first pops is also on wait_for_completion(), in
code that is potentially trying to queue work on all CPUs (I've no idea if
cpu_needs_drain() can be true on the isolated CPUs).

	cpumask_clear(&has_work);
	for_each_online_cpu(cpu) {
		struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);

		if (cpu_needs_drain(cpu)) {
			INIT_WORK(work, lru_add_drain_per_cpu);
			queue_work_on(cpu, mm_percpu_wq, work);
			__cpumask_set_cpu(cpu, &has_work);
		}
	}

	for_each_cpu(cpu, &has_work)
		flush_work(&per_cpu(lru_add_drain_work, cpu));

  sched: DL replenish lagged too much
  systemd[1]: Starting Cleanup of Temporary Directories...
  systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
  systemd[1]: Finished Cleanup of Temporary Directories.
  systemd[1]: systemd-journald.service: State 'stop-watchdog' timed out. Killing.
  systemd[1]: systemd-journald.service: Killing process 647 (systemd-journal) with signal SIGKILL.
  systemd[1]: Starting system activity accounting tool...
  systemd[1]: sysstat-collect.service: Deactivated successfully.
  systemd[1]: Finished system activity accounting tool.
  INFO: task khugepaged:263 blocked for more than 122 seconds.
        Not tainted 6.12.4-arch1-1 #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:khugepaged      state:D stack:0     pid:263   tgid:263   ppid:2      flags:0x00004000
  Call Trace:
   <TASK>
   __schedule+0x3b0/0x12b0
   schedule+0x27/0xf0
   schedule_timeout+0x12f/0x160
   wait_for_completion+0x86/0x170
   __flush_work+0x1bf/0x2c0
   __lru_add_drain_all+0x13e/0x1e0
   khugepaged+0x66/0x930
   kthread+0xd2/0x100
   ret_from_fork+0x34/0x50
   ret_from_fork_asm+0x1a/0x30
   </TASK>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
  2024-12-16 16:50     ` Sean Christopherson
@ 2024-12-16 20:40       ` Ranguvar
  2024-12-17  8:57         ` Juri Lelli
  0 siblings, 1 reply; 9+ messages in thread
From: Ranguvar @ 2024-12-16 20:40 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Juri Lelli, Peter Zijlstra, Juri Lelli,
	regressions@lists.linux.dev, regressions@leemhuis.info,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org

On Monday, December 16th, 2024 at 16:50, Sean Christopherson <seanjc@google.com> wrote:
> 
> On Mon, Dec 16, 2024, Juri Lelli wrote:
> 
> > On 14/12/24 19:52, Peter Zijlstra wrote:
> > 
> > > On Sat, Dec 14, 2024 at 06:32:57AM +0000, Ranguvar wrote:
> > > 
> > > > I have in kernel cmdline `iommu=pt isolcpus=1-7,17-23 rcu_nocbs=1-7,17-23 nohz_full=1-7,17-23`. Removing iommu=pt does not produce a change, and
> > > > dropping the core isolation freezes the host on VM startup.
> 
> As in, dropping all of isolcpus, rcu_nocbs, and nohz_full? Or just dropping
> isolcpus?

Thanks for looking.
I had dropped all three, but not altered the VM guest config, which is:

<cputune>
<vcpupin vcpu='0' cpuset='2'/>
<vcpupin vcpu='1' cpuset='18'/>
...
<vcpupin vcpu='11' cpuset='23'/>
<emulatorpin cpuset='1,17'/>
<iothreadpin iothread='1' cpuset='1,17'/>
<vcpusched vcpus='0' scheduler='fifo' priority='95'/>
...
<iothreadsched iothreads='1' scheduler='fifo' priority='50'/>
</cputune>

CPU mode is host-passthrough, cache mode is passthrough.

The 24GB VRAM did cause trouble when setting up resizeable BAR months ago as well. It necessitated a special qemu config:
<qemu:commandline>
<qemu:arg value='-fw_cfg'/>
<qemu:arg value='opt/ovmf/PciMmio64Mb,string=65536'/>
</qemu:commandline>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
  2024-12-16 20:40       ` Ranguvar
@ 2024-12-17  8:57         ` Juri Lelli
  2024-12-18  6:21           ` Ranguvar
  0 siblings, 1 reply; 9+ messages in thread
From: Juri Lelli @ 2024-12-17  8:57 UTC (permalink / raw)
  To: Ranguvar
  Cc: Sean Christopherson, Peter Zijlstra, Juri Lelli,
	regressions@lists.linux.dev, regressions@leemhuis.info,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org

On 16/12/24 20:40, Ranguvar wrote:
> On Monday, December 16th, 2024 at 16:50, Sean Christopherson <seanjc@google.com> wrote:
> > 
> > On Mon, Dec 16, 2024, Juri Lelli wrote:
> > 
> > > On 14/12/24 19:52, Peter Zijlstra wrote:
> > > 
> > > > On Sat, Dec 14, 2024 at 06:32:57AM +0000, Ranguvar wrote:
> > > > 
> > > > > I have in kernel cmdline `iommu=pt isolcpus=1-7,17-23 rcu_nocbs=1-7,17-23 nohz_full=1-7,17-23`. Removing iommu=pt does not produce a change, and
> > > > > dropping the core isolation freezes the host on VM startup.
> > 
> > As in, dropping all of isolcpus, rcu_nocbs, and nohz_full? Or just dropping
> > isolcpus?
> 
> Thanks for looking.
> I had dropped all three, but not altered the VM guest config, which is:
> 
> <cputune>
> <vcpupin vcpu='0' cpuset='2'/>
> <vcpupin vcpu='1' cpuset='18'/>
> ...
> <vcpupin vcpu='11' cpuset='23'/>
> <emulatorpin cpuset='1,17'/>
> <iothreadpin iothread='1' cpuset='1,17'/>
> <vcpusched vcpus='0' scheduler='fifo' priority='95'/>
> ...
> <iothreadsched iothreads='1' scheduler='fifo' priority='50'/>

Are you disabling/enabling/configuring RT throttling (sched_rt_{runtime,
period}_us) in your configuration?

> </cputune>
> 
> CPU mode is host-passthrough, cache mode is passthrough.
> 
> The 24GB VRAM did cause trouble when setting up resizeable BAR months ago as well. It necessitated a special qemu config:
> <qemu:commandline>
> <qemu:arg value='-fw_cfg'/>
> <qemu:arg value='opt/ovmf/PciMmio64Mb,string=65536'/>
> </qemu:commandline>
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
  2024-12-17  8:57         ` Juri Lelli
@ 2024-12-18  6:21           ` Ranguvar
  0 siblings, 0 replies; 9+ messages in thread
From: Ranguvar @ 2024-12-18  6:21 UTC (permalink / raw)
  To: Juri Lelli, Sean Christopherson
  Cc: Peter Zijlstra, Juri Lelli, regressions@lists.linux.dev,
	regressions@leemhuis.info, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org

The bug is caused by Windows kernel as a KVM guest.
Cannot reproduce with Ubuntu 24.10 install iso and nouveau driver.
Windows 11 23H2 install iso reproduces reliably.

Two [0] more [1] kernel logs below.
Decode worked only on the first - spent too long trying to fix it.

On Tuesday, December 17th, 2024 at 08:57, Juri Lelli <juri.lelli@redhat.com> wrote:
>
> On 16/12/24 20:40, Ranguvar wrote:
>
> > On Monday, December 16th, 2024 at 16:50, Sean Christopherson seanjc@google.com wrote:
> >
> > > On Mon, Dec 16, 2024, Juri Lelli wrote:
> > >
> > > > On 14/12/24 19:52, Peter Zijlstra wrote:
> > > >
> > > > > On Sat, Dec 14, 2024 at 06:32:57AM +0000, Ranguvar wrote:
> > > > >
> > > > > > I have in kernel cmdline `iommu=pt isolcpus=1-7,17-23 rcu_nocbs=1-7,17-23 nohz_full=1-7,17-23`. Removing iommu=pt does not produce a change, and
> > > > > > dropping the core isolation freezes the host on VM startup.
> > >
> > > As in, dropping all of isolcpus, rcu_nocbs, and nohz_full? Or just dropping
> > > isolcpus?
> >
> > Thanks for looking.
> > I had dropped all three, but not altered the VM guest config, which is:
> >
> > <cputune>
> > <vcpupin vcpu='0' cpuset='2'/>
> > <vcpupin vcpu='1' cpuset='18'/>
> > ...
> > <vcpupin vcpu='11' cpuset='23'/>
> > <emulatorpin cpuset='1,17'/>
> > <iothreadpin iothread='1' cpuset='1,17'/>
> > <vcpusched vcpus='0' scheduler='fifo' priority='95'/>
> > ...
> > <iothreadsched iothreads='1' scheduler='fifo' priority='50'/>
>
>
> Are you disabling/enabling/configuring RT throttling (sched_rt_{runtime,
> period}_us) in your configuration?
>

I don't touch these.

[ranguvar@khufu ~]$ cat /proc/sys/kernel/sched_rt_period_us
1000000
[ranguvar@khufu ~]$ cat /proc/sys/kernel/sched_rt_runtime_us
950000

I removed myself from realtime group also (used by PipeWire) but still the same breakage.

> > </cputune>
> >
> > CPU mode is host-passthrough, cache mode is passthrough.
> >
> > The 24GB VRAM did cause trouble when setting up resizeable BAR months ago as well. It necessitated a special qemu config:
> > qemu:commandline
> > <qemu:arg value='-fw_cfg'/>
> > <qemu:arg value='opt/ovmf/PciMmio64Mb,string=65536'/>
> > </qemu:commandline>

I removed this config block as it appears unnecessary now.
No impact on this issue.

I tried also changed the size of the BAR from 32GB to 256MB manually before running the guest.

lspci:
Region 1: Memory at 7000000000 (64-bit, prefetchable) [size=32G]
Region 3: Memory at 7800000000 (64-bit, prefetchable) [size=32M]

after unbinding vfio_pci, writing '8' to to resource1_resize, and rebinding:
Region 1: Memory at 1040000000 (64-bit, prefetchable) [size=256M]
Region 3: Memory at 1050000000 (64-bit, prefetchable) [size=32M]

No impact.

[0]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.11.0-rc1-1-git-00057-gbd9bbc96e835-20241216-decoded.log
[1]: https://ranguvar.io/pub/paste/linux-6.12-vm-regression/dmesg-6.11.0-rc1-1-git-00057-gbd9bbc96e835-20241217.log

^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <nscDY8Zl-c9zxKZ0qGQX8eqpyHf-84yh3mPJWUUWkaNsx5A06rvv6tBOQSXXFjZzXeQl_ZVUbgGvK9yjH6avpoOwmZZkm3FSILtaz2AHgLk=@ranguvar.io>]

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
       [not found] <nscDY8Zl-c9zxKZ0qGQX8eqpyHf-84yh3mPJWUUWkaNsx5A06rvv6tBOQSXXFjZzXeQl_ZVUbgGvK9yjH6avpoOwmZZkm3FSILtaz2AHgLk=@ranguvar.io>
@ 2024-12-14 18:39 ` Peter Zijlstra
  2024-12-15 18:50   ` Ranguvar
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2024-12-14 18:39 UTC (permalink / raw)
  To: Ranguvar
  Cc: regressions@lists.linux.dev, regressions@leemhuis.info,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org

On Sat, Dec 14, 2024 at 06:30:11AM +0000, Ranguvar wrote:
> Hello, all,
> 
> Any assistance with proper format and process is appreciated as I am
> new to these lists.  After the commit bd9bbc96e835 "sched: Rework
> dl_server" I am no longer able to boot my Windows 11 23H2 guest using
> pinned/exclusive CPU cores and passing a PCIe graphics card.  This
> setup worked for me since at least 5.10, likely earlier, with minimal
> changes.
> 
> Most or all cores assigned to guest VM report 100% usage, and many
> tasks on the host hang indefinitely (10min+) until the guest is
> forcibly stopped.  This happens only once the Windows kernel begins
> loading - its spinner appears and freezes.

Do the patches here:

  https://lkml.kernel.org/r/20241213032244.877029-1-vineeth@bitbyteword.org

help?

I'm not really skilled with the whole virt thing, and I definitely do
not have Windows guests at hand. If the above patches do not work; would
it be possible to share an image or something?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest
  2024-12-14 18:39 ` Peter Zijlstra
@ 2024-12-15 18:50   ` Ranguvar
  0 siblings, 0 replies; 9+ messages in thread
From: Ranguvar @ 2024-12-15 18:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: regressions@lists.linux.dev, regressions@leemhuis.info,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org

On Saturday, December 14th, 2024 at 18:39, Peter Zijlstra <peterz@infradead.org> wrote:
> Do the patches here:
> 
> https://lkml.kernel.org/r/20241213032244.877029-1-vineeth@bitbyteword.org
> 
> help?

I was able to apply both patches to Arch's 6.12.4 along with the svm.c regression patch.
Unfortunately it's the same broken behavior and kernel messages as bd9bbc96e835.

> I'm not really skilled with the whole virt thing, and I definitely do
> not have Windows guests at hand. If the above patches do not work; would
> it be possible to share an image or something?

I'm not sure if the fault is triggered by the pinned CPU cores, the PCIe passthrough, or the NVIDIA graphics driver on guest side.
I'm also unsure if then a Windows installation ISO or Linux live ISO might suffice to trigger the bug.

Unfortunately this guest cannot be taken down very often during this month.
As time allows though I will attempt to create a reduced and generified copy of my domain xml file such that you could import it using `virsh define` and then add some install or OS image.

Just let me know if I should test anything else or if you have any ideas about approaching with debug tools.

Thank you again.
- Devin

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-12-18  6:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-14  6:32 [REGRESSION][BISECTED] from bd9bbc96e835: cannot boot Win11 KVM guest Ranguvar
2024-12-14 18:52 ` Peter Zijlstra
2024-12-16 15:23   ` Juri Lelli
2024-12-16 16:50     ` Sean Christopherson
2024-12-16 20:40       ` Ranguvar
2024-12-17  8:57         ` Juri Lelli
2024-12-18  6:21           ` Ranguvar
     [not found] <nscDY8Zl-c9zxKZ0qGQX8eqpyHf-84yh3mPJWUUWkaNsx5A06rvv6tBOQSXXFjZzXeQl_ZVUbgGvK9yjH6avpoOwmZZkm3FSILtaz2AHgLk=@ranguvar.io>
2024-12-14 18:39 ` Peter Zijlstra
2024-12-15 18:50   ` Ranguvar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox