qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* dozens of qemu/kvm VMs getting into stuck states since kernel ~5.13
@ 2021-12-07 19:44 Chris Murphy
  2021-12-07 22:25 ` Sean Christopherson
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Murphy @ 2021-12-07 19:44 UTC (permalink / raw)
  To: kvm; +Cc: qemu-devel

cc: qemu-devel

Hi,

I'm trying to help progress a very troublesome and so far elusive bug
we're seeing in Fedora infrastructure. When running dozens of qemu-kvm
VMs simultaneously, eventually they become unresponsive, as well as
new processes as we try to extract information from the host about
what's gone wrong.

Systems (Fedora openQA worker hosts) on kernel 5.12.12+ wind up in a
state where forking does not work correctly, breaking most things
https://bugzilla.redhat.com/show_bug.cgi?id=2009585

In subsequent testing, we used newer kernels with lockdep and other
debug stuff enabled, and managed to capture a hung task with a bunch
of locks listed, including kvm and qemu processes. But I can't parse
it.

5.15-rc7
https://bugzilla-attachments.redhat.com/attachment.cgi?id=1840941
5.15+
https://bugzilla-attachments.redhat.com/attachment.cgi?id=1840939

If anyone can take a glance at those kernel messages, and/or give
hints how we can extract more information for debugging, it'd be
appreciated. Maybe all of that is normal and the actual problem isn't
in any of these traces.

Thanks,

--
Chris Murphy


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-12-08 17:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-07 19:44 dozens of qemu/kvm VMs getting into stuck states since kernel ~5.13 Chris Murphy
2021-12-07 22:25 ` Sean Christopherson
2021-12-08 17:09   ` Chris Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).