From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Denis V. Lunev" <den@openvz.org>, mst@redhat.com
Cc: virtualization@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] virtio_balloon: quiesce balloon work on device shutdown
Date: Mon, 22 Jun 2026 16:29:54 +0200 [thread overview]
Message-ID: <de197e27-e56e-48e1-ac2d-e76ba0c3e6b2@kernel.org> (raw)
In-Reply-To: <20260622133715.3707707-1-den@openvz.org>
On 6/22/26 15:37, Denis V. Lunev wrote:
> Since commit 8bd2fa086a04 ("virtio: break and reset virtio devices on
> device_shutdown()") the virtio bus breaks and resets every virtio device
> during device_shutdown(), i.e. on reboot and kexec. virtio_balloon has no
> .shutdown of its own, so that generic path runs while the balloon's
> asynchronous work is still armed: the free page reporting worker, the
> inflate/deflate and stats workers, the OOM notifier and the free page
> shrinker.
>
> Once the device has been broken, virtqueue_add_inbuf() in
> virtballoon_free_page_report() returns -EIO and trips its WARN_ON_ONCE().
> On a kernel booted with panic_on_warn that turns an ordinary reboot into a
> fatal panic in the middle of device_shutdown(), so the machine never
> reaches the new kernel. The inflate/deflate and OOM paths do not warn but
> are no better off: they call wait_event(vb->acked, ...) and would block
> forever on a queue that can no longer complete.
>
> This was hit in the field as an intermittent failure of a virtualization
> cluster upgrade: guest storage nodes were rebooted via kexec into the new
> kernel, and the ones whose free page reporting happened to run during
> device_shutdown() panicked (the guests run with panic_on_warn) and never
> came back, stalling the rolling upgrade. The crash dump showed the WARN at
> virtio_balloon.c:216 in a page_reporting kworker, with all the balloon
> virtqueues already broken.
>
> Patch 1 factors the teardown out of virtballoon_remove() into a
> virtballoon_quiesce() helper (no functional change). Patch 2 adds a
> virtio_balloon .shutdown handler that quiesces via that helper while the
> device is still alive, then breaks and resets it the way the generic
> virtio_dev_shutdown() would.
>
> Relaxing the single WARN_ON_ONCE() instead was considered and rejected: it
> would silence the panic but leave the inflate/deflate and OOM paths
> hanging on the broken device. The device has to be quiesced, not just kept
> quiet.
Do you have a link to that discussion you could add?
--
Cheers,
David
next prev parent reply other threads:[~2026-06-22 14:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-22 13:37 [PATCH 0/2] virtio_balloon: quiesce balloon work on device shutdown Denis V. Lunev
2026-06-22 13:37 ` [PATCH 1/2] virtio_balloon: factor out virtballoon_quiesce() Denis V. Lunev
2026-06-22 14:46 ` David Hildenbrand (Arm)
2026-06-22 14:59 ` Michael S. Tsirkin
2026-06-22 13:37 ` [PATCH 2/2] virtio_balloon: quiesce balloon work before device shutdown Denis V. Lunev
2026-06-22 14:38 ` David Hildenbrand (Arm)
2026-06-22 14:58 ` Michael S. Tsirkin
2026-06-22 14:29 ` David Hildenbrand (Arm) [this message]
2026-06-22 14:33 ` [PATCH 0/2] virtio_balloon: quiesce balloon work on " Denis V. Lunev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de197e27-e56e-48e1-ac2d-e76ba0c3e6b2@kernel.org \
--to=david@kernel.org \
--cc=den@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.