Linux virtualization list
 help / color / mirror / Atom feed
From: Alison Schofield <alison.schofield@intel.com>
To: Li Chen <me@linux.beauty>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"Dave Jiang" <dave.jiang@intel.com>,
	Ira Weiny <ira.weiny@intel.com>, <virtualization@lists.linux.dev>,
	<nvdimm@lists.linux.dev>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 0/7] nvdimm: virtio_pmem: fix request lifetime and converge broken queue failures
Date: Fri, 12 Jun 2026 11:42:29 -0700	[thread overview]
Message-ID: <aixTFVGZVDaCxMis@aschofie-mobl2.lan> (raw)
In-Reply-To: <20260609120726.1714780-1-me@linux.beauty>

On Tue, Jun 09, 2026 at 08:07:14PM +0800, Li Chen wrote:
> Hi,

Hi Li Chen,

In case you missed it, a Sashiko AI review of this set has posted
feedback. Please take a look.

https://sashiko.dev/#/patchset/20260609120726.1714780-1-me%40linux.beauty

-- Alison

> 
> The nvdimm flush helper currently converts any non-zero provider flush
> callback error to -EIO. That hides useful errno values from providers. For
> example, virtio-pmem may fail child flush bio allocation with -ENOMEM, but
> that is currently reported as -EIO by nvdimm_flush().
> 
> The raw failure seen in the local mkfs sanity test was:
> 
>   wipefs: /dev/pmem0: cannot flush modified buffers: Input/output error
>   mkfs.ext4: Input/output error while writing out and closing file system
>   nd_region region0: dbg: nvdimm_flush rc=-5
> 
> The first two patches keep provider flush errors intact and make the
> virtio-pmem child flush bio allocation use GFP_NOIO. async_pmem_flush() can
> allocate a child flush bio from filesystem flush and writeback paths;
> GFP_NOIO is a better fit than GFP_ATOMIC there because the allocation may
> sleep but must not recurse into filesystem I/O reclaim. With these changes,
> the same mkfs sanity test reached mkfs_rc=0, mount_rc=0, and umount_rc=0.
> 
> The rest of the series addresses virtio-pmem request lifetime and broken
> virtqueue handling. The virtio-pmem flush path uses a virtqueue cookie/token
> to carry a per-request context through completion. Under broken virtqueue /
> notify failure conditions, the submitter can return and free the request
> object while the host/backend may still complete the published request. The
> IRQ completion handler then dereferences freed memory when waking waiters,
> which is reported by KASAN as a slab-use-after-free and may manifest as lock
> corruption (e.g. "BUG: spinlock already unlocked") without KASAN.
> 
> In addition, the flush path has two wait sites: one for virtqueue descriptor
> availability (-ENOSPC from virtqueue_add_sgs()) and one for request
> completion. If the virtqueue becomes broken, forward progress is no longer
> guaranteed and these waiters may sleep indefinitely unless the driver
> converges the failure and wakes all wait sites.
> 
> This series addresses these issues:
> 
> 1/7 nvdimm: preserve flush callback errors
> Return provider flush callback errors directly from nvdimm_flush().
> 
> 2/7 nvdimm: virtio_pmem: use GFP_NOIO for child flush bio
> Use GFP_NOIO for the child flush bio allocation.
> 
> 3/7 nvdimm: virtio_pmem: always wake -ENOSPC waiters
> Wake one -ENOSPC waiter for each reclaimed used buffer, decoupled from
> token completion.
> 
> 4/7 nvdimm: virtio_pmem: use READ_ONCE()/WRITE_ONCE() for wait flags
> Use READ_ONCE()/WRITE_ONCE() for the wait_event() flags (done and
> wq_buf_avail).
> 
> 5/7 nvdimm: virtio_pmem: refcount requests for token lifetime
> Refcount request objects so the token lifetime spans the window where it is
> reachable through the virtqueue until completion/drain drops the virtqueue
> reference.
> 
> 6/7 nvdimm: virtio_pmem: converge broken virtqueue to -EIO
> Track a device-level broken state to converge broken/notify failures to -EIO:
> wake all waiters and drain/detach outstanding requests to complete them with
> an error, and fail-fast new requests.
> 
> 7/7 nvdimm: virtio_pmem: drain requests in freeze
> Drain outstanding requests in freeze() before tearing down virtqueues so
> waiters do not sleep indefinitely.
> 
> Testing was done on QEMU x86_64 with a virtio-pmem device exported as
> /dev/pmem0. This v4 series applies to v7.1-rc7, builds with
> CONFIG_VIRTIO_PMEM=m, passes checkpatch, and passed the local repro checks
> with a local-only virtqueue_kick() fault injection. I also checked that it
> applies cleanly to next/master at 6e845bcb78c9 ("Add linux-next specific
> files for 20260605").
> 
> Thanks,
> Li Chen
> 
> Changelog:
> v3->v4:
> - Rebased the series onto v7.1-rc7 so it applies cleanly to Linux 7.1-rc7.
> - Update the allocation site in 6/7 from kmalloc(sizeof(*req_data),
>   GFP_KERNEL) to kmalloc_obj(*req_data) to match current nvdimm code.
> - Add 1/7 to preserve provider flush callback errors in nvdimm_flush().
> - Include the GFP_NOIO child flush bio allocation fix as 2/7.
> - Renumber the old request lifetime and broken virtqueue fixes after the two
>   new flush error patches.
> v2->v3:
> - Split patch 1 as suggested by Pankaj Gupta: keep the waiter wakeup
>   ordering change in 1/5 and move READ_ONCE()/WRITE_ONCE() updates to
>   2/5 (no functional change intended).
> - Add log report to commit msg
> - Fold the export fix into 4/5 to keep the series bisectable when
>   CONFIG_VIRTIO_PMEM=m.
> v1->v2: add the export patch to fix compile issue.
> 
> Links:
> v3: https://lore.kernel.org/all/20260226025712.2236279-1-me@linux.beauty/#t
> v2: https://lore.kernel.org/all/20251225042915.334117-1-me@linux.beauty/
> v1: https://www.spinics.net/lists/kernel/msg5974818.html
> 
> Li Chen (7):
>   nvdimm: preserve flush callback errors
>   nvdimm: virtio_pmem: use GFP_NOIO for child flush bio
>   nvdimm: virtio_pmem: always wake -ENOSPC waiters
>   nvdimm: virtio_pmem: use READ_ONCE()/WRITE_ONCE() for wait flags
>   nvdimm: virtio_pmem: refcount requests for token lifetime
>   nvdimm: virtio_pmem: converge broken virtqueue to -EIO
>   nvdimm: virtio_pmem: drain requests in freeze
> 
>  drivers/nvdimm/nd_virtio.c   | 139 +++++++++++++++++++++++++++++------
>  drivers/nvdimm/region_devs.c |   6 +-
>  drivers/nvdimm/virtio_pmem.c |  14 ++++
>  drivers/nvdimm/virtio_pmem.h |   6 ++
>  4 files changed, 139 insertions(+), 26 deletions(-)
> -- 
> 2.52.0

      parent reply	other threads:[~2026-06-12 18:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-09 12:07 [PATCH v4 0/7] nvdimm: virtio_pmem: fix request lifetime and converge broken queue failures Li Chen
2026-06-09 12:07 ` [PATCH v4 1/7] nvdimm: preserve flush callback errors Li Chen
2026-06-09 12:07 ` [PATCH v4 2/7] nvdimm: virtio_pmem: use GFP_NOIO for child flush bio Li Chen
2026-06-09 12:07 ` [PATCH v4 3/7] nvdimm: virtio_pmem: always wake -ENOSPC waiters Li Chen
2026-06-09 12:07 ` [PATCH v4 4/7] nvdimm: virtio_pmem: use READ_ONCE()/WRITE_ONCE() for wait flags Li Chen
2026-06-09 12:07 ` [PATCH v4 5/7] nvdimm: virtio_pmem: refcount requests for token lifetime Li Chen
2026-06-09 12:07 ` [PATCH v4 6/7] nvdimm: virtio_pmem: converge broken virtqueue to -EIO Li Chen
2026-06-09 12:07 ` [PATCH v4 7/7] nvdimm: virtio_pmem: drain requests in freeze Li Chen
2026-06-12 18:42 ` Alison Schofield [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aixTFVGZVDaCxMis@aschofie-mobl2.lan \
    --to=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@linux.beauty \
    --cc=nvdimm@lists.linux.dev \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=virtualization@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox