Re: virtio-blk/ext4 error handling for host-side ENOSPC

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Stefan Hajnoczi <stefanha@redhat.com>
To: Keiichi Watanabe <keiichiw@chromium.org>
Cc: dverkamp@chromium.org, linux-fsdevel@vger.kernel.org,
	takayas@chromium.org, tytso@mit.edu, uekawa@chromium.org
Subject: Re: virtio-blk/ext4 error handling for host-side ENOSPC
Date: Thu, 11 Jul 2024 08:02:46 +0200	[thread overview]
Message-ID: <20240711060246.GA563880@dynamic-pd01.res.v6.highway.a1.net> (raw)
In-Reply-To: <CAD90VcbVFm7YVsrubQs_B_baDHp432v4BuaAZ382VfT2XQ-hHQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3060 bytes --]

On Fri, Jun 28, 2024 at 12:29:05PM +0900, Keiichi Watanabe wrote:
> Hi Stefan,
> 
> Thanks for sharing QEMU's approach!
> We also have a similar early notification mechanism to avoid low-disk
> conditions.
> However, the approach I would like to propose is to prevent pausing
> the guest by allowing the guest retry requests after a while.
> 
> On Wed, Jun 19, 2024 at 10:57 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > > What do you think of this idea? Also, has anything similar been attempted yet?
> >
> > Hi Keiichi,
> > Yes, there is an existing approach that is related but not identical to
> > what you are exploring:
> >
> > QEMU has an option to pause the guest and raise a notification to the
> > management tool that ENOSPC has been reached. The guest is unable to
> > resolve ENOSPC itself and guest applications are likely to fail the disk
> > becomes unavailable, hence the guest is simply paused.
> >
> > In systems that expect to hit this condition, this pause behavior can be
> > combined with an early notification when a free space watermark is hit.
> > This way guest are almost never paused because free space can be added
> > before ENOSPC is reached. QEMU has a write watermark feature that works
> > well on top of qcow2 images (they grow incrementally so it's trivial to
> > monitor how much space is being consumed).
> >
> > I wanted to share this existing approach in case you think it would work
> > nicely for your use case.
> >
> > The other thought I had was: how does the new ENOSPC error fit into the
> > block device model? Hopefully this behavior is not virtio-blk-specific
> > behavior but rather something general that other storage protocols like
> > NVMe and SCSI support too. That way file systems can handle this in a
> > generic fashion.
> >
> > The place I would check is Logical Block Provisioning in SCSI and NVMe.
> > Perhaps there are features in these protocols for reporting low
> > resources? (Sorry, I didn't have time to check.)
> 
> For scsi, THIN_PROVISIONING_SOFT_THRESHOLD_REACHED looks like the one.
> For NVMe, NVME_SC_CAPACITY_EXCEEDED looks like this.
> 
> I guess we can add a new error state in ext4 layer. Le'ts say it's
> "HOST_NOSPACE" in ext4. This should be used when virtio-blk returns
> ENOSPACE or virtio-scsi returns
> THIN_PROVISIONING_SOFT_THRESHOLD_REACHED. I'm not sure if there is a
> case where NVME_SC_CAPACITY_EXCEEDED is translated to this state
> because we don't have virito-nvme.
> If ext4 is in the state of HOST_NOSPACE, ext4 will periodically try to
> write to the disk (= virtio-blk or virtio-scsi) several times. If this
> fails a certain number of times, the guest will report a disk error.
> What do you think?

I'm sure virtio-blk can be extended if you can work with the file system
maintainers to introduce the concept of logical block exhaustion. There
might be complications for fsync and memory pressure if pages cannot be
written back to exhausted devices, but I'm not an expert.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

     prev parent reply	other threads:[~2024-07-11  6:02 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-17  3:34 virtio-blk/ext4 error handling for host-side ENOSPC Keiichi Watanabe
2024-06-18  8:33 ` Keiichi Watanabe
2024-06-19 13:57   ` Stefan Hajnoczi
2024-06-28  3:29     ` Keiichi Watanabe
2024-07-11  6:02       ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240711060246.GA563880@dynamic-pd01.res.v6.highway.a1.net \
    --to=stefanha@redhat.com \
    --cc=dverkamp@chromium.org \
    --cc=keiichiw@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=takayas@chromium.org \
    --cc=tytso@mit.edu \
    --cc=uekawa@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).