Re: [PATCH for-10.2] file-posix: Handle suspended dm-multipath better for SG_IO

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Hanna Czenczek <hreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>, qemu-block@nongnu.org
Cc: qinwang@redhat.com, bmarzins@redhat.com, qemu-devel@nongnu.org,
	qemu-stable@nongnu.org
Subject: Re: [PATCH for-10.2] file-posix: Handle suspended dm-multipath better for SG_IO
Date: Wed, 3 Dec 2025 18:27:20 +0100	[thread overview]
Message-ID: <1c7784d0-8952-4b4f-9a5e-923bdaef9f2f@redhat.com> (raw)
In-Reply-To: <20251128221440.89125-1-kwolf@redhat.com>

On 28.11.25 23:14, Kevin Wolf wrote:
> When introducing DM_MPATH_PROBE_PATHS, we already anticipated that
> dm-multipath devices might be suspended for a short time when the DM
> tables are reloaded and that they return -EAGAIN in this case. The
> behaviour promised in the comment wasn't actually implemented, though:
> We don't get SG_IO_MAX_RETRIES in practice, because after the first
> 1ms sleep, DM_MPATH_PROBE_PATHS is called and if that still fails with
> -EAGAIN, we error out immediately without any retry.

How so?  `hdev_co_ioctl_sgio_retry()` is what issues 
`DM_MPATH_PROBE_PATHS`, and if it gets `-EAGAIN` it will return `true`, 
requesting a retry.

> However, meanwhile it has also turned out that libmpathpersist (which is
> used by qemu-pr-helper) may need to perform more complex recovery
> operations to get reservations back to expected state if a path failure
> happened in the middle of a PR operation. In this case, the device is
> suspended for a longer time compared to the case we originally expected.

In any case, this does warrant the change.

> This patch changes hdev_co_ioctl() to treat -EAGAIN separately so that
> it doesn't result in an immediate failure if the device is suspended for
> more than 1ms, and moves to incremental backoff to cover both quick and
> slow cases without excessive delays.
>
> Buglink: https://issues.redhat.com/browse/RHEL-121543
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block/file-posix.c | 56 ++++++++++++++++++++++++++++------------------
>   1 file changed, 34 insertions(+), 22 deletions(-)

Reviewed-by: Hanna Czenczek <hreitz@redhat.com>

> diff --git a/block/file-posix.c b/block/file-posix.c
> index c9e367a222..6265d2e248 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -4288,25 +4288,8 @@ hdev_open_Mac_error:
>   static bool coroutine_fn sgio_path_error(int ret, sg_io_hdr_t *io_hdr)
>   {
>       if (ret < 0) {
> -        switch (ret) {
> -        case -ENODEV:
> -            return true;
> -        case -EAGAIN:
> -            /*
> -             * The device is probably suspended. This happens while the dm table
> -             * is reloaded, e.g. because a path is added or removed. This is an
> -             * operation that should complete within 1ms, so just wait a bit and
> -             * retry.
> -             *
> -             * If the device was suspended for another reason, we'll wait and
> -             * retry SG_IO_MAX_RETRIES times. This is a tolerable delay before
> -             * we return an error and potentially stop the VM.
> -             */
> -            qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000);
> -            return true;
> -        default:
> -            return false;
> -        }
> +        /* Path errors sometimes result in -ENODEV */
> +        return ret == -ENODEV;
>       }
>   
>       if (io_hdr->host_status != SCSI_HOST_OK) {
> @@ -4375,6 +4358,7 @@ hdev_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
>   {
>       BDRVRawState *s = bs->opaque;
>       RawPosixAIOData acb;
> +    uint64_t eagain_sleep_ns = 1 * SCALE_MS;
>       int retries = SG_IO_MAX_RETRIES;
>       int ret;
>   
> @@ -4403,9 +4387,37 @@ hdev_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
>           },
>       };
>   
> -    do {
> -        ret = raw_thread_pool_submit(handle_aiocb_ioctl, &acb);
> -    } while (req == SG_IO && retries-- && hdev_co_ioctl_sgio_retry(&acb, ret));
> +retry:
> +    ret = raw_thread_pool_submit(handle_aiocb_ioctl, &acb);
> +    if (req == SG_IO && s->use_mpath) {
> +        if (ret == -EAGAIN && eagain_sleep_ns < NANOSECONDS_PER_SECOND) {
> +            /*
> +             * If this is a multipath device, it is probably suspended.
> +             *
> +             * This can happen while the dm table is reloaded, e.g. because a
> +             * path is added or removed. This is an operation that should
> +             * complete within 1ms, so just wait a bit and retry.
> +             *
> +             * There are also some cases in which libmpathpersist must recover
> +             * from path failure during its operation, which can leave the
> +             * device suspended for a bit longer while the library brings back
> +             * reservations into the expected state.
> +             *
> +             * Use increasing delays to cover both cases without waiting
> +             * excessively, and stop after a bit more than a second (1023 ms).
> +             * This is a tolerable delay before we return an error and
> +             * potentially stop the VM.
> +             */
> +            qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, eagain_sleep_ns);
> +            eagain_sleep_ns *= 2;
> +            goto retry;
> +        }
> +
> +        /* Even for ret == 0, the SG_IO header can contain an error */
> +        if (retries-- && hdev_co_ioctl_sgio_retry(&acb, ret)) {
> +            goto retry;
> +        }
> +    }
>   
>       return ret;
>   }

next prev parent reply	other threads:[~2025-12-03 17:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-28 22:14 [PATCH for-10.2] file-posix: Handle suspended dm-multipath better for SG_IO Kevin Wolf
2025-12-03 17:27 ` Hanna Czenczek [this message]
2025-12-04 10:44   ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c7784d0-8952-4b4f-9a5e-923bdaef9f2f@redhat.com \
    --to=hreitz@redhat.com \
    --cc=bmarzins@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=qinwang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).