From: Peter Xu <peterx@redhat.com>
To: "Cédric Le Goater" <clg@redhat.com>
Cc: qemu-devel@nongnu.org, Alex Williamson <alex@shazbot.org>,
Avihai Horon <avihaih@nvidia.com>
Subject: Re: [PATCH] vfio/migration: Detect and report overflow in migration size queries
Date: Wed, 13 May 2026 13:09:15 -0400 [thread overview]
Message-ID: <agSwO7PjtjovanZl@x1.local> (raw)
In-Reply-To: <20260513094522.346314-1-clg@redhat.com>
On Wed, May 13, 2026 at 11:45:22AM +0200, Cédric Le Goater wrote:
> VFIO migration ioctls (VFIO_DEVICE_FEATURE_MIG_DATA_SIZE and
> VFIO_MIG_GET_PRECOPY_INFO) return device-estimated migration sizes as
> uint64_t values. A misbehaving kernel driver could return values that
> are unreasonably large, which would corrupt the size accounting used
> to decide migration convergence.
>
> This misbehavior occurred a few times when testing migration of a VM
> with an assigned NVIDIA vGPU and an MLX5 VF. In some of the save
> iterations, the reported precopy and stopcopy sizes were unreasonably
> large (close to UINT64_MAX):
>
> vfio_state_pending (4fbce62c-8ce2-4cc9-b429-41635bc94f24) stopcopy size 0 precopy initial size 18446744073708667040 precopy dirty size 0
> vfio_save_iterate (4fbce62c-8ce2-4cc9-b429-41635bc94f24) precopy initial size 18446744073707618464 precopy dirty size 0
> vfio_state_pending (4fbce62c-8ce2-4cc9-b429-41635bc94f24) stopcopy size 18446744073708503040 precopy initial size 18446744073707618464 precopy dirty size 0
> vfio_state_pending (4fbce62c-8ce2-4cc9-b429-41635bc94f24) stopcopy size 0 precopy initial size 18446744073707618464 precopy dirty size 0
> vfio_state_pending (0000:b1:01.0) stopcopy size 18446744073709543408 precopy initial size 0 precopy dirty size 1008
>
> This had the effect of corrupting migration convergence, as reported
> by the HMP migrate command:
>
> (qemu) info migrate
> Status: active
> Time (ms): total=21140, setup=86, exp_down=152455434886355
> Remaining: 16 EiB
> RAM info:
> Throughput (Mbps): 967.98
> Sizes: pagesize=4 KiB, total=4 GiB
> Transfers: transferred=2.29 GiB, remain=4.7 MiB
> Channels: precopy=1.91 GiB, multifd=0 B, postcopy=0 B, vfio=387 MiB
> Page Types: normal=499427, zero=559708
> Page Rates (pps): transfer=0, dirty=1892
> Others: dirty_syncs=3
>
> Add a helper to detect values that exceed INT64_MAX, which is far
> beyond any realistic device state size, and report them with an error
> message. Return -ERANGE from the query functions so callers can abort
> the migration rather than proceeding with corrupted estimates.
> However, the callers don't yet check the return value to actually stop
> the migration.
>
> Cc: Avihai Horon <avihaih@nvidia.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
next prev parent reply other threads:[~2026-05-13 17:10 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 9:45 [PATCH] vfio/migration: Detect and report overflow in migration size queries Cédric Le Goater
2026-05-13 17:09 ` Peter Xu [this message]
2026-05-14 12:52 ` Avihai Horon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agSwO7PjtjovanZl@x1.local \
--to=peterx@redhat.com \
--cc=alex@shazbot.org \
--cc=avihaih@nvidia.com \
--cc=clg@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.