QEMU-Devel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-devel@nongnu.org, Juraj Marcin <jmarcin@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH v2] migration: Fix possible division by zero on calc expected downtime
Date: Mon, 11 May 2026 14:21:35 -0400	[thread overview]
Message-ID: <agIeL1qYpqmDppab@x1.local> (raw)
In-Reply-To: <CAFEAcA-Tup_NPTTi3x0xk_thyku8yN3qNGvzJRt8A0_vJbQZ7g@mail.gmail.com>

On Mon, May 11, 2026 at 07:03:08PM +0100, Peter Maydell wrote:
> On Mon, 11 May 2026 at 18:47, Peter Xu <peterx@redhat.com> wrote:
> >
> > On Mon, May 11, 2026 at 04:47:22PM +0100, Peter Maydell wrote:
> > > On Mon, 11 May 2026 at 16:20, Peter Xu <peterx@redhat.com> wrote:
> > > >
> > > > Commit dd4fe8844b changed the reporting of expected downtime behavior, so
> > > > that the value will be calculated on-demand.  One side effect on the change
> > > > is QEMU will allow the calculation to happen anytime even if there's no
> > > > transfer happening for a short while.
> > > >
> > > > PeterM reported an ubsan report from clang when running migration-test with
> > > > aarch64 binary on x86_64 hosts.  I can also reproduce if I run the test
> > > > concurrently so some of the src QEMU may not get chance to push any data,
> > > > causing mbps to be 0:
> > > >
> > > > ../migration/migration.c:1051:12: runtime error: -nan is outside the range of representable values of type 'long'
> > > >
> > > > Fix it by properly handle both Inf and Nan.  One note is we can't use
> > > > ">"/">=" check here otherwise we cannot cover Nan.
> > > >
> > > > Link: https://lore.kernel.org/r/CAFEAcA-MYH6C39xO0OLx4-M5pKurJpurwRsMqZe9q=W-NShAbw@mail.gmail.com
> > > > Reported-by: Peter Maydell <peter.maydell@linaro.org>
> > > > Fixes: dd4fe8844b ("migration: Calculate expected downtime on demand")
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >  migration/migration.c | 18 +++++++++++++++++-
> > > >  1 file changed, 17 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > index b6f78eb3ac..e4103cd3f0 100644
> > > > --- a/migration/migration.c
> > > > +++ b/migration/migration.c
> > > > @@ -1044,12 +1044,28 @@ static bool migrate_show_downtime(MigrationState *s)
> > > >  /* Return expected downtime (unit: milliseconds) */
> > > >  int64_t migration_downtime_calc_expected(MigrationState *s)
> > > >  {
> > > > +    double expected_ms;
> > > > +
> > > >      if (mig_stats.dirty_sync_count <= 1) {
> > > >          return migrate_downtime_limit();
> > > >      }
> > > >
> > > > -    return mig_stats.dirty_bytes_last_sync /
> > > > +    expected_ms = mig_stats.dirty_bytes_last_sync /
> > > >          migration_get_switchover_bw(s) * 1000;
> > > > +
> > > > +    /*
> > > > +     * This "<" check covers two cases where we want to fallback to
> > > > +     * INT64_MAX, the 1st case is obvious, but the 2nd is not:
> > > > +     *
> > > > +     * (1) when expected_ms is Inf, or anything too big for int64_t
> > > > +     * (2) when expected_ms is Nan (division by zero), evaluation of this
> > >
> > > This should say "zero divided by zero" -- general division by
> > > zero gives Inf, and it's only 0 / 0 that runs into NaN.
> > >
> > > > +     *     if clause will be FALSE
> > > > +     */
> > > > +    if (expected_ms < (double)INT64_MAX) {
> > >
> > > This works, but maybe we should write it out
> > >       if (isnan(expected_ms) || expected_ms < (double)INT64_MAX) {
> >
> > I agree using isnan() is better than comment. Though code in the patch for
> > the next line here is:
> >
> > +           return (int64_t) expected_ms;
> 
> Oops, yes, I got the sense of the condition wrong.
> 
> > Do you think below should work?
> >
> >     expected_ms = ...;
> >
> >     /* For isnan() (0/0) case, we can return anything; return MAX too */
> >     if (isnan(expected_ms) || expected_ms >= (double)INT64_MAX) {
> >         return INT64_MAX;
> >     }
> 
> Yes, this will work. But I think rather than "return anything"
> we ought to say why what we're returning is a sensible value
> for the use case we have. How about:

Just to mention, here when I mentioned "anything", what actually in my mind
is the previous valid value we reported, like before the change of commit
dd4fe8844b5, here we used to have a cache value and only update if we
transferred more than 10k bytes (which itself is a magic value).

But I'm not sure if we need to keep that behavior either..

> 
> /*
>  * If we haven't been able to transfer any data, the result here
>  * could be NaN (for 0 / 0) or infinity (something else / 0).

Theoretically, we can also come to affinity if we sent something small but
the total dirty data is rediculously large, but yeah, I'm OK with this
wording; even if it may not be accurate, it's clear enough to me as a
comment to help reading.

>  * Return INT64_MAX as our best approximation to "this will
>  * take forever to complete". If the problem is transient
>  * (e.g. we just haven't started to transfer yet) we'll
>  * recalculate to a more accurate figure later.
>  */
> 
> ?

I'll use the comment suggested, thanks.

-- 
Peter Xu



      reply	other threads:[~2026-05-11 18:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 15:20 [PATCH v2] migration: Fix possible division by zero on calc expected downtime Peter Xu
2026-05-11 15:27 ` Peter Maydell
2026-05-11 15:47 ` Peter Maydell
2026-05-11 17:47   ` Peter Xu
2026-05-11 18:03     ` Peter Maydell
2026-05-11 18:21       ` Peter Xu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agIeL1qYpqmDppab@x1.local \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=jmarcin@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox