All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-devel@nongnu.org, Juraj Marcin <jmarcin@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH v2] migration: Fix possible division by zero on calc expected downtime
Date: Mon, 11 May 2026 14:21:35 -0400	[thread overview]
Message-ID: <agIeL1qYpqmDppab@x1.local> (raw)
In-Reply-To: <CAFEAcA-Tup_NPTTi3x0xk_thyku8yN3qNGvzJRt8A0_vJbQZ7g@mail.gmail.com>

On Mon, May 11, 2026 at 07:03:08PM +0100, Peter Maydell wrote:
> On Mon, 11 May 2026 at 18:47, Peter Xu <peterx@redhat.com> wrote:
> >
> > On Mon, May 11, 2026 at 04:47:22PM +0100, Peter Maydell wrote:
> > > On Mon, 11 May 2026 at 16:20, Peter Xu <peterx@redhat.com> wrote:
> > > >
> > > > Commit dd4fe8844b changed the reporting of expected downtime behavior, so
> > > > that the value will be calculated on-demand.  One side effect on the change
> > > > is QEMU will allow the calculation to happen anytime even if there's no
> > > > transfer happening for a short while.
> > > >
> > > > PeterM reported an ubsan report from clang when running migration-test with
> > > > aarch64 binary on x86_64 hosts.  I can also reproduce if I run the test
> > > > concurrently so some of the src QEMU may not get chance to push any data,
> > > > causing mbps to be 0:
> > > >
> > > > ../migration/migration.c:1051:12: runtime error: -nan is outside the range of representable values of type 'long'
> > > >
> > > > Fix it by properly handle both Inf and Nan.  One note is we can't use
> > > > ">"/">=" check here otherwise we cannot cover Nan.
> > > >
> > > > Link: https://lore.kernel.org/r/CAFEAcA-MYH6C39xO0OLx4-M5pKurJpurwRsMqZe9q=W-NShAbw@mail.gmail.com
> > > > Reported-by: Peter Maydell <peter.maydell@linaro.org>
> > > > Fixes: dd4fe8844b ("migration: Calculate expected downtime on demand")
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >  migration/migration.c | 18 +++++++++++++++++-
> > > >  1 file changed, 17 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > index b6f78eb3ac..e4103cd3f0 100644
> > > > --- a/migration/migration.c
> > > > +++ b/migration/migration.c
> > > > @@ -1044,12 +1044,28 @@ static bool migrate_show_downtime(MigrationState *s)
> > > >  /* Return expected downtime (unit: milliseconds) */
> > > >  int64_t migration_downtime_calc_expected(MigrationState *s)
> > > >  {
> > > > +    double expected_ms;
> > > > +
> > > >      if (mig_stats.dirty_sync_count <= 1) {
> > > >          return migrate_downtime_limit();
> > > >      }
> > > >
> > > > -    return mig_stats.dirty_bytes_last_sync /
> > > > +    expected_ms = mig_stats.dirty_bytes_last_sync /
> > > >          migration_get_switchover_bw(s) * 1000;
> > > > +
> > > > +    /*
> > > > +     * This "<" check covers two cases where we want to fallback to
> > > > +     * INT64_MAX, the 1st case is obvious, but the 2nd is not:
> > > > +     *
> > > > +     * (1) when expected_ms is Inf, or anything too big for int64_t
> > > > +     * (2) when expected_ms is Nan (division by zero), evaluation of this
> > >
> > > This should say "zero divided by zero" -- general division by
> > > zero gives Inf, and it's only 0 / 0 that runs into NaN.
> > >
> > > > +     *     if clause will be FALSE
> > > > +     */
> > > > +    if (expected_ms < (double)INT64_MAX) {
> > >
> > > This works, but maybe we should write it out
> > >       if (isnan(expected_ms) || expected_ms < (double)INT64_MAX) {
> >
> > I agree using isnan() is better than comment. Though code in the patch for
> > the next line here is:
> >
> > +           return (int64_t) expected_ms;
> 
> Oops, yes, I got the sense of the condition wrong.
> 
> > Do you think below should work?
> >
> >     expected_ms = ...;
> >
> >     /* For isnan() (0/0) case, we can return anything; return MAX too */
> >     if (isnan(expected_ms) || expected_ms >= (double)INT64_MAX) {
> >         return INT64_MAX;
> >     }
> 
> Yes, this will work. But I think rather than "return anything"
> we ought to say why what we're returning is a sensible value
> for the use case we have. How about:

Just to mention, here when I mentioned "anything", what actually in my mind
is the previous valid value we reported, like before the change of commit
dd4fe8844b5, here we used to have a cache value and only update if we
transferred more than 10k bytes (which itself is a magic value).

But I'm not sure if we need to keep that behavior either..

> 
> /*
>  * If we haven't been able to transfer any data, the result here
>  * could be NaN (for 0 / 0) or infinity (something else / 0).

Theoretically, we can also come to affinity if we sent something small but
the total dirty data is rediculously large, but yeah, I'm OK with this
wording; even if it may not be accurate, it's clear enough to me as a
comment to help reading.

>  * Return INT64_MAX as our best approximation to "this will
>  * take forever to complete". If the problem is transient
>  * (e.g. we just haven't started to transfer yet) we'll
>  * recalculate to a more accurate figure later.
>  */
> 
> ?

I'll use the comment suggested, thanks.

-- 
Peter Xu



      reply	other threads:[~2026-05-11 18:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 15:20 [PATCH v2] migration: Fix possible division by zero on calc expected downtime Peter Xu
2026-05-11 15:27 ` Peter Maydell
2026-05-11 15:47 ` Peter Maydell
2026-05-11 17:47   ` Peter Xu
2026-05-11 18:03     ` Peter Maydell
2026-05-11 18:21       ` Peter Xu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agIeL1qYpqmDppab@x1.local \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=jmarcin@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.