All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH 0/3] migration: Downtime tracepoints
Date: Thu, 26 Oct 2023 14:18:49 -0400	[thread overview]
Message-ID: <ZTqtieZo/VaSscp5@x1n> (raw)
In-Reply-To: <ZTqb/XDnwhkTUL3s@x1n>

On Thu, Oct 26, 2023 at 01:03:57PM -0400, Peter Xu wrote:
> On Thu, Oct 26, 2023 at 05:06:37PM +0100, Joao Martins wrote:
> > On 26/10/2023 16:53, Peter Xu wrote:
> > > This small series (actually only the last patch; first two are cleanups)
> > > wants to improve ability of QEMU downtime analysis similarly to what Joao
> > > used to propose here:
> > > 
> > >   https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com
> > > 
> > Thanks for following up on the idea; It's been hard to have enough bandwidth for
> > everything on the past set of weeks :(
> 
> Yeah, totally understdood.  I think our QE team pushed me towards some
> series like this, while my plan was waiting for your new version. :)
> 
> Then when I started I decided to go into per-device.  I was thinking of
> also persist that information, but then I remembered some ppc guest can
> have ~40,000 vmstates..  and memory to maintain that may or may not regress
> a ppc user.  So I figured I should first keep it simple with tracepoints.
> 
> > 
> > > But with a few differences:
> > > 
> > >   - Nothing exported yet to qapi, all tracepoints so far
> > > 
> > >   - Instead of major checkpoints (stop, iterable, non-iterable, resume-rp),
> > >     finer granule by providing downtime measurements for each vmstate (I
> > >     made microsecond to be the unit to be accurate).  So far it seems
> > >     iterable / non-iterable is the core of the problem, and I want to nail
> > >     it to per-device.
> > > 
> > >   - Trace dest QEMU too
> > > 
> > > For the last bullet: consider the case where a device save() can be super
> > > fast, while load() can actually be super slow.  Both of them will
> > > contribute to the ultimate downtime, but not a simple summary: when src
> > > QEMU is save()ing on device1, dst QEMU can be load()ing on device2.  So
> > > they can run in parallel.  However the only way to figure all components of
> > > the downtime is to record both.
> > > 
> > > Please have a look, thanks.
> > >
> > 
> > I like your series, as it allows a user to pinpoint one particular bad device,
> > while covering the load side too. The checkpoints of migration on the other hand
> > were useful -- while also a bit ugly -- for the sort of big picture of how
> > downtime breaks down. Perhaps we could add that /also/ as tracepoitns without
> > specifically commiting to be exposed in QAPI.
> > 
> > More fundamentally, how can one capture the 'stop' part? There's also time spent
> > there like e.g. quiescing/stopping vhost-net workers, or suspending the VF
> > device. All likely as bad to those tracepoints pertaining device-state/ram
> > related stuff (iterable and non-iterable portions).
> 
> Yeah that's a good point.  I didn't cover "stop" yet because I think it's
> just more tricky and I didn't think it all through, yet.
> 
> The first question is, when stopping some backends, the vCPUs are still
> running, so it's not 100% clear to me on which should be contributed as
> part of real downtime.

I was wrong.. we always stop vcpus first.

If you won't mind, I can add some traceopints for all those spots in this
series to cover your other series.  I'll also make sure I do that for both
sides.

Thanks,

> 
> Meanwhile that'll be another angle besides vmstates: need to keep some eye
> on the state change handlers, and that can be a device, or something else.
> 
> Did you measure the stop process in some way before?  Do you have some
> rough number or anything surprising you already observed?
> 
> Thanks,
> 
> -- 
> Peter Xu

-- 
Peter Xu



  reply	other threads:[~2023-10-26 18:19 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-26 15:53 [PATCH 0/3] migration: Downtime tracepoints Peter Xu
2023-10-26 15:53 ` [PATCH 1/3] migration: Set downtime_start even for postcopy Peter Xu
2023-10-26 17:05   ` Fabiano Rosas
2023-10-26 15:53 ` [PATCH 2/3] migration: Add migration_downtime_start|end() helpers Peter Xu
2023-10-26 17:11   ` Fabiano Rosas
2023-10-26 15:53 ` [PATCH 3/3] migration: Add per vmstate downtime tracepoints Peter Xu
2023-10-26 16:06 ` [PATCH 0/3] migration: Downtime tracepoints Joao Martins
2023-10-26 17:03   ` Peter Xu
2023-10-26 18:18     ` Peter Xu [this message]
2023-10-26 19:33       ` Joao Martins
2023-10-26 20:07         ` Peter Xu
2023-10-27  8:58           ` Joao Martins
2023-10-27 14:41             ` Peter Xu
2023-10-27 22:17               ` Joao Martins
2023-10-30 15:13                 ` Peter Xu
2023-10-30 16:09                   ` Peter Xu
2023-10-30 16:11                     ` Joao Martins
2023-10-26 19:01 ` [PATCH 4/3] migration: Add tracepoints for downtime checkpoints Peter Xu
2023-10-26 19:43   ` Joao Martins
2023-10-26 20:08     ` Peter Xu
2023-10-26 20:14       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTqtieZo/VaSscp5@x1n \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=joao.m.martins@oracle.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.