RE: [PATCH 0/2] Migration time prediction using calc-dirty-rate

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Gudkov Andrei via <qemu-devel@nongnu.org>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"dgilbert@redhat.com" <dgilbert@redhat.com>
Subject: RE: [PATCH 0/2] Migration time prediction using calc-dirty-rate
Date: Thu, 27 Apr 2023 13:51:52 +0000	[thread overview]
Message-ID: <ace97633672345e78a9325cd72be48e2@huawei.com> (raw)
In-Reply-To: <ZD7QrDJLrxjKehD3@redhat.com>

Thank you for the review. I submitted new version of the patch:
https://patchew.org/QEMU/cover.1682598010.git.gudkov.andrei@huawei.com/

> -----Original Message-----
> From: Daniel P. Berrangé [mailto:berrange@redhat.com]
> Sent: Tuesday, April 18, 2023 20:18
> To: Gudkov Andrei <gudkov.andrei@huawei.com>
> Cc: qemu-devel@nongnu.org; quintela@redhat.com; dgilbert@redhat.com
> Subject: Re: [PATCH 0/2] Migration time prediction using calc-dirty-rate
> 
> On Tue, Feb 28, 2023 at 04:16:01PM +0300, Andrei Gudkov via wrote:
> > Summary of calc-dirty-rate changes:
> >
> > 1. The most important change is that now calc-dirty-rate produces
> >    a *vector* of dirty page measurements for progressively increasing time
> >    periods: 125ms, 250, 500, 750, 1000, 1500, .., up to specified calc-time.
> >    The motivation behind such change is that number of dirtied pages as
> >    a function of time starting from "clean state" (new migration iteration)
> >    is far from linear. Shape of this function depends on the workload type
> >    and intensity. Measuring number of dirty pages at progressively
> >    increasing periods allows to reconstruct this function using piece-wise
> >    interpolation.
> >
> > 2. New metric added -- number of all-zero pages.
> >    Predictor needs to distinguish between number of zero and non-zero pages
> >    because during migration only 8 byte header is placed on the wire for
> >    all-zero page.
> >
> > 3. Hashing function was changed from CRC32 to xxHash.
> >    This reduces overhead of sampling by ~10 times, which is important since
> >    now some of the measurement periods are sub-second.
> 
> Very good !
> 
> >
> > 4. Other trivial metrics were added for convenience: total number
> >    of VM pages, number of sampled pages, page size.
> >
> >
> > After these changes output from calc-dirty-rate looks like this:
> >
> > {
> >   "page-size": 4096,
> >   "periods": [125, 250, 375, 500, 750, 1000, 1500,
> >               2000, 3000, 4001, 6000, 8000, 10000,
> >               15000, 20000, 25000, 30000, 35000,
> >               40000, 45000, 50000, 60000],
> >   "status": "measured",
> >   "sample-pages": 512,
> >   "dirty-rate": 98,
> >   "mode": "page-sampling",
> >   "n-dirty-pages": [33, 78, 119, 151, 217, 236, 293, 336,
> >                     425, 505, 620, 756, 898, 1204, 1457,
> >                     1723, 1934, 2141, 2328, 2522, 2675, 2958],
> >   "n-sampled-pages": 16392,
> >   "n-zero-pages": 10060,
> >   "n-total-pages": 8392704,
> >   "start-time": 2916750,
> >   "calc-time": 60
> > }
> 
> Ok, so "periods" and "n-dirty-pages" pages arrays correlate with
> each other.
> 
> >
> > Passing this data into prediction script, we get the following estimations:
> >
> > Downtime> |    125ms |    250ms |    500ms |   1000ms |   5000ms |    unlim
> > ---------------------------------------------------------------------------
> >  100 Mbps |        - |        - |        - |        - |        - |   16m59s
> >    1 Gbps |        - |        - |        - |        - |        - |    1m40s
> >    2 Gbps |        - |        - |        - |        - |    1m41s |      50s
> >  2.5 Gbps |        - |        - |        - |        - |    1m07s |      40s
> >    5 Gbps |      48s |      46s |      31s |      28s |      25s |      20s
> >   10 Gbps |      13s |      12s |      12s |      12s |      12s |      10s
> >   25 Gbps |       5s |       5s |       5s |       5s |       4s |       4s
> >   40 Gbps |       3s |       3s |       3s |       3s |       3s |       3s
> 
> This is fascinating and really helpful as an idea. It so nicely
> shows the when it is not even worth bothering to try to start the
> migrate unless you're willing to put up with large (5 sec) downtime.
> or use autoconverge/post-copy.
> 
> I wonder if the calc-dirty-rate measurements also give enough info
> to predict the likely number/duration of async page fetches needed
> during post-copy phase ? Or does this give enough info to predict
> how far down auto-converge should throttle the guest to enable
> convergance.

I also was thinking about supporting more migration features.
Currently my understanding is the following:

1. It *should* be possible to support throttling directly inside the
   prediction script without any changes to calc-dirty-rate. Maybe we can
   suggest the level of throttling required to achieve target downtime.

2. Support for compression would be harder because we would have to know
   average compression ratio and compression speed. This would require
   more changes to calc-dirty-rate.

3. To support post-copy, we would need to know network characteristics, namely
   latency and jitter. Both can be quite unstable unless source and target
   hosts are located very close in network topology.

> 
> > Quality of prediction was tested with YCSB benchmark. Memcached instance
> > was installed into 32GiB VM, and a client generated a stream of requests.
> > Between experiments we varied request size distribution, number of threads,
> > and location of the client (inside or outside the VM).
> > After short preheat phase, we measured calc-dirty-rate:
> > 1. {"execute": "calc-dirty-rate", "arguments":{"calc-time":60}}
> > 2. Wait 60 seconds
> > 3. Collect results with {"execute": "query-dirty-rate"}
> >
> > Afterwards we tried to migrate VM after randomly selecting max downtime
> > and bandwidth limit. Typical prediction error is 6-7%, with only 180 out
> > of 5779 experiments failing badly: prediction error >=25% or incorrectly
> > predicting migration success when in fact it didn't converge.
> 
> Nice results
> 
> 
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

     prev parent reply	other threads:[~2023-04-27 13:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-28 13:16 [PATCH 0/2] Migration time prediction using calc-dirty-rate Andrei Gudkov via
2023-02-28 13:16 ` [PATCH 1/2] migration/calc-dirty-rate: new metrics in sampling mode Andrei Gudkov via
2023-04-18 17:11   ` Daniel P. Berrangé
2023-02-28 13:16 ` [PATCH 2/2] migration/calc-dirty-rate: tool to predict migration time Andrei Gudkov via
2023-03-17 13:29 ` [PATCH 0/2] Migration time prediction using calc-dirty-rate Gudkov Andrei via
2023-03-27 14:08 ` Gudkov Andrei via
2023-04-03 14:41 ` Gudkov Andrei via
2023-04-10 15:19 ` Gudkov Andrei via
2023-04-18 13:25 ` Gudkov Andrei via
2023-04-18 17:21   ` Daniel P. Berrangé
2023-04-18 17:17 ` Daniel P. Berrangé
2023-04-27 13:51   ` Gudkov Andrei via [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ace97633672345e78a9325cd72be48e2@huawei.com \
    --to=qemu-devel@nongnu.org \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=gudkov.andrei@huawei.com \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).