Re: [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and migration logic

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Denis V. Lunev" <den@openvz.org>
Cc: John Snow <jsnow@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	quintela@redhat.com, stefanha@redhat.com, famz@redhat.com,
	mreitz@redhat.com, kwolf@redhat.com,
	Eric Blake <eblake@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and migration logic
Date: Thu, 2 Aug 2018 10:29:08 +0100	[thread overview]
Message-ID: <20180802092907.GA2523@work-vm> (raw)
In-Reply-To: <6bfdc952-fb17-feae-f367-be710853d829@openvz.org>

* Denis V. Lunev (den@openvz.org) wrote:
> On 08/01/2018 09:55 PM, Dr. David Alan Gilbert wrote:
> > * Denis V. Lunev (den@openvz.org) wrote:
> >> On 08/01/2018 08:40 PM, Dr. David Alan Gilbert wrote:
> >>> * John Snow (jsnow@redhat.com) wrote:
> >>>> On 08/01/2018 06:20 AM, Dr. David Alan Gilbert wrote:
> >>>>> * John Snow (jsnow@redhat.com) wrote:
> >>>>>
> >>>>> <snip>
> >>>>>
> >>>>>> I'd rather do something like this:
> >>>>>> - Always flush bitmaps to disk on inactivate.
> >>>>> Does that increase the time taken by the inactivate measurably?
> >>>>> If it's small relative to everything else that's fine; it's just I
> >>>>> always worry a little since I think this happens after we've stopped the
> >>>>> CPU on the source, so is part of the 'downtime'.
> >>>>>
> >>>>> Dave
> >>>>> --
> >>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >>>>>
> >>>> I'm worried that if we don't, we're leaving behind unusable, partially
> >>>> complete files behind us. That's a bad design and we shouldn't push for
> >>>> it just because it's theoretically faster.
> >>> Oh I don't care about theoretical speed; but if it's actually unusably
> >>> slow in practice then it needs fixing.
> >>>
> >>> Dave
> >> This is not "theoretical" speed. This is real practical speed and
> >> instability.
> >> EACH IO operation can be performed unpredictably slow and thus with
> >> IO operations in mind you can not even calculate or predict downtime,
> >> which should be done according to the migration protocol.
> > We end up doing some IO anyway, even ignoring these new bitmaps,
> > at the end of the migration when we pause the CPU, we do a
> > bdrv_inactivate_all to flush any outstanding writes; so we've already
> > got that unpredictable slowness.
> >
> > So, not being a block person, but with some interest in making sure
> > downtime doesn't increase, I just wanted to understand whether the
> > amount of writes we're talking about here is comparable to that
> > which already exists or a lot smaller or a lot larger.
> > If the amount of IO you're talking about is much smaller than what
> > we typically already do, then John has a point and you may as well
> > do the write.
> > If the amount of IO for the bitmap is much larger and would slow
> > the downtime a lot then you've got a point and that would be unworkable.
> >
> > Dave
> This is not theoretical difference.
> 
> For 1 Tb drive and 64 kb bitmap granularity the size of bitmap is
> 2 Mb + some metadata (64 Kb). Thus we will have to write
> 2 Mb of data per bitmap.

OK, this was about my starting point; I think your Mb here is Byte not
Bit; so assuming a drive of 200MByte/s, that's 200/2=1/100th of a
second = 10ms; now 10ms I'd say is small enough not to worry about downtime
increases, since the number we normally hope for is in the 300ms ish
range.

> For some case there are 2-3-5 bitmaps
> this we will have 10 Mb of data.

OK, remembering I'm not a block person can you just explain why
you need 5 bitmaps?
But with 5 bitmaps that's 50ms, that's starting to get worrying.

> With 16 Tb drive the amount of
> data to write will be multiplied by 16 which gives 160 Mb to
> write. More disks and bigger the size - more data to write.

Yeh and that's going on for a second and way too big.

(Although that feels like you could fix it by adding bitmaps on your
bitmaps hierarchically so you didn't write them all; but that's
getting way more complex).

> Above amount should be multiplied by 2 - x Mb to be written
> on source, x Mb to be read on target which gives 320 Mb to
> write.
> 
> That is why this is not good - we have linear increase with the
> size and amount of disks.
> 
> There is also some thoughts on normal guest IO. Theoretically
> we can think on replaying IO on the target closing the file
> immediately or block writes to changed areas and notify
> target upon IO completion or invent other fancy dances.
> At least we think right now on these optimizations for regular
> migration paths.
> 
> The problem right that such things are not needed now for CBT
> but will become necessary and pretty much useless upon
> introducing this stuff.

I don't quite understand the last two paragraphs.

However, coming back to my question; it was really saying that
normal guest IO during the end of the migration will cause
a delay; I'm expecting that to be fairly unrelated to the size
of the disk; more to do with workload; so I guess in your case
the worry is the case of big large disks giving big large
bitmaps.

Dave

> Den
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2018-08-02  9:29 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-26 13:50 [Qemu-devel] [PATCH 0/6] fix persistent bitmaps migration logic Vladimir Sementsov-Ogievskiy
2018-06-26 13:50 ` [Qemu-devel] [PATCH 1/6] iotests: 169: drop deprecated 'autoload' parameter Vladimir Sementsov-Ogievskiy
2018-07-09 22:36   ` John Snow
2018-06-26 13:50 ` [Qemu-devel] [PATCH 2/6] block/qcow2: improve error message in qcow2_inactivate Vladimir Sementsov-Ogievskiy
2018-06-28 12:16   ` Eric Blake
2018-07-09 22:38     ` John Snow
2018-06-26 13:50 ` [Qemu-devel] [PATCH 3/6] bloc/qcow2: drop dirty_bitmaps_loaded state variable Vladimir Sementsov-Ogievskiy
2018-07-09 23:25   ` John Snow
2018-07-10  7:43     ` Vladimir Sementsov-Ogievskiy
2018-07-17 19:10       ` John Snow
2018-06-26 13:50 ` [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and migration logic Vladimir Sementsov-Ogievskiy
2018-07-21  2:41   ` John Snow
2018-08-01 10:20     ` Dr. David Alan Gilbert
2018-08-01 17:34       ` John Snow
2018-08-01 17:40         ` Dr. David Alan Gilbert
2018-08-01 18:42           ` Denis V. Lunev
2018-08-01 18:55             ` Dr. David Alan Gilbert
2018-08-01 20:25               ` Denis V. Lunev
2018-08-02  9:29                 ` Dr. David Alan Gilbert [this message]
2018-08-02  9:38                   ` Denis V. Lunev
2018-08-02  9:50                     ` Dr. David Alan Gilbert
2018-08-02 19:05                       ` Denis V. Lunev
2018-08-02 19:10                         ` John Snow
     [not found]                           ` <6d8ed319-9b63-5a7b-fcfe-20cd37cf8c7c@virtuozzo.com>
     [not found]                             ` <d2538432-be74-99bc-72d1-94f8abaa2f9b@redhat.com>
     [not found]                               ` <26c0e008-898d-924a-214e-68ab9fedf1ea@virtuozzo.com>
2018-10-15  9:42                                 ` [Qemu-devel] ping " Vladimir Sementsov-Ogievskiy
2018-10-29 17:52                                 ` [Qemu-devel] ping2 " Vladimir Sementsov-Ogievskiy
2018-10-29 18:06                                   ` John Snow
2018-08-03  8:33                         ` [Qemu-devel] " Dr. David Alan Gilbert
2018-08-03  8:44                           ` Vladimir Sementsov-Ogievskiy
2018-08-03  8:49                             ` Dr. David Alan Gilbert
2018-08-03  8:59                           ` Denis V. Lunev
2018-08-03  9:10                             ` Dr. David Alan Gilbert
2018-08-01 18:56             ` John Snow
2018-08-01 20:31               ` Denis V. Lunev
2018-08-01 20:47               ` Denis V. Lunev
2018-08-01 22:28                 ` John Snow
2018-08-02 10:23                   ` Vladimir Sementsov-Ogievskiy
2018-08-01 12:24     ` Vladimir Sementsov-Ogievskiy
2018-06-26 13:50 ` [Qemu-devel] [PATCH 5/6] iotests: improve 169 Vladimir Sementsov-Ogievskiy
2018-06-26 13:50 ` [Qemu-devel] [PATCH 6/6] iotests: 169: add cases for source vm resuming Vladimir Sementsov-Ogievskiy
2018-06-26 18:22 ` [Qemu-devel] [PATCH 0/6] fix persistent bitmaps migration logic John Snow
2018-06-28 12:04   ` Vladimir Sementsov-Ogievskiy
2018-06-26 18:36 ` John Snow
2018-07-12 19:00 ` Vladimir Sementsov-Ogievskiy
2018-07-12 20:25   ` John Snow
2018-07-13  6:46     ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180802092907.GA2523@work-vm \
    --to=dgilbert@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).