From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Juan Quintela <quintela@redhat.com>,
qemu-devel@nongnu.org, Juan Quintela <quintela@trasno.org>,
kvm-devel <kvm@vger.kernel.org>
Subject: Re: [PATCH 09/10] Exit loop if we have been there too long
Date: Tue, 30 Nov 2010 08:50:20 -0600 [thread overview]
Message-ID: <4CF50F2C.7090503@codemonkey.ws> (raw)
In-Reply-To: <4CF509C1.9@redhat.com>
On 11/30/2010 08:27 AM, Avi Kivity wrote:
> On 11/30/2010 04:17 PM, Anthony Liguori wrote:
>>> What's the problem with burning that cpu? per guest page,
>>> compressing takes less than sending. Is it just an issue of qemu
>>> mutex hold time?
>>
>>
>> If you have a 512GB guest, then you have a 16MB dirty bitmap which
>> ends up being an 128MB dirty bitmap in QEMU because we represent
>> dirty bits with 8 bits.
>
> Was there not a patchset to split each bit into its own bitmap? And
> then copy the kvm or qemu master bitmap into each client bitmap as it
> became needed?
>
>> Walking 16mb (or 128mb) of memory just fine find a few pages to send
>> over the wire is a big waste of CPU time. If kvm.ko used a
>> multi-level table to represent dirty info, we could walk the memory
>> mapping at 2MB chunks allowing us to skip a large amount of the
>> comparisons.
>
> There's no reason to assume dirty pages would be clustered. If 0.2%
> of memory were dirty, but scattered uniformly, there would be no win
> from the two-level bitmap. A loss, in fact: 2MB can be represented as
> 512 bits or 64 bytes, just one cache line. Any two-level thing will
> need more.
>
> We might have a more compact encoding for sparse bitmaps, like
> run-length encoding.
>
>>
>>>> In the short term, fixing (2) by accounting zero pages as full
>>>> sized pages should "fix" the problem.
>>>>
>>>> In the long term, we need a new dirty bit interface from kvm.ko
>>>> that uses a multi-level table. That should dramatically improve
>>>> scan performance.
>>>
>>> Why would a multi-level table help? (or rather, please explain what
>>> you mean by a multi-level table).
>>>
>>> Something we could do is divide memory into more slots, and polling
>>> each slot when we start to scan its page range. That reduces the
>>> time between sampling a page's dirtiness and sending it off, and
>>> reduces the latency incurred by the sampling. There are also
>>> non-interface-changing ways to reduce this latency, like O(1) write
>>> protection, or using dirty bits instead of write protection when
>>> available.
>>
>> BTW, we should also refactor qemu to use the kvm dirty bitmap
>> directly instead of mapping it to the main dirty bitmap.
>
> That's what the patch set I was alluding to did. Or maybe I imagined
> the whole thing.
No, it just split the main bitmap into three bitmaps. I'm suggesting
that we have the dirty interface have two implementations, one that
refers to the 8-bit bitmap when TCG in use and another one that uses the
KVM representation.
TCG really needs multiple dirty bits but KVM doesn't. A shared
implementation really can't be optimal.
>
>>>> We also need to implement live migration in a separate thread that
>>>> doesn't carry qemu_mutex while it runs.
>>>
>>> IMO that's the biggest hit currently.
>>
>> Yup. That's the Correct solution to the problem.
>
> Then let's just Do it.
>
Yup.
Regards,
Anthony Liguori
next prev parent reply other threads:[~2010-11-30 14:50 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1290552026.git.quintela@redhat.com>
[not found] ` <9b23b9b4cee242591bdb356c838a9cfb9af033c1.1290552026.git.quintela@redhat.com>
[not found] ` <4CF45D67.5010906@codemonkey.ws>
[not found] ` <4CF4A478.8080209@redhat.com>
2010-11-30 13:47 ` [PATCH 09/10] Exit loop if we have been there too long Anthony Liguori
2010-11-30 13:58 ` Avi Kivity
2010-11-30 14:17 ` Anthony Liguori
2010-11-30 14:27 ` Avi Kivity
2010-11-30 14:50 ` Anthony Liguori [this message]
2010-12-01 12:40 ` Avi Kivity
2010-11-30 17:43 ` Juan Quintela
2010-12-01 1:20 ` Takuya Yoshikawa
2010-12-01 1:52 ` Juan Quintela
2010-12-01 2:22 ` Takuya Yoshikawa
2010-12-01 12:35 ` Avi Kivity
2010-12-01 13:45 ` Juan Quintela
2010-12-02 1:31 ` Takuya Yoshikawa
2010-12-02 8:37 ` Avi Kivity
2010-11-30 14:12 ` Paolo Bonzini
2010-11-30 15:00 ` Anthony Liguori
2010-11-30 17:59 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CF50F2C.7090503@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=quintela@trasno.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox