From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Juan Quintela <quintela@trasno.org>,
qemu-devel@nongnu.org, kvm-devel <kvm@vger.kernel.org>,
Juan Quintela <quintela@redhat.com>
Subject: [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Date: Tue, 30 Nov 2010 08:17:39 -0600 [thread overview]
Message-ID: <4CF50783.90402@codemonkey.ws> (raw)
In-Reply-To: <4CF5030B.40703@redhat.com>
On 11/30/2010 07:58 AM, Avi Kivity wrote:
> On 11/30/2010 03:47 PM, Anthony Liguori wrote:
>> On 11/30/2010 01:15 AM, Paolo Bonzini wrote:
>>> On 11/30/2010 03:11 AM, Anthony Liguori wrote:
>>>>
>>>> BufferedFile should hit the qemu_file_rate_limit check when the socket
>>>> buffer gets filled up.
>>>
>>> The problem is that the file rate limit is not hit because work is
>>> done elsewhere. The rate can limit the bandwidth used and makes
>>> QEMU aware that socket operations may block (because that's what the
>>> buffered file freeze/unfreeze logic does); but it cannot be used to
>>> limit the _time_ spent in the migration code.
>>
>> Yes, it can, if you set the rate limit sufficiently low.
>>
>> The caveats are 1) the kvm.ko interface for dirty bits doesn't scale
>> for large memory guests so we spend a lot more CPU time walking it
>> than we should 2) zero pages cause us to burn a lot more CPU time
>> than we otherwise would because compressing them is so effective.
>
> What's the problem with burning that cpu? per guest page, compressing
> takes less than sending. Is it just an issue of qemu mutex hold time?
If you have a 512GB guest, then you have a 16MB dirty bitmap which ends
up being an 128MB dirty bitmap in QEMU because we represent dirty bits
with 8 bits.
Walking 16mb (or 128mb) of memory just fine find a few pages to send
over the wire is a big waste of CPU time. If kvm.ko used a multi-level
table to represent dirty info, we could walk the memory mapping at 2MB
chunks allowing us to skip a large amount of the comparisons.
>> In the short term, fixing (2) by accounting zero pages as full sized
>> pages should "fix" the problem.
>>
>> In the long term, we need a new dirty bit interface from kvm.ko that
>> uses a multi-level table. That should dramatically improve scan
>> performance.
>
> Why would a multi-level table help? (or rather, please explain what
> you mean by a multi-level table).
>
> Something we could do is divide memory into more slots, and polling
> each slot when we start to scan its page range. That reduces the time
> between sampling a page's dirtiness and sending it off, and reduces
> the latency incurred by the sampling. There are also
> non-interface-changing ways to reduce this latency, like O(1) write
> protection, or using dirty bits instead of write protection when
> available.
BTW, we should also refactor qemu to use the kvm dirty bitmap directly
instead of mapping it to the main dirty bitmap.
>> We also need to implement live migration in a separate thread that
>> doesn't carry qemu_mutex while it runs.
>
> IMO that's the biggest hit currently.
Yup. That's the Correct solution to the problem.
Regards,
Anthony Liguori
next prev parent reply other threads:[~2010-12-01 4:43 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-23 23:02 [Qemu-devel] [PATCH 00/10] Fix migration with lots of memory Juan Quintela
2010-11-23 23:02 ` [Qemu-devel] [PATCH 01/10] Add spent time to migration Juan Quintela
2010-11-23 23:02 ` [Qemu-devel] [PATCH 02/10] Add buffered_file_internal constant Juan Quintela
2010-11-24 10:40 ` [Qemu-devel] " Michael S. Tsirkin
2010-11-24 10:52 ` Juan Quintela
2010-11-24 11:04 ` Michael S. Tsirkin
2010-11-24 11:13 ` Juan Quintela
2010-11-24 11:19 ` Michael S. Tsirkin
[not found] ` <4CF46012.2060804@codemonkey.ws>
2010-11-30 11:56 ` Juan Quintela
2010-11-30 14:02 ` Anthony Liguori
2010-11-30 14:11 ` Michael S. Tsirkin
2010-11-30 14:22 ` Anthony Liguori
2010-11-30 15:40 ` Juan Quintela
2010-11-30 16:10 ` Michael S. Tsirkin
2010-11-30 16:32 ` Juan Quintela
2010-11-30 16:44 ` Anthony Liguori
2010-11-30 18:04 ` Juan Quintela
2010-11-30 18:54 ` Anthony Liguori
2010-11-30 19:15 ` Juan Quintela
2010-11-30 20:23 ` Anthony Liguori
2010-11-30 20:56 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 03/10] Add printf debug to savevm Juan Quintela
[not found] ` <4CF45AB2.7050506@codemonkey.ws>
2010-11-30 10:36 ` Stefan Hajnoczi
2010-11-30 22:40 ` [Qemu-devel] " Juan Quintela
2010-12-01 7:50 ` Stefan Hajnoczi
2010-11-23 23:03 ` [Qemu-devel] [PATCH 04/10] No need to iterate if we already are over the limit Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 05/10] KVM don't care about TLB handling Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 06/10] Only calculate expected_time for stage 2 Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 07/10] ram_save_remaining() returns an uint64_t Juan Quintela
[not found] ` <4CF45C0C.705@codemonkey.ws>
2010-11-30 7:21 ` [Qemu-devel] " Paolo Bonzini
2010-11-30 13:44 ` Anthony Liguori
2010-11-30 14:38 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 08/10] Count nanoseconds with uint64_t not doubles Juan Quintela
2010-11-30 7:17 ` [Qemu-devel] " Paolo Bonzini
[not found] ` <4CF45C5B.9080507@codemonkey.ws>
2010-11-30 14:40 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 09/10] Exit loop if we have been there too long Juan Quintela
2010-11-24 10:40 ` [Qemu-devel] " Michael S. Tsirkin
2010-11-24 11:01 ` Juan Quintela
2010-11-24 11:14 ` Michael S. Tsirkin
2010-11-24 15:16 ` Paolo Bonzini
2010-11-24 15:59 ` Michael S. Tsirkin
[not found] ` <4CF45E3F.4040609@codemonkey.ws>
2010-11-30 8:10 ` Paolo Bonzini
2010-11-30 13:26 ` Juan Quintela
[not found] ` <4CF45D67.5010906@codemonkey.ws>
2010-11-30 7:15 ` Paolo Bonzini
2010-11-30 13:47 ` Anthony Liguori
2010-11-30 13:58 ` Avi Kivity
2010-11-30 14:17 ` Anthony Liguori [this message]
2010-11-30 14:27 ` Avi Kivity
2010-11-30 14:50 ` Anthony Liguori
2010-12-01 12:40 ` Avi Kivity
2010-11-30 17:43 ` Juan Quintela
2010-12-01 1:20 ` Takuya Yoshikawa
2010-12-01 1:52 ` Juan Quintela
2010-12-01 2:22 ` Takuya Yoshikawa
2010-12-01 12:35 ` Avi Kivity
2010-12-01 13:45 ` Juan Quintela
2010-12-02 1:31 ` Takuya Yoshikawa
2010-12-02 8:37 ` Avi Kivity
2010-11-30 14:12 ` Paolo Bonzini
2010-11-30 15:00 ` Anthony Liguori
2010-11-30 17:59 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 10/10] Maintaing number of dirty pages Juan Quintela
[not found] ` <4CF45DE0.8020701@codemonkey.ws>
2010-11-30 14:46 ` [Qemu-devel] " Juan Quintela
2010-12-01 14:46 ` Avi Kivity
2010-12-01 15:51 ` Juan Quintela
2010-12-01 15:55 ` Anthony Liguori
2010-12-01 16:25 ` Juan Quintela
2010-12-01 16:33 ` Anthony Liguori
2010-12-01 16:43 ` Avi Kivity
2010-12-01 16:49 ` Anthony Liguori
2010-12-01 16:52 ` Avi Kivity
2010-12-01 16:56 ` Anthony Liguori
2010-12-01 17:01 ` Avi Kivity
2010-12-01 17:05 ` Anthony Liguori
2010-12-01 18:51 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CF50783.90402@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=quintela@trasno.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).