From: Anthony Liguori <aliguori@us.ibm.com>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] Re: [Qemu-commits] [COMMIT 3086844] Instead of writing a zero page, madvise it away
Date: Mon, 22 Jun 2009 12:03:25 -0500 [thread overview]
Message-ID: <4A3FB95D.3060404@us.ibm.com> (raw)
In-Reply-To: <4A3FB390.4060809@redhat.com>
Avi Kivity wrote:
> On 06/22/2009 07:25 PM, Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> On 06/22/2009 06:51 PM, Anthony Liguori wrote:
>>>> From: Anthony Liguori<aliguori@us.ibm.com>
>>>>
>>>> Otherwise, after migration, we end up with a much larger RSS size
>>>> then we
>>>> ought to have.
>>>>
>>>
>>> We have the same issue on the migration source node. I don't see a
>>> simple way to solve it, though.
>>
>> I don't follow. In this case, the issue is:
>>
>> 1) Start a guest with 1024, balloon down to 128MB. RSS size is now
>> ~128MB
>> 2) Live migrate to a different node
>> 3) RSS on different node jumps to ~1GB
>
> 3.5) RSS on source node jumps to ~1GB, since reading the page
> instantiates the pte
Surely we can do better here...
For TCG, we always know when memory is dirty and we can check it
atomically. So we know whether a page has changed since we knew it was
last zero. We basically need a ZERO_DIRTY bit. All memory initially
carries this bit and ballooning also sets the bit. During live
migration, we can check the dirty bit first.
For KVM, we would have to enable dirty tracking always to keep
ZERO_DIRTY up to date. Since write faults are going to happen anyway at
start up, perhaps the cost of doing this wouldn't be so bad?
>
> Right. I'd love to do madvise() on the source node as well if we
> fault in a page and find out it's zero, but the guest (and aio) is
> still running and we might drop live data. We need a
> madvise(MADV_DONTNEED_IFZERO), or a mincore() flag that tells us if
> the page exists (vs. swapped). ksm would also do this, but it is
> overkill for some applications.
For KVM, we could just have an KVM_IOCTL_MADVISE_IF_NOT_DIRTY, but
that's a bad solution. That's more or less the desired semantics though.
--
Regards,
Anthony Liguori
next prev parent reply other threads:[~2009-06-22 17:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200906221549.n5MFn3Qd015389@d03av02.boulder.ibm.com>
2009-06-22 16:12 ` [Qemu-devel] Re: [Qemu-commits] [COMMIT 3086844] Instead of writing a zero page, madvise it away Avi Kivity
2009-06-22 16:25 ` Anthony Liguori
2009-06-22 16:38 ` Avi Kivity
2009-06-22 16:58 ` Anthony Liguori
2009-06-22 17:12 ` Avi Kivity
2009-06-22 17:03 ` Anthony Liguori [this message]
2009-06-22 17:20 ` Avi Kivity
2009-06-22 17:37 ` Anthony Liguori
2009-06-22 18:01 ` Avi Kivity
2009-06-22 17:44 ` Anthony Liguori
2009-06-22 18:04 ` Avi Kivity
2009-06-22 19:38 ` Paul Brook
2009-06-22 19:49 ` Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A3FB95D.3060404@us.ibm.com \
--to=aliguori@us.ibm.com \
--cc=avi@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).