From: Avi Kivity <avi@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
kvm@vger.kernel.org, Glauber Costa <glommer@redhat.com>
Subject: Re: [PATCH 2/2] KVM: Prevent internal slots from being COWed
Date: Tue, 06 Jul 2010 17:53:43 +0300 [thread overview]
Message-ID: <4C334377.7080506@redhat.com> (raw)
In-Reply-To: <20100706144519.GC16195@random.random>
On 07/06/2010 05:45 PM, Andrea Arcangeli wrote:
>
>> Ouch, corrected and applied.
>>
> I think I tracked down the corruption during swapping with THP enabled
> to this bug. The real bug is that the mmu notifier fires (it's not
> like fork isn't covered by the mmu notifier) but KVM ignores it and
> keeps writing to the old location. Shared pages can also be swapped
> out and if the dirty bit on the spte isn't set faster than the time it
> takes to write the page, the page can be relocated. Basically if
> do_swap_page decides to make a copy of the page (like in ksm-swapin
> case, erratically triggered now even for non-ksm pages in current
> upstream by a bug in the new anon-vma code which I fixed already in
> aa.git) and the dirty bit on the spte is ignored because of lumpy
> reclaim (which also I removed now and that makes the bug stop
> triggering too), eventually what happens is that the page is unmapped
> and during swapin it is relocated to a different page.
>
> The bug really is in KVM that ignores the mmu_notifier_invalidate_page
> and keeps using the old page.
>
Right. It's the same problem as O_DIRECT + fork() - kvm did a
get_user_pages() long ago on this page, it got forked, and on the next
write the mm duplicated the page and assigned qemu the new page, which
kvm ignored.
> It should have rang a bell that fork was breaking anything... fork
> must not break anything since KVM is mmu notifier
> capable. MADV_DONTFORK must only be a performance optimization
> now. And the above change should be unnecessary (and I doubt the above
> really fixes the swapping case as tmpfs can also be swapped out, at
> least unless the page is pinned).
>
That particular page is pinned.
> The way I'd like to fix it is to allocate those magic pages by hand
> and not add them to lru and have page->mapping null. Then they will
> remain pinned in the pte, and all problems will go away.
>
Yes, that's the correct solution. It shouldn't be a user page in the
first place. Problem is that this is a very intrusive change.
> The other way would be to have a lookup hashtable that when mmu
> notifier invalidate fires, we lookup the hash and we call a method to
> have kvm stop using the page. And then something is needed during the
> page fault, if the gfn in the hash is paged-in another method is
> called to set the magic host user address to point the new pfn.
>
> I think pinning the pages and allocating them by hand is simpler,
> hopefully we can do it in a way that munmap will collect them
> automatically like now.
>
I'd like to remove it completely from the memslot mechanism,
unfortunately it may affect many paths.
--
error compiling committee.c: too many arguments to function
prev parent reply other threads:[~2010-07-06 14:53 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-21 8:18 [PATCH 0/2] Fix failures caused by fork() interaction with internal slots Avi Kivity
2010-06-21 8:18 ` [PATCH 1/2] KVM: Keep slot ID in memory slot structure Avi Kivity
2010-06-21 8:18 ` [PATCH 2/2] KVM: Prevent internal slots from being COWed Avi Kivity
2010-06-21 20:23 ` Marcelo Tosatti
2010-06-22 11:17 ` Avi Kivity
2010-07-06 14:45 ` Andrea Arcangeli
2010-07-06 14:53 ` Avi Kivity [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C334377.7080506@redhat.com \
--to=avi@redhat.com \
--cc=aarcange@redhat.com \
--cc=glommer@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.