kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	kvm@vger.kernel.org, Glauber Costa <glommer@redhat.com>
Subject: Re: [PATCH 2/2] KVM: Prevent internal slots from being COWed
Date: Tue, 06 Jul 2010 17:53:43 +0300	[thread overview]
Message-ID: <4C334377.7080506@redhat.com> (raw)
In-Reply-To: <20100706144519.GC16195@random.random>

On 07/06/2010 05:45 PM, Andrea Arcangeli wrote:
>
>> Ouch, corrected and applied.
>>      
> I think I tracked down the corruption during swapping with THP enabled
> to this bug. The real bug is that the mmu notifier fires (it's not
> like fork isn't covered by the mmu notifier) but KVM ignores it and
> keeps writing to the old location. Shared pages can also be swapped
> out and if the dirty bit on the spte isn't set faster than the time it
> takes to write the page, the page can be relocated. Basically if
> do_swap_page decides to make a copy of the page (like in ksm-swapin
> case, erratically triggered now even for non-ksm pages in current
> upstream by a bug in the new anon-vma code which I fixed already in
> aa.git)  and the dirty bit on the spte is ignored because of lumpy
> reclaim (which also I removed now and that makes the bug stop
> triggering too), eventually what happens is that the page is unmapped
> and during swapin it is relocated to a different page.
>
> The bug really is in KVM that ignores the mmu_notifier_invalidate_page
> and keeps using the old page.
>    

Right.  It's the same problem as O_DIRECT + fork() - kvm did a 
get_user_pages() long ago on this page, it got forked, and on the next 
write the mm duplicated the page and assigned qemu the new page, which 
kvm ignored.

> It should have rang a bell that fork was breaking anything... fork
> must not break anything since KVM is mmu notifier
> capable. MADV_DONTFORK must only be a performance optimization
> now. And the above change should be unnecessary (and I doubt the above
> really fixes the swapping case as tmpfs can also be swapped out, at
> least unless the page is pinned).
>    

That particular page is pinned.

> The way I'd like to fix it is to allocate those magic pages by hand
> and not add them to lru and have page->mapping null. Then they will
> remain pinned in the pte, and all problems will go away.
>    

Yes, that's the correct solution.  It shouldn't be a user page in the 
first place.  Problem is that this is a very intrusive change.

> The other way would be to have a lookup hashtable that when mmu
> notifier invalidate fires, we lookup the hash and we call a method to
> have kvm stop using the page. And then something is needed during the
> page fault, if the gfn in the hash is paged-in another method is
> called to set the magic host user address to point the new pfn.
>
> I think pinning the pages and allocating them by hand is simpler,
> hopefully we can do it in a way that munmap will collect them
> automatically like now.
>    

I'd like to remove it completely from the memslot mechanism, 
unfortunately it may affect many paths.

-- 
error compiling committee.c: too many arguments to function


      reply	other threads:[~2010-07-06 14:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-21  8:18 [PATCH 0/2] Fix failures caused by fork() interaction with internal slots Avi Kivity
2010-06-21  8:18 ` [PATCH 1/2] KVM: Keep slot ID in memory slot structure Avi Kivity
2010-06-21  8:18 ` [PATCH 2/2] KVM: Prevent internal slots from being COWed Avi Kivity
2010-06-21 20:23   ` Marcelo Tosatti
2010-06-22 11:17     ` Avi Kivity
2010-07-06 14:45       ` Andrea Arcangeli
2010-07-06 14:53         ` Avi Kivity [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C334377.7080506@redhat.com \
    --to=avi@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=glommer@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).