All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@qumranet.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@qumranet.com>, kvm-devel <kvm@vger.kernel.org>
Subject: Re: KVM: MMU: rmap_write_protect() hugepage iteration bug
Date: Sun, 8 Jun 2008 03:54:52 +0200	[thread overview]
Message-ID: <20080608015452.GC8321@duo.random> (raw)
In-Reply-To: <20080608002736.GA25582@dmt.cnet>

On Sat, Jun 07, 2008 at 09:27:36PM -0300, Marcelo Tosatti wrote:
> 
> rmap_next() expects the "spte" argument to be NULL whenever there's only
> one remaining entry in the descriptor. That is, it was not designed to
> handle changes in the chain while iterating.
> 
> This bug cripples rmap_write_protect() so that it won't nuke all
> writable large mappings to a particular gfn.

Better than before but the patched version will still not nuke all
writeable sptes (I did exactly the same mistake once).

I found out the trouble the hard way with a trace of ksm kprobe
wp_notifier faults, and that's this reordering here:

	   desc->shadow_ptes[i] = desc->shadow_ptes[j];
	   desc->shadow_ptes[j] = NULL;

In short there's no way to mix rmap_next and rmap_remove in the same
loop as rmap_remove will reorder stuff in the array, so using the
next_spte as pointer will lead to missing the sptes after the
next_spte that have been reordered in place of the spte just before
the next_spte. This was quite subtle issue as it's not immediately
evident the first time you read rmap_remove internals.

rmap_remove invocation requires to restart from scratch (or
alternatively to avoid restarting from scratch, we'd need to extend
the rmap methods to provide that functionality, something like
rmap_next_safe that would be robust against rmap_remove if used in the
same way you did now with rmap_next).

For now this will fix it for real.

Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index aaccc40..9e4622c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -641,8 +641,9 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn)
 			--kvm->stat.lpages;
 			set_shadow_pte(spte, shadow_trap_nonpresent_pte);
 			write_protected = 1;
-		}
-		spte = rmap_next(kvm, rmapp, spte);
+			spte = rmap_next(kvm, rmapp, NULL);
+		} else
+			spte = rmap_next(kvm, rmapp, spte);
 	}
 
 	if (write_protected)

  reply	other threads:[~2008-06-08  1:54 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-08  0:27 KVM: MMU: rmap_write_protect() hugepage iteration bug Marcelo Tosatti
2008-06-08  1:54 ` Andrea Arcangeli [this message]
2008-06-08  4:48   ` Marcelo Tosatti
2008-06-08  7:30     ` Avi Kivity
2008-06-08  8:03       ` Avi Kivity
2008-06-08  8:04         ` Avi Kivity
2008-06-08 18:31     ` Andrea Arcangeli
2008-06-08 19:52       ` Marcelo Tosatti
2008-06-08 20:30         ` Andrea Arcangeli
2008-06-09  0:20           ` Marcelo Tosatti
2008-06-09  0:55             ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080608015452.GC8321@duo.random \
    --to=andrea@qumranet.com \
    --cc=avi@qumranet.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.