[PATCH v5 2/9] KVM: MMU: fix race between 'walk_addr' and 'fetch'

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
To: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	KVM list <kvm@vger.kernel.org>
Subject: [PATCH v5 2/9] KVM: MMU: fix race between 'walk_addr' and 'fetch'
Date: Tue, 06 Jul 2010 18:45:28 +0800	[thread overview]
Message-ID: <4C330948.1070305@cn.fujitsu.com> (raw)
In-Reply-To: <4C330918.6040709@cn.fujitsu.com>

'walk_addr' is out of mmu_lock's protection, so while we handle 'fetch',
then guest's mapping has modifited by other vcpu's write path, such as
invlpg, pte_write and other fetch path

Fixed by checking all level's mapping

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/paging_tmpl.h |   73 ++++++++++++++++++++++++++------------------
 1 files changed, 43 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 19f0077..f58a5c4 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -300,7 +300,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			 int *ptwrite, pfn_t pfn)
 {
 	unsigned access = gw->pt_access;
-	struct kvm_mmu_page *sp;
+	struct kvm_mmu_page *sp = NULL;
 	u64 spte, *sptep = NULL;
 	int direct;
 	gfn_t table_gfn;
@@ -319,22 +319,23 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		direct_access &= ~ACC_WRITE_MASK;
 
 	for_each_shadow_entry(vcpu, addr, iterator) {
+		bool nonpresent = false, last_mapping = false;
+
 		level = iterator.level;
 		sptep = iterator.sptep;
-		if (iterator.level == hlevel) {
-			mmu_set_spte(vcpu, sptep, access,
-				     gw->pte_access & access,
-				     user_fault, write_fault,
-				     dirty, ptwrite, level,
-				     gw->gfn, pfn, false, true);
-			break;
+
+		if (level == hlevel) {
+			last_mapping = true;
+			goto check_set_spte;
 		}
 
-		if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep)) {
-			struct kvm_mmu_page *child;
+		if (is_large_pte(*sptep)) {
+			drop_spte(vcpu->kvm, sptep, shadow_trap_nonpresent_pte);
+			kvm_flush_remote_tlbs(vcpu->kvm);
+		}
 
-			if (level != gw->level)
-				continue;
+		if (is_shadow_present_pte(*sptep) && level == gw->level) {
+			struct kvm_mmu_page *child;
 
 			/*
 			 * For the direct sp, if the guest pte's dirty bit
@@ -344,19 +345,17 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			 * a new sp with the correct access.
 			 */
 			child = page_header(*sptep & PT64_BASE_ADDR_MASK);
-			if (child->role.access == direct_access)
-				continue;
-
-			mmu_page_remove_parent_pte(child, sptep);
-			__set_spte(sptep, shadow_trap_nonpresent_pte);
-			kvm_flush_remote_tlbs(vcpu->kvm);
+			if (child->role.access != direct_access) {
+				mmu_page_remove_parent_pte(child, sptep);
+				__set_spte(sptep, shadow_trap_nonpresent_pte);
+				kvm_flush_remote_tlbs(vcpu->kvm);
+			}
 		}
 
-		if (is_large_pte(*sptep)) {
-			drop_spte(vcpu->kvm, sptep, shadow_trap_nonpresent_pte);
-			kvm_flush_remote_tlbs(vcpu->kvm);
-		}
+		if (is_shadow_present_pte(*sptep))
+			goto check_set_spte;
 
+		nonpresent = true;
 		if (level <= gw->level) {
 			direct = 1;
 			access = direct_access;
@@ -374,22 +373,36 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		}
 		sp = kvm_mmu_get_page(vcpu, table_gfn, addr, level-1,
 					       direct, access, sptep);
-		if (!direct) {
+check_set_spte:
+		if (level >= gw->level) {
 			r = kvm_read_guest_atomic(vcpu->kvm,
-						  gw->pte_gpa[level - 2],
+						  gw->pte_gpa[level - 1],
 						  &curr_pte, sizeof(curr_pte));
-			if (r || curr_pte != gw->ptes[level - 2]) {
-				kvm_mmu_put_page(sp, sptep);
+			if (r || curr_pte != gw->ptes[level - 1]) {
+				if (nonpresent)
+					kvm_mmu_put_page(sp, sptep);
 				kvm_release_pfn_clean(pfn);
 				sptep = NULL;
 				break;
 			}
 		}
 
-		spte = __pa(sp->spt)
-			| PT_PRESENT_MASK | PT_ACCESSED_MASK
-			| PT_WRITABLE_MASK | PT_USER_MASK;
-		*sptep = spte;
+		if (nonpresent) {
+			spte = __pa(sp->spt)
+				| PT_PRESENT_MASK | PT_ACCESSED_MASK
+				| PT_WRITABLE_MASK | PT_USER_MASK;
+			*sptep = spte;
+			continue;
+		}
+
+		if (last_mapping) {
+			mmu_set_spte(vcpu, sptep, access,
+				     gw->pte_access & access,
+				     user_fault, write_fault,
+				     dirty, ptwrite, level,
+				     gw->gfn, pfn, false, true);
+			break;
+		}
 	}
 
 	return sptep;
-- 
1.6.1.2

next prev parent reply	other threads:[~2010-07-06 10:49 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-06 10:44 [PATCH v5 1/9] KVM: MMU: fix forgot reserved bits check in speculative path Xiao Guangrong
2010-07-06 10:45 ` Xiao Guangrong [this message]
2010-07-06 10:46   ` [PATCH v5 3/9] export __get_user_pages_fast() function Xiao Guangrong
2010-07-11 12:52   ` [PATCH v5 2/9] KVM: MMU: fix race between 'walk_addr' and 'fetch' Avi Kivity
2010-07-11 15:40     ` Avi Kivity
2010-07-06 10:47 ` [PATCH v5 4/9] KVM: MMU: introduce gfn_to_pfn_atomic() function Xiao Guangrong
2010-07-06 11:22   ` Gleb Natapov
2010-07-06 11:28     ` Xiao Guangrong
2010-07-09  1:34   ` Xiao Guangrong
2010-07-06 10:48 ` [PATCH v5 5/9] KVM: MMU: introduce gfn_to_page_many_atomic() function Xiao Guangrong
2010-07-11 12:59   ` Avi Kivity
2010-07-12  2:55     ` Xiao Guangrong
2010-07-12 12:28       ` Avi Kivity
2010-07-13  1:17         ` Xiao Guangrong
2010-07-06 10:49 ` [PATCH v5 6/9] KVM: MMU: introduce pte_prefetch_topup_memory_cache() Xiao Guangrong
2010-07-11 13:05   ` Avi Kivity
2010-07-12  3:05     ` Xiao Guangrong
2010-07-12 12:26       ` Avi Kivity
2010-07-13  1:16         ` Xiao Guangrong
2010-07-13  4:21           ` Avi Kivity
2010-07-13  4:25             ` Xiao Guangrong
2010-07-13  5:35               ` Avi Kivity
2010-07-13  5:48                 ` Xiao Guangrong
2010-07-13  6:05                   ` Avi Kivity
2010-07-13  6:10                     ` Xiao Guangrong
2010-07-13  6:29                       ` Avi Kivity
2010-07-13  6:52                         ` Xiao Guangrong
2010-07-13  7:45                           ` Avi Kivity
2010-07-06 10:50 ` [PATCH v5 7/9] KVM: MMU: prefetch ptes when intercepted guest #PF Xiao Guangrong
2010-07-06 10:51 ` [PATCH v5 8/9] KVM: MMU: combine guest pte read between fetch and pte prefetch Xiao Guangrong
2010-07-06 19:52   ` Marcelo Tosatti
2010-07-07  1:23     ` Xiao Guangrong
2010-07-07 13:07       ` Marcelo Tosatti
2010-07-07 13:11         ` Xiao Guangrong
2010-07-07 13:40           ` Marcelo Tosatti
2010-07-07 14:10             ` Xiao Guangrong
2010-07-07 15:30               ` Marcelo Tosatti
2010-07-06 10:52 ` [PATCH v5 9/9] KVM: MMU: trace " Xiao Guangrong
2010-07-11 12:24 ` [PATCH v5 1/9] KVM: MMU: fix forgot reserved bits check in speculative path Avi Kivity
2010-07-12  2:37   ` Xiao Guangrong
2010-07-12 13:15     ` Avi Kivity
2010-07-13  1:57       ` Xiao Guangrong

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:19f0077 dfblob:f58a5c4 )
 OR (
bs:"[PATCH v5 2/9] KVM: MMU: fix race between 'walk_addr' and 'fetch'" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C330948.1070305@cn.fujitsu.com \
    --to=xiaoguangrong@cn.fujitsu.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.