From: Sean Christopherson <seanjc@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org,
Lai Jiangshan <laijs@linux.alibaba.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
kvm@vger.kernel.org
Subject: Re: [PATCH 1/2] KVM: X86: Check pte present first in __shadow_walk_next()
Date: Thu, 12 Aug 2021 16:03:47 +0000 [thread overview]
Message-ID: <YRVGY1ZK8wl9ybBH@google.com> (raw)
In-Reply-To: <20210812043630.2686-1-jiangshanlai@gmail.com>
On Thu, Aug 12, 2021, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
>
> So far, the loop bodies already ensure the pte is present before calling
> __shadow_walk_next(). But checking pte present in __shadow_walk_next()
> is a more prudent way of programing and loop bodies will not need
> to always check it. It allows us removing is_shadow_present_pte()
> in the loop bodies.
There needs to be more analysis in the changelog, as there are many more callers
to __shadow_walk_next() than the three that are modified in the next patch. It
might even make sense to squash the two patches together, i.e. make it a "move"
instead of an "add + remove", and then explicitly explain why it's ok to add the
check for paths that do _not_ currently have a !is_shadow_present_pte() in the
loop body.
Specifically, FNAME(fetch) via shadow_walk_next() and __direct_map() via
for_each_shadow_entry() do not currently terminate their walks with a !PRESENT,
but they get away with it because they install present non-leaf SPTEs in the loop
itself.
The other argument for the audit (changelog+patch) of all users is that the next
patch misses FNAME(invlpg), e.g.
@@ -977,7 +980,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
FNAME(update_pte)(vcpu, sp, sptep, &gpte);
}
- if (!is_shadow_present_pte(*sptep) || !sp->unsync_children)
+ if (!sp->unsync_children)
break;
}
write_unlock(&vcpu->kvm->mmu_lock);
It would also be worthwhile to document via the changelog that terminating on
!is_shadow_present_pte() is 100% the correct behavior, as walking past a !PRESENT
SPTE would lead to attempting to read a the next level SPTE from a garbage
iter->shadow_addr.
And for clarity and safety, I think it would be worth adding the patch below as
a prep patch to document and enforce that walking the non-leaf SPTEs when faulting
in a page should never terminate early.
From 1c202a7e82b1931e4eb37b23aa9d7108340a6cd2 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 12 Aug 2021 08:38:35 -0700
Subject: [PATCH] KVM: x86/mmu: Verify shadow walk doesn't terminate early in
page faults
WARN and bail if the shadow walk for faulting in a SPTE terminates early,
i.e. doesn't reach the expected level because the walk encountered a
terminal SPTE. The shadow walks for page faults are subtle in that they
install non-leaf SPTEs (zapping leaf SPTEs if necessary!) in the loop
body, and consume the newly created non-leaf SPTE in the loop control,
e.g. __shadow_walk_next(). In other words, the walks guarantee that the
walk will stop if and only if the target level is reached by installing
non-leaf SPTEs to guarantee the walk remains valid.
Opportunistically use fault->goal-level instead of it.level in
FNAME(fetch) to further clarify that KVM always installs the leaf SPTE at
the target level.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 3 +++
arch/x86/kvm/mmu/paging_tmpl.h | 7 +++++--
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a272ccbddfa1..2a243ae1d64c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2992,6 +2992,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
account_huge_nx_page(vcpu->kvm, sp);
}
+ if (WARN_ON_ONCE(it.level != fault->goal_level))
+ return -EFAULT;
+
ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL,
fault->write, fault->goal_level, base_gfn, fault->pfn,
fault->prefault, fault->map_writable);
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index f70afecbf3a2..3a8a7b2f9979 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -749,9 +749,12 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
}
}
+ if (WARN_ON_ONCE(it.level != fault->goal_level))
+ return -EFAULT;
+
ret = mmu_set_spte(vcpu, it.sptep, gw->pte_access, fault->write,
- it.level, base_gfn, fault->pfn, fault->prefault,
- fault->map_writable);
+ fault->goal_level, base_gfn, fault->pfn,
+ fault->prefault, fault->map_writable);
if (ret == RET_PF_SPURIOUS)
return ret;
--
2.33.0.rc1.237.g0d66db33f3-goog
next prev parent reply other threads:[~2021-08-12 16:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-12 4:36 [PATCH 1/2] KVM: X86: Check pte present first in __shadow_walk_next() Lai Jiangshan
2021-08-12 4:36 ` [PATCH 2/2] KVM: X86: Remove the present check from for_each_shadow_entry* loop body Lai Jiangshan
2021-08-12 16:03 ` Sean Christopherson [this message]
2021-08-13 3:16 ` [PATCH V2] KVM: X86: Move PTE present check from loop body to __shadow_walk_next() Lai Jiangshan
2021-08-24 8:30 ` Lai Jiangshan
2021-09-02 20:43 ` Sean Christopherson
2021-09-06 12:25 ` [PATCH V3 1/2] KVM: x86/mmu: Verify shadow walk doesn't terminate early in page faults Lai Jiangshan
2021-09-06 12:25 ` [PATCH V3 2/2] KVM: X86: Move PTE present check from loop body to __shadow_walk_next() Lai Jiangshan
2021-09-24 9:56 ` Paolo Bonzini
2021-09-24 9:33 ` [PATCH V3 1/2] KVM: x86/mmu: Verify shadow walk doesn't terminate early in page faults Paolo Bonzini
2021-08-14 9:47 ` [PATCH 1/2] KVM: X86: Check pte present first in __shadow_walk_next() Lai Jiangshan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YRVGY1ZK8wl9ybBH@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=hpa@zytor.com \
--cc=jiangshanlai@gmail.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=laijs@linux.alibaba.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.