* [RFC PATCH 0/2] KVM: MMU: return page fault error code from permission_fault @ 2016-03-08 11:45 Paolo Bonzini 2016-03-08 11:45 ` [RFC PATCH 1/2] KVM: MMU: precompute page fault error code Paolo Bonzini 2016-03-08 11:45 ` [RFC PATCH 2/2] KVM: MMU: return page fault error code from permission_fault Paolo Bonzini 0 siblings, 2 replies; 7+ messages in thread From: Paolo Bonzini @ 2016-03-08 11:45 UTC (permalink / raw) To: linux-kernel, kvm; +Cc: guangrong.xiao, huaitong.han Intel's new PKU extension introduces a bit of the page fault error code that must be dynamically computed after the PTE lookup (unlike I/D, U/S and W/R). To ease this, these two patches make permission_fault return the page fault error code. Right now it only adds the P bit to the input parameter "pfec", but PKU can change that. Paolo Paolo Bonzini (2): KVM: MMU: precompute page fault error code KVM: MMU: return page fault error code from permission_fault arch/x86/kvm/mmu.c | 2 +- arch/x86/kvm/mmu.h | 15 ++++++++++----- arch/x86/kvm/paging_tmpl.h | 31 +++++++++++++++++-------------- 3 files changed, 28 insertions(+), 20 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC PATCH 1/2] KVM: MMU: precompute page fault error code 2016-03-08 11:45 [RFC PATCH 0/2] KVM: MMU: return page fault error code from permission_fault Paolo Bonzini @ 2016-03-08 11:45 ` Paolo Bonzini 2016-03-10 14:01 ` Xiao Guangrong 2016-03-08 11:45 ` [RFC PATCH 2/2] KVM: MMU: return page fault error code from permission_fault Paolo Bonzini 1 sibling, 1 reply; 7+ messages in thread From: Paolo Bonzini @ 2016-03-08 11:45 UTC (permalink / raw) To: linux-kernel, kvm; +Cc: guangrong.xiao, huaitong.han For the next patch, we will want to filter PFERR_FETCH_MASK away early, and not pass it to permission_fault if neither NX nor SMEP are enabled. Prepare for the change. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- arch/x86/kvm/mmu.c | 2 +- arch/x86/kvm/paging_tmpl.h | 26 +++++++++++++++----------- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 2463de0b935c..e57f7be061e3 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3883,7 +3883,7 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, u = bit & ACC_USER_MASK; if (!ept) { - /* Not really needed: !nx will cause pte.nx to fault */ + /* Not really needed: if !nx, ff will always be zero */ x |= !mmu->nx; /* Allow supervisor writes if !cr0.wp */ w |= !is_write_protection(vcpu) && !uf; diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 6013f3685ef4..285858d3223b 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -272,13 +272,24 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, gpa_t pte_gpa; int offset; const int write_fault = access & PFERR_WRITE_MASK; - const int user_fault = access & PFERR_USER_MASK; - const int fetch_fault = access & PFERR_FETCH_MASK; - u16 errcode = 0; + u16 errcode; gpa_t real_gpa; gfn_t gfn; trace_kvm_mmu_pagetable_walk(addr, access); + + /* + * Do not modify PFERR_FETCH_MASK in access. It is used later in the call to + * mmu->translate_gpa and, when nested virtualization is in use, the X or NX + * bit of nested page tables always applies---even if NX and SMEP are disabled + * in the guest. + * + * TODO: cache the result of the NX and SMEP test in struct kvm_mmu? + */ + errcode = access; + if (!(mmu->nx || kvm_read_cr4_bits(vcpu, X86_CR4_SMEP))) + errcode &= ~PFERR_FETCH_MASK; + retry_walk: walker->level = mmu->root_level; pte = mmu->get_cr3(vcpu); @@ -389,9 +400,7 @@ retry_walk: if (unlikely(!accessed_dirty)) { ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, write_fault); - if (unlikely(ret < 0)) - goto error; - else if (ret) + if (ret > 0) goto retry_walk; } @@ -402,11 +411,6 @@ retry_walk: return 1; error: - errcode |= write_fault | user_fault; - if (fetch_fault && (mmu->nx || - kvm_read_cr4_bits(vcpu, X86_CR4_SMEP))) - errcode |= PFERR_FETCH_MASK; - walker->fault.vector = PF_VECTOR; walker->fault.error_code_valid = true; walker->fault.error_code = errcode; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH 1/2] KVM: MMU: precompute page fault error code 2016-03-08 11:45 ` [RFC PATCH 1/2] KVM: MMU: precompute page fault error code Paolo Bonzini @ 2016-03-10 14:01 ` Xiao Guangrong 2016-03-10 14:07 ` Paolo Bonzini 0 siblings, 1 reply; 7+ messages in thread From: Xiao Guangrong @ 2016-03-10 14:01 UTC (permalink / raw) To: Paolo Bonzini, linux-kernel, kvm; +Cc: huaitong.han On 03/08/2016 07:45 PM, Paolo Bonzini wrote: > For the next patch, we will want to filter PFERR_FETCH_MASK away early, > and not pass it to permission_fault if neither NX nor SMEP are enabled. > Prepare for the change. Why it is needed? It is much easier to drop PFEC.F in update_permission_bitmask(). > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > arch/x86/kvm/mmu.c | 2 +- > arch/x86/kvm/paging_tmpl.h | 26 +++++++++++++++----------- > 2 files changed, 16 insertions(+), 12 deletions(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 2463de0b935c..e57f7be061e3 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -3883,7 +3883,7 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, > u = bit & ACC_USER_MASK; > > if (!ept) { > - /* Not really needed: !nx will cause pte.nx to fault */ > + /* Not really needed: if !nx, ff will always be zero */ > x |= !mmu->nx; > /* Allow supervisor writes if !cr0.wp */ > w |= !is_write_protection(vcpu) && !uf; > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > index 6013f3685ef4..285858d3223b 100644 > --- a/arch/x86/kvm/paging_tmpl.h > +++ b/arch/x86/kvm/paging_tmpl.h > @@ -272,13 +272,24 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, > gpa_t pte_gpa; > int offset; > const int write_fault = access & PFERR_WRITE_MASK; > - const int user_fault = access & PFERR_USER_MASK; > - const int fetch_fault = access & PFERR_FETCH_MASK; > - u16 errcode = 0; > + u16 errcode; > gpa_t real_gpa; > gfn_t gfn; > > trace_kvm_mmu_pagetable_walk(addr, access); > + > + /* > + * Do not modify PFERR_FETCH_MASK in access. It is used later in the call to > + * mmu->translate_gpa and, when nested virtualization is in use, the X or NX > + * bit of nested page tables always applies---even if NX and SMEP are disabled > + * in the guest. > + * > + * TODO: cache the result of the NX and SMEP test in struct kvm_mmu? > + */ > + errcode = access; > + if (!(mmu->nx || kvm_read_cr4_bits(vcpu, X86_CR4_SMEP))) > + errcode &= ~PFERR_FETCH_MASK; > + PFEC.P might be set since it is copied from @access. And the workload is moved from rare path to the common path... > retry_walk: > walker->level = mmu->root_level; > pte = mmu->get_cr3(vcpu); > @@ -389,9 +400,7 @@ retry_walk: > > if (unlikely(!accessed_dirty)) { > ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, write_fault); > - if (unlikely(ret < 0)) > - goto error; > - else if (ret) > + if (ret > 0) > goto retry_walk; So it returns normally even if update_accessed_dirty_bits() failed. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH 1/2] KVM: MMU: precompute page fault error code 2016-03-10 14:01 ` Xiao Guangrong @ 2016-03-10 14:07 ` Paolo Bonzini 0 siblings, 0 replies; 7+ messages in thread From: Paolo Bonzini @ 2016-03-10 14:07 UTC (permalink / raw) To: Xiao Guangrong, linux-kernel, kvm; +Cc: huaitong.han On 10/03/2016 15:01, Xiao Guangrong wrote: > > > On 03/08/2016 07:45 PM, Paolo Bonzini wrote: >> For the next patch, we will want to filter PFERR_FETCH_MASK away early, >> and not pass it to permission_fault if neither NX nor SMEP are enabled. >> Prepare for the change. > > Why it is needed? It is much easier to drop PFEC.F in > update_permission_bitmask(). It's just how I structured the patches. It's unrelated to update_permission_bimask. I wanted permission_fault to return a fault code, and while it's easy to add bits (such as PKERR_PK_MASK) it's harder to remove bits from the PFEC that permission_fault receives. So I'm doing it here. Feel free to instead drop PFEC.F in permission_fault() or even add a new member to kvm_mmu with the valid bits of a PFEC. This is just an RFC for you and Huaitong to adapt in the PK patchset. Sometimes things are easier to describe in patches than in English. :) Paolo >> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> --- >> arch/x86/kvm/mmu.c | 2 +- >> arch/x86/kvm/paging_tmpl.h | 26 +++++++++++++++----------- >> 2 files changed, 16 insertions(+), 12 deletions(-) >> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 2463de0b935c..e57f7be061e3 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -3883,7 +3883,7 @@ static void update_permission_bitmask(struct >> kvm_vcpu *vcpu, >> u = bit & ACC_USER_MASK; >> >> if (!ept) { >> - /* Not really needed: !nx will cause pte.nx to fault */ >> + /* Not really needed: if !nx, ff will always be zero */ >> x |= !mmu->nx; >> /* Allow supervisor writes if !cr0.wp */ >> w |= !is_write_protection(vcpu) && !uf; >> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h >> index 6013f3685ef4..285858d3223b 100644 >> --- a/arch/x86/kvm/paging_tmpl.h >> +++ b/arch/x86/kvm/paging_tmpl.h >> @@ -272,13 +272,24 @@ static int FNAME(walk_addr_generic)(struct >> guest_walker *walker, >> gpa_t pte_gpa; >> int offset; >> const int write_fault = access & PFERR_WRITE_MASK; >> - const int user_fault = access & PFERR_USER_MASK; >> - const int fetch_fault = access & PFERR_FETCH_MASK; >> - u16 errcode = 0; >> + u16 errcode; >> gpa_t real_gpa; >> gfn_t gfn; >> >> trace_kvm_mmu_pagetable_walk(addr, access); >> + >> + /* >> + * Do not modify PFERR_FETCH_MASK in access. It is used later in >> the call to >> + * mmu->translate_gpa and, when nested virtualization is in use, >> the X or NX >> + * bit of nested page tables always applies---even if NX and SMEP >> are disabled >> + * in the guest. >> + * >> + * TODO: cache the result of the NX and SMEP test in struct kvm_mmu? >> + */ >> + errcode = access; >> + if (!(mmu->nx || kvm_read_cr4_bits(vcpu, X86_CR4_SMEP))) >> + errcode &= ~PFERR_FETCH_MASK; >> + > > PFEC.P might be set since it is copied from @access. > > And the workload is moved from rare path to the common path... > >> retry_walk: >> walker->level = mmu->root_level; >> pte = mmu->get_cr3(vcpu); >> @@ -389,9 +400,7 @@ retry_walk: >> >> if (unlikely(!accessed_dirty)) { >> ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, >> write_fault); >> - if (unlikely(ret < 0)) >> - goto error; >> - else if (ret) >> + if (ret > 0) >> goto retry_walk; > > So it returns normally even if update_accessed_dirty_bits() failed. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC PATCH 2/2] KVM: MMU: return page fault error code from permission_fault 2016-03-08 11:45 [RFC PATCH 0/2] KVM: MMU: return page fault error code from permission_fault Paolo Bonzini 2016-03-08 11:45 ` [RFC PATCH 1/2] KVM: MMU: precompute page fault error code Paolo Bonzini @ 2016-03-08 11:45 ` Paolo Bonzini 2016-03-10 14:03 ` Xiao Guangrong 1 sibling, 1 reply; 7+ messages in thread From: Paolo Bonzini @ 2016-03-08 11:45 UTC (permalink / raw) To: linux-kernel, kvm; +Cc: guangrong.xiao, huaitong.han Intel's new PKU extension introduces a bit of the page fault error code that must be dynamically computed after the PTE lookup (unlike I/D, U/S and W/R). To ease this, these two patches make permission_fault return the page fault error code. Right now it only adds the P bit to the input parameter "pfec", but PKU can change that. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- arch/x86/kvm/mmu.h | 15 ++++++++++----- arch/x86/kvm/paging_tmpl.h | 5 ++--- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 58fe98a0a526..8b2b3dfca1ae 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -141,11 +141,15 @@ static inline bool is_write_protection(struct kvm_vcpu *vcpu) } /* - * Will a fault with a given page-fault error code (pfec) cause a permission - * fault with the given access (in ACC_* format)? + * Check if a given access (described through the I/D, W/R and U/S bits of a + * page fault error code pfec) causes a permission fault with the given PTE + * access rights (in ACC_* format). + * + * Return zero if the access does not fault; return the page fault error code + * if the access faults. */ -static inline bool permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, - unsigned pte_access, unsigned pfec) +static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, + unsigned pte_access, unsigned pfec) { int cpl = kvm_x86_ops->get_cpl(vcpu); unsigned long rflags = kvm_x86_ops->get_rflags(vcpu); @@ -169,7 +173,8 @@ static inline bool permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, WARN_ON(pfec & PFERR_RSVD_MASK); - return (mmu->permissions[index] >> pte_access) & 1; + pfec |= PFERR_PRESENT_MASK; + return -((mmu->permissions[index] >> pte_access) & 1) & pfec; } void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm); diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 285858d3223b..4a6866dab848 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -370,10 +370,9 @@ retry_walk: walker->ptes[walker->level - 1] = pte; } while (!is_last_gpte(mmu, walker->level, pte)); - if (unlikely(permission_fault(vcpu, mmu, pte_access, access))) { - errcode |= PFERR_PRESENT_MASK; + errcode = permission_fault(vcpu, mmu, pte_access, errcode); + if (unlikely(errcode)) goto error; - } gfn = gpte_to_gfn_lvl(pte, walker->level); gfn += (addr & PT_LVL_OFFSET_MASK(walker->level)) >> PAGE_SHIFT; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH 2/2] KVM: MMU: return page fault error code from permission_fault 2016-03-08 11:45 ` [RFC PATCH 2/2] KVM: MMU: return page fault error code from permission_fault Paolo Bonzini @ 2016-03-10 14:03 ` Xiao Guangrong 2016-03-10 14:04 ` Paolo Bonzini 0 siblings, 1 reply; 7+ messages in thread From: Xiao Guangrong @ 2016-03-10 14:03 UTC (permalink / raw) To: Paolo Bonzini, linux-kernel, kvm; +Cc: huaitong.han On 03/08/2016 07:45 PM, Paolo Bonzini wrote: > Intel's new PKU extension introduces a bit of the page fault error code > that must be dynamically computed after the PTE lookup (unlike I/D, U/S > and W/R). To ease this, these two patches make permission_fault return > the page fault error code. > > Right now it only adds the P bit to the input parameter "pfec", but PKU > can change that. Yep, i got the same idea when i reviewed the pkey patchset. This patch looks good to me. Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH 2/2] KVM: MMU: return page fault error code from permission_fault 2016-03-10 14:03 ` Xiao Guangrong @ 2016-03-10 14:04 ` Paolo Bonzini 0 siblings, 0 replies; 7+ messages in thread From: Paolo Bonzini @ 2016-03-10 14:04 UTC (permalink / raw) To: Xiao Guangrong, linux-kernel, kvm; +Cc: huaitong.han On 10/03/2016 15:03, Xiao Guangrong wrote: > > > On 03/08/2016 07:45 PM, Paolo Bonzini wrote: >> Intel's new PKU extension introduces a bit of the page fault error code >> that must be dynamically computed after the PTE lookup (unlike I/D, U/S >> and W/R). To ease this, these two patches make permission_fault return >> the page fault error code. >> >> Right now it only adds the P bit to the input parameter "pfec", but PKU >> can change that. > > Yep, i got the same idea when i reviewed the pkey patchset. This patch > looks good to me. > > Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Great, feel free to pick it up in the pkey patchset. I'm answering to 1/2 now. Paolo ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-03-10 14:07 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-03-08 11:45 [RFC PATCH 0/2] KVM: MMU: return page fault error code from permission_fault Paolo Bonzini 2016-03-08 11:45 ` [RFC PATCH 1/2] KVM: MMU: precompute page fault error code Paolo Bonzini 2016-03-10 14:01 ` Xiao Guangrong 2016-03-10 14:07 ` Paolo Bonzini 2016-03-08 11:45 ` [RFC PATCH 2/2] KVM: MMU: return page fault error code from permission_fault Paolo Bonzini 2016-03-10 14:03 ` Xiao Guangrong 2016-03-10 14:04 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).