From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: [PATCH V4 3/7] KVM, pkeys: update memeory permission bitmask for pkeys Date: Wed, 9 Mar 2016 13:03:33 +0800 Message-ID: <56DFAEA5.9080604@linux.intel.com> References: <1457177252-7577-1-git-send-email-huaitong.han@intel.com> <1457177252-7577-4-git-send-email-huaitong.han@intel.com> <56DBDF57.4030607@linux.intel.com> <56DCB9BF.4020904@redhat.com> <56DE80B8.40900@linux.intel.com> <56DE8D7D.5010302@redhat.com> <56DE991E.8070007@linux.intel.com> <56DEA2FA.70302@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: Paolo Bonzini , Huaitong Han , gleb@kernel.org Return-path: Received: from mga14.intel.com ([192.55.52.115]:24832 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbcCIFD7 (ORCPT ); Wed, 9 Mar 2016 00:03:59 -0500 In-Reply-To: <56DEA2FA.70302@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 03/08/2016 06:01 PM, Paolo Bonzini wrote: > > > On 08/03/2016 10:19, Xiao Guangrong wrote: >>>> >>>> As PKEY is not enabled on softmmu, the gva_to_gpa mostly comes from >>>> internal >>>> KVM, that means we should always set PFEC.PKEY for all the gva_to_gpa >>>> request. >>>> Wasting a bit is really unnecessary. >>>> >>>> And it is always better to move more workload from permission_fault() to >>>> update_permission_bitmask() as the former is much hotter than the >>>> latter. >>> >>> I agree, but I'm not sure why you say that adding a bits adds more work >>> to permission_fault(). >> >> A branch to check PFEC.PKEY, which is not well predictable on soft mmu. (It >> should always be set in EPT as the page table walking is done by software, >> however, if we only consider EPT we can assume it is always true). > > Ok, I agree we shouldn't add such a branch. > > It's really messy that PKERR_PK_MASK is computed at runtime. It means > we need to check U=1 in permission_fault, and we need to rethink all of > the handling of PKERR_PK_MASK in fact. I think we should extend the > algorithm from earlier in the thread to handle the AD bit too: > > /* Bit 0: should PKRU.AD be considered if PFEC=0 or 1? > * Bit 1: should PKRU.WD be considered if PFEC=0 or 1? > * Bit 2: should PKRU.AD be considered if PFEC=2 or 3? > * Bit 3: should PKRU.WD be considered if PFEC=2 or 3? > * Ignore PKRU if U=0. > * etc. > */ > u32 pkru_mask; > > In update_permission_bitmask, the bits of mmu->pkru_mask only depend on > the byte, not on the bit. > > Bits 0, 2, ... are computed from pku, i.e. cr4_pku && long_mode && !ff. > Bits 1, 3, ... are computed from pku && (wf && > (is_write_protection(vcpu) || uf)). > > The permissions[] array can keep 16 entries, because PFERR_PK_MASK is > never passed to permission_fault (like PFERR_RSVD_MASK). The > permission_fault code is something like: > > WARN_ON(pfec & PFERR_PK_MASK); > pkru_mask = pte_access & ACC_USER_MASK ? mmu->pkru_mask : 0; > pkru_bits &= pkru_mask >> (pfec & ~1); ^ it should be pfec >> 1? As we never take PFEC.P into account? Actually pkru_mask can be u16 since only three bits that are PFEC.W, PFEC.U and PFEC.F need to be taken into account and every offset occupies 2 bits. > pfec |= -pkru_bits & PFERR_PK_MASK; > > What do you think? Very good. :) > > I still believe that the above checks should be wrapped in a > > if (unlikely(mmu->pkru_mask) { > ... > } > > It should be predicted well and removes the potential performance cost > of branchless code on !PKE OSes. I am considering we can use 'pte_access & ACC_USER_MASK' replacing PFEC.RSVD in your algorithm (then pkru_mask is 32 bit), permission_fault() is like this: WARN_ON(pfec & PFERR_PK_MASK); offset = pfec & (PFERR_WRITE_MASK | PFERR_USER_MASK); offset |= !!(pte_access & ACC_USER_MASK) << PFERR_RSVD_BIT; offset >>= 1; pkru_bits &= pkru_mask >> offset; pfec |= -pkru_bits & PFERR_PK_MASK; Now, this is no branch and no much overload introduced in update_permission_bitmask().