From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: [PATCH V4 3/7] KVM, pkeys: update memeory permission bitmask for pkeys Date: Tue, 8 Mar 2016 11:01:30 +0100 Message-ID: <56DEA2FA.70302@redhat.com> References: <1457177252-7577-1-git-send-email-huaitong.han@intel.com> <1457177252-7577-4-git-send-email-huaitong.han@intel.com> <56DBDF57.4030607@linux.intel.com> <56DCB9BF.4020904@redhat.com> <56DE80B8.40900@linux.intel.com> <56DE8D7D.5010302@redhat.com> <56DE991E.8070007@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: kvm@vger.kernel.org To: Xiao Guangrong , Huaitong Han , gleb@kernel.org Return-path: Received: from mail-lb0-f194.google.com ([209.85.217.194]:35474 "EHLO mail-lb0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932828AbcCHKBh (ORCPT ); Tue, 8 Mar 2016 05:01:37 -0500 Received: by mail-lb0-f194.google.com with SMTP id h2so1040341lbs.2 for ; Tue, 08 Mar 2016 02:01:36 -0800 (PST) In-Reply-To: <56DE991E.8070007@linux.intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On 08/03/2016 10:19, Xiao Guangrong wrote: >>> >>> As PKEY is not enabled on softmmu, the gva_to_gpa mostly comes from >>> internal >>> KVM, that means we should always set PFEC.PKEY for all the gva_to_gpa >>> request. >>> Wasting a bit is really unnecessary. >>> >>> And it is always better to move more workload from permission_fault() to >>> update_permission_bitmask() as the former is much hotter than the >>> latter. >> >> I agree, but I'm not sure why you say that adding a bits adds more work >> to permission_fault(). > > A branch to check PFEC.PKEY, which is not well predictable on soft mmu. (It > should always be set in EPT as the page table walking is done by software, > however, if we only consider EPT we can assume it is always true). Ok, I agree we shouldn't add such a branch. It's really messy that PKERR_PK_MASK is computed at runtime. It means we need to check U=1 in permission_fault, and we need to rethink all of the handling of PKERR_PK_MASK in fact. I think we should extend the algorithm from earlier in the thread to handle the AD bit too: /* Bit 0: should PKRU.AD be considered if PFEC=0 or 1? * Bit 1: should PKRU.WD be considered if PFEC=0 or 1? * Bit 2: should PKRU.AD be considered if PFEC=2 or 3? * Bit 3: should PKRU.WD be considered if PFEC=2 or 3? * Ignore PKRU if U=0. * etc. */ u32 pkru_mask; In update_permission_bitmask, the bits of mmu->pkru_mask only depend on the byte, not on the bit. Bits 0, 2, ... are computed from pku, i.e. cr4_pku && long_mode && !ff. Bits 1, 3, ... are computed from pku && (wf && (is_write_protection(vcpu) || uf)). The permissions[] array can keep 16 entries, because PFERR_PK_MASK is never passed to permission_fault (like PFERR_RSVD_MASK). The permission_fault code is something like: WARN_ON(pfec & PFERR_PK_MASK); pkru_mask = pte_access & ACC_USER_MASK ? mmu->pkru_mask : 0; pkru_bits &= pkru_mask >> (pfec & ~1); pfec |= -pkru_bits & PFERR_PK_MASK; What do you think? I still believe that the above checks should be wrapped in a if (unlikely(mmu->pkru_mask) { ... } It should be predicted well and removes the potential performance cost of branchless code on !PKE OSes. >> Adding a bit lets us skip CR4.PKU and EFER.LMA checks in >> permission_fault() and in all gva_to_gpa() callers. > > The point is when we can clear this bit to skip these checks. We should > _always_ check PKEY even if PFEC.PKEY = 0, because: > 1) all gva_to_gpa()s issued by KVM should always check PKEY. This is the > case of ept only. > > 2) if the feature is enabled in softmmu, shadow page table may change its > behavior, for example, the mmio-access causes a reserved PF which > may clear PFEC.PKEY. > > And skipping these checks is not really necessary as we can take them into > account when we update the bitmask. Got it. Does the above pseudocode match your idea? It disregards the pagefault's PFERR_PK_MASK completely. Paolo