Re: [PATCH V4 3/7] KVM, pkeys: update memeory permission bitmask for pkeys

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Paolo Bonzini <pbonzini@redhat.com>
To: Xiao Guangrong <guangrong.xiao@linux.intel.com>,
	Huaitong Han <huaitong.han@intel.com>,
	gleb@kernel.org
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH V4 3/7] KVM, pkeys: update memeory permission bitmask for pkeys
Date: Tue, 8 Mar 2016 09:29:49 +0100	[thread overview]
Message-ID: <56DE8D7D.5010302@redhat.com> (raw)
In-Reply-To: <56DE80B8.40900@linux.intel.com>



On 08/03/2016 08:35, Xiao Guangrong wrote:
>> well-predicted branches are _faster_ than branchless code.
> 
> Er, i do not understand this. If these two case have the same cache hit,
> how can a branch be faster?

Because branchless code typically executes fewer instructions.

Take the same example here:

>>     do {
>>     } while (level > PT_PAGE_TABLE_LEVEL &&
>>          (!(gpte & PT_PAGE_SIZE_MASK) ||
>>           level == mmu->root_level));

The assembly looks like (assuming %level, %gpte and %mmu are registers)

	cmp $1, %level
	jbe 1f
	test $128, %gpte
	jz beginning_of_loop
	cmpb ROOT_LEVEL_OFFSET(%mmu), %level
	je beginning_of_loop
1:

These are two to six instructions, with no dependency and which the
processor can change into one to three macro-ops.  For the branchless
code (I posted a patch to implement this algorithm yesterday):

	lea -2(%level), %temp1
	orl %temp1, %gpte
	movzbl LAST_NONLEAF_LEVEL_OFFSET(%mmu), %temp1
	movl %level, %temp2
	subl %temp1, %temp2
	andl %temp2, %gpte
	test $128, %gpte
	jz beginning_of_loop

These are eight instructions, with some dependencies between them too.
In some cases branchless code throws away the result of 10-15
instructions (because in the end it's ANDed with 0, for example).  If it
weren't for mispredictions, the branchy code would be faster.

>> Here none of the branches is easily predicted, so we want to get rid of
>> them.
>>
>> The next patch adds three branches, and they are not all equal:
>>
>> - is_long_vcpu is well predicted to true (or even for 32-bit OSes it
>> should be well predicted if the host is not overcommitted).
> 
> But, in the production, cpu over-commit is the normal case...

It depends on the workload.  I would guess that 32-bit OSes are more
common where you have a single legacy guest because e.g. it doesn't have
drivers for recent hardware.

>>> However, i do not think we need a new byte index for PK. The conditions
>>> detecting PK enablement
>>> can be fully found in current vcpu content (i.e, CR4, EFER and U/S
>>> access).
>>
>> Adding a new byte index lets you cache CR4.PKE (and actually EFER.LMA
>> too, though Huaitong's patch doesn't do that).  It's a good thing to do.
>>   U/S is also handled by adding a new byte index, see Huaitong's
> 
> It is not on the same page, the U/S is the type of memory access which
> is depended on vCPU runtime.

Do you mean the type of page (ACC_USER_MASK)?  Only U=1 pages are
subject to PKRU, even in the kernel.  The processor CPL
(PFERR_USER_MASK) only matters if CR0.WP=0.

> But the condition whether PKEY is enabled or not
> is fully depended on the envorment of CPU and we should _always_
> check PKEY even if PFEC_PKEY is not set.
> 
> As PKEY is not enabled on softmmu, the gva_to_gpa mostly comes from internal
> KVM, that means we should always set PFEC.PKEY for all the gva_to_gpa request.
> Wasting a bit is really unnecessary.
> 
> And it is always better to move more workload from permission_fault() to
> update_permission_bitmask() as the former is much hotter than the latter.

I agree, but I'm not sure why you say that adding a bits adds more work
to permission_fault().

Adding a bit lets us skip CR4.PKU and EFER.LMA checks in
permission_fault() and in all gva_to_gpa() callers.

So my proposal is to compute the "effective" PKRU bits (i.e. extract the
relevant AD and WD bits, and mask away WD if irrelevant) in
update_permission_bitmask(), and add PFERR_PK_MASK to the error code if
they are nonzero.

PFERR_PK_MASK must be computed in permission_fault().  It's a runtime
condition that it's not known before.

>> I don't like the idea of making permissions[] four times larger.
> 
> Okay, then lets introduce a new field for PKEY separately. Your approach
> , fault_u1w0, looks good to me.
>
>> I think I even prefer if update_permission_bitmask sets up a separate
>> bitmask:
>>
>>         mmu->fault_u1w0 |= (wf && !w) << byte;
>>
>> and then this other bitmap can be tested in permission_fault:
>>
>> -        if (!wf || (!uf && !is_write_protection(vcpu)))
>> -            pkru_bits &= ~(1 << PKRU_WRITE);
>> +        /*
>> +         * fault_u1w0 ignores SMAP and PKRU, so use the
>> +         * partially-computed PFEC that we were given.
>> +         */
>> +        fault_uw = (mmu->fault_u1w0 >> (pfec >> 1)) & 1;
>> +        pkru_bits &= ~(1 << PKRU_WRITE) |
>> +            (fault_uw << PKRU_WRITE);
> 
> It looks good to me!

Good. Thanks for reviewing the idea!

Paolo

next prev parent reply	other threads:[~2016-03-08  8:29 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-05 11:27 [PATCH V4 0/7] KVM, pkeys: add memory protection-key support Huaitong Han
2016-03-05 11:27 ` [PATCH V4 1/7] KVM, pkeys: expose CPUID/CR4 to guest Huaitong Han
2016-03-06  7:15   ` Xiao Guangrong
2016-03-06 23:20     ` Paolo Bonzini
2016-03-08  7:39       ` Xiao Guangrong
2016-03-08  7:58         ` Paolo Bonzini
2016-03-05 11:27 ` [PATCH V4 2/7] KVM, pkeys: disable pkeys for guests in non-paging mode Huaitong Han
2016-03-06  7:19   ` Xiao Guangrong
2016-03-08 12:09   ` Yang Zhang
2016-03-08 12:11     ` Paolo Bonzini
2016-03-08 13:02       ` Yang Zhang
2016-03-05 11:27 ` [PATCH V4 3/7] KVM, pkeys: update memeory permission bitmask for pkeys Huaitong Han
2016-03-06  7:42   ` Xiao Guangrong
2016-03-06 23:14     ` Paolo Bonzini
2016-03-08  7:35       ` Xiao Guangrong
2016-03-08  8:29         ` Paolo Bonzini [this message]
2016-03-08  9:19           ` Xiao Guangrong
2016-03-08 10:01             ` Paolo Bonzini
2016-03-09  5:03               ` Xiao Guangrong
2016-03-09  8:10                 ` Paolo Bonzini
2016-03-05 11:27 ` [PATCH V4 4/7] KVM, pkeys: add pkeys support for permission_fault logic Huaitong Han
2016-03-06  8:00   ` Xiao Guangrong
2016-03-06 20:36     ` Paolo Bonzini
2016-03-06 23:29       ` Paolo Bonzini
2016-03-08  5:57       ` Xiao Guangrong
2016-03-05 11:27 ` [PATCH V4 5/7] KVM, pkeys: Add pkeys support for gva_to_gpa funcions Huaitong Han
2016-03-06  8:01   ` Xiao Guangrong
2016-03-06 21:33     ` Paolo Bonzini
2016-03-05 11:27 ` [PATCH V4 6/7] KVM, pkeys: add pkeys support for xsave state Huaitong Han
2016-03-06  8:27   ` Xiao Guangrong
2016-03-05 11:27 ` [PATCH V4 7/7] KVM, pkeys: disable PKU feature without ept Huaitong Han
2016-03-06  9:28   ` Xiao Guangrong
2016-03-06 20:32     ` Paolo Bonzini
2016-03-08  5:54       ` Xiao Guangrong
2016-03-08  8:47         ` Paolo Bonzini
2016-03-08  9:32           ` Xiao Guangrong
2016-03-08 10:02             ` Paolo Bonzini
2016-03-09  5:51               ` Xiao Guangrong
2016-03-09  6:37                 ` Yang Zhang
2016-03-09  7:21                   ` Xiao Guangrong
2016-03-09  7:41                     ` Yang Zhang
2016-03-09  7:50                       ` Xiao Guangrong
2016-03-09  8:00                         ` Yang Zhang
2016-03-09  8:05                           ` Xiao Guangrong
2016-03-09  8:18                             ` Paolo Bonzini
2016-03-09  8:13                 ` Paolo Bonzini
2016-03-09  6:24           ` Yang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56DE8D7D.5010302@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=gleb@kernel.org \
    --cc=guangrong.xiao@linux.intel.com \
    --cc=huaitong.han@intel.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.