* [PATCH v2] x86/vdso: Use RDPID in preference to LSL when available @ 2016-04-21 1:16 Andy Lutomirski 2016-04-21 12:16 ` Borislav Petkov 0 siblings, 1 reply; 4+ messages in thread From: Andy Lutomirski @ 2016-04-21 1:16 UTC (permalink / raw) To: x86; +Cc: linux-kernel@vger.kernel.org, Borislav Petkov, Andy Lutomirski RDPID is a new instruction that reads MSR_TSC_AUX quickly. This should be considerably faster than reading the GDT. Add a cpufeature for it and use it from __vdso_getcpu when available. Signed-off-by: Andy Lutomirski <luto@kernel.org> --- I don't have a Cannonlake CPU (or whatever CPU I'd need for this). Could someone who has such a beast give this a try? Boris, could you double-check me? You're a lot more familiar with CPUID stuff and the instruction tables than I am, and I can't fall back to a real disassembler because the instruction is too new. Also, it's time for someone to do UMIP. I'll see if I can convince someone in KVM land to emulate it to make it easier to test. Changes from v1: - Remove rdpid() from special_instructions.h. (Was a leftover -- sorry.) arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/vgtod.h | 7 ++++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 7bfb6b70c745..beaf2fb601ee 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -279,6 +279,7 @@ /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */ #define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ +#define X86_FEATURE_RDPID (16*32+ 22) /* RDPID instruction */ /* * BUG word(s) diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index e728699db774..3a01996db58f 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -89,8 +89,13 @@ static inline unsigned int __getcpu(void) * works on all CPUs. This is volatile so that it orders * correctly wrt barrier() and to keep gcc from cleverly * hoisting it out of the calling function. + * + * If RDPID is available, use it. */ - asm volatile ("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG)); + alternative_io ("lsl %[p],%[seg]", + ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ + X86_FEATURE_RDPID, + [p] "=a" (p), [seg] "r" (__PER_CPU_SEG)); return p; } -- 2.5.5 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2] x86/vdso: Use RDPID in preference to LSL when available 2016-04-21 1:16 [PATCH v2] x86/vdso: Use RDPID in preference to LSL when available Andy Lutomirski @ 2016-04-21 12:16 ` Borislav Petkov 2016-04-21 15:25 ` Andy Lutomirski 0 siblings, 1 reply; 4+ messages in thread From: Borislav Petkov @ 2016-04-21 12:16 UTC (permalink / raw) To: Andy Lutomirski; +Cc: x86, linux-kernel@vger.kernel.org On Wed, Apr 20, 2016 at 06:16:01PM -0700, Andy Lutomirski wrote: > Also, it's time for someone to do UMIP. I'll see if I can convince > someone in KVM land to emulate it to make it easier to test. That'll be fun - we can simply set that bit in CR4 and see who screams :-P > Changes from v1: > - Remove rdpid() from special_instructions.h. (Was a leftover -- sorry.) > > arch/x86/include/asm/cpufeatures.h | 1 + > arch/x86/include/asm/vgtod.h | 7 ++++++- > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h > index 7bfb6b70c745..beaf2fb601ee 100644 > --- a/arch/x86/include/asm/cpufeatures.h > +++ b/arch/x86/include/asm/cpufeatures.h > @@ -279,6 +279,7 @@ > /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */ > #define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ > #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ > +#define X86_FEATURE_RDPID (16*32+ 22) /* RDPID instruction */ > > /* > * BUG word(s) > diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h > index e728699db774..3a01996db58f 100644 > --- a/arch/x86/include/asm/vgtod.h > +++ b/arch/x86/include/asm/vgtod.h > @@ -89,8 +89,13 @@ static inline unsigned int __getcpu(void) > * works on all CPUs. This is volatile so that it orders > * correctly wrt barrier() and to keep gcc from cleverly > * hoisting it out of the calling function. > + * > + * If RDPID is available, use it. > */ > - asm volatile ("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG)); > + alternative_io ("lsl %[p],%[seg]", > + ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ AFAICT, 0xf8 is correct, if I'm reading the SDM right: bits [7:6] must be 11b for opcode group 9 and RDPID is in the 11b row, bits [5:3] are ModRM.reg and they need to be 111b for RDPID (0x7 column) and the last three [2:0] select the register and they must be 000b for rAX. HOWEVER, you need to make the asm output register constraint "=a" because you're specifying rAX as a destination register for RDPID. Also, I'm wondering: should we supply that alternative in a separate inline function in special_instructions.h for wider use? I.e., something like read_cpu_num() or so... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] x86/vdso: Use RDPID in preference to LSL when available 2016-04-21 12:16 ` Borislav Petkov @ 2016-04-21 15:25 ` Andy Lutomirski 2016-04-21 16:26 ` Borislav Petkov 0 siblings, 1 reply; 4+ messages in thread From: Andy Lutomirski @ 2016-04-21 15:25 UTC (permalink / raw) To: Borislav Petkov; +Cc: Andy Lutomirski, X86 ML, linux-kernel@vger.kernel.org On Thu, Apr 21, 2016 at 5:16 AM, Borislav Petkov <bp@alien8.de> wrote: > On Wed, Apr 20, 2016 at 06:16:01PM -0700, Andy Lutomirski wrote: >> Also, it's time for someone to do UMIP. I'll see if I can convince >> someone in KVM land to emulate it to make it easier to test. > > That'll be fun - we can simply set that bit in CR4 and see who screams > :-P > >> Changes from v1: >> - Remove rdpid() from special_instructions.h. (Was a leftover -- sorry.) >> >> arch/x86/include/asm/cpufeatures.h | 1 + >> arch/x86/include/asm/vgtod.h | 7 ++++++- >> 2 files changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h >> index 7bfb6b70c745..beaf2fb601ee 100644 >> --- a/arch/x86/include/asm/cpufeatures.h >> +++ b/arch/x86/include/asm/cpufeatures.h >> @@ -279,6 +279,7 @@ >> /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */ >> #define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ >> #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ >> +#define X86_FEATURE_RDPID (16*32+ 22) /* RDPID instruction */ >> >> /* >> * BUG word(s) >> diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h >> index e728699db774..3a01996db58f 100644 >> --- a/arch/x86/include/asm/vgtod.h >> +++ b/arch/x86/include/asm/vgtod.h >> @@ -89,8 +89,13 @@ static inline unsigned int __getcpu(void) >> * works on all CPUs. This is volatile so that it orders >> * correctly wrt barrier() and to keep gcc from cleverly >> * hoisting it out of the calling function. >> + * >> + * If RDPID is available, use it. >> */ >> - asm volatile ("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG)); >> + alternative_io ("lsl %[p],%[seg]", >> + ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ > > AFAICT, 0xf8 is correct, if I'm reading the SDM right: > > bits [7:6] must be 11b for opcode group 9 and RDPID is in the 11b row, > bits [5:3] are ModRM.reg and they need to be 111b for RDPID (0x7 column) > and the last three [2:0] select the register and they must be 000b for > rAX. > > HOWEVER, you need to make the asm output register constraint "=a" > because you're specifying rAX as a destination register for RDPID. Didn't I? > > Also, I'm wondering: should we supply that alternative in a separate > inline function in special_instructions.h for wider use? I.e., something > like read_cpu_num() or so... > I thought about it, and there were two reasons: 1. I don't think we want to use __getcpu in the kernel. LSL is fairly slow, and we'd still need to mask off the node number. raw_smp_processor_id(), in contrast, is a single load. 2. I have no way to benchmark this thing. I'm assuming the RDPID will be faster than LSL, but that doesn't mean it's faster than a load. (It could be -- it will save a cache line.) So we might actually want something that does an alternative where the two choices are the percpu load and RDPID ; AND, but that wouldn't end up sharing code. But I'll leave that to someone with an actual RDPID-supporting CPU :) --Andy ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] x86/vdso: Use RDPID in preference to LSL when available 2016-04-21 15:25 ` Andy Lutomirski @ 2016-04-21 16:26 ` Borislav Petkov 0 siblings, 0 replies; 4+ messages in thread From: Borislav Petkov @ 2016-04-21 16:26 UTC (permalink / raw) To: Andy Lutomirski; +Cc: Andy Lutomirski, X86 ML, linux-kernel@vger.kernel.org On Thu, Apr 21, 2016 at 08:25:45AM -0700, Andy Lutomirski wrote: > Didn't I? Bah, I cut off the line which has the "=a" and then did the commenting. Sorry about the noise. > I thought about it, and there were two reasons: > > 1. I don't think we want to use __getcpu in the kernel. LSL is fairly > slow, and we'd still need to mask off the node number. > raw_smp_processor_id(), in contrast, is a single load. Right. > 2. I have no way to benchmark this thing. I'm assuming the RDPID will > be faster than LSL, but that doesn't mean it's faster than a load. > (It could be -- it will save a cache line.) But the RDPID reads an MSR. So it probably is microcode and thus slower than a load... I guess one of the reasons for the RDPID is to avoid the serialization cost of RDTSCP. > So we might actually want something that does an alternative where the > two choices are the percpu load and RDPID ; AND, but that wouldn't end > up sharing code. But I'll leave that to someone with an actual > RDPID-supporting CPU :) Right. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-04-21 16:26 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-04-21 1:16 [PATCH v2] x86/vdso: Use RDPID in preference to LSL when available Andy Lutomirski 2016-04-21 12:16 ` Borislav Petkov 2016-04-21 15:25 ` Andy Lutomirski 2016-04-21 16:26 ` Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox