From: Ingo Molnar <mingo@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: "Robert O'Callahan" <robert@ocallahan.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Andy Lutomirski" <luto@kernel.org>,
"Ingo Molnar" <mingo@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, "Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"Jeff Dike" <jdike@addtoit.com>,
"Richard Weinberger" <richard@nod.at>,
"Alexander Viro" <viro@zeniv.linux.org.uk>,
"Shuah Khan" <shuah@kernel.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Borislav Petkov" <bp@suse.de>,
"Peter Zijlstra" <peterz@infradead.org>,
"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
"Len Brown" <len.brown@intel.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
"Dmitry Safonov" <dsafonov@virtuozzo.com>,
"David Matlack" <dmatlack@google.com>,
"Nadav Amit" <nadav.amit@gmail.com>,
linux-kernel@vger.kernel.org,
user-mode-linux-devel@lists.sourceforge.net,
user-mode-linux-user@lists.sourceforge.net,
linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
kvm@vger.kernel.org
Subject: Re: [PATCH v12 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
Date: Fri, 18 Nov 2016 09:14:44 +0100 [thread overview]
Message-ID: <20161118081444.GC15912@gmail.com> (raw)
In-Reply-To: <20161117020610.5302-7-khuey@kylehuey.com>
* Kyle Huey <me@kylehuey.com> wrote:
> Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
> When enabled, the processor will fault on attempts to execute the CPUID
> instruction with CPL>0. Exposing this feature to userspace will allow a
> ptracer to trap and emulate the CPUID instruction.
>
> When supported, this feature is controlled by toggling bit 0 of
> MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
> https://bugzilla.kernel.org/attachment.cgi?id=243991
>
> Implement a new pair of arch_prctls, available on both x86-32 and x86-64.
>
> ARCH_GET_CPUID: Returns the current CPUID faulting state, either
> ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. arg2 must be 0.
>
> ARCH_SET_CPUID: Set the CPUID faulting state to arg2, which must be either
> ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. Returns EINVAL if arg2 is
> another value or CPUID faulting is not supported on this system.
So the interface is:
> +#define ARCH_GET_CPUID 0x1005
> +#define ARCH_SET_CPUID 0x1006
> +#define ARCH_CPUID_ENABLE 1
> +#define ARCH_CPUID_SIGSEGV 2
Which maps to:
prctl(ARCH_SET_CPUID, 0); /* -EINVAL */
prctl(ARCH_SET_CPUID, 1); /* enable CPUID [i.e. make it work without faulting] */
prctl(ARCH_SET_CPUID, 2); /* disable CPUID [i.e. make it fault] */
ret = prctl(ARCH_GET_CPUID, 0); /* return current state: 1==on, 2==off */
This is a very broken interface that makes very little sense.
It would be much better to use a more natural interface where 1/0 means on/off and
where ARCH_GET_CPUID returns the current natural state:
prctl(ARCH_SET_CPUID, 0); /* disable CPUID [i.e. make it fault] */
prctl(ARCH_SET_CPUID, 1); /* enable CPUID [i.e. make it work without faulting] */
ret = prctl(ARCH_GET_CPUID); /* 1==enabled, 0==disabled */
See how natural it is? The use of the ARCH_CPUID_SIGSEGV/ENABLED symbols can be
avoided altogether. This will cut down on some of the ugliness in the kernel code
as well - and clean up the argument name as well: instead of naming it 'int arg2'
it can be named the more natural 'int cpuid_enabled'.
> The state of the CPUID faulting flag is propagated across forks, but reset
> upon exec.
I don't think this is the natural API for propagating settings across exec().
We should reset the flag on exec() only if security considerations require it -
i.e. like perf events are cleared.
If binaries that assume a working CPUID are exec()-ed then CPUID can be enabled
explicitly.
Clearing it automatically loses the ability of a pure no-CPUID environment to
exec() a CPUID-safe binary.
> Signed-off-by: Kyle Huey <khuey@kylehuey.com>
> ---
> arch/x86/include/asm/msr-index.h | 3 +
> arch/x86/include/asm/processor.h | 2 +
> arch/x86/include/asm/thread_info.h | 6 +-
> arch/x86/include/uapi/asm/prctl.h | 6 +
> arch/x86/kernel/cpu/intel.c | 7 +
> arch/x86/kernel/process.c | 84 ++++++++++
> fs/exec.c | 1 +
> include/linux/thread_info.h | 4 +
> tools/testing/selftests/x86/Makefile | 2 +-
> tools/testing/selftests/x86/cpuid-fault.c | 254 ++++++++++++++++++++++++++++++
> 10 files changed, 367 insertions(+), 2 deletions(-)
> create mode 100644 tools/testing/selftests/x86/cpuid-fault.c
Please put the self-test into a separate patch.
> static void init_intel_misc_features_enables(struct cpuinfo_x86 *c)
> {
> u64 msr;
>
> + if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &msr))
> + return;
> +
> + msr = 0;
> + wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
> + this_cpu_write(msr_misc_features_enables_shadow, msr);
> +
> if (!rdmsrl_safe(MSR_PLATFORM_INFO, &msr)) {
> if (msr & MSR_PLATFORM_INFO_CPUID_FAULT)
> set_cpu_cap(c, X86_FEATURE_CPUID_FAULT);
> }
> }
Sigh, so the Intel MSR index itself is grossly misnamed: MSR_MISC_FEATURES_ENABLES
- plain reading of 'enables' suggests it's a verb, but in wants to be a noun. A
better name would be MSR_MISC_FEATURES or so.
So while for the MSR index we want to keep the Intel name, please drop that
_enables() postfix from the kernel C function names such as this one - and from
the shadow value name as well.
> +DEFINE_PER_CPU(u64, msr_misc_features_enables_shadow);
> +
> +static void set_cpuid_faulting(bool on)
> +{
> + u64 msrval;
> +
> + DEBUG_LOCKS_WARN_ON(!irqs_disabled());
> +
> + msrval = this_cpu_read(msr_misc_features_enables_shadow);
> + msrval &= ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;
> + msrval |= (on << MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT);
> + this_cpu_write(msr_misc_features_enables_shadow, msrval);
> + wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval);
This gets called from the context switch path and this looks pretty suboptimal,
especially when combined with the TIF flag check:
> void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
> struct tss_struct *tss)
> {
> struct thread_struct *prev, *next;
>
> prev = &prev_p->thread;
> next = &next_p->thread;
>
> @@ -206,16 +278,21 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
>
> debugctl &= ~DEBUGCTLMSR_BTF;
> if (test_tsk_thread_flag(next_p, TIF_BLOCKSTEP))
> debugctl |= DEBUGCTLMSR_BTF;
>
> update_debugctlmsr(debugctl);
> }
>
> + if (test_tsk_thread_flag(prev_p, TIF_NOCPUID) ^
> + test_tsk_thread_flag(next_p, TIF_NOCPUID)) {
> + set_cpuid_faulting(test_tsk_thread_flag(next_p, TIF_NOCPUID));
> + }
> +
Why not cache the required MSR value in the task struct instead?
That would allow something much more obvious and much faster, like:
if (prev_p->thread.misc_features_val != next_p->thread.misc_features_val)
wrmsrl(MSR_MISC_FEATURES_ENABLES, next_p->thread.misc_features_val);
(The TIF flag maintenance is still required to get into __switch_to_xtra().)
It would also be easy to extend without extra overhead, should any other feature
bit be added to the MSR in the future.
Thanks,
Ingo
next prev parent reply other threads:[~2016-11-18 8:14 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-17 2:06 [PATCH v12 0/7] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Kyle Huey
2016-11-17 2:06 ` [PATCH v12 1/7] x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl Kyle Huey
2016-11-17 2:06 ` [PATCH v12 2/7] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64 Kyle Huey
2016-11-18 7:27 ` Ingo Molnar
2016-11-18 7:28 ` Thomas Gleixner
2016-11-18 8:16 ` Ingo Molnar
2016-11-18 16:39 ` Kyle Huey
2016-11-29 9:26 ` Ingo Molnar
2016-11-17 2:06 ` [PATCH v12 3/7] x86/arch_prctl: Add do_arch_prctl_common Kyle Huey
2016-11-17 2:06 ` [PATCH v12 4/7] x86/syscalls/32: Wire up arch_prctl on x86-32 Kyle Huey
2016-11-18 7:30 ` Ingo Molnar
2016-11-17 2:06 ` [PATCH v12 5/7] x86/cpufeature: Detect CPUID faulting support Kyle Huey
2016-11-17 16:51 ` Borislav Petkov
2016-11-17 2:06 ` [PATCH v12 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID Kyle Huey
2016-11-18 8:14 ` Ingo Molnar [this message]
2016-11-18 8:49 ` Thomas Gleixner
2016-11-21 8:27 ` Ingo Molnar
2016-11-22 17:26 ` Andy Lutomirski
2016-11-18 15:55 ` Kyle Huey
2016-11-18 17:32 ` Andy Lutomirski
2016-11-17 2:06 ` [PATCH v12 7/7] KVM: x86: virtualize cpuid faulting Kyle Huey
2016-11-17 12:31 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161118081444.GC15912@gmail.com \
--to=mingo@kernel.org \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@suse.de \
--cc=dave.hansen@linux.intel.com \
--cc=dmatlack@google.com \
--cc=dsafonov@virtuozzo.com \
--cc=hpa@zytor.com \
--cc=jdike@addtoit.com \
--cc=kvm@vger.kernel.org \
--cc=len.brown@intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=luto@kernel.org \
--cc=me@kylehuey.com \
--cc=mingo@redhat.com \
--cc=nadav.amit@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=richard@nod.at \
--cc=rkrcmar@redhat.com \
--cc=robert@ocallahan.org \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=user-mode-linux-devel@lists.sourceforge.net \
--cc=user-mode-linux-user@lists.sourceforge.net \
--cc=viro@zeniv.linux.org.uk \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).