All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: "Robert O'Callahan" <robert@ocallahan.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Jeff Dike" <jdike@addtoit.com>,
	"Richard Weinberger" <richard@nod.at>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shuah Khan" <shuah@kernel.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Borislav Petkov" <bp@suse.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Len Brown" <len.brown@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	"Dmitry Safonov" <dsafonov@virtuozzo.com>,
	"David Matlack" <dmatlack@google.com>,
	"Nadav Amit" <nadav.amit@gmail.com>,
	linux-kernel@vger.kernel.org,
	user-mode-linux-devel@lists.sourceforge.net,
	user-mode-linux-user@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH v12 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
Date: Fri, 18 Nov 2016 09:14:44 +0100	[thread overview]
Message-ID: <20161118081444.GC15912@gmail.com> (raw)
In-Reply-To: <20161117020610.5302-7-khuey@kylehuey.com>


* Kyle Huey <me@kylehuey.com> wrote:

> Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
> When enabled, the processor will fault on attempts to execute the CPUID
> instruction with CPL>0. Exposing this feature to userspace will allow a
> ptracer to trap and emulate the CPUID instruction.
> 
> When supported, this feature is controlled by toggling bit 0 of
> MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
> https://bugzilla.kernel.org/attachment.cgi?id=243991
> 
> Implement a new pair of arch_prctls, available on both x86-32 and x86-64.
> 
> ARCH_GET_CPUID: Returns the current CPUID faulting state, either
>   ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. arg2 must be 0.
> 
> ARCH_SET_CPUID: Set the CPUID faulting state to arg2, which must be either
>   ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. Returns EINVAL if arg2 is
>   another value or CPUID faulting is not supported on this system.

So the interface is:

> +#define ARCH_GET_CPUID 0x1005
> +#define ARCH_SET_CPUID 0x1006
> +#define ARCH_CPUID_ENABLE 1
> +#define ARCH_CPUID_SIGSEGV 2

Which maps to:

   prctl(ARCH_SET_CPUID, 0); /* -EINVAL */
   prctl(ARCH_SET_CPUID, 1); /* enable CPUID [i.e. make it work without faulting] */
   prctl(ARCH_SET_CPUID, 2); /* disable CPUID [i.e. make it fault] */

   ret = prctl(ARCH_GET_CPUID, 0); /* return current state: 1==on, 2==off */

This is a very broken interface that makes very little sense.

It would be much better to use a more natural interface where 1/0 means on/off and 
where ARCH_GET_CPUID returns the current natural state:

   prctl(ARCH_SET_CPUID, 0); /* disable CPUID [i.e. make it fault] */
   prctl(ARCH_SET_CPUID, 1); /* enable CPUID [i.e. make it work without faulting] */

   ret = prctl(ARCH_GET_CPUID); /* 1==enabled, 0==disabled */

See how natural it is? The use of the ARCH_CPUID_SIGSEGV/ENABLED symbols can be 
avoided altogether. This will cut down on some of the ugliness in the kernel code 
as well - and clean up the argument name as well: instead of naming it 'int arg2' 
it can be named the more natural 'int cpuid_enabled'.

> The state of the CPUID faulting flag is propagated across forks, but reset
> upon exec.

I don't think this is the natural API for propagating settings across exec().
We should reset the flag on exec() only if security considerations require it - 
i.e. like perf events are cleared.

If binaries that assume a working CPUID are exec()-ed then CPUID can be enabled 
explicitly.

Clearing it automatically loses the ability of a pure no-CPUID environment to 
exec() a CPUID-safe binary.

> Signed-off-by: Kyle Huey <khuey@kylehuey.com>
> ---
>  arch/x86/include/asm/msr-index.h          |   3 +
>  arch/x86/include/asm/processor.h          |   2 +
>  arch/x86/include/asm/thread_info.h        |   6 +-
>  arch/x86/include/uapi/asm/prctl.h         |   6 +
>  arch/x86/kernel/cpu/intel.c               |   7 +
>  arch/x86/kernel/process.c                 |  84 ++++++++++
>  fs/exec.c                                 |   1 +
>  include/linux/thread_info.h               |   4 +
>  tools/testing/selftests/x86/Makefile      |   2 +-
>  tools/testing/selftests/x86/cpuid-fault.c | 254 ++++++++++++++++++++++++++++++
>  10 files changed, 367 insertions(+), 2 deletions(-)
>  create mode 100644 tools/testing/selftests/x86/cpuid-fault.c

Please put the self-test into a separate patch.

>  static void init_intel_misc_features_enables(struct cpuinfo_x86 *c)
>  {
>  	u64 msr;
>  
> +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &msr))
> +		return;
> +
> +	msr = 0;
> +	wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
> +	this_cpu_write(msr_misc_features_enables_shadow, msr);
> +
>  	if (!rdmsrl_safe(MSR_PLATFORM_INFO, &msr)) {
>  		if (msr & MSR_PLATFORM_INFO_CPUID_FAULT)
>  			set_cpu_cap(c, X86_FEATURE_CPUID_FAULT);
>  	}
>  }

Sigh, so the Intel MSR index itself is grossly misnamed: MSR_MISC_FEATURES_ENABLES 
- plain reading of 'enables' suggests it's a verb, but in wants to be a noun. A 
better name would be MSR_MISC_FEATURES or so.

So while for the MSR index we want to keep the Intel name, please drop that 
_enables() postfix from the kernel C function names such as this one - and from 
the shadow value name as well.

> +DEFINE_PER_CPU(u64, msr_misc_features_enables_shadow);
> +
> +static void set_cpuid_faulting(bool on)
> +{
> +	u64 msrval;
> +
> +	DEBUG_LOCKS_WARN_ON(!irqs_disabled());
> +
> +	msrval = this_cpu_read(msr_misc_features_enables_shadow);
> +	msrval &= ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;
> +	msrval |= (on << MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT);
> +	this_cpu_write(msr_misc_features_enables_shadow, msrval);
> +	wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval);

This gets called from the context switch path and this looks pretty suboptimal, 
especially when combined with the TIF flag check:

>  void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
>  		      struct tss_struct *tss)
>  {
>  	struct thread_struct *prev, *next;
>  
>  	prev = &prev_p->thread;
>  	next = &next_p->thread;
>  
> @@ -206,16 +278,21 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
>  
>  		debugctl &= ~DEBUGCTLMSR_BTF;
>  		if (test_tsk_thread_flag(next_p, TIF_BLOCKSTEP))
>  			debugctl |= DEBUGCTLMSR_BTF;
>  
>  		update_debugctlmsr(debugctl);
>  	}
>  
> +	if (test_tsk_thread_flag(prev_p, TIF_NOCPUID) ^
> +	    test_tsk_thread_flag(next_p, TIF_NOCPUID)) {
> +		set_cpuid_faulting(test_tsk_thread_flag(next_p, TIF_NOCPUID));
> +	}
> +

Why not cache the required MSR value in the task struct instead?

That would allow something much more obvious and much faster, like:

	if (prev_p->thread.misc_features_val != next_p->thread.misc_features_val)
		wrmsrl(MSR_MISC_FEATURES_ENABLES, next_p->thread.misc_features_val);

(The TIF flag maintenance is still required to get into __switch_to_xtra().)

It would also be easy to extend without extra overhead, should any other feature 
bit be added to the MSR in the future.

Thanks,

	Ingo


WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: "Robert O'Callahan" <robert@ocallahan.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Jeff Dike" <jdike@addtoit.com>,
	"Richard Weinberger" <richard@nod.at>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shuah Khan" <shuah@kernel.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Borislav Petkov" <bp@suse.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Len Brown" <len.brown@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	"Dmitry Safonov" <dsafonov@virtuozzo.com>,
	"David Matlack" <dmatlack@google.com>
Subject: Re: [PATCH v12 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
Date: Fri, 18 Nov 2016 09:14:44 +0100	[thread overview]
Message-ID: <20161118081444.GC15912@gmail.com> (raw)
In-Reply-To: <20161117020610.5302-7-khuey@kylehuey.com>


* Kyle Huey <me@kylehuey.com> wrote:

> Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
> When enabled, the processor will fault on attempts to execute the CPUID
> instruction with CPL>0. Exposing this feature to userspace will allow a
> ptracer to trap and emulate the CPUID instruction.
> 
> When supported, this feature is controlled by toggling bit 0 of
> MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
> https://bugzilla.kernel.org/attachment.cgi?id=243991
> 
> Implement a new pair of arch_prctls, available on both x86-32 and x86-64.
> 
> ARCH_GET_CPUID: Returns the current CPUID faulting state, either
>   ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. arg2 must be 0.
> 
> ARCH_SET_CPUID: Set the CPUID faulting state to arg2, which must be either
>   ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. Returns EINVAL if arg2 is
>   another value or CPUID faulting is not supported on this system.

So the interface is:

> +#define ARCH_GET_CPUID 0x1005
> +#define ARCH_SET_CPUID 0x1006
> +#define ARCH_CPUID_ENABLE 1
> +#define ARCH_CPUID_SIGSEGV 2

Which maps to:

   prctl(ARCH_SET_CPUID, 0); /* -EINVAL */
   prctl(ARCH_SET_CPUID, 1); /* enable CPUID [i.e. make it work without faulting] */
   prctl(ARCH_SET_CPUID, 2); /* disable CPUID [i.e. make it fault] */

   ret = prctl(ARCH_GET_CPUID, 0); /* return current state: 1==on, 2==off */

This is a very broken interface that makes very little sense.

It would be much better to use a more natural interface where 1/0 means on/off and 
where ARCH_GET_CPUID returns the current natural state:

   prctl(ARCH_SET_CPUID, 0); /* disable CPUID [i.e. make it fault] */
   prctl(ARCH_SET_CPUID, 1); /* enable CPUID [i.e. make it work without faulting] */

   ret = prctl(ARCH_GET_CPUID); /* 1==enabled, 0==disabled */

See how natural it is? The use of the ARCH_CPUID_SIGSEGV/ENABLED symbols can be 
avoided altogether. This will cut down on some of the ugliness in the kernel code 
as well - and clean up the argument name as well: instead of naming it 'int arg2' 
it can be named the more natural 'int cpuid_enabled'.

> The state of the CPUID faulting flag is propagated across forks, but reset
> upon exec.

I don't think this is the natural API for propagating settings across exec().
We should reset the flag on exec() only if security considerations require it - 
i.e. like perf events are cleared.

If binaries that assume a working CPUID are exec()-ed then CPUID can be enabled 
explicitly.

Clearing it automatically loses the ability of a pure no-CPUID environment to 
exec() a CPUID-safe binary.

> Signed-off-by: Kyle Huey <khuey@kylehuey.com>
> ---
>  arch/x86/include/asm/msr-index.h          |   3 +
>  arch/x86/include/asm/processor.h          |   2 +
>  arch/x86/include/asm/thread_info.h        |   6 +-
>  arch/x86/include/uapi/asm/prctl.h         |   6 +
>  arch/x86/kernel/cpu/intel.c               |   7 +
>  arch/x86/kernel/process.c                 |  84 ++++++++++
>  fs/exec.c                                 |   1 +
>  include/linux/thread_info.h               |   4 +
>  tools/testing/selftests/x86/Makefile      |   2 +-
>  tools/testing/selftests/x86/cpuid-fault.c | 254 ++++++++++++++++++++++++++++++
>  10 files changed, 367 insertions(+), 2 deletions(-)
>  create mode 100644 tools/testing/selftests/x86/cpuid-fault.c

Please put the self-test into a separate patch.

>  static void init_intel_misc_features_enables(struct cpuinfo_x86 *c)
>  {
>  	u64 msr;
>  
> +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &msr))
> +		return;
> +
> +	msr = 0;
> +	wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
> +	this_cpu_write(msr_misc_features_enables_shadow, msr);
> +
>  	if (!rdmsrl_safe(MSR_PLATFORM_INFO, &msr)) {
>  		if (msr & MSR_PLATFORM_INFO_CPUID_FAULT)
>  			set_cpu_cap(c, X86_FEATURE_CPUID_FAULT);
>  	}
>  }

Sigh, so the Intel MSR index itself is grossly misnamed: MSR_MISC_FEATURES_ENABLES 
- plain reading of 'enables' suggests it's a verb, but in wants to be a noun. A 
better name would be MSR_MISC_FEATURES or so.

So while for the MSR index we want to keep the Intel name, please drop that 
_enables() postfix from the kernel C function names such as this one - and from 
the shadow value name as well.

> +DEFINE_PER_CPU(u64, msr_misc_features_enables_shadow);
> +
> +static void set_cpuid_faulting(bool on)
> +{
> +	u64 msrval;
> +
> +	DEBUG_LOCKS_WARN_ON(!irqs_disabled());
> +
> +	msrval = this_cpu_read(msr_misc_features_enables_shadow);
> +	msrval &= ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;
> +	msrval |= (on << MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT);
> +	this_cpu_write(msr_misc_features_enables_shadow, msrval);
> +	wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval);

This gets called from the context switch path and this looks pretty suboptimal, 
especially when combined with the TIF flag check:

>  void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
>  		      struct tss_struct *tss)
>  {
>  	struct thread_struct *prev, *next;
>  
>  	prev = &prev_p->thread;
>  	next = &next_p->thread;
>  
> @@ -206,16 +278,21 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
>  
>  		debugctl &= ~DEBUGCTLMSR_BTF;
>  		if (test_tsk_thread_flag(next_p, TIF_BLOCKSTEP))
>  			debugctl |= DEBUGCTLMSR_BTF;
>  
>  		update_debugctlmsr(debugctl);
>  	}
>  
> +	if (test_tsk_thread_flag(prev_p, TIF_NOCPUID) ^
> +	    test_tsk_thread_flag(next_p, TIF_NOCPUID)) {
> +		set_cpuid_faulting(test_tsk_thread_flag(next_p, TIF_NOCPUID));
> +	}
> +

Why not cache the required MSR value in the task struct instead?

That would allow something much more obvious and much faster, like:

	if (prev_p->thread.misc_features_val != next_p->thread.misc_features_val)
		wrmsrl(MSR_MISC_FEATURES_ENABLES, next_p->thread.misc_features_val);

(The TIF flag maintenance is still required to get into __switch_to_xtra().)

It would also be easy to extend without extra overhead, should any other feature 
bit be added to the MSR in the future.

Thanks,

	Ingo

  reply	other threads:[~2016-11-18  8:14 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-17  2:06 [PATCH v12 0/7] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Kyle Huey
2016-11-17  2:06 ` Kyle Huey
2016-11-17  2:06 ` [PATCH v12 1/7] x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl Kyle Huey
2016-11-17  2:06   ` Kyle Huey
2016-11-17  2:06 ` [PATCH v12 2/7] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64 Kyle Huey
2016-11-17  2:06   ` Kyle Huey
2016-11-18  7:27   ` Ingo Molnar
2016-11-18  7:27     ` Ingo Molnar
2016-11-18  7:28     ` Thomas Gleixner
2016-11-18  7:28       ` Thomas Gleixner
2016-11-18  8:16       ` Ingo Molnar
2016-11-18  8:16         ` Ingo Molnar
2016-11-18 16:39     ` Kyle Huey
2016-11-18 16:39       ` Kyle Huey
2016-11-29  9:26       ` Ingo Molnar
2016-11-29  9:26         ` Ingo Molnar
2016-11-17  2:06 ` [PATCH v12 3/7] x86/arch_prctl: Add do_arch_prctl_common Kyle Huey
2016-11-17  2:06   ` Kyle Huey
2016-11-17  2:06 ` [PATCH v12 4/7] x86/syscalls/32: Wire up arch_prctl on x86-32 Kyle Huey
2016-11-17  2:06   ` Kyle Huey
2016-11-18  7:30   ` Ingo Molnar
2016-11-18  7:30     ` Ingo Molnar
2016-11-17  2:06 ` [PATCH v12 5/7] x86/cpufeature: Detect CPUID faulting support Kyle Huey
2016-11-17  2:06   ` Kyle Huey
2016-11-17 16:51   ` Borislav Petkov
2016-11-17 16:51     ` Borislav Petkov
2016-11-17  2:06 ` [PATCH v12 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID Kyle Huey
2016-11-17  2:06   ` Kyle Huey
2016-11-18  8:14   ` Ingo Molnar [this message]
2016-11-18  8:14     ` Ingo Molnar
2016-11-18  8:49     ` Thomas Gleixner
2016-11-18  8:49       ` Thomas Gleixner
2016-11-21  8:27       ` Ingo Molnar
2016-11-21  8:27         ` Ingo Molnar
2016-11-22 17:26         ` Andy Lutomirski
2016-11-22 17:26           ` Andy Lutomirski
2016-11-18 15:55     ` Kyle Huey
2016-11-18 15:55       ` Kyle Huey
2016-11-18 17:32     ` Andy Lutomirski
2016-11-18 17:32       ` Andy Lutomirski
2016-11-17  2:06 ` [PATCH v12 7/7] KVM: x86: virtualize cpuid faulting Kyle Huey
2016-11-17  2:06   ` Kyle Huey
2016-11-17 12:31   ` Paolo Bonzini
2016-11-17 12:31     ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161118081444.GC15912@gmail.com \
    --to=mingo@kernel.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=dsafonov@virtuozzo.com \
    --cc=hpa@zytor.com \
    --cc=jdike@addtoit.com \
    --cc=kvm@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=me@kylehuey.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=richard@nod.at \
    --cc=rkrcmar@redhat.com \
    --cc=robert@ocallahan.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=user-mode-linux-devel@lists.sourceforge.net \
    --cc=user-mode-linux-user@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.