All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: "Robert O'Callahan" <robert@ocallahan.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Jeff Dike" <jdike@addtoit.com>,
	"Richard Weinberger" <richard@nod.at>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shuah Khan" <shuah@kernel.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Borislav Petkov" <bp@suse.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Len Brown" <len.brown@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	"Dmitry Safonov" <dsafonov@virtuozzo.com>,
	"David Matlack" <dmatlack@google.com>,
	"Nadav Amit" <nadav.amit@gmail.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	linux-kernel@vger.kernel.org,
	user-mode-linux-devel@lists.sourceforge.net,
	user-mode-linux-user@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH v13 0/8] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction
Date: Fri, 2 Dec 2016 11:29:42 +0100	[thread overview]
Message-ID: <20161202102942.GA17332@gmail.com> (raw)
In-Reply-To: <20161128030521.4423-1-khuey@kylehuey.com>


* Kyle Huey <me@kylehuey.com> wrote:

> rr (http://rr-project.org/), a userspace record-and-replay reverse-
> execution debugger, would like to trap and emulate the CPUID instruction.
> This would allow us to a) mask away certain hardware features that rr does
> not support (e.g. RDRAND) and b) enable trace portability across machines
> by providing constant results.
> 
> Newer Intel CPUs (Ivy Bridge and later) can fault when CPUID is executed at
> CPL > 0. Expose this capability to userspace as a new pair of arch_prctls,
> ARCH_GET_CPUID and ARCH_SET_CPUID.
> 
> Since v12:
> Patch 4: x86/syscalls/32: Wire up arch_prctl on x86-32
> - compat_sys_arch_prctl prototype has argument names.

So while I am fine with the feature, I'm still unconvinced about the 
implementation:

1)

As I pointed out before, the arbitrary 'code' argument name x86-ism should be 
changed to 'option' like the canonical core kernel option name is for prctls(). 

This is still unfixed.

2)

As I complained about in my first review, TIF_NOCPUID flag is too far removed from 
the value what will be written into the MSR.

The result is poor code generation on 64-bit defconfig+CONFIG_PREEMPT=y:

        if (test_tsk_thread_flag(prev_p, TIF_NOCPUID) ^
            test_tsk_thread_flag(next_p, TIF_NOCPUID)) {
		set_cpuid_faulting(test_tsk_thread_flag(next_p, TIF_NOCPUID));

is compiled as:

  476:   49 8b 06                mov    (%r14),%rax
  479:   49 8b 55 00             mov    0x0(%r13),%rdx
  47d:   48 c1 e8 0f             shr    $0xf,%rax
  481:   48 c1 ea 0f             shr    $0xf,%rdx
  485:   83 e2 01                and    $0x1,%edx
  488:   83 e0 01                and    $0x1,%eax
  48b:   38 c2                   cmp    %al,%dl
  48d:   74 10                   je     49f <__switch_to_xtra+0x9f>
  48f:   49 8b 7d 00             mov    0x0(%r13),%rdi
  493:   48 c1 ef 0f             shr    $0xf,%rdi
  497:   83 e7 01                and    $0x1,%edi
  49a:   e8 61 fb ff ff          callq  0 <set_cpuid_faulting>

... the first 7 instructions burdens all __switch_to_xtra() users, not just the 
faulting-CPUID users.

The set_cpuid_faulting() call is also unnecessary and the set_cpuid_faulting() 
call generates into an obscene sequence of:

0000000000000000 <set_cpuid_faulting>:
   0:	8b 15 00 00 00 00    	mov    0x0(%rip),%edx        # 6 <set_cpuid_faulting+0x6>
   6:	55                   	push   %rbp
   7:	48 89 e5             	mov    %rsp,%rbp
   a:	53                   	push   %rbx
   b:	40 0f b6 df          	movzbl %dil,%ebx
   f:	85 d2                	test   %edx,%edx
  11:	75 07                	jne    1a <set_cpuid_faulting+0x1a>
  13:	9c                   	pushfq 
  14:	58                   	pop    %rax
  15:	f6 c4 02             	test   $0x2,%ah
  18:	75 48                	jne    62 <set_cpuid_faulting+0x62>
  1a:	65 48 8b 05 00 00 00 	mov    %gs:0x0(%rip),%rax        # 22 <set_cpuid_faulting+0x22>
  21:	00 
  22:	48 83 e0 fe          	and    $0xfffffffffffffffe,%rax
  26:	b9 40 01 00 00       	mov    $0x140,%ecx
  2b:	48 09 d8             	or     %rbx,%rax
  2e:	48 89 c2             	mov    %rax,%rdx
  31:	65 48 89 05 00 00 00 	mov    %rax,%gs:0x0(%rip)        # 39 <set_cpuid_faulting+0x39>
  38:	00 
  39:	48 c1 ea 20          	shr    $0x20,%rdx
  3d:	0f 30                	wrmsr  
  3f:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  44:	5b                   	pop    %rbx
  45:	5d                   	pop    %rbp
  46:	c3                   	retq   
  47:	48 c1 e2 20          	shl    $0x20,%rdx
  4b:	89 c0                	mov    %eax,%eax
  4d:	bf 40 01 00 00       	mov    $0x140,%edi
  52:	48 09 d0             	or     %rdx,%rax
  55:	31 d2                	xor    %edx,%edx
  57:	48 89 c6             	mov    %rax,%rsi
  5a:	e8 00 00 00 00       	callq  5f <set_cpuid_faulting+0x5f>
  5f:	5b                   	pop    %rbx
  60:	5d                   	pop    %rbp
  61:	c3                   	retq   
  62:	e8 00 00 00 00       	callq  67 <set_cpuid_faulting+0x67>
  67:	85 c0                	test   %eax,%eax
  69:	74 af                	je     1a <set_cpuid_faulting+0x1a>
  6b:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # 71 <set_cpuid_faulting+0x71>
  71:	85 c0                	test   %eax,%eax
  73:	75 a5                	jne    1a <set_cpuid_faulting+0x1a>
  75:	48 c7 c1 00 00 00 00 	mov    $0x0,%rcx
  7c:	48 c7 c2 00 00 00 00 	mov    $0x0,%rdx
  83:	be b9 00 00 00       	mov    $0xb9,%esi
  88:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
  8f:	e8 00 00 00 00       	callq  94 <set_cpuid_faulting+0x94>
  94:	eb 84                	jmp    1a <set_cpuid_faulting+0x1a>
  96:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  9d:	00 00 00 

The affected object file code size blows up as well, by 17%:

 arch/x86/kernel/process.o:
    text    data     bss     dec     hex filename
    3325    8577      32   11934    2e9e process.o.before
    3889    8609      32   12530    30f2 process.o.after

A good deal of this overhead and complexity comes from the implementation 
inefficiency I pointed out, and all this can be avoided with the method I 
suggested in my previous review, by caching the per task MSR value in the thread 
struct.

So sorry, NAK for this implementation - especially considering how relatively 
straightforward the changes I suggested are to implement.

Thanks,

	Ingo


WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: "Robert O'Callahan" <robert@ocallahan.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Jeff Dike" <jdike@addtoit.com>,
	"Richard Weinberger" <richard@nod.at>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shuah Khan" <shuah@kernel.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Borislav Petkov" <bp@suse.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Len Brown" <len.brown@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	"Dmitry Safonov" <dsafonov@virtuozzo.com>,
	"David Matlack" <dmatlack@google.com>
Subject: Re: [PATCH v13 0/8] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction
Date: Fri, 2 Dec 2016 11:29:42 +0100	[thread overview]
Message-ID: <20161202102942.GA17332@gmail.com> (raw)
In-Reply-To: <20161128030521.4423-1-khuey@kylehuey.com>


* Kyle Huey <me@kylehuey.com> wrote:

> rr (http://rr-project.org/), a userspace record-and-replay reverse-
> execution debugger, would like to trap and emulate the CPUID instruction.
> This would allow us to a) mask away certain hardware features that rr does
> not support (e.g. RDRAND) and b) enable trace portability across machines
> by providing constant results.
> 
> Newer Intel CPUs (Ivy Bridge and later) can fault when CPUID is executed at
> CPL > 0. Expose this capability to userspace as a new pair of arch_prctls,
> ARCH_GET_CPUID and ARCH_SET_CPUID.
> 
> Since v12:
> Patch 4: x86/syscalls/32: Wire up arch_prctl on x86-32
> - compat_sys_arch_prctl prototype has argument names.

So while I am fine with the feature, I'm still unconvinced about the 
implementation:

1)

As I pointed out before, the arbitrary 'code' argument name x86-ism should be 
changed to 'option' like the canonical core kernel option name is for prctls(). 

This is still unfixed.

2)

As I complained about in my first review, TIF_NOCPUID flag is too far removed from 
the value what will be written into the MSR.

The result is poor code generation on 64-bit defconfig+CONFIG_PREEMPT=y:

        if (test_tsk_thread_flag(prev_p, TIF_NOCPUID) ^
            test_tsk_thread_flag(next_p, TIF_NOCPUID)) {
		set_cpuid_faulting(test_tsk_thread_flag(next_p, TIF_NOCPUID));

is compiled as:

  476:   49 8b 06                mov    (%r14),%rax
  479:   49 8b 55 00             mov    0x0(%r13),%rdx
  47d:   48 c1 e8 0f             shr    $0xf,%rax
  481:   48 c1 ea 0f             shr    $0xf,%rdx
  485:   83 e2 01                and    $0x1,%edx
  488:   83 e0 01                and    $0x1,%eax
  48b:   38 c2                   cmp    %al,%dl
  48d:   74 10                   je     49f <__switch_to_xtra+0x9f>
  48f:   49 8b 7d 00             mov    0x0(%r13),%rdi
  493:   48 c1 ef 0f             shr    $0xf,%rdi
  497:   83 e7 01                and    $0x1,%edi
  49a:   e8 61 fb ff ff          callq  0 <set_cpuid_faulting>

... the first 7 instructions burdens all __switch_to_xtra() users, not just the 
faulting-CPUID users.

The set_cpuid_faulting() call is also unnecessary and the set_cpuid_faulting() 
call generates into an obscene sequence of:

0000000000000000 <set_cpuid_faulting>:
   0:	8b 15 00 00 00 00    	mov    0x0(%rip),%edx        # 6 <set_cpuid_faulting+0x6>
   6:	55                   	push   %rbp
   7:	48 89 e5             	mov    %rsp,%rbp
   a:	53                   	push   %rbx
   b:	40 0f b6 df          	movzbl %dil,%ebx
   f:	85 d2                	test   %edx,%edx
  11:	75 07                	jne    1a <set_cpuid_faulting+0x1a>
  13:	9c                   	pushfq 
  14:	58                   	pop    %rax
  15:	f6 c4 02             	test   $0x2,%ah
  18:	75 48                	jne    62 <set_cpuid_faulting+0x62>
  1a:	65 48 8b 05 00 00 00 	mov    %gs:0x0(%rip),%rax        # 22 <set_cpuid_faulting+0x22>
  21:	00 
  22:	48 83 e0 fe          	and    $0xfffffffffffffffe,%rax
  26:	b9 40 01 00 00       	mov    $0x140,%ecx
  2b:	48 09 d8             	or     %rbx,%rax
  2e:	48 89 c2             	mov    %rax,%rdx
  31:	65 48 89 05 00 00 00 	mov    %rax,%gs:0x0(%rip)        # 39 <set_cpuid_faulting+0x39>
  38:	00 
  39:	48 c1 ea 20          	shr    $0x20,%rdx
  3d:	0f 30                	wrmsr  
  3f:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  44:	5b                   	pop    %rbx
  45:	5d                   	pop    %rbp
  46:	c3                   	retq   
  47:	48 c1 e2 20          	shl    $0x20,%rdx
  4b:	89 c0                	mov    %eax,%eax
  4d:	bf 40 01 00 00       	mov    $0x140,%edi
  52:	48 09 d0             	or     %rdx,%rax
  55:	31 d2                	xor    %edx,%edx
  57:	48 89 c6             	mov    %rax,%rsi
  5a:	e8 00 00 00 00       	callq  5f <set_cpuid_faulting+0x5f>
  5f:	5b                   	pop    %rbx
  60:	5d                   	pop    %rbp
  61:	c3                   	retq   
  62:	e8 00 00 00 00       	callq  67 <set_cpuid_faulting+0x67>
  67:	85 c0                	test   %eax,%eax
  69:	74 af                	je     1a <set_cpuid_faulting+0x1a>
  6b:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # 71 <set_cpuid_faulting+0x71>
  71:	85 c0                	test   %eax,%eax
  73:	75 a5                	jne    1a <set_cpuid_faulting+0x1a>
  75:	48 c7 c1 00 00 00 00 	mov    $0x0,%rcx
  7c:	48 c7 c2 00 00 00 00 	mov    $0x0,%rdx
  83:	be b9 00 00 00       	mov    $0xb9,%esi
  88:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
  8f:	e8 00 00 00 00       	callq  94 <set_cpuid_faulting+0x94>
  94:	eb 84                	jmp    1a <set_cpuid_faulting+0x1a>
  96:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  9d:	00 00 00 

The affected object file code size blows up as well, by 17%:

 arch/x86/kernel/process.o:
    text    data     bss     dec     hex filename
    3325    8577      32   11934    2e9e process.o.before
    3889    8609      32   12530    30f2 process.o.after

A good deal of this overhead and complexity comes from the implementation 
inefficiency I pointed out, and all this can be avoided with the method I 
suggested in my previous review, by caching the per task MSR value in the thread 
struct.

So sorry, NAK for this implementation - especially considering how relatively 
straightforward the changes I suggested are to implement.

Thanks,

	Ingo

  parent reply	other threads:[~2016-12-02 10:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-28  3:05 [PATCH v13 0/8] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Kyle Huey
2016-11-28  3:05 ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 1/8] x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 2/8] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64 Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 3/8] x86/arch_prctl: Add do_arch_prctl_common Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 4/8] x86/syscalls/32: Wire up arch_prctl on x86-32 Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 5/8] x86/cpufeature: Detect CPUID faulting support Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 6/8] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-11-28 14:06   ` Thomas Gleixner
2016-11-28 14:06     ` Thomas Gleixner
2016-11-28 16:13     ` Kyle Huey
2016-11-28 16:13       ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 7/8] x86/arch_prctl: Selftest for ARCH_[GET|SET]_CPUID Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 8/8] KVM: x86: virtualize cpuid faulting Kyle Huey
2016-11-28  3:05   ` Kyle Huey
2016-12-02 10:29 ` Ingo Molnar [this message]
2016-12-02 10:29   ` [PATCH v13 0/8] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Ingo Molnar
2016-12-03 15:37   ` Kyle Huey
2016-12-03 15:37     ` Kyle Huey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161202102942.GA17332@gmail.com \
    --to=mingo@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=dsafonov@virtuozzo.com \
    --cc=hpa@zytor.com \
    --cc=jdike@addtoit.com \
    --cc=kvm@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=me@kylehuey.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=richard@nod.at \
    --cc=rkrcmar@redhat.com \
    --cc=robert@ocallahan.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=user-mode-linux-devel@lists.sourceforge.net \
    --cc=user-mode-linux-user@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.