public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org
Subject: [RFC PATCH -tip 0/4] x86: signal handler improvement
Date: Mon, 22 Sep 2008 18:45:14 -0700	[thread overview]
Message-ID: <48D84A2A.2030009@ct.jp.nec.com> (raw)

This patch series is experimental.
I made it against tip/master.

I noticed there are inefficient codes in x86 signals.
For example, disassembled 32-bit setup_sigcontext();

0000007c <setup_sigcontext>:
  7c:	55                   	push   %ebp
  7d:	89 e5                	mov    %esp,%ebp
  7f:	57                   	push   %edi
  80:	31 ff                	xor    %edi,%edi
  82:	56                   	push   %esi
  83:	89 c6                	mov    %eax,%esi
  85:	53                   	push   %ebx
  86:	83 ec 58             	sub    $0x58,%esp
  89:	89 55 a4             	mov    %edx,-0x5c(%ebp)
  8c:	89 fa                	mov    %edi,%edx
  8e:	8b 41 24             	mov    0x24(%ecx),%eax
  91:	89 46 04             	mov    %eax,0x4(%esi)
  94:	89 55 a8             	mov    %edx,-0x58(%ebp)
...
 184:	8b 5d ac             	mov    -0x54(%ebp),%ebx
 187:	0b 5d a8             	or     -0x58(%ebp),%ebx
 18a:	0b 5d b0             	or     -0x50(%ebp),%ebx
 18d:	0b 5d b4             	or     -0x4c(%ebp),%ebx
 190:	0b 5d b8             	or     -0x48(%ebp),%ebx
 193:	0b 5d bc             	or     -0x44(%ebp),%ebx
 196:	0b 5d c0             	or     -0x40(%ebp),%ebx
 199:	0b 5d c4             	or     -0x3c(%ebp),%ebx
 19c:	0b 5d c8             	or     -0x38(%ebp),%ebx
 19f:	0b 5d cc             	or     -0x34(%ebp),%ebx
 1a2:	0b 5d d0             	or     -0x30(%ebp),%ebx
 1a5:	0b 5d d4             	or     -0x2c(%ebp),%ebx
 1a8:	0b 5d d8             	or     -0x28(%ebp),%ebx
 1ab:	0b 5d dc             	or     -0x24(%ebp),%ebx
 1ae:	0b 5d e0             	or     -0x20(%ebp),%ebx
 1b1:	0b 5d e4             	or     -0x1c(%ebp),%ebx
 1b4:	0b 5d e8             	or     -0x18(%ebp),%ebx
 1b7:	0b 5d ec             	or     -0x14(%ebp),%ebx
 1ba:	0b 5d f0             	or     -0x10(%ebp),%ebx
 1bd:	09 fb                	or     %edi,%ebx
...
 1dc:	09 d8                	or     %ebx,%eax
 1de:	5b                   	pop    %ebx
 1df:	09 c8                	or     %ecx,%eax
 1e1:	5e                   	pop    %esi
 1e2:	5f                   	pop    %edi
 1e3:	5d                   	pop    %ebp
 1e4:	c3                   	ret    

there is a lot of "or" operation with stack, and it came from a set of
following lines;

err |= __put_user(x, ptr);

The above line compiled to like this;
  a0:	89 fa                	mov    %edi,%edx
  a2:	8b 41 20             	mov    0x20(%ecx),%eax
  a5:	89 46 08             	mov    %eax,0x8(%esi)
  a8:	89 55 b0             	mov    %edx,-0x50(%ebp)

and errors in __put_user is stored to stack.
At the end of function, all errors in stack are calculated. If access to user
space is failed, %edx is set to -EFAULT in exception handler and stored to
stack for later operation.
It seems inefficient.

This patch series introduce __{put|get}_user_cerr for cumulative error
handling. So, the line;
err |= __put_user(x, ptr);
changes to
__put_user_cerr(x, ptr, err);

and it compiled to like this;
  92:	8b 41 20             	mov    0x20(%ecx),%eax
  95:	89 47 08             	mov    %eax,0x8(%edi)

The cumulative error is kept in the other register, in this example,
%esi is used for this, and returns it. If access to user space causes fault,
%esi is set to the value (%esi | -EFAULT) in exception handler.

 137:	89 f0                	mov    %esi,%eax
 139:	5b                   	pop    %ebx
 13a:	5e                   	pop    %esi
 13b:	5f                   	pop    %edi
 13c:	5d                   	pop    %ebp
 13d:	c3                   	ret    

The results of this, I got a little performance improvement in signal
handling. Here is a result of lmbench.
I've tried 64-bit only at this time. Will measure on 32-bit.

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
tip64-sig Linux 2.6.27- 3788 0.17 0.27 1.67 2.65 3.20 0.32 2.97 180. 524. 1806
tip64-sig Linux 2.6.27- 3788 0.17 0.27 1.68 2.65 3.20 0.32 2.95 180. 520. 1796
tip64-sig Linux 2.6.27- 3788 0.17 0.27 1.68 2.72 3.21 0.32 2.96 177. 528. 1812
tip64     Linux 2.6.27- 3788 0.17 0.27 1.68 2.74 3.22 0.32 3.01 174. 523. 1853
tip64     Linux 2.6.27- 3788 0.17 0.27 1.68 2.73 3.22 0.32 3.01 175. 523. 1806
tip64     Linux 2.6.27- 3788 0.17 0.27 1.68 2.74 3.20 0.32 3.01 181. 523. 1791

This patch series also reduces stack usages and code size.

$ size signal_*
   text	   data	    bss	    dec	    hex	filename
   4507	      0	      0	   4507	   119b	signal_32.o.new
   5031	      0	      0	   5031	   13a7	signal_32.o.old
   3827	      0	      0	   3827	    ef3	signal_64.o.new
   4652	      0	      0	   4652	   122c	signal_64.o.old

Comments are welcome.
I'll handle this patch series if it's good.

thanks,
Hiroshi Shimamoto


             reply	other threads:[~2008-09-23  1:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-23  1:45 Hiroshi Shimamoto [this message]
2008-09-23  1:50 ` [RFC PATCH -tip 1/4] x86: uaccess: rename __put_user_u64 to __put_user_asm_u64 Hiroshi Shimamoto
2008-09-23  1:51 ` [RFC PATCH -tip 2/4] x86: uaccess: introduce __{put|get}_user_asm_eop Hiroshi Shimamoto
2008-09-23  1:51 ` [RFC PATCH -tip 3/4] x86: uaccess: introduce __{put|get}_user_cerr Hiroshi Shimamoto
2008-09-23  1:51 ` [RFC PATCH -tip 4/4] x86: signal: use __{put|get}_user_cerr Hiroshi Shimamoto
2008-09-23  8:50 ` [RFC PATCH -tip 0/4] x86: signal handler improvement Ingo Molnar
2008-09-23  8:55   ` Ingo Molnar
2008-09-23 17:00     ` Hiroshi Shimamoto
2008-09-23 16:58   ` Hiroshi Shimamoto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48D84A2A.2030009@ct.jp.nec.com \
    --to=h-shimamoto@ct.jp.nec.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox