From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org
Subject: [RFC PATCH -tip 0/4] x86: signal handler improvement
Date: Mon, 22 Sep 2008 18:45:14 -0700 [thread overview]
Message-ID: <48D84A2A.2030009@ct.jp.nec.com> (raw)
This patch series is experimental.
I made it against tip/master.
I noticed there are inefficient codes in x86 signals.
For example, disassembled 32-bit setup_sigcontext();
0000007c <setup_sigcontext>:
7c: 55 push %ebp
7d: 89 e5 mov %esp,%ebp
7f: 57 push %edi
80: 31 ff xor %edi,%edi
82: 56 push %esi
83: 89 c6 mov %eax,%esi
85: 53 push %ebx
86: 83 ec 58 sub $0x58,%esp
89: 89 55 a4 mov %edx,-0x5c(%ebp)
8c: 89 fa mov %edi,%edx
8e: 8b 41 24 mov 0x24(%ecx),%eax
91: 89 46 04 mov %eax,0x4(%esi)
94: 89 55 a8 mov %edx,-0x58(%ebp)
...
184: 8b 5d ac mov -0x54(%ebp),%ebx
187: 0b 5d a8 or -0x58(%ebp),%ebx
18a: 0b 5d b0 or -0x50(%ebp),%ebx
18d: 0b 5d b4 or -0x4c(%ebp),%ebx
190: 0b 5d b8 or -0x48(%ebp),%ebx
193: 0b 5d bc or -0x44(%ebp),%ebx
196: 0b 5d c0 or -0x40(%ebp),%ebx
199: 0b 5d c4 or -0x3c(%ebp),%ebx
19c: 0b 5d c8 or -0x38(%ebp),%ebx
19f: 0b 5d cc or -0x34(%ebp),%ebx
1a2: 0b 5d d0 or -0x30(%ebp),%ebx
1a5: 0b 5d d4 or -0x2c(%ebp),%ebx
1a8: 0b 5d d8 or -0x28(%ebp),%ebx
1ab: 0b 5d dc or -0x24(%ebp),%ebx
1ae: 0b 5d e0 or -0x20(%ebp),%ebx
1b1: 0b 5d e4 or -0x1c(%ebp),%ebx
1b4: 0b 5d e8 or -0x18(%ebp),%ebx
1b7: 0b 5d ec or -0x14(%ebp),%ebx
1ba: 0b 5d f0 or -0x10(%ebp),%ebx
1bd: 09 fb or %edi,%ebx
...
1dc: 09 d8 or %ebx,%eax
1de: 5b pop %ebx
1df: 09 c8 or %ecx,%eax
1e1: 5e pop %esi
1e2: 5f pop %edi
1e3: 5d pop %ebp
1e4: c3 ret
there is a lot of "or" operation with stack, and it came from a set of
following lines;
err |= __put_user(x, ptr);
The above line compiled to like this;
a0: 89 fa mov %edi,%edx
a2: 8b 41 20 mov 0x20(%ecx),%eax
a5: 89 46 08 mov %eax,0x8(%esi)
a8: 89 55 b0 mov %edx,-0x50(%ebp)
and errors in __put_user is stored to stack.
At the end of function, all errors in stack are calculated. If access to user
space is failed, %edx is set to -EFAULT in exception handler and stored to
stack for later operation.
It seems inefficient.
This patch series introduce __{put|get}_user_cerr for cumulative error
handling. So, the line;
err |= __put_user(x, ptr);
changes to
__put_user_cerr(x, ptr, err);
and it compiled to like this;
92: 8b 41 20 mov 0x20(%ecx),%eax
95: 89 47 08 mov %eax,0x8(%edi)
The cumulative error is kept in the other register, in this example,
%esi is used for this, and returns it. If access to user space causes fault,
%esi is set to the value (%esi | -EFAULT) in exception handler.
137: 89 f0 mov %esi,%eax
139: 5b pop %ebx
13a: 5e pop %esi
13b: 5f pop %edi
13c: 5d pop %ebp
13d: c3 ret
The results of this, I got a little performance improvement in signal
handling. Here is a result of lmbench.
I've tried 64-bit only at this time. Will measure on 32-bit.
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
tip64-sig Linux 2.6.27- 3788 0.17 0.27 1.67 2.65 3.20 0.32 2.97 180. 524. 1806
tip64-sig Linux 2.6.27- 3788 0.17 0.27 1.68 2.65 3.20 0.32 2.95 180. 520. 1796
tip64-sig Linux 2.6.27- 3788 0.17 0.27 1.68 2.72 3.21 0.32 2.96 177. 528. 1812
tip64 Linux 2.6.27- 3788 0.17 0.27 1.68 2.74 3.22 0.32 3.01 174. 523. 1853
tip64 Linux 2.6.27- 3788 0.17 0.27 1.68 2.73 3.22 0.32 3.01 175. 523. 1806
tip64 Linux 2.6.27- 3788 0.17 0.27 1.68 2.74 3.20 0.32 3.01 181. 523. 1791
This patch series also reduces stack usages and code size.
$ size signal_*
text data bss dec hex filename
4507 0 0 4507 119b signal_32.o.new
5031 0 0 5031 13a7 signal_32.o.old
3827 0 0 3827 ef3 signal_64.o.new
4652 0 0 4652 122c signal_64.o.old
Comments are welcome.
I'll handle this patch series if it's good.
thanks,
Hiroshi Shimamoto
next reply other threads:[~2008-09-23 1:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-23 1:45 Hiroshi Shimamoto [this message]
2008-09-23 1:50 ` [RFC PATCH -tip 1/4] x86: uaccess: rename __put_user_u64 to __put_user_asm_u64 Hiroshi Shimamoto
2008-09-23 1:51 ` [RFC PATCH -tip 2/4] x86: uaccess: introduce __{put|get}_user_asm_eop Hiroshi Shimamoto
2008-09-23 1:51 ` [RFC PATCH -tip 3/4] x86: uaccess: introduce __{put|get}_user_cerr Hiroshi Shimamoto
2008-09-23 1:51 ` [RFC PATCH -tip 4/4] x86: signal: use __{put|get}_user_cerr Hiroshi Shimamoto
2008-09-23 8:50 ` [RFC PATCH -tip 0/4] x86: signal handler improvement Ingo Molnar
2008-09-23 8:55 ` Ingo Molnar
2008-09-23 17:00 ` Hiroshi Shimamoto
2008-09-23 16:58 ` Hiroshi Shimamoto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48D84A2A.2030009@ct.jp.nec.com \
--to=h-shimamoto@ct.jp.nec.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.