From: Eric Dumazet <dada1@cosmosbay.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <christoph@lameter.com>,
Alok N Kataria <alokk@calsoftinc.com>,
Shobhit Dayal <shobhit@calsoftinc.com>,
Shai Fultheim <shai@scalex86.org>, Matt Mackall <mpm@selenic.com>,
Andrew Morton <akpm@osdl.org>, john stultz <johnstul@us.ibm.com>,
Gunter Ohrner <G.Ohrner@post.rwth-aachen.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RT 00/02] SLOB optimizations
Date: Wed, 21 Dec 2005 09:02:15 +0100 [thread overview]
Message-ID: <43A90C07.4000003@cosmosbay.com> (raw)
In-Reply-To: <20051221074346.GA2398@elte.hu>
Ingo Molnar a écrit :
> * Eric Dumazet <dada1@cosmosbay.com> wrote:
>
>
>>>while it could possibly be cleaned up a bit, it's one of the
>>>best-optimized subsystems Linux has. Most of the "unnecessary
>>>complexity" in SLAB is related to a performance or a debugging feature.
>>>Many times i have looked at the SLAB code in a disassembler, right next
>>>to profile output from some hot workload, and have concluded: 'I couldnt
>>>do this any better even with hand-coded assembly'.
>>
>>Well, I miss a version of kmem_cache_alloc()/kmem_cache_free() that
>>wont play with IRQ masking.
>
>
> sure, but adding this sure wont reduce complexity ;)
>
>
>>The local_irq_save()/local_irq_restore() pair is quite expensive and
>>could be avoided for several caches that are exclusively used in
>>process context.
>
>
> in any case, on sane platforms (i386, x86_64) an irq-disable is
> well-optimized in hardware, and is just as fast as a preempt_disable().
>
I'm afraid its not the case on current hardware.
The irq enable/disable pair count for more than 50% the cpu time spent in
kmem_cache_alloc()/kmem_cache_free()/kfree()
oprofile results on a dual Opteron 246 :
You can see the high profile numbers right after cli and popf(sti)
instructions, popf being VERY expensive.
CPU: Hammer, speed 1993.39 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 50000
29993 1.9317 kfree
18654 1.2014 kmem_cache_alloc
12962 0.8348 kmem_cache_free
ffffffff8015c370 <kfree>: /* kfree total: 30334 1.9335 */
770 0.0491 :ffffffff8015c370: push %rbp
2477 0.1579 :ffffffff8015c371: mov %rdi,%rbp
:ffffffff8015c374: push %rbx
:ffffffff8015c375: sub $0x8,%rsp
1792 0.1142 :ffffffff8015c379: test %rdi,%rdi
:ffffffff8015c37c: je ffffffff8015c452 <kfree+0xe2>
122 0.0078 :ffffffff8015c382: pushfq
1001 0.0638 :ffffffff8015c383: popq (%rsp)
1456 0.0928 :ffffffff8015c386: cli
2489 0.1586 :ffffffff8015c387: mov $0xffffffff7fffffff,%rax <<
...
72 0.0046 :ffffffff8015c44e: pushq (%rsp)
1080 0.0688 :ffffffff8015c451: popfq
13934 0.8882 :ffffffff8015c452: add $0x8,%rsp << HERE >>
290 0.0185 :ffffffff8015c456: pop %rbx
:ffffffff8015c457: pop %rbp
124 0.0079 :ffffffff8015c458: retq
ffffffff8015c460 <kmem_cache_free>: /* kmem_cache_free total: 13084 0.8340 */
388 0.0247 :ffffffff8015c460: sub $0x18,%rsp
365 0.0233 :ffffffff8015c464: mov %rbp,0x10(%rsp)
:ffffffff8015c469: mov %rbx,0x8(%rsp)
121 0.0077 :ffffffff8015c46e: mov %rsi,%rbp
262 0.0167 :ffffffff8015c471: pushfq
549 0.0350 :ffffffff8015c472: popq (%rsp)
351 0.0224 :ffffffff8015c475: cli
2478 0.1579 :ffffffff8015c476: mov %gs:0x34,%eax
592 0.0377 :ffffffff8015c47e: cltq
:ffffffff8015c480: mov (%rdi,%rax,8),%rbx
7 4.5e-04 :ffffffff8015c484: mov (%rbx),%eax
200 0.0127 :ffffffff8015c486: cmp 0x4(%rbx),%eax
:ffffffff8015c489: jae ffffffff8015c48f
<kmem_cache_free+0x2f>
:ffffffff8015c48b: mov %eax,%eax
766 0.0488 :ffffffff8015c48d: jmp ffffffff8015c4a0
<kmem_cache_free+0x40>
:ffffffff8015c48f: mov %rbx,%rsi
71 0.0045 :ffffffff8015c492: callq ffffffff8015c810
<cache_flusharray>
:ffffffff8015c497: mov (%rbx),%eax
1 6.4e-05 :ffffffff8015c499: data16
:ffffffff8015c49a: data16
:ffffffff8015c49b: data16
:ffffffff8015c49c: nop
:ffffffff8015c49d: data16
:ffffffff8015c49e: data16
:ffffffff8015c49f: nop
:ffffffff8015c4a0: mov %rbp,0x10(%rbx,%rax,8)
20 0.0013 :ffffffff8015c4a5: incl (%rbx)
176 0.0112 :ffffffff8015c4a7: pushq (%rsp)
7 4.5e-04 :ffffffff8015c4aa: popfq
6187 0.3944 :ffffffff8015c4ab: mov 0x8(%rsp),%rbx << HERE>>
543 0.0346 :ffffffff8015c4b0: mov 0x10(%rsp),%rbp
:ffffffff8015c4b5: add $0x18,%rsp
:ffffffff8015c4b9: retq
ffffffff8015bd70 <kmem_cache_alloc>: /* kmem_cache_alloc total: 18803 1.1985 */
549 0.0350 :ffffffff8015bd70: sub $0x8,%rsp
700 0.0446 :ffffffff8015bd74: pushfq
1427 0.0910 :ffffffff8015bd75: popq (%rsp)
226 0.0144 :ffffffff8015bd78: cli
2399 0.1529 :ffffffff8015bd79: mov %gs:0x34,%eax <<HERE>>
416 0.0265 :ffffffff8015bd81: cltq
:ffffffff8015bd83: mov (%rdi,%rax,8),%rdx
21 0.0013 :ffffffff8015bd87: mov (%rdx),%eax
172 0.0110 :ffffffff8015bd89: test %eax,%eax
:ffffffff8015bd8b: je ffffffff8015bda1
<kmem_cache_alloc+0x31>
8 5.1e-04 :ffffffff8015bd8d: dec %eax
1338 0.0853 :ffffffff8015bd8f: movl $0x1,0xc(%rdx)
9 5.7e-04 :ffffffff8015bd96: mov %eax,(%rdx)
9 5.7e-04 :ffffffff8015bd98: mov %eax,%eax
1146 0.0730 :ffffffff8015bd9a: mov 0x10(%rdx,%rax,8),%rax
4 2.5e-04 :ffffffff8015bd9f: jmp ffffffff8015bda6
<kmem_cache_alloc+0x36>
:ffffffff8015bda1: callq ffffffff8015c160
<cache_alloc_refill>
154 0.0098 :ffffffff8015bda6: pushq (%rsp)
241 0.0154 :ffffffff8015bda9: popfq
9222 0.5878 :ffffffff8015bdaa: prefetchw (%rax) <<HERE>>
758 0.0483 :ffffffff8015bdad: add $0x8,%rsp
4 2.5e-04 :ffffffff8015bdb1: retq
Eric
next prev parent reply other threads:[~2005-12-21 8:02 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-16 11:30 2.6.15-rc5-rt2 slowness Gunter Ohrner
2005-12-16 11:42 ` Gunter Ohrner
2005-12-16 12:04 ` Gunter Ohrner
2005-12-16 12:34 ` Steven Rostedt
2005-12-16 12:32 ` Steven Rostedt
2005-12-16 22:58 ` john stultz
2005-12-17 0:22 ` Gunter Ohrner
2005-12-17 3:51 ` Steven Rostedt
2005-12-17 3:33 ` Steven Rostedt
2005-12-17 22:57 ` Steven Rostedt
2005-12-18 16:05 ` K.R. Foley
2005-12-20 13:32 ` Ingo Molnar
2005-12-20 13:38 ` Steven Rostedt
2005-12-20 13:57 ` Ingo Molnar
2005-12-20 14:04 ` Steven Rostedt
2005-12-20 14:33 ` Steven Rostedt
2005-12-20 15:07 ` Ingo Molnar
2005-12-20 15:16 ` Steven Rostedt
2005-12-20 15:44 ` [PATCH RT 00/02] SLOB optimizations Steven Rostedt
2005-12-20 15:56 ` Steven Rostedt
2005-12-20 15:58 ` Ingo Molnar
2005-12-20 16:13 ` Ingo Molnar
2005-12-20 16:29 ` Steven Rostedt
2005-12-20 16:39 ` Steven Rostedt
2005-12-20 18:19 ` Matt Mackall
2005-12-20 19:15 ` Steven Rostedt
2005-12-20 19:43 ` Matt Mackall
2005-12-20 20:06 ` Steven Rostedt
2005-12-20 20:15 ` Pekka Enberg
2005-12-20 21:42 ` Steven Rostedt
2005-12-20 21:52 ` Christoph Lameter
2005-12-20 22:11 ` Steven Rostedt
2005-12-21 6:36 ` Ingo Molnar
2005-12-21 12:50 ` Steven Rostedt
2005-12-21 6:56 ` Ingo Molnar
2005-12-21 7:16 ` Pekka J Enberg
2005-12-21 7:50 ` Ingo Molnar
2005-12-21 13:13 ` Steven Rostedt
2005-12-21 15:34 ` [PATCH] SLAB - have index_of bug at compile time Steven Rostedt
2005-12-21 7:20 ` [PATCH RT 00/02] SLOB optimizations Eric Dumazet
2005-12-21 7:43 ` Ingo Molnar
2005-12-21 8:02 ` Eric Dumazet [this message]
2005-12-22 18:02 ` Zwane Mwaikambo
2005-12-22 21:11 ` Ingo Molnar
2005-12-22 21:39 ` Eric Dumazet
2005-12-22 21:44 ` George Anzinger
2005-12-22 22:00 ` Eric Dumazet
2005-12-22 22:08 ` Eric Dumazet
2005-12-23 19:22 ` Zwane Mwaikambo
2005-12-21 13:02 ` Steven Rostedt
2005-12-21 2:30 ` Nick Piggin
2005-12-21 2:41 ` Steven Rostedt
2005-12-20 15:44 ` [PATCH RT 01/02] SLOB - remove bigblock list Steven Rostedt
2005-12-20 15:44 ` [PATCH RT 02/02] SLOB - break SLOB up by caches Steven Rostedt
2005-12-20 14:07 ` 2.6.15-rc5-rt2 slowness Steven Rostedt
2005-12-20 15:26 ` K.R. Foley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43A90C07.4000003@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=G.Ohrner@post.rwth-aachen.de \
--cc=akpm@osdl.org \
--cc=alokk@calsoftinc.com \
--cc=christoph@lameter.com \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mpm@selenic.com \
--cc=penberg@cs.helsinki.fi \
--cc=rostedt@goodmis.org \
--cc=shai@scalex86.org \
--cc=shobhit@calsoftinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox