* Re: Large inlines in include/linux/skbuff.h
[not found] ` <1NAJr-61F-3@gated-at.bofh.it>
@ 2004-04-22 14:59 ` Andi Kleen
2004-04-22 15:15 ` James Morris
[not found] ` <1NRqX-2GI-15@gated-at.bofh.it>
1 sibling, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2004-04-22 14:59 UTC (permalink / raw)
To: James Morris; +Cc: linux-kernel, vda
James Morris <jmorris@redhat.com> writes:
> On Wed, 21 Apr 2004, Denis Vlasenko wrote:
>
>> What shall be done with this? I'll make patch to move locking functions
>> into net/core/skbuff.c unless there is some reason not to do it.
>
> How will these changes impact performance? I asked this last time you
> posted about inlines and didn't see any response.
I don't think it will be an issue. The optimization guidelines
of AMD and Intel recommend to move functions that generate
more than 30-40 instructions out of line. 100 instructions
is certainly enough to amortize the call overhead, and you
safe some icache too so it may be even faster.
-Andi
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Large inlines in include/linux/skbuff.h
2004-04-22 14:59 ` Large inlines in include/linux/skbuff.h Andi Kleen
@ 2004-04-22 15:15 ` James Morris
2004-04-22 15:37 ` Eric Dumazet
2004-04-22 22:23 ` Andi Kleen
0 siblings, 2 replies; 13+ messages in thread
From: James Morris @ 2004-04-22 15:15 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, vda
On Thu, 22 Apr 2004, Andi Kleen wrote:
> > How will these changes impact performance? I asked this last time you
> > posted about inlines and didn't see any response.
>
> I don't think it will be an issue. The optimization guidelines
> of AMD and Intel recommend to move functions that generate
> more than 30-40 instructions out of line. 100 instructions
> is certainly enough to amortize the call overhead, and you
> safe some icache too so it may be even faster.
Of course, but it would be good to see some measurements.
- James
--
James Morris
<jmorris@redhat.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Large inlines in include/linux/skbuff.h
2004-04-22 15:15 ` James Morris
@ 2004-04-22 15:37 ` Eric Dumazet
[not found] ` <200404221927.34717.vda@port.imtp.ilyichevsk.odessa.ua>
2004-04-22 22:23 ` Andi Kleen
1 sibling, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2004-04-22 15:37 UTC (permalink / raw)
To: James Morris; +Cc: Andi Kleen, linux-kernel, vda
James Morris wrote:
>On Thu, 22 Apr 2004, Andi Kleen wrote:
>
>
>
>>>How will these changes impact performance? I asked this last time you
>>>posted about inlines and didn't see any response.
>>>
>>>
>>I don't think it will be an issue. The optimization guidelines
>>of AMD and Intel recommend to move functions that generate
>>more than 30-40 instructions out of line. 100 instructions
>>is certainly enough to amortize the call overhead, and you
>>safe some icache too so it may be even faster.
>>
>>
>
>Of course, but it would be good to see some measurements.
>
>
>- James
>
>
It depends a lot of the workload of the machine.
If this a specialized machine, with a small program that mostly uses
recv() & send() syscalls, then, inlining functions is a gain, since
icache may have a 100% hit ratio. Optimization guidelines are good for
the common cases, not every cases.
Eric Dumazet
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Large inlines in include/linux/skbuff.h
2004-04-22 15:15 ` James Morris
2004-04-22 15:37 ` Eric Dumazet
@ 2004-04-22 22:23 ` Andi Kleen
2004-04-22 22:25 ` David S. Miller
1 sibling, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2004-04-22 22:23 UTC (permalink / raw)
To: James Morris; +Cc: Andi Kleen, linux-kernel, vda
On Thu, Apr 22, 2004 at 11:15:47AM -0400, James Morris wrote:
> On Thu, 22 Apr 2004, Andi Kleen wrote:
>
> > > How will these changes impact performance? I asked this last time you
> > > posted about inlines and didn't see any response.
> >
> > I don't think it will be an issue. The optimization guidelines
> > of AMD and Intel recommend to move functions that generate
> > more than 30-40 instructions out of line. 100 instructions
> > is certainly enough to amortize the call overhead, and you
> > safe some icache too so it may be even faster.
>
> Of course, but it would be good to see some measurements.
It's useless in this case. networking is dominated by cache misses
and locks and a modern CPU can do hundreds of function calls in a
single cache miss.
-Andi
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Large inlines in include/linux/skbuff.h
2004-04-22 22:23 ` Andi Kleen
@ 2004-04-22 22:25 ` David S. Miller
2004-04-22 22:34 ` Andi Kleen
0 siblings, 1 reply; 13+ messages in thread
From: David S. Miller @ 2004-04-22 22:25 UTC (permalink / raw)
To: Andi Kleen; +Cc: jmorris, ak, linux-kernel, vda
On 23 Apr 2004 00:23:26 +0200
Andi Kleen <ak@muc.de> wrote:
> > Of course, but it would be good to see some measurements.
>
> It's useless in this case. networking is dominated by cache misses
> and locks and a modern CPU can do hundreds of function calls in a
> single cache miss.
Indeed, this change does influence the I-cache footprint of the
networking, so I conclude that it is relevant in this case.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Large inlines in include/linux/skbuff.h
2004-04-22 22:25 ` David S. Miller
@ 2004-04-22 22:34 ` Andi Kleen
2004-04-22 22:35 ` David S. Miller
0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2004-04-22 22:34 UTC (permalink / raw)
To: David S. Miller; +Cc: Andi Kleen, jmorris, linux-kernel, vda
On Thu, Apr 22, 2004 at 03:25:25PM -0700, David S. Miller wrote:
> On 23 Apr 2004 00:23:26 +0200
> Andi Kleen <ak@muc.de> wrote:
>
> > > Of course, but it would be good to see some measurements.
> >
> > It's useless in this case. networking is dominated by cache misses
> > and locks and a modern CPU can do hundreds of function calls in a
> > single cache miss.
>
> Indeed, this change does influence the I-cache footprint of the
> networking, so I conclude that it is relevant in this case.
All it does it is to lower the cache foot print, not enlarge it.
So even if it made a difference it could only get better.
It's extremly unlikely to make it detectable worse.
-Andi
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Large inlines in include/linux/skbuff.h
2004-04-22 22:34 ` Andi Kleen
@ 2004-04-22 22:35 ` David S. Miller
0 siblings, 0 replies; 13+ messages in thread
From: David S. Miller @ 2004-04-22 22:35 UTC (permalink / raw)
To: Andi Kleen; +Cc: ak, jmorris, linux-kernel, vda
On 23 Apr 2004 00:34:01 +0200
Andi Kleen <ak@muc.de> wrote:
> All it does it is to lower the cache foot print, not enlarge it.
> So even if it made a difference it could only get better.
> It's extremly unlikely to make it detectable worse.
That's a good argument. I would still like to see some
numbers, you know to help me sleep at night :-)
^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <1NRqX-2GI-15@gated-at.bofh.it>]
* Large inlines in include/linux/skbuff.h
@ 2004-04-21 19:26 Denis Vlasenko
2004-04-22 0:47 ` James Morris
0 siblings, 1 reply; 13+ messages in thread
From: Denis Vlasenko @ 2004-04-21 19:26 UTC (permalink / raw)
To: David S. Miller; +Cc: Jeff Garzik, linux-kernel
Let's skip stuff which is smaller than 100 bytes.
Size Uses Wasted Name and definition
===== ==== ====== ================================================
58 20 722 __skb_dequeue include/linux/skbuff.h
122 119 12036 skb_dequeue include/linux/skbuff.h
103 10 747 __skb_queue_purge include/linux/skbuff.h
164 78 11088 skb_queue_purge include/linux/skbuff.h
46 26 650 __skb_queue_tail include/linux/skbuff.h
109 101 8900 skb_queue_tail include/linux/skbuff.h
47 11 270 __skb_queue_head include/linux/skbuff.h
110 44 3870 skb_queue_head include/linux/skbuff.h
51 16 465 __skb_unlink include/linux/skbuff.h
132 9 896 skb_unlink include/linux/skbuff.h
46 2 26 __skb_append include/linux/skbuff.h
114 5 376 skb_append include/linux/skbuff.h
'Uses' was counted over .c files only, .h were excluded, so
no double count here.
As you can see here, non-locking functions are more or less sanely
sized (except __skb_queue_purge). When spin_lock/unlock code gets added,
we cross 100 byte mark.
What shall be done with this? I'll make patch to move locking functions
into net/core/skbuff.c unless there is some reason not to do it.
Just to make you feel what is 100+ bytes in assembly:
void inline_skb_dequeue_1() { skb_dequeqe(0); }
gets compiled into:
00003513 <inline_skb_dequeue_1>:
3513: 55 push %ebp
3514: 89 e5 mov %esp,%ebp
3516: 53 push %ebx
3517: 9c pushf
3518: 5b pop %ebx
3519: fa cli
351a: b9 00 e0 ff ff mov $0xffffe000,%ecx
351f: 21 e1 and %esp,%ecx
3521: ff 41 14 incl 0x14(%ecx)
3524: f0 fe 0d 0c 00 00 00 lock decb 0xc
352b: 78 02 js 352f <inline_skb_dequeue_1+0x1c>
352d: eb 0d jmp 353c <inline_skb_dequeue_1+0x29>
352f: f3 90 repz nop
3531: 80 3d 0c 00 00 00 00 cmpb $0x0,0xc
3538: 7e f5 jle 352f <inline_skb_dequeue_1+0x1c>
353a: eb e8 jmp 3524 <inline_skb_dequeue_1+0x11>
353c: 8b 15 00 00 00 00 mov 0x0,%edx
3542: 85 d2 test %edx,%edx
3544: 74 2b je 3571 <inline_skb_dequeue_1+0x5e>
3546: 89 d0 mov %edx,%eax
3548: 8b 12 mov (%edx),%edx
354a: 89 15 00 00 00 00 mov %edx,0x0
3550: ff 0d 08 00 00 00 decl 0x8
3556: c7 42 04 00 00 00 00 movl $0x0,0x4(%edx)
355d: c7 00 00 00 00 00 movl $0x0,(%eax)
3563: c7 40 04 00 00 00 00 movl $0x0,0x4(%eax)
356a: c7 40 08 00 00 00 00 movl $0x0,0x8(%eax)
3571: b0 01 mov $0x1,%al
3573: 86 05 0c 00 00 00 xchg %al,0xc
3579: 53 push %ebx
357a: 9d popf
357b: 8b 41 08 mov 0x8(%ecx),%eax
357e: ff 49 14 decl 0x14(%ecx)
3581: a8 08 test $0x8,%al
3583: 74 05 je 358a <inline_skb_dequeue_1+0x77>
3585: e8 fc ff ff ff call 3586 <inline_skb_dequeue_1+0x73>
358a: 5b pop %ebx
358b: c9 leave
358c: c3 ret
--
vda
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-04-22 22:36 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1NvJL-1QO-9@gated-at.bofh.it>
[not found] ` <1NAJr-61F-3@gated-at.bofh.it>
2004-04-22 14:59 ` Large inlines in include/linux/skbuff.h Andi Kleen
2004-04-22 15:15 ` James Morris
2004-04-22 15:37 ` Eric Dumazet
[not found] ` <200404221927.34717.vda@port.imtp.ilyichevsk.odessa.ua>
2004-04-22 19:24 ` Eric Dumazet
2004-04-22 22:23 ` Andi Kleen
2004-04-22 22:25 ` David S. Miller
2004-04-22 22:34 ` Andi Kleen
2004-04-22 22:35 ` David S. Miller
[not found] ` <1NRqX-2GI-15@gated-at.bofh.it>
[not found] ` <1NRqX-2GI-13@gated-at.bofh.it>
2004-04-22 19:51 ` Andi Kleen
2004-04-21 19:26 Denis Vlasenko
2004-04-22 0:47 ` James Morris
[not found] ` <200404221756.46240.vda@port.imtp.ilyichevsk.odessa.ua>
2004-04-22 18:36 ` David S. Miller
[not found] ` <200404222239.34760.vda@port.imtp.ilyichevsk.odessa.ua>
2004-04-22 22:10 ` David S. Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox