All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Thomas Graf <tgraf@suug.ch>
Cc: "David S. Miller" <davem@davemloft.net>, netdev@oss.sgi.com
Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c
Date: Tue, 05 Jul 2005 17:58:39 +0200	[thread overview]
Message-ID: <42CAAE2F.5070807@cosmosbay.com> (raw)
In-Reply-To: <20050705134805.GH16076@postel.suug.ch>

Thomas Graf a écrit :

>>OK. At least my compiler (gcc-3.3.1) does NOT unroll the loop :
> 
> 
> Because you don't specify -funroll-loop

I'm using vanilla 2.6.12 : no -funroll-loop in it. Maybe in your tree, not on 99.9% of 2.6.12 trees.

Are you suggesting everybody should use this compiler flag ?
Something like :

net/sched/Makefile:

CFLAGS_sch_generic.o := -funroll-loops

?

> 
> [...]
> 
> 
>>Please give us the code your compiler produces,
> 
> 
> Unrolled version:
> 
> pfifo_fast_dequeue:
> 	pushl	%esi
> 	xorl	%edx, %edx
> 	pushl	%ebx
> 	movl	12(%esp), %esi
> 	movl	128(%esi), %eax
> 	leal	128(%esi), %ecx
> 	cmpl	%ecx, %eax
> 	je	.L132
> 	movl	%eax, %edx
> 	movl	(%eax), %eax
> 	decl	8(%ecx)
> 	movl	$0, 8(%edx)
> 	movl	%ecx, 4(%eax)
> 	movl	%eax, 128(%esi)
> 	movl	$0, 4(%edx)
> 	movl	$0, (%edx)
> .L132:
> 	testl	%edx, %edx
> 	je	.L131
> 	movl	96(%edx), %ebx
> 	movl	80(%esi), %eax
> 	decl	40(%esi)
> 	subl	%ebx, %eax
> 	movl	%eax, 80(%esi)
> 	movl	%edx, %eax
> .L117:
> 	popl	%ebx
> 	popl	%esi
> 	ret
> .L131:
> 	movl	20(%ecx), %eax
> 	leal	20(%ecx), %edx
> 	xorl	%ebx, %ebx
> 	cmpl	%edx, %eax
> 	je	.L137
> 	movl	%eax, %ebx
> 	movl	(%eax), %eax
> 	decl	8(%edx)
> 	movl	$0, 8(%ebx)
> 	movl	%edx, 4(%eax)
> 	movl	%eax, 20(%ecx)
> 	movl	$0, 4(%ebx)
> 	movl	$0, (%ebx)
> .L137:
> 	testl	%ebx, %ebx
> 	je	.L147
> .L146:
> 	movl	96(%ebx), %ecx
> 	movl	80(%esi), %eax
> 	decl	40(%esi)
> 	subl	%ecx, %eax
> 	movl	%eax, 80(%esi)
> 	movl	%ebx, %eax
> 	jmp	.L117
> .L147:
> 	movl	40(%ecx), %eax
> 	leal	40(%ecx), %edx
> 	xorl	%ebx, %ebx
> 	cmpl	%edx, %eax
> 	je	.L142
> 	movl	%eax, %ebx
> 	movl	(%eax), %eax
> 	decl	8(%edx)
> 	movl	$0, 8(%ebx)
> 	movl	%edx, 4(%eax)
> 	movl	%eax, 40(%ecx)
> 	movl	$0, 4(%ebx)
> 	movl	$0, (%ebx)
> .L142:
> 	xorl	%eax, %eax
> 	testl	%ebx, %ebx
> 	jne	.L146
> 	jmp	.L117
> 

OK thanks, but you dont give the code for my version :) shorter and unrolled as you can see, and with nice predicted branches.

00000fc0 <pfifo_fast_dequeue>:
      fc0:       56                      push   %esi
      fc1:       89 c1                   mov    %eax,%ecx
      fc3:       53                      push   %ebx
      fc4:       8d 98 a0 00 00 00       lea    0xa0(%eax),%ebx
      fca:       39 98 a0 00 00 00       cmp    %ebx,0xa0(%eax)
      fd0:       89 da                   mov    %ebx,%edx
      fd2:       75 22                   jne    ff6 <pfifo_fast_dequeue+0x36>
      fd4:       8d 90 c4 00 00 00       lea    0xc4(%eax),%edx
      fda:       39 90 c4 00 00 00       cmp    %edx,0xc4(%eax)
      fe0:       89 d3                   mov    %edx,%ebx
      fe2:       75 12                   jne    ff6 <pfifo_fast_dequeue+0x36>
      fe4:       8d 98 e8 00 00 00       lea    0xe8(%eax),%ebx
      fea:       31 f6                   xor    %esi,%esi
      fec:       39 98 e8 00 00 00       cmp    %ebx,0xe8(%eax)
      ff2:       89 da                   mov    %ebx,%edx
      ff4:       74 27                   je     101d <pfifo_fast_dequeue+0x5d>
      ff6:       8b 32                   mov    (%edx),%esi
      ff8:       39 d6                   cmp    %edx,%esi
      ffa:       74 26                   je     1022 <pfifo_fast_dequeue+0x62>
      ffc:       8b 06                   mov    (%esi),%eax
      ffe:       ff 4b 08                decl   0x8(%ebx)
     1001:       c7 46 08 00 00 00 00    movl   $0x0,0x8(%esi)
     1008:       89 50 04                mov    %edx,0x4(%eax)
     100b:       89 02                   mov    %eax,(%edx)
     100d:       c7 46 04 00 00 00 00    movl   $0x0,0x4(%esi)
     1014:       c7 06 00 00 00 00       movl   $0x0,(%esi)
     101a:       ff 49 28                decl   0x28(%ecx)
     101d:       5b                      pop    %ebx
     101e:       89 f0                   mov    %esi,%eax
     1020:       5e                      pop    %esi
     1021:       c3                      ret
     1022:       ff 49 28                decl   0x28(%ecx)
     1025:       31 f6                   xor    %esi,%esi
     1027:       eb f4                   jmp    101d <pfifo_fast_dequeue+0x5d>


> 
> I just noticed that this is a local modification of my own, so in
> the vanilla tree it indeed doesn't have any impact on the code
> generated.
> 
> Still, your patch does not make sense to me. The latest tree
> also includes my pfifo_fast changes wich modified the code to
> maintain a backlog and made it easy to add more fifos at compile
> time.  If you want the loop unrolled then let the compiler do it
> via -funroll-loop. These kind of optimization seem as uncessary
> to me as all the loopback optimizations.
> 

I dont want change compiler flags in my tree and loose this optim when 2.6.13 is released.

I dont know about loopback optimization, I am not involved with this stuff, maybe you think I'm another guy ?

It seems to me you give unrelated arguments.
I dont know what are your plans, but mine were not to say you are writing bad code.
Just to give my performance analysis and feedback, I'm sorry if it hurts you.


Eric Dumazet

  reply	other threads:[~2005-07-05 15:58 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-11 21:15 [TG3]: Add hw coalescing infrastructure David S. Miller
2005-05-11 21:17 ` Michael Chan
2005-05-12  2:28   ` David S. Miller
2005-05-12  7:53     ` Robert Olsson
2005-06-22 15:25 ` [TG3]: About " Eric Dumazet
2005-06-22 19:03   ` Michael Chan
2005-07-04 21:22     ` Eric Dumazet
2005-07-04 21:26       ` David S. Miller
2005-07-04 21:39         ` Eric Dumazet
2005-07-04 21:49           ` David S. Miller
2005-07-04 22:31           ` Eric Dumazet
2005-07-04 22:47             ` David S. Miller
2005-07-04 22:55               ` Eric Dumazet
2005-07-04 22:57                 ` Eric Dumazet
2005-07-04 23:01                   ` David S. Miller
2005-07-05  7:38                     ` [PATCH] loop unrolling in net/sched/sch_generic.c Eric Dumazet
2005-07-05 11:51                       ` Thomas Graf
2005-07-05 12:03                         ` Thomas Graf
2005-07-05 13:04                         ` Eric Dumazet
2005-07-05 13:48                           ` Thomas Graf
2005-07-05 15:58                             ` Eric Dumazet [this message]
2005-07-05 17:34                               ` Thomas Graf
2005-07-05 21:22                                 ` David S. Miller
2005-07-05 21:33                                   ` Thomas Graf
2005-07-05 21:35                                     ` David S. Miller
2005-07-05 23:16                                       ` Eric Dumazet
2005-07-05 23:41                                         ` Thomas Graf
2005-07-05 23:45                                           ` David S. Miller
2005-07-05 23:55                                             ` Thomas Graf
2005-07-06  0:32                                           ` Eric Dumazet
2005-07-06  0:51                                             ` Thomas Graf
2005-07-06  1:04                                               ` Eric Dumazet
2005-07-06  1:07                                                 ` Thomas Graf
2005-07-06  0:53                                             ` Eric Dumazet
2005-07-06  1:02                                               ` Thomas Graf
2005-07-06  1:09                                                 ` Eric Dumazet
2005-07-06 12:42                                               ` Thomas Graf
2005-07-07 21:17                                                 ` David S. Miller
2005-07-07 21:34                                                   ` Thomas Graf
2005-07-07 22:24                                                     ` David S. Miller
     [not found]                                                   ` <42CE22CE.7030902@cosmosbay.com>
2005-07-08  7:30                                                     ` David S. Miller
2005-07-08  8:19                                                       ` Eric Dumazet
2005-07-08 11:08                                                         ` Arnaldo Carvalho de Melo
2005-07-12  4:02                                                           ` David S. Miller
2005-07-05 21:26                       ` David S. Miller
2005-07-28 15:52                       ` [PATCH] Add prefetches in net/ipv4/route.c Eric Dumazet
2005-07-28 19:39                         ` David S. Miller
2005-07-28 20:56                           ` Eric Dumazet
2005-07-28 20:58                             ` David S. Miller
2005-07-28 21:24                               ` Eric Dumazet
2005-07-28 22:44                                 ` David S. Miller
2005-07-29 14:50                                 ` Robert Olsson
2005-07-29 17:06                                   ` Rick Jones
2005-07-29 17:44                                     ` Robert Olsson
2005-07-29 17:57                                     ` Eric Dumazet
2005-07-29 18:25                                       ` Rick Jones
2005-07-31  3:52                                         ` David S. Miller
     [not found]                                           ` <42EDDA50.4010405@cosmosbay.com>
2005-08-01 15:39                                             ` David S. Miller
2005-07-31  3:51                                       ` David S. Miller
2005-07-31  3:44                                   ` David S. Miller
2005-07-04 23:00                 ` [TG3]: About hw coalescing infrastructure David S. Miller
2005-07-05 16:14                   ` Eric Dumazet
2005-07-04 22:47             ` Eric Dumazet
     [not found] <C925F8B43D79CC49ACD0601FB68FF50C045E0FB0@orsmsx408>
2005-07-07 22:30 ` [PATCH] loop unrolling in net/sched/sch_generic.c David S. Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42CAAE2F.5070807@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=netdev@oss.sgi.com \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.