* [PATCH 1/2] ipmr: delete redundant variable @ 2008-07-23 1:45 Wang Chen 2008-07-23 8:03 ` Ingo Oeser 0 siblings, 1 reply; 7+ messages in thread From: Wang Chen @ 2008-07-23 1:45 UTC (permalink / raw) To: David S. Miller; +Cc: NETDEV *v can be removed as this patch showing. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> --- diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index c519b8d..6e715c7 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -1121,7 +1121,6 @@ int ipmr_ioctl(struct sock *sk, int cmd, void __user *arg) static int ipmr_device_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = ptr; - struct vif_device *v; int ct; if (!net_eq(dev_net(dev), &init_net)) @@ -1129,9 +1128,9 @@ static int ipmr_device_event(struct notifier_block *this, unsigned long event, v if (event != NETDEV_UNREGISTER) return NOTIFY_DONE; - v=&vif_table[0]; - for (ct=0;ct<maxvif;ct++,v++) { - if (v->dev==dev) + + for (ct = 0; ct < maxvif; ct++) { + if (vif_table[ct].dev == dev) vif_delete(ct, 1); } return NOTIFY_DONE; ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ipmr: delete redundant variable 2008-07-23 1:45 [PATCH 1/2] ipmr: delete redundant variable Wang Chen @ 2008-07-23 8:03 ` Ingo Oeser 2008-07-23 9:35 ` Wang Chen 0 siblings, 1 reply; 7+ messages in thread From: Ingo Oeser @ 2008-07-23 8:03 UTC (permalink / raw) To: Wang Chen; +Cc: David S. Miller, NETDEV Hi Wang Chen, Wang Chen schrieb: > *v can be removed as this patch showing. You are right, but did you check the resulting asm? > Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> > --- > diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c > index c519b8d..6e715c7 100644 > --- a/net/ipv4/ipmr.c > +++ b/net/ipv4/ipmr.c > @@ -1129,9 +1128,9 @@ static int ipmr_device_event(struct notifier_block *this, unsigned long event, v > > if (event != NETDEV_UNREGISTER) > return NOTIFY_DONE; > - v=&vif_table[0]; > - for (ct=0;ct<maxvif;ct++,v++) { > - if (v->dev==dev) This is ptr += sizeof(vif_table[0]) > + > + for (ct = 0; ct < maxvif; ct++) { > + if (vif_table[ct].dev == dev) This is ptr + ct * sizeof(vif_table[0]) On architectures, where the second address variant is not supported, it spills a register with the multiply/shift. But the second variant could be easily auto vectorized, if we had no if. So just check the asm on a CISC and a RISC architecture with a cross compile, before you transform these patterns. Maybe GCC even transform one into the other these days :-) Best Regards Ingo Oeser ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ipmr: delete redundant variable 2008-07-23 8:03 ` Ingo Oeser @ 2008-07-23 9:35 ` Wang Chen 2008-07-23 12:05 ` Ingo Oeser 0 siblings, 1 reply; 7+ messages in thread From: Wang Chen @ 2008-07-23 9:35 UTC (permalink / raw) To: Ingo Oeser; +Cc: David S. Miller, NETDEV Ingo Oeser said the following on 2008-7-23 16:03: > Hi Wang Chen, > > Wang Chen schrieb: >> *v can be removed as this patch showing. > > You are right, but did you check the resulting asm? > >> Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> >> --- >> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c >> index c519b8d..6e715c7 100644 >> --- a/net/ipv4/ipmr.c >> +++ b/net/ipv4/ipmr.c >> @@ -1129,9 +1128,9 @@ static int ipmr_device_event(struct notifier_block *this, unsigned long event, v >> >> if (event != NETDEV_UNREGISTER) >> return NOTIFY_DONE; >> - v=&vif_table[0]; >> - for (ct=0;ct<maxvif;ct++,v++) { >> - if (v->dev==dev) > > This is ptr += sizeof(vif_table[0]) > >> + >> + for (ct = 0; ct < maxvif; ct++) { >> + if (vif_table[ct].dev == dev) > > This is ptr + ct * sizeof(vif_table[0]) > > On architectures, where the second address variant is > not supported, it spills a register with the multiply/shift. > But "accessing entry of table by index" is always allowed, right? If the complier makes such pointer which spills a register with the multiply/shift, the simple code as following is bug too: i = table[100].field; But it shouldn't, right :) > But the second variant could be easily auto vectorized, > if we had no if. > > So just check the asm on a CISC and a RISC architecture > with a cross compile, before you transform these patterns. > > Maybe GCC even transform one into the other these days :-) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ipmr: delete redundant variable 2008-07-23 9:35 ` Wang Chen @ 2008-07-23 12:05 ` Ingo Oeser 2008-07-23 15:16 ` Wang Chen 2008-07-24 7:37 ` Wang Chen 0 siblings, 2 replies; 7+ messages in thread From: Ingo Oeser @ 2008-07-23 12:05 UTC (permalink / raw) To: Wang Chen; +Cc: David S. Miller, NETDEV Hi Wand Chen, Wang Chen schrieb: > But "accessing entry of table by index" is always allowed, > right? > If the complier makes such pointer which spills a register with > the multiply/shift, the simple code as following is bug too: > i = table[100].field; > But it shouldn't, right :) I'm NOT telling you, that your transformation is introducing a BUG. It is semantically perfectly equivalent. I'm trying to tell you, that it might not led to the same or better performance and might thus be not worth it. But please check the generated assembly yourself on a CISC and RISC machine to get an idea of the effects. It will be a nice learning experience I enjoyed myself already. Best Regards Ingo Oeser ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ipmr: delete redundant variable 2008-07-23 12:05 ` Ingo Oeser @ 2008-07-23 15:16 ` Wang Chen 2008-07-24 7:37 ` Wang Chen 1 sibling, 0 replies; 7+ messages in thread From: Wang Chen @ 2008-07-23 15:16 UTC (permalink / raw) To: Ingo Oeser; +Cc: David S. Miller, NETDEV Ingo Oeser said the following on 2008-7-23 20:05: > Hi Wand Chen, > > Wang Chen schrieb: >> But "accessing entry of table by index" is always allowed, >> right? >> If the complier makes such pointer which spills a register with >> the multiply/shift, the simple code as following is bug too: >> i = table[100].field; >> But it shouldn't, right :) > > I'm NOT telling you, that your transformation is introducing a BUG. > It is semantically perfectly equivalent. > > I'm trying to tell you, that it might not led to the same or better > performance and might thus be not worth it. > Agree. I also think the accessing by index might lead to worse performance. But in this code, we don't care performance, since it only be called when device is unregistered. :) > But please check the generated assembly yourself on a CISC and RISC > machine to get an idea of the effects. It will be a nice learning > experience I enjoyed myself already. > Sure. I am doing it. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ipmr: delete redundant variable 2008-07-23 12:05 ` Ingo Oeser 2008-07-23 15:16 ` Wang Chen @ 2008-07-24 7:37 ` Wang Chen 2008-07-25 17:36 ` Ingo Oeser 1 sibling, 1 reply; 7+ messages in thread From: Wang Chen @ 2008-07-24 7:37 UTC (permalink / raw) To: Ingo Oeser; +Cc: David S. Miller, NETDEV Ingo Oeser said the following on 2008-7-23 20:05: > But please check the generated assembly yourself on a CISC and RISC > machine to get an idea of the effects. It will be a nice learning > experience I enjoyed myself already. > I did the experiment. I used the following C code to compare which approach is better and get a result that two are same on performance. ----main.c #define maxvif 32 struct vif { int *dev; unsigned long bytes_in, bytyes_out; unsigned long pkt_in, pkt_out; unsigned long rate_limit; unsigned char threshhold; unsigned short flags; int local, remote; int link; }; struct vif vif_table[maxvif]; int main() { struct vif *v; int ct; v = &vif_table[0]; for (ct = 0; ct < maxvif; ct++, v++) if(v->link==1) break; return 0; } --- ---main2.c #define maxvif 32 struct vif { int *dev; unsigned long bytes_in, bytyes_out; unsigned long pkt_in, pkt_out; unsigned long rate_limit; unsigned char threshhold; unsigned short flags; int local, remote; int link; }; struct vif vif_table[maxvif]; int main() { struct vif *v; int ct; v = &vif_table[0]; for (ct = 0; ct < maxvif; ct++) if(vif_table[ct].link==1) break; return 0; } --- Use gcc -S -O2 to compile: ---x86 asm main.s .file "main.c" .text .p2align 4,,15 .globl main .type main, @function main: leal 4(%esp), %ecx andl $-16, %esp pushl -4(%ecx) movl $vif_table, %eax pushl %ebp movl %esp, %ebp pushl %ecx jmp .L2 .p2align 4,,7 .L8: cmpl $vif_table+1240, %eax je .L3 addl $40, %eax .L2: cmpl $1, 36(%eax) jne .L8 .L3: popl %ecx xorl %eax, %eax popl %ebp leal -4(%ecx), %esp ret .size main, .-main .comm vif_table,1280,32 .ident "GCC: (GNU) 4.1.2 20070115 (prerelease) (SUSE Linux)" .section .note.GNU-stack,"",@progbits --- ---x86 asm main2.s .file "main2.c" .text .p2align 4,,15 .globl main .type main, @function main: leal 4(%esp), %ecx andl $-16, %esp pushl -4(%ecx) xorl %eax, %eax pushl %ebp movl %esp, %ebp pushl %ecx jmp .L2 .p2align 4,,7 .L8: addl $40, %eax cmpl $1280, %eax je .L3 .L2: cmpl $1, vif_table+36(%eax) jne .L8 .L3: popl %ecx xorl %eax, %eax popl %ebp leal -4(%ecx), %esp ret .size main, .-main .comm vif_table,1280,32 .ident "GCC: (GNU) 4.1.2 20070115 (prerelease) (SUSE Linux)" .section .note.GNU-stack,"",@progbits --- In loop area, main.s and main2.s have the following difference: main.s : cmpl $vif_table+1240, %eax cmpl $1, 36(%eax) main2.s: cmpl $1280, %eax cmpl $1, vif_table+36(%eax) The difference can't cause different performance. OK. Here is the asm on SPARC(not cross compile) ---main.s .global main main: /* 000000 21 */ sethi %hi(vif_table),%o5 /* 0x0004 22 */ or %g0,0,%o4 /* 0x0008 21 */ add %o5,%lo(vif_table),%o3 /* 0x000c 23 */ ld [%o3+36],%o5 .L900000106: /* 0x0010 23 */ cmp %o5,1 /* 0x0014 */ be,pn %icc,.L77000028 /* 0x0018 22 */ add %o4,1,%o4 .L77000025: /* 0x001c 22 */ add %o3,40,%o3 /* 0x0020 */ cmp %o4,32 /* 0x0024 */ bl,a,pt %icc,.L900000106 /* 0x0028 23 */ ld [%o3+36],%o5 .L77000028: /* 0x002c 22 */ retl ! Result = %o0 /* 0x0030 */ or %g0,0,%o0 /* 0x0034 0 */ .type main,2 /* 0x0034 0 */ .size main,(.-main) /* 0x0034 0 */ .global __fsr_init_value /* 0x0034 */ __fsr_init_value=0 --- ---main2.s .global main main: /* 000000 22 */ sethi %hi(vif_table+36),%o5 /* 0x0004 */ or %g0,0,%o3 /* 0x0008 */ add %o5,%lo(vif_table+36),%o4 /* 0x000c 23 */ ld [%o5+%lo(vif_table+36)],%o5 .L900000106: /* 0x0010 23 */ cmp %o5,1 /* 0x0014 */ be,pn %icc,.L77000028 /* 0x0018 22 */ add %o4,40,%o4 .L77000025: /* 0x001c 22 */ add %o3,1,%o3 /* 0x0020 */ cmp %o3,32 /* 0x0024 */ bl,a,pt %icc,.L900000106 /* 0x0028 23 */ ld [%o4],%o5 .L77000028: /* 0x002c 22 */ retl ! Result = %o0 /* 0x0030 */ or %g0,0,%o0 /* 0x0034 0 */ .type main,2 /* 0x0034 0 */ .size main,(.-main) /* 0x0034 0 */ .global __fsr_init_value /* 0x0034 */ __fsr_init_value=0 --- In loop area, they are both ptr+sizeof(struct). Now, we can get a conclusion that current compiler can do optimize the index accessing. :) Ingo, if you have any different opinion, it will be appreciated that you can share. :) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ipmr: delete redundant variable 2008-07-24 7:37 ` Wang Chen @ 2008-07-25 17:36 ` Ingo Oeser 0 siblings, 0 replies; 7+ messages in thread From: Ingo Oeser @ 2008-07-25 17:36 UTC (permalink / raw) To: Wang Chen; +Cc: David S. Miller, NETDEV Hi Wang Chen, Wang Chen schrieb: > Ingo Oeser said the following on 2008-7-23 20:05: > > But please check the generated assembly yourself on a CISC and RISC > > machine to get an idea of the effects. It will be a nice learning > > experience I enjoyed myself already. > > > > I did the experiment. [..] > In loop area, they are both ptr+sizeof(struct). > > Now, we can get a conclusion that current compiler can do optimize the index accessing. > :) > > Ingo, if you have any different opinion, it will be appreciated that you can share. :) Great! Compilers improved a lot here :-) Many thanks for doing this experiment. Now you and others can anyone who is is questioning this fact to your experiment and take it as a reference for similiar changes. That is a great help for the community, I think! Best Regards Ingo Oeser ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-07-25 17:36 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-07-23 1:45 [PATCH 1/2] ipmr: delete redundant variable Wang Chen 2008-07-23 8:03 ` Ingo Oeser 2008-07-23 9:35 ` Wang Chen 2008-07-23 12:05 ` Ingo Oeser 2008-07-23 15:16 ` Wang Chen 2008-07-24 7:37 ` Wang Chen 2008-07-25 17:36 ` Ingo Oeser
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).