* e1000: include <net/ip6_checksum.h> for IA64 @ 2006-11-08 17:48 Auke Kok 2006-11-08 19:06 ` Andrew Morton 2006-11-09 0:10 ` Chen, Kenneth W 0 siblings, 2 replies; 8+ messages in thread From: Auke Kok @ 2006-11-08 17:48 UTC (permalink / raw) To: Jeff Garzik Cc: Andrew Morton, NetDev, Linux Kernel Mailing List, Jesse Brandeburg Here's a slightly better patch to fix ia64 not building atm. Jeff, please apply this to netdev-2.6#upstream instead of akpm's patch that I acked earlier. Of course, someone really should come up with an asm version for ia64 of the missing function ;) Cheers, Auke --- e1000: include <net/ip6_checksum.h> for IA64 IA64 does not have an optimized asm version for ipv6 csum magic. Fall back to generic implementation. Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com> diff --git a/drivers/net/e1000/e1000.h b/drivers/net/e1000/e1000.h index f091042..26e7506 100644 --- a/drivers/net/e1000/e1000.h +++ b/drivers/net/e1000/e1000.h @@ -61,6 +61,7 @@ #include <linux/ip.h> #ifdef NETIF_F_TSO6 #include <linux/ipv6.h> +#include <net/ip6_checksum.h> #endif #include <linux/tcp.h> #include <linux/udp.h> ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: e1000: include <net/ip6_checksum.h> for IA64 2006-11-08 17:48 e1000: include <net/ip6_checksum.h> for IA64 Auke Kok @ 2006-11-08 19:06 ` Andrew Morton 2006-11-09 0:10 ` Chen, Kenneth W 1 sibling, 0 replies; 8+ messages in thread From: Andrew Morton @ 2006-11-08 19:06 UTC (permalink / raw) To: Auke Kok; +Cc: Jeff Garzik, NetDev, Linux Kernel Mailing List, Jesse Brandeburg On Wed, 08 Nov 2006 09:48:35 -0800 Auke Kok <auke-jan.h.kok@intel.com> wrote: > Here's a slightly better patch to fix ia64 not building atm. fsvo "better". > Jeff, please apply this to netdev-2.6#upstream instead of akpm's patch that I acked earlier. > > Of course, someone really should come up with an asm version for ia64 of the missing > function ;) > > Cheers, > > Auke > > --- > > e1000: include <net/ip6_checksum.h> for IA64 > > IA64 does not have an optimized asm version for ipv6 csum magic. Fall > back to generic implementation. > > Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com> > > diff --git a/drivers/net/e1000/e1000.h b/drivers/net/e1000/e1000.h > index f091042..26e7506 100644 > --- a/drivers/net/e1000/e1000.h > +++ b/drivers/net/e1000/e1000.h > @@ -61,6 +61,7 @@ > #include <linux/ip.h> > #ifdef NETIF_F_TSO6 > #include <linux/ipv6.h> > +#include <net/ip6_checksum.h> > #endif > #include <linux/tcp.h> > #include <linux/udp.h> It is noxious of e1000 to do a #include <everything.h> from its driver-wide header file and I refused to be a party to such a thing! Jeff probably won't be able to apply this because, like your other patches, it is space-stuffed. Then again, maybe git understands format=flowed, dunno. ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: e1000: include <net/ip6_checksum.h> for IA64 2006-11-08 17:48 e1000: include <net/ip6_checksum.h> for IA64 Auke Kok @ 2006-11-09 0:10 ` Chen, Kenneth W 2006-11-09 0:10 ` Chen, Kenneth W 1 sibling, 0 replies; 8+ messages in thread From: Chen, Kenneth W @ 2006-11-09 0:10 UTC (permalink / raw) To: Kok, Auke-jan H, Jeff Garzik Cc: Andrew Morton, NetDev, Linux Kernel Mailing List, Brandeburg, Jesse, Luck, Tony, linux-ia64 Auke Kok wrote on Wednesday, November 08, 2006 9:49 AM > Of course, someone really should come up with an asm version for ia64 of the > missing function ;) Sure, absolutely. Here is an implementation for ia64. Tested heavily. Tony, please merge. - Ken [patch] implement csum_ipv6_magic for ia64. The asm version is 3.4 times faster than the generic version. --- ./arch/ia64/lib/ip_fast_csum.S.orig 2006-11-08 12:26:28.000000000 -0800 +++ ./arch/ia64/lib/ip_fast_csum.S 2006-11-08 16:39:24.000000000 -0800 @@ -8,8 +8,8 @@ * in0: address of buffer to checksum (char *) * in1: length of the buffer (int) * - * Copyright (C) 2002 Intel Corp. - * Copyright (C) 2002 Ken Chen <kenneth.w.chen@intel.com> + * Copyright (C) 2002, 2006 Intel Corp. + * Copyright (C) 2002, 2006 Ken Chen <kenneth.w.chen@intel.com> */ #include <asm/asmmacro.h> @@ -25,6 +25,9 @@ #define in0 r32 #define in1 r33 +#define in2 r34 +#define in3 r35 +#define in4 r36 #define ret0 r8 GLOBAL_ENTRY(ip_fast_csum) @@ -88,3 +91,51 @@ GLOBAL_ENTRY(ip_fast_csum) mov b0=r34 br.ret.sptk.many b0 END(ip_fast_csum) + +GLOBAL_ENTRY(csum_ipv6_magic) + ld4 r20=[in0],4 + ld4 r21=[in1],4 + dep r15=in2,in3,16,16 + ;; + ld4 r22=[in0],4 + ld4 r23=[in1],4 + mux1 r15=r15,@rev + ;; + ld4 r24=[in0],4 + ld4 r25=[in1],4 + shr.u r15=r15,32 + add r16=r20,r21 + add r17=r22,r23 + ;; + ld4 r26=[in0],4 + ld4 r27=[in1],4 + add r18=r24,r25 + add r8=r16,r17 + ;; + add r19=r26,r27 + add r8=r8,r18 + ;; + add r8=r8,r19 + add r15=r15,in4 + ;; + add r8=r8,r15 + ;; + shr.u r10=r8,16 // now fold sum into short + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // yeah, keep it rolling + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // three times lucky + zxt2 r11=r8 + ;; + add r8=r10,r11 + mov r9=0xffff + ;; + andcm r8=r9,r8 + br.ret.sptk.many b0 +END(csum_ipv6_magic) --- ./include/asm-ia64/checksum.h.orig 2006-11-08 16:52:16.000000000 -0800 +++ ./include/asm-ia64/checksum.h 2006-11-08 17:01:09.000000000 -0800 @@ -73,4 +73,10 @@ csum_fold (unsigned int sum) return ~sum; } +#define _HAVE_ARCH_IPV6_CSUM 1 +struct in6_addr; +extern unsigned short int csum_ipv6_magic(struct in6_addr *saddr, + struct in6_addr *daddr, __u32 len, unsigned short proto, + unsigned int csum); + #endif /* _ASM_IA64_CHECKSUM_H */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: e1000: include <net/ip6_checksum.h> for IA64 @ 2006-11-09 0:10 ` Chen, Kenneth W 0 siblings, 0 replies; 8+ messages in thread From: Chen, Kenneth W @ 2006-11-09 0:10 UTC (permalink / raw) To: Kok, Auke-jan H, Jeff Garzik Cc: Andrew Morton, NetDev, Linux Kernel Mailing List, Brandeburg, Jesse, Luck, Tony, linux-ia64 Auke Kok wrote on Wednesday, November 08, 2006 9:49 AM > Of course, someone really should come up with an asm version for ia64 of the > missing function ;) Sure, absolutely. Here is an implementation for ia64. Tested heavily. Tony, please merge. - Ken [patch] implement csum_ipv6_magic for ia64. The asm version is 3.4 times faster than the generic version. --- ./arch/ia64/lib/ip_fast_csum.S.orig 2006-11-08 12:26:28.000000000 -0800 +++ ./arch/ia64/lib/ip_fast_csum.S 2006-11-08 16:39:24.000000000 -0800 @@ -8,8 +8,8 @@ * in0: address of buffer to checksum (char *) * in1: length of the buffer (int) * - * Copyright (C) 2002 Intel Corp. - * Copyright (C) 2002 Ken Chen <kenneth.w.chen@intel.com> + * Copyright (C) 2002, 2006 Intel Corp. + * Copyright (C) 2002, 2006 Ken Chen <kenneth.w.chen@intel.com> */ #include <asm/asmmacro.h> @@ -25,6 +25,9 @@ #define in0 r32 #define in1 r33 +#define in2 r34 +#define in3 r35 +#define in4 r36 #define ret0 r8 GLOBAL_ENTRY(ip_fast_csum) @@ -88,3 +91,51 @@ GLOBAL_ENTRY(ip_fast_csum) mov b0=r34 br.ret.sptk.many b0 END(ip_fast_csum) + +GLOBAL_ENTRY(csum_ipv6_magic) + ld4 r20=[in0],4 + ld4 r21=[in1],4 + dep r15=in2,in3,16,16 + ;; + ld4 r22=[in0],4 + ld4 r23=[in1],4 + mux1 r15=r15,@rev + ;; + ld4 r24=[in0],4 + ld4 r25=[in1],4 + shr.u r15=r15,32 + add r16=r20,r21 + add r17=r22,r23 + ;; + ld4 r26=[in0],4 + ld4 r27=[in1],4 + add r18=r24,r25 + add r8=r16,r17 + ;; + add r19=r26,r27 + add r8=r8,r18 + ;; + add r8=r8,r19 + add r15=r15,in4 + ;; + add r8=r8,r15 + ;; + shr.u r10=r8,16 // now fold sum into short + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // yeah, keep it rolling + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // three times lucky + zxt2 r11=r8 + ;; + add r8=r10,r11 + mov r9=0xffff + ;; + andcm r8=r9,r8 + br.ret.sptk.many b0 +END(csum_ipv6_magic) --- ./include/asm-ia64/checksum.h.orig 2006-11-08 16:52:16.000000000 -0800 +++ ./include/asm-ia64/checksum.h 2006-11-08 17:01:09.000000000 -0800 @@ -73,4 +73,10 @@ csum_fold (unsigned int sum) return ~sum; } +#define _HAVE_ARCH_IPV6_CSUM 1 +struct in6_addr; +extern unsigned short int csum_ipv6_magic(struct in6_addr *saddr, + struct in6_addr *daddr, __u32 len, unsigned short proto, + unsigned int csum); + #endif /* _ASM_IA64_CHECKSUM_H */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: e1000: include <net/ip6_checksum.h> for IA64 @ 2006-11-09 0:32 ` Chen, Kenneth W 0 siblings, 0 replies; 8+ messages in thread From: Chen, Kenneth W @ 2006-11-09 0:32 UTC (permalink / raw) To: Kok, Auke-jan H, 'Jeff Garzik' Cc: 'Andrew Morton', 'NetDev', 'Linux Kernel Mailing List', Brandeburg, Jesse, Luck, Tony, linux-ia64 Chen, Kenneth wrote on Wednesday, November 08, 2006 4:10 PM > Auke Kok wrote on Wednesday, November 08, 2006 9:49 AM > > Of course, someone really should come up with an asm version for ia64 of the > > missing function ;) > > Sure, absolutely. Here is an implementation for ia64. Tested heavily. Tony, please > merge. Hmm. Forgot about the signed-off line. Here it is: Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> [patch] implement csum_ipv6_magic for ia64. The asm version is 3.4 times faster than the generic version. --- ./arch/ia64/lib/ip_fast_csum.S.orig 2006-11-08 12:26:28.000000000 -0800 +++ ./arch/ia64/lib/ip_fast_csum.S 2006-11-08 16:39:24.000000000 -0800 @@ -8,8 +8,8 @@ * in0: address of buffer to checksum (char *) * in1: length of the buffer (int) * - * Copyright (C) 2002 Intel Corp. - * Copyright (C) 2002 Ken Chen <kenneth.w.chen@intel.com> + * Copyright (C) 2002, 2006 Intel Corp. + * Copyright (C) 2002, 2006 Ken Chen <kenneth.w.chen@intel.com> */ #include <asm/asmmacro.h> @@ -25,6 +25,9 @@ #define in0 r32 #define in1 r33 +#define in2 r34 +#define in3 r35 +#define in4 r36 #define ret0 r8 GLOBAL_ENTRY(ip_fast_csum) @@ -88,3 +91,51 @@ GLOBAL_ENTRY(ip_fast_csum) mov b0=r34 br.ret.sptk.many b0 END(ip_fast_csum) + +GLOBAL_ENTRY(csum_ipv6_magic) + ld4 r20=[in0],4 + ld4 r21=[in1],4 + dep r15=in2,in3,16,16 + ;; + ld4 r22=[in0],4 + ld4 r23=[in1],4 + mux1 r15=r15,@rev + ;; + ld4 r24=[in0],4 + ld4 r25=[in1],4 + shr.u r15=r15,32 + add r16=r20,r21 + add r17=r22,r23 + ;; + ld4 r26=[in0],4 + ld4 r27=[in1],4 + add r18=r24,r25 + add r8=r16,r17 + ;; + add r19=r26,r27 + add r8=r8,r18 + ;; + add r8=r8,r19 + add r15=r15,in4 + ;; + add r8=r8,r15 + ;; + shr.u r10=r8,16 // now fold sum into short + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // yeah, keep it rolling + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // three times lucky + zxt2 r11=r8 + ;; + add r8=r10,r11 + mov r9=0xffff + ;; + andcm r8=r9,r8 + br.ret.sptk.many b0 +END(csum_ipv6_magic) --- ./include/asm-ia64/checksum.h.orig 2006-11-08 16:52:16.000000000 -0800 +++ ./include/asm-ia64/checksum.h 2006-11-08 17:01:09.000000000 -0800 @@ -73,4 +73,10 @@ csum_fold (unsigned int sum) return ~sum; } +#define _HAVE_ARCH_IPV6_CSUM 1 +struct in6_addr; +extern unsigned short int csum_ipv6_magic(struct in6_addr *saddr, + struct in6_addr *daddr, __u32 len, unsigned short proto, + unsigned int csum); + #endif /* _ASM_IA64_CHECKSUM_H */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: e1000: include <net/ip6_checksum.h> for IA64 @ 2006-11-09 0:32 ` Chen, Kenneth W 0 siblings, 0 replies; 8+ messages in thread From: Chen, Kenneth W @ 2006-11-09 0:32 UTC (permalink / raw) To: Kok, Auke-jan H, 'Jeff Garzik' Cc: 'Andrew Morton', 'NetDev', 'Linux Kernel Mailing List', Brandeburg, Jesse, Luck, Tony, linux-ia64 Chen, Kenneth wrote on Wednesday, November 08, 2006 4:10 PM > Auke Kok wrote on Wednesday, November 08, 2006 9:49 AM > > Of course, someone really should come up with an asm version for ia64 of the > > missing function ;) > > Sure, absolutely. Here is an implementation for ia64. Tested heavily. Tony, please > merge. Hmm. Forgot about the signed-off line. Here it is: Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> [patch] implement csum_ipv6_magic for ia64. The asm version is 3.4 times faster than the generic version. --- ./arch/ia64/lib/ip_fast_csum.S.orig 2006-11-08 12:26:28.000000000 -0800 +++ ./arch/ia64/lib/ip_fast_csum.S 2006-11-08 16:39:24.000000000 -0800 @@ -8,8 +8,8 @@ * in0: address of buffer to checksum (char *) * in1: length of the buffer (int) * - * Copyright (C) 2002 Intel Corp. - * Copyright (C) 2002 Ken Chen <kenneth.w.chen@intel.com> + * Copyright (C) 2002, 2006 Intel Corp. + * Copyright (C) 2002, 2006 Ken Chen <kenneth.w.chen@intel.com> */ #include <asm/asmmacro.h> @@ -25,6 +25,9 @@ #define in0 r32 #define in1 r33 +#define in2 r34 +#define in3 r35 +#define in4 r36 #define ret0 r8 GLOBAL_ENTRY(ip_fast_csum) @@ -88,3 +91,51 @@ GLOBAL_ENTRY(ip_fast_csum) mov b0=r34 br.ret.sptk.many b0 END(ip_fast_csum) + +GLOBAL_ENTRY(csum_ipv6_magic) + ld4 r20=[in0],4 + ld4 r21=[in1],4 + dep r15=in2,in3,16,16 + ;; + ld4 r22=[in0],4 + ld4 r23=[in1],4 + mux1 r15=r15,@rev + ;; + ld4 r24=[in0],4 + ld4 r25=[in1],4 + shr.u r15=r15,32 + add r16=r20,r21 + add r17=r22,r23 + ;; + ld4 r26=[in0],4 + ld4 r27=[in1],4 + add r18=r24,r25 + add r8=r16,r17 + ;; + add r19=r26,r27 + add r8=r8,r18 + ;; + add r8=r8,r19 + add r15=r15,in4 + ;; + add r8=r8,r15 + ;; + shr.u r10=r8,16 // now fold sum into short + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // yeah, keep it rolling + zxt2 r11=r8 + ;; + add r8=r10,r11 + ;; + shr.u r10=r8,16 // three times lucky + zxt2 r11=r8 + ;; + add r8=r10,r11 + mov r9=0xffff + ;; + andcm r8=r9,r8 + br.ret.sptk.many b0 +END(csum_ipv6_magic) --- ./include/asm-ia64/checksum.h.orig 2006-11-08 16:52:16.000000000 -0800 +++ ./include/asm-ia64/checksum.h 2006-11-08 17:01:09.000000000 -0800 @@ -73,4 +73,10 @@ csum_fold (unsigned int sum) return ~sum; } +#define _HAVE_ARCH_IPV6_CSUM 1 +struct in6_addr; +extern unsigned short int csum_ipv6_magic(struct in6_addr *saddr, + struct in6_addr *daddr, __u32 len, unsigned short proto, + unsigned int csum); + #endif /* _ASM_IA64_CHECKSUM_H */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: e1000: include <net/ip6_checksum.h> for IA64 2006-11-09 0:32 ` Chen, Kenneth W @ 2006-11-09 2:07 ` Auke Kok -1 siblings, 0 replies; 8+ messages in thread From: Auke Kok @ 2006-11-09 2:07 UTC (permalink / raw) To: Chen, Kenneth W Cc: Kok, Auke-jan H, 'Jeff Garzik', 'Andrew Morton', 'NetDev', 'Linux Kernel Mailing List', Brandeburg, Jesse, Luck, Tony, linux-ia64 Chen, Kenneth W wrote: > Chen, Kenneth wrote on Wednesday, November 08, 2006 4:10 PM >> Auke Kok wrote on Wednesday, November 08, 2006 9:49 AM >>> Of course, someone really should come up with an asm version for ia64 of the >>> missing function ;) >> Sure, absolutely. Here is an implementation for ia64. Tested heavily. Tony, please > merge. Applauded-by: Auke Kok <auke-jan.h.kok@intel.com> ;) Cheers, Auke > > > Hmm. Forgot about the signed-off line. Here it is: > > Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> > > > > [patch] implement csum_ipv6_magic for ia64. The asm version is 3.4 times faster than > the generic version. > > --- ./arch/ia64/lib/ip_fast_csum.S.orig 2006-11-08 12:26:28.000000000 -0800 > +++ ./arch/ia64/lib/ip_fast_csum.S 2006-11-08 16:39:24.000000000 -0800 > @@ -8,8 +8,8 @@ > * in0: address of buffer to checksum (char *) > * in1: length of the buffer (int) > * > - * Copyright (C) 2002 Intel Corp. > - * Copyright (C) 2002 Ken Chen <kenneth.w.chen@intel.com> > + * Copyright (C) 2002, 2006 Intel Corp. > + * Copyright (C) 2002, 2006 Ken Chen <kenneth.w.chen@intel.com> > */ > > #include <asm/asmmacro.h> > @@ -25,6 +25,9 @@ > > #define in0 r32 > #define in1 r33 > +#define in2 r34 > +#define in3 r35 > +#define in4 r36 > #define ret0 r8 > > GLOBAL_ENTRY(ip_fast_csum) > @@ -88,3 +91,51 @@ GLOBAL_ENTRY(ip_fast_csum) > mov b0=r34 > br.ret.sptk.many b0 > END(ip_fast_csum) > + > +GLOBAL_ENTRY(csum_ipv6_magic) > + ld4 r20=[in0],4 > + ld4 r21=[in1],4 > + dep r15=in2,in3,16,16 > + ;; > + ld4 r22=[in0],4 > + ld4 r23=[in1],4 > + mux1 r15=r15,@rev > + ;; > + ld4 r24=[in0],4 > + ld4 r25=[in1],4 > + shr.u r15=r15,32 > + add r16=r20,r21 > + add r17=r22,r23 > + ;; > + ld4 r26=[in0],4 > + ld4 r27=[in1],4 > + add r18=r24,r25 > + add r8=r16,r17 > + ;; > + add r19=r26,r27 > + add r8=r8,r18 > + ;; > + add r8=r8,r19 > + add r15=r15,in4 > + ;; > + add r8=r8,r15 > + ;; > + shr.u r10=r8,16 // now fold sum into short > + zxt2 r11=r8 > + ;; > + add r8=r10,r11 > + ;; > + shr.u r10=r8,16 // yeah, keep it rolling > + zxt2 r11=r8 > + ;; > + add r8=r10,r11 > + ;; > + shr.u r10=r8,16 // three times lucky > + zxt2 r11=r8 > + ;; > + add r8=r10,r11 > + mov r9=0xffff > + ;; > + andcm r8=r9,r8 > + br.ret.sptk.many b0 > +END(csum_ipv6_magic) > --- ./include/asm-ia64/checksum.h.orig 2006-11-08 16:52:16.000000000 -0800 > +++ ./include/asm-ia64/checksum.h 2006-11-08 17:01:09.000000000 -0800 > @@ -73,4 +73,10 @@ csum_fold (unsigned int sum) > return ~sum; > } > > +#define _HAVE_ARCH_IPV6_CSUM 1 > +struct in6_addr; > +extern unsigned short int csum_ipv6_magic(struct in6_addr *saddr, > + struct in6_addr *daddr, __u32 len, unsigned short proto, > + unsigned int csum); > + > #endif /* _ASM_IA64_CHECKSUM_H */ > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: e1000: include <net/ip6_checksum.h> for IA64 @ 2006-11-09 2:07 ` Auke Kok 0 siblings, 0 replies; 8+ messages in thread From: Auke Kok @ 2006-11-09 2:07 UTC (permalink / raw) To: Chen, Kenneth W Cc: Kok, Auke-jan H, 'Jeff Garzik', 'Andrew Morton', 'NetDev', 'Linux Kernel Mailing List', Brandeburg, Jesse, Luck, Tony, linux-ia64 Chen, Kenneth W wrote: > Chen, Kenneth wrote on Wednesday, November 08, 2006 4:10 PM >> Auke Kok wrote on Wednesday, November 08, 2006 9:49 AM >>> Of course, someone really should come up with an asm version for ia64 of the >>> missing function ;) >> Sure, absolutely. Here is an implementation for ia64. Tested heavily. Tony, please > merge. Applauded-by: Auke Kok <auke-jan.h.kok@intel.com> ;) Cheers, Auke > > > Hmm. Forgot about the signed-off line. Here it is: > > Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> > > > > [patch] implement csum_ipv6_magic for ia64. The asm version is 3.4 times faster than > the generic version. > > --- ./arch/ia64/lib/ip_fast_csum.S.orig 2006-11-08 12:26:28.000000000 -0800 > +++ ./arch/ia64/lib/ip_fast_csum.S 2006-11-08 16:39:24.000000000 -0800 > @@ -8,8 +8,8 @@ > * in0: address of buffer to checksum (char *) > * in1: length of the buffer (int) > * > - * Copyright (C) 2002 Intel Corp. > - * Copyright (C) 2002 Ken Chen <kenneth.w.chen@intel.com> > + * Copyright (C) 2002, 2006 Intel Corp. > + * Copyright (C) 2002, 2006 Ken Chen <kenneth.w.chen@intel.com> > */ > > #include <asm/asmmacro.h> > @@ -25,6 +25,9 @@ > > #define in0 r32 > #define in1 r33 > +#define in2 r34 > +#define in3 r35 > +#define in4 r36 > #define ret0 r8 > > GLOBAL_ENTRY(ip_fast_csum) > @@ -88,3 +91,51 @@ GLOBAL_ENTRY(ip_fast_csum) > mov b0=r34 > br.ret.sptk.many b0 > END(ip_fast_csum) > + > +GLOBAL_ENTRY(csum_ipv6_magic) > + ld4 r20=[in0],4 > + ld4 r21=[in1],4 > + dep r15=in2,in3,16,16 > + ;; > + ld4 r22=[in0],4 > + ld4 r23=[in1],4 > + mux1 r15=r15,@rev > + ;; > + ld4 r24=[in0],4 > + ld4 r25=[in1],4 > + shr.u r15=r15,32 > + add r16=r20,r21 > + add r17=r22,r23 > + ;; > + ld4 r26=[in0],4 > + ld4 r27=[in1],4 > + add r18=r24,r25 > + add r8=r16,r17 > + ;; > + add r19=r26,r27 > + add r8=r8,r18 > + ;; > + add r8=r8,r19 > + add r15=r15,in4 > + ;; > + add r8=r8,r15 > + ;; > + shr.u r10=r8,16 // now fold sum into short > + zxt2 r11=r8 > + ;; > + add r8=r10,r11 > + ;; > + shr.u r10=r8,16 // yeah, keep it rolling > + zxt2 r11=r8 > + ;; > + add r8=r10,r11 > + ;; > + shr.u r10=r8,16 // three times lucky > + zxt2 r11=r8 > + ;; > + add r8=r10,r11 > + mov r9=0xffff > + ;; > + andcm r8=r9,r8 > + br.ret.sptk.many b0 > +END(csum_ipv6_magic) > --- ./include/asm-ia64/checksum.h.orig 2006-11-08 16:52:16.000000000 -0800 > +++ ./include/asm-ia64/checksum.h 2006-11-08 17:01:09.000000000 -0800 > @@ -73,4 +73,10 @@ csum_fold (unsigned int sum) > return ~sum; > } > > +#define _HAVE_ARCH_IPV6_CSUM 1 > +struct in6_addr; > +extern unsigned short int csum_ipv6_magic(struct in6_addr *saddr, > + struct in6_addr *daddr, __u32 len, unsigned short proto, > + unsigned int csum); > + > #endif /* _ASM_IA64_CHECKSUM_H */ > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-11-09 2:07 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-11-08 17:48 e1000: include <net/ip6_checksum.h> for IA64 Auke Kok 2006-11-08 19:06 ` Andrew Morton 2006-11-09 0:10 ` Chen, Kenneth W 2006-11-09 0:10 ` Chen, Kenneth W 2006-11-09 0:32 ` Chen, Kenneth W 2006-11-09 0:32 ` Chen, Kenneth W 2006-11-09 2:07 ` Auke Kok 2006-11-09 2:07 ` Auke Kok
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.