From: w@1wt.eu (Willy Tarreau)
To: linux-arm-kernel@lists.infradead.org
Subject: gcc miscompiles csum_tcpudp_magic() on ARMv5
Date: Thu, 12 Dec 2013 18:35:37 +0100 [thread overview]
Message-ID: <20131212173537.GB2337@1wt.eu> (raw)
In-Reply-To: <20131212172049.GU4360@n2100.arm.linux.org.uk>
On Thu, Dec 12, 2013 at 05:20:49PM +0000, Russell King - ARM Linux wrote:
> On Thu, Dec 12, 2013 at 06:11:08PM +0100, Willy Tarreau wrote:
> > Another thing that can be done to improve the folding of the 16-bit
> > checksum is to swap the values to be added, sum them and only keep
> > the high half integer which already contains the carry. At least on
> > x86 I save some cycles doing this :
> >
> > 31:24 23:16 15:8 7:0
> > sum32 = D C B A
> >
> > To fold this into 16-bit at a time, I just do this :
> >
> > 31:24 23:16 15:8 7:0
> > sum32 D C B A
> > + sum32swapped B A D C
> > = A+B C+A+carry(B+D/C+A) B+D C+A
> >
> > so just take the upper result and you get the final 16-bit word at
> > once.
> >
> > In C it does :
> >
> > fold16 = (((sum32 >> 16) | (sum32 << 16)) + sum32) >> 16
> >
> > When the CPU has a rotate instruction, it's fast :-)
>
> Indeed - and if your CPU can do the rotate and add at the same time,
> it's just a singe instruction, and it ends up looking remarkably
> similar to this:
>
> static inline __sum16 csum_fold(__wsum sum)
> {
> __asm__(
> "add %0, %1, %1, ror #16 @ csum_fold"
> : "=r" (sum)
> : "r" (sum)
> : "cc");
> return (__force __sum16)(~(__force u32)sum >> 16);
> }
Marvelous :-)
Willy
next prev parent reply other threads:[~2013-12-12 17:35 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-12 12:14 gcc miscompiles csum_tcpudp_magic() on ARMv5 Maxime Bizon
2013-12-12 12:40 ` Russell King - ARM Linux
2013-12-12 13:36 ` Maxime Bizon
2013-12-12 13:48 ` Måns Rullgård
2013-12-12 14:10 ` Maxime Bizon
2013-12-12 14:19 ` Willy Tarreau
2013-12-12 14:28 ` Maxime Bizon
2013-12-12 14:42 ` Måns Rullgård
2013-12-12 14:52 ` Maxime Bizon
2013-12-12 14:58 ` Måns Rullgård
2013-12-12 15:00 ` Russell King - ARM Linux
2013-12-12 15:26 ` Maxime Bizon
2013-12-12 15:07 ` Willy Tarreau
2013-12-12 15:18 ` Måns Rullgård
2013-12-12 15:28 ` Willy Tarreau
2013-12-12 15:43 ` Russell King - ARM Linux
2013-12-12 15:50 ` Måns Rullgård
2013-12-12 14:37 ` Måns Rullgård
2013-12-12 14:40 ` Maxime Bizon
2013-12-12 14:47 ` Måns Rullgård
2013-12-12 14:26 ` Måns Rullgård
2013-12-12 14:48 ` Russell King - ARM Linux
2013-12-12 15:00 ` Måns Rullgård
2013-12-12 15:04 ` Maxime Bizon
2013-12-12 15:41 ` Russell King - ARM Linux
2013-12-12 16:04 ` Måns Rullgård
2013-12-12 16:04 ` Willy Tarreau
2013-12-12 16:47 ` Russell King - ARM Linux
2013-12-12 17:11 ` Willy Tarreau
2013-12-12 17:20 ` Russell King - ARM Linux
2013-12-12 17:35 ` Willy Tarreau [this message]
2013-12-12 18:07 ` Nicolas Pitre
2013-12-12 22:30 ` Maxime Bizon
2013-12-12 22:36 ` Russell King - ARM Linux
2013-12-12 22:44 ` Maxime Bizon
2013-12-12 22:48 ` Russell King - ARM Linux
2013-12-12 17:34 ` Maxime Bizon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131212173537.GB2337@1wt.eu \
--to=w@1wt.eu \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.