From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: gcc miscompiles csum_tcpudp_magic() on ARMv5
Date: Thu, 12 Dec 2013 17:20:49 +0000 [thread overview]
Message-ID: <20131212172049.GU4360@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <20131212171108.GA2337@1wt.eu>
On Thu, Dec 12, 2013 at 06:11:08PM +0100, Willy Tarreau wrote:
> Another thing that can be done to improve the folding of the 16-bit
> checksum is to swap the values to be added, sum them and only keep
> the high half integer which already contains the carry. At least on
> x86 I save some cycles doing this :
>
> 31:24 23:16 15:8 7:0
> sum32 = D C B A
>
> To fold this into 16-bit at a time, I just do this :
>
> 31:24 23:16 15:8 7:0
> sum32 D C B A
> + sum32swapped B A D C
> = A+B C+A+carry(B+D/C+A) B+D C+A
>
> so just take the upper result and you get the final 16-bit word at
> once.
>
> In C it does :
>
> fold16 = (((sum32 >> 16) | (sum32 << 16)) + sum32) >> 16
>
> When the CPU has a rotate instruction, it's fast :-)
Indeed - and if your CPU can do the rotate and add at the same time,
it's just a singe instruction, and it ends up looking remarkably
similar to this:
static inline __sum16 csum_fold(__wsum sum)
{
__asm__(
"add %0, %1, %1, ror #16 @ csum_fold"
: "=r" (sum)
: "r" (sum)
: "cc");
return (__force __sum16)(~(__force u32)sum >> 16);
}
next prev parent reply other threads:[~2013-12-12 17:20 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-12 12:14 gcc miscompiles csum_tcpudp_magic() on ARMv5 Maxime Bizon
2013-12-12 12:40 ` Russell King - ARM Linux
2013-12-12 13:36 ` Maxime Bizon
2013-12-12 13:48 ` Måns Rullgård
2013-12-12 14:10 ` Maxime Bizon
2013-12-12 14:19 ` Willy Tarreau
2013-12-12 14:28 ` Maxime Bizon
2013-12-12 14:42 ` Måns Rullgård
2013-12-12 14:52 ` Maxime Bizon
2013-12-12 14:58 ` Måns Rullgård
2013-12-12 15:00 ` Russell King - ARM Linux
2013-12-12 15:26 ` Maxime Bizon
2013-12-12 15:07 ` Willy Tarreau
2013-12-12 15:18 ` Måns Rullgård
2013-12-12 15:28 ` Willy Tarreau
2013-12-12 15:43 ` Russell King - ARM Linux
2013-12-12 15:50 ` Måns Rullgård
2013-12-12 14:37 ` Måns Rullgård
2013-12-12 14:40 ` Maxime Bizon
2013-12-12 14:47 ` Måns Rullgård
2013-12-12 14:26 ` Måns Rullgård
2013-12-12 14:48 ` Russell King - ARM Linux
2013-12-12 15:00 ` Måns Rullgård
2013-12-12 15:04 ` Maxime Bizon
2013-12-12 15:41 ` Russell King - ARM Linux
2013-12-12 16:04 ` Måns Rullgård
2013-12-12 16:04 ` Willy Tarreau
2013-12-12 16:47 ` Russell King - ARM Linux
2013-12-12 17:11 ` Willy Tarreau
2013-12-12 17:20 ` Russell King - ARM Linux [this message]
2013-12-12 17:35 ` Willy Tarreau
2013-12-12 18:07 ` Nicolas Pitre
2013-12-12 22:30 ` Maxime Bizon
2013-12-12 22:36 ` Russell King - ARM Linux
2013-12-12 22:44 ` Maxime Bizon
2013-12-12 22:48 ` Russell King - ARM Linux
2013-12-12 17:34 ` Maxime Bizon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131212172049.GU4360@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).