From: David Laight <David.Laight@ACULAB.COM>
To: David Laight <David.Laight@ACULAB.COM>,
'Noah Goldstein' <goldstein.w.n@gmail.com>,
'Eric Dumazet' <edumazet@google.com>
Cc: "'tglx@linutronix.de'" <tglx@linutronix.de>,
"'mingo@redhat.com'" <mingo@redhat.com>,
'Borislav Petkov' <bp@alien8.de>,
"'dave.hansen@linux.intel.com'" <dave.hansen@linux.intel.com>,
'X86 ML' <x86@kernel.org>, "'hpa@zytor.com'" <hpa@zytor.com>,
"'peterz@infradead.org'" <peterz@infradead.org>,
"'alexanderduyck@fb.com'" <alexanderduyck@fb.com>,
'open list' <linux-kernel@vger.kernel.org>,
'netdev' <netdev@vger.kernel.org>
Subject: RE: [PATCH] lib/x86: Optimise csum_partial of buffers that are not multiples of 8 bytes.
Date: Tue, 14 Dec 2021 12:36:07 +0000 [thread overview]
Message-ID: <3107b1e365f34df080feefb68be8a422@AcuMS.aculab.com> (raw)
In-Reply-To: <f1cd1a19878248f09e2e7cffe88c8191@AcuMS.aculab.com>
From: David Laight <David.Laight@ACULAB.COM>
> Sent: 13 December 2021 18:01
>
> Add in the trailing bytes first so that there is no need to worry
> about the sum exceeding 64 bits.
This is an alternate version that (mostly) compiles to reasonable code.
I've also booted a kernel with it - networking still works!
https://godbolt.org/z/K6vY31Gqs
I changed the while (len >= 64) loop into an
if (len >= 64) do (...) while(len >= 64) one.
But gcc makes a pigs breakfast of compiling it - it optimises
it so that it is while (ptr < lim) but adds a lot of code.
So I've done that by hand.
Then it still makes a meal of it because it refuses to take
'buff' from the final loop iteration.
An assignment to the limit helps.
Then there is the calculation of (8 - (len & 7)) * 8.
gcc prior to 9.2 just negate (len & 7) then use leal 56(,%rs1,8),%rcx.
But later ones and fail to notice.
Even given (64 + 8 * -(len & 7)) clang fails to use leal.
I'm not even sure the code clang generates is right:
(%rsi is (len & 7))
movq -8(%rsi,%rax), %rdx
leal (,%rsi,8), %ecx
andb $56, %cl
negb %cl
shrq %cl, %rdx
The 'negb' is the wrong size of the 'andb'.
It might be ok if it is assuming the cpu ignores the high 2 bits of %cl.
But that is a horrid assumption to be making.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
prev parent reply other threads:[~2021-12-14 12:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-13 18:00 [PATCH] lib/x86: Optimise csum_partial of buffers that are not multiples of 8 bytes David Laight
2021-12-13 18:40 ` Alexander Duyck
2021-12-13 22:52 ` David Laight
2021-12-13 18:45 ` Eric Dumazet
2021-12-13 19:23 ` Alexander Duyck
2021-12-14 12:36 ` David Laight [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3107b1e365f34df080feefb68be8a422@AcuMS.aculab.com \
--to=david.laight@aculab.com \
--cc=alexanderduyck@fb.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=edumazet@google.com \
--cc=goldstein.w.n@gmail.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox