From: David Laight <David.Laight@ACULAB.COM>
To: 'Eric Dumazet' <edumazet@google.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
"David S . Miller" <davem@davemloft.net>,
"Jakub Kicinski" <kuba@kernel.org>,
netdev <netdev@vger.kernel.org>,
"the arch/x86 maintainers" <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: RE: [PATCH v1] x86/csum: rewrite csum_partial()
Date: Sun, 14 Nov 2021 19:09:54 +0000 [thread overview]
Message-ID: <31bd81df79c4488c92c6a149eeceee3c@AcuMS.aculab.com> (raw)
In-Reply-To: <CANn89iJtqTGuJL6JgfOAuHxbkej9faURhj3yf2a9Y43Uh_4+Kg@mail.gmail.com>
From: Eric Dumazet
> Sent: 14 November 2021 15:04
>
> On Sun, Nov 14, 2021 at 6:44 AM David Laight <David.Laight@aculab.com> wrote:
> >
> > From: Eric Dumazet
> > > Sent: 11 November 2021 22:31
> > ..
> > > That requires an extra add32_with_carry(), which unfortunately made
> > > the thing slower for me.
> > >
> > > I even hardcoded an inline fast_csum_40bytes() and got best results
> > > with the 10+1 addl,
> > > instead of
> > > (5 + 1) acql + mov (needing one extra register) + shift + addl + adcl
> >
> > Did you try something like:
> > sum = buf[0];
> > val = buf[1]:
> > asm(
> > add64 sum, val
> > adc64 sum, buf[2]
> > adc64 sum, buf[3]
> > adc64 sum, buf[4]
> > adc64 sum, 0
> > }
> > sum_hi = sum >> 32;
> > asm(
> > add32 sum, sum_hi
> > adc32 sum, 0
> > )
>
> This is what I tried. but the last part was using add32_with_carry(),
> and clang was adding stupid mov to temp variable on the stack,
> killing the perf.
Persuading the compile the generate the required assembler is an art!
I also ended up using __builtin_bswap32(sum) when the alignment
was 'odd' - the shift expression didn't always get converted
to a rotate. Byteswap32 DTRT.
I also noticed that any initial checksum was being added in at the end.
The 64bit code can almost always handle a 32 bit (or maybe 56bit!)
input value and add it in 'for free' into the code that does the
initial alignment.
I don't remember testing misaligned buffers.
But I think it doesn't matter (on cpu anyone cares about!).
Even Sandy bridge can do two memory reads in one clock.
So should be able to do a single misaligned read every clock.
Which almost certainly means that aligning the addresses is pointless.
(Given you're not trying to do the adcx/adox loop.)
(Page spanning shouldn't matter.)
For buffers that aren't a multiple of 8 bytes it might be best to
read the last 8 bytes first and shift left to discard the ones that
would get added in twice.
This value can be added to the 32bit 'input' checksum.
Something like:
sum_in += buf[length - 8] << (64 - (length & 7) * 8));
Annoyingly a special case is needed for buffers shorter than 8 bytes
to avoid falling off the start of a page.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
next prev parent reply other threads:[~2021-11-14 19:10 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-11 18:10 [PATCH v1] x86/csum: rewrite csum_partial() Eric Dumazet
2021-11-11 21:56 ` Alexander Duyck
2021-11-11 22:30 ` Eric Dumazet
2021-11-12 9:13 ` Peter Zijlstra
2021-11-12 14:21 ` Eric Dumazet
2021-11-12 15:25 ` Peter Zijlstra
2021-11-12 15:37 ` Eric Dumazet
2021-11-14 14:44 ` David Laight
2021-11-14 15:03 ` Eric Dumazet
2021-11-14 19:09 ` David Laight [this message]
2021-11-14 19:23 ` Eric Dumazet
2021-11-14 14:21 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=31bd81df79c4488c92c6a149eeceee3c@AcuMS.aculab.com \
--to=david.laight@aculab.com \
--cc=alexander.duyck@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).