linux-sctp.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>,
	netdev@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-sctp@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Daniel Borkmann <daniel@iogearbox.net>,
	Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [PATCH net-next 00/10] net: faster and simpler CRC32C computation
Date: Sun, 11 May 2025 16:07:50 -0700	[thread overview]
Message-ID: <20250511230750.GA87326@sol> (raw)
In-Reply-To: <CAMj1kXFSm9-5+uBoF3mBbZKRU6wK9jmmyh=L538FoGvZ1XVShQ@mail.gmail.com>

On Sun, May 11, 2025 at 11:45:14PM +0200, Ard Biesheuvel wrote:
> On Sun, 11 May 2025 at 23:22, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Sun, May 11, 2025 at 10:29:29AM -0700, Eric Biggers wrote:
> > > On Sun, May 11, 2025 at 06:30:25PM +0200, Andrew Lunn wrote:
> > > > On Sat, May 10, 2025 at 05:41:00PM -0700, Eric Biggers wrote:
> > > > > Update networking code that computes the CRC32C of packets to just call
> > > > > crc32c() without unnecessary abstraction layers.  The result is faster
> > > > > and simpler code.
> > > >
> > > > Hi Eric
> > > >
> > > > Do you have some benchmarks for these changes?
> > > >
> > > >     Andrew
> > >
> > > Do you want benchmarks that show that removing the indirect calls makes things
> > > faster?  I think that should be fairly self-evident by now after dealing with
> > > retpoline for years, but I can provide more details if you need them.
> >
> > I was think more like iperf before/after? Show the CPU load has gone
> > down without the bandwidth also going down.
> >
> > Eric Dumazet has a T-Shirt with a commit message on the back which
> > increased network performance by X%. At the moment, there is nothing
> > T-Shirt quotable here.
> >
> 
> I think that removing layers of redundant code to ultimately call the
> same core CRC-32 implementation is a rather obvious win, especially
> when indirect calls are involved. The diffstat speaks for itself, so
> maybe you can print that on a T-shirt.

Agreed with Ard.  I did try doing some SCTP benchmarks with iperf3 earlier, but
they were very noisy and the CRC32C checksumming seemed to be lost in the noise.
There probably are some tricks to running reliable networking benchmarks; I'm
not a networking developer.  Regardless, this series is a clear win for the
CRC32C code, both from a simplicity and performance perspective.  It also fixes
the kconfig dependency issues.  That should be good enough, IMO.

In case it's helpful, here are some microbenchmarks of __skb_checksum (old) vs
skb_crc32c (new):

    Linear sk_buffs

        Length in bytes    __skb_checksum cycles    skb_crc32c cycles
        ===============    =====================    =================
                     64                       43                   18
                   1420                      204                  161
                  16384                     1735                 1642

    Nonlinear sk_buffs (even split between head and one fragment)

        Length in bytes    __skb_checksum cycles    skb_crc32c cycles
        ===============    =====================    =================
                     64                      579                   22
                   1420                     1506                  194
                  16384                     4365                 1682

So 1420-byte linear buffers (roughly the most common case) is 21% faster, but
other cases range from 5% to 2500% faster.  This was on an AMD Zen 5 processor,
where the kernel defaults to using IBRS instead of retpoline; I understand that
an even larger improvement may be seen when retpoline is enabled.

But again this is just the CRC32C checksumming performance.  I'm not claiming
measurable improvements to overall SCTP (or NVME-TLS) latency or throughput,
though it's possible that there are.

- Eric

  reply	other threads:[~2025-05-11 23:07 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-11  0:41 [PATCH net-next 00/10] net: faster and simpler CRC32C computation Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 01/10] net: introduce CONFIG_NET_CRC32C Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 02/10] net: add skb_crc32c() Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 03/10] net: use skb_crc32c() in skb_crc32c_csum_help() Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 04/10] RDMA/siw: use skb_crc32c() instead of __skb_checksum() Eric Biggers
2025-05-15 20:02   ` Bart Van Assche
2025-05-15 20:12     ` Eric Biggers
2025-05-16 10:42       ` Bernard Metzler
2025-05-19  9:04   ` Bernard Metzler
2025-05-20 13:18     ` Leon Romanovsky
2025-05-20 15:18       ` Eric Biggers
2025-05-21 10:38         ` Leon Romanovsky
2025-05-11  0:41 ` [PATCH net-next 05/10] sctp: " Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 06/10] net: fold __skb_checksum() into skb_checksum() Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 07/10] lib/crc32: remove unused support for CRC32C combination Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 08/10] net: add skb_copy_and_crc32c_datagram_iter() Eric Biggers
2025-05-13 21:41   ` Jakub Kicinski
2025-05-15 18:09     ` Eric Biggers
2025-05-11  0:41 ` [PATCH net-next 09/10] nvme-tcp: use crc32c() and skb_copy_and_crc32c_datagram_iter() Eric Biggers
2025-05-16  4:36   ` Christoph Hellwig
2025-05-16  5:31     ` Eric Biggers
2025-05-16  6:06       ` Christoph Hellwig
2025-05-17 17:45         ` Eric Biggers
2025-05-17 20:32       ` Sagi Grimberg
2025-05-17  9:58   ` Sagi Grimberg
2025-05-17 17:29     ` Eric Biggers
2025-05-17 20:30       ` Sagi Grimberg
2025-05-11  0:41 ` [PATCH net-next 10/10] net: remove skb_copy_and_hash_datagram_iter() Eric Biggers
2025-05-11 16:30 ` [PATCH net-next 00/10] net: faster and simpler CRC32C computation Andrew Lunn
2025-05-11 17:29   ` Eric Biggers
2025-05-11 21:22     ` Andrew Lunn
2025-05-11 21:45       ` Ard Biesheuvel
2025-05-11 23:07         ` Eric Biggers [this message]
2025-05-15 19:21           ` David Laight
2025-05-15 19:50             ` Eric Biggers
2025-05-13 21:40 ` Jakub Kicinski
2025-05-15 18:10   ` Eric Biggers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250511230750.GA87326@sol \
    --to=ebiggers@kernel.org \
    --cc=andrew@lunn.ch \
    --cc=ardb@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-sctp@vger.kernel.org \
    --cc=marcelo.leitner@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).