From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, davem@davemloft.net,
kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net,
toshiaki.makita1@gmail.com, lorenzo.bianconi@redhat.com,
toke@redhat.com, Stefano Brivio <sbrivio@redhat.com>,
brouer@redhat.com
Subject: Re: [PATCH v2 bpf-next] net: veth: alloc skb in bulk for ndo_xdp_xmit
Date: Mon, 1 Feb 2021 11:06:34 +0100 [thread overview]
Message-ID: <20210201110634.70be00ba@carbon> (raw)
In-Reply-To: <20210129214927.GC20729@lore-desk>
On Fri, 29 Jan 2021 22:49:27 +0100
Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> On Jan 29, Lorenzo Bianconi wrote:
> > On Jan 29, Jesper Dangaard Brouer wrote:
> > > On Fri, 29 Jan 2021 17:02:16 +0100
> > > Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > >
> > > > > + for (i = 0; i < n_skb; i++) {
> > > > > + struct sk_buff *skb = skbs[i];
> > > > > +
> > > > > + memset(skb, 0, offsetof(struct sk_buff, tail));
> > > >
> > > > It is very subtle, but the memset operation on Intel CPU translates
> > > > into a "rep stos" (repeated store) operation. This operation need to
> > > > save CPU-flags (to support being interrupted) thus it is actually
> > > > expensive (and in my experience cause side effects on pipeline
> > > > efficiency). I have a kernel module for testing memset here[1].
> > > >
> > > > In CPUMAP I have moved the clearing outside this loop. But via asking
> > > > the MM system to clear the memory via gfp_t flag __GFP_ZERO. This
> > > > cause us to clear more memory 256 bytes, but it is aligned. Above
> > > > offsetof(struct sk_buff, tail) is 188 bytes, which is unaligned making
> > > > the rep-stos more expensive in setup time. It is below 3-cachelines,
> > > > which is actually interesting and an improvement since last I checked.
> > > > I actually have to re-test with time_bench_memset[1], to know that is
> > > > better now.
> > >
> > > After much testing (with [1]), yes please use gfp_t flag __GFP_ZERO.
> >
> > I run some comparison tests using memset and __GFP_ZERO and with VETH_XDP_BATCH
> > set to 8 and 16. Results are pretty close so not completely sure the delta is
> > just a noise:
> >
> > - VETH_XDP_BATCH= 8 + __GFP_ZERO: ~3.737Mpps
> > - VETH_XDP_BATCH= 16 + __GFP_ZERO: ~3.79Mpps
> > - VETH_XDP_BATCH= 8 + memset: ~3.766Mpps
> > - VETH_XDP_BATCH= 16 + __GFP_ZERO: ~3.765Mpps
>
> Sorry last line is:
> - VETH_XDP_BATCH= 16 + memset: ~3.765Mpps
Thanks for doing these benchmarks.
From my memset benchmarks we are looking for a 1.66 ns difference(10.463-8.803),
which is VERY hard to measure accurately (anything below 2 ns is
extremely hard due to OS noise).
VETH_XDP_BATCH=8 __GFP_ZERO (3.737Mpps) -> memset (3.766Mpps)
- __GFP_ZERO loosing 0.029Mpps and 2.06 ns slower
VETH_XDP_BATCH=16 __GFP_ZERO (3.79Mpps) -> memset (3.765Mpps)
- __GFP_ZERO gaining 0.025Mpps and 1.75 ns faster
I would say this is noise in the measurements. Even-though batch=16
match the expected improvement, batch=8 goes in the other direction.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2021-02-01 10:09 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <415937741661ac331be09c0e59b4ff1eacfee782.1611861943.git.lorenzo@kernel.org>
[not found] ` <20210129170216.6a879619@carbon>
2021-01-29 19:17 ` [PATCH v2 bpf-next] net: veth: alloc skb in bulk for ndo_xdp_xmit Jesper Dangaard Brouer
2021-01-29 21:46 ` Lorenzo Bianconi
2021-01-29 21:49 ` Lorenzo Bianconi
2021-02-01 10:06 ` Jesper Dangaard Brouer [this message]
[not found] ` <36298336-72f9-75a5-781e-7cb01dac1702@gmail.com>
2021-01-29 21:41 ` Lorenzo Bianconi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210201110634.70be00ba@carbon \
--to=brouer@redhat.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=kuba@kernel.org \
--cc=lorenzo.bianconi@redhat.com \
--cc=lorenzo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sbrivio@redhat.com \
--cc=toke@redhat.com \
--cc=toshiaki.makita1@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.