All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: <davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>,
	<pabeni@redhat.com>, <bjorn@kernel.org>,
	<magnus.karlsson@intel.com>, <jonathan.lemon@gmail.com>,
	<sdf@fomichev.me>, <ast@kernel.org>, <daniel@iogearbox.net>,
	<hawk@kernel.org>, <john.fastabend@gmail.com>, <joe@dama.to>,
	<willemdebruijn.kernel@gmail.com>, <bpf@vger.kernel.org>,
	<netdev@vger.kernel.org>, Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next v6] net: xsk: introduce XDP_MAX_TX_BUDGET set/getsockopt
Date: Mon, 30 Jun 2025 13:46:20 +0200	[thread overview]
Message-ID: <aGJ5DDtFAZ/IsE0B@boxer> (raw)
In-Reply-To: <CAL+tcoCCM+m6eJ1VNoeF2UMdFOhMjJ1z2FVUoMJk=js++hk0RQ@mail.gmail.com>

On Sun, Jun 29, 2025 at 06:43:05PM +0800, Jason Xing wrote:
> On Sun, Jun 29, 2025 at 10:51 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > On Fri, Jun 27, 2025 at 7:01 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > This patch provides a setsockopt method to let applications leverage to
> > > adjust how many descs to be handled at most in one send syscall. It
> > > mitigates the situation where the default value (32) that is too small
> > > leads to higher frequency of triggering send syscall.
> > >
> > > Considering the prosperity/complexity the applications have, there is no
> > > absolutely ideal suggestion fitting all cases. So keep 32 as its default
> > > value like before.
> > >
> > > The patch does the following things:
> > > - Add XDP_MAX_TX_BUDGET socket option.
> > > - Convert TX_BATCH_SIZE to tx_budget_spent.
> > > - Set tx_budget_spent to 32 by default in the initialization phase as a
> > >   per-socket granular control. 32 is also the min value for
> > >   tx_budget_spent.
> > > - Set the range of tx_budget_spent as [32, xs->tx->nentries].
> > >
> > > The idea behind this comes out of real workloads in production. We use a
> > > user-level stack with xsk support to accelerate sending packets and
> > > minimize triggering syscalls. When the packets are aggregated, it's not
> > > hard to hit the upper bound (namely, 32). The moment user-space stack
> > > fetches the -EAGAIN error number passed from sendto(), it will loop to try
> > > again until all the expected descs from tx ring are sent out to the driver.
> > > Enlarging the XDP_MAX_TX_BUDGET value contributes to less frequency of
> > > sendto() and higher throughput/PPS.
> > >
> > > Here is what I did in production, along with some numbers as follows:
> > > For one application I saw lately, I suggested using 128 as max_tx_budget
> > > because I saw two limitations without changing any default configuration:
> > > 1) XDP_MAX_TX_BUDGET, 2) socket sndbuf which is 212992 decided by
> > > net.core.wmem_default. As to XDP_MAX_TX_BUDGET, the scenario behind
> > > this was I counted how many descs are transmitted to the driver at one
> > > time of sendto() based on [1] patch and then I calculated the
> > > possibility of hitting the upper bound. Finally I chose 128 as a
> > > suitable value because 1) it covers most of the cases, 2) a higher
> > > number would not bring evident results. After twisting the parameters,
> > > a stable improvement of around 4% for both PPS and throughput and less
> > > resources consumption were found to be observed by strace -c -p xxx:
> > > 1) %time was decreased by 7.8%
> > > 2) error counter was decreased from 18367 to 572
> >
> > More interesting numbers are arriving here as I run some benchmarks
> > from xdp-project/bpf-examples/AF_XDP-example/ in my VM.
> >
> > Running "sudo taskset -c 2 ./xdpsock -i eth0 -q 1 -l -N -t -b 256"

do you have a patch against xdpsock that does setsockopt you're
introducing here?

-B -b 256 was for enabling busy polling and giving it 256 budget, which is
not what you wanted to achieve.

> >
> > Using the default configure 32 as the max budget iteration:
> >  sock0@eth0:1 txonly xdp-drv
> >                    pps            pkts           1.01
> > rx                 0              0
> > tx                 48,574         49,152
> >
> > Enlarging the value to 256:
> >  sock0@eth0:1 txonly xdp-drv
> >                    pps            pkts           1.00
> > rx                 0              0
> > tx                 148,277        148,736
> >
> > Enlarging the value to 512:
> >  sock0@eth0:1 txonly xdp-drv
> >                    pps            pkts           1.00
> > rx                 0              0
> > tx                 226,306        227,072
> >
> > The performance of pps goes up by 365% (with max budget set as 512)
> > which is an incredible number :)
> 
> Weird thing. I purchased another VM and didn't manage to see such a
> huge improvement.... Good luck is that I own that good machine which
> is still reproducible and I'm still digging in it. So please ignore
> this noise for now :|
> 
> Thanks,
> Jason

  reply	other threads:[~2025-06-30 11:47 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-27 11:01 [PATCH net-next v6] net: xsk: introduce XDP_MAX_TX_BUDGET set/getsockopt Jason Xing
2025-06-29  2:51 ` Jason Xing
2025-06-29 10:43   ` Jason Xing
2025-06-30 11:46     ` Maciej Fijalkowski [this message]
2025-06-30 12:07       ` Jason Xing
2025-06-30 12:25         ` Maciej Fijalkowski
2025-06-30 12:38           ` Jason Xing
2025-07-01  0:20           ` Jason Xing
2025-07-02 11:33 ` Jason Xing
2025-07-03  8:15 ` Paolo Abeni
2025-07-03  8:22   ` Jason Xing
2025-07-03 12:29     ` Maciej Fijalkowski
2025-07-03 13:11       ` Jason Xing
2025-07-03 12:25 ` Maciej Fijalkowski
2025-07-03 13:09   ` Jason Xing
2025-07-03 14:25     ` Jason Xing
2025-07-03 14:53       ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGJ5DDtFAZ/IsE0B@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=joe@dama.to \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.