netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Samiullah Khawaja <skhawaja@google.com>
Cc: Martin Karsten <mkarsten@uwaterloo.ca>,
	Jakub Kicinski <kuba@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	almasrymina@google.com, willemb@google.com, jdamato@fastly.com,
	netdev@vger.kernel.org
Subject: Re: [PATCH net-next v4 0/4] Add support to do threaded napi busy poll
Date: Wed, 26 Mar 2025 14:57:37 -0700	[thread overview]
Message-ID: <Z-R4UUzeuplbdQTy@mini-arch> (raw)
In-Reply-To: <CAAywjhRuJYakS4=zqtB7QzthJE+1UQfcaqT2bcj6sWPN_6Akeg@mail.gmail.com>

On 03/26, Samiullah Khawaja wrote:
> On Tue, Mar 25, 2025 at 10:47 AM Martin Karsten <mkarsten@uwaterloo.ca> wrote:
> >
> > On 2025-03-25 12:40, Samiullah Khawaja wrote:
> > > On Sun, Mar 23, 2025 at 7:38 PM Martin Karsten <mkarsten@uwaterloo.ca> wrote:
> > >>
> > >> On 2025-03-20 22:15, Samiullah Khawaja wrote:
> > >>> Extend the already existing support of threaded napi poll to do continuous
> > >>> busy polling.
> > >>>
> > >>> This is used for doing continuous polling of napi to fetch descriptors
> > >>> from backing RX/TX queues for low latency applications. Allow enabling
> > >>> of threaded busypoll using netlink so this can be enabled on a set of
> > >>> dedicated napis for low latency applications.
> > >>>
> > >>> Once enabled user can fetch the PID of the kthread doing NAPI polling
> > >>> and set affinity, priority and scheduler for it depending on the
> > >>> low-latency requirements.
> > >>>
> > >>> Currently threaded napi is only enabled at device level using sysfs. Add
> > >>> support to enable/disable threaded mode for a napi individually. This
> > >>> can be done using the netlink interface. Extend `napi-set` op in netlink
> > >>> spec that allows setting the `threaded` attribute of a napi.
> > >>>
> > >>> Extend the threaded attribute in napi struct to add an option to enable
> > >>> continuous busy polling. Extend the netlink and sysfs interface to allow
> > >>> enabling/disabling threaded busypolling at device or individual napi
> > >>> level.
> > >>>
> > >>> We use this for our AF_XDP based hard low-latency usecase with usecs
> > >>> level latency requirement. For our usecase we want low jitter and stable
> > >>> latency at P99.
> > >>>
> > >>> Following is an analysis and comparison of available (and compatible)
> > >>> busy poll interfaces for a low latency usecase with stable P99. Please
> > >>> note that the throughput and cpu efficiency is a non-goal.
> > >>>
> > >>> For analysis we use an AF_XDP based benchmarking tool `xdp_rr`. The
> > >>> description of the tool and how it tries to simulate the real workload
> > >>> is following,
> > >>>
> > >>> - It sends UDP packets between 2 machines.
> > >>> - The client machine sends packets at a fixed frequency. To maintain the
> > >>>     frequency of the packet being sent, we use open-loop sampling. That is
> > >>>     the packets are sent in a separate thread.
> > >>> - The server replies to the packet inline by reading the pkt from the
> > >>>     recv ring and replies using the tx ring.
> > >>> - To simulate the application processing time, we use a configurable
> > >>>     delay in usecs on the client side after a reply is received from the
> > >>>     server.
> > >>>
> > >>> The xdp_rr tool is posted separately as an RFC for tools/testing/selftest.
> > >>
> > >> Thanks very much for sending the benchmark program and these specific
> > >> experiments. I am able to build the tool and run the experiments in
> > >> principle. While I don't have a complete picture yet, one observation
> > >> seems already clear, so I want to report back on it.
> > > Thanks for reproducing this Martin. Really appreciate you reviewing
> > > this and your interest in this.
> > >>
> > >>> We use this tool with following napi polling configurations,
> > >>>
> > >>> - Interrupts only
> > >>> - SO_BUSYPOLL (inline in the same thread where the client receives the
> > >>>     packet).
> > >>> - SO_BUSYPOLL (separate thread and separate core)
> > >>> - Threaded NAPI busypoll
> > >>
> > >> The configurations that you describe as SO_BUSYPOLL here are not using
> > >> the best busy-polling configuration. The best busy-polling strictly
> > >> alternates between application processing and network polling. No
> > >> asynchronous processing due to hardware irq delivery or softirq
> > >> processing should happen.
> > >>
> > >> A high-level check is making sure that no softirq processing is reported
> > >> for the relevant cores (see, e.g., "%soft" in sar -P <cores> -u ALL 1).
> > >> In addition, interrupts can be counted in /proc/stat or /proc/interrupts.
> > >>
> > >> Unfortunately it is not always straightforward to enter this pattern. In
> > >> this particular case, it seems that two pieces are missing:
> > >>
> > >> 1) Because the XPD socket is created with XDP_COPY, it is never marked
> > >> with its corresponding napi_id. Without the socket being marked with a
> > >> valid napi_id, sk_busy_loop (called from __xsk_recvmsg) never invokes
> > >> napi_busy_loop. Instead the gro_flush_timeout/napi_defer_hard_irqs
> > >> softirq loop controls packet delivery.
> > > Nice catch. It seems a recent change broke the busy polling for AF_XDP
> > > and there was a fix for the XDP_ZEROCOPY but the XDP_COPY remained
> > > broken and seems in my experiments I didn't pick that up. During my
> > > experimentation I confirmed that all experiment modes are invoking the
> > > busypoll and not going through softirqs. I confirmed this through perf
> > > traces. I sent out a fix for XDP_COPY busy polling here in the link
> > > below. I will resent this for the net since the original commit has
> > > already landed in 6.13.
> > > https://lore.kernel.org/netdev/CAAywjhSEjaSgt7fCoiqJiMufGOi=oxa164_vTfk+3P43H60qwQ@mail.gmail.com/T/#t

In general, when sending the patches and numbers, try running everything
against the latest net-next. Otherwise, it is very confusing to reason
about..

  parent reply	other threads:[~2025-03-26 21:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21  2:15 [PATCH net-next v4 0/4] Add support to do threaded napi busy poll Samiullah Khawaja
2025-03-21  2:15 ` [PATCH net-next v4 1/4] Add support to set napi threaded for individual napi Samiullah Khawaja
2025-03-21 17:10   ` Joe Damato
2025-03-25 14:51     ` Jakub Kicinski
2025-04-01 18:27       ` Jakub Kicinski
2025-04-14 17:16         ` Samiullah Khawaja
2025-03-25 14:52   ` Jakub Kicinski
2025-03-21  2:15 ` [PATCH net-next v4 2/4] net: Create separate gro_flush helper function Samiullah Khawaja
2025-03-21 17:16   ` Joe Damato
2025-03-27 16:42     ` Samiullah Khawaja
2025-03-21  2:15 ` [PATCH net-next v4 3/4] Extend napi threaded polling to allow kthread based busy polling Samiullah Khawaja
2025-03-21 17:39   ` Joe Damato
2025-03-27 16:39     ` Samiullah Khawaja
2025-03-21  2:15 ` [PATCH net-next v4 4/4] selftests: Add napi threaded busy poll test in `busy_poller` Samiullah Khawaja
2025-03-24  2:38 ` [PATCH net-next v4 0/4] Add support to do threaded napi busy poll Martin Karsten
2025-03-25 16:40   ` Samiullah Khawaja
2025-03-25 17:47     ` Martin Karsten
2025-03-26 20:34       ` Samiullah Khawaja
2025-03-26 21:22         ` Martin Karsten
2025-03-26 21:57         ` Stanislav Fomichev [this message]
2025-03-27 18:40           ` Joe Damato
2025-03-27 19:35             ` Samiullah Khawaja

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-R4UUzeuplbdQTy@mini-arch \
    --to=stfomichev@gmail.com \
    --cc=almasrymina@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jdamato@fastly.com \
    --cc=kuba@kernel.org \
    --cc=mkarsten@uwaterloo.ca \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=skhawaja@google.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).