All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Samiullah Khawaja <skhawaja@google.com>
Cc: Martin Karsten <mkarsten@uwaterloo.ca>,
	Jakub Kicinski <kuba@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	almasrymina@google.com, willemb@google.com, jdamato@fastly.com,
	netdev@vger.kernel.org
Subject: Re: [PATCH net-next v4 0/4] Add support to do threaded napi busy poll
Date: Wed, 26 Mar 2025 14:57:37 -0700	[thread overview]
Message-ID: <Z-R4UUzeuplbdQTy@mini-arch> (raw)
In-Reply-To: <CAAywjhRuJYakS4=zqtB7QzthJE+1UQfcaqT2bcj6sWPN_6Akeg@mail.gmail.com>

On 03/26, Samiullah Khawaja wrote:
> On Tue, Mar 25, 2025 at 10:47 AM Martin Karsten <mkarsten@uwaterloo.ca> wrote:
> >
> > On 2025-03-25 12:40, Samiullah Khawaja wrote:
> > > On Sun, Mar 23, 2025 at 7:38 PM Martin Karsten <mkarsten@uwaterloo.ca> wrote:
> > >>
> > >> On 2025-03-20 22:15, Samiullah Khawaja wrote:
> > >>> Extend the already existing support of threaded napi poll to do continuous
> > >>> busy polling.
> > >>>
> > >>> This is used for doing continuous polling of napi to fetch descriptors
> > >>> from backing RX/TX queues for low latency applications. Allow enabling
> > >>> of threaded busypoll using netlink so this can be enabled on a set of
> > >>> dedicated napis for low latency applications.
> > >>>
> > >>> Once enabled user can fetch the PID of the kthread doing NAPI polling
> > >>> and set affinity, priority and scheduler for it depending on the
> > >>> low-latency requirements.
> > >>>
> > >>> Currently threaded napi is only enabled at device level using sysfs. Add
> > >>> support to enable/disable threaded mode for a napi individually. This
> > >>> can be done using the netlink interface. Extend `napi-set` op in netlink
> > >>> spec that allows setting the `threaded` attribute of a napi.
> > >>>
> > >>> Extend the threaded attribute in napi struct to add an option to enable
> > >>> continuous busy polling. Extend the netlink and sysfs interface to allow
> > >>> enabling/disabling threaded busypolling at device or individual napi
> > >>> level.
> > >>>
> > >>> We use this for our AF_XDP based hard low-latency usecase with usecs
> > >>> level latency requirement. For our usecase we want low jitter and stable
> > >>> latency at P99.
> > >>>
> > >>> Following is an analysis and comparison of available (and compatible)
> > >>> busy poll interfaces for a low latency usecase with stable P99. Please
> > >>> note that the throughput and cpu efficiency is a non-goal.
> > >>>
> > >>> For analysis we use an AF_XDP based benchmarking tool `xdp_rr`. The
> > >>> description of the tool and how it tries to simulate the real workload
> > >>> is following,
> > >>>
> > >>> - It sends UDP packets between 2 machines.
> > >>> - The client machine sends packets at a fixed frequency. To maintain the
> > >>>     frequency of the packet being sent, we use open-loop sampling. That is
> > >>>     the packets are sent in a separate thread.
> > >>> - The server replies to the packet inline by reading the pkt from the
> > >>>     recv ring and replies using the tx ring.
> > >>> - To simulate the application processing time, we use a configurable
> > >>>     delay in usecs on the client side after a reply is received from the
> > >>>     server.
> > >>>
> > >>> The xdp_rr tool is posted separately as an RFC for tools/testing/selftest.
> > >>
> > >> Thanks very much for sending the benchmark program and these specific
> > >> experiments. I am able to build the tool and run the experiments in
> > >> principle. While I don't have a complete picture yet, one observation
> > >> seems already clear, so I want to report back on it.
> > > Thanks for reproducing this Martin. Really appreciate you reviewing
> > > this and your interest in this.
> > >>
> > >>> We use this tool with following napi polling configurations,
> > >>>
> > >>> - Interrupts only
> > >>> - SO_BUSYPOLL (inline in the same thread where the client receives the
> > >>>     packet).
> > >>> - SO_BUSYPOLL (separate thread and separate core)
> > >>> - Threaded NAPI busypoll
> > >>
> > >> The configurations that you describe as SO_BUSYPOLL here are not using
> > >> the best busy-polling configuration. The best busy-polling strictly
> > >> alternates between application processing and network polling. No
> > >> asynchronous processing due to hardware irq delivery or softirq
> > >> processing should happen.
> > >>
> > >> A high-level check is making sure that no softirq processing is reported
> > >> for the relevant cores (see, e.g., "%soft" in sar -P <cores> -u ALL 1).
> > >> In addition, interrupts can be counted in /proc/stat or /proc/interrupts.
> > >>
> > >> Unfortunately it is not always straightforward to enter this pattern. In
> > >> this particular case, it seems that two pieces are missing:
> > >>
> > >> 1) Because the XPD socket is created with XDP_COPY, it is never marked
> > >> with its corresponding napi_id. Without the socket being marked with a
> > >> valid napi_id, sk_busy_loop (called from __xsk_recvmsg) never invokes
> > >> napi_busy_loop. Instead the gro_flush_timeout/napi_defer_hard_irqs
> > >> softirq loop controls packet delivery.
> > > Nice catch. It seems a recent change broke the busy polling for AF_XDP
> > > and there was a fix for the XDP_ZEROCOPY but the XDP_COPY remained
> > > broken and seems in my experiments I didn't pick that up. During my
> > > experimentation I confirmed that all experiment modes are invoking the
> > > busypoll and not going through softirqs. I confirmed this through perf
> > > traces. I sent out a fix for XDP_COPY busy polling here in the link
> > > below. I will resent this for the net since the original commit has
> > > already landed in 6.13.
> > > https://lore.kernel.org/netdev/CAAywjhSEjaSgt7fCoiqJiMufGOi=oxa164_vTfk+3P43H60qwQ@mail.gmail.com/T/#t

In general, when sending the patches and numbers, try running everything
against the latest net-next. Otherwise, it is very confusing to reason
about..

  parent reply	other threads:[~2025-03-26 21:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21  2:15 [PATCH net-next v4 0/4] Add support to do threaded napi busy poll Samiullah Khawaja
2025-03-21  2:15 ` [PATCH net-next v4 1/4] Add support to set napi threaded for individual napi Samiullah Khawaja
2025-03-21 17:10   ` Joe Damato
2025-03-25 14:51     ` Jakub Kicinski
2025-04-01 18:27       ` Jakub Kicinski
2025-04-14 17:16         ` Samiullah Khawaja
2025-03-25 14:52   ` Jakub Kicinski
2025-03-21  2:15 ` [PATCH net-next v4 2/4] net: Create separate gro_flush helper function Samiullah Khawaja
2025-03-21 17:16   ` Joe Damato
2025-03-27 16:42     ` Samiullah Khawaja
2025-03-21  2:15 ` [PATCH net-next v4 3/4] Extend napi threaded polling to allow kthread based busy polling Samiullah Khawaja
2025-03-21 17:39   ` Joe Damato
2025-03-27 16:39     ` Samiullah Khawaja
2025-03-21  2:15 ` [PATCH net-next v4 4/4] selftests: Add napi threaded busy poll test in `busy_poller` Samiullah Khawaja
2025-03-24  2:38 ` [PATCH net-next v4 0/4] Add support to do threaded napi busy poll Martin Karsten
2025-03-25 16:40   ` Samiullah Khawaja
2025-03-25 17:47     ` Martin Karsten
2025-03-26 20:34       ` Samiullah Khawaja
2025-03-26 21:22         ` Martin Karsten
2025-03-26 21:57         ` Stanislav Fomichev [this message]
2025-03-27 18:40           ` Joe Damato
2025-03-27 19:35             ` Samiullah Khawaja

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-R4UUzeuplbdQTy@mini-arch \
    --to=stfomichev@gmail.com \
    --cc=almasrymina@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jdamato@fastly.com \
    --cc=kuba@kernel.org \
    --cc=mkarsten@uwaterloo.ca \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=skhawaja@google.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.