From: Lorenzo Bianconi <lorenzo@kernel.org>
To: Filip Bakreski <phial@phiality.com>
Cc: nbd@nbd.name, ryder.lee@mediatek.com, shayne.chen@mediatek.com,
sean.wang@mediatek.com, linux-wireless@vger.kernel.org
Subject: Re: [PATCH v2] wifi: mt76: mt76u: use a threaded NAPI for the RX path
Date: Tue, 9 Jun 2026 11:45:53 +0200 [thread overview]
Message-ID: <aifg0WOxWD6x7no4@lore-desk> (raw)
In-Reply-To: <20260609003224.132191-1-phial@phiality.com>
[-- Attachment #1: Type: text/plain, Size: 5719 bytes --]
> The USB RX path delivers frames to the stack via mt76_rx_complete() with
> a NULL napi pointer, taking the netif_receive_skb_list() path, so it never
> benefits from GRO -- unlike the DMA-based mt76 drivers, which pass a real
> napi and use napi_gro_receive(). For bulk TCP traffic this is costly, as
> every segment traverses the stack individually.
>
> Service the MT_RXQ_MAIN queue from a threaded NAPI, reusing mt76_dev's
> existing napi_dev and napi[] rather than adding new fields. The URB
> completion handler schedules the napi; its poll drains the URBs, builds
> the skbs, resubmits and delivers them through napi_gro_receive(). The MCU
> queue stays on the existing RX worker. This enables GRO and moves RX
> processing into its own kernel thread, parallelising the datapath.
>
> On mt7921u at HE-MCS 11 (2x2, 80 MHz; fast.com, multiple streams) this
> averages ~588 Mbit/s, versus ~424 Mbit/s when the same napi is instead
> driven manually from the RX worker, and ~380 Mbit/s for the unmodified
> driver.
>
> Suggested-by: Lorenzo Bianconi <lorenzo@kernel.org>
> Assisted-by: Claude:claude-opus-4-8
> Signed-off-by: Filip Bakreski <phial@phiality.com>
> ---
> v2:
> - Service MT_RXQ_MAIN from a threaded NAPI instead of a NAPI driven
> manually from the RX worker; on mt7921u the threaded variant measured
> ~39% faster (~588 vs ~424 Mbit/s, fast.com) (Lorenzo Bianconi).
> - Reuse mt76_dev's existing napi_dev/napi[] instead of adding new fields
> to struct mt76_usb (Lorenzo Bianconi).
>
> v1: https://lore.kernel.org/linux-wireless/20260608044109.31730-1-phial@phiality.com/
Hi Filip,
I guess the patch is fine, just a couple of nits inline.
Regards,
Lorenzo
>
> drivers/net/wireless/mediatek/mt76/usb.c | 56 +++++++++++++++++++++---
> 1 file changed, 49 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> index d9638a9b7..aef8f855f 100644
> --- a/drivers/net/wireless/mediatek/mt76/usb.c
> +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> @@ -580,7 +580,10 @@ static void mt76u_complete_rx(struct urb *urb)
>
> q->head = (q->head + 1) % q->ndesc;
> q->queued++;
> - mt76_worker_schedule(&dev->usb.rx_worker);
nit: new-line here.
> + if (q == &dev->q_rx[MT_RXQ_MAIN])
> + napi_schedule(&dev->napi[MT_RXQ_MAIN]);
> + else
> + mt76_worker_schedule(&dev->usb.rx_worker);
> out:
> spin_unlock_irqrestore(&q->lock, flags);
> }
> @@ -618,11 +621,23 @@ mt76u_process_rx_queue(struct mt76_dev *dev, struct mt76_queue *q)
> }
> mt76u_submit_rx_buf(dev, qid, urb);
> }
> - if (qid == MT_RXQ_MAIN) {
> - local_bh_disable();
> - mt76_rx_poll_complete(dev, MT_RXQ_MAIN, NULL);
> - local_bh_enable();
> - }
> +}
> +
> +/* Threaded NAPI poll for the MAIN RX queue: drain URBs, build skbs, resubmit,
> + * then deliver through napi_gro_receive() and let napi_complete() flush GRO.
> + */
> +static int mt76u_napi_poll(struct napi_struct *napi, int budget)
> +{
> + struct mt76_dev *dev = mt76_priv(napi->dev);
> +
> + rcu_read_lock();
> + mt76u_process_rx_queue(dev, &dev->q_rx[MT_RXQ_MAIN]);
> + mt76_rx_poll_complete(dev, MT_RXQ_MAIN, napi);
> + rcu_read_unlock();
> +
> + napi_complete(napi);
> +
> + return 0;
> }
>
> static void mt76u_rx_worker(struct mt76_worker *w)
> @@ -632,8 +647,12 @@ static void mt76u_rx_worker(struct mt76_worker *w)
> int i;
>
> rcu_read_lock();
> - mt76_for_each_q_rx(dev, i)
> + mt76_for_each_q_rx(dev, i) {
> + /* MT_RXQ_MAIN is serviced by the threaded NAPI poll */
> + if (i == MT_RXQ_MAIN)
> + continue;
nit: new-line here.
> mt76u_process_rx_queue(dev, &dev->q_rx[i]);
> + }
> rcu_read_unlock();
> }
>
> @@ -723,6 +742,8 @@ void mt76u_stop_rx(struct mt76_dev *dev)
> int i;
>
> mt76_worker_disable(&dev->usb.rx_worker);
> + if (dev->napi_dev)
> + napi_disable(&dev->napi[MT_RXQ_MAIN]);
>
> mt76_for_each_q_rx(dev, i) {
> struct mt76_queue *q = &dev->q_rx[i];
> @@ -751,6 +772,8 @@ int mt76u_resume_rx(struct mt76_dev *dev)
> }
>
> mt76_worker_enable(&dev->usb.rx_worker);
> + if (dev->napi_dev)
> + napi_enable(&dev->napi[MT_RXQ_MAIN]);
>
> return 0;
> }
> @@ -1051,6 +1074,13 @@ void mt76u_queues_deinit(struct mt76_dev *dev)
> mt76u_stop_rx(dev);
> mt76u_stop_tx(dev);
>
> + /* mt76u_stop_rx() (above) already napi_disable()d the MAIN queue */
> + if (dev->napi_dev) {
> + netif_napi_del(&dev->napi[MT_RXQ_MAIN]);
> + free_netdev(dev->napi_dev);
> + dev->napi_dev = NULL;
> + }
> +
> mt76u_free_rx(dev);
> mt76u_free_tx(dev);
> }
> @@ -1115,6 +1145,18 @@ int __mt76u_init(struct mt76_dev *dev, struct usb_interface *intf,
> sched_set_fifo_low(usb->rx_worker.task);
> sched_set_fifo_low(usb->status_worker.task);
>
> + /* threaded NAPI on a dummy netdev (reusing mt76_dev's napi_dev/napi[])
> + * services the MAIN RX queue and gives the RX path GRO
> + */
> + dev->napi_dev = alloc_netdev_dummy(sizeof(struct mt76_dev *));
> + if (!dev->napi_dev)
> + return -ENOMEM;
nit: new-line here.
> + *(struct mt76_dev **)netdev_priv(dev->napi_dev) = dev;
To make the code more readable, I guess you can define priv pointer similar to mt76_dma_init().
> + strscpy(dev->napi_dev->name, "mt76u-rx", sizeof(dev->napi_dev->name));
> + dev->napi_dev->threaded = 1;
> + netif_napi_add(dev->napi_dev, &dev->napi[MT_RXQ_MAIN], mt76u_napi_poll);
> + napi_enable(&dev->napi[MT_RXQ_MAIN]);
> +
> return 0;
> }
> EXPORT_SYMBOL_GPL(__mt76u_init);
>
> base-commit: 5f6099446d1ddb888e36cdf93b6a0551f05c1267
> --
> 2.54.0
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
next prev parent reply other threads:[~2026-06-09 9:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-09 0:32 [PATCH v2] wifi: mt76: mt76u: use a threaded NAPI for the RX path Filip Bakreski
2026-06-09 9:45 ` Lorenzo Bianconi [this message]
2026-06-09 10:57 ` Phiality
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aifg0WOxWD6x7no4@lore-desk \
--to=lorenzo@kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=nbd@nbd.name \
--cc=phial@phiality.com \
--cc=ryder.lee@mediatek.com \
--cc=sean.wang@mediatek.com \
--cc=shayne.chen@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox