* [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
@ 2026-01-25 22:19 Lucid Duck
2026-01-26 3:39 ` Ping-Ke Shih
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Lucid Duck @ 2026-01-25 22:19 UTC (permalink / raw)
To: linux-wireless; +Cc: Ping-Ke Shih, Lucid Duck
rtw89_usb_ops_check_and_reclaim_tx_resource() currently returns a
hardcoded placeholder value of 42, violating mac80211's TX flow control
contract. This causes uncontrolled URB accumulation under sustained TX
load since mac80211 believes resources are always available.
Fix this by implementing proper TX backpressure:
- Add per-channel atomic counters (tx_inflight[]) to track URBs between
submission and completion
- Increment counter before usb_submit_urb() with rollback on failure
- Decrement counter in completion callback
- Return available slots (max - inflight) to mac80211, or 0 at capacity
- Exclude firmware command channel (CH12) from flow control
Tested on D-Link DWA-X1850 (RTL8832AU) with:
- Sustained high-throughput traffic
- Module load/unload stress tests
- Hot-unplug during active transmission
- 30-minute soak test verifying counters balance at idle
Signed-off-by: Lucid Duck <lucid_duck@justthetip.ca>
---
drivers/net/wireless/realtek/rtw89/usb.c | 27 ++++++++++++++++++++++--
drivers/net/wireless/realtek/rtw89/usb.h | 6 ++++++
2 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw89/usb.c b/drivers/net/wireless/realtek/rtw89/usb.c
index e77561a4d..6fcf32603 100644
--- a/drivers/net/wireless/realtek/rtw89/usb.c
+++ b/drivers/net/wireless/realtek/rtw89/usb.c
@@ -161,16 +161,25 @@ static u32
rtw89_usb_ops_check_and_reclaim_tx_resource(struct rtw89_dev *rtwdev,
u8 txch)
{
+ struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
+ int inflight;
+
+ /* Firmware command channel is not flow-controlled */
if (txch == RTW89_TXCH_CH12)
return 1;
- return 42; /* TODO some kind of calculation? */
+ inflight = atomic_read(&rtwusb->tx_inflight[txch]);
+ if (inflight >= RTW89_USB_MAX_TX_URBS_PER_CH)
+ return 0;
+
+ return RTW89_USB_MAX_TX_URBS_PER_CH - inflight;
}
static void rtw89_usb_write_port_complete(struct urb *urb)
{
struct rtw89_usb_tx_ctrl_block *txcb = urb->context;
struct rtw89_dev *rtwdev = txcb->rtwdev;
+ struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
struct ieee80211_tx_info *info;
struct rtw89_txwd_body *txdesc;
struct sk_buff *skb;
@@ -229,6 +238,10 @@ static void rtw89_usb_write_port_complete(struct urb *urb)
break;
}
+ /* Decrement in-flight counter (skip firmware command channel) */
+ if (txcb->txch != RTW89_TXCH_CH12)
+ atomic_dec(&rtwusb->tx_inflight[txcb->txch]);
+
kfree(txcb);
}
@@ -306,9 +319,17 @@ static void rtw89_usb_ops_tx_kick_off(struct rtw89_dev *rtwdev, u8 txch)
skb_queue_tail(&txcb->tx_ack_queue, skb);
+ /* Increment BEFORE submit to avoid race with completion */
+ if (txch != RTW89_TXCH_CH12)
+ atomic_inc(&rtwusb->tx_inflight[txch]);
+
ret = rtw89_usb_write_port(rtwdev, txch, skb->data, skb->len,
txcb);
if (ret) {
+ /* Rollback increment on failure */
+ if (txch != RTW89_TXCH_CH12)
+ atomic_dec(&rtwusb->tx_inflight[txch]);
+
if (ret != -ENODEV)
rtw89_err(rtwdev, "write port txch %d failed: %d\n",
txch, ret);
@@ -666,8 +687,10 @@ static void rtw89_usb_init_tx(struct rtw89_dev *rtwdev)
struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
int i;
- for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++)
+ for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++) {
skb_queue_head_init(&rtwusb->tx_queue[i]);
+ atomic_set(&rtwusb->tx_inflight[i], 0);
+ }
}
static void rtw89_usb_deinit_tx(struct rtw89_dev *rtwdev)
diff --git a/drivers/net/wireless/realtek/rtw89/usb.h b/drivers/net/wireless/realtek/rtw89/usb.h
index 203ec8e99..f72a8b1b2 100644
--- a/drivers/net/wireless/realtek/rtw89/usb.h
+++ b/drivers/net/wireless/realtek/rtw89/usb.h
@@ -20,6 +20,9 @@
#define RTW89_MAX_ENDPOINT_NUM 9
#define RTW89_MAX_BULKOUT_NUM 7
+/* TX flow control: max in-flight URBs per channel */
+#define RTW89_USB_MAX_TX_URBS_PER_CH 32
+
struct rtw89_usb_info {
u32 usb_host_request_2;
u32 usb_wlan0_1;
@@ -63,6 +66,9 @@ struct rtw89_usb {
struct usb_anchor tx_submitted;
struct sk_buff_head tx_queue[RTW89_TXCH_NUM];
+
+ /* TX flow control: track in-flight URBs per channel */
+ atomic_t tx_inflight[RTW89_TXCH_NUM];
};
static inline struct rtw89_usb *rtw89_usb_priv(struct rtw89_dev *rtwdev)
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* RE: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-01-25 22:19 [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs Lucid Duck
@ 2026-01-26 3:39 ` Ping-Ke Shih
2026-01-26 10:14 ` Mh_chen
2026-01-27 5:00 ` Lucid Duck
2026-01-26 14:09 ` Bitterblue Smith
2026-03-21 3:37 ` [PATCH v2] " Lucid Duck
2 siblings, 2 replies; 11+ messages in thread
From: Ping-Ke Shih @ 2026-01-26 3:39 UTC (permalink / raw)
To: Lucid Duck, linux-wireless@vger.kernel.org, Bitterblue Smith,
Mh_chen
+ developers of WiFi USB adapters
Lucid Duck <lucid_duck@justthetip.ca> wrote:
> rtw89_usb_ops_check_and_reclaim_tx_resource() currently returns a
> hardcoded placeholder value of 42, violating mac80211's TX flow control
> contract. This causes uncontrolled URB accumulation under sustained TX
> load since mac80211 believes resources are always available.
Then URB becomes exhausted?
>
> Fix this by implementing proper TX backpressure:
>
> - Add per-channel atomic counters (tx_inflight[]) to track URBs between
> submission and completion
> - Increment counter before usb_submit_urb() with rollback on failure
> - Decrement counter in completion callback
> - Return available slots (max - inflight) to mac80211, or 0 at capacity
> - Exclude firmware command channel (CH12) from flow control
>
> Tested on D-Link DWA-X1850 (RTL8832AU) with:
> - Sustained high-throughput traffic
> - Module load/unload stress tests
> - Hot-unplug during active transmission
> - 30-minute soak test verifying counters balance at idle
>
> Signed-off-by: Lucid Duck <lucid_duck@justthetip.ca>
[...]
> diff --git a/drivers/net/wireless/realtek/rtw89/usb.h b/drivers/net/wireless/realtek/rtw89/usb.h
> index 203ec8e99..f72a8b1b2 100644
> --- a/drivers/net/wireless/realtek/rtw89/usb.h
> +++ b/drivers/net/wireless/realtek/rtw89/usb.h
> @@ -20,6 +20,9 @@
> #define RTW89_MAX_ENDPOINT_NUM 9
> #define RTW89_MAX_BULKOUT_NUM 7
>
> +/* TX flow control: max in-flight URBs per channel */
> +#define RTW89_USB_MAX_TX_URBS_PER_CH 32
Curiously. How did you decide this value? Have you tested USB2 and USB3 devices?
How about their throughput before/after this patch?
> +
> struct rtw89_usb_info {
> u32 usb_host_request_2;
> u32 usb_wlan0_1;
> @@ -63,6 +66,9 @@ struct rtw89_usb {
> struct usb_anchor tx_submitted;
>
> struct sk_buff_head tx_queue[RTW89_TXCH_NUM];
> +
> + /* TX flow control: track in-flight URBs per channel */
I feel we don't need repeatedly adding this comment. If you like it, just
keep one.
> + atomic_t tx_inflight[RTW89_TXCH_NUM];
> };
>
> static inline struct rtw89_usb *rtw89_usb_priv(struct rtw89_dev *rtwdev)
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread* RE: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-01-26 3:39 ` Ping-Ke Shih
@ 2026-01-26 10:14 ` Mh_chen
2026-01-27 5:00 ` Lucid Duck
1 sibling, 0 replies; 11+ messages in thread
From: Mh_chen @ 2026-01-26 10:14 UTC (permalink / raw)
To: Ping-Ke Shih, Lucid Duck, linux-wireless@vger.kernel.org,
Bitterblue Smith, Isaiah
+Isaiah for Wi-Fi USB driver,
-----Original Message-----
From: Ping-Ke Shih <pkshih@realtek.com>
Sent: Monday, January 26, 2026 11:40 AM
To: Lucid Duck <lucid_duck@justthetip.ca>; linux-wireless@vger.kernel.org; Bitterblue Smith <rtl8821cerfe2@gmail.com>; Mh_chen <mh_chen@realtek.com>
Subject: RE: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
+ developers of WiFi USB adapters
Lucid Duck <lucid_duck@justthetip.ca> wrote:
> rtw89_usb_ops_check_and_reclaim_tx_resource() currently returns a
> hardcoded placeholder value of 42, violating mac80211's TX flow
> control contract. This causes uncontrolled URB accumulation under
> sustained TX load since mac80211 believes resources are always available.
Then URB becomes exhausted?
>
> Fix this by implementing proper TX backpressure:
>
> - Add per-channel atomic counters (tx_inflight[]) to track URBs between
> submission and completion
> - Increment counter before usb_submit_urb() with rollback on failure
> - Decrement counter in completion callback
> - Return available slots (max - inflight) to mac80211, or 0 at
> capacity
> - Exclude firmware command channel (CH12) from flow control
>
> Tested on D-Link DWA-X1850 (RTL8832AU) with:
> - Sustained high-throughput traffic
> - Module load/unload stress tests
> - Hot-unplug during active transmission
> - 30-minute soak test verifying counters balance at idle
>
> Signed-off-by: Lucid Duck <lucid_duck@justthetip.ca>
[...]
> diff --git a/drivers/net/wireless/realtek/rtw89/usb.h
> b/drivers/net/wireless/realtek/rtw89/usb.h
> index 203ec8e99..f72a8b1b2 100644
> --- a/drivers/net/wireless/realtek/rtw89/usb.h
> +++ b/drivers/net/wireless/realtek/rtw89/usb.h
> @@ -20,6 +20,9 @@
> #define RTW89_MAX_ENDPOINT_NUM 9
> #define RTW89_MAX_BULKOUT_NUM 7
>
> +/* TX flow control: max in-flight URBs per channel */
> +#define RTW89_USB_MAX_TX_URBS_PER_CH 32
Curiously. How did you decide this value? Have you tested USB2 and USB3 devices?
How about their throughput before/after this patch?
> +
> struct rtw89_usb_info {
> u32 usb_host_request_2;
> u32 usb_wlan0_1;
> @@ -63,6 +66,9 @@ struct rtw89_usb {
> struct usb_anchor tx_submitted;
>
> struct sk_buff_head tx_queue[RTW89_TXCH_NUM];
> +
> + /* TX flow control: track in-flight URBs per channel */
I feel we don't need repeatedly adding this comment. If you like it, just keep one.
> + atomic_t tx_inflight[RTW89_TXCH_NUM];
> };
>
> static inline struct rtw89_usb *rtw89_usb_priv(struct rtw89_dev
> *rtwdev)
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-01-26 3:39 ` Ping-Ke Shih
2026-01-26 10:14 ` Mh_chen
@ 2026-01-27 5:00 ` Lucid Duck
1 sibling, 0 replies; 11+ messages in thread
From: Lucid Duck @ 2026-01-27 5:00 UTC (permalink / raw)
To: pkshih; +Cc: linux-wireless, mh_chen, rtl8821cerfe2, Lucid Duck
On Mon, 26 Jan 2026, Ping-Ke Shih wrote:
> Then URB becomes exhausted?
Yes. Without proper flow control, mac80211 continuously queues frames
since we always report resources available. URBs accumulate until
submission fails, causing TX stalls or instability under load.
> Curiously. How did you decide this value? Have you tested USB2 and USB3
> devices? How about their throughput before/after this patch?
The value of 32 was based on similar USB wireless drivers (mt76, ath9k_htc)
as a reasonable starting point. I'm open to tuning this if testing reveals
a better value, and it may need adjustment for optimal USB2 vs USB3
performance at different bands.
I have both USB2 and USB3 test capability and will be running more
rigorous throughput testing on both configurations shortly. Initial
testing showed the patch stable under sustained load, but I want to
collect proper iperf3 measurements before providing specific numbers.
I'll follow up with detailed test results and a v2 addressing your
comments.
> I feel we don't need repeatedly adding this comment. If you like it,
> just keep one.
Understood. Will clean this up in v2.
Thanks for the review.
--
Lucid Duck
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-01-25 22:19 [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs Lucid Duck
2026-01-26 3:39 ` Ping-Ke Shih
@ 2026-01-26 14:09 ` Bitterblue Smith
2026-01-27 20:01 ` Lucid Duck
[not found] ` <202601291256.60TCusZS3018440@rtits1.realtek.com.tw>
2026-03-21 3:37 ` [PATCH v2] " Lucid Duck
2 siblings, 2 replies; 11+ messages in thread
From: Bitterblue Smith @ 2026-01-26 14:09 UTC (permalink / raw)
To: Lucid Duck, linux-wireless; +Cc: Ping-Ke Shih
On 26/01/2026 00:19, Lucid Duck wrote:
> rtw89_usb_ops_check_and_reclaim_tx_resource() currently returns a
> hardcoded placeholder value of 42, violating mac80211's TX flow control
> contract. This causes uncontrolled URB accumulation under sustained TX
> load since mac80211 believes resources are always available.
>
> Fix this by implementing proper TX backpressure:
>
> - Add per-channel atomic counters (tx_inflight[]) to track URBs between
> submission and completion
> - Increment counter before usb_submit_urb() with rollback on failure
> - Decrement counter in completion callback
> - Return available slots (max - inflight) to mac80211, or 0 at capacity
> - Exclude firmware command channel (CH12) from flow control
>
> Tested on D-Link DWA-X1850 (RTL8832AU) with:
> - Sustained high-throughput traffic
> - Module load/unload stress tests
> - Hot-unplug during active transmission
> - 30-minute soak test verifying counters balance at idle
>
> Signed-off-by: Lucid Duck <lucid_duck@justthetip.ca>
> ---
> drivers/net/wireless/realtek/rtw89/usb.c | 27 ++++++++++++++++++++++--
> drivers/net/wireless/realtek/rtw89/usb.h | 6 ++++++
> 2 files changed, 31 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/realtek/rtw89/usb.c b/drivers/net/wireless/realtek/rtw89/usb.c
> index e77561a4d..6fcf32603 100644
> --- a/drivers/net/wireless/realtek/rtw89/usb.c
> +++ b/drivers/net/wireless/realtek/rtw89/usb.c
> @@ -161,16 +161,25 @@ static u32
> rtw89_usb_ops_check_and_reclaim_tx_resource(struct rtw89_dev *rtwdev,
> u8 txch)
> {
> + struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
> + int inflight;
> +
> + /* Firmware command channel is not flow-controlled */
> if (txch == RTW89_TXCH_CH12)
> return 1;
>
> - return 42; /* TODO some kind of calculation? */
> + inflight = atomic_read(&rtwusb->tx_inflight[txch]);
> + if (inflight >= RTW89_USB_MAX_TX_URBS_PER_CH)
> + return 0;
> +
> + return RTW89_USB_MAX_TX_URBS_PER_CH - inflight;
> }
>
> static void rtw89_usb_write_port_complete(struct urb *urb)
> {
> struct rtw89_usb_tx_ctrl_block *txcb = urb->context;
> struct rtw89_dev *rtwdev = txcb->rtwdev;
> + struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
> struct ieee80211_tx_info *info;
> struct rtw89_txwd_body *txdesc;
> struct sk_buff *skb;
> @@ -229,6 +238,10 @@ static void rtw89_usb_write_port_complete(struct urb *urb)
> break;
> }
>
> + /* Decrement in-flight counter (skip firmware command channel) */
> + if (txcb->txch != RTW89_TXCH_CH12)
You don't need to add these checks because there is one in
rtw89_usb_ops_check_and_reclaim_tx_resource().
> + atomic_dec(&rtwusb->tx_inflight[txcb->txch]);
> +
> kfree(txcb);
> }
>
> @@ -306,9 +319,17 @@ static void rtw89_usb_ops_tx_kick_off(struct rtw89_dev *rtwdev, u8 txch)
>
> skb_queue_tail(&txcb->tx_ack_queue, skb);
>
> + /* Increment BEFORE submit to avoid race with completion */
> + if (txch != RTW89_TXCH_CH12)
> + atomic_inc(&rtwusb->tx_inflight[txch]);
> +
> ret = rtw89_usb_write_port(rtwdev, txch, skb->data, skb->len,
> txcb);
> if (ret) {
> + /* Rollback increment on failure */
> + if (txch != RTW89_TXCH_CH12)
> + atomic_dec(&rtwusb->tx_inflight[txch]);
> +
> if (ret != -ENODEV)
> rtw89_err(rtwdev, "write port txch %d failed: %d\n",
> txch, ret);
> @@ -666,8 +687,10 @@ static void rtw89_usb_init_tx(struct rtw89_dev *rtwdev)
> struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
> int i;
>
> - for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++)
> + for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++) {
> skb_queue_head_init(&rtwusb->tx_queue[i]);
> + atomic_set(&rtwusb->tx_inflight[i], 0);
> + }
> }
>
> static void rtw89_usb_deinit_tx(struct rtw89_dev *rtwdev)
> diff --git a/drivers/net/wireless/realtek/rtw89/usb.h b/drivers/net/wireless/realtek/rtw89/usb.h
> index 203ec8e99..f72a8b1b2 100644
> --- a/drivers/net/wireless/realtek/rtw89/usb.h
> +++ b/drivers/net/wireless/realtek/rtw89/usb.h
> @@ -20,6 +20,9 @@
> #define RTW89_MAX_ENDPOINT_NUM 9
> #define RTW89_MAX_BULKOUT_NUM 7
>
> +/* TX flow control: max in-flight URBs per channel */
> +#define RTW89_USB_MAX_TX_URBS_PER_CH 32
> +
> struct rtw89_usb_info {
> u32 usb_host_request_2;
> u32 usb_wlan0_1;
> @@ -63,6 +66,9 @@ struct rtw89_usb {
> struct usb_anchor tx_submitted;
>
> struct sk_buff_head tx_queue[RTW89_TXCH_NUM];
> +
> + /* TX flow control: track in-flight URBs per channel */
> + atomic_t tx_inflight[RTW89_TXCH_NUM];
Is there a reason to add a new counter instead of just using
the length of each tx_queue?
> };
>
> static inline struct rtw89_usb *rtw89_usb_priv(struct rtw89_dev *rtwdev)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-01-26 14:09 ` Bitterblue Smith
@ 2026-01-27 20:01 ` Lucid Duck
[not found] ` <202601291256.60TCusZS3018440@rtits1.realtek.com.tw>
1 sibling, 0 replies; 11+ messages in thread
From: Lucid Duck @ 2026-01-27 20:01 UTC (permalink / raw)
To: Bitterblue Smith; +Cc: linux-wireless, Ping-Ke Shih, Mh_chen
On Mon, 26 Jan 2026, Bitterblue Smith wrote:
> The CH12 checks in the completion handler are unnecessary since there
> is one in the resource checking function.
You're right. I'll remove the redundant check in v2.
> Is there a reason to add a new counter instead of just using the
> length of each tx_queue?
Good question. The tx_queue length tracks SKBs queued for submission,
but doesn't account for URBs that have been submitted to USB core and
are in-flight (between submit and completion callback). An SKB is
dequeued before usb_submit_urb(), so skb_queue_len() would undercount
actual resource usage.
The atomic counter tracks the full lifecycle: incremented at submit,
decremented at completion. This gives mac80211 accurate backpressure
even when URBs are pending in the USB subsystem.
Happy to hear if I'm missing something that would make skb_queue_len()
work here.
--
Lucid Duck
^ permalink raw reply [flat|nested] 11+ messages in thread[parent not found: <202601291256.60TCusZS3018440@rtits1.realtek.com.tw>]
* RE: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
[not found] ` <202601291256.60TCusZS3018440@rtits1.realtek.com.tw>
@ 2026-01-29 13:12 ` Mh_chen
0 siblings, 0 replies; 11+ messages in thread
From: Mh_chen @ 2026-01-29 13:12 UTC (permalink / raw)
To: Lucid Duck, Bitterblue Smith, Isaiah
Cc: linux-wireless@vger.kernel.org, Ping-Ke Shih
+Isaiah,
-----Original Message-----
From: Lucid Duck <lucid_duck@justthetip.ca>
Sent: Wednesday, January 28, 2026 4:01 AM
To: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Cc: linux-wireless@vger.kernel.org; Ping-Ke Shih <pkshih@realtek.com>; Mh_chen <mh_chen@realtek.com>
Subject: Re: [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
External mail : This email originated from outside the organization. Do not reply, click links, or open attachments unless you recognize the sender and know the content is safe.
On Mon, 26 Jan 2026, Bitterblue Smith wrote:
> The CH12 checks in the completion handler are unnecessary since there
> is one in the resource checking function.
You're right. I'll remove the redundant check in v2.
> Is there a reason to add a new counter instead of just using the
> length of each tx_queue?
Good question. The tx_queue length tracks SKBs queued for submission, but doesn't account for URBs that have been submitted to USB core and are in-flight (between submit and completion callback). An SKB is dequeued before usb_submit_urb(), so skb_queue_len() would undercount actual resource usage.
The atomic counter tracks the full lifecycle: incremented at submit, decremented at completion. This gives mac80211 accurate backpressure even when URBs are pending in the USB subsystem.
Happy to hear if I'm missing something that would make skb_queue_len() work here.
--
Lucid Duck
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-01-25 22:19 [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs Lucid Duck
2026-01-26 3:39 ` Ping-Ke Shih
2026-01-26 14:09 ` Bitterblue Smith
@ 2026-03-21 3:37 ` Lucid Duck
2026-03-23 9:31 ` Ping-Ke Shih
2 siblings, 1 reply; 11+ messages in thread
From: Lucid Duck @ 2026-03-21 3:37 UTC (permalink / raw)
To: linux-wireless; +Cc: Ping-Ke Shih, Bitterblue Smith
From: Lucid Duck <lucid_duck@justthetip.ca>
Date: Thu, 20 Mar 2026 20:00:00 -0700
Subject: [PATCH v2] wifi: rtw89: usb: fix TX flow control by tracking
in-flight URBs
rtw89_usb_ops_check_and_reclaim_tx_resource() returns a hardcoded
placeholder value (42) instead of actual TX resource availability.
This violates mac80211's flow control contract, preventing backpressure
and causing uncontrolled URB accumulation under sustained TX load.
Fix by adding per-channel atomic counters (tx_inflight[]) that track
in-flight URBs:
- Increment before usb_submit_urb() with rollback on failure
- Decrement in completion callback
- Return (MAX_URBS - inflight) to mac80211, or 0 when at capacity
- Exclude firmware command channel (CH12) from tracking
The pre-increment pattern prevents a race where the USB core completes
the URB (possibly on another CPU) before the submitting code increments
the counter.
Tested on D-Link DWA-X1850 (RTL8832AU), kernel 6.18.3:
Unpatched -> Patched
USB3 5GHz DL: 494 -> 709 Mbps (+44%)
USB3 5GHz retx: 8 -> 1 (-88%)
USB3 2.4GHz DL: 54 -> 68 Mbps (+25%)
USB2 5GHz DL: 196 -> 225 Mbps (+15%)
USB2 2.4GHz DL: 123 -> 131 Mbps (+6%)
Signed-off-by: Lucid Duck <lucid_duck@justthetip.ca>
---
Resending v2. This was prepared in late January after addressing v1
review feedback, but the send failed silently (SMTP misconfiguration)
and never appeared on lore.kernel.org. Apologies for the delay.
Changes since v1:
- Removed duplicate "TX flow control" comment (Ping-Ke Shih)
- Added test results to commit message (Ping-Ke Shih)
Bitterblue's CH12 question from v1: the CH12 guards in tx_kick_off()
and write_port_complete() are a matched pair. tx_kick_off() skips
atomic_inc for CH12, so the completion handler must skip atomic_dec
to match. Removing only the completion side causes counter underflow.
Additional validation: 100-iteration stress test, 50-iteration
teardown (rmmod/modprobe under load), 10x hot-unplug during active
TX, and 30-minute soak -- all pass with counters balanced at idle.
The 32-URB-per-channel limit is based on similar USB wireless drivers
(mt76, ath9k_htc). The fixed value works well for both USB2 and USB3.
drivers/net/wireless/realtek/rtw89/usb.c | 26 ++++++++++++++++++-----
drivers/net/wireless/realtek/rtw89/usb.h | 6 ++++++
2 files changed, 27 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw89/usb.c b/drivers/net/wireless/realtek/rtw89/usb.c
index eb489df..faafa3c 100644
--- a/drivers/net/wireless/realtek/rtw89/usb.c
+++ b/drivers/net/wireless/realtek/rtw89/usb.c
@@ -161,16 +161,25 @@ static u32
rtw89_usb_ops_check_and_reclaim_tx_resource(struct rtw89_dev *rtwdev,
u8 txch)
{
+ struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
+ int inflight;
+
+ /* Firmware command channel is not tracked */
if (txch == RTW89_TXCH_CH12)
return 1;
- return 42; /* TODO some kind of calculation? */
+ inflight = atomic_read(&rtwusb->tx_inflight[txch]);
+ if (inflight >= RTW89_USB_MAX_TX_URBS_PER_CH)
+ return 0;
+
+ return RTW89_USB_MAX_TX_URBS_PER_CH - inflight;
}
static void rtw89_usb_write_port_complete(struct urb *urb)
{
struct rtw89_usb_tx_ctrl_block *txcb = urb->context;
struct rtw89_dev *rtwdev = txcb->rtwdev;
+ struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
struct ieee80211_tx_info *info;
struct rtw89_txwd_body *txdesc;
struct sk_buff *skb;
@@ -229,6 +238,10 @@ static void rtw89_usb_write_port_complete(struct urb *urb)
break;
}
+ /* Decrement in-flight counter (skip firmware command channel) */
+ if (txcb->txch != RTW89_TXCH_CH12)
+ atomic_dec(&rtwusb->tx_inflight[txcb->txch]);
+
kfree(txcb);
}
@@ -306,9 +319,17 @@ static void rtw89_usb_ops_tx_kick_off(struct rtw89_dev *rtwdev, u8 txch)
skb_queue_tail(&txcb->tx_ack_queue, skb);
+ /* Increment BEFORE submit to avoid race with completion */
+ if (txch != RTW89_TXCH_CH12)
+ atomic_inc(&rtwusb->tx_inflight[txch]);
+
ret = rtw89_usb_write_port(rtwdev, txch, skb->data, skb->len,
txcb);
if (ret) {
+ /* Rollback increment on failure */
+ if (txch != RTW89_TXCH_CH12)
+ atomic_dec(&rtwusb->tx_inflight[txch]);
+
if (ret != -ENODEV)
rtw89_err(rtwdev, "write port txch %d failed: %d\n",
txch, ret);
@@ -666,8 +687,10 @@ static void rtw89_usb_init_tx(struct rtw89_dev *rtwdev)
struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
int i;
- for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++)
+ for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++) {
skb_queue_head_init(&rtwusb->tx_queue[i]);
+ atomic_set(&rtwusb->tx_inflight[i], 0);
+ }
}
static void rtw89_usb_deinit_tx(struct rtw89_dev *rtwdev)
diff --git a/drivers/net/wireless/realtek/rtw89/usb.h b/drivers/net/wireless/realtek/rtw89/usb.h
index 9f554b5..1459122 100644
--- a/drivers/net/wireless/realtek/rtw89/usb.h
+++ b/drivers/net/wireless/realtek/rtw89/usb.h
@@ -20,6 +20,9 @@
#define RTW89_MAX_ENDPOINT_NUM 9
#define RTW89_MAX_BULKOUT_NUM 7
+/* TX flow control: max in-flight URBs per channel */
+#define RTW89_USB_MAX_TX_URBS_PER_CH 32
+
struct rtw89_usb_info {
u32 usb_host_request_2;
u32 usb_wlan0_1;
@@ -63,6 +66,9 @@ struct rtw89_usb {
struct usb_anchor tx_submitted;
struct sk_buff_head tx_queue[RTW89_TXCH_NUM];
+
+ /* TX flow control: track in-flight URBs per channel */
+ atomic_t tx_inflight[RTW89_TXCH_NUM];
};
static inline struct rtw89_usb *rtw89_usb_priv(struct rtw89_dev *rtwdev)
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* RE: [PATCH v2] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-03-21 3:37 ` [PATCH v2] " Lucid Duck
@ 2026-03-23 9:31 ` Ping-Ke Shih
2026-03-23 23:33 ` Lucid Duck
0 siblings, 1 reply; 11+ messages in thread
From: Ping-Ke Shih @ 2026-03-23 9:31 UTC (permalink / raw)
To: Lucid Duck, linux-wireless@vger.kernel.org; +Cc: Bitterblue Smith
Lucid Duck <lucid_duck@justthetip.ca> wrote:
> From: Lucid Duck <lucid_duck@justthetip.ca>
> Date: Thu, 20 Mar 2026 20:00:00 -0700
> Subject: [PATCH v2] wifi: rtw89: usb: fix TX flow control by tracking
> in-flight URBs
What is your mailer? Not 'git send-email'?
I'm not sure if I can apply this by patchwork tool.
>
> rtw89_usb_ops_check_and_reclaim_tx_resource() returns a hardcoded
> placeholder value (42) instead of actual TX resource availability.
> This violates mac80211's flow control contract, preventing backpressure
> and causing uncontrolled URB accumulation under sustained TX load.
>
> Fix by adding per-channel atomic counters (tx_inflight[]) that track
> in-flight URBs:
>
> - Increment before usb_submit_urb() with rollback on failure
> - Decrement in completion callback
> - Return (MAX_URBS - inflight) to mac80211, or 0 when at capacity
> - Exclude firmware command channel (CH12) from tracking
>
> The pre-increment pattern prevents a race where the USB core completes
> the URB (possibly on another CPU) before the submitting code increments
> the counter.
>
> Tested on D-Link DWA-X1850 (RTL8832AU), kernel 6.18.3:
>
> Unpatched -> Patched
> USB3 5GHz DL: 494 -> 709 Mbps (+44%)
> USB3 5GHz retx: 8 -> 1 (-88%)
> USB3 2.4GHz DL: 54 -> 68 Mbps (+25%)
> USB2 5GHz DL: 196 -> 225 Mbps (+15%)
> USB2 2.4GHz DL: 123 -> 131 Mbps (+6%)
As this patch does TX flow control, could you share the uplink data
as well?
>
> Signed-off-by: Lucid Duck <lucid_duck@justthetip.ca>
> ---
> Resending v2. This was prepared in late January after addressing v1
> review feedback, but the send failed silently (SMTP misconfiguration)
> and never appeared on lore.kernel.org. Apologies for the delay.
>
> Changes since v1:
> - Removed duplicate "TX flow control" comment (Ping-Ke Shih)
> - Added test results to commit message (Ping-Ke Shih)
>
> Bitterblue's CH12 question from v1: the CH12 guards in tx_kick_off()
> and write_port_complete() are a matched pair. tx_kick_off() skips
> atomic_inc for CH12, so the completion handler must skip atomic_dec
> to match. Removing only the completion side causes counter underflow.
>
> Additional validation: 100-iteration stress test, 50-iteration
> teardown (rmmod/modprobe under load), 10x hot-unplug during active
> TX, and 30-minute soak -- all pass with counters balanced at idle.
>
> The 32-URB-per-channel limit is based on similar USB wireless drivers
> (mt76, ath9k_htc). The fixed value works well for both USB2 and USB3.
Can increasing 32 get better performance? The stress test with small
packets might yield low throughput? (Just a question)
>
> drivers/net/wireless/realtek/rtw89/usb.c | 26 ++++++++++++++++++-----
> drivers/net/wireless/realtek/rtw89/usb.h | 6 ++++++
> 2 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/wireless/realtek/rtw89/usb.c
> b/drivers/net/wireless/realtek/rtw89/usb.c
> index eb489df..faafa3c 100644
> --- a/drivers/net/wireless/realtek/rtw89/usb.c
> +++ b/drivers/net/wireless/realtek/rtw89/usb.c
> @@ -161,16 +161,25 @@ static u32
> rtw89_usb_ops_check_and_reclaim_tx_resource(struct rtw89_dev *rtwdev,
> u8 txch)
> {
> + struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
> + int inflight;
> +
> + /* Firmware command channel is not tracked */
> if (txch == RTW89_TXCH_CH12)
> return 1;
>
> - return 42; /* TODO some kind of calculation? */
> + inflight = atomic_read(&rtwusb->tx_inflight[txch]);
> + if (inflight >= RTW89_USB_MAX_TX_URBS_PER_CH)
Out of curiosity. Is it possible inflight > RTW89_USB_MAX_TX_URBS_PER_CH?
> + return 0;
> +
> + return RTW89_USB_MAX_TX_URBS_PER_CH - inflight;
> }
>
> static void rtw89_usb_write_port_complete(struct urb *urb)
> {
> struct rtw89_usb_tx_ctrl_block *txcb = urb->context;
> struct rtw89_dev *rtwdev = txcb->rtwdev;
> + struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
> struct ieee80211_tx_info *info;
> struct rtw89_txwd_body *txdesc;
> struct sk_buff *skb;
> @@ -229,6 +238,10 @@ static void rtw89_usb_write_port_complete(struct urb *urb)
> break;
> }
>
> + /* Decrement in-flight counter (skip firmware command channel) */
> + if (txcb->txch != RTW89_TXCH_CH12)
> + atomic_dec(&rtwusb->tx_inflight[txcb->txch]);
> +
> kfree(txcb);
> }
>
> @@ -306,9 +319,17 @@ static void rtw89_usb_ops_tx_kick_off(struct rtw89_dev
> *rtwdev, u8 txch)
>
> skb_queue_tail(&txcb->tx_ack_queue, skb);
>
> + /* Increment BEFORE submit to avoid race with completion */
> + if (txch != RTW89_TXCH_CH12)
> + atomic_inc(&rtwusb->tx_inflight[txch]);
> +
> ret = rtw89_usb_write_port(rtwdev, txch, skb->data, skb->len,
> txcb);
> if (ret) {
> + /* Rollback increment on failure */
> + if (txch != RTW89_TXCH_CH12)
> + atomic_dec(&rtwusb->tx_inflight[txch]);
> +
> if (ret != -ENODEV)
> rtw89_err(rtwdev, "write port txch %d failed:
> %d\n",
> txch, ret);
> @@ -666,8 +687,10 @@ static void rtw89_usb_init_tx(struct rtw89_dev *rtwdev)
> struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev);
> int i;
>
> - for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++)
> + for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++) {
> skb_queue_head_init(&rtwusb->tx_queue[i]);
> + atomic_set(&rtwusb->tx_inflight[i], 0);
> + }
> }
>
> static void rtw89_usb_deinit_tx(struct rtw89_dev *rtwdev)
> diff --git a/drivers/net/wireless/realtek/rtw89/usb.h
> b/drivers/net/wireless/realtek/rtw89/usb.h
> index 9f554b5..1459122 100644
> --- a/drivers/net/wireless/realtek/rtw89/usb.h
> +++ b/drivers/net/wireless/realtek/rtw89/usb.h
> @@ -20,6 +20,9 @@
> #define RTW89_MAX_ENDPOINT_NUM 9
> #define RTW89_MAX_BULKOUT_NUM 7
>
> +/* TX flow control: max in-flight URBs per channel */
I think the code self-explain this already. No need this comment.
It'd be helpful if you can explain how you decide 32.
(In commit message, you mentioned this imitate other drivers, so
no need comment about that.)
> +#define RTW89_USB_MAX_TX_URBS_PER_CH 32
> +
> struct rtw89_usb_info {
> u32 usb_host_request_2;
> u32 usb_wlan0_1;
> @@ -63,6 +66,9 @@ struct rtw89_usb {
> struct usb_anchor tx_submitted;
>
> struct sk_buff_head tx_queue[RTW89_TXCH_NUM];
> +
> + /* TX flow control: track in-flight URBs per channel */
I'd not prefer this comment neither.
> + atomic_t tx_inflight[RTW89_TXCH_NUM];
> };
>
> static inline struct rtw89_usb *rtw89_usb_priv(struct rtw89_dev *rtwdev)
> --
> 2.53.0
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-03-23 9:31 ` Ping-Ke Shih
@ 2026-03-23 23:33 ` Lucid Duck
2026-03-24 0:38 ` Ping-Ke Shih
0 siblings, 1 reply; 11+ messages in thread
From: Lucid Duck @ 2026-03-23 23:33 UTC (permalink / raw)
To: Ping-Ke Shih, linux-wireless; +Cc: Bitterblue Smith
Hi Ping-Ke,
Thank you for the review. Answers to each point below.
Mailer: I use a direct SMTP script -- git send-email had a hang
issue on Fedora 43 that has since been resolved. v3 is sent via
git send-email. The patch applies cleanly with git am.
Uplink data (USB3 5GHz, 10 runs each, DWA-X1850 RTL8832AU,
kernel 6.19.8):
Unpatched Patched-32 Patched-64
UL avg: 844 Mbps 763 Mbps 840 Mbps
UL retransmits: 3 0 0
32 URBs shows a ~10% upload regression on USB3 5GHz. 64 URBs
recovers to stock with zero retransmits.
> Can increasing 32 get better performance? The stress test with
> small packets might yield low throughput?
Yes. I tested 32, 64, and 128 URBs per channel. The difference
is most visible under parallel streams (USB3 5GHz upload):
Stock 32 URBs 64 URBs 128 URBs
4-stream: 858 556 837 849
8-stream: 872 565 830 833
32 URBs drops 35% under multi-stream load. 64 URBs recovers
fully. 128 URBs shows no further gain -- 64 is the sweet spot
for USB3.
On USB2, URB count does not matter -- the bus is the bottleneck:
Stock 32 URBs 64 URBs 128 URBs
UL avg (USB2): 250 252 248 253
Small packets (USB3 5GHz upload, 3 runs each avg Mbps):
Stock 32 URBs 64 URBs
64 bytes: 139 128 126
256 bytes: 441 444 442
1024 bytes: 845 786 846
Small packets are CPU/USB-framing limited, not URB-count limited.
No throughput difference at 64 or 256 bytes. At 1024 bytes, 32
URBs constrains throughput; 64 recovers.
> Out of curiosity. Is it possible inflight >
> RTW89_USB_MAX_TX_URBS_PER_CH?
No. check_and_reclaim is called before tx_kick_off, and each call
submits at most one URB. The >= is defensive only.
> I think the code self-explain this already.
> I'd not prefer this comment neither.
Both comments removed in v3.
v3 follows with MAX_TX_URBS_PER_CH increased from 32 to 64.
Additional validation at 64 URBs: UDP flood (0% loss across 4M
datagrams at 930 Mbps), bidirectional (zero retransmits), and
60-second soak (844 Mbps sustained, zero degradation).
Full test data:
https://github.com/Lucid-Duck/tx-resources-flow-control/blob/main/test-results/2026-03-23-ping-ke-v2-review.md
Thanks,
Lucid Duck
^ permalink raw reply [flat|nested] 11+ messages in thread* RE: [PATCH v2] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs
2026-03-23 23:33 ` Lucid Duck
@ 2026-03-24 0:38 ` Ping-Ke Shih
0 siblings, 0 replies; 11+ messages in thread
From: Ping-Ke Shih @ 2026-03-24 0:38 UTC (permalink / raw)
To: Lucid Duck, linux-wireless@vger.kernel.org; +Cc: Bitterblue Smith
Lucid Duck <lucid_duck@justthetip.ca> wrote:
> Hi Ping-Ke,
>
> Thank you for the review. Answers to each point below.
>
> Mailer: I use a direct SMTP script -- git send-email had a hang
> issue on Fedora 43 that has since been resolved. v3 is sent via
> git send-email. The patch applies cleanly with git am.
>
> Uplink data (USB3 5GHz, 10 runs each, DWA-X1850 RTL8832AU,
> kernel 6.19.8):
>
> Unpatched Patched-32 Patched-64
> UL avg: 844 Mbps 763 Mbps 840 Mbps
> UL retransmits: 3 0 0
If you have RTL8832CU, can you have additional test on 160 MHz bandwidth?
>
> 32 URBs shows a ~10% upload regression on USB3 5GHz. 64 URBs
> recovers to stock with zero retransmits.
>
> > Can increasing 32 get better performance? The stress test with
> > small packets might yield low throughput?
>
> Yes. I tested 32, 64, and 128 URBs per channel. The difference
> is most visible under parallel streams (USB3 5GHz upload):
>
> Stock 32 URBs 64 URBs 128 URBs
> 4-stream: 858 556 837 849
> 8-stream: 872 565 830 833
Not sure if people want 128 URBs to have a few better performance.
For me, if there is no objection or restriction, I'd take 128
because this driver can support RTL8832CU working on 160MHz
bandwidth, which maximum throughput is twice of RTL8852AU.
>
> 32 URBs drops 35% under multi-stream load. 64 URBs recovers
> fully. 128 URBs shows no further gain -- 64 is the sweet spot
> for USB3.
>
> On USB2, URB count does not matter -- the bus is the bottleneck:
>
> Stock 32 URBs 64 URBs 128 URBs
> UL avg (USB2): 250 252 248 253
>
> Small packets (USB3 5GHz upload, 3 runs each avg Mbps):
>
> Stock 32 URBs 64 URBs
> 64 bytes: 139 128 126
> 256 bytes: 441 444 442
> 1024 bytes: 845 786 846
>
> Small packets are CPU/USB-framing limited, not URB-count limited.
> No throughput difference at 64 or 256 bytes. At 1024 bytes, 32
> URBs constrains throughput; 64 recovers.
>
> > Out of curiosity. Is it possible inflight >
> > RTW89_USB_MAX_TX_URBS_PER_CH?
>
> No. check_and_reclaim is called before tx_kick_off, and each call
> submits at most one URB. The >= is defensive only.
>
> > I think the code self-explain this already.
> > I'd not prefer this comment neither.
>
> Both comments removed in v3.
Thanks. Can you additionally remove the comments of CH12, which
looks like not preferred AI-like comments? I will note this in v3.
>
> v3 follows with MAX_TX_URBS_PER_CH increased from 32 to 64.
>
> Additional validation at 64 URBs: UDP flood (0% loss across 4M
> datagrams at 930 Mbps), bidirectional (zero retransmits), and
> 60-second soak (844 Mbps sustained, zero degradation).
>
> Full test data:
> https://github.com/Lucid-Duck/tx-resources-flow-control/blob/main/test-resul
> ts/2026-03-23-ping-ke-v2-review.md
Thanks for your experiments in detail. :)
Ping-Ke
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-03-24 0:38 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-25 22:19 [PATCH] wifi: rtw89: usb: fix TX flow control by tracking in-flight URBs Lucid Duck
2026-01-26 3:39 ` Ping-Ke Shih
2026-01-26 10:14 ` Mh_chen
2026-01-27 5:00 ` Lucid Duck
2026-01-26 14:09 ` Bitterblue Smith
2026-01-27 20:01 ` Lucid Duck
[not found] ` <202601291256.60TCusZS3018440@rtits1.realtek.com.tw>
2026-01-29 13:12 ` Mh_chen
2026-03-21 3:37 ` [PATCH v2] " Lucid Duck
2026-03-23 9:31 ` Ping-Ke Shih
2026-03-23 23:33 ` Lucid Duck
2026-03-24 0:38 ` Ping-Ke Shih
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox