linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rtw v3 0/5] wifi: fixes for rtw89
@ 2025-08-28 21:11 Fedor Pchelkin
  2025-08-28 21:11 ` [PATCH rtw v3 1/5] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait() Fedor Pchelkin
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Fedor Pchelkin @ 2025-08-28 21:11 UTC (permalink / raw)
  To: Ping-Ke Shih, Zong-Zhe Yang, Bitterblue Smith
  Cc: Fedor Pchelkin, Po-Hao Huang, linux-wireless, linux-kernel,
	lvc-project

v3: - further improvements for fix for use-after-free in
      rtw89_core_tx_kick_off_and_wait(), see changelog below --- in the patch
    - add patch 3/5 for USB case

v2: https://lore.kernel.org/linux-wireless/20250827120603.723548-1-pchelkin@ispras.ru/
    - rework fix for use-after-free in rtw89_core_tx_kick_off_and_wait() (Zong-Zhe)
    - add patches 2/4 and 3/4 for another issues discovered while reworking
    - use rtw tree instead of rtw-next

v1: https://lore.kernel.org/linux-wireless/20250820141441.106156-1-pchelkin@ispras.ru/ 

Fedor Pchelkin (5):
  wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait()
  wifi: rtw89: fix tx_wait initialization race
  wifi: rtw89: perform tx_wait completions for USB part
  wifi: rtw89: fix leak in rtw89_core_send_nullfunc()
  wifi: rtw89: avoid circular locking dependency in ser_state_run()

 drivers/net/wireless/realtek/rtw89/core.c | 57 ++++++++++++++++-------
 drivers/net/wireless/realtek/rtw89/core.h | 38 +++++++++++++++-
 drivers/net/wireless/realtek/rtw89/pci.c  |  5 +-
 drivers/net/wireless/realtek/rtw89/ser.c  |  5 +-
 drivers/net/wireless/realtek/rtw89/usb.c  | 36 ++++++++------
 5 files changed, 105 insertions(+), 36 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH rtw v3 1/5] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait()
  2025-08-28 21:11 [PATCH rtw v3 0/5] wifi: fixes for rtw89 Fedor Pchelkin
@ 2025-08-28 21:11 ` Fedor Pchelkin
  2025-08-29  7:42   ` Zong-Zhe Yang
  2025-08-28 21:11 ` [PATCH rtw v3 2/5] wifi: rtw89: fix tx_wait initialization race Fedor Pchelkin
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Fedor Pchelkin @ 2025-08-28 21:11 UTC (permalink / raw)
  To: Ping-Ke Shih, Zong-Zhe Yang, Bitterblue Smith
  Cc: Fedor Pchelkin, Po-Hao Huang, linux-wireless, linux-kernel,
	lvc-project, stable

There is a bug observed when rtw89_core_tx_kick_off_and_wait() tries to
access already freed skb_data:

 BUG: KFENCE: use-after-free write in rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110

 CPU: 6 UID: 0 PID: 41377 Comm: kworker/u64:24 Not tainted  6.17.0-rc1+ #1 PREEMPT(lazy)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS edk2-20250523-14.fc42 05/23/2025
 Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]

 Use-after-free write at 0x0000000020309d9d (in kfence-#251):
 rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110
 rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
 rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
 rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
 rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.h:141
 rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
 rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
 rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
 process_one_work kernel/workqueue.c:3241
 worker_thread kernel/workqueue.c:3400
 kthread kernel/kthread.c:463
 ret_from_fork arch/x86/kernel/process.c:154
 ret_from_fork_asm arch/x86/entry/entry_64.S:258

 kfence-#251: 0x0000000056e2393d-0x000000009943cb62, size=232, cache=skbuff_head_cache

 allocated by task 41377 on cpu 6 at 77869.159548s (0.009551s ago):
 __alloc_skb net/core/skbuff.c:659
 __netdev_alloc_skb net/core/skbuff.c:734
 ieee80211_nullfunc_get net/mac80211/tx.c:5844
 rtw89_core_send_nullfunc drivers/net/wireless/realtek/rtw89/core.c:3431
 rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
 rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
 rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
 rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.c:3194
 rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
 rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
 rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
 process_one_work kernel/workqueue.c:3241
 worker_thread kernel/workqueue.c:3400
 kthread kernel/kthread.c:463
 ret_from_fork arch/x86/kernel/process.c:154
 ret_from_fork_asm arch/x86/entry/entry_64.S:258

 freed by task 1045 on cpu 9 at 77869.168393s (0.001557s ago):
 ieee80211_tx_status_skb net/mac80211/status.c:1117
 rtw89_pci_release_txwd_skb drivers/net/wireless/realtek/rtw89/pci.c:564
 rtw89_pci_release_tx_skbs.isra.0 drivers/net/wireless/realtek/rtw89/pci.c:651
 rtw89_pci_release_tx drivers/net/wireless/realtek/rtw89/pci.c:676
 rtw89_pci_napi_poll drivers/net/wireless/realtek/rtw89/pci.c:4238
 __napi_poll net/core/dev.c:7495
 net_rx_action net/core/dev.c:7557 net/core/dev.c:7684
 handle_softirqs kernel/softirq.c:580
 do_softirq.part.0 kernel/softirq.c:480
 __local_bh_enable_ip kernel/softirq.c:407
 rtw89_pci_interrupt_threadfn drivers/net/wireless/realtek/rtw89/pci.c:927
 irq_thread_fn kernel/irq/manage.c:1133
 irq_thread kernel/irq/manage.c:1257
 kthread kernel/kthread.c:463
 ret_from_fork arch/x86/kernel/process.c:154
 ret_from_fork_asm arch/x86/entry/entry_64.S:258

It is a consequence of a race between the waiting and the signaling side
of the completion:

            Waiting thread                            Completing thread

rtw89_core_tx_kick_off_and_wait()
  rcu_assign_pointer(skb_data->wait, wait)
  /* start waiting */
  wait_for_completion_timeout()
                                                rtw89_pci_tx_status()
                                                  rtw89_core_tx_wait_complete()
                                                    rcu_read_lock()
                                                    /* signals completion and
                                                     * proceeds further
                                                     */
                                                    complete(&wait->completion)
                                                    rcu_read_unlock()
                                                  ...
                                                  /* frees skb_data */
                                                  ieee80211_tx_status_ni()
  /* returns (exit status doesn't matter) */
  wait_for_completion_timeout()
  ...
  /* accesses the already freed skb_data */
  rcu_assign_pointer(skb_data->wait, NULL)

The completing side might proceed and free the underlying skb even before
the waiting side is fully awoken and run to execution.  Actually the race
happens regardless of wait_for_completion_timeout() exit status, e.g.
the waiting side may hit a timeout and the concurrent completing side is
still able to free the skb.

Skbs which are sent by rtw89_core_tx_kick_off_and_wait() are owned by the
driver.  They don't come from core ieee80211 stack so no need to pass them
to ieee80211_tx_status_ni() on completing side.

Introduce a work function which will act as a garbage collector for
rtw89_tx_wait_info objects and the associated skbs.  Thus no potentially
heavy locks are required on the completing side.

Found by Linux Verification Center (linuxtesting.org).

Fixes: 1ae5ca615285 ("wifi: rtw89: add function to wait for completion of TX skbs")
Cc: stable@vger.kernel.org
Suggested-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
---

v3: - decrease waiting timeout in rtw89_tx_wait_work() (Zong-Zhe)
    - clear tx_waits list from rtw89_hci_reset(), too (Zong-Zhe)
    - keep RCU updating for skb_data->wait (Zong-Zhe)
    - keep the old order of calls in rtw89_pci_tx_status() (Zong-Zhe)
    - drop wait->finished as complete_all() would make the completion be
      done permanently (me)

v2: - use a work function to manage release of tx_waits and associated skbs (Zong-Zhe)


FWIW, I think RCU locking becomes unnecessary after the next patch
[PATCH rtw v3 2/5] wifi: rtw89: fix tx_wait initialization race
but let's keep it if you feel that's more safe.


 drivers/net/wireless/realtek/rtw89/core.c | 28 ++++++++++++++---
 drivers/net/wireless/realtek/rtw89/core.h | 38 +++++++++++++++++++++--
 drivers/net/wireless/realtek/rtw89/pci.c  |  3 +-
 drivers/net/wireless/realtek/rtw89/ser.c  |  2 ++
 4 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
index 57590f5577a3..5c964283cd67 100644
--- a/drivers/net/wireless/realtek/rtw89/core.c
+++ b/drivers/net/wireless/realtek/rtw89/core.c
@@ -1073,6 +1073,14 @@ rtw89_core_tx_update_desc_info(struct rtw89_dev *rtwdev,
 	}
 }
 
+static void rtw89_tx_wait_work(struct wiphy *wiphy, struct wiphy_work *work)
+{
+	struct rtw89_dev *rtwdev = container_of(work, struct rtw89_dev,
+						tx_wait_work);
+
+	rtw89_tx_wait_list_clear(rtwdev);
+}
+
 void rtw89_core_tx_kick_off(struct rtw89_dev *rtwdev, u8 qsel)
 {
 	u8 ch_dma;
@@ -1090,6 +1098,8 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
 	unsigned long time_left;
 	int ret = 0;
 
+	lockdep_assert_wiphy(rtwdev->hw->wiphy);
+
 	wait = kzalloc(sizeof(*wait), GFP_KERNEL);
 	if (!wait) {
 		rtw89_core_tx_kick_off(rtwdev, qsel);
@@ -1098,17 +1108,22 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
 
 	init_completion(&wait->completion);
 	rcu_assign_pointer(skb_data->wait, wait);
+	wait->skb = skb;
 
 	rtw89_core_tx_kick_off(rtwdev, qsel);
 	time_left = wait_for_completion_timeout(&wait->completion,
 						msecs_to_jiffies(timeout));
-	if (time_left == 0)
+
+	if (time_left == 0) {
 		ret = -ETIMEDOUT;
-	else if (!wait->tx_done)
-		ret = -EAGAIN;
+		list_add_tail(&wait->list, &rtwdev->tx_waits);
+		wiphy_work_queue(rtwdev->hw->wiphy, &rtwdev->tx_wait_work);
 
-	rcu_assign_pointer(skb_data->wait, NULL);
-	kfree_rcu(wait, rcu_head);
+	} else {
+		if (!wait->tx_done)
+			ret = -EAGAIN;
+		rtw89_tx_wait_release(&wait->rcu_head);
+	}
 
 	return ret;
 }
@@ -4972,6 +4987,7 @@ void rtw89_core_stop(struct rtw89_dev *rtwdev)
 	clear_bit(RTW89_FLAG_RUNNING, rtwdev->flags);
 
 	wiphy_work_cancel(wiphy, &rtwdev->c2h_work);
+	wiphy_work_cancel(wiphy, &rtwdev->tx_wait_work);
 	wiphy_work_cancel(wiphy, &rtwdev->cancel_6ghz_probe_work);
 	wiphy_work_cancel(wiphy, &btc->eapol_notify_work);
 	wiphy_work_cancel(wiphy, &btc->arp_notify_work);
@@ -5203,6 +5219,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
 		INIT_LIST_HEAD(&rtwdev->scan_info.pkt_list[band]);
 	}
 	INIT_LIST_HEAD(&rtwdev->scan_info.chan_list);
+	INIT_LIST_HEAD(&rtwdev->tx_waits);
 	INIT_WORK(&rtwdev->ba_work, rtw89_core_ba_work);
 	INIT_WORK(&rtwdev->txq_work, rtw89_core_txq_work);
 	INIT_DELAYED_WORK(&rtwdev->txq_reinvoke_work, rtw89_core_txq_reinvoke_work);
@@ -5233,6 +5250,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
 	wiphy_work_init(&rtwdev->c2h_work, rtw89_fw_c2h_work);
 	wiphy_work_init(&rtwdev->ips_work, rtw89_ips_work);
 	wiphy_work_init(&rtwdev->cancel_6ghz_probe_work, rtw89_cancel_6ghz_probe_work);
+	wiphy_work_init(&rtwdev->tx_wait_work, rtw89_tx_wait_work);
 	INIT_WORK(&rtwdev->load_firmware_work, rtw89_load_firmware_work);
 
 	skb_queue_head_init(&rtwdev->c2h_queue);
diff --git a/drivers/net/wireless/realtek/rtw89/core.h b/drivers/net/wireless/realtek/rtw89/core.h
index 43e10278e14d..9b22bb116af9 100644
--- a/drivers/net/wireless/realtek/rtw89/core.h
+++ b/drivers/net/wireless/realtek/rtw89/core.h
@@ -3506,9 +3506,12 @@ struct rtw89_phy_rate_pattern {
 	bool enable;
 };
 
+#define RTW89_TX_WAIT_DEFAULT_TIMEOUT 10
 struct rtw89_tx_wait_info {
 	struct rcu_head rcu_head;
+	struct list_head list;
 	struct completion completion;
+	struct sk_buff *skb;
 	bool tx_done;
 };
 
@@ -5925,6 +5928,9 @@ struct rtw89_dev {
 	/* used to protect rpwm */
 	spinlock_t rpwm_lock;
 
+	struct list_head tx_waits;
+	struct wiphy_work tx_wait_work;
+
 	struct rtw89_cam_info cam_info;
 
 	struct sk_buff_head c2h_queue;
@@ -6181,6 +6187,30 @@ rtw89_assoc_link_rcu_dereference(struct rtw89_dev *rtwdev, u8 macid)
 	list_first_entry_or_null(&p->dlink_pool, typeof(*p->links_inst), dlink_schd); \
 })
 
+static inline void rtw89_tx_wait_release(struct rcu_head *rcu_head)
+{
+	struct rtw89_tx_wait_info *wait =
+		container_of(rcu_head, struct rtw89_tx_wait_info, rcu_head);
+
+	dev_kfree_skb_any(wait->skb);
+	kfree(wait);
+}
+
+static inline void rtw89_tx_wait_list_clear(struct rtw89_dev *rtwdev)
+{
+	struct rtw89_tx_wait_info *wait, *tmp;
+
+	lockdep_assert_wiphy(rtwdev->hw->wiphy);
+
+	list_for_each_entry_safe(wait, tmp, &rtwdev->tx_waits, list) {
+		if (!wait_for_completion_timeout(&wait->completion,
+						 RTW89_TX_WAIT_DEFAULT_TIMEOUT))
+			continue;
+		list_del(&wait->list);
+		call_rcu(&wait->rcu_head, rtw89_tx_wait_release);
+	}
+}
+
 static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,
 				     struct rtw89_core_tx_request *tx_req)
 {
@@ -6190,6 +6220,7 @@ static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,
 static inline void rtw89_hci_reset(struct rtw89_dev *rtwdev)
 {
 	rtwdev->hci.ops->reset(rtwdev);
+	rtw89_tx_wait_list_clear(rtwdev);
 }
 
 static inline int rtw89_hci_start(struct rtw89_dev *rtwdev)
@@ -7258,11 +7289,12 @@ static inline struct sk_buff *rtw89_alloc_skb_for_rx(struct rtw89_dev *rtwdev,
 	return dev_alloc_skb(length);
 }
 
-static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
+static inline bool rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
 					       struct rtw89_tx_skb_data *skb_data,
 					       bool tx_done)
 {
 	struct rtw89_tx_wait_info *wait;
+	bool ret = false;
 
 	rcu_read_lock();
 
@@ -7270,11 +7302,13 @@ static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
 	if (!wait)
 		goto out;
 
+	ret = true;
 	wait->tx_done = tx_done;
-	complete(&wait->completion);
+	complete_all(&wait->completion);
 
 out:
 	rcu_read_unlock();
+	return ret;
 }
 
 static inline bool rtw89_is_mlo_1_1(struct rtw89_dev *rtwdev)
diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
index a669f2f843aa..4e3034b44f56 100644
--- a/drivers/net/wireless/realtek/rtw89/pci.c
+++ b/drivers/net/wireless/realtek/rtw89/pci.c
@@ -464,7 +464,8 @@ static void rtw89_pci_tx_status(struct rtw89_dev *rtwdev,
 	struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
 	struct ieee80211_tx_info *info;
 
-	rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE);
+	if (rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE))
+		return;
 
 	info = IEEE80211_SKB_CB(skb);
 	ieee80211_tx_info_clear_status(info);
diff --git a/drivers/net/wireless/realtek/rtw89/ser.c b/drivers/net/wireless/realtek/rtw89/ser.c
index bb39fdbcba0d..fe7beff8c424 100644
--- a/drivers/net/wireless/realtek/rtw89/ser.c
+++ b/drivers/net/wireless/realtek/rtw89/ser.c
@@ -502,7 +502,9 @@ static void ser_reset_trx_st_hdl(struct rtw89_ser *ser, u8 evt)
 		}
 
 		drv_stop_rx(ser);
+		wiphy_lock(wiphy);
 		drv_trx_reset(ser);
+		wiphy_unlock(wiphy);
 
 		/* wait m3 */
 		hal_send_m2_event(ser);
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rtw v3 2/5] wifi: rtw89: fix tx_wait initialization race
  2025-08-28 21:11 [PATCH rtw v3 0/5] wifi: fixes for rtw89 Fedor Pchelkin
  2025-08-28 21:11 ` [PATCH rtw v3 1/5] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait() Fedor Pchelkin
@ 2025-08-28 21:11 ` Fedor Pchelkin
  2025-08-29  9:40   ` Zong-Zhe Yang
  2025-08-28 21:11 ` [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part Fedor Pchelkin
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Fedor Pchelkin @ 2025-08-28 21:11 UTC (permalink / raw)
  To: Ping-Ke Shih, Zong-Zhe Yang, Bitterblue Smith
  Cc: Fedor Pchelkin, Po-Hao Huang, linux-wireless, linux-kernel,
	lvc-project, stable

Now that nullfunc skbs are recycled in a separate work item in the driver,
the following race during initialization and processing of those skbs
might lead to noticeable bugs:

      Waiting thread                          Completing thread

rtw89_core_send_nullfunc()
  rtw89_core_tx_write_link()
    ...
    rtw89_pci_txwd_submit()
      skb_data->wait = NULL
      /* add skb to the queue */
      skb_queue_tail(&txwd->queue, skb)
                                            rtw89_pci_napi_poll()
                                            ...
                                              rtw89_pci_release_txwd_skb()
                                                /* get skb from the queue */
                                                skb_unlink(skb, &txwd->queue)
                                                rtw89_pci_tx_status()
                                                  rtw89_core_tx_wait_complete()
                                                  /* use incorrect skb_data->wait */
  rtw89_core_tx_kick_off_and_wait()
  /* assign skb_data->wait but too late */

The value of skb_data->wait indicates whether skb is passed on to the
core ieee80211 stack or released by the driver itself.  So assure that by
the time skb is added to txwd queue and becomes visible to the completing
side, it has already allocated tx_wait-related data (in case it's needed).

Found by Linux Verification Center (linuxtesting.org).

Fixes: 1ae5ca615285 ("wifi: rtw89: add function to wait for completion of TX skbs")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
---
 drivers/net/wireless/realtek/rtw89/core.c | 30 ++++++++++++++---------
 drivers/net/wireless/realtek/rtw89/pci.c  |  2 --
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
index 5c964283cd67..57f20559dfde 100644
--- a/drivers/net/wireless/realtek/rtw89/core.c
+++ b/drivers/net/wireless/realtek/rtw89/core.c
@@ -1094,20 +1094,12 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
 				    int qsel, unsigned int timeout)
 {
 	struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
-	struct rtw89_tx_wait_info *wait;
+	struct rtw89_tx_wait_info *wait = skb_data->wait;
 	unsigned long time_left;
 	int ret = 0;
 
 	lockdep_assert_wiphy(rtwdev->hw->wiphy);
 
-	wait = kzalloc(sizeof(*wait), GFP_KERNEL);
-	if (!wait) {
-		rtw89_core_tx_kick_off(rtwdev, qsel);
-		return 0;
-	}
-
-	init_completion(&wait->completion);
-	rcu_assign_pointer(skb_data->wait, wait);
 	wait->skb = skb;
 
 	rtw89_core_tx_kick_off(rtwdev, qsel);
@@ -1172,10 +1164,12 @@ int rtw89_h2c_tx(struct rtw89_dev *rtwdev,
 static int rtw89_core_tx_write_link(struct rtw89_dev *rtwdev,
 				    struct rtw89_vif_link *rtwvif_link,
 				    struct rtw89_sta_link *rtwsta_link,
-				    struct sk_buff *skb, int *qsel, bool sw_mld)
+				    struct sk_buff *skb, int *qsel, bool sw_mld,
+				    struct rtw89_tx_wait_info *wait)
 {
 	struct ieee80211_sta *sta = rtwsta_link_to_sta_safe(rtwsta_link);
 	struct ieee80211_vif *vif = rtwvif_link_to_vif(rtwvif_link);
+	struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
 	struct rtw89_vif *rtwvif = rtwvif_link->rtwvif;
 	struct rtw89_core_tx_request tx_req = {};
 	int ret;
@@ -1192,6 +1186,8 @@ static int rtw89_core_tx_write_link(struct rtw89_dev *rtwdev,
 	rtw89_core_tx_update_desc_info(rtwdev, &tx_req);
 	rtw89_core_tx_wake(rtwdev, &tx_req);
 
+	rcu_assign_pointer(skb_data->wait, wait);
+
 	ret = rtw89_hci_tx_write(rtwdev, &tx_req);
 	if (ret) {
 		rtw89_err(rtwdev, "failed to transmit skb to HCI\n");
@@ -1228,7 +1224,8 @@ int rtw89_core_tx_write(struct rtw89_dev *rtwdev, struct ieee80211_vif *vif,
 		}
 	}
 
-	return rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, qsel, false);
+	return rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, qsel, false,
+					NULL);
 }
 
 static __le32 rtw89_build_txwd_body0(struct rtw89_tx_desc_info *desc_info)
@@ -3426,6 +3423,7 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
 	struct ieee80211_vif *vif = rtwvif_link_to_vif(rtwvif_link);
 	int link_id = ieee80211_vif_is_mld(vif) ? rtwvif_link->link_id : -1;
 	struct rtw89_sta_link *rtwsta_link;
+	struct rtw89_tx_wait_info *wait;
 	struct ieee80211_sta *sta;
 	struct ieee80211_hdr *hdr;
 	struct rtw89_sta *rtwsta;
@@ -3435,6 +3433,12 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
 	if (vif->type != NL80211_IFTYPE_STATION || !vif->cfg.assoc)
 		return 0;
 
+	wait = kzalloc(sizeof(*wait), GFP_KERNEL);
+	if (!wait)
+		return -ENOMEM;
+
+	init_completion(&wait->completion);
+
 	rcu_read_lock();
 	sta = ieee80211_find_sta(vif, vif->cfg.ap_addr);
 	if (!sta) {
@@ -3459,7 +3463,8 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
 		goto out;
 	}
 
-	ret = rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, &qsel, true);
+	ret = rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, &qsel, true,
+				       wait);
 	if (ret) {
 		rtw89_warn(rtwdev, "nullfunc transmit failed: %d\n", ret);
 		dev_kfree_skb_any(skb);
@@ -3472,6 +3477,7 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
 					       timeout);
 out:
 	rcu_read_unlock();
+	kfree(wait);
 
 	return ret;
 }
diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
index 4e3034b44f56..cb9682f306a6 100644
--- a/drivers/net/wireless/realtek/rtw89/pci.c
+++ b/drivers/net/wireless/realtek/rtw89/pci.c
@@ -1372,7 +1372,6 @@ static int rtw89_pci_txwd_submit(struct rtw89_dev *rtwdev,
 	struct pci_dev *pdev = rtwpci->pdev;
 	struct sk_buff *skb = tx_req->skb;
 	struct rtw89_pci_tx_data *tx_data = RTW89_PCI_TX_SKB_CB(skb);
-	struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
 	bool en_wd_info = desc_info->en_wd_info;
 	u32 txwd_len;
 	u32 txwp_len;
@@ -1388,7 +1387,6 @@ static int rtw89_pci_txwd_submit(struct rtw89_dev *rtwdev,
 	}
 
 	tx_data->dma = dma;
-	rcu_assign_pointer(skb_data->wait, NULL);
 
 	txwp_len = sizeof(*txwp_info);
 	txwd_len = chip->txwd_body_size;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part
  2025-08-28 21:11 [PATCH rtw v3 0/5] wifi: fixes for rtw89 Fedor Pchelkin
  2025-08-28 21:11 ` [PATCH rtw v3 1/5] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait() Fedor Pchelkin
  2025-08-28 21:11 ` [PATCH rtw v3 2/5] wifi: rtw89: fix tx_wait initialization race Fedor Pchelkin
@ 2025-08-28 21:11 ` Fedor Pchelkin
  2025-08-29 19:57   ` Bitterblue Smith
  2025-08-28 21:12 ` [PATCH rtw v3 4/5] wifi: rtw89: fix leak in rtw89_core_send_nullfunc() Fedor Pchelkin
  2025-08-28 21:12 ` [PATCH rtw v3 5/5] wifi: rtw89: avoid circular locking dependency in ser_state_run() Fedor Pchelkin
  4 siblings, 1 reply; 11+ messages in thread
From: Fedor Pchelkin @ 2025-08-28 21:11 UTC (permalink / raw)
  To: Ping-Ke Shih, Zong-Zhe Yang, Bitterblue Smith
  Cc: Fedor Pchelkin, Po-Hao Huang, linux-wireless, linux-kernel,
	lvc-project

There is no completion signaling for tx_wait skbs on USB part. This means
rtw89_core_tx_kick_off_and_wait() always returns with a timeout.

Moreover, recent rework of tx_wait objects lifecycle handling made the
driver be responsible for freeing the associated skbs, not the core
ieee80211 stack. Lack of completion signaling would cause those objects
being kept in driver internal tx_waits queue until rtw89_hci_reset()
occurs, and then a double free would happen.

Extract TX status handling into a separate function, like its
rtw89_pci_tx_status() counterpart. Signal completion from there.

Found by Linux Verification Center (linuxtesting.org).

Fixes: 2135c28be6a8 ("wifi: rtw89: Add usb.{c,h}")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
---

New series iteration -> new nuances found.

It seems the two previous patches from the series would not be too great
in USB case because there is no completion signaling for tx_wait skbs
there.

I don't have this hardware so *the patch is compile tested only*. It'd
be nice if someone gave it a run on top of two previous patches of the
series, thanks!

 drivers/net/wireless/realtek/rtw89/usb.c | 36 +++++++++++++++---------
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtw89/usb.c b/drivers/net/wireless/realtek/rtw89/usb.c
index 6cf89aee252e..10fe19bd5166 100644
--- a/drivers/net/wireless/realtek/rtw89/usb.c
+++ b/drivers/net/wireless/realtek/rtw89/usb.c
@@ -188,11 +188,32 @@ static u8 rtw89_usb_get_bulkout_id(u8 ch_dma)
 	}
 }
 
+static void rtw89_usb_tx_status(struct rtw89_dev *rtwdev, struct sk_buff *skb,
+				int status)
+{
+	struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
+	struct ieee80211_tx_info *info;
+
+	if (rtw89_core_tx_wait_complete(rtwdev, skb_data, status == 0))
+		return;
+
+	info = IEEE80211_SKB_CB(skb);
+	ieee80211_tx_info_clear_status(info);
+
+	if (status == 0) {
+		if (info->flags & IEEE80211_TX_CTL_NO_ACK)
+			info->flags |= IEEE80211_TX_STAT_NOACK_TRANSMITTED;
+		else
+			info->flags |= IEEE80211_TX_STAT_ACK;
+	}
+
+	ieee80211_tx_status_irqsafe(rtwdev->hw, skb);
+}
+
 static void rtw89_usb_write_port_complete(struct urb *urb)
 {
 	struct rtw89_usb_tx_ctrl_block *txcb = urb->context;
 	struct rtw89_dev *rtwdev = txcb->rtwdev;
-	struct ieee80211_tx_info *info;
 	struct rtw89_txwd_body *txdesc;
 	struct sk_buff *skb;
 	u32 txdesc_size;
@@ -214,18 +235,7 @@ static void rtw89_usb_write_port_complete(struct urb *urb)
 			txdesc_size += rtwdev->chip->txwd_info_size;
 
 		skb_pull(skb, txdesc_size);
-
-		info = IEEE80211_SKB_CB(skb);
-		ieee80211_tx_info_clear_status(info);
-
-		if (urb->status == 0) {
-			if (info->flags & IEEE80211_TX_CTL_NO_ACK)
-				info->flags |= IEEE80211_TX_STAT_NOACK_TRANSMITTED;
-			else
-				info->flags |= IEEE80211_TX_STAT_ACK;
-		}
-
-		ieee80211_tx_status_irqsafe(rtwdev->hw, skb);
+		rtw89_usb_tx_status(rtwdev, skb, urb->status);
 	}
 
 	switch (urb->status) {
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rtw v3 4/5] wifi: rtw89: fix leak in rtw89_core_send_nullfunc()
  2025-08-28 21:11 [PATCH rtw v3 0/5] wifi: fixes for rtw89 Fedor Pchelkin
                   ` (2 preceding siblings ...)
  2025-08-28 21:11 ` [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part Fedor Pchelkin
@ 2025-08-28 21:12 ` Fedor Pchelkin
  2025-08-28 21:12 ` [PATCH rtw v3 5/5] wifi: rtw89: avoid circular locking dependency in ser_state_run() Fedor Pchelkin
  4 siblings, 0 replies; 11+ messages in thread
From: Fedor Pchelkin @ 2025-08-28 21:12 UTC (permalink / raw)
  To: Ping-Ke Shih, Zong-Zhe Yang, Bitterblue Smith
  Cc: Fedor Pchelkin, Po-Hao Huang, linux-wireless, linux-kernel,
	lvc-project

If there is no rtwsta_link found in rtw89_core_send_nullfunc(), allocated
skb is leaked.  Free it on the error handling path.

Found by Linux Verification Center (linuxtesting.org).

Fixes: a8ba4acab7db ("wifi: rtw89: send nullfunc based on the given link")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
---

v3: - use dev_kfree_skb_any() directly on error handling path without
      extra goto complexity  (Zong-Zhe)

 drivers/net/wireless/realtek/rtw89/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
index 57f20559dfde..0076dae4572c 100644
--- a/drivers/net/wireless/realtek/rtw89/core.c
+++ b/drivers/net/wireless/realtek/rtw89/core.c
@@ -3460,6 +3460,7 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link *rt
 	rtwsta_link = rtwsta->links[rtwvif_link->link_id];
 	if (unlikely(!rtwsta_link)) {
 		ret = -ENOLINK;
+		dev_kfree_skb_any(skb);
 		goto out;
 	}
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rtw v3 5/5] wifi: rtw89: avoid circular locking dependency in ser_state_run()
  2025-08-28 21:11 [PATCH rtw v3 0/5] wifi: fixes for rtw89 Fedor Pchelkin
                   ` (3 preceding siblings ...)
  2025-08-28 21:12 ` [PATCH rtw v3 4/5] wifi: rtw89: fix leak in rtw89_core_send_nullfunc() Fedor Pchelkin
@ 2025-08-28 21:12 ` Fedor Pchelkin
  4 siblings, 0 replies; 11+ messages in thread
From: Fedor Pchelkin @ 2025-08-28 21:12 UTC (permalink / raw)
  To: Ping-Ke Shih, Zong-Zhe Yang, Bitterblue Smith
  Cc: Fedor Pchelkin, Po-Hao Huang, linux-wireless, linux-kernel,
	lvc-project

Lockdep gives a splat [1] when ser_hdl_work item is executed.  It is
scheduled at mac80211 workqueue via ieee80211_queue_work() and takes a
wiphy lock inside.  However, this workqueue can be flushed when e.g.
closing the interface and wiphy lock is already taken in that case.

Choosing wiphy_work_queue() for SER is likely not suitable.  Back on to
the global workqueue.

[1]:

 WARNING: possible circular locking dependency detected
 6.17.0-rc2 #17 Not tainted
 ------------------------------------------------------
 kworker/u32:1/61 is trying to acquire lock:
 ffff88811bc00768 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: ser_state_run+0x5e/0x180 [rtw89_core]

 but task is already holding lock:
 ffffc9000048fd30 ((work_completion)(&ser->ser_hdl_work)){+.+.}-{0:0}, at: process_one_work+0x7b5/0x1450

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 ((work_completion)(&ser->ser_hdl_work)){+.+.}-{0:0}:
        process_one_work+0x7c6/0x1450
        worker_thread+0x49e/0xd00
        kthread+0x313/0x640
        ret_from_fork+0x221/0x300
        ret_from_fork_asm+0x1a/0x30

 -> #1 ((wq_completion)phy0){+.+.}-{0:0}:
        touch_wq_lockdep_map+0x8e/0x180
        __flush_workqueue+0x129/0x10d0
        ieee80211_stop_device+0xa8/0x110
        ieee80211_do_stop+0x14ce/0x2880
        ieee80211_stop+0x13a/0x2c0
        __dev_close_many+0x18f/0x510
        __dev_change_flags+0x25f/0x670
        netif_change_flags+0x7b/0x160
        do_setlink.isra.0+0x1640/0x35d0
        rtnl_newlink+0xd8c/0x1d30
        rtnetlink_rcv_msg+0x700/0xb80
        netlink_rcv_skb+0x11d/0x350
        netlink_unicast+0x49a/0x7a0
        netlink_sendmsg+0x759/0xc20
        ____sys_sendmsg+0x812/0xa00
        ___sys_sendmsg+0xf7/0x180
        __sys_sendmsg+0x11f/0x1b0
        do_syscall_64+0xbb/0x360
        entry_SYSCALL_64_after_hwframe+0x77/0x7f

 -> #0 (&rdev->wiphy.mtx){+.+.}-{4:4}:
        __lock_acquire+0x124c/0x1d20
        lock_acquire+0x154/0x2e0
        __mutex_lock+0x17b/0x12f0
        ser_state_run+0x5e/0x180 [rtw89_core]
        rtw89_ser_hdl_work+0x119/0x220 [rtw89_core]
        process_one_work+0x82d/0x1450
        worker_thread+0x49e/0xd00
        kthread+0x313/0x640
        ret_from_fork+0x221/0x300
        ret_from_fork_asm+0x1a/0x30

 other info that might help us debug this:

 Chain exists of:
   &rdev->wiphy.mtx --> (wq_completion)phy0 --> (work_completion)(&ser->ser_hdl_work)

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock((work_completion)(&ser->ser_hdl_work));
                                lock((wq_completion)phy0);
                                lock((work_completion)(&ser->ser_hdl_work));
   lock(&rdev->wiphy.mtx);

  *** DEADLOCK ***

 2 locks held by kworker/u32:1/61:
  #0: ffff888103835148 ((wq_completion)phy0){+.+.}-{0:0}, at: process_one_work+0xefa/0x1450
  #1: ffffc9000048fd30 ((work_completion)(&ser->ser_hdl_work)){+.+.}-{0:0}, at: process_one_work+0x7b5/0x1450

 stack backtrace:
 CPU: 0 UID: 0 PID: 61 Comm: kworker/u32:1 Not tainted 6.17.0-rc2 #17 PREEMPT(voluntary)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS edk2-20250523-14.fc42 05/23/2025
 Workqueue: phy0 rtw89_ser_hdl_work [rtw89_core]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x5d/0x80
  print_circular_bug.cold+0x178/0x1be
  check_noncircular+0x14c/0x170
  __lock_acquire+0x124c/0x1d20
  lock_acquire+0x154/0x2e0
  __mutex_lock+0x17b/0x12f0
  ser_state_run+0x5e/0x180 [rtw89_core]
  rtw89_ser_hdl_work+0x119/0x220 [rtw89_core]
  process_one_work+0x82d/0x1450
  worker_thread+0x49e/0xd00
  kthread+0x313/0x640
  ret_from_fork+0x221/0x300
  ret_from_fork_asm+0x1a/0x30
  </TASK>

Found by Linux Verification Center (linuxtesting.org).

Fixes: ebfc9199df05 ("wifi: rtw89: add wiphy_lock() to work that isn't held wiphy_lock() yet")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
---
 drivers/net/wireless/realtek/rtw89/ser.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtw89/ser.c b/drivers/net/wireless/realtek/rtw89/ser.c
index fe7beff8c424..f99e179f7ff9 100644
--- a/drivers/net/wireless/realtek/rtw89/ser.c
+++ b/drivers/net/wireless/realtek/rtw89/ser.c
@@ -205,7 +205,6 @@ static void rtw89_ser_hdl_work(struct work_struct *work)
 
 static int ser_send_msg(struct rtw89_ser *ser, u8 event)
 {
-	struct rtw89_dev *rtwdev = container_of(ser, struct rtw89_dev, ser);
 	struct ser_msg *msg = NULL;
 
 	if (test_bit(RTW89_SER_DRV_STOP_RUN, ser->flags))
@@ -221,7 +220,7 @@ static int ser_send_msg(struct rtw89_ser *ser, u8 event)
 	list_add(&msg->list, &ser->msg_q);
 	spin_unlock_irq(&ser->msg_q_lock);
 
-	ieee80211_queue_work(rtwdev->hw, &ser->ser_hdl_work);
+	schedule_work(&ser->ser_hdl_work);
 	return 0;
 }
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH rtw v3 1/5] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait()
  2025-08-28 21:11 ` [PATCH rtw v3 1/5] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait() Fedor Pchelkin
@ 2025-08-29  7:42   ` Zong-Zhe Yang
  0 siblings, 0 replies; 11+ messages in thread
From: Zong-Zhe Yang @ 2025-08-29  7:42 UTC (permalink / raw)
  To: pchelkin@ispras.ru, Ping-Ke Shih, Bitterblue Smith
  Cc: Bernie Huang, linux-wireless@vger.kernel.org,
	linux-kernel@vger.kernel.org, lvc-project@linuxtesting.org,
	stable@vger.kernel.org

Fedor Pchelkin <pchelkin@ispras.ru> wrote:
> There is a bug observed when rtw89_core_tx_kick_off_and_wait() tries to access already
> freed skb_data:
> 
>  BUG: KFENCE: use-after-free write in rtw89_core_tx_kick_off_and_wait
> drivers/net/wireless/realtek/rtw89/core.c:1110
> 
>  CPU: 6 UID: 0 PID: 41377 Comm: kworker/u64:24 Not tainted  6.17.0-rc1+ #1 PREEMPT(lazy)
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS edk2-20250523-14.fc42
> 05/23/2025
>  Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]
> 
>  Use-after-free write at 0x0000000020309d9d (in kfence-#251):
>  rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110
>  rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
>  rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
>  rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
>  rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.h:141
>  rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
>  rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
>  rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
>  process_one_work kernel/workqueue.c:3241  worker_thread kernel/workqueue.c:3400
> kthread kernel/kthread.c:463  ret_from_fork arch/x86/kernel/process.c:154
> ret_from_fork_asm arch/x86/entry/entry_64.S:258
> 
>  kfence-#251: 0x0000000056e2393d-0x000000009943cb62, size=232,
> cache=skbuff_head_cache
> 
>  allocated by task 41377 on cpu 6 at 77869.159548s (0.009551s ago):
>  __alloc_skb net/core/skbuff.c:659
>  __netdev_alloc_skb net/core/skbuff.c:734  ieee80211_nullfunc_get
> net/mac80211/tx.c:5844  rtw89_core_send_nullfunc
> drivers/net/wireless/realtek/rtw89/core.c:3431
>  rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
>  rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
>  rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
>  rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.c:3194
>  rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
>  rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
>  rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
>  process_one_work kernel/workqueue.c:3241  worker_thread kernel/workqueue.c:3400
> kthread kernel/kthread.c:463  ret_from_fork arch/x86/kernel/process.c:154
> ret_from_fork_asm arch/x86/entry/entry_64.S:258
> 
>  freed by task 1045 on cpu 9 at 77869.168393s (0.001557s ago):
>  ieee80211_tx_status_skb net/mac80211/status.c:1117  rtw89_pci_release_txwd_skb
> drivers/net/wireless/realtek/rtw89/pci.c:564
>  rtw89_pci_release_tx_skbs.isra.0 drivers/net/wireless/realtek/rtw89/pci.c:651
>  rtw89_pci_release_tx drivers/net/wireless/realtek/rtw89/pci.c:676
>  rtw89_pci_napi_poll drivers/net/wireless/realtek/rtw89/pci.c:4238
>  __napi_poll net/core/dev.c:7495
>  net_rx_action net/core/dev.c:7557 net/core/dev.c:7684  handle_softirqs
> kernel/softirq.c:580
>  do_softirq.part.0 kernel/softirq.c:480
>  __local_bh_enable_ip kernel/softirq.c:407  rtw89_pci_interrupt_threadfn
> drivers/net/wireless/realtek/rtw89/pci.c:927
>  irq_thread_fn kernel/irq/manage.c:1133
>  irq_thread kernel/irq/manage.c:1257
>  kthread kernel/kthread.c:463
>  ret_from_fork arch/x86/kernel/process.c:154  ret_from_fork_asm
> arch/x86/entry/entry_64.S:258
> 
> It is a consequence of a race between the waiting and the signaling side of the completion:
> 
>             Waiting thread                            Completing thread
> 
> rtw89_core_tx_kick_off_and_wait()
>   rcu_assign_pointer(skb_data->wait, wait)
>   /* start waiting */
>   wait_for_completion_timeout()
>                                                 rtw89_pci_tx_status()
>                                                   rtw89_core_tx_wait_complete()
>                                                     rcu_read_lock()
>                                                     /* signals completion and
>                                                      * proceeds further
>                                                      */
> 
> complete(&wait->completion)
>                                                     rcu_read_unlock()
>                                                   ...
>                                                   /* frees skb_data */
>                                                   ieee80211_tx_status_ni()
>   /* returns (exit status doesn't matter) */
>   wait_for_completion_timeout()
>   ...
>   /* accesses the already freed skb_data */
>   rcu_assign_pointer(skb_data->wait, NULL)
> 
> The completing side might proceed and free the underlying skb even before the waiting side
> is fully awoken and run to execution.  Actually the race happens regardless of
> wait_for_completion_timeout() exit status, e.g.
> the waiting side may hit a timeout and the concurrent completing side is still able to free the
> skb.
> 
> Skbs which are sent by rtw89_core_tx_kick_off_and_wait() are owned by the driver.  They
> don't come from core ieee80211 stack so no need to pass them to ieee80211_tx_status_ni()
> on completing side.
> 
> Introduce a work function which will act as a garbage collector for rtw89_tx_wait_info objects
> and the associated skbs.  Thus no potentially heavy locks are required on the completing
> side.
> 
> Found by Linux Verification Center (linuxtesting.org).
> 
> Fixes: 1ae5ca615285 ("wifi: rtw89: add function to wait for completion of TX skbs")
> Cc: stable@vger.kernel.org
> Suggested-by: Zong-Zhe Yang <kevin_yang@realtek.com>
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> ---
> 
> v3: - decrease waiting timeout in rtw89_tx_wait_work() (Zong-Zhe)
>     - clear tx_waits list from rtw89_hci_reset(), too (Zong-Zhe)
>     - keep RCU updating for skb_data->wait (Zong-Zhe)
>     - keep the old order of calls in rtw89_pci_tx_status() (Zong-Zhe)
>     - drop wait->finished as complete_all() would make the completion be
>       done permanently (me)
> 
> v2: - use a work function to manage release of tx_waits and associated skbs (Zong-Zhe)
> 
> 
> FWIW, I think RCU locking becomes unnecessary after the next patch [PATCH rtw v3 2/5] wifi:
> rtw89: fix tx_wait initialization race but let's keep it if you feel that's more safe.
> 

Thanks. My thoughts are simple.
1. If it doesn't lead to bugs (i.e. not really necessary changes), keep it.
2. Keeping RCU gives us some room, if someday somehow we need to extend flow.
    Otherwise, we need to add it back again at that time.
3. Keep free side safe (kfree_rcu(@wait)) and doesn't depend on how complete() or
    wait_for_completion*() are implemented. (where completion is in @wait)

(If I change my mind after reading [PATCH v3 2/5], I will send my new comments.)

> 
>  drivers/net/wireless/realtek/rtw89/core.c | 28 ++++++++++++++---
> drivers/net/wireless/realtek/rtw89/core.h | 38 +++++++++++++++++++++--
> drivers/net/wireless/realtek/rtw89/pci.c  |  3 +-  drivers/net/wireless/realtek/rtw89/ser.c
> |  2 ++
>  4 files changed, 63 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtw89/core.c
> b/drivers/net/wireless/realtek/rtw89/core.c
> index 57590f5577a3..5c964283cd67 100644
> --- a/drivers/net/wireless/realtek/rtw89/core.c
> +++ b/drivers/net/wireless/realtek/rtw89/core.c
> @@ -1073,6 +1073,14 @@ rtw89_core_tx_update_desc_info(struct rtw89_dev *rtwdev,
>  	}
>  }
> 
> +static void rtw89_tx_wait_work(struct wiphy *wiphy, struct wiphy_work
> +*work) {
> +	struct rtw89_dev *rtwdev = container_of(work, struct rtw89_dev,
> +						tx_wait_work);
> +
> +	rtw89_tx_wait_list_clear(rtwdev);
> +}
> +
>  void rtw89_core_tx_kick_off(struct rtw89_dev *rtwdev, u8 qsel)  {
>  	u8 ch_dma;
> @@ -1090,6 +1098,8 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev,
> struct sk_buff *sk
>  	unsigned long time_left;
>  	int ret = 0;
> 
> +	lockdep_assert_wiphy(rtwdev->hw->wiphy);
> +
>  	wait = kzalloc(sizeof(*wait), GFP_KERNEL);
>  	if (!wait) {
>  		rtw89_core_tx_kick_off(rtwdev, qsel); @@ -1098,17 +1108,22 @@ int
> rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
> 
>  	init_completion(&wait->completion);
>  	rcu_assign_pointer(skb_data->wait, wait);
> +	wait->skb = skb;
> 

Generally, fill wait's fields before publishing it.
(i.e. before rcu_assign_pointer(...))

>  	rtw89_core_tx_kick_off(rtwdev, qsel);
>  	time_left = wait_for_completion_timeout(&wait->completion,
>  						msecs_to_jiffies(timeout));
> -	if (time_left == 0)
> +
> +	if (time_left == 0) {
>  		ret = -ETIMEDOUT;
> -	else if (!wait->tx_done)
> -		ret = -EAGAIN;
> +		list_add_tail(&wait->list, &rtwdev->tx_waits);
> +		wiphy_work_queue(rtwdev->hw->wiphy, &rtwdev->tx_wait_work);
> 
> -	rcu_assign_pointer(skb_data->wait, NULL);
> -	kfree_rcu(wait, rcu_head);
> +	} else {
> +		if (!wait->tx_done)
> +			ret = -EAGAIN;
> +		rtw89_tx_wait_release(&wait->rcu_head);
> +	}
> 
>  	return ret;
>  }
> @@ -4972,6 +4987,7 @@ void rtw89_core_stop(struct rtw89_dev *rtwdev)
>  	clear_bit(RTW89_FLAG_RUNNING, rtwdev->flags);
> 
>  	wiphy_work_cancel(wiphy, &rtwdev->c2h_work);
> +	wiphy_work_cancel(wiphy, &rtwdev->tx_wait_work);
>  	wiphy_work_cancel(wiphy, &rtwdev->cancel_6ghz_probe_work);
>  	wiphy_work_cancel(wiphy, &btc->eapol_notify_work);
>  	wiphy_work_cancel(wiphy, &btc->arp_notify_work); @@ -5203,6 +5219,7 @@ int
> rtw89_core_init(struct rtw89_dev *rtwdev)
>  		INIT_LIST_HEAD(&rtwdev->scan_info.pkt_list[band]);
>  	}
>  	INIT_LIST_HEAD(&rtwdev->scan_info.chan_list);
> +	INIT_LIST_HEAD(&rtwdev->tx_waits);
>  	INIT_WORK(&rtwdev->ba_work, rtw89_core_ba_work);
>  	INIT_WORK(&rtwdev->txq_work, rtw89_core_txq_work);
>  	INIT_DELAYED_WORK(&rtwdev->txq_reinvoke_work, rtw89_core_txq_reinvoke_work);
> @@ -5233,6 +5250,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
>  	wiphy_work_init(&rtwdev->c2h_work, rtw89_fw_c2h_work);
>  	wiphy_work_init(&rtwdev->ips_work, rtw89_ips_work);
>  	wiphy_work_init(&rtwdev->cancel_6ghz_probe_work, rtw89_cancel_6ghz_probe_work);
> +	wiphy_work_init(&rtwdev->tx_wait_work, rtw89_tx_wait_work);
>  	INIT_WORK(&rtwdev->load_firmware_work, rtw89_load_firmware_work);
> 
>  	skb_queue_head_init(&rtwdev->c2h_queue);
> diff --git a/drivers/net/wireless/realtek/rtw89/core.h
> b/drivers/net/wireless/realtek/rtw89/core.h
> index 43e10278e14d..9b22bb116af9 100644
> --- a/drivers/net/wireless/realtek/rtw89/core.h
> +++ b/drivers/net/wireless/realtek/rtw89/core.h
> @@ -3506,9 +3506,12 @@ struct rtw89_phy_rate_pattern {
>  	bool enable;
>  };
> 
> +#define RTW89_TX_WAIT_DEFAULT_TIMEOUT 10
>  struct rtw89_tx_wait_info {
>  	struct rcu_head rcu_head;
> +	struct list_head list;
>  	struct completion completion;
> +	struct sk_buff *skb;
>  	bool tx_done;
>  };
> 
> @@ -5925,6 +5928,9 @@ struct rtw89_dev {
>  	/* used to protect rpwm */
>  	spinlock_t rpwm_lock;
> 
> +	struct list_head tx_waits;
> +	struct wiphy_work tx_wait_work;
> +
>  	struct rtw89_cam_info cam_info;
> 
>  	struct sk_buff_head c2h_queue;
> @@ -6181,6 +6187,30 @@ rtw89_assoc_link_rcu_dereference(struct rtw89_dev *rtwdev, u8
> macid)
>  	list_first_entry_or_null(&p->dlink_pool, typeof(*p->links_inst), dlink_schd); \
>  })
> 
> +static inline void rtw89_tx_wait_release(struct rcu_head *rcu_head) {
> +	struct rtw89_tx_wait_info *wait =
> +		container_of(rcu_head, struct rtw89_tx_wait_info, rcu_head);
> +
> +	dev_kfree_skb_any(wait->skb);
> +	kfree(wait);
> +}
> +
> +static inline void rtw89_tx_wait_list_clear(struct rtw89_dev *rtwdev) {
> +	struct rtw89_tx_wait_info *wait, *tmp;
> +
> +	lockdep_assert_wiphy(rtwdev->hw->wiphy);
> +
> +	list_for_each_entry_safe(wait, tmp, &rtwdev->tx_waits, list) {
> +		if (!wait_for_completion_timeout(&wait->completion,
> +						 RTW89_TX_WAIT_DEFAULT_TIMEOUT))
> +			continue;
> +		list_del(&wait->list);
> +		call_rcu(&wait->rcu_head, rtw89_tx_wait_release);
> +	}
> +}
> +

I feel dev_kfree_skb_any + kfree_rcu is fine at least for now.

For call_rcu, there is one problem to me.
Can rtw89_tx_wait_release be invoked after rmmod ?
e.g. one tx is never completed yet unfortunately -> user rmmod driver
    -> release pending skb -> rtw89_tx_wait_list_clear -> ...

So, how about the follow-up ?

    rtw89_tx_wait_release(struct rtw89_tx_wait_info *wait):
        dev_kfree_skb_any(wait->skb)
        kfree_rcu(wait)

    rtw89_tx_wait_list_clear(...):
        ...
        for each wait in list:
            ...
            list_del
            rtw89_tx_wait_release(wait)

>  static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,
>  				     struct rtw89_core_tx_request *tx_req)  { @@ -6190,6 +6220,7
> @@ static inline int rtw89_hci_tx_write(struct rtw89_dev *rtwdev,  static inline void
> rtw89_hci_reset(struct rtw89_dev *rtwdev)  {
>  	rtwdev->hci.ops->reset(rtwdev);
> +	rtw89_tx_wait_list_clear(rtwdev);
>  }
> 
>  static inline int rtw89_hci_start(struct rtw89_dev *rtwdev) @@ -7258,11 +7289,12 @@
> static inline struct sk_buff *rtw89_alloc_skb_for_rx(struct rtw89_dev *rtwdev,
>  	return dev_alloc_skb(length);
>  }
> 
> -static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
> +static inline bool rtw89_core_tx_wait_complete(struct rtw89_dev
> +*rtwdev,
>  					       struct rtw89_tx_skb_data *skb_data,
>  					       bool tx_done)
>  {
>  	struct rtw89_tx_wait_info *wait;
> +	bool ret = false;
> 
>  	rcu_read_lock();
> 
> @@ -7270,11 +7302,13 @@ static inline void rtw89_core_tx_wait_complete(struct
> rtw89_dev *rtwdev,
>  	if (!wait)
>  		goto out;
> 
> +	ret = true;
>  	wait->tx_done = tx_done;
> -	complete(&wait->completion);
> +	complete_all(&wait->completion);
> 

Maybe add a comment to hint.
For example, /* Don't access skb anymore after completing */

>  out:
>  	rcu_read_unlock();
> +	return ret;
>  }
> 
>  static inline bool rtw89_is_mlo_1_1(struct rtw89_dev *rtwdev) diff --git
> a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
> index a669f2f843aa..4e3034b44f56 100644
> --- a/drivers/net/wireless/realtek/rtw89/pci.c
> +++ b/drivers/net/wireless/realtek/rtw89/pci.c
> @@ -464,7 +464,8 @@ static void rtw89_pci_tx_status(struct rtw89_dev *rtwdev,
>  	struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
>  	struct ieee80211_tx_info *info;
> 
> -	rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE);
> +	if (rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE))
> +		return;
> 
>  	info = IEEE80211_SKB_CB(skb);
>  	ieee80211_tx_info_clear_status(info);
> diff --git a/drivers/net/wireless/realtek/rtw89/ser.c b/drivers/net/wireless/realtek/rtw89/ser.c
> index bb39fdbcba0d..fe7beff8c424 100644
> --- a/drivers/net/wireless/realtek/rtw89/ser.c
> +++ b/drivers/net/wireless/realtek/rtw89/ser.c
> @@ -502,7 +502,9 @@ static void ser_reset_trx_st_hdl(struct rtw89_ser *ser, u8 evt)
>  		}
> 
>  		drv_stop_rx(ser);
> +		wiphy_lock(wiphy);
>  		drv_trx_reset(ser);
> +		wiphy_unlock(wiphy);
> 
>  		/* wait m3 */
>  		hal_send_m2_event(ser);
> --
> 2.51.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH rtw v3 2/5] wifi: rtw89: fix tx_wait initialization race
  2025-08-28 21:11 ` [PATCH rtw v3 2/5] wifi: rtw89: fix tx_wait initialization race Fedor Pchelkin
@ 2025-08-29  9:40   ` Zong-Zhe Yang
  0 siblings, 0 replies; 11+ messages in thread
From: Zong-Zhe Yang @ 2025-08-29  9:40 UTC (permalink / raw)
  To: Fedor Pchelkin, Ping-Ke Shih, Bitterblue Smith
  Cc: Bernie Huang, linux-wireless@vger.kernel.org,
	linux-kernel@vger.kernel.org, lvc-project@linuxtesting.org,
	stable@vger.kernel.org

Fedor Pchelkin <pchelkin@ispras.ru> wrote:
> Now that nullfunc skbs are recycled in a separate work item in the driver, the following race
> during initialization and processing of those skbs might lead to noticeable bugs:
> 
>       Waiting thread                          Completing thread
> 
> rtw89_core_send_nullfunc()
>   rtw89_core_tx_write_link()
>     ...
>     rtw89_pci_txwd_submit()
>       skb_data->wait = NULL
>       /* add skb to the queue */
>       skb_queue_tail(&txwd->queue, skb)
>                                             rtw89_pci_napi_poll()
>                                             ...
>                                               rtw89_pci_release_txwd_skb()
>                                                 /* get skb from the queue */
>                                                 skb_unlink(skb, &txwd->queue)
>                                                 rtw89_pci_tx_status()
>                                                   rtw89_core_tx_wait_complete()
>                                                   /* use incorrect skb_data->wait
> */
>   rtw89_core_tx_kick_off_and_wait()
>   /* assign skb_data->wait but too late */
> 
> The value of skb_data->wait indicates whether skb is passed on to the core ieee80211 stack
> or released by the driver itself.  So assure that by the time skb is added to txwd queue and
> becomes visible to the completing side, it has already allocated tx_wait-related data (in case
> it's needed).
> 
> Found by Linux Verification Center (linuxtesting.org).
> 
> Fixes: 1ae5ca615285 ("wifi: rtw89: add function to wait for completion of TX skbs")
> Cc: stable@vger.kernel.org
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> ---
>  drivers/net/wireless/realtek/rtw89/core.c | 30 ++++++++++++++---------
> drivers/net/wireless/realtek/rtw89/pci.c  |  2 --
>  2 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtw89/core.c
> b/drivers/net/wireless/realtek/rtw89/core.c
> index 5c964283cd67..57f20559dfde 100644
> --- a/drivers/net/wireless/realtek/rtw89/core.c
> +++ b/drivers/net/wireless/realtek/rtw89/core.c
> @@ -1094,20 +1094,12 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev,
> struct sk_buff *sk
>                                     int qsel, unsigned int timeout)  {
>         struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
> -       struct rtw89_tx_wait_info *wait;
> +       struct rtw89_tx_wait_info *wait = skb_data->wait;

wiphy_dereference(rtwdev->hw->wiphy, skb_data->wait)

>         unsigned long time_left;
>         int ret = 0;
> 
>         lockdep_assert_wiphy(rtwdev->hw->wiphy);
> 
> -       wait = kzalloc(sizeof(*wait), GFP_KERNEL);
> -       if (!wait) {
> -               rtw89_core_tx_kick_off(rtwdev, qsel);
> -               return 0;
> -       }
> -
> -       init_completion(&wait->completion);
> -       rcu_assign_pointer(skb_data->wait, wait);
>         wait->skb = skb;

Also, move "wait->skb = skb" to where wait is initialized.

> 
>         rtw89_core_tx_kick_off(rtwdev, qsel); @@ -1172,10 +1164,12 @@ int
> rtw89_h2c_tx(struct rtw89_dev *rtwdev,  static int rtw89_core_tx_write_link(struct
> rtw89_dev *rtwdev,
>                                     struct rtw89_vif_link *rtwvif_link,
>                                     struct rtw89_sta_link *rtwsta_link,
> -                                   struct sk_buff *skb, int *qsel, bool sw_mld)
> +                                   struct sk_buff *skb, int *qsel, bool sw_mld,
> +                                   struct rtw89_tx_wait_info *wait)
>  {
>         struct ieee80211_sta *sta = rtwsta_link_to_sta_safe(rtwsta_link);
>         struct ieee80211_vif *vif = rtwvif_link_to_vif(rtwvif_link);
> +       struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
>         struct rtw89_vif *rtwvif = rtwvif_link->rtwvif;
>         struct rtw89_core_tx_request tx_req = {};
>         int ret;
> @@ -1192,6 +1186,8 @@ static int rtw89_core_tx_write_link(struct rtw89_dev *rtwdev,
>         rtw89_core_tx_update_desc_info(rtwdev, &tx_req);
>         rtw89_core_tx_wake(rtwdev, &tx_req);
> 
> +       rcu_assign_pointer(skb_data->wait, wait);
> +
>         ret = rtw89_hci_tx_write(rtwdev, &tx_req);
>         if (ret) {
>                 rtw89_err(rtwdev, "failed to transmit skb to HCI\n"); @@ -1228,7 +1224,8
> @@ int rtw89_core_tx_write(struct rtw89_dev *rtwdev, struct ieee80211_vif *vif,
>                 }
>         }
> 
> -       return rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, qsel, false);
> +       return rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, qsel, false,
> +                                       NULL);
>  }
> 
>  static __le32 rtw89_build_txwd_body0(struct rtw89_tx_desc_info *desc_info) @@ -3426,6
> +3423,7 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct rtw89_vif_link
> *rt
>         struct ieee80211_vif *vif = rtwvif_link_to_vif(rtwvif_link);
>         int link_id = ieee80211_vif_is_mld(vif) ? rtwvif_link->link_id : -1;
>         struct rtw89_sta_link *rtwsta_link;
> +       struct rtw89_tx_wait_info *wait;
>         struct ieee80211_sta *sta;
>         struct ieee80211_hdr *hdr;
>         struct rtw89_sta *rtwsta;
> @@ -3435,6 +3433,12 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct
> rtw89_vif_link *rt
>         if (vif->type != NL80211_IFTYPE_STATION || !vif->cfg.assoc)
>                 return 0;
> 
> +       wait = kzalloc(sizeof(*wait), GFP_KERNEL);
> +       if (!wait)
> +               return -ENOMEM;
> +
> +       init_completion(&wait->completion);
> +
>         rcu_read_lock();
>         sta = ieee80211_find_sta(vif, vif->cfg.ap_addr);
>         if (!sta) {
> @@ -3459,7 +3463,8 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct
> rtw89_vif_link *rt
>                 goto out;
>         }
> 

Fill wait->skb as mentioned above.

> -       ret = rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, &qsel, true);
> +       ret = rtw89_core_tx_write_link(rtwdev, rtwvif_link, rtwsta_link, skb, &qsel, true,
> +                                      wait);
>         if (ret) {
>                 rtw89_warn(rtwdev, "nullfunc transmit failed: %d\n", ret);
>                 dev_kfree_skb_any(skb);
> @@ -3472,6 +3477,7 @@ int rtw89_core_send_nullfunc(struct rtw89_dev *rtwdev, struct
> rtw89_vif_link *rt
>                                                timeout);
>  out:
>         rcu_read_unlock();
> +       kfree(wait);
> 
>         return ret;
>  }
> diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
> index 4e3034b44f56..cb9682f306a6 100644
> --- a/drivers/net/wireless/realtek/rtw89/pci.c
> +++ b/drivers/net/wireless/realtek/rtw89/pci.c
> @@ -1372,7 +1372,6 @@ static int rtw89_pci_txwd_submit(struct rtw89_dev *rtwdev,
>         struct pci_dev *pdev = rtwpci->pdev;
>         struct sk_buff *skb = tx_req->skb;
>         struct rtw89_pci_tx_data *tx_data = RTW89_PCI_TX_SKB_CB(skb);
> -       struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
>         bool en_wd_info = desc_info->en_wd_info;
>         u32 txwd_len;
>         u32 txwp_len;
> @@ -1388,7 +1387,6 @@ static int rtw89_pci_txwd_submit(struct rtw89_dev *rtwdev,
>         }
> 
>         tx_data->dma = dma;
> -       rcu_assign_pointer(skb_data->wait, NULL);
> 
>         txwp_len = sizeof(*txwp_info);
>         txwd_len = chip->txwd_body_size;
> --
> 2.51.0
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part
  2025-08-28 21:11 ` [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part Fedor Pchelkin
@ 2025-08-29 19:57   ` Bitterblue Smith
  2025-09-01 17:45     ` Fedor Pchelkin
  0 siblings, 1 reply; 11+ messages in thread
From: Bitterblue Smith @ 2025-08-29 19:57 UTC (permalink / raw)
  To: Fedor Pchelkin, Ping-Ke Shih, Zong-Zhe Yang
  Cc: Po-Hao Huang, linux-wireless, linux-kernel, lvc-project

On 29/08/2025 00:11, Fedor Pchelkin wrote:
> There is no completion signaling for tx_wait skbs on USB part. This means
> rtw89_core_tx_kick_off_and_wait() always returns with a timeout.
> 
> Moreover, recent rework of tx_wait objects lifecycle handling made the
> driver be responsible for freeing the associated skbs, not the core
> ieee80211 stack. Lack of completion signaling would cause those objects
> being kept in driver internal tx_waits queue until rtw89_hci_reset()
> occurs, and then a double free would happen.
> 
> Extract TX status handling into a separate function, like its
> rtw89_pci_tx_status() counterpart. Signal completion from there.
> 
> Found by Linux Verification Center (linuxtesting.org).
> 
> Fixes: 2135c28be6a8 ("wifi: rtw89: Add usb.{c,h}")
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> ---
> 
> New series iteration -> new nuances found.
> 
> It seems the two previous patches from the series would not be too great
> in USB case because there is no completion signaling for tx_wait skbs
> there.
> 
> I don't have this hardware so *the patch is compile tested only*. It'd
> be nice if someone gave it a run on top of two previous patches of the
> series, thanks!
> 

I tested your first three patches with RTL8851BU for a few hours. It's
looking good, no explosion yet.

The USB side doesn't have real TX ACK status reporting yet. I only
learned recently how to do that. It looks like it will work about the
same as in rtw88.

>  drivers/net/wireless/realtek/rtw89/usb.c | 36 +++++++++++++++---------
>  1 file changed, 23 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtw89/usb.c b/drivers/net/wireless/realtek/rtw89/usb.c
> index 6cf89aee252e..10fe19bd5166 100644
> --- a/drivers/net/wireless/realtek/rtw89/usb.c
> +++ b/drivers/net/wireless/realtek/rtw89/usb.c
> @@ -188,11 +188,32 @@ static u8 rtw89_usb_get_bulkout_id(u8 ch_dma)
>  	}
>  }
>  
> +static void rtw89_usb_tx_status(struct rtw89_dev *rtwdev, struct sk_buff *skb,
> +				int status)
> +{
> +	struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
> +	struct ieee80211_tx_info *info;
> +
> +	if (rtw89_core_tx_wait_complete(rtwdev, skb_data, status == 0))
> +		return;
> +
> +	info = IEEE80211_SKB_CB(skb);
> +	ieee80211_tx_info_clear_status(info);
> +
> +	if (status == 0) {
> +		if (info->flags & IEEE80211_TX_CTL_NO_ACK)
> +			info->flags |= IEEE80211_TX_STAT_NOACK_TRANSMITTED;
> +		else
> +			info->flags |= IEEE80211_TX_STAT_ACK;
> +	}
> +
> +	ieee80211_tx_status_irqsafe(rtwdev->hw, skb);
> +}
> +
>  static void rtw89_usb_write_port_complete(struct urb *urb)
>  {
>  	struct rtw89_usb_tx_ctrl_block *txcb = urb->context;
>  	struct rtw89_dev *rtwdev = txcb->rtwdev;
> -	struct ieee80211_tx_info *info;
>  	struct rtw89_txwd_body *txdesc;
>  	struct sk_buff *skb;
>  	u32 txdesc_size;
> @@ -214,18 +235,7 @@ static void rtw89_usb_write_port_complete(struct urb *urb)
>  			txdesc_size += rtwdev->chip->txwd_info_size;
>  
>  		skb_pull(skb, txdesc_size);
> -
> -		info = IEEE80211_SKB_CB(skb);
> -		ieee80211_tx_info_clear_status(info);
> -
> -		if (urb->status == 0) {
> -			if (info->flags & IEEE80211_TX_CTL_NO_ACK)
> -				info->flags |= IEEE80211_TX_STAT_NOACK_TRANSMITTED;
> -			else
> -				info->flags |= IEEE80211_TX_STAT_ACK;
> -		}
> -
> -		ieee80211_tx_status_irqsafe(rtwdev->hw, skb);
> +		rtw89_usb_tx_status(rtwdev, skb, urb->status);
>  	}
>  
>  	switch (urb->status) {


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part
  2025-08-29 19:57   ` Bitterblue Smith
@ 2025-09-01 17:45     ` Fedor Pchelkin
  2025-09-01 18:46       ` Bitterblue Smith
  0 siblings, 1 reply; 11+ messages in thread
From: Fedor Pchelkin @ 2025-09-01 17:45 UTC (permalink / raw)
  To: Bitterblue Smith
  Cc: Ping-Ke Shih, Zong-Zhe Yang, Po-Hao Huang, linux-wireless,
	linux-kernel, lvc-project

On Fri, 29. Aug 22:57, Bitterblue Smith wrote:
> On 29/08/2025 00:11, Fedor Pchelkin wrote:
> > There is no completion signaling for tx_wait skbs on USB part. This means
> > rtw89_core_tx_kick_off_and_wait() always returns with a timeout.
> > 
> > Moreover, recent rework of tx_wait objects lifecycle handling made the
> > driver be responsible for freeing the associated skbs, not the core
> > ieee80211 stack. Lack of completion signaling would cause those objects
> > being kept in driver internal tx_waits queue until rtw89_hci_reset()
> > occurs, and then a double free would happen.
> > 
> > Extract TX status handling into a separate function, like its
> > rtw89_pci_tx_status() counterpart. Signal completion from there.
> > 
> > Found by Linux Verification Center (linuxtesting.org).
> > 
> > Fixes: 2135c28be6a8 ("wifi: rtw89: Add usb.{c,h}")
> > Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> > ---
> > 
> > New series iteration -> new nuances found.
> > 
> > It seems the two previous patches from the series would not be too great
> > in USB case because there is no completion signaling for tx_wait skbs
> > there.
> > 
> > I don't have this hardware so *the patch is compile tested only*. It'd
> > be nice if someone gave it a run on top of two previous patches of the
> > series, thanks!
> > 
> 
> I tested your first three patches with RTL8851BU for a few hours. It's
> looking good, no explosion yet.

Hello Bitterblue,

thank you!  Just in case, rtw89_core_send_nullfunc() has to be called in
order to trigger all the tx_wait activity touched with the patches, please
make sure it's called during the tests - it should be done after scan
complete, 47a498b84f01 ("wifi: rtw89: TX nulldata 0 after scan complete").

There is one more issue we'd also need to solve: perform tx_wait
completion signaling inside rtw89_usb_ops_reset() (driver shutdown stage
should probably also be handled with the case).  This'd require having an
ability to track TX URBs and kill them.  I'm just throwing these thoughts
now, maybe you have some ideas.  I'm still exploring the USB-part source
code and hopefully will have a chance to get hands on the USB chip soon.

> 
> The USB side doesn't have real TX ACK status reporting yet. I only
> learned recently how to do that. It looks like it will work about the
> same as in rtw88.

Do you mean similar pattern already exists in rtw88?  Could you give a
hint on how USB side TX ACK status reporting works there?  At a quick
glance, I don't see how those TX URB complete callbacks differ from what
rtw89 has.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part
  2025-09-01 17:45     ` Fedor Pchelkin
@ 2025-09-01 18:46       ` Bitterblue Smith
  0 siblings, 0 replies; 11+ messages in thread
From: Bitterblue Smith @ 2025-09-01 18:46 UTC (permalink / raw)
  To: Fedor Pchelkin
  Cc: Ping-Ke Shih, Zong-Zhe Yang, Po-Hao Huang, linux-wireless,
	linux-kernel, lvc-project

On 01/09/2025 20:45, Fedor Pchelkin wrote:
> On Fri, 29. Aug 22:57, Bitterblue Smith wrote:
>> On 29/08/2025 00:11, Fedor Pchelkin wrote:
>>> There is no completion signaling for tx_wait skbs on USB part. This means
>>> rtw89_core_tx_kick_off_and_wait() always returns with a timeout.
>>>
>>> Moreover, recent rework of tx_wait objects lifecycle handling made the
>>> driver be responsible for freeing the associated skbs, not the core
>>> ieee80211 stack. Lack of completion signaling would cause those objects
>>> being kept in driver internal tx_waits queue until rtw89_hci_reset()
>>> occurs, and then a double free would happen.
>>>
>>> Extract TX status handling into a separate function, like its
>>> rtw89_pci_tx_status() counterpart. Signal completion from there.
>>>
>>> Found by Linux Verification Center (linuxtesting.org).
>>>
>>> Fixes: 2135c28be6a8 ("wifi: rtw89: Add usb.{c,h}")
>>> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
>>> ---
>>>
>>> New series iteration -> new nuances found.
>>>
>>> It seems the two previous patches from the series would not be too great
>>> in USB case because there is no completion signaling for tx_wait skbs
>>> there.
>>>
>>> I don't have this hardware so *the patch is compile tested only*. It'd
>>> be nice if someone gave it a run on top of two previous patches of the
>>> series, thanks!
>>>
>>
>> I tested your first three patches with RTL8851BU for a few hours. It's
>> looking good, no explosion yet.
> 
> Hello Bitterblue,
> 
> thank you!  Just in case, rtw89_core_send_nullfunc() has to be called in
> order to trigger all the tx_wait activity touched with the patches, please
> make sure it's called during the tests - it should be done after scan
> complete, 47a498b84f01 ("wifi: rtw89: TX nulldata 0 after scan complete").
> 
> There is one more issue we'd also need to solve: perform tx_wait
> completion signaling inside rtw89_usb_ops_reset() (driver shutdown stage
> should probably also be handled with the case).  This'd require having an
> ability to track TX URBs and kill them.  I'm just throwing these thoughts
> now, maybe you have some ideas.  I'm still exploring the USB-part source
> code and hopefully will have a chance to get hands on the USB chip soon.
> 
>>
>> The USB side doesn't have real TX ACK status reporting yet. I only
>> learned recently how to do that. It looks like it will work about the
>> same as in rtw88.
> 
> Do you mean similar pattern already exists in rtw88?  Could you give a
> hint on how USB side TX ACK status reporting works there?  At a quick
> glance, I don't see how those TX URB complete callbacks differ from what
> rtw89 has.

Well, I assume we are talking about ACK status reporting. For example,
when mac80211 detects beacon loss it sends a null frame, or a probe
request (I'm not sure which is used when). It flags the frame with
IEEE80211_TX_CTL_REQ_TX_STATUS, which means the driver has to report
whether the AP sent ACK for the null frame/probe request or not. If
the AP doesn't reply for a while, the connection is considered lost.

A URB status of 0 only means that the URB was submitted successfully.
It doesn't mean the chip actually transmitted anything, and it doesn't
mean the chip received ACK from the AP.

In order to receive these ACK status reports rtw89 will have to set
a bit in the TX descriptor and place the skb in a queue to wait for
a message from the firmware. ieee80211_tx_status_irqsafe() can be
called when the firmware sends the message. 

This is for USB. It seems to work differently for PCIE in rtw89
(rtw89_pci_release_rpp()). In rtw88 it's one mechanism for PCIE, USB,
and SDIO.

These are some functions to look at in rtw88:

rtw_tx_report_enable()
rtw_usb_write_port_tx_complete()
rtw_tx_report_enqueue()

rtw_usb_rx_handler()
rtw_fw_c2h_cmd_rx_irqsafe()
rtw_fw_c2h_cmd_handle() / rtw_fw_c2h_cmd_handle_ext()
rtw_tx_report_handle()

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-09-01 18:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-28 21:11 [PATCH rtw v3 0/5] wifi: fixes for rtw89 Fedor Pchelkin
2025-08-28 21:11 ` [PATCH rtw v3 1/5] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait() Fedor Pchelkin
2025-08-29  7:42   ` Zong-Zhe Yang
2025-08-28 21:11 ` [PATCH rtw v3 2/5] wifi: rtw89: fix tx_wait initialization race Fedor Pchelkin
2025-08-29  9:40   ` Zong-Zhe Yang
2025-08-28 21:11 ` [PATCH rtw v3 3/5] wifi: rtw89: perform tx_wait completions for USB part Fedor Pchelkin
2025-08-29 19:57   ` Bitterblue Smith
2025-09-01 17:45     ` Fedor Pchelkin
2025-09-01 18:46       ` Bitterblue Smith
2025-08-28 21:12 ` [PATCH rtw v3 4/5] wifi: rtw89: fix leak in rtw89_core_send_nullfunc() Fedor Pchelkin
2025-08-28 21:12 ` [PATCH rtw v3 5/5] wifi: rtw89: avoid circular locking dependency in ser_state_run() Fedor Pchelkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).