From: Zong-Zhe Yang <kevin_yang@realtek.com>
To: Fedor Pchelkin <pchelkin@ispras.ru>, Ping-Ke Shih <pkshih@realtek.com>
Cc: Bernie Huang <phhuang@realtek.com>,
"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"lvc-project@linuxtesting.org" <lvc-project@linuxtesting.org>,
"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: RE: [PATCH rtw-next 1/2] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait()
Date: Thu, 21 Aug 2025 04:01:21 +0000 [thread overview]
Message-ID: <b4ec58864e544b0295ddb02ed408199b@realtek.com> (raw)
In-Reply-To: <20250820141441.106156-2-pchelkin@ispras.ru>
Fedor Pchelkin <pchelkin@ispras.ru> wrote:
>
> There is a bug observed when rtw89_core_tx_kick_off_and_wait() tries to access already
> freed skb_data:
>
> BUG: KFENCE: use-after-free write in rtw89_core_tx_kick_off_and_wait
> drivers/net/wireless/realtek/rtw89/core.c:1110
>
> CPU: 6 UID: 0 PID: 41377 Comm: kworker/u64:24 Not tainted 6.17.0-rc1+ #1 PREEMPT(lazy)
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS edk2-20250523-14.fc42
> 05/23/2025
> Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]
>
> Use-after-free write at 0x0000000020309d9d (in kfence-#251):
> rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110
> rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
> rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
> rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
> rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.h:141
> rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
> rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
> rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
> process_one_work kernel/workqueue.c:3241 worker_thread kernel/workqueue.c:3400
> kthread kernel/kthread.c:463 ret_from_fork arch/x86/kernel/process.c:154
> ret_from_fork_asm arch/x86/entry/entry_64.S:258
>
> kfence-#251: 0x0000000056e2393d-0x000000009943cb62, size=232,
> cache=skbuff_head_cache
>
> allocated by task 41377 on cpu 6 at 77869.159548s (0.009551s ago):
> __alloc_skb net/core/skbuff.c:659
> __netdev_alloc_skb net/core/skbuff.c:734 ieee80211_nullfunc_get
> net/mac80211/tx.c:5844 rtw89_core_send_nullfunc
> drivers/net/wireless/realtek/rtw89/core.c:3431
> rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
> rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
> rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
> rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.c:3194
> rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
> rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
> rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
> process_one_work kernel/workqueue.c:3241 worker_thread kernel/workqueue.c:3400
> kthread kernel/kthread.c:463 ret_from_fork arch/x86/kernel/process.c:154
> ret_from_fork_asm arch/x86/entry/entry_64.S:258
>
> freed by task 1045 on cpu 9 at 77869.168393s (0.001557s ago):
> ieee80211_tx_status_skb net/mac80211/status.c:1117 rtw89_pci_release_txwd_skb
> drivers/net/wireless/realtek/rtw89/pci.c:564
> rtw89_pci_release_tx_skbs.isra.0 drivers/net/wireless/realtek/rtw89/pci.c:651
> rtw89_pci_release_tx drivers/net/wireless/realtek/rtw89/pci.c:676
> rtw89_pci_napi_poll drivers/net/wireless/realtek/rtw89/pci.c:4238
> __napi_poll net/core/dev.c:7495
> net_rx_action net/core/dev.c:7557 net/core/dev.c:7684 handle_softirqs
> kernel/softirq.c:580
> do_softirq.part.0 kernel/softirq.c:480
> __local_bh_enable_ip kernel/softirq.c:407 rtw89_pci_interrupt_threadfn
> drivers/net/wireless/realtek/rtw89/pci.c:927
> irq_thread_fn kernel/irq/manage.c:1133
> irq_thread kernel/irq/manage.c:1257
> kthread kernel/kthread.c:463
> ret_from_fork arch/x86/kernel/process.c:154 ret_from_fork_asm
> arch/x86/entry/entry_64.S:258
>
> It is a consequence of a race between the waiting and the signalling side of the completion:
>
> Thread 1 Thread 2
> rtw89_core_tx_kick_off_and_wait()
> rcu_assign_pointer(skb_data->wait, wait)
> /* start waiting */
> wait_for_completion_timeout()
> rtw89_pci_tx_status()
> rtw89_core_tx_wait_complete()
> rcu_read_lock()
> /* signals completion and
> * proceeds further
> */
>
> complete(&wait->completion)
> rcu_read_unlock()
> ...
> /* frees skb_data */
> ieee80211_tx_status_ni()
> /* returns (exit status doesn't matter) */
> wait_for_completion_timeout()
> ...
> /* accesses the already freed skb_data */
> rcu_assign_pointer(skb_data->wait, NULL)
>
> The signalling side might proceed and free the underlying skb even before the waiting side is
> fully awoken and run to execution.
>
> RCU synchronization here would work well if the signalling side didn't go on and release skb
> on its own. Thus the waiting side should be told somehow about what is happening on the
> completion side.
I reread the flow and am thinking about it a bit.
Actually, only when signaling side doesn't run on time, waiting side should update skb_data->wait.
(see code comments below)
>
> It seems the only correct way is to use standard locking primitives with owner tracking, like
> was originally published in one [1] of the versions of the patch mentioned in Fixes.
>
> [1]: https://lore.kernel.org/linux-wireless/20230404025259.15503-3-pkshih@realtek.com/
>
> Found by Linux Verification Center (linuxtesting.org).
>
> Fixes: 1ae5ca615285 ("wifi: rtw89: add function to wait for completion of TX skbs")
> Cc: stable@vger.kernel.org
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> ---
>
> The bug is tricky because the waiter-completer interaction isn't simple here. I've tried to
> come up with something that wouldn't require taking additional locks at
> rtw89_core_tx_wait_complete() but these ideas don't eliminate the possible race entirely, to
> my mind.
Thank you for finding the potential race condition.
>
> Though one solution that _works_ currently is to get rid of 'struct rtw89_tx_wait_info' and
> replace it with the only field it is used for - 'bool tx_done'. Then it can be stored at 'struct
> ieee80211_tx_info::status::status_driver_data' directly without the need for allocating an
> extra dynamic object and tracking its lifecycle.
> I didn't post this since then the structure won't be expandable for new fields and that's
> probably the reason for why it wasn't done in this manner initially.
With a busy waiting on tx waiting side ?
If so, it would be unacceptable.
>
> drivers/net/wireless/realtek/rtw89/core.c | 15 ++++++++---
> drivers/net/wireless/realtek/rtw89/core.h | 32 ++++++++++++++---------
> drivers/net/wireless/realtek/rtw89/pci.c | 6 +++--
> 3 files changed, 36 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/net/wireless/realtek/rtw89/core.c
> b/drivers/net/wireless/realtek/rtw89/core.c
> index 57590f5577a3..826540319027 100644
> --- a/drivers/net/wireless/realtek/rtw89/core.c
> +++ b/drivers/net/wireless/realtek/rtw89/core.c
> @@ -1088,6 +1088,7 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev,
> struct sk_buff *sk
> struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
> struct rtw89_tx_wait_info *wait;
> unsigned long time_left;
> + bool free_wait = true;
> int ret = 0;
>
> wait = kzalloc(sizeof(*wait), GFP_KERNEL); @@ -1097,7 +1098,8 @@ int
> rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
> }
>
> init_completion(&wait->completion);
> - rcu_assign_pointer(skb_data->wait, wait);
> + spin_lock_init(&wait->owner_lock);
> + skb_data->wait = wait;
>
> rtw89_core_tx_kick_off(rtwdev, qsel);
> time_left = wait_for_completion_timeout(&wait->completion,
> @@ -1107,8 +1109,15 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev,
> struct sk_buff *sk
> else if (!wait->tx_done)
> ret = -EAGAIN;
>
> - rcu_assign_pointer(skb_data->wait, NULL);
> - kfree_rcu(wait, rcu_head);
Please consider the following.
(moving "rcu_assign_pointer(skb_data->wait, NULL)" to be under "if (time_left == 0)")
if (time_left == 0) {
rcu_assign_pointer(skb_data->wait, NULL);
ret = -ETIMEDOUT;
} else if (!wait->tx_done) {
ret = -EAGAIN;
}
kfree_rcu(wait, rcu_head);
If completing side does run as expected (potential racing mentioned in this patch),
there is no real need to assign NULL back.
next prev parent reply other threads:[~2025-08-21 4:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-20 14:13 [PATCH rtw-next 0/2] a couple of fixes for rtw89 Fedor Pchelkin
2025-08-20 14:13 ` [PATCH rtw-next 1/2] wifi: rtw89: fix use-after-free in rtw89_core_tx_kick_off_and_wait() Fedor Pchelkin
2025-08-21 4:01 ` Zong-Zhe Yang [this message]
2025-08-21 9:11 ` Fedor Pchelkin
2025-08-22 7:52 ` Zong-Zhe Yang
2025-08-20 14:13 ` [PATCH rtw-next 2/2] wifi: rtw89: avoid circular locking dependency in ser_state_run() Fedor Pchelkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b4ec58864e544b0295ddb02ed408199b@realtek.com \
--to=kevin_yang@realtek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=lvc-project@linuxtesting.org \
--cc=pchelkin@ispras.ru \
--cc=phhuang@realtek.com \
--cc=pkshih@realtek.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.