All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luka Gejak <luka.gejak@linux.dev>
To: Bitterblue Smith <rtl8821cerfe2@gmail.com>,
	Ping-Ke Shih <pkshih@realtek.com>, Kalle Valo <kvalo@kernel.org>
Cc: Yan-Hsuan Chuang <yhchuang@realtek.com>,
	Brian Norris <briannorris@chromium.org>,
	Stanislaw Gruszka <sgruszka@redhat.com>,
	linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, luka.gejak@linux.dev
Subject: Re: [PATCH] wifi: rtw88: increase TX report timeout to fix race condition
Date: Fri, 01 May 2026 22:46:51 +0200	[thread overview]
Message-ID: <6CD170FE-CAED-4B91-AEED-A1AFB98FFE8A@linux.dev> (raw)
In-Reply-To: <72f6fffd-bd77-437f-a9d9-6a542a8b365b@gmail.com>

On May 1, 2026 9:26:30 PM GMT+02:00, Bitterblue Smith <rtl8821cerfe2@gmail.com> wrote:
>On 01/05/2026 18:04, luka.gejak@linux.dev wrote:
>> From: Luka Gejak <luka.gejak@linux.dev>
>> 
>> The driver expects the firmware to report TX status within 500ms.
>> However, a race condition exists when the hardware is under heavy TX
>> load and is simultaneously interrupted by background scans or
>> power-saving state transitions. During these events, the firmware may
>> go off-channel for longer than 500ms, delaying the TX reports.
>> 
Hi Bitterblue,
thanks for the review.
>
>But power saving state transitions should not happen during heavy TX load.
>
You are absolutely right that power save transitions don't happen 
during heavy TX. The issue is strictly tied to off-channel dwell time.
I reliably trigger this on my rtl8723du (USB) by forcing background 
scans (iw dev wlanX scan) while under heavy iperf3 load. The firmware 
goes off-channel to scan, which delays the TX report well beyond the 
current 500ms threshold.

>> When this happens, the purge timer fires prematurely, dropping the
>> tracking skbs from the queue and spamming the kernel log with:
>> "failed to get tx report from firmware". Dropping these tracking skbs
>> prevents the driver from reporting TX status back to mac80211, which
>> breaks rate control accounting and degrades performance.
>> 
>
>But mac80211 doesn't handle rate control for these chips. How much does
>performance degrade?
>

I understand the firmware handles that internally. The performance 
degradation I am actually seeing is TCP window collapse, as the host 
stack interprets the dropped tracking skbs as packet loss. In my 
testing with iperf3, throughput drops from a steady 80-90 Mbps to 
near-zero for nearly 2 seconds following the scan before recovery 
begins.

>> Increase RTW_TX_PROBE_TIMEOUT to 2500ms. This timeout is large enough
>> to comfortably accommodate the duration of full WiFi background scans
>> and sleep transitions without incorrectly tripping the purge timer,
>> while still eventually catching true firmware lockups.
>> 
>
>rtw88 supports many chips. Which one are you using?
>
>Perhaps provide a full description of the problem you encountered.
>

...

I also realize now that globally changing RTW_TX_PROBE_TIMEOUT to 
2500ms is too heavy-handed. Since this impacts all rtw88 chips, 
including PCIe variants where 500ms might be exactly what is needed to
catch a real firmware lockup, the blast radius is too large. How would
you prefer I handle this for the v2 patch? I can either implement a 
more conservative global bump, or make the timeout dynamic based on 
the HCI interface so USB devices get a longer timeout to accommodate 
the bus latency during scans.

Best regards,
Luka Gejak

  reply	other threads:[~2026-05-01 20:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-01 15:04 [PATCH] wifi: rtw88: increase TX report timeout to fix race condition luka.gejak
2026-05-01 19:26 ` Bitterblue Smith
2026-05-01 20:46   ` Luka Gejak [this message]
2026-05-01 21:28     ` Bitterblue Smith
2026-05-01 21:33       ` Luka Gejak
2026-05-06  8:23         ` Ping-Ke Shih

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6CD170FE-CAED-4B91-AEED-A1AFB98FFE8A@linux.dev \
    --to=luka.gejak@linux.dev \
    --cc=briannorris@chromium.org \
    --cc=kvalo@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=pkshih@realtek.com \
    --cc=rtl8821cerfe2@gmail.com \
    --cc=sgruszka@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=yhchuang@realtek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.