From: LB F <goainwo@gmail.com>
To: Ping-Ke Shih <pkshih@realtek.com>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] wifi: rtw88: Hard system freeze on RTL8821CE when power_save is enabled (LPS/ASPM conflict)
Date: Tue, 10 Mar 2026 17:12:19 +0200 [thread overview]
Message-ID: <CALdGYqSz3SNzoSjUQvK6FgTc2Xkac52=T5A7Lt=d+nxAXGgJVw@mail.gmail.com> (raw)
In-Reply-To: <CALdGYqQb=Vt0jjqW7k8RGMV1gczL0cg-26cHgCm3MmzBjezGMQ@mail.gmail.com>
Hi Ping-Ke,
Thank you for your guidance. To provide you with the cleanest possible
diagnostic data, we devised a strict testing environment:
1. **Live USB Environment:** We booted a completely fresh Live USB of
CachyOS (Kernel 6.19.6) to eliminate any potential interference from
installed software, TLP profiles, or custom NetworkManager
configurations.
2. **Aggressive Local Logging:** Because the system freeze physically
locks the PCIe bus and disables the Wi-Fi adapter instantly, using
`netconsole` was impossible (the network drops microseconds before the
freeze).
To overcome this, we wrote an "aggressive logger" script that pipes
`dmesg -w` directly to an independent FAT32 USB drive while issuing a
`sync` command twice a second. This bypassed RAM caching and
physically burned the logs to the drive right up to the moment of the
hard freeze. The script we used was:
```bash
#!/bin/bash
LOG_FILE="/run/media/liveuser/LOGS/kernel_freeze.log"
dmesg -w > "$LOG_FILE" &
while true; do
sync
sleep 0.5
done
```
3. No workarounds (`disable_aspm=n`, `disable_lps_deep=n`) were active
in this test. We manually enabled power saving (`iw dev wlan0 set
power_save on`) and triggered the freeze via typical web browsing.
Here are the precise, unadulterated logs showing the adapter
successfully connecting to the network, sitting idle for about 10
seconds (presumably entering power-saving states), and then suffering
a fatal firmware lockup right before the PCIe bus froze:
```
[ 304.709201] audit: type=1111 ... op=connection-add-activate ...
name="Andrey_5G" ...
[ 305.617785] wlan0: authenticate with 6c:68:a4:1c:97:5b ...
[ 305.660333] wlan0: authenticated
[ 305.661661] wlan0: associate with 6c:68:a4:1c:97:5b (try 1/3)
[ 305.663404] wlan0: associated
[ 305.719997] wlan0: Limiting TX power to 30 (30 - 0) dBm as
advertised by 6c:68:a4:1c:97:5b
... (~10 seconds of idle network time) ...
[ 316.907114] rtw88_8821ce 0000:13:00.0: failed to send h2c command
[ 316.911190] rtw88_8821ce 0000:13:00.0: failed to send h2c command
[ 316.921504] rtw88_8821ce 0000:13:00.0: coex request time out
...
[ 349.630952] rtw88_8821ce 0000:13:00.0: failed to send h2c command
[ 349.635023] rtw88_8821ce 0000:13:00.0: failed to send h2c command
[ 357.811235] rtw88_8821ce 0000:13:00.0: firmware failed to leave lps state
[ 359.797238] rtw88_8821ce 0000:13:00.0: firmware failed to leave lps state
... (repeats indefinitely until hard reset) ...
```
As the logs clearly demonstrate, the adapter authenticates perfectly
but the firmware explicitly fails to leave the LPS state after a brief
idle period, dropping all H2C commands immediately before the
system-wide hard freeze begins.
We will upload the full, unabridged `.log` file to our Bugzilla thread
(Bug 221195) momentarily, but we wanted to provide you with this exact
'smoking gun' trace right away to help identify the root cause.
Please let us know if this information is helpful or if there are any
specific module patches or further tests you would like us to perform
to assist with debugging.
Best regards,
Oleksandr
вт, 10 мар. 2026 г. в 13:01, LB F <goainwo@gmail.com>:
>
> Hi Ping-Ke,
>
> Thank you for the incredibly fast response and assistance!
>
> > Can you dig kernel log (by netconsole or ramoops) if something useful?
> > I'd like to know this is hardware level freeze or kernel can capture something wrong.
>
> I managed to pull a call trace from a historic journald log just
> before the system hung. The kernel gets trapped in an IRQ thread
> inside `rtw_pci_interrupt_threadfn`, calling up into `mac80211`
> `ieee80211_rx_list` before everything freezes. Here is the relevant
> snippet:
>
> ```text
> Call Trace:
> <IRQ>
> ? __alloc_skb+0x23a/0x2a0
> ? __alloc_skb+0x10c/0x2a0
> ? __pfx_irq_thread_fn+0x10/0x10
> [ ... truncated module list ... ]
> Tainted: G W I 6.19.6-2-cachyos #1 PREEMPT(full)
> Hardware name: HP HP Notebook/81F0, BIOS F.50 11/20/2020
> RIP: 0010:ieee80211_rx_list+0x1012/0x1020 [mac80211]
> CPU: 2 UID: 0 PID: 765 Comm: irq/56-rtw88_pc
> rtw_pci_interrupt_threadfn+0x239/0x310 [rtw88_pci]
> ```
>
> It behaves exactly like a PCIe bus deadlock or a hardware fault that
> eventually brings down the CPU handling the IRQ.
>
> > Are these totally needed to workaround the problem? Or disable_aspm is enough?
> > I'd list them in order of power consumption impact:
> > 1. disable_aspm=y
> > 2. disable_lps_deep=y
> > 3. disable WiFi power save
>
> To verify which parameters are strictly necessary, I performed
> isolated testing today. I ensured no other modprobe configs were
> active, rebuilt the initramfs, and manually enforced that
> `wifi.powersave` was active via `iw dev wlan0 set power_save on`
> during all tests (as the OS power management profiles were defaulting
> it to off, which initially masked the issue).
>
> I tested each workaround individually across multiple sleep/wake
> cycles and active usage:
>
> **Test 1 (ASPM Disabled, LPS Deep Enabled):**
> - Kernel parameters: `rtw88_pci disable_aspm=y` (and `rtw88_core
> disable_lps_deep=n`)
> - Result: Stable. No freezes were observed during usage or transitions
> into/out of S3 sleep while power saving was enforced.
>
> **Test 2 (ASPM Enabled, LPS Deep Disabled):**
> - Kernel parameters: `rtw88_core disable_lps_deep=y` (and `rtw88_pci
> disable_aspm=n`)
> - Result: Stable. No freezes were observed under the same forced power
> save conditions.
>
> **Conclusion:** It appears we do not need both workarounds
> simultaneously for this specific hardware. Using only `disable_aspm=y`
> seems to be sufficient to prevent the system freeze. Given your note
> about the power consumption impact ranking, this looks like the
> optimal path forward.
>
> > But what does 'deadlock' mean? As I know NAPI poll is scheduled by ISR,
> > and going to receive packets. The rx_no_aspm workaround is to forcely turn
> > off ASPM during this period.
>
> By "deadlock" I meant a hardware-level bus lockup. It seems the
> physical RTL8821CE chip itself crashes or hangs the system's PCIe bus
> when trying to negotiate waking up from ASPM L1 while simultaneously
> existing in `LPS_DEEP_MODE_LCLK`. The `rx_no_aspm` workaround in NAPI
> helps during active Rx decoding, but the laptop often freezes while
> completely idle, presumably when the AP sends a basic beacon, the chip
> attempts to leave LPS Deep + L1, and the hardware simply gives up and
> halts the system.
>
> > We have not modified RTL8821CE for a long time, so I'd add workaround
> > to specific platform as mentioned above.
>
> Adding a DMI/platform quirk specifically for this laptop to disable
> ASPM would be wonderful and deeply appreciated. I agree it is safer
> than touching the global flags for hardware that is functioning
> correctly out in the wild.
>
> Here is the exact identifying information for my system:
>
> System Vendor: HP
> Product Name: HP Notebook
> SKU Number: P3S95EA#ACB
> Family: 103C_5335KV
> PCI ID: 10ec:c821
> Subsystem ID: 103c:831a
>
> I am completely ready to test any patch or quirk you send my way.
> Thank you so much for your time and helping track this down!
>
> Best regards,
> Oleksandr
next prev parent reply other threads:[~2026-03-10 15:13 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 21:48 [BUG] wifi: rtw88: Hard system freeze on RTL8821CE when power_save is enabled (LPS/ASPM conflict) LB F
2026-03-10 2:02 ` Ping-Ke Shih
2026-03-10 11:01 ` LB F
2026-03-10 15:12 ` LB F [this message]
2026-03-11 2:20 ` Ping-Ke Shih
2026-03-11 2:15 ` Ping-Ke Shih
2026-03-11 2:22 ` Ping-Ke Shih
2026-03-11 11:00 ` LB F
2026-03-11 15:22 ` LB F
2026-03-12 1:56 ` Ping-Ke Shih
2026-03-12 21:42 ` LB F
2026-03-13 0:03 ` LB F
2026-03-13 0:29 ` LB F
2026-03-14 10:52 ` LB F
2026-03-14 12:39 ` LB F
2026-03-15 0:24 ` LB F
2026-03-16 2:55 ` Ping-Ke Shih
2026-03-16 20:27 ` LB F
2026-03-17 1:28 ` Ping-Ke Shih
2026-03-18 0:00 ` LB F
2026-03-18 0:58 ` Ping-Ke Shih
2026-03-18 23:55 ` LB F
2026-03-19 0:22 ` LB F
2026-03-19 0:49 ` Ping-Ke Shih
2026-03-19 1:24 ` Ping-Ke Shih
2026-03-19 23:58 ` LB F
2026-03-20 0:41 ` LB F
2026-03-20 1:00 ` Ping-Ke Shih
2026-03-20 1:19 ` LB F
2026-03-20 2:02 ` Ping-Ke Shih
2026-03-21 12:07 ` LB F
2026-03-23 2:01 ` Ping-Ke Shih
2026-03-25 20:38 ` LB F
2026-03-16 2:50 ` Ping-Ke Shih
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALdGYqSz3SNzoSjUQvK6FgTc2Xkac52=T5A7Lt=d+nxAXGgJVw@mail.gmail.com' \
--to=goainwo@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=pkshih@realtek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox