public inbox for linux-wireless@vger.kernel.org
 help / color / mirror / Atom feed
From: LB F <goainwo@gmail.com>
To: Ping-Ke Shih <pkshih@realtek.com>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] wifi: rtw88: Hard system freeze on RTL8821CE when power_save is enabled (LPS/ASPM conflict)
Date: Wed, 25 Mar 2026 22:38:36 +0200	[thread overview]
Message-ID: <CALdGYqQ5U2USCqVEixoDda1Xd2ugBakh1K1QkaKAU7HPSTTNWg@mail.gmail.com> (raw)
In-Reply-To: <ba9790526e4e42c386642a05fcbc2f34@realtek.com>

Subject: Cross-platform analysis: RTL8821CE ASPM/LPS instability
        affects multiple OEM platforms beyond HP P3S95EA#ACB

Hi Ping-Ke,

First of all, thank you very much for your work on the rtw88 driver
and for the time you spent helping us resolve the issues on our HP
laptop. Both patches -- the v2 DMI quirk (ASPM + LPS Deep) and the
v2 RX rate validation (rx.c) -- have been tested and verified stable
on our system across two kernel updates (6.19.9-1 and 6.19.9-2),
sustained network load (~1.9 GB), and multiple suspend/resume cycles.
The system is now completely free of freezes, h2c errors, and
mac80211 warnings. Your patches genuinely solved every issue we had.

While working through this, I noticed that many other users across
different hardware platforms appear to be experiencing the same
problems that your patches resolved for us. I decided to collect
and organize these observations in case they might be useful to you.

Please note that this is an amateur analysis, not a professional
one -- I am just a user trying to help. It is entirely possible
that I have missed nuances or made incorrect assumptions. My only
goal is to share what I found, in case it provides useful data
points or sparks ideas for broader improvements. If any of this
is not relevant or not useful, please feel free to disregard it.


1. KERNEL BUGZILLA: OPEN RTL8821CE REPORTS
==========================================

I reviewed all open RTL8821CE bugs in kernel.org Bugzilla. Three
of the six show symptoms that directly match the root causes
addressed by your patches (ASPM deadlock and LPS Deep h2c failures).

--- Directly correlated with ASPM/LPS patches ---

Bug 215131 - System freeze (ASPM L1 deadlock)
  Title:    "Realtek 8821CE makes the system freeze after 9e2fd29864c5
             (rtw88: add napi support)"
  Reporter: Kai-Heng Feng (Canonical)
  Kernel:   5.15+
  Symptoms: Hard freeze preceded by "pci bus timeout, check dma status"
            warnings. RX tag mismatch in rtw_pci_dma_check().
  Workaround confirmed by reporter: rtw88_pci.disable_aspm=1
  Reporter note: "disable_aspm=1 is not a viable workaround because
                  it increases power consumption significantly"
  Status:   OPEN since 2021-11-24.
  Link:     https://bugzilla.kernel.org/show_bug.cgi?id=215131
  Relevance: Identical root cause to Bug 221195. The reporter's
             confirmed workaround (disable_aspm=1) is exactly what
             the DMI quirk implements.

Bug 219830 - h2c/LPS failures + BT crackling
  Title:    "rtw88_8821ce (RTL8821CE) slow performance and error
             messages in dmesg"
  Reporter: Bmax Y14 laptop, Fedora 41, kernel 6.13.5
  Symptoms: - "failed to send h2c command" (periodic)
            - "firmware failed to leave lps state" (periodic)
            - Lower signal strength vs Windows
            - Bluetooth crackling during audio playback
  Cross-ref: https://github.com/lwfinger/rtw88/issues/306
  Status:   OPEN since 2025-03-02.
  Link:     https://bugzilla.kernel.org/show_bug.cgi?id=219830
  Relevance: The h2c/lps errors are the same messages we observed
             before the DMI quirk disabled LPS Deep Mode. The BT
             crackling may correlate with the invalid RX rate
             condition addressed by your rx.c validation patch.

Bug 218697 - TX queue flush timeout during scan
  Title:    "rtw88_8821ce timed out to flush queue 2"
  Reporter: Arch Linux, kernel 6.8.4 / 6.8.5
  Symptoms: - "timed out to flush queue 2" every ~30 seconds
            - "failed to get tx report from firmware"
            - Stack trace: ieee80211_scan_work -> rtw_ops_flush ->
              rtw_mac_flush_queues timeout
  Status:   OPEN since 2024-04-08.
  Link:     https://bugzilla.kernel.org/show_bug.cgi?id=218697
  Relevance: The flush timeout occurs when the firmware cannot
             respond to TX queue operations -- consistent with
             firmware being stuck in LPS Deep during scan.

--- Potentially related (no confirmed workaround data) ---

Bug 217491 - "timed out to flush queue 1" regression (kernel 6.3)
  Manjaro user. Floods of "timed out to flush queue 1/2".
  Similar pattern to Bug 218697.
  Link: https://bugzilla.kernel.org/show_bug.cgi?id=217491

Bug 217781 - Random sudden dropouts
  Arch user. Random disconnections during streaming/transfers.
  Reproduced on Ubuntu and Fedora (kernels 5.15 to 6.4).
  Link: https://bugzilla.kernel.org/show_bug.cgi?id=217781

Bug 216685 - Low wireless speed
  Reduced throughput vs expected 802.11ac performance.
  Link: https://bugzilla.kernel.org/show_bug.cgi?id=216685


2. SYMPTOM-TO-PATCH MAPPING
=============================

dmesg signature                    Patch that resolves it
--------------------------         ----------------------
Hard system freeze                 pci.c DMI quirk (disable ASPM)
"pci bus timeout, check dma"       pci.c DMI quirk (disable ASPM)
"firmware failed to leave lps"     pci.c DMI quirk (disable LPS Deep)
"failed to send h2c command"       pci.c DMI quirk (disable LPS Deep)
"timed out to flush queue N"       pci.c DMI quirk (disable LPS Deep) [1]
"failed to get tx report"          pci.c DMI quirk (disable LPS Deep) [1]
VHT NSS=0 mac80211 WARNING        rx.c rate validation (v2)

Confirmed in bugs: 215131, 219830, 218697, 221195.
[1] Inferred: flush timeout occurs when firmware cannot exit LPS
    to process TX queue operations.


3. AFFECTED HARDWARE
=====================

Current DMI quirk coverage: HP P3S95EA#ACB only.

Platforms confirmed affected in Bugzilla:
  Bug 221195: HP Notebook 81F0 (P3S95EA#ACB)
  Bug 215131: unknown (Canonical upstream testing)
  Bug 219830: Bmax Y14
  Bug 218697: unknown (Arch Linux user)

Platforms reported in community forums as requiring
disable_aspm=Y and/or disable_lps_deep=Y for stable RTL8821CE:
  - HP 17-by4063CL
  - Lenovo IdeaPad S145-15AST, IdeaPad 3, IdeaPad 330S
  - ASUS VivoBook X series
  - Acer Aspire 3/5 series

All use PCI Device ID 10ec:c821 with different Subsystem IDs.


4. LPS_DEEP_MODE_LCLK IN THE rtw88 TREE
=========================================

I verified in the source which chips have the same
lps_deep_mode_supported flag:

Chip file       lps_deep_mode_supported            PCIe variant
---------       -----------------------            ------------
rtw8821c.c      BIT(LPS_DEEP_MODE_LCLK)            rtw8821ce ✓
rtw8822c.c      BIT(LPS_DEEP_MODE_LCLK) | PG       rtw8822ce ✓
rtw8822b.c      BIT(LPS_DEEP_MODE_LCLK)            rtw8822be ✓
rtw8814a.c      BIT(LPS_DEEP_MODE_LCLK)            rtw8814ae ✓
rtw8723d.c      0                                   rtw8723de ✗
rtw8703b.c      0                                   (SDIO)     -
rtw8821a.c      0                                   (legacy)   -

Source references:
  rtw8821c.c:2002  rtw8822c.c:5365  rtw8822b.c:2545
  rtw8814a.c:2211  rtw8723d.c:2144

RTL8822CE community reports (Manjaro, Arch forums) confirm the
same disable_aspm=Y + disable_lps_deep=Y workaround is effective,
consistent with rtw8822c.c having LCLK enabled.


5. COMMUNITY WORKAROUND REFERENCES
====================================

The following are concrete examples of forums and wikis where the
same modprobe workarounds are actively recommended:

Arch Wiki - RTW88 section:
  https://wiki.archlinux.org/title/Network_configuration/Wireless
  (section "RTW88" and "rtl8821ce" under Troubleshooting/Realtek)
  Recommends rtw88-dkms-git and documents the rtw88_8821ce issues.

Arch Wiki - RTW89 section (same page):
  Documents the identical ASPM pattern for the newer RTW89 driver:
    options rtw89_pci disable_aspm_l1=y disable_aspm_l1ss=y
    options rtw89_core disable_ps_mode=y
  This suggests the ASPM/LPS interaction is a systemic Realtek
  design issue, not specific to rtw88 or the 8821CE chip.

Arch Linux Forum - RTL8821CE thread:
  https://bbs.archlinux.org/viewtopic.php?id=273440
  Referenced by the Arch Wiki as the primary rtl8821ce discussion.

lwfinger/rtw88 GitHub:
  https://github.com/lwfinger/rtw88/issues/306
  Directly cross-referenced by Bug 219830. Reporter reports h2c
  errors and investigated antenna hardware (no fault found).

lwfinger/rtw89 GitHub:
  https://github.com/lwfinger/rtw89/issues/275#issuecomment-1784155449
  Same ASPM-disable pattern documented for the newer RTW89 driver
  on HP and Lenovo laptops.


6. OBSERVATIONS
================

a) Three open Bugzilla reporters (215131, 219830, 218697) show
   symptoms identical to those resolved by your patches but have
   no upstream fix available since they are not the HP P3S95EA#ACB.

b) Bug 215131 reporter (Kai-Heng Feng, Canonical) explicitly
   confirmed disable_aspm=1 as a workaround and called it
   "not viable" due to power cost. A DMI quirk for their
   platform would give them a proper fix.

c) The Arch Wiki documents the same ASPM-disable pattern for
   both RTW88 and RTW89 drivers, suggesting the underlying
   hardware/firmware limitation is shared across generations.

d) Asking Bugzilla reporters to provide their DMI data
   (dmidecode -t 1,2) could allow extending the quirk table
   with minimal effort and zero risk to unaffected platforms.

e) The rx.c rate validation patch is chip-agnostic and requires
   no platform-specific consideration.


7. REFERENCES
==============

Kernel Bugzilla:
  https://bugzilla.kernel.org/show_bug.cgi?id=215131
  https://bugzilla.kernel.org/show_bug.cgi?id=219830
  https://bugzilla.kernel.org/show_bug.cgi?id=218697
  https://bugzilla.kernel.org/show_bug.cgi?id=217491
  https://bugzilla.kernel.org/show_bug.cgi?id=217781
  https://bugzilla.kernel.org/show_bug.cgi?id=216685

GitHub:
  https://github.com/lwfinger/rtw88/issues/306
  https://github.com/lwfinger/rtw89/issues/275

Arch Wiki:
  https://wiki.archlinux.org/title/Network_configuration/Wireless

Arch Linux Forum:
  https://bbs.archlinux.org/viewtopic.php?id=273440

Source code (lps_deep_mode_supported):
  drivers/net/wireless/realtek/rtw88/rtw8821c.c:2002
  drivers/net/wireless/realtek/rtw88/rtw8822c.c:5365
  drivers/net/wireless/realtek/rtw88/rtw8822b.c:2545
  drivers/net/wireless/realtek/rtw88/rtw8814a.c:2211
  drivers/net/wireless/realtek/rtw88/rtw8723d.c:2144


Best regards,
Oleksandr Havrylov <goainwo@gmail.com>

  reply	other threads:[~2026-03-25 20:39 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09 21:48 [BUG] wifi: rtw88: Hard system freeze on RTL8821CE when power_save is enabled (LPS/ASPM conflict) LB F
2026-03-10  2:02 ` Ping-Ke Shih
2026-03-10 11:01   ` LB F
2026-03-10 15:12     ` LB F
2026-03-11  2:20       ` Ping-Ke Shih
2026-03-11  2:15     ` Ping-Ke Shih
2026-03-11  2:22       ` Ping-Ke Shih
2026-03-11 11:00         ` LB F
2026-03-11 15:22           ` LB F
2026-03-12  1:56             ` Ping-Ke Shih
2026-03-12 21:42               ` LB F
2026-03-13  0:03                 ` LB F
2026-03-13  0:29                   ` LB F
2026-03-14 10:52                     ` LB F
2026-03-14 12:39                       ` LB F
2026-03-15  0:24                         ` LB F
2026-03-16  2:55                           ` Ping-Ke Shih
2026-03-16 20:27                             ` LB F
2026-03-17  1:28                               ` Ping-Ke Shih
2026-03-18  0:00                                 ` LB F
2026-03-18  0:58                                   ` Ping-Ke Shih
2026-03-18 23:55                                     ` LB F
2026-03-19  0:22                                       ` LB F
2026-03-19  0:49                                         ` Ping-Ke Shih
2026-03-19  1:24                                       ` Ping-Ke Shih
2026-03-19 23:58                                         ` LB F
2026-03-20  0:41                                           ` LB F
2026-03-20  1:00                                             ` Ping-Ke Shih
2026-03-20  1:19                                               ` LB F
2026-03-20  2:02                                                 ` Ping-Ke Shih
2026-03-21 12:07                                                   ` LB F
2026-03-23  2:01                                                     ` Ping-Ke Shih
2026-03-25 20:38                                                       ` LB F [this message]
2026-03-16  2:50                         ` Ping-Ke Shih

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALdGYqQ5U2USCqVEixoDda1Xd2ugBakh1K1QkaKAU7HPSTTNWg@mail.gmail.com \
    --to=goainwo@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=pkshih@realtek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox