public inbox for linux-wireless@vger.kernel.org
 help / color / mirror / Atom feed
From: Ping-Ke Shih <pkshih@realtek.com>
To: Christian Hewitt <christianshewitt@gmail.com>
Cc: Bitterblue Smith <rtl8821cerfe2@gmail.com>,
	"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure
Date: Thu, 12 Mar 2026 08:28:08 +0000	[thread overview]
Message-ID: <3dbffdd1086f48b58eff048c3fa99db9@realtek.com> (raw)
In-Reply-To: <70E90B9D-4C33-46B0-92B7-46969F6AF7B0@gmail.com>

Christian Hewitt <christianshewitt@gmail.com> wrote:
> 
> > On 12 Mar 2026, at 11:39 am, Ping-Ke Shih <pkshih@realtek.com> wrote:
> >
> > Christian Hewitt <christianshewitt@gmail.com> wrote:
> >>> On 12 Mar 2026, at 6:22 am, Ping-Ke Shih <pkshih@realtek.com> wrote:
> >>>
> >>> Christian Hewitt <christianshewitt@gmail.com> wrote:
> >>>>> On 11 Mar 2026, at 7:05 am, Ping-Ke Shih <pkshih@realtek.com> wrote:
> >>>>>
> >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote:
> >>>>>>
> >>>>>>> On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote:
> >>>>>>>
> >>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote:
> >>>>>>>>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote:
> >>>>>>>>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse
> >>>>>>>>>>>> physical map dump intermittently fails with -EBUSY during probe.
> >>>>>>>>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where
> >>>>>>>>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY
> >>>>>>>>>>>> bit after 1 second.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm checking internally how we handle this case.
> >>>>>>>
> >>>>>>> Sorry for the late.
> >>>>>>>
> >>>>>>> We encountered WiFi/BT reading efuse at the same time causing similar
> >>>>>>> problem as yours. The workaround is like yours, which adds timeout
> >>>>>>> time.
> >>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> [...]
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> For context, firmware also fails (and recovers) sometimes:
> >>>>>>>>>>>
> >>>>>>>>>>> Did you mean this doesn't always happen? sometimes?
> >>>>>>>>>>
> >>>>>>>>>> It’s another intermittent behaviour observed on this board (and not
> >>>>>>>>>> related to the issue this patch targets). It occurs less frequently
> >>>>>>>>>> than the efuse issue and the existing retry mechanism in the driver
> >>>>>>>>>> ensures firmware load always succeeds.
> >>>>>>>
> >>>>>>> This might be the same cause due to reading efuse in firmware.
> >>>>>>>
> >>>>>>> Though we can add more timeout and retry times as workaround, I wonder
> >>>>>>> if you can control loading time of WiFi and BT kernel modules?
> >>>>>>>
> >>>>>>> More, can you do experiment that you load BT module first, and then load
> >>>>>>> WiFi module after 10 seconds (choose a large number intentionally, or
> >>>>>>> even larger)?
> >>>>>>
> >>>>>> https://paste.libreelec.tv/charmed-turkey.sh
> >>>>>>
> >>>>>> I’ve run the above script ^ which removes the wifi and bt modules in
> >>>>>> sequence then reloads them in the reverse order with a delay between
> >>>>>> bt and wifi modules loading, then checks for error messages. Over 200
> >>>>>> test cycles with a 10s delay all were clean (no errors). I also ran
> >>>>>> cycles with a 2 second delay and 0 second delay before starting wifi
> >>>>>> module load and those were clear too. I guess that proves sequencing
> >>>>>> avoids the efuse contention issue? - although it’s not possible in
> >>>>>> the real-world so not sure there’s huge value in knowing that :)
> >>>>>
> >>>>> Thanks for the experiments.
> >>>>>
> >>>>> Still want to know is it possible to change sequence/time of loading
> >>>>> kernel modules at boot time from system level? I mean can you adjust
> >>>>> the sequence in the Rock 5B board?
> >>>>
> >>>> I’m not a kernel expert, but I’ve always understood module probe and
> >>>> load ordering to not be guaranteed; as many things run in parallel and
> >>>> are highly subjective to the specific hardware capabilities and kernel
> >>>> config being used.
> >>>
> >>> I have heard people about changing sequence/time of kernel modules, so
> >>> I'd like you can try this method.
> >>>
> >>> I did ask AI, it said it is possible to create a .conf file under
> >>> /etc/modprobe.d/ and use `softdep` syntax to ensure loading sequence.
> >>> Could you try this?
> >>
> >> I can test this, but even if it works it’s not a fix because modprobe
> >> confs configured in userspace are only used with loadable modules that
> >> have been compiled with =m, not build-in modules that are resident in
> >> kernel memory and compiled with =y; and distros are free to choose how
> >> their kernel is configured. NB: I’m not sure if there are any general
> >> kernel rules for this, but I’d expect there to be general principle of
> >> modules being resilient to transient host states and not depending on
> >> userspace packaging to load correctly?
> >
> > I think built-in modules will be loaded sequentially (not in parallel)
> > by device_initicall(), so BT and WiFi drivers will not read efuse
> > at the same time.
> 
> Even if built-in modules are loaded sequentially, the kernel still has
> many dynamically loaded modules; and distros can configure that mix as
> they like, so you still cannot predict or guarantee the outcome. That
> could be changed by requiring rtw89 modules to be =y, but that goes
> against the principles of a modular kernel and I’d expect appropriately
> rude comments to the idea if submitted :)

As I know, dynamical modules are executed after init process, but that's
not your case. Let's clarify if /etc/modprobe.d/ with `softdep` option
can resolve your problem. I'd like to know the result. :)

> 
> >>>> In addition, did below messages not appear in these experiments?
> >>>>>
> >>>>> [    7.864148] rtw89_8852be 0002:21:00.0: fw security fail
> >>>>> [    7.864154] rtw89_8852be 0002:21:00.0: download firmware fail
> >>>>
> >>>> No, because even if we have a 0s delay between each group of modules
> >>>> being loaded, they are loaded in series, so we workaround the issue.
> >>>> Tweaking the script to background the module load loops so both run
> >>>> in parallel would be closer to normal conditions, and I would expect
> >>>> to start seeing failures and the retry mechanisms within the modules
> >>>> (as added in this patch) being triggered.
> >>>
> >>> Additional question for downloading firmware. As you reported this
> >>> issue initially (load modules at boot time in parallel), it seems
> >>> appear this message by chance. Since this driver will retry to download
> >>> firmware, will it successfully downloads firmware finally? Or it still
> >>> fails to download after 5 times retry?
> >>
> >> I have only seen firmware load fail a handful of times in many hundreds
> >> of boots and each time one retry attempt resulted in success. To be
> >> clear; I have am not reporting firwmare loading as a problem, it is not
> >> an issue for me. I’ve mentioned it only for context, i.e. it shows that
> >> a simple retry mechanism is effective at handling the similar issue with
> >> efuse map.
> >
> > I have this question because I wonder downloading firmware issue might be
> > also a reading efuse issue. If so, retry might resolve as well.
> 
> Hard to know, but it's an infrequent event and the existing retry mechanism
> appears to work fine.
> 
> > As your results, it looks like to retry reading efuse can resolve all
> > issues you found. What do you think?
> 
> The patch submitted resolves the efuse map dump for me. If there are more
> efuse accesses that need to be addressed I haven’t seen them in tests. If
> you are hinting to abstract things further I’d ask you to please propose
> an alternative patch that I can test for you; I’m firmly at the novice end
> of kernel contributors and unlikely to spot where changes might be needed
> without being spoon-fed rather explicit instructions :)

I will start to review this patch in detail and consider if another
alternative method.

Ping-Ke


  reply	other threads:[~2026-03-12  8:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-01  4:24 [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure Christian Hewitt
2026-03-02  5:47 ` Ping-Ke Shih
2026-03-02  5:55   ` Christian Hewitt
2026-03-02  6:04     ` Ping-Ke Shih
2026-03-02  6:17       ` Christian Hewitt
2026-03-09  2:35         ` Ping-Ke Shih
2026-03-10 17:16           ` Christian Hewitt
2026-03-11  3:05             ` Ping-Ke Shih
2026-03-11  4:20               ` Christian Hewitt
2026-03-12  2:22                 ` Ping-Ke Shih
2026-03-12  5:58                   ` Christian Hewitt
2026-03-12  7:39                     ` Ping-Ke Shih
2026-03-12  8:11                       ` Christian Hewitt
2026-03-12  8:28                         ` Ping-Ke Shih [this message]
2026-03-16  5:32 ` Ping-Ke Shih
2026-03-16 11:03   ` Christian Hewitt
2026-03-17  1:37     ` Ping-Ke Shih
2026-03-17  6:15       ` Christian Hewitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3dbffdd1086f48b58eff048c3fa99db9@realtek.com \
    --to=pkshih@realtek.com \
    --cc=christianshewitt@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=rtl8821cerfe2@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox