* [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure @ 2026-03-01 4:24 Christian Hewitt 2026-03-02 5:47 ` Ping-Ke Shih 2026-03-16 5:32 ` Ping-Ke Shih 0 siblings, 2 replies; 18+ messages in thread From: Christian Hewitt @ 2026-03-01 4:24 UTC (permalink / raw) To: Ping-Ke Shih, Bitterblue Smith, linux-wireless, linux-kernel On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse physical map dump intermittently fails with -EBUSY during probe. The failure occurs in rtw89_dump_physical_efuse_map_ddv() where read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY bit after 1 second. The root cause is a timing race during boot: the WiFi driver's chip initialization (firmware download via PCIe) overlaps with the Bluetooth firmware download to the same combo chip over USB. This can leave the efuse controller temporarily unavailable when the WiFi driver attempts to read the efuse map. Add a retry loop (up to 3 attempts with 500ms delays) around the physical efuse map dump in rtw89_parse_efuse_map_ax(). The firmware download path already retries up to 5 times, but the efuse read that follows has no retry logic, making it the weak link in the probe sequence. Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> --- The LibreELEC distro is minimalist and fast-booting with some history of exposing racy probing and timing issues that don't show up on the mainstream distros. Below are some before/after dmesg prints to support the patch: Before: ROCK5B:~ # dmesg | grep rtw89 [ 6.575383] rtw89_8852be 0002:21:00.0: loaded firmware rtw89/rtw8852b_fw-1.bin [ 6.575538] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) [ 6.585763] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 5 [ 6.585779] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 3 [ 10.174946] rtw89_8852be 0002:21:00.0: failed to dump efuse physical map [ 10.176584] rtw89_8852be 0002:21:00.0: failed to setup chip information [ 10.178173] rtw89_8852be 0002:21:00.0: probe with driver rtw89_8852be failed with error -16 After: ROCK5B:~ # dmesg | grep rtw89 [ 7.393558] rtw89_8852be 0002:21:00.0: loaded firmware rtw89/rtw8852b_fw-1.bin [ 7.393729] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) [ 7.406341] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 5 [ 7.406363] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 3 [ 11.041563] rtw89_8852be 0002:21:00.0: efuse dump failed, retrying (1) [ 11.798390] rtw89_8852be 0002:21:00.0: chip info CID: 0, CV: 1, AID: 0, ACV: 1, RFE: 1 [ 11.801013] rtw89_8852be 0002:21:00.0: rfkill hardware state changed to enable For context, firmware also fails (and recovers) sometimes: ROCK5B:~ # dmesg | grep rtw89 [ 6.436873] rtw89_8852be 0002:21:00.0: loaded firmware rtw89/rtw8852b_fw-1.bin [ 6.437165] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) [ 6.450228] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 5 [ 6.450239] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 3 [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail [ 7.864160] rtw89_8852be 0002:21:00.0: [ERR]fwdl 0x1E0 = 0x62 [ 7.864165] rtw89_8852be 0002:21:00.0: [ERR]fwdl 0x83F0 = 0x80011 [ 7.864173] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 [ 7.864188] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 [ 7.864203] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864219] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864234] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864250] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 [ 7.864265] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 [ 7.864281] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 [ 7.864296] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864311] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864327] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864342] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 [ 7.864358] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864373] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 7.864387] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 [ 8.181342] rtw89_8852be 0002:21:00.0: chip info CID: 0, CV: 1, AID: 0, ACV: 1, RFE: 1 [ 8.184322] rtw89_8852be 0002:21:00.0: rfkill hardware state changed to enable drivers/net/wireless/realtek/rtw89/efuse.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c b/drivers/net/wireless/realtek/rtw89/efuse.c index a2757a88d55d..d506f04ffd6c 100644 --- a/drivers/net/wireless/realtek/rtw89/efuse.c +++ b/drivers/net/wireless/realtek/rtw89/efuse.c @@ -270,6 +270,7 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) u8 *log_map = NULL; u8 *dav_phy_map = NULL; u8 *dav_log_map = NULL; + int retry; int ret; if (rtw89_read16(rtwdev, R_AX_SYS_WL_EFUSE_CTRL) & B_AX_AUTOLOAD_SUS) @@ -289,7 +290,17 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) goto out_free; } - ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, phy_size, false); + for (retry = 0; retry < 3; retry++) { + if (retry) { + rtw89_warn(rtwdev, "efuse dump failed, retrying (%d)\n", + retry); + fsleep(500000); + } + ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, + phy_size, false); + if (!ret) + break; + } if (ret) { rtw89_warn(rtwdev, "failed to dump efuse physical map\n"); goto out_free; -- 2.43.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-01 4:24 [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure Christian Hewitt @ 2026-03-02 5:47 ` Ping-Ke Shih 2026-03-02 5:55 ` Christian Hewitt 2026-03-16 5:32 ` Ping-Ke Shih 1 sibling, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-02 5:47 UTC (permalink / raw) To: Christian Hewitt, Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > physical map dump intermittently fails with -EBUSY during probe. > The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > bit after 1 second. I'm checking internally how we handle this case. [...] > > For context, firmware also fails (and recovers) sometimes: Did you mean this doesn't always happen? sometimes? We has seen similar log because of 36-bit DMA. Try below to force 32- or 36- bit DMA to see if it can resolve problem in your platform. diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c index 43c61b3dc969..9d003ab93c85 100644 --- a/drivers/net/wireless/realtek/rtw89/pci.c +++ b/drivers/net/wireless/realtek/rtw89/pci.c @@ -3305,6 +3305,8 @@ static bool rtw89_pci_is_dac_compatible_bridge(struct rtw89_dev *rtwdev) if (!bridge) return false; + return true; // or force to return false; + switch (bridge->vendor) { case PCI_VENDOR_ID_INTEL: return true; > > ROCK5B:~ # dmesg | grep rtw89 > [ 6.436873] rtw89_8852be 0002:21:00.0: loaded firmware rtw89/rtw8852b_fw-1.bin > [ 6.437165] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) > [ 6.450228] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 5 > [ 6.450239] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 3 > [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail > [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail > [ 7.864160] rtw89_8852be 0002:21:00.0: [ERR]fwdl 0x1E0 = 0x62 > [ 7.864165] rtw89_8852be 0002:21:00.0: [ERR]fwdl 0x83F0 = 0x80011 > [ 7.864173] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 > [ 7.864188] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 > [ 7.864203] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864219] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864234] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864250] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 > [ 7.864265] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 > [ 7.864281] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 > [ 7.864296] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864311] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864327] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864342] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 > [ 7.864358] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864373] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 7.864387] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 > [ 8.181342] rtw89_8852be 0002:21:00.0: chip info CID: 0, CV: 1, AID: 0, ACV: 1, RFE: 1 > [ 8.184322] rtw89_8852be 0002:21:00.0: rfkill hardware state changed to enable > ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-02 5:47 ` Ping-Ke Shih @ 2026-03-02 5:55 ` Christian Hewitt 2026-03-02 6:04 ` Ping-Ke Shih 0 siblings, 1 reply; 18+ messages in thread From: Christian Hewitt @ 2026-03-02 5:55 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >> physical map dump intermittently fails with -EBUSY during probe. >> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >> bit after 1 second. > > I'm checking internally how we handle this case. > > [...] > >> >> For context, firmware also fails (and recovers) sometimes: > > Did you mean this doesn't always happen? sometimes? It’s another intermittent behaviour observed on this board (and not related to the issue this patch targets). It occurs less frequently than the efuse issue and the existing retry mechanism in the driver ensures firmware load always succeeds. > We has seen similar log because of 36-bit DMA. Try below to force 32- or 36- > bit DMA to see if it can resolve problem in your platform. I can experiment but this doesn’t happen often so I probably can’t provide meaningful feedback. Christian > diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c > index 43c61b3dc969..9d003ab93c85 100644 > --- a/drivers/net/wireless/realtek/rtw89/pci.c > +++ b/drivers/net/wireless/realtek/rtw89/pci.c > @@ -3305,6 +3305,8 @@ static bool rtw89_pci_is_dac_compatible_bridge(struct rtw89_dev *rtwdev) > if (!bridge) > return false; > > + return true; // or force to return false; > + > switch (bridge->vendor) { > case PCI_VENDOR_ID_INTEL: > return true; > >> >> ROCK5B:~ # dmesg | grep rtw89 >> [ 6.436873] rtw89_8852be 0002:21:00.0: loaded firmware rtw89/rtw8852b_fw-1.bin >> [ 6.437165] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) >> [ 6.450228] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 5 >> [ 6.450239] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 3 >> [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail >> [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail >> [ 7.864160] rtw89_8852be 0002:21:00.0: [ERR]fwdl 0x1E0 = 0x62 >> [ 7.864165] rtw89_8852be 0002:21:00.0: [ERR]fwdl 0x83F0 = 0x80011 >> [ 7.864173] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 >> [ 7.864188] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 >> [ 7.864203] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864219] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864234] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864250] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 >> [ 7.864265] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 >> [ 7.864281] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 >> [ 7.864296] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864311] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864327] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864342] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931150 >> [ 7.864358] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864373] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 7.864387] rtw89_8852be 0002:21:00.0: [ERR]fw PC = 0xb8931154 >> [ 8.181342] rtw89_8852be 0002:21:00.0: chip info CID: 0, CV: 1, AID: 0, ACV: 1, RFE: 1 >> [ 8.184322] rtw89_8852be 0002:21:00.0: rfkill hardware state changed to enable >> > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-02 5:55 ` Christian Hewitt @ 2026-03-02 6:04 ` Ping-Ke Shih 2026-03-02 6:17 ` Christian Hewitt 0 siblings, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-02 6:04 UTC (permalink / raw) To: Christian Hewitt Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > > On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > > > Christian Hewitt <christianshewitt@gmail.com> wrote: > >> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > >> physical map dump intermittently fails with -EBUSY during probe. > >> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > >> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > >> bit after 1 second. > > > > I'm checking internally how we handle this case. > > > > [...] > > > >> > >> For context, firmware also fails (and recovers) sometimes: > > > > Did you mean this doesn't always happen? sometimes? > > It’s another intermittent behaviour observed on this board (and not > related to the issue this patch targets). It occurs less frequently > than the efuse issue and the existing retry mechanism in the driver > ensures firmware load always succeeds. As intermittent behaviour, it might be not worth to try DMA. Recently, we have some patches related to PCI hardware settings. Please use the latest driver including patch [1] to see if it can be stable. [1] af1e82232b98 ("wifi: rtw89: pci: restore LDO setting after device resume") > > > We has seen similar log because of 36-bit DMA. Try below to force 32- or 36- > > bit DMA to see if it can resolve problem in your platform. > > I can experiment but this doesn’t happen often so I probably can’t > provide meaningful feedback. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-02 6:04 ` Ping-Ke Shih @ 2026-03-02 6:17 ` Christian Hewitt 2026-03-09 2:35 ` Ping-Ke Shih 0 siblings, 1 reply; 18+ messages in thread From: Christian Hewitt @ 2026-03-02 6:17 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>> >>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >>>> physical map dump intermittently fails with -EBUSY during probe. >>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >>>> bit after 1 second. >>> >>> I'm checking internally how we handle this case. >>> >>> [...] >>> >>>> >>>> For context, firmware also fails (and recovers) sometimes: >>> >>> Did you mean this doesn't always happen? sometimes? >> >> It’s another intermittent behaviour observed on this board (and not >> related to the issue this patch targets). It occurs less frequently >> than the efuse issue and the existing retry mechanism in the driver >> ensures firmware load always succeeds. > > As intermittent behaviour, it might be not worth to try DMA. > > Recently, we have some patches related to PCI hardware settings. Please > use the latest driver including patch [1] to see if it can be stable. > > [1] af1e82232b98 ("wifi: rtw89: pci: restore LDO setting after device resume") The efuse fail snippet that I posted alongside the patch was from a Linux 7.0-rc1 kernel so that patch was already present. >> We has seen similar log because of 36-bit DMA. Try below to force 32- or 36- >>> bit DMA to see if it can resolve problem in your platform. >> >> I can experiment but this doesn’t happen often so I probably can’t >> provide meaningful feedback. ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-02 6:17 ` Christian Hewitt @ 2026-03-09 2:35 ` Ping-Ke Shih 2026-03-10 17:16 ` Christian Hewitt 0 siblings, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-09 2:35 UTC (permalink / raw) To: Christian Hewitt Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > > > On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > > > Christian Hewitt <christianshewitt@gmail.com> wrote: > >>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>> > >>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > >>>> physical map dump intermittently fails with -EBUSY during probe. > >>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > >>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > >>>> bit after 1 second. > >>> > >>> I'm checking internally how we handle this case. Sorry for the late. We encountered WiFi/BT reading efuse at the same time causing similar problem as yours. The workaround is like yours, which adds timeout time. > >>> > >>> [...] > >>> > >>>> > >>>> For context, firmware also fails (and recovers) sometimes: > >>> > >>> Did you mean this doesn't always happen? sometimes? > >> > >> It’s another intermittent behaviour observed on this board (and not > >> related to the issue this patch targets). It occurs less frequently > >> than the efuse issue and the existing retry mechanism in the driver > >> ensures firmware load always succeeds. This might be the same cause due to reading efuse in firmware. Though we can add more timeout and retry times as workaround, I wonder if you can control loading time of WiFi and BT kernel modules? More, can you do experiment that you load BT module first, and then load WiFi module after 10 seconds (choose a large number intentionally, or even larger)? Ping-Ke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-09 2:35 ` Ping-Ke Shih @ 2026-03-10 17:16 ` Christian Hewitt 2026-03-11 3:05 ` Ping-Ke Shih 0 siblings, 1 reply; 18+ messages in thread From: Christian Hewitt @ 2026-03-10 17:16 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >> >>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>> >>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>> >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >>>>>> physical map dump intermittently fails with -EBUSY during probe. >>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >>>>>> bit after 1 second. >>>>> >>>>> I'm checking internally how we handle this case. > > Sorry for the late. > > We encountered WiFi/BT reading efuse at the same time causing similar > problem as yours. The workaround is like yours, which adds timeout > time. > >>>>> >>>>> [...] >>>>> >>>>>> >>>>>> For context, firmware also fails (and recovers) sometimes: >>>>> >>>>> Did you mean this doesn't always happen? sometimes? >>>> >>>> It’s another intermittent behaviour observed on this board (and not >>>> related to the issue this patch targets). It occurs less frequently >>>> than the efuse issue and the existing retry mechanism in the driver >>>> ensures firmware load always succeeds. > > This might be the same cause due to reading efuse in firmware. > > Though we can add more timeout and retry times as workaround, I wonder > if you can control loading time of WiFi and BT kernel modules? > > More, can you do experiment that you load BT module first, and then load > WiFi module after 10 seconds (choose a large number intentionally, or > even larger)? https://paste.libreelec.tv/charmed-turkey.sh I’ve run the above script ^ which removes the wifi and bt modules in sequence then reloads them in the reverse order with a delay between bt and wifi modules loading, then checks for error messages. Over 200 test cycles with a 10s delay all were clean (no errors). I also ran cycles with a 2 second delay and 0 second delay before starting wifi module load and those were clear too. I guess that proves sequencing avoids the efuse contention issue? - although it’s not possible in the real-world so not sure there’s huge value in knowing that :) Christian ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-10 17:16 ` Christian Hewitt @ 2026-03-11 3:05 ` Ping-Ke Shih 2026-03-11 4:20 ` Christian Hewitt 0 siblings, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-11 3:05 UTC (permalink / raw) To: Christian Hewitt Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > > > On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > > > Christian Hewitt <christianshewitt@gmail.com> wrote: > >> > >>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>> > >>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>> > >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > >>>>>> physical map dump intermittently fails with -EBUSY during probe. > >>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > >>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > >>>>>> bit after 1 second. > >>>>> > >>>>> I'm checking internally how we handle this case. > > > > Sorry for the late. > > > > We encountered WiFi/BT reading efuse at the same time causing similar > > problem as yours. The workaround is like yours, which adds timeout > > time. > > > >>>>> > >>>>> [...] > >>>>> > >>>>>> > >>>>>> For context, firmware also fails (and recovers) sometimes: > >>>>> > >>>>> Did you mean this doesn't always happen? sometimes? > >>>> > >>>> It’s another intermittent behaviour observed on this board (and not > >>>> related to the issue this patch targets). It occurs less frequently > >>>> than the efuse issue and the existing retry mechanism in the driver > >>>> ensures firmware load always succeeds. > > > > This might be the same cause due to reading efuse in firmware. > > > > Though we can add more timeout and retry times as workaround, I wonder > > if you can control loading time of WiFi and BT kernel modules? > > > > More, can you do experiment that you load BT module first, and then load > > WiFi module after 10 seconds (choose a large number intentionally, or > > even larger)? > > https://paste.libreelec.tv/charmed-turkey.sh > > I’ve run the above script ^ which removes the wifi and bt modules in > sequence then reloads them in the reverse order with a delay between > bt and wifi modules loading, then checks for error messages. Over 200 > test cycles with a 10s delay all were clean (no errors). I also ran > cycles with a 2 second delay and 0 second delay before starting wifi > module load and those were clear too. I guess that proves sequencing > avoids the efuse contention issue? - although it’s not possible in > the real-world so not sure there’s huge value in knowing that :) Thanks for the experiments. Still want to know is it possible to change sequence/time of loading kernel modules at boot time from system level? I mean can you adjust the sequence in the Rock 5B board? In addition, did below messages not appear in these experiments? [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail Ping-Ke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-11 3:05 ` Ping-Ke Shih @ 2026-03-11 4:20 ` Christian Hewitt 2026-03-12 2:22 ` Ping-Ke Shih 0 siblings, 1 reply; 18+ messages in thread From: Christian Hewitt @ 2026-03-11 4:20 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 11 Mar 2026, at 7:05 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >> >>> On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>> >>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>> >>>>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>> >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>>>> >>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >>>>>>>> physical map dump intermittently fails with -EBUSY during probe. >>>>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >>>>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >>>>>>>> bit after 1 second. >>>>>>> >>>>>>> I'm checking internally how we handle this case. >>> >>> Sorry for the late. >>> >>> We encountered WiFi/BT reading efuse at the same time causing similar >>> problem as yours. The workaround is like yours, which adds timeout >>> time. >>> >>>>>>> >>>>>>> [...] >>>>>>> >>>>>>>> >>>>>>>> For context, firmware also fails (and recovers) sometimes: >>>>>>> >>>>>>> Did you mean this doesn't always happen? sometimes? >>>>>> >>>>>> It’s another intermittent behaviour observed on this board (and not >>>>>> related to the issue this patch targets). It occurs less frequently >>>>>> than the efuse issue and the existing retry mechanism in the driver >>>>>> ensures firmware load always succeeds. >>> >>> This might be the same cause due to reading efuse in firmware. >>> >>> Though we can add more timeout and retry times as workaround, I wonder >>> if you can control loading time of WiFi and BT kernel modules? >>> >>> More, can you do experiment that you load BT module first, and then load >>> WiFi module after 10 seconds (choose a large number intentionally, or >>> even larger)? >> >> https://paste.libreelec.tv/charmed-turkey.sh >> >> I’ve run the above script ^ which removes the wifi and bt modules in >> sequence then reloads them in the reverse order with a delay between >> bt and wifi modules loading, then checks for error messages. Over 200 >> test cycles with a 10s delay all were clean (no errors). I also ran >> cycles with a 2 second delay and 0 second delay before starting wifi >> module load and those were clear too. I guess that proves sequencing >> avoids the efuse contention issue? - although it’s not possible in >> the real-world so not sure there’s huge value in knowing that :) > > Thanks for the experiments. > > Still want to know is it possible to change sequence/time of loading > kernel modules at boot time from system level? I mean can you adjust > the sequence in the Rock 5B board? I’m not a kernel expert, but I’ve always understood module probe and load ordering to not be guaranteed; as many things run in parallel and are highly subjective to the specific hardware capabilities and kernel config being used. > In addition, did below messages not appear in these experiments? > > [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail > [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail No, because even if we have a 0s delay between each group of modules being loaded, they are loaded in series, so we workaround the issue. Tweaking the script to background the module load loops so both run in parallel would be closer to normal conditions, and I would expect to start seeing failures and the retry mechanisms within the modules (as added in this patch) being triggered. Christian ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-11 4:20 ` Christian Hewitt @ 2026-03-12 2:22 ` Ping-Ke Shih 2026-03-12 5:58 ` Christian Hewitt 0 siblings, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-12 2:22 UTC (permalink / raw) To: Christian Hewitt Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > > On 11 Mar 2026, at 7:05 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > > > Christian Hewitt <christianshewitt@gmail.com> wrote: > >> > >>> On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>> > >>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>> > >>>>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>> > >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>>>> > >>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > >>>>>>>> physical map dump intermittently fails with -EBUSY during probe. > >>>>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > >>>>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > >>>>>>>> bit after 1 second. > >>>>>>> > >>>>>>> I'm checking internally how we handle this case. > >>> > >>> Sorry for the late. > >>> > >>> We encountered WiFi/BT reading efuse at the same time causing similar > >>> problem as yours. The workaround is like yours, which adds timeout > >>> time. > >>> > >>>>>>> > >>>>>>> [...] > >>>>>>> > >>>>>>>> > >>>>>>>> For context, firmware also fails (and recovers) sometimes: > >>>>>>> > >>>>>>> Did you mean this doesn't always happen? sometimes? > >>>>>> > >>>>>> It’s another intermittent behaviour observed on this board (and not > >>>>>> related to the issue this patch targets). It occurs less frequently > >>>>>> than the efuse issue and the existing retry mechanism in the driver > >>>>>> ensures firmware load always succeeds. > >>> > >>> This might be the same cause due to reading efuse in firmware. > >>> > >>> Though we can add more timeout and retry times as workaround, I wonder > >>> if you can control loading time of WiFi and BT kernel modules? > >>> > >>> More, can you do experiment that you load BT module first, and then load > >>> WiFi module after 10 seconds (choose a large number intentionally, or > >>> even larger)? > >> > >> https://paste.libreelec.tv/charmed-turkey.sh > >> > >> I’ve run the above script ^ which removes the wifi and bt modules in > >> sequence then reloads them in the reverse order with a delay between > >> bt and wifi modules loading, then checks for error messages. Over 200 > >> test cycles with a 10s delay all were clean (no errors). I also ran > >> cycles with a 2 second delay and 0 second delay before starting wifi > >> module load and those were clear too. I guess that proves sequencing > >> avoids the efuse contention issue? - although it’s not possible in > >> the real-world so not sure there’s huge value in knowing that :) > > > > Thanks for the experiments. > > > > Still want to know is it possible to change sequence/time of loading > > kernel modules at boot time from system level? I mean can you adjust > > the sequence in the Rock 5B board? > > I’m not a kernel expert, but I’ve always understood module probe and > load ordering to not be guaranteed; as many things run in parallel and > are highly subjective to the specific hardware capabilities and kernel > config being used. I have heard people about changing sequence/time of kernel modules, so I'd like you can try this method. I did ask AI, it said it is possible to create a .conf file under /etc/modprobe.d/ and use `softdep` syntax to ensure loading sequence. Could you try this? > > > In addition, did below messages not appear in these experiments? > > > > [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail > > [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail > > No, because even if we have a 0s delay between each group of modules > being loaded, they are loaded in series, so we workaround the issue. > Tweaking the script to background the module load loops so both run > in parallel would be closer to normal conditions, and I would expect > to start seeing failures and the retry mechanisms within the modules > (as added in this patch) being triggered. Additional question for downloading firmware. As you reported this issue initially (load modules at boot time in parallel), it seems appear this message by chance. Since this driver will retry to download firmware, will it successfully downloads firmware finally? Or it still fails to download after 5 times retry? Ping-Ke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-12 2:22 ` Ping-Ke Shih @ 2026-03-12 5:58 ` Christian Hewitt 2026-03-12 7:39 ` Ping-Ke Shih 0 siblings, 1 reply; 18+ messages in thread From: Christian Hewitt @ 2026-03-12 5:58 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 12 Mar 2026, at 6:22 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >>> On 11 Mar 2026, at 7:05 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>> >>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>> >>>>> On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>> >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>> >>>>>>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>>>> >>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>>>>>> >>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >>>>>>>>>> physical map dump intermittently fails with -EBUSY during probe. >>>>>>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >>>>>>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >>>>>>>>>> bit after 1 second. >>>>>>>>> >>>>>>>>> I'm checking internally how we handle this case. >>>>> >>>>> Sorry for the late. >>>>> >>>>> We encountered WiFi/BT reading efuse at the same time causing similar >>>>> problem as yours. The workaround is like yours, which adds timeout >>>>> time. >>>>> >>>>>>>>> >>>>>>>>> [...] >>>>>>>>> >>>>>>>>>> >>>>>>>>>> For context, firmware also fails (and recovers) sometimes: >>>>>>>>> >>>>>>>>> Did you mean this doesn't always happen? sometimes? >>>>>>>> >>>>>>>> It’s another intermittent behaviour observed on this board (and not >>>>>>>> related to the issue this patch targets). It occurs less frequently >>>>>>>> than the efuse issue and the existing retry mechanism in the driver >>>>>>>> ensures firmware load always succeeds. >>>>> >>>>> This might be the same cause due to reading efuse in firmware. >>>>> >>>>> Though we can add more timeout and retry times as workaround, I wonder >>>>> if you can control loading time of WiFi and BT kernel modules? >>>>> >>>>> More, can you do experiment that you load BT module first, and then load >>>>> WiFi module after 10 seconds (choose a large number intentionally, or >>>>> even larger)? >>>> >>>> https://paste.libreelec.tv/charmed-turkey.sh >>>> >>>> I’ve run the above script ^ which removes the wifi and bt modules in >>>> sequence then reloads them in the reverse order with a delay between >>>> bt and wifi modules loading, then checks for error messages. Over 200 >>>> test cycles with a 10s delay all were clean (no errors). I also ran >>>> cycles with a 2 second delay and 0 second delay before starting wifi >>>> module load and those were clear too. I guess that proves sequencing >>>> avoids the efuse contention issue? - although it’s not possible in >>>> the real-world so not sure there’s huge value in knowing that :) >>> >>> Thanks for the experiments. >>> >>> Still want to know is it possible to change sequence/time of loading >>> kernel modules at boot time from system level? I mean can you adjust >>> the sequence in the Rock 5B board? >> >> I’m not a kernel expert, but I’ve always understood module probe and >> load ordering to not be guaranteed; as many things run in parallel and >> are highly subjective to the specific hardware capabilities and kernel >> config being used. > > I have heard people about changing sequence/time of kernel modules, so > I'd like you can try this method. > > I did ask AI, it said it is possible to create a .conf file under > /etc/modprobe.d/ and use `softdep` syntax to ensure loading sequence. > Could you try this? I can test this, but even if it works it’s not a fix because modprobe confs configured in userspace are only used with loadable modules that have been compiled with =m, not build-in modules that are resident in kernel memory and compiled with =y; and distros are free to choose how their kernel is configured. NB: I’m not sure if there are any general kernel rules for this, but I’d expect there to be general principle of modules being resilient to transient host states and not depending on userspace packaging to load correctly? >> In addition, did below messages not appear in these experiments? >>> >>> [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail >>> [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail >> >> No, because even if we have a 0s delay between each group of modules >> being loaded, they are loaded in series, so we workaround the issue. >> Tweaking the script to background the module load loops so both run >> in parallel would be closer to normal conditions, and I would expect >> to start seeing failures and the retry mechanisms within the modules >> (as added in this patch) being triggered. > > Additional question for downloading firmware. As you reported this > issue initially (load modules at boot time in parallel), it seems > appear this message by chance. Since this driver will retry to download > firmware, will it successfully downloads firmware finally? Or it still > fails to download after 5 times retry? I have only seen firmware load fail a handful of times in many hundreds of boots and each time one retry attempt resulted in success. To be clear; I have am not reporting firwmare loading as a problem, it is not an issue for me. I’ve mentioned it only for context, i.e. it shows that a simple retry mechanism is effective at handling the similar issue with efuse map. Christian ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-12 5:58 ` Christian Hewitt @ 2026-03-12 7:39 ` Ping-Ke Shih 2026-03-12 8:11 ` Christian Hewitt 0 siblings, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-12 7:39 UTC (permalink / raw) To: Christian Hewitt Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > > On 12 Mar 2026, at 6:22 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > > > Christian Hewitt <christianshewitt@gmail.com> wrote: > >>> On 11 Mar 2026, at 7:05 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>> > >>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>> > >>>>> On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>> > >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>> > >>>>>>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>>>> > >>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>>>>>> > >>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > >>>>>>>>>> physical map dump intermittently fails with -EBUSY during probe. > >>>>>>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > >>>>>>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > >>>>>>>>>> bit after 1 second. > >>>>>>>>> > >>>>>>>>> I'm checking internally how we handle this case. > >>>>> > >>>>> Sorry for the late. > >>>>> > >>>>> We encountered WiFi/BT reading efuse at the same time causing similar > >>>>> problem as yours. The workaround is like yours, which adds timeout > >>>>> time. > >>>>> > >>>>>>>>> > >>>>>>>>> [...] > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> For context, firmware also fails (and recovers) sometimes: > >>>>>>>>> > >>>>>>>>> Did you mean this doesn't always happen? sometimes? > >>>>>>>> > >>>>>>>> It’s another intermittent behaviour observed on this board (and not > >>>>>>>> related to the issue this patch targets). It occurs less frequently > >>>>>>>> than the efuse issue and the existing retry mechanism in the driver > >>>>>>>> ensures firmware load always succeeds. > >>>>> > >>>>> This might be the same cause due to reading efuse in firmware. > >>>>> > >>>>> Though we can add more timeout and retry times as workaround, I wonder > >>>>> if you can control loading time of WiFi and BT kernel modules? > >>>>> > >>>>> More, can you do experiment that you load BT module first, and then load > >>>>> WiFi module after 10 seconds (choose a large number intentionally, or > >>>>> even larger)? > >>>> > >>>> https://paste.libreelec.tv/charmed-turkey.sh > >>>> > >>>> I’ve run the above script ^ which removes the wifi and bt modules in > >>>> sequence then reloads them in the reverse order with a delay between > >>>> bt and wifi modules loading, then checks for error messages. Over 200 > >>>> test cycles with a 10s delay all were clean (no errors). I also ran > >>>> cycles with a 2 second delay and 0 second delay before starting wifi > >>>> module load and those were clear too. I guess that proves sequencing > >>>> avoids the efuse contention issue? - although it’s not possible in > >>>> the real-world so not sure there’s huge value in knowing that :) > >>> > >>> Thanks for the experiments. > >>> > >>> Still want to know is it possible to change sequence/time of loading > >>> kernel modules at boot time from system level? I mean can you adjust > >>> the sequence in the Rock 5B board? > >> > >> I’m not a kernel expert, but I’ve always understood module probe and > >> load ordering to not be guaranteed; as many things run in parallel and > >> are highly subjective to the specific hardware capabilities and kernel > >> config being used. > > > > I have heard people about changing sequence/time of kernel modules, so > > I'd like you can try this method. > > > > I did ask AI, it said it is possible to create a .conf file under > > /etc/modprobe.d/ and use `softdep` syntax to ensure loading sequence. > > Could you try this? > > I can test this, but even if it works it’s not a fix because modprobe > confs configured in userspace are only used with loadable modules that > have been compiled with =m, not build-in modules that are resident in > kernel memory and compiled with =y; and distros are free to choose how > their kernel is configured. NB: I’m not sure if there are any general > kernel rules for this, but I’d expect there to be general principle of > modules being resilient to transient host states and not depending on > userspace packaging to load correctly? I think built-in modules will be loaded sequentially (not in parallel) by device_initicall(), so BT and WiFi drivers will not read efuse at the same time. > > >> In addition, did below messages not appear in these experiments? > >>> > >>> [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail > >>> [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail > >> > >> No, because even if we have a 0s delay between each group of modules > >> being loaded, they are loaded in series, so we workaround the issue. > >> Tweaking the script to background the module load loops so both run > >> in parallel would be closer to normal conditions, and I would expect > >> to start seeing failures and the retry mechanisms within the modules > >> (as added in this patch) being triggered. > > > > Additional question for downloading firmware. As you reported this > > issue initially (load modules at boot time in parallel), it seems > > appear this message by chance. Since this driver will retry to download > > firmware, will it successfully downloads firmware finally? Or it still > > fails to download after 5 times retry? > > I have only seen firmware load fail a handful of times in many hundreds > of boots and each time one retry attempt resulted in success. To be > clear; I have am not reporting firwmare loading as a problem, it is not > an issue for me. I’ve mentioned it only for context, i.e. it shows that > a simple retry mechanism is effective at handling the similar issue with > efuse map. I have this question because I wonder downloading firmware issue might be also a reading efuse issue. If so, retry might resolve as well. As your results, it looks like to retry reading efuse can resolve all issues you found. What do you think? Ping-Ke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-12 7:39 ` Ping-Ke Shih @ 2026-03-12 8:11 ` Christian Hewitt 2026-03-12 8:28 ` Ping-Ke Shih 0 siblings, 1 reply; 18+ messages in thread From: Christian Hewitt @ 2026-03-12 8:11 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 12 Mar 2026, at 11:39 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >>> On 12 Mar 2026, at 6:22 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>> >>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>> On 11 Mar 2026, at 7:05 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>> >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>> >>>>>>> On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>>>> >>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>>>> >>>>>>>>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>>>>>> >>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>>>>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>>>>>>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >>>>>>>>>>>> physical map dump intermittently fails with -EBUSY during probe. >>>>>>>>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >>>>>>>>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >>>>>>>>>>>> bit after 1 second. >>>>>>>>>>> >>>>>>>>>>> I'm checking internally how we handle this case. >>>>>>> >>>>>>> Sorry for the late. >>>>>>> >>>>>>> We encountered WiFi/BT reading efuse at the same time causing similar >>>>>>> problem as yours. The workaround is like yours, which adds timeout >>>>>>> time. >>>>>>> >>>>>>>>>>> >>>>>>>>>>> [...] >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> For context, firmware also fails (and recovers) sometimes: >>>>>>>>>>> >>>>>>>>>>> Did you mean this doesn't always happen? sometimes? >>>>>>>>>> >>>>>>>>>> It’s another intermittent behaviour observed on this board (and not >>>>>>>>>> related to the issue this patch targets). It occurs less frequently >>>>>>>>>> than the efuse issue and the existing retry mechanism in the driver >>>>>>>>>> ensures firmware load always succeeds. >>>>>>> >>>>>>> This might be the same cause due to reading efuse in firmware. >>>>>>> >>>>>>> Though we can add more timeout and retry times as workaround, I wonder >>>>>>> if you can control loading time of WiFi and BT kernel modules? >>>>>>> >>>>>>> More, can you do experiment that you load BT module first, and then load >>>>>>> WiFi module after 10 seconds (choose a large number intentionally, or >>>>>>> even larger)? >>>>>> >>>>>> https://paste.libreelec.tv/charmed-turkey.sh >>>>>> >>>>>> I’ve run the above script ^ which removes the wifi and bt modules in >>>>>> sequence then reloads them in the reverse order with a delay between >>>>>> bt and wifi modules loading, then checks for error messages. Over 200 >>>>>> test cycles with a 10s delay all were clean (no errors). I also ran >>>>>> cycles with a 2 second delay and 0 second delay before starting wifi >>>>>> module load and those were clear too. I guess that proves sequencing >>>>>> avoids the efuse contention issue? - although it’s not possible in >>>>>> the real-world so not sure there’s huge value in knowing that :) >>>>> >>>>> Thanks for the experiments. >>>>> >>>>> Still want to know is it possible to change sequence/time of loading >>>>> kernel modules at boot time from system level? I mean can you adjust >>>>> the sequence in the Rock 5B board? >>>> >>>> I’m not a kernel expert, but I’ve always understood module probe and >>>> load ordering to not be guaranteed; as many things run in parallel and >>>> are highly subjective to the specific hardware capabilities and kernel >>>> config being used. >>> >>> I have heard people about changing sequence/time of kernel modules, so >>> I'd like you can try this method. >>> >>> I did ask AI, it said it is possible to create a .conf file under >>> /etc/modprobe.d/ and use `softdep` syntax to ensure loading sequence. >>> Could you try this? >> >> I can test this, but even if it works it’s not a fix because modprobe >> confs configured in userspace are only used with loadable modules that >> have been compiled with =m, not build-in modules that are resident in >> kernel memory and compiled with =y; and distros are free to choose how >> their kernel is configured. NB: I’m not sure if there are any general >> kernel rules for this, but I’d expect there to be general principle of >> modules being resilient to transient host states and not depending on >> userspace packaging to load correctly? > > I think built-in modules will be loaded sequentially (not in parallel) > by device_initicall(), so BT and WiFi drivers will not read efuse > at the same time. Even if built-in modules are loaded sequentially, the kernel still has many dynamically loaded modules; and distros can configure that mix as they like, so you still cannot predict or guarantee the outcome. That could be changed by requiring rtw89 modules to be =y, but that goes against the principles of a modular kernel and I’d expect appropriately rude comments to the idea if submitted :) >>>> In addition, did below messages not appear in these experiments? >>>>> >>>>> [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail >>>>> [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail >>>> >>>> No, because even if we have a 0s delay between each group of modules >>>> being loaded, they are loaded in series, so we workaround the issue. >>>> Tweaking the script to background the module load loops so both run >>>> in parallel would be closer to normal conditions, and I would expect >>>> to start seeing failures and the retry mechanisms within the modules >>>> (as added in this patch) being triggered. >>> >>> Additional question for downloading firmware. As you reported this >>> issue initially (load modules at boot time in parallel), it seems >>> appear this message by chance. Since this driver will retry to download >>> firmware, will it successfully downloads firmware finally? Or it still >>> fails to download after 5 times retry? >> >> I have only seen firmware load fail a handful of times in many hundreds >> of boots and each time one retry attempt resulted in success. To be >> clear; I have am not reporting firwmare loading as a problem, it is not >> an issue for me. I’ve mentioned it only for context, i.e. it shows that >> a simple retry mechanism is effective at handling the similar issue with >> efuse map. > > I have this question because I wonder downloading firmware issue might be > also a reading efuse issue. If so, retry might resolve as well. Hard to know, but it's an infrequent event and the existing retry mechanism appears to work fine. > As your results, it looks like to retry reading efuse can resolve all > issues you found. What do you think? The patch submitted resolves the efuse map dump for me. If there are more efuse accesses that need to be addressed I haven’t seen them in tests. If you are hinting to abstract things further I’d ask you to please propose an alternative patch that I can test for you; I’m firmly at the novice end of kernel contributors and unlikely to spot where changes might be needed without being spoon-fed rather explicit instructions :) Christian ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-12 8:11 ` Christian Hewitt @ 2026-03-12 8:28 ` Ping-Ke Shih 0 siblings, 0 replies; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-12 8:28 UTC (permalink / raw) To: Christian Hewitt Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > > > On 12 Mar 2026, at 11:39 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > > > Christian Hewitt <christianshewitt@gmail.com> wrote: > >>> On 12 Mar 2026, at 6:22 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>> > >>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>> On 11 Mar 2026, at 7:05 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>> > >>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>> > >>>>>>> On 9 Mar 2026, at 6:35 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>>>> > >>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>>>> > >>>>>>>>> On 2 Mar 2026, at 10:04 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>>>>>> > >>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>>>>>>> On 2 Mar 2026, at 9:47 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > >>>>>>>>>>> > >>>>>>>>>>> Christian Hewitt <christianshewitt@gmail.com> wrote: > >>>>>>>>>>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > >>>>>>>>>>>> physical map dump intermittently fails with -EBUSY during probe. > >>>>>>>>>>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > >>>>>>>>>>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > >>>>>>>>>>>> bit after 1 second. > >>>>>>>>>>> > >>>>>>>>>>> I'm checking internally how we handle this case. > >>>>>>> > >>>>>>> Sorry for the late. > >>>>>>> > >>>>>>> We encountered WiFi/BT reading efuse at the same time causing similar > >>>>>>> problem as yours. The workaround is like yours, which adds timeout > >>>>>>> time. > >>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> [...] > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> For context, firmware also fails (and recovers) sometimes: > >>>>>>>>>>> > >>>>>>>>>>> Did you mean this doesn't always happen? sometimes? > >>>>>>>>>> > >>>>>>>>>> It’s another intermittent behaviour observed on this board (and not > >>>>>>>>>> related to the issue this patch targets). It occurs less frequently > >>>>>>>>>> than the efuse issue and the existing retry mechanism in the driver > >>>>>>>>>> ensures firmware load always succeeds. > >>>>>>> > >>>>>>> This might be the same cause due to reading efuse in firmware. > >>>>>>> > >>>>>>> Though we can add more timeout and retry times as workaround, I wonder > >>>>>>> if you can control loading time of WiFi and BT kernel modules? > >>>>>>> > >>>>>>> More, can you do experiment that you load BT module first, and then load > >>>>>>> WiFi module after 10 seconds (choose a large number intentionally, or > >>>>>>> even larger)? > >>>>>> > >>>>>> https://paste.libreelec.tv/charmed-turkey.sh > >>>>>> > >>>>>> I’ve run the above script ^ which removes the wifi and bt modules in > >>>>>> sequence then reloads them in the reverse order with a delay between > >>>>>> bt and wifi modules loading, then checks for error messages. Over 200 > >>>>>> test cycles with a 10s delay all were clean (no errors). I also ran > >>>>>> cycles with a 2 second delay and 0 second delay before starting wifi > >>>>>> module load and those were clear too. I guess that proves sequencing > >>>>>> avoids the efuse contention issue? - although it’s not possible in > >>>>>> the real-world so not sure there’s huge value in knowing that :) > >>>>> > >>>>> Thanks for the experiments. > >>>>> > >>>>> Still want to know is it possible to change sequence/time of loading > >>>>> kernel modules at boot time from system level? I mean can you adjust > >>>>> the sequence in the Rock 5B board? > >>>> > >>>> I’m not a kernel expert, but I’ve always understood module probe and > >>>> load ordering to not be guaranteed; as many things run in parallel and > >>>> are highly subjective to the specific hardware capabilities and kernel > >>>> config being used. > >>> > >>> I have heard people about changing sequence/time of kernel modules, so > >>> I'd like you can try this method. > >>> > >>> I did ask AI, it said it is possible to create a .conf file under > >>> /etc/modprobe.d/ and use `softdep` syntax to ensure loading sequence. > >>> Could you try this? > >> > >> I can test this, but even if it works it’s not a fix because modprobe > >> confs configured in userspace are only used with loadable modules that > >> have been compiled with =m, not build-in modules that are resident in > >> kernel memory and compiled with =y; and distros are free to choose how > >> their kernel is configured. NB: I’m not sure if there are any general > >> kernel rules for this, but I’d expect there to be general principle of > >> modules being resilient to transient host states and not depending on > >> userspace packaging to load correctly? > > > > I think built-in modules will be loaded sequentially (not in parallel) > > by device_initicall(), so BT and WiFi drivers will not read efuse > > at the same time. > > Even if built-in modules are loaded sequentially, the kernel still has > many dynamically loaded modules; and distros can configure that mix as > they like, so you still cannot predict or guarantee the outcome. That > could be changed by requiring rtw89 modules to be =y, but that goes > against the principles of a modular kernel and I’d expect appropriately > rude comments to the idea if submitted :) As I know, dynamical modules are executed after init process, but that's not your case. Let's clarify if /etc/modprobe.d/ with `softdep` option can resolve your problem. I'd like to know the result. :) > > >>>> In addition, did below messages not appear in these experiments? > >>>>> > >>>>> [ 7.864148] rtw89_8852be 0002:21:00.0: fw security fail > >>>>> [ 7.864154] rtw89_8852be 0002:21:00.0: download firmware fail > >>>> > >>>> No, because even if we have a 0s delay between each group of modules > >>>> being loaded, they are loaded in series, so we workaround the issue. > >>>> Tweaking the script to background the module load loops so both run > >>>> in parallel would be closer to normal conditions, and I would expect > >>>> to start seeing failures and the retry mechanisms within the modules > >>>> (as added in this patch) being triggered. > >>> > >>> Additional question for downloading firmware. As you reported this > >>> issue initially (load modules at boot time in parallel), it seems > >>> appear this message by chance. Since this driver will retry to download > >>> firmware, will it successfully downloads firmware finally? Or it still > >>> fails to download after 5 times retry? > >> > >> I have only seen firmware load fail a handful of times in many hundreds > >> of boots and each time one retry attempt resulted in success. To be > >> clear; I have am not reporting firwmare loading as a problem, it is not > >> an issue for me. I’ve mentioned it only for context, i.e. it shows that > >> a simple retry mechanism is effective at handling the similar issue with > >> efuse map. > > > > I have this question because I wonder downloading firmware issue might be > > also a reading efuse issue. If so, retry might resolve as well. > > Hard to know, but it's an infrequent event and the existing retry mechanism > appears to work fine. > > > As your results, it looks like to retry reading efuse can resolve all > > issues you found. What do you think? > > The patch submitted resolves the efuse map dump for me. If there are more > efuse accesses that need to be addressed I haven’t seen them in tests. If > you are hinting to abstract things further I’d ask you to please propose > an alternative patch that I can test for you; I’m firmly at the novice end > of kernel contributors and unlikely to spot where changes might be needed > without being spoon-fed rather explicit instructions :) I will start to review this patch in detail and consider if another alternative method. Ping-Ke ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-01 4:24 [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure Christian Hewitt 2026-03-02 5:47 ` Ping-Ke Shih @ 2026-03-16 5:32 ` Ping-Ke Shih 2026-03-16 11:03 ` Christian Hewitt 1 sibling, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-16 5:32 UTC (permalink / raw) To: Christian Hewitt, Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > physical map dump intermittently fails with -EBUSY during probe. > The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > bit after 1 second. > > The root cause is a timing race during boot: the WiFi driver's > chip initialization (firmware download via PCIe) overlaps with the > Bluetooth firmware download to the same combo chip over USB. This > can leave the efuse controller temporarily unavailable when the > WiFi driver attempts to read the efuse map. > > Add a retry loop (up to 3 attempts with 500ms delays) around the > physical efuse map dump in rtw89_parse_efuse_map_ax(). The firmware > download path already retries up to 5 times, but the efuse read > that follows has no retry logic, making it the weak link in the > probe sequence. I'd prefer adding a wrapper to retry 5 times without delay as bottom changes for reference. If you want to limit retry only for 'dav == false' case, it is also fine to me. > > Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> [...] > > drivers/net/wireless/realtek/rtw89/efuse.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c > b/drivers/net/wireless/realtek/rtw89/efuse.c > index a2757a88d55d..d506f04ffd6c 100644 > --- a/drivers/net/wireless/realtek/rtw89/efuse.c > +++ b/drivers/net/wireless/realtek/rtw89/efuse.c > @@ -270,6 +270,7 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) > u8 *log_map = NULL; > u8 *dav_phy_map = NULL; > u8 *dav_log_map = NULL; > + int retry; > int ret; > > if (rtw89_read16(rtwdev, R_AX_SYS_WL_EFUSE_CTRL) & B_AX_AUTOLOAD_SUS) > @@ -289,7 +290,17 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) > goto out_free; > } > > - ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, phy_size, > false); > + for (retry = 0; retry < 3; retry++) { > + if (retry) { > + rtw89_warn(rtwdev, "efuse dump failed, retrying > (%d)\n", > + retry); > + fsleep(500000); > + } > + ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, > + phy_size, false); > + if (!ret) > + break; > + } > if (ret) { > rtw89_warn(rtwdev, "failed to dump efuse physical map\n"); > goto out_free; > -- > 2.43.0 How about retrying 5 times without fsleep(500000)? diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c b/drivers/net/wireless/realtek/rtw89/efuse.c index a2757a88d55d..89d4b1b865f8 100644 --- a/drivers/net/wireless/realtek/rtw89/efuse.c +++ b/drivers/net/wireless/realtek/rtw89/efuse.c @@ -185,8 +185,8 @@ static int rtw89_dump_physical_efuse_map_dav(struct rtw89_dev *rtwdev, u8 *map, return 0; } -static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, - u32 dump_addr, u32 dump_size, bool dav) +static int __rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, + u32 dump_addr, u32 dump_size, bool dav) { int ret; @@ -208,6 +208,25 @@ static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, return 0; } +static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, + u32 dump_addr, u32 dump_size, bool dav) +{ + int retry; + int ret; + + for (retry = 0; retry < 5; retry++) { + ret = __rtw89_dump_physical_efuse_map(rtwdev, map, dump_addr, + dump_size, dav); + if (!ret) + return 0; + + rtw89_warn(rtwdev, "efuse dump (dav=%d) failed, retrying (%d)\n", + dav, retry); + } + + return ret; +} + #define invalid_efuse_header(hdr1, hdr2) \ ((hdr1) == 0xff || (hdr2) == 0xff) #define invalid_efuse_content(word_en, i) \ ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-16 5:32 ` Ping-Ke Shih @ 2026-03-16 11:03 ` Christian Hewitt 2026-03-17 1:37 ` Ping-Ke Shih 0 siblings, 1 reply; 18+ messages in thread From: Christian Hewitt @ 2026-03-16 11:03 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 16 Mar 2026, at 9:32 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >> physical map dump intermittently fails with -EBUSY during probe. >> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >> bit after 1 second. >> >> The root cause is a timing race during boot: the WiFi driver's >> chip initialization (firmware download via PCIe) overlaps with the >> Bluetooth firmware download to the same combo chip over USB. This >> can leave the efuse controller temporarily unavailable when the >> WiFi driver attempts to read the efuse map. >> >> Add a retry loop (up to 3 attempts with 500ms delays) around the >> physical efuse map dump in rtw89_parse_efuse_map_ax(). The firmware >> download path already retries up to 5 times, but the efuse read >> that follows has no retry logic, making it the weak link in the >> probe sequence. > > I'd prefer adding a wrapper to retry 5 times without delay as bottom > changes for reference. If you want to limit retry only for > 'dav == false' case, it is also fine to me. > >> >> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> > > [...] > >> >> drivers/net/wireless/realtek/rtw89/efuse.c | 13 ++++++++++++- >> 1 file changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c >> b/drivers/net/wireless/realtek/rtw89/efuse.c >> index a2757a88d55d..d506f04ffd6c 100644 >> --- a/drivers/net/wireless/realtek/rtw89/efuse.c >> +++ b/drivers/net/wireless/realtek/rtw89/efuse.c >> @@ -270,6 +270,7 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) >> u8 *log_map = NULL; >> u8 *dav_phy_map = NULL; >> u8 *dav_log_map = NULL; >> + int retry; >> int ret; >> >> if (rtw89_read16(rtwdev, R_AX_SYS_WL_EFUSE_CTRL) & B_AX_AUTOLOAD_SUS) >> @@ -289,7 +290,17 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) >> goto out_free; >> } >> >> - ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, phy_size, >> false); >> + for (retry = 0; retry < 3; retry++) { >> + if (retry) { >> + rtw89_warn(rtwdev, "efuse dump failed, retrying >> (%d)\n", >> + retry); >> + fsleep(500000); >> + } >> + ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, >> + phy_size, false); >> + if (!ret) >> + break; >> + } >> if (ret) { >> rtw89_warn(rtwdev, "failed to dump efuse physical map\n"); >> goto out_free; >> -- >> 2.43.0 > > How about retrying 5 times without fsleep(500000)? > > diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c b/drivers/net/wireless/realtek/rtw89/efuse.c > index a2757a88d55d..89d4b1b865f8 100644 > --- a/drivers/net/wireless/realtek/rtw89/efuse.c > +++ b/drivers/net/wireless/realtek/rtw89/efuse.c > @@ -185,8 +185,8 @@ static int rtw89_dump_physical_efuse_map_dav(struct rtw89_dev *rtwdev, u8 *map, > return 0; > } > > -static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, > - u32 dump_addr, u32 dump_size, bool dav) > +static int __rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, > + u32 dump_addr, u32 dump_size, bool dav) > { > int ret; > > @@ -208,6 +208,25 @@ static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, > return 0; > } > > +static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, > + u32 dump_addr, u32 dump_size, bool dav) > +{ > + int retry; > + int ret; > + > + for (retry = 0; retry < 5; retry++) { > + ret = __rtw89_dump_physical_efuse_map(rtwdev, map, dump_addr, > + dump_size, dav); > + if (!ret) > + return 0; > + > + rtw89_warn(rtwdev, "efuse dump (dav=%d) failed, retrying (%d)\n", > + dav, retry); > + } > + > + return ret; > +} > + > #define invalid_efuse_header(hdr1, hdr2) \ > ((hdr1) == 0xff || (hdr2) == 0xff) > #define invalid_efuse_content(word_en, i) \ I’ve run some boot tests and this also resolves my efuse map use-case, e.g. ROCK5B:~ # dmesg | grep rtw89 [ 6.506375] rtw89_8852be 0002:21:00.0: loaded firmware rtw89/rtw8852b_fw-1.bin [ 6.506539] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) [ 6.516069] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 5 [ 6.516083] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 (6fb3ec41), cmd version 0, type 3 [ 10.153731] rtw89_8852be 0002:21:00.0: efuse dump (dav=0) failed, retrying (0) [ 10.405347] rtw89_8852be 0002:21:00.0: chip info CID: 0, CV: 1, AID: 0, ACV: 1, RFE: 1 [ 10.408311] rtw89_8852be 0002:21:00.0: rfkill hardware state changed to enable So far I haven’t observed more than 1x retry being required, and there are no issues with loading the BT module. Would you like me to send a v2 using your revised version? - or? Christian ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-16 11:03 ` Christian Hewitt @ 2026-03-17 1:37 ` Ping-Ke Shih 2026-03-17 6:15 ` Christian Hewitt 0 siblings, 1 reply; 18+ messages in thread From: Ping-Ke Shih @ 2026-03-17 1:37 UTC (permalink / raw) To: Christian Hewitt Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Christian Hewitt <christianshewitt@gmail.com> wrote: > > On 16 Mar 2026, at 9:32 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > > > Christian Hewitt <christianshewitt@gmail.com> wrote: > >> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse > >> physical map dump intermittently fails with -EBUSY during probe. > >> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where > >> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY > >> bit after 1 second. > >> > >> The root cause is a timing race during boot: the WiFi driver's > >> chip initialization (firmware download via PCIe) overlaps with the > >> Bluetooth firmware download to the same combo chip over USB. This > >> can leave the efuse controller temporarily unavailable when the > >> WiFi driver attempts to read the efuse map. > >> > >> Add a retry loop (up to 3 attempts with 500ms delays) around the > >> physical efuse map dump in rtw89_parse_efuse_map_ax(). The firmware > >> download path already retries up to 5 times, but the efuse read > >> that follows has no retry logic, making it the weak link in the > >> probe sequence. > > > > I'd prefer adding a wrapper to retry 5 times without delay as bottom > > changes for reference. If you want to limit retry only for > > 'dav == false' case, it is also fine to me. > > > >> > >> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> > > > > [...] > > > >> > >> drivers/net/wireless/realtek/rtw89/efuse.c | 13 ++++++++++++- > >> 1 file changed, 12 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c > >> b/drivers/net/wireless/realtek/rtw89/efuse.c > >> index a2757a88d55d..d506f04ffd6c 100644 > >> --- a/drivers/net/wireless/realtek/rtw89/efuse.c > >> +++ b/drivers/net/wireless/realtek/rtw89/efuse.c > >> @@ -270,6 +270,7 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) > >> u8 *log_map = NULL; > >> u8 *dav_phy_map = NULL; > >> u8 *dav_log_map = NULL; > >> + int retry; > >> int ret; > >> > >> if (rtw89_read16(rtwdev, R_AX_SYS_WL_EFUSE_CTRL) & > B_AX_AUTOLOAD_SUS) > >> @@ -289,7 +290,17 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) > >> goto out_free; > >> } > >> > >> - ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, phy_size, > >> false); > >> + for (retry = 0; retry < 3; retry++) { > >> + if (retry) { > >> + rtw89_warn(rtwdev, "efuse dump failed, retrying > >> (%d)\n", > >> + retry); > >> + fsleep(500000); > >> + } > >> + ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, > >> + phy_size, false); > >> + if (!ret) > >> + break; > >> + } > >> if (ret) { > >> rtw89_warn(rtwdev, "failed to dump efuse physical map\n"); > >> goto out_free; > >> -- > >> 2.43.0 > > > > How about retrying 5 times without fsleep(500000)? > > > > diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c > b/drivers/net/wireless/realtek/rtw89/efuse.c > > index a2757a88d55d..89d4b1b865f8 100644 > > --- a/drivers/net/wireless/realtek/rtw89/efuse.c > > +++ b/drivers/net/wireless/realtek/rtw89/efuse.c > > @@ -185,8 +185,8 @@ static int rtw89_dump_physical_efuse_map_dav(struct > rtw89_dev *rtwdev, u8 *map, > > return 0; > > } > > > > -static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, > > - u32 dump_addr, u32 dump_size, bool > dav) > > +static int __rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 > *map, > > + u32 dump_addr, u32 dump_size, > bool dav) > > { > > int ret; > > > > @@ -208,6 +208,25 @@ static int rtw89_dump_physical_efuse_map(struct rtw89_dev > *rtwdev, u8 *map, > > return 0; > > } > > > > +static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, > > + u32 dump_addr, u32 dump_size, bool > dav) > > +{ > > + int retry; > > + int ret; > > + > > + for (retry = 0; retry < 5; retry++) { > > + ret = __rtw89_dump_physical_efuse_map(rtwdev, map, > dump_addr, > > + dump_size, dav); > > + if (!ret) > > + return 0; > > + > > + rtw89_warn(rtwdev, "efuse dump (dav=%d) failed, retrying > (%d)\n", > > + dav, retry); > > + } > > + > > + return ret; > > +} > > + > > #define invalid_efuse_header(hdr1, hdr2) \ > > ((hdr1) == 0xff || (hdr2) == 0xff) > > #define invalid_efuse_content(word_en, i) \ > > I’ve run some boot tests and this also resolves my efuse map use-case, e.g. > > ROCK5B:~ # dmesg | grep rtw89 > [ 6.506375] rtw89_8852be 0002:21:00.0: loaded firmware > rtw89/rtw8852b_fw-1.bin > [ 6.506539] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) > [ 6.516069] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 > (6fb3ec41), cmd version 0, type 5 > [ 6.516083] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 > (6fb3ec41), cmd version 0, type 3 > [ 10.153731] rtw89_8852be 0002:21:00.0: efuse dump (dav=0) failed, retrying > (0) > [ 10.405347] rtw89_8852be 0002:21:00.0: chip info CID: 0, CV: 1, AID: 0, ACV: > 1, RFE: 1 > [ 10.408311] rtw89_8852be 0002:21:00.0: rfkill hardware state changed to > enable > > So far I haven’t observed more than 1x retry being required, and there are no > issues with loading the BT module. My changes do retry for 5 times, because your patch does 3 times retry plus additional 500ms delay. I feel you want around 5 seconds for loading BT module. Did you mean for now you can't reproduce the situation that long loading time of BT module? (But it took long time days ago?) > > Would you like me to send a v2 using your revised version? - or? Yes, please help v2. Ping-Ke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure 2026-03-17 1:37 ` Ping-Ke Shih @ 2026-03-17 6:15 ` Christian Hewitt 0 siblings, 0 replies; 18+ messages in thread From: Christian Hewitt @ 2026-03-17 6:15 UTC (permalink / raw) To: Ping-Ke Shih Cc: Bitterblue Smith, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org > On 17 Mar 2026, at 5:37 am, Ping-Ke Shih <pkshih@realtek.com> wrote: > > Christian Hewitt <christianshewitt@gmail.com> wrote: >>> On 16 Mar 2026, at 9:32 am, Ping-Ke Shih <pkshih@realtek.com> wrote: >>> >>> Christian Hewitt <christianshewitt@gmail.com> wrote: >>>> On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse >>>> physical map dump intermittently fails with -EBUSY during probe. >>>> The failure occurs in rtw89_dump_physical_efuse_map_ddv() where >>>> read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY >>>> bit after 1 second. >>>> >>>> The root cause is a timing race during boot: the WiFi driver's >>>> chip initialization (firmware download via PCIe) overlaps with the >>>> Bluetooth firmware download to the same combo chip over USB. This >>>> can leave the efuse controller temporarily unavailable when the >>>> WiFi driver attempts to read the efuse map. >>>> >>>> Add a retry loop (up to 3 attempts with 500ms delays) around the >>>> physical efuse map dump in rtw89_parse_efuse_map_ax(). The firmware >>>> download path already retries up to 5 times, but the efuse read >>>> that follows has no retry logic, making it the weak link in the >>>> probe sequence. >>> >>> I'd prefer adding a wrapper to retry 5 times without delay as bottom >>> changes for reference. If you want to limit retry only for >>> 'dav == false' case, it is also fine to me. >>> >>>> >>>> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> >>> >>> [...] >>> >>>> >>>> drivers/net/wireless/realtek/rtw89/efuse.c | 13 ++++++++++++- >>>> 1 file changed, 12 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c >>>> b/drivers/net/wireless/realtek/rtw89/efuse.c >>>> index a2757a88d55d..d506f04ffd6c 100644 >>>> --- a/drivers/net/wireless/realtek/rtw89/efuse.c >>>> +++ b/drivers/net/wireless/realtek/rtw89/efuse.c >>>> @@ -270,6 +270,7 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) >>>> u8 *log_map = NULL; >>>> u8 *dav_phy_map = NULL; >>>> u8 *dav_log_map = NULL; >>>> + int retry; >>>> int ret; >>>> >>>> if (rtw89_read16(rtwdev, R_AX_SYS_WL_EFUSE_CTRL) & >> B_AX_AUTOLOAD_SUS) >>>> @@ -289,7 +290,17 @@ int rtw89_parse_efuse_map_ax(struct rtw89_dev *rtwdev) >>>> goto out_free; >>>> } >>>> >>>> - ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, phy_size, >>>> false); >>>> + for (retry = 0; retry < 3; retry++) { >>>> + if (retry) { >>>> + rtw89_warn(rtwdev, "efuse dump failed, retrying >>>> (%d)\n", >>>> + retry); >>>> + fsleep(500000); >>>> + } >>>> + ret = rtw89_dump_physical_efuse_map(rtwdev, phy_map, 0, >>>> + phy_size, false); >>>> + if (!ret) >>>> + break; >>>> + } >>>> if (ret) { >>>> rtw89_warn(rtwdev, "failed to dump efuse physical map\n"); >>>> goto out_free; >>>> -- >>>> 2.43.0 >>> >>> How about retrying 5 times without fsleep(500000)? >>> >>> diff --git a/drivers/net/wireless/realtek/rtw89/efuse.c >> b/drivers/net/wireless/realtek/rtw89/efuse.c >>> index a2757a88d55d..89d4b1b865f8 100644 >>> --- a/drivers/net/wireless/realtek/rtw89/efuse.c >>> +++ b/drivers/net/wireless/realtek/rtw89/efuse.c >>> @@ -185,8 +185,8 @@ static int rtw89_dump_physical_efuse_map_dav(struct >> rtw89_dev *rtwdev, u8 *map, >>> return 0; >>> } >>> >>> -static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, >>> - u32 dump_addr, u32 dump_size, bool >> dav) >>> +static int __rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 >> *map, >>> + u32 dump_addr, u32 dump_size, >> bool dav) >>> { >>> int ret; >>> >>> @@ -208,6 +208,25 @@ static int rtw89_dump_physical_efuse_map(struct rtw89_dev >> *rtwdev, u8 *map, >>> return 0; >>> } >>> >>> +static int rtw89_dump_physical_efuse_map(struct rtw89_dev *rtwdev, u8 *map, >>> + u32 dump_addr, u32 dump_size, bool >> dav) >>> +{ >>> + int retry; >>> + int ret; >>> + >>> + for (retry = 0; retry < 5; retry++) { >>> + ret = __rtw89_dump_physical_efuse_map(rtwdev, map, >> dump_addr, >>> + dump_size, dav); >>> + if (!ret) >>> + return 0; >>> + >>> + rtw89_warn(rtwdev, "efuse dump (dav=%d) failed, retrying >> (%d)\n", >>> + dav, retry); >>> + } >>> + >>> + return ret; >>> +} >>> + >>> #define invalid_efuse_header(hdr1, hdr2) \ >>> ((hdr1) == 0xff || (hdr2) == 0xff) >>> #define invalid_efuse_content(word_en, i) \ >> >> I’ve run some boot tests and this also resolves my efuse map use-case, e.g. >> >> ROCK5B:~ # dmesg | grep rtw89 >> [ 6.506375] rtw89_8852be 0002:21:00.0: loaded firmware >> rtw89/rtw8852b_fw-1.bin >> [ 6.506539] rtw89_8852be 0002:21:00.0: enabling device (0000 -> 0003) >> [ 6.516069] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 >> (6fb3ec41), cmd version 0, type 5 >> [ 6.516083] rtw89_8852be 0002:21:00.0: Firmware version 0.29.29.15 >> (6fb3ec41), cmd version 0, type 3 >> [ 10.153731] rtw89_8852be 0002:21:00.0: efuse dump (dav=0) failed, retrying >> (0) >> [ 10.405347] rtw89_8852be 0002:21:00.0: chip info CID: 0, CV: 1, AID: 0, ACV: >> 1, RFE: 1 >> [ 10.408311] rtw89_8852be 0002:21:00.0: rfkill hardware state changed to >> enable >> >> So far I haven’t observed more than 1x retry being required, and there are no >> issues with loading the BT module. > > My changes do retry for 5 times, because your patch does 3 times retry plus > additional 500ms delay. I feel you want around 5 seconds for loading BT module. Understood. > Did you mean for now you can't reproduce the situation that long loading > time of BT module? (But it took long time days ago?) I’ve never noticed a long loading time for the BT module and timings seem to be consistent both with and without the patch (and not a problem). >> Would you like me to send a v2 using your revised version? - or? > > Yes, please help v2. Thanks, will send v2 later today. Christian ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2026-03-17 6:15 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-01 4:24 [PATCH] wifi: rtw89: retry efuse physical map dump on transient failure Christian Hewitt 2026-03-02 5:47 ` Ping-Ke Shih 2026-03-02 5:55 ` Christian Hewitt 2026-03-02 6:04 ` Ping-Ke Shih 2026-03-02 6:17 ` Christian Hewitt 2026-03-09 2:35 ` Ping-Ke Shih 2026-03-10 17:16 ` Christian Hewitt 2026-03-11 3:05 ` Ping-Ke Shih 2026-03-11 4:20 ` Christian Hewitt 2026-03-12 2:22 ` Ping-Ke Shih 2026-03-12 5:58 ` Christian Hewitt 2026-03-12 7:39 ` Ping-Ke Shih 2026-03-12 8:11 ` Christian Hewitt 2026-03-12 8:28 ` Ping-Ke Shih 2026-03-16 5:32 ` Ping-Ke Shih 2026-03-16 11:03 ` Christian Hewitt 2026-03-17 1:37 ` Ping-Ke Shih 2026-03-17 6:15 ` Christian Hewitt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox