[PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY

Netdev List
 help / color / mirror / Atom feed

* [PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY
@ 2026-06-17 12:07 Robert Malz
  2026-06-18 15:20 ` Simon Horman
  0 siblings, 1 reply; 4+ messages in thread
From: Robert Malz @ 2026-06-17 12:07 UTC (permalink / raw)
  To: anthony.l.nguyen, przemyslaw.kitszel; +Cc: intel-wired-lan, netdev

When the admin queue command to read NVM returns EBUSY, the driver
currently treats it as a fatal error and aborts the entire read
operation. This can cause spurious NVM read failures during periods of
high firmware activity.

Add retry logic to ice_read_flat_nvm() that handles EBUSY responses
from the admin queue. When an EBUSY error is encountered, release the
NVM resource lock, wait for ICE_SQ_SEND_DELAY_TIME_MS, re-acquire it,
and retry the failed read. The retry is attempted up to
ICE_SQ_SEND_MAX_EXECUTE times before giving up.

Code was extracted from OOT ice driver 1.15.4 release. Additional
change was made to reset last_cmd in case of retry to make sure that
all commands are retried properly.

Fixes: e94509906d6b ("ice: create function to read a section of the NVM and Shadow RAM")
Signed-off-by: Robert Malz <robert.malz@canonical.com>
---
Changes in v2:
- change ICE_AQ_RC_EBUSY -> LIBIE_AQ_RC_EBUSY
Changes in v3:
- resending to comply with the netdev 24-hour rule. No code changes since v2.

 drivers/net/ethernet/intel/ice/ice_nvm.c | 25 +++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c
index 7e187a804dfa..b3120605d66f 100644
--- a/drivers/net/ethernet/intel/ice/ice_nvm.c
+++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
@@ -67,6 +67,7 @@ ice_read_flat_nvm(struct ice_hw *hw, u32 offset, u32 *length, u8 *data,
 {
 	u32 inlen = *length;
 	u32 bytes_read = 0;
+	int retry_cnt = 0;
 	bool last_cmd;
 	int status;
 
@@ -96,11 +97,25 @@ ice_read_flat_nvm(struct ice_hw *hw, u32 offset, u32 *length, u8 *data,
 					 offset, read_size,
 					 data + bytes_read, last_cmd,
 					 read_shadow_ram, NULL);
-		if (status)
-			break;
-
-		bytes_read += read_size;
-		offset += read_size;
+		if (status) {
+			if (hw->adminq.sq_last_status != LIBIE_AQ_RC_EBUSY ||
+			    retry_cnt > ICE_SQ_SEND_MAX_EXECUTE)
+				break;
+			ice_debug(hw, ICE_DBG_NVM,
+				  "NVM read EBUSY error, retry %d\n",
+				  retry_cnt + 1);
+			last_cmd = false;
+			ice_release_nvm(hw);
+			msleep(ICE_SQ_SEND_DELAY_TIME_MS);
+			status = ice_acquire_nvm(hw, ICE_RES_READ);
+			if (status)
+				break;
+			retry_cnt++;
+		} else {
+			bytes_read += read_size;
+			offset += read_size;
+			retry_cnt = 0;
+		}
 	} while (!last_cmd);
 
 	*length = bytes_read;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY
  2026-06-17 12:07 [PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY Robert Malz
@ 2026-06-18 15:20 ` Simon Horman
  2026-06-24 11:15   ` Robert Malz
  0 siblings, 1 reply; 4+ messages in thread
From: Simon Horman @ 2026-06-18 15:20 UTC (permalink / raw)
  To: robert.malz
  Cc: Simon Horman, anthony.l.nguyen, przemyslaw.kitszel,
	intel-wired-lan, netdev

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ice: retry reading NVM if admin queue returns EBUSY

When the admin queue command to read NVM returns EBUSY, the driver
currently treats it as a fatal error and aborts the entire read operation.
This patch adds retry logic to handle this condition.

> diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c
> --- a/drivers/net/ethernet/intel/ice/ice_nvm.c
> +++ b/drivers/net/ethernet/intel/ice/ice_nvm.c

[ ... ]

> @@ -96,11 +97,25 @@ ice_read_flat_nvm(struct ice_hw *hw, u32 offset, u32 *length, u8 *data,
>  					 offset, read_size,
>  					 data + bytes_read, last_cmd,
>  					 read_shadow_ram, NULL);
> -		if (status)
> -			break;
> -
> -		bytes_read += read_size;
> -		offset += read_size;
> +		if (status) {
> +			if (hw->adminq.sq_last_status != LIBIE_AQ_RC_EBUSY ||
> +			    retry_cnt > ICE_SQ_SEND_MAX_EXECUTE)

[Severity: Low]
Does this check allow an extra retry execution? Since retry_cnt starts at 0, 
using a strictly greater-than check allows the loop to retry 4 times
(5 total executions) instead of the 3 executions intended by
ICE_SQ_SEND_MAX_EXECUTE.

> +				break;
> +			ice_debug(hw, ICE_DBG_NVM,
> +				  "NVM read EBUSY error, retry %d\n",
> +				  retry_cnt + 1);
> +			last_cmd = false;
> +			ice_release_nvm(hw);
> +			msleep(ICE_SQ_SEND_DELAY_TIME_MS);

[Severity: High]
Does dropping the NVM lock here violate the caller's assumption of mutual
exclusion?

Callers of ice_read_flat_nvm() typically acquire the NVM lock beforehand to
ensure an atomic read of multiple sectors without interleaved writes. By
explicitly dropping the lock during the msleep() window, another thread, VF,
or PF could acquire the NVM lock and modify the NVM before the read resumes,
potentially resulting in reading torn or inconsistent NVM data.

> +			status = ice_acquire_nvm(hw, ICE_RES_READ);
> +			if (status)
> +				break;

[Severity: High]
Can this lead to an unbalanced NVM resource lock release and potential data
corruption?

If ice_acquire_nvm() fails here, the loop breaks and returns an error with the
lock currently unheld. However, callers like ice_read_flash_module() are
unaware of this internal lock state change and will unconditionally call
ice_release_nvm() upon failure:

drivers/net/ethernet/intel/ice/ice_nvm.c:ice_read_flash_module() {
	status = ice_acquire_nvm(hw, ICE_RES_READ);
	if (status)
		return status;

	status = ice_read_flat_nvm(hw, start + offset, &length, data, false);

	ice_release_nvm(hw);
    ...

Because firmware tracks resource locks per-PF rather than per-OS-thread, this
errant second release could forcefully strip the lock from another thread on
the same PF that successfully acquired it, exposing the NVM to concurrent
modification.

> +			retry_cnt++;
> +		} else {
> +			bytes_read += read_size;
> +			offset += read_size;
> +			retry_cnt = 0;
> +		}
>  	} while (!last_cmd);
>  
>  	*length = bytes_read;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY
  2026-06-18 15:20 ` Simon Horman
@ 2026-06-24 11:15   ` Robert Malz
  2026-06-25  7:53     ` Przemek Kitszel
  0 siblings, 1 reply; 4+ messages in thread
From: Robert Malz @ 2026-06-24 11:15 UTC (permalink / raw)
  To: Simon Horman
  Cc: anthony.l.nguyen, przemyslaw.kitszel, intel-wired-lan, netdev

Hey Simon,
Thanks for the review. I'll try to address these comments one by one.

On Thu, Jun 18, 2026 at 5:20 PM Simon Horman <horms@kernel.org> wrote:
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> ice: retry reading NVM if admin queue returns EBUSY
>
> When the admin queue command to read NVM returns EBUSY, the driver
> currently treats it as a fatal error and aborts the entire read operation.
> This patch adds retry logic to handle this condition.
>
> > diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c
> > --- a/drivers/net/ethernet/intel/ice/ice_nvm.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
>
> [ ... ]
>
> > @@ -96,11 +97,25 @@ ice_read_flat_nvm(struct ice_hw *hw, u32 offset, u32 *length, u8 *data,
> >                                        offset, read_size,
> >                                        data + bytes_read, last_cmd,
> >                                        read_shadow_ram, NULL);
> > -             if (status)
> > -                     break;
> > -
> > -             bytes_read += read_size;
> > -             offset += read_size;
> > +             if (status) {
> > +                     if (hw->adminq.sq_last_status != LIBIE_AQ_RC_EBUSY ||
> > +                         retry_cnt > ICE_SQ_SEND_MAX_EXECUTE)
>
> [Severity: Low]
> Does this check allow an extra retry execution? Since retry_cnt starts at 0,
> using a strictly greater-than check allows the loop to retry 4 times
> (5 total executions) instead of the 3 executions intended by
> ICE_SQ_SEND_MAX_EXECUTE.
>
> > +                             break;
> > +                     ice_debug(hw, ICE_DBG_NVM,
> > +                               "NVM read EBUSY error, retry %d\n",
> > +                               retry_cnt + 1);
> > +                     last_cmd = false;
> > +                     ice_release_nvm(hw);
> > +                     msleep(ICE_SQ_SEND_DELAY_TIME_MS);
>
The review is correct that this allows one extra execution. However, I
intentionally kept this verbatim from the OOT driver (1.15.4) to
retain the significant test exposure it has there. The absolute worst
case is one harmless extra retry before giving up. If you strongly
prefer strict adherence to ICE_SQ_SEND_MAX_EXECUTE, I can update the
condition in a v4.

> [Severity: High]
> Does dropping the NVM lock here violate the caller's assumption of mutual
> exclusion?
>
> Callers of ice_read_flat_nvm() typically acquire the NVM lock beforehand to
> ensure an atomic read of multiple sectors without interleaved writes. By
> explicitly dropping the lock during the msleep() window, another thread, VF,
> or PF could acquire the NVM lock and modify the NVM before the read resumes,
> potentially resulting in reading torn or inconsistent NVM data.
>
> > +                     status = ice_acquire_nvm(hw, ICE_RES_READ);
> > +                     if (status)
> > +                             break;
>

I don't think this is an issue. The NVM resource (req_res 0x0008) is
an ownership lease, not a content snapshot. The FW grants it with a
hold timeout (see the timeout field handling in ice_aq_req_res). It
never guaranteed a point-in-time view of the flash.
Very similar behavior already exists in ice_devlink_nvm_snapshot(),
which reads the NVM in chunks via ice_read_flat_nvm() and deliberately
acquires/releases the lock between chunks (with a comment explaining
the FW reclaim).
Additionally, the lock is only dropped on the failure path, where
neither bytes_read nor offset have been advanced. The driver simply
re-reads the failed chunk after re-acquiring. So no data that is
returned to the caller is committed across the drop window.
Overall the lock drop does not weaken any guarantee callers actually
have. If we wanted to introduce a true snapshot guarantee, that would
probably require different approach and separate patch.

> [Severity: High]
> Can this lead to an unbalanced NVM resource lock release and potential data
> corruption?
>
> If ice_acquire_nvm() fails here, the loop breaks and returns an error with the
> lock currently unheld. However, callers like ice_read_flash_module() are
> unaware of this internal lock state change and will unconditionally call
> ice_release_nvm() upon failure:
>
> drivers/net/ethernet/intel/ice/ice_nvm.c:ice_read_flash_module() {
>         status = ice_acquire_nvm(hw, ICE_RES_READ);
>         if (status)
>                 return status;
>
>         status = ice_read_flat_nvm(hw, start + offset, &length, data, false);
>
>         ice_release_nvm(hw);
>     ...
>
> Because firmware tracks resource locks per-PF rather than per-OS-thread, this
> errant second release could forcefully strip the lock from another thread on
> the same PF that successfully acquired it, exposing the NVM to concurrent
> modification.
>

Agreed, this might be a real bug, and the one of the three I think is
worth investigating.
If ice_acquire_nvm() fails after the drop, ice_read_flat_nvm() returns
with the lock unheld while callers unconditionally call
ice_release_nvm(), so a stray release is issued.

On probability, though, the window is very small. Reaching it requires
sustained EBUSY across the retry budget plus a failed re-acquire
(which itself polls up to ICE_NVM_TIMEOUT), and concurrently another
requester taking the lock. Most reads happen during init (ice_probe,
and reset/rebuild via ice_init_nvm), and NVM writes only happen on an
already initialized driver. The devlink/ethtool nvm_read paths are
also exposed, but hitting this race would require precise timing
against a concurrent NVM owner on the device.

I'd prefer to keep the scope of this patch limited to the EBUSY retry
path and not take on the unbalanced-release fix here. A proper fix
should change the lock-ownership contract of ice_read_flat_nvm() (on
error, the lock must be released by ice_read_flat_nvm(), callers
release only on success) and update all callers. Code change sould be
simple for all callers but ice_discover_flash_size(), it intentionally
holds one lease across a read loop and would need to re-acquire after
each expected boundary failure.

Given how small the original window is, I'd rather not trade tested
OOT behavior for the risk of a complex unbalanced NVM lock fix. I
actually have a patch mostly ready that fixes the lock-ownership
contract, but I really don't like it. It changes the design of
ice_read_flat_nvm(), making it less intuitive for callers. More
importantly, I just don't have the resources or test coverage right
now to properly verify such change.

However, I can modify the failure path for ice_acquire_nvm inside
ice_read_flat_nvm. Instead of bailing out immediately, we can just
retry it within the existing retry budget. In this case, the
probability of leaving ice_read_flat_nvm without holding the lock is
reduced even further without needing a refactor.

Please let me know what you think about my thought process on this.

> > +                     retry_cnt++;
> > +             } else {
> > +                     bytes_read += read_size;
> > +                     offset += read_size;
> > +                     retry_cnt = 0;
> > +             }
> >       } while (!last_cmd);
> >
> >       *length = bytes_read;

Thanks,
Robert

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY
  2026-06-24 11:15   ` Robert Malz
@ 2026-06-25  7:53     ` Przemek Kitszel
  0 siblings, 0 replies; 4+ messages in thread
From: Przemek Kitszel @ 2026-06-25  7:53 UTC (permalink / raw)
  To: Robert Malz, Simon Horman, Grzegorz Nitka
  Cc: anthony.l.nguyen, intel-wired-lan, netdev

>> This is an AI-generated review of your patch. The human sending this


>> [Severity: Low]
>> Does this check allow an extra retry execution? Since retry_cnt starts at 0,
>> using a strictly greater-than check allows the loop to retry 4 times
>> (5 total executions) instead of the 3 executions intended by
>> ICE_SQ_SEND_MAX_EXECUTE.
>>
>>> +                             break;
>>> +                     ice_debug(hw, ICE_DBG_NVM,
>>> +                               "NVM read EBUSY error, retry %d\n",
>>> +                               retry_cnt + 1);
>>> +                     last_cmd = false;
>>> +                     ice_release_nvm(hw);
>>> +                     msleep(ICE_SQ_SEND_DELAY_TIME_MS);
>>
> The review is correct that this allows one extra execution. However, I
> intentionally kept this verbatim from the OOT driver (1.15.4) to

This OOT change was made entirely by Grzegorz, for E825C. Would be great 
to acknowledge his authorship (+ SoB ofc, below).
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>

What was the platform that you have encountered the bug on?

> retain the significant test exposure it has there. The absolute worst
> case is one harmless extra retry before giving up. If you strongly
> prefer strict adherence to ICE_SQ_SEND_MAX_EXECUTE, I can update the
> condition in a v4.

Please update (to have one attempt less).

> 
>> [Severity: High]
>> Does dropping the NVM lock here violate the caller's assumption of mutual
>> exclusion?
>>
>> Callers of ice_read_flat_nvm() typically acquire the NVM lock beforehand to
>> ensure an atomic read of multiple sectors without interleaved writes. By
>> explicitly dropping the lock during the msleep() window, another thread, VF,
>> or PF could acquire the NVM lock and modify the NVM before the read resumes,
>> potentially resulting in reading torn or inconsistent NVM data.
>>
>>> +                     status = ice_acquire_nvm(hw, ICE_RES_READ);
>>> +                     if (status)
>>> +                             break;
>>
> 
> I don't think this is an issue. The NVM resource (req_res 0x0008) is
> an ownership lease, not a content snapshot. The FW grants it with a
> hold timeout (see the timeout field handling in ice_aq_req_res). It
> never guaranteed a point-in-time view of the flash.
> Very similar behavior already exists in ice_devlink_nvm_snapshot(),
> which reads the NVM in chunks via ice_read_flat_nvm() and deliberately
> acquires/releases the lock between chunks (with a comment explaining
> the FW reclaim).
> Additionally, the lock is only dropped on the failure path, where
> neither bytes_read nor offset have been advanced. The driver simply
> re-reads the failed chunk after re-acquiring. So no data that is
> returned to the caller is committed across the drop window.
> Overall the lock drop does not weaken any guarantee callers actually
> have. If we wanted to introduce a true snapshot guarantee, that would
> probably require different approach and separate patch.
> 
> 
>> [Severity: High]
>> Can this lead to an unbalanced NVM resource lock release and potential data
>> corruption?
>>
>> If ice_acquire_nvm() fails here, the loop breaks and returns an error with the
>> lock currently unheld. However, callers like ice_read_flash_module() are
>> unaware of this internal lock state change and will unconditionally call
>> ice_release_nvm() upon failure:
>>
>> drivers/net/ethernet/intel/ice/ice_nvm.c:ice_read_flash_module() {
>>          status = ice_acquire_nvm(hw, ICE_RES_READ);
>>          if (status)
>>                  return status;
>>
>>          status = ice_read_flat_nvm(hw, start + offset, &length, data, false);
>>
>>          ice_release_nvm(hw);
>>      ...
>>
>> Because firmware tracks resource locks per-PF rather than per-OS-thread, this
>> errant second release could forcefully strip the lock from another thread on
>> the same PF that successfully acquired it, exposing the NVM to concurrent
>> modification.
>>
> 
> Agreed, this might be a real bug, and the one of the three I think is
> worth investigating.
> If ice_acquire_nvm() fails after the drop, ice_read_flat_nvm() returns
> with the lock unheld while callers unconditionally call
> ice_release_nvm(), so a stray release is issued.
> 
> On probability, though, the window is very small. Reaching it requires
> sustained EBUSY across the retry budget plus a failed re-acquire
> (which itself polls up to ICE_NVM_TIMEOUT), and concurrently another
> requester taking the lock. Most reads happen during init (ice_probe,
> and reset/rebuild via ice_init_nvm), and NVM writes only happen on an
> already initialized driver. The devlink/ethtool nvm_read paths are
> also exposed, but hitting this race would require precise timing
> against a concurrent NVM owner on the device.
> 
> I'd prefer to keep the scope of this patch limited to the EBUSY retry
> path and not take on the unbalanced-release fix here. A proper fix
> should change the lock-ownership contract of ice_read_flat_nvm() (on
> error, the lock must be released by ice_read_flat_nvm(), callers
> release only on success) and update all callers. Code change sould be
> simple for all callers but ice_discover_flash_size(), it intentionally
> holds one lease across a read loop and would need to re-acquire after
> each expected boundary failure.
> 
> Given how small the original window is, I'd rather not trade tested
> OOT behavior for the risk of a complex unbalanced NVM lock fix. I
> actually have a patch mostly ready that fixes the lock-ownership
> contract, but I really don't like it. It changes the design of
> ice_read_flat_nvm(), making it less intuitive for callers. More
> importantly, I just don't have the resources or test coverage right
> now to properly verify such change.
> 
> However, I can modify the failure path for ice_acquire_nvm inside
> ice_read_flat_nvm. Instead of bailing out immediately, we can just
> retry it within the existing retry budget. In this case, the
> probability of leaving ice_read_flat_nvm without holding the lock is
> reduced even further without needing a refactor.
> 
> Please let me know what you think about my thought process on this.

I think that both AI-reported issues against the lock are valid
concerns.

I think that sleep was the actual fix, and re-locking were merely
a necessity due to their expiration (as you said).

A proper fix would be to just increase lock-timeout to accommodate all
attempts (and still do the retries&sleep, but without unlocking).


> 
> 
> 
>>> +                     retry_cnt++;
>>> +             } else {
>>> +                     bytes_read += read_size;
>>> +                     offset += read_size;
>>> +                     retry_cnt = 0;
>>> +             }
>>>        } while (!last_cmd);
>>>
>>>        *length = bytes_read;
> 
> Thanks,
> Robert


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-25  7:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-17 12:07 [PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY Robert Malz
2026-06-18 15:20 ` Simon Horman
2026-06-24 11:15   ` Robert Malz
2026-06-25  7:53     ` Przemek Kitszel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox