From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: <anthony.l.nguyen@intel.com>, <netdev@vger.kernel.org>,
<sdf@fomichev.me>, <andrew+netdev@lunn.ch>, <ast@kernel.org>,
<sx.rinitha@intel.com>, <horms@kernel.org>,
<yury.norov@gmail.com>, <john.fastabend@gmail.com>,
<kohei@enjuk.jp>, <przemyslaw.kitszel@intel.com>,
<richardcochran@gmail.com>, <alexander.nowlin@intel.com>,
<daniel@iogearbox.net>, <maciej.fijalkowski@intel.com>,
<nxne.cnse.osdt.itp.upstreaming@intel.com>, <edumazet@google.com>,
<aleksandr.loktionov@intel.com>, <marcin.szycik@linux.intel.com>,
<hawk@kernel.org>, <jacob.e.keller@intel.com>,
<magnus.karlsson@intel.com>, <pmenzel@molgen.mpg.de>,
<pabeni@redhat.com>, <bpf@vger.kernel.org>, <davem@davemloft.net>,
<andriy.shevchenko@linux.intel.com>
Subject: Re: [net-next,3/9] ice: migrate to netdev ops lock
Date: Wed, 11 Feb 2026 14:51:56 +0100 [thread overview]
Message-ID: <2a300bf0-3975-4f79-a3e0-5cb9b8159d62@intel.com> (raw)
In-Reply-To: <20260211042458.555701-1-kuba@kernel.org>
From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 10 Feb 2026 20:24:58 -0800
Hi,
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> ice: migrate to netdev ops lock
>
> This commit migrates the ice driver to use the netdev ops lock by
> requesting ops locking and using _locked counterparts of NAPI
> configuration functions to avoid deadlocks during suspend/resume, queue
> rebuild and resets.
>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
>> --- a/drivers/net/ethernet/intel/ice/ice_lib.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
>
> [ ... ]
>
>> @@ -2308,10 +2308,14 @@ static int ice_vsi_cfg_tc_lan(struct ice_pf *pf, struct ice_vsi *vsi)
>> }
>>
>> /**
>> - * ice_vsi_cfg_def - configure default VSI based on the type
>> + * ice_vsi_cfg_def_locked - configure default VSI based on the type
>> * @vsi: pointer to VSI
>> + *
>> + * Should be called only with the netdev lock taken.
>> + *
>> + * Return: 0 on success, -errno on failure.
>> */
>> -static int ice_vsi_cfg_def(struct ice_vsi *vsi)
>> +static int ice_vsi_cfg_def_locked(struct ice_vsi *vsi)
>
> [ ... ]
>
>> +unroll_vector_base:
>> + /* reclaim SW interrupts back to the common pool */
>> +unroll_alloc_q_vector:
>> + ice_vsi_free_q_vectors(vsi);
>
> Should this be ice_vsi_free_q_vectors_locked(vsi)?
>
> ice_vsi_cfg_def_locked() is called with the netdev lock held (either by
> ice_vsi_cfg_def() wrapper or directly from callers like
> ice_vsi_rebuild_locked()). The error path here calls ice_vsi_free_q_vectors()
> which tries to acquire the netdev lock:
>
> ice_vsi_cfg_def_locked() [netdev lock held]
> -> ice_vsi_free_q_vectors()
> -> netdev_lock(dev) /* deadlock - already held */
>
> This would deadlock when an error triggers unroll_vector_base or
> unroll_alloc_q_vector paths (e.g., memory allocation failure in
> ice_vsi_alloc_rings() or ice_vsi_alloc_ring_stats()).
>
> [ ... ]
>
>> @@ -2703,7 +2748,7 @@ void ice_vsi_close(struct ice_vsi *vsi)
>> {
>> if (!test_and_set_bit(ICE_VSI_DOWN, vsi->state))
>> ice_down(vsi);
>>
>> - ice_vsi_clear_napi_queues(vsi);
>> + ice_vsi_clear_napi_queues_locked(vsi);
>
> ice_vsi_clear_napi_queues_locked() expects the netdev lock to be held.
> Are all callers of ice_vsi_close() holding the netdev lock?
>
> Looking at ice_sf_dev_remove():
>
> ice_sf_dev_remove()
> -> ice_vsi_close(vsi) /* no netdev_lock held */
> -> ice_sf_decfg_netdev() /* unregisters netdev later */
>
> And ice_dis_vsi() when !netif_running(vsi->netdev):
>
> ice_dis_vsi(vsi, false)
> -> ice_vsi_close(vsi) /* rtnl_lock only taken when netif_running */
>
> Both paths call ice_vsi_close() without holding the netdev lock while
> the netdev is still registered, which could trigger
> netdev_assert_locked_or_invisible warnings.
>
> [ ... ]
>
>> @@ -3069,16 +3152,17 @@
>> }
>>
>> /**
>> - * ice_vsi_rebuild - Rebuild VSI after reset
>> + * ice_vsi_rebuild_locked - Rebuild VSI after reset
>> * @vsi: VSI to be rebuild
>> * @vsi_flags: flags used for VSI rebuild flow
>> *
>> * Set vsi_flags to ICE_VSI_FLAG_INIT to initialize a new VSI, or
>> * ICE_VSI_FLAG_NO_INIT to rebuild an existing VSI in hardware.
>> + * Should be called only under the netdev lock.
>> *
>> * Returns 0 on success and negative value on failure
>> */
>> -int ice_vsi_rebuild(struct ice_vsi *vsi, u32 vsi_flags)
>> +int ice_vsi_rebuild_locked(struct ice_vsi *vsi, u32 vsi_flags)
>
> [ ... ]
>
>> +free_coalesce:
>> + kfree(coalesce);
>> +decfg:
>> + if (ret)
>> + ice_vsi_decfg(vsi);
>
> Should this be ice_vsi_decfg_locked(vsi)?
>
> ice_vsi_rebuild_locked() is called with the netdev lock already held
> (either by the ice_vsi_rebuild() wrapper or by callers like
> ice_vsi_recfg_qs()). The error path at label 'decfg:' calls ice_vsi_decfg()
> which tries to acquire the lock again:
>
> ice_vsi_rebuild_locked() [netdev lock held]
> -> ice_vsi_decfg()
> -> netdev_lock(dev) /* deadlock - already held */
>
> This would deadlock when an error occurs after ice_vsi_cfg_def_locked()
> succeeds but a later operation fails.
Tony also fed the series to AI, two times, and each time he got a
different answer.
The series was on iwl-next for 1.5 months and only one bug was reported,
which I fixed immediately.
I can take a look into this, but wouldn't be better if we take the
series now and then have 2 months to fix bugs if any appears?
Thanks,
Olek
next prev parent reply other threads:[~2026-02-11 13:54 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 17:48 [PATCH net-next 0/9][pull request] Intel Wired LAN Driver Updates 2026-02-06 (libeth, ice, i40e, ixgbe) Tony Nguyen
2026-02-06 17:48 ` [PATCH net-next 1/9] libeth: pass Rx queue index to PP when creating a fill queue Tony Nguyen
2026-02-06 17:48 ` [PATCH net-next 2/9] libeth: handle creating pools with unreadable buffers Tony Nguyen
2026-02-06 17:49 ` [PATCH net-next 3/9] ice: migrate to netdev ops lock Tony Nguyen
2026-02-11 4:24 ` [net-next,3/9] " Jakub Kicinski
2026-02-11 13:51 ` Alexander Lobakin [this message]
2026-02-11 16:55 ` Jakub Kicinski
2026-02-11 17:13 ` Alexander Lobakin
2026-02-11 18:46 ` Jacob Keller
2026-02-06 17:49 ` [PATCH net-next 4/9] ice: implement Rx queue management ops Tony Nguyen
2026-02-06 17:49 ` [PATCH net-next 5/9] ice: add support for transmitting unreadable frags Tony Nguyen
2026-02-06 17:49 ` [PATCH net-next 6/9] ice: Make name member of struct ice_cgu_pin_desc const Tony Nguyen
2026-02-06 17:49 ` [PATCH net-next 7/9] i40e: drop useless bitmap_weight() call in i40e_set_rxfh_fields() Tony Nguyen
2026-02-06 17:49 ` [PATCH net-next 8/9] i40e: Add missing header Tony Nguyen
2026-02-06 17:49 ` [PATCH net-next 9/9] ixgbe: refactor: use DECLARE_BITMAP for ring state field Tony Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2a300bf0-3975-4f79-a3e0-5cb9b8159d62@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=aleksandr.loktionov@intel.com \
--cc=alexander.nowlin@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=andriy.shevchenko@linux.intel.com \
--cc=anthony.l.nguyen@intel.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=john.fastabend@gmail.com \
--cc=kohei@enjuk.jp \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=marcin.szycik@linux.intel.com \
--cc=netdev@vger.kernel.org \
--cc=nxne.cnse.osdt.itp.upstreaming@intel.com \
--cc=pabeni@redhat.com \
--cc=pmenzel@molgen.mpg.de \
--cc=przemyslaw.kitszel@intel.com \
--cc=richardcochran@gmail.com \
--cc=sdf@fomichev.me \
--cc=sx.rinitha@intel.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox