From: Przemek Kitszel <przemyslaw.kitszel@intel.com>
To: Petr Oros <poros@redhat.com>, <netdev@vger.kernel.org>
Cc: <ivecera@redhat.com>, <aleksandr.loktionov@intel.com>,
<shaojijie@huawei.com>, Jacob Keller <jacob.e.keller@intel.com>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
<intel-wired-lan@lists.osuosl.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH net v3] iavf: fix incorrect reset handling in callbacks
Date: Fri, 13 Feb 2026 11:29:05 +0100 [thread overview]
Message-ID: <0f76a71d-fae5-4b1e-a693-8c55d91be977@intel.com> (raw)
In-Reply-To: <20260211191855.1532226-1-poros@redhat.com>
On 2/11/26 20:18, Petr Oros wrote:
> Three driver callbacks schedule a reset and wait for its completion:
> ndo_change_mtu(), ethtool set_ringparam(), and ethtool set_channels().
>
> Waiting for reset in ndo_change_mtu() and set_ringparam() was added by
> commit c2ed2403f12c ("iavf: Wait for reset in callbacks which trigger
> it") to fix a race condition where adding an interface to bonding
> immediately after MTU or ring parameter change failed because the
> interface was still in __RESETTING state. The same commit also added
> waiting in iavf_set_priv_flags(), which was later removed by commit
> 53844673d555 ("iavf: kill "legacy-rx" for good").
>
> Waiting in set_channels() was introduced earlier by commit 4e5e6b5d9d13
> ("iavf: Fix return of set the new channel count") to ensure the PF has
> enough time to complete the VF reset when changing channel count, and to
> return correct error codes to userspace.
>
> Commit ef490bbb2267 ("iavf: Add net_shaper_ops support") added
> net_shaper_ops to iavf, which required reset_task to use _locked NAPI
> variants (napi_enable_locked, napi_disable_locked) that need the netdev
> instance lock.
>
> Later, commit 7e4d784f5810 ("net: hold netdev instance lock during
> rtnetlink operations") and commit 2bcf4772e45a ("net: ethtool: try to
> protect all callback with netdev instance lock") started holding the
> netdev instance lock during ndo and ethtool callbacks for drivers with
> net_shaper_ops.
>
> Finally, commit 120f28a6f314 ("iavf: get rid of the crit lock")
> replaced the driver's crit_lock with netdev_lock in reset_task, causing
> incorrect behavior: the callback holds netdev_lock and waits for
> reset_task, but reset_task needs the same lock:
>
> Thread 1 (callback) Thread 2 (reset_task)
> ------------------- ---------------------
> netdev_lock() [blocked on workqueue]
> ndo_change_mtu() or ethtool op
> iavf_schedule_reset()
> iavf_wait_for_reset() iavf_reset_task()
> waiting... netdev_lock() <- blocked
>
> This does not strictly deadlock because iavf_wait_for_reset() uses
> wait_event_interruptible_timeout() with a 5-second timeout. The wait
> eventually times out, the callback returns an error to userspace, and
> after the lock is released reset_task completes the reset. This leads to
> incorrect behavior: userspace sees an error even though the configuration
> change silently takes effect after the timeout.
>
> Fix this by extracting the reset logic from iavf_reset_task() into a new
> iavf_reset_step() function that expects netdev_lock to be already held.
> The three callbacks now call iavf_reset_step() directly instead of
> scheduling the work and waiting, performing the reset synchronously in
> the caller's context which already holds netdev_lock. This eliminates
> both the incorrect error reporting and the need for
> iavf_wait_for_reset(), which is removed along with the now-unused
> reset_waitqueue.
>
> The workqueue-based iavf_reset_task() becomes a thin wrapper that
> acquires netdev_lock and calls iavf_reset_step(), preserving its use
> for PF-initiated resets.
>
> The callbacks may block for several seconds while iavf_reset_step()
> polls hardware registers, but this is acceptable since netdev_lock is a
> per-device mutex and only serializes operations on the same interface.
>
> v3:
> - Remove netif_running() guard from iavf_set_channels(). Unlike
> set_ringparam where descriptor counts are picked up by iavf_open()
> directly, num_req_queues is only consumed during
> iavf_reinit_interrupt_scheme() in the reset path. Skipping the reset
> on a down device would silently discard the channel count change.
> - Remove dead reset_waitqueue code (struct field, init, and all
> wake_up calls) since iavf_wait_for_reset() was the only consumer.
>
> Fixes: 120f28a6f314 ("iavf: get rid of the crit lock")
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
> drivers/net/ethernet/intel/iavf/iavf.h | 3 +-
> .../net/ethernet/intel/iavf/iavf_ethtool.c | 19 ++---
> drivers/net/ethernet/intel/iavf/iavf_main.c | 77 ++++++-------------
> .../net/ethernet/intel/iavf/iavf_virtchnl.c | 1 -
> 4 files changed, 31 insertions(+), 69 deletions(-)
>
thank you for improving on your v2,
next time please post as top-level (instead of In-reply-to), to get more
traction earlier
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
next prev parent reply other threads:[~2026-02-13 10:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-07 10:22 [Intel-wired-lan] [PATCH net v2] iavf: fix deadlock in reset handling Petr Oros
2026-02-09 7:15 ` Loktionov, Aleksandr
2026-02-09 23:59 ` Jacob Keller
2026-02-10 14:58 ` Przemek Kitszel
2026-02-11 11:50 ` Paolo Abeni
2026-02-11 18:50 ` Jacob Keller
2026-02-11 19:18 ` [Intel-wired-lan] [PATCH net v3] iavf: fix incorrect reset handling in callbacks Petr Oros
2026-02-13 10:29 ` Przemek Kitszel [this message]
2026-02-23 16:10 ` Romanowski, Rafal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0f76a71d-fae5-4b1e-a693-8c55d91be977@intel.com \
--to=przemyslaw.kitszel@intel.com \
--cc=aleksandr.loktionov@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=ivecera@redhat.com \
--cc=jacob.e.keller@intel.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=poros@redhat.com \
--cc=sdf@fomichev.me \
--cc=shaojijie@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox