From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from www2881.sakura.ne.jp (www2881.sakura.ne.jp [49.212.198.91]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD6412DEA86 for ; Mon, 6 Apr 2026 12:30:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=49.212.198.91 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775478622; cv=none; b=AXFTMUUjHZ46Bepcgxgck1O9zinr97q8SB0El0KCvHCwnlvmDrgFINYIvMJh5+Gp7Fek63oHDans1MKpZSqyABHfjtHRrMOvE3qYw/K79L5ne7/oRT9YGIlmPzSLuSlYhISBmPTP6IqN+tXCtw6g7uLFeAydWkzxJoe3uXvF3cM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775478622; c=relaxed/simple; bh=emRCJyacTnSloKRUbPyqntMfWZ5pD3jQ3P79nsNsgmA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=XKZ4IGbp6C8cdfa6EH9N3wEu0IVi9AhOlLefGdfRhA7CpPXoMQ5/KSFfHRGmu5+Fj5rU/CyKT3usFK2DP8D8i3jcuwpNtr/Qey8eE9pzNFc6JkhBE2F6gTlAWqKxUwhqDvxShofixiv56GMK/M8onCzA0dQao2/78bPXkbtEOiU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=enjuk.jp; spf=pass smtp.mailfrom=enjuk.jp; dkim=pass (2048-bit key) header.d=enjuk.jp header.i=@enjuk.jp header.b=exUuEimw; arc=none smtp.client-ip=49.212.198.91 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=enjuk.jp Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=enjuk.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=enjuk.jp header.i=@enjuk.jp header.b="exUuEimw" Received: from x1 (13.3.31.150.dy.iij4u.or.jp [150.31.3.13]) (authenticated bits=0) by www2881.sakura.ne.jp (8.16.1/8.16.1) with ESMTPSA id 636CTo3j080947 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 6 Apr 2026 21:29:52 +0900 (JST) (envelope-from kohei@enjuk.jp) DKIM-Signature: a=rsa-sha256; bh=3OLGYuWueXpAxw4Xwkb7rNrd6HvEWShajiH3l89z4JE=; c=relaxed/relaxed; d=enjuk.jp; h=From:Message-ID:To:Subject:Date; s=rs20251215; t=1775478592; v=1; b=exUuEimwu+3LRSzYhW/E37U5h2mO6GNiVvGmtd5dpgX/MBhWFIRJCLhZbiiqOJDa 0qTY6LkjTmVPE4ycnrw8xHlErDa4Lvy/wPiI9NPbZmm7QLuh53Ji8h9boEEkZZrY +8bgVkBfITYXHXECH6t6DfGXldKysJhXceZOAdcFCR5B3sNyXYkf2vTTYTwZZXKg yhR6XtxA81mCXQqIQ1iN0W6C9XTvQhqs14dxL4cnl8XI13cHGjwWN0Jk954uLN7L TjoH7yRrykcjXmbrOW1BOFKavYCawN7taZrMrAR7Siup5IhIzBBQWWOwHAPTl3e8 CvTnHpqFoo5nVy8PO9RyUA== Date: Mon, 6 Apr 2026 21:29:50 +0900 From: Kohei Enju To: Jose Ignacio Tornos Martinez Cc: netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, stable@vger.kernel.org Subject: Re: [PATCH net 3/3] iavf: drop netdev lock while waiting for MAC change completion Message-ID: References: <20260406112057.906685-1-jtornosm@redhat.com> <20260406112057.906685-4-jtornosm@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20260406112057.906685-4-jtornosm@redhat.com> On 04/06 13:20, Jose Ignacio Tornos Martinez wrote: > After commit ad7c7b2172c3 ("net: hold netdev instance lock during sysfs > operations"), iavf_set_mac() is called with the netdev instance lock > already held. > > The function queues a MAC address change request and then waits for > completion while holding this lock. However, the watchdog task that > processes admin queue commands (including MAC changes) also needs to > acquire the netdev lock to run. > > This creates a lock contention scenario: > 1. iavf_set_mac() holds netdev lock and waits for MAC change > 2. Watchdog needs netdev lock to process the MAC change request > 3. Watchdog blocks waiting for lock > 4. MAC change times out after 2.5 seconds > 5. iavf_set_mac() returns -EAGAIN > > This particularly affects VFs during initialization when enslaved to a > bond. The first VF typically succeeds as it's already fully initialized, > but subsequent VFs fail as they're still progressing through their state > machine and need the watchdog to advance. > > Fix by temporarily dropping the netdev lock before waiting for MAC change > completion, allowing the watchdog to run and process the request, then > re-acquiring the lock before returning. > > This is safe because: > - The MAC change request is already queued before we drop the lock > - iavf_is_mac_set_handled() just checks filter state, doesn't modify it > - We re-acquire the lock before checking results and returning > > Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations") > cc: stable@vger.kernel.org > Signed-off-by: Jose Ignacio Tornos Martinez > --- > drivers/net/ethernet/intel/iavf/iavf_main.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c > index dad001abc908..6281858e6f3c 100644 > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c > @@ -1068,10 +1068,14 @@ static int iavf_set_mac(struct net_device *netdev, void *p) > if (ret) > return ret; > > + netdev_unlock(netdev); > + > ret = wait_event_interruptible_timeout(adapter->vc_waitqueue, > iavf_is_mac_set_handled(netdev, addr->sa_data), > msecs_to_jiffies(2500)); > > + netdev_lock(netdev); > + Hi Jose, thank you for the fix and detailed explanation. I don't have a great solution for this issue, but dropping the netdev lock taken by the networking core in the driver callback might not look acceptable. FYI, Petr reported the same type of locking issue in ndo_change_mtu(), and the v1 approach was really similar to this one. https://lore.kernel.org/intel-wired-lan/20260202155813.3f8fbc27@kernel.org/ IIUC, the issue was eventually fixed by completing the reset synchronously in the same context as ndo_change_mtu(), instead of dropping the netdev lock and waiting for reset_task. https://lore.kernel.org/intel-wired-lan/20260211191855.1532226-1-poros@redhat.com/ If that applies here as well, maybe iavf_set_mac() needs a similar approach, e.g. progressing the relevant virtchnl request/completion synchronously with the netdev lock held, rather than dropping the lock here? > /* If ret < 0 then it means wait was interrupted. > * If ret == 0 then it means we got a timeout. > * else it means we got response for set MAC from PF, > -- > 2.53.0 >