From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 88570E7316D for ; Mon, 2 Feb 2026 13:31:11 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 34E6540842; Mon, 2 Feb 2026 13:31:11 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id A1ZV4Q5f_yLv; Mon, 2 Feb 2026 13:31:10 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.166.142; helo=lists1.osuosl.org; envelope-from=intel-wired-lan-bounces@osuosl.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org E51A240817 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osuosl.org; s=default; t=1770039069; bh=eBN4sgUDtqNg4GpSirkeIULyv0JMY2X6vXXcs8fqAFc=; h=Date:To:References:From:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: Cc:From; b=oTuCkaaWWSzUuTd2j/n8bRJ2GkrgTUqwWadcMVMF6/m5VJVuQjoVKnZDRZxhtC0/q cj+OLt/8xyf/4mbXsiMt1SWKb3VfxXx5gcd17z3p3rvHTshb7BlgaNsdUMW6ECqRLn HHzDPYQRwI1mY15I6HDbR3DNXu3VsuwoBhTDO4Aqq5DjKrHoEDDpEI4TI2crFtZ/RB NbVSIavYfu8w9XMa36PS2g8ZnRpBV6C8spXkiIPklfQcQxNe5WISeMmR7Zml/WfIwC QJNpLOtRgLQoX8MAh1d9aqu0lfw7+amXW/k11IOYMPAEDNAk0x4obNmC6lDnFpRrRn U0TjJaOM+VG7g== Received: from lists1.osuosl.org (lists1.osuosl.org [140.211.166.142]) by smtp4.osuosl.org (Postfix) with ESMTP id E51A240817; Mon, 2 Feb 2026 13:31:09 +0000 (UTC) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) by lists1.osuosl.org (Postfix) with ESMTP id 05CB1149 for ; Mon, 2 Feb 2026 13:31:08 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id F146B40817 for ; Mon, 2 Feb 2026 13:31:07 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id W7gZlp1ahhzg for ; Mon, 2 Feb 2026 13:31:06 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=170.10.129.124; helo=us-smtp-delivery-124.mimecast.com; envelope-from=ivecera@redhat.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org 3805140815 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 3805140815 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp4.osuosl.org (Postfix) with ESMTPS id 3805140815 for ; Mon, 2 Feb 2026 13:31:05 +0000 (UTC) Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-189-BBHCiaKeNxGZckjG__3xLw-1; Mon, 02 Feb 2026 08:31:00 -0500 X-MC-Unique: BBHCiaKeNxGZckjG__3xLw-1 X-Mimecast-MFC-AGG-ID: BBHCiaKeNxGZckjG__3xLw_1770039058 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1E265195608A; Mon, 2 Feb 2026 13:30:58 +0000 (UTC) Received: from [10.45.225.123] (unknown [10.45.225.123]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5469518008FF; Mon, 2 Feb 2026 13:30:54 +0000 (UTC) Message-ID: <1e28083b-691a-480b-bcf6-cd57222e08a1@redhat.com> Date: Mon, 2 Feb 2026 14:30:52 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Petr Oros , netdev@vger.kernel.org References: <20260202084820.260033-1-poros@redhat.com> Content-Language: en-US From: Ivan Vecera In-Reply-To: <20260202084820.260033-1-poros@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1770039064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eBN4sgUDtqNg4GpSirkeIULyv0JMY2X6vXXcs8fqAFc=; b=Oj9mq2RSwJb4K2h7kP6moYXQ16bsn6FbqLTyuVa+JGcCzqPIBPdV4trjkneVI80w1fPJoL 8GT4rYRw89HT16Ah7JxwUkmFu4SG0vnnL7aORmcKMwYBYkw+eTJBongJThUbQi1eD4o94y Eq9eVTHscWeuFFdWkruQJFWIxJ/5rmA= X-Mailman-Original-Authentication-Results: smtp4.osuosl.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com X-Mailman-Original-Authentication-Results: smtp4.osuosl.org; dkim=pass (1024-bit key, unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Oj9mq2RS Subject: Re: [Intel-wired-lan] [PATCH net] iavf: fix deadlock in reset handling X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Przemek Kitszel , linux-kernel@vger.kernel.org, Andrew Lunn , Eric Dumazet , Stanislav Fomichev , Tony Nguyen , intel-wired-lan@lists.osuosl.org, Jakub Kicinski , Paolo Abeni , "David S. Miller" Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" On 2/2/26 9:48 AM, Petr Oros wrote: > Three driver callbacks schedule a reset and wait for its completion: > ndo_change_mtu(), ethtool set_ringparam(), and ethtool set_channels(). > > Waiting for reset in ndo_change_mtu() and set_ringparam() was added by > commit c2ed2403f12c74 ("iavf: Wait for reset in callbacks which trigger > it") to fix a race condition where adding an interface to bonding > immediately after MTU or ring parameter change failed because the > interface was still in __RESETTING state. The same commit also added > waiting in iavf_set_priv_flags(), which was later removed by commit > 53844673d55529 ("iavf: kill "legacy-rx" for good"). > > Waiting in set_channels() was introduced earlier by commit 4e5e6b5d9d13 > ("iavf: Fix return of set the new channel count") to ensure the PF has > enough time to complete the VF reset when changing channel count, and to > return correct error codes to userspace. > > Commit ef490bbb226702 ("iavf: Add net_shaper_ops support") added > net_shaper_ops to iavf, which required reset_task to use _locked NAPI > variants (napi_enable_locked, napi_disable_locked) that need the netdev > instance lock. > > Later, commit 7e4d784f5810 ("net: hold netdev instance lock during > rtnetlink operations") and commit 2bcf4772e45adb ("net: ethtool: try to > protect all callback with netdev instance lock") started holding the > netdev instance lock during ndo and ethtool callbacks for drivers with > net_shaper_ops. > > The combination of waiting for reset and the new locking requirements > creates a deadlock: the callback holds the lock and waits for reset_task, > but reset_task is blocked waiting for the same lock: > > Thread 1 (callback) Thread 2 (reset_task) > ------------------- --------------------- > netdev_lock() > ndo_change_mtu() or ethtool op > iavf_schedule_reset() > iavf_wait_for_reset() iavf_reset_task() > waiting... netdev_lock() <- DEADLOCK > > Reproducer: > > # echo 16 > /sys/class/net/$PF/device/sriov_numvfs > # ip link set $VF up > # ip link set $VF mtu 5000 > RTNETLINK answers: Device or resource busy > > # dmesg | tail -1 > iavf: MTU change interrupted waiting for reset > > Fix this by temporarily releasing the lock while waiting for reset to > complete. This follows the pattern used elsewhere in the kernel (e.g., > do_set_master() releases rtnl_lock before calling ndo_add_slave()). > > Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") > Signed-off-by: Petr Oros > --- > drivers/net/ethernet/intel/iavf/iavf_main.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c > index 8aa6e92c16431f..d7738fb8fa60bc 100644 > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c > @@ -189,13 +189,22 @@ static bool iavf_is_reset_in_progress(struct iavf_adapter *adapter) > * iavf_wait_for_reset - Wait for reset to finish. > * @adapter: board private structure > * > + * The iavf driver selects NET_SHAPER, so callbacks that trigger reset are > + * always called with netdev instance lock held, while reset_task also needs > + * this lock. Release the lock while waiting to avoid deadlock. > + * > * Returns 0 if reset finished successfully, negative on timeout or interrupt. > */ > int iavf_wait_for_reset(struct iavf_adapter *adapter) > { > - int ret = wait_event_interruptible_timeout(adapter->reset_waitqueue, > - !iavf_is_reset_in_progress(adapter), > - msecs_to_jiffies(5000)); > + struct net_device *netdev = adapter->netdev; > + int ret; > + > + netdev_unlock(netdev); > + ret = wait_event_interruptible_timeout(adapter->reset_waitqueue, > + !iavf_is_reset_in_progress(adapter), > + msecs_to_jiffies(5000)); > + netdev_lock(netdev); > > /* If ret < 0 then it means wait was interrupted. > * If ret == 0 then it means we got a timeout while waiting Reviewed-by: Ivan Vecera