From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 441CEC3ABA5 for ; Tue, 29 Apr 2025 15:20:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id E038B6081A; Tue, 29 Apr 2025 15:20:29 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id JV08OU9BpY7k; Tue, 29 Apr 2025 15:20:29 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.166.142; helo=lists1.osuosl.org; envelope-from=intel-wired-lan-bounces@osuosl.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 543C660767 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osuosl.org; s=default; t=1745940029; bh=f+K2Yc/cYLv7XjhXW56/UAaKP6XQ8WqXKWF0ILuSYwI=; h=Date:From:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From; b=mPQ9Iq/vZcolhh/6ZtcUSOfuQ2zYrF8TJXFZ/GynBwztYKDZp1R8PhZvdbYS1fQLg SFp8s1opm2bJNqvgczXBwVlIuGwnLfjCS2JkIdrwc6hSo6lxOlCOWgVahz0ugTII5p RzDmtXPQMG6cXMxkUr3Gi5sE+vUpjZBofc/DXZAt0hboi3cOMPSOUzepFttavgZZSj 4SQjiARttqrbUvJDrEAAccoYKiPDzEcSCUccRrd1+fAdHgaNnKWj3f3sP/jOYncMbu 3lLWEggcp1C7NkyuAATRONmbOwOSIdPbiv2H39OJII5dal/wyCRwBT/5Jd7hM2bz0T sE/xaAxe4DvCA== Received: from lists1.osuosl.org (lists1.osuosl.org [140.211.166.142]) by smtp3.osuosl.org (Postfix) with ESMTP id 543C660767; Tue, 29 Apr 2025 15:20:29 +0000 (UTC) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists1.osuosl.org (Postfix) with ESMTP id 59639E4 for ; Tue, 29 Apr 2025 15:20:28 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 4B46660767 for ; Tue, 29 Apr 2025 15:20:28 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id Jr64pt42--AT for ; Tue, 29 Apr 2025 15:20:27 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=139.178.84.217; helo=dfw.source.kernel.org; envelope-from=horms@kernel.org; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp3.osuosl.org B92DC60701 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org B92DC60701 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by smtp3.osuosl.org (Postfix) with ESMTPS id B92DC60701 for ; Tue, 29 Apr 2025 15:20:27 +0000 (UTC) Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 7A3885C3549; Tue, 29 Apr 2025 15:18:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CBFC1C4CEE3; Tue, 29 Apr 2025 15:20:23 +0000 (UTC) Date: Tue, 29 Apr 2025 16:20:21 +0100 From: Simon Horman To: Ian Ray Cc: Tony Nguyen , Przemek Kitszel , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , brian.ruley@gehealthcare.com, intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= Message-ID: <20250429152021.GP3339421@horms.kernel.org> References: <20250428115450.639-1-ian.ray@gehealthcare.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250428115450.639-1-ian.ray@gehealthcare.com> X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745940026; bh=rxoMXpkpRsXYKrXv1e5QC5IEfYkq+HrBL07RI8vLYXU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=bsORuipn+BexskWvLb3jJGuWS0W5JpuowzIlt3FoR+nvstJhBgHJKNcJyomtzEbTW 7kNyFfEr9xWqpQ8cf8/eFG94lyqqtnKlJ64HP9hH3HF4dBrRD+KpFqB/K3y/MXSaO0 tf4s611tM+VX4CMePy1Gqr1bI8mN2Xx+lFK9u7YkbFk5ffLjNmWte2BG3zi8hY7pzY ZNWfyhCQCPRsK5jbrm5Jvqvql881wB7NnWvesuhctzUAYaZcTX79HbyDBqE0wxJcpC H+B2lRV+Bj6YsnbTgeco/ptBcYCy6jiIH9Cr+5GLSKwYgebn+HKKcKRE4q0zNTwnom dMNBgQITYL+SA== X-Mailman-Original-Authentication-Results: smtp3.osuosl.org; dmarc=pass (p=quarantine dis=none) header.from=kernel.org X-Mailman-Original-Authentication-Results: smtp3.osuosl.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=k20201202 header.b=bsORuipn Subject: Re: [Intel-wired-lan] [PATCH] igb: Fix watchdog_task race with shutdown X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" + Toke On Mon, Apr 28, 2025 at 02:54:49PM +0300, Ian Ray wrote: > A rare [1] race condition is observed between the igb_watchdog_task and > shutdown on a dual-core i.MX6 based system with two I210 controllers. > > Using printk, the igb_watchdog_task is hung in igb_read_phy_reg because > __igb_shutdown has already called __igb_close. > > Fix this by locking in igb_watchdog_task (in the same way as is done in > igb_reset_task). > > reboot kworker > > __igb_shutdown > rtnl_lock > __igb_close > : igb_watchdog_task > : : > : igb_read_phy_reg (hung) > rtnl_unlock > > [1] Note that this is easier to reproduce with 'initcall_debug' logging > and additional and printk logging in igb_main. > > Signed-off-by: Ian Ray Hi Ian, Thanks for your patch. While I think that the simplicity of this approach may well be appropriate as a fix for the problem described I do have a concern. I am worried that taking RTNL each time the watchdog tasks will create unnecessary lock contention. That may manifest in weird and wonderful ways in future. Maybe this patch doesn't make things materially worse in that regard. But it would be nice to have a plan to move away from using RTNL, as is happening elsewhere. ... From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB1801DAC81; Tue, 29 Apr 2025 15:20:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745940027; cv=none; b=qM7TVpTy0UU0FCXik9wvNAwPhVXc/gagw0UJhuUCvdITkkYqzoZuVheSmNGtdAGmYy7ZKecGiKaDC2kM9OeEPH/HrUpc6xgm2+HbMASFsMq/zpq57S5ZcTiChBWkgZMQeqkGGreCaFmjKXvOFRfgtvUiMkBgcAuQ9HQl/TGY8uo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745940027; c=relaxed/simple; bh=rxoMXpkpRsXYKrXv1e5QC5IEfYkq+HrBL07RI8vLYXU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=W/cmXPHYCIhIIoLytRyQf5x2KSKJzcVgZ4MIs+wAU4FcU0lMel2oQauzxPbEWjEfI6ylABSo3s6joI09NghTsnbuqYrkcxhYu8RaizzKtPc49iBvUrvsvaw7liT+iYArEcYjp4Nnd2XKVXeGmwO55m2ScAIymyjACmruIyvyQ1E= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bsORuipn; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bsORuipn" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CBFC1C4CEE3; Tue, 29 Apr 2025 15:20:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745940026; bh=rxoMXpkpRsXYKrXv1e5QC5IEfYkq+HrBL07RI8vLYXU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=bsORuipn+BexskWvLb3jJGuWS0W5JpuowzIlt3FoR+nvstJhBgHJKNcJyomtzEbTW 7kNyFfEr9xWqpQ8cf8/eFG94lyqqtnKlJ64HP9hH3HF4dBrRD+KpFqB/K3y/MXSaO0 tf4s611tM+VX4CMePy1Gqr1bI8mN2Xx+lFK9u7YkbFk5ffLjNmWte2BG3zi8hY7pzY ZNWfyhCQCPRsK5jbrm5Jvqvql881wB7NnWvesuhctzUAYaZcTX79HbyDBqE0wxJcpC H+B2lRV+Bj6YsnbTgeco/ptBcYCy6jiIH9Cr+5GLSKwYgebn+HKKcKRE4q0zNTwnom dMNBgQITYL+SA== Date: Tue, 29 Apr 2025 16:20:21 +0100 From: Simon Horman To: Ian Ray Cc: Tony Nguyen , Przemek Kitszel , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , brian.ruley@gehealthcare.com, intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= Subject: Re: [PATCH] igb: Fix watchdog_task race with shutdown Message-ID: <20250429152021.GP3339421@horms.kernel.org> References: <20250428115450.639-1-ian.ray@gehealthcare.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250428115450.639-1-ian.ray@gehealthcare.com> + Toke On Mon, Apr 28, 2025 at 02:54:49PM +0300, Ian Ray wrote: > A rare [1] race condition is observed between the igb_watchdog_task and > shutdown on a dual-core i.MX6 based system with two I210 controllers. > > Using printk, the igb_watchdog_task is hung in igb_read_phy_reg because > __igb_shutdown has already called __igb_close. > > Fix this by locking in igb_watchdog_task (in the same way as is done in > igb_reset_task). > > reboot kworker > > __igb_shutdown > rtnl_lock > __igb_close > : igb_watchdog_task > : : > : igb_read_phy_reg (hung) > rtnl_unlock > > [1] Note that this is easier to reproduce with 'initcall_debug' logging > and additional and printk logging in igb_main. > > Signed-off-by: Ian Ray Hi Ian, Thanks for your patch. While I think that the simplicity of this approach may well be appropriate as a fix for the problem described I do have a concern. I am worried that taking RTNL each time the watchdog tasks will create unnecessary lock contention. That may manifest in weird and wonderful ways in future. Maybe this patch doesn't make things materially worse in that regard. But it would be nice to have a plan to move away from using RTNL, as is happening elsewhere. ...