From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20E544C74; Sat, 31 May 2025 01:31:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748655102; cv=none; b=LBQNavdzAw7V883TTwDizoD1Zi8jxNc5Q+e7XGMhRuIsWmR0iKxxg9h4MMBFfNTzMCKb2dVgMBLjw6cplfBvKvU+XZURgqoMSY9PIlqvwhxXfYgmWpLa7VgYZ7ZPS6HwVgO79NreJWzKgtfa5N6PA73xXNZnrV3fraUpyBdyOKk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748655102; c=relaxed/simple; bh=SeUxHDr/ovRgaIvx1nnr6AEFPNfj0/y2SyAmVaiwcko=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=g9apiDmaLOEMQTeAj3INTUyB2z6+9jVR4gGdHDxt41mQIDGfJ8uFIvN6KaMZ2DGqvZhGPEye25omY1FP0/nZfkhP4+Ymx4U6m6jrPrPAgnV4uXF9TTLwoHkB/hZCmy1Oxv9YXLqUfFEXRX7K8T7nn67URodhZ0itarHd6T3BiAQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OaX9pnSH; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OaX9pnSH" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2AD5BC4CEEB; Sat, 31 May 2025 01:31:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1748655101; bh=SeUxHDr/ovRgaIvx1nnr6AEFPNfj0/y2SyAmVaiwcko=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=OaX9pnSHDCbGdiDnup73JGl16WHOa4dTLIen1Z+gHD0BtEFl6UYvX4tbVA9iy5an3 taGfejmVAUotdaLZntWpY3i/sfH5rMc2o0QKzUaDqJhI6WfAq1FJtpbNnzPhXdS0pX cd7A8fKPMyoxS/zkxmVkbHvNH3xoaz3EbLu+pYopW8Lc08C0MF6PyMcKUFaZaUsm4v S5sLCI4MyxiYCggletz5cpNQojT1tkRNdQORLOkQN8XoKM7CGTarIeSnI6XATUn75P HXOHnt99fUrh5t/xIRAwzGHe7XzvT9DXr71wqp3yZ1Y9DCbpN4Z6TBgv8mzPuxlUNd sBxcMQYQHt80Q== Date: Fri, 30 May 2025 18:31:40 -0700 From: Jakub Kicinski To: Joe Damato Cc: Stanislav Fomichev , netdev@vger.kernel.org, john.cs.hey@gmail.com, jacob.e.keller@intel.com, syzbot+846bb38dc67fe62cc733@syzkaller.appspotmail.com, Tony Nguyen , Przemek Kitszel , Andrew Lunn , "David S. Miller" , Eric Dumazet , Paolo Abeni , "moderated list:INTEL ETHERNET DRIVERS" , open list Subject: Re: [PATCH iwl-net] e1000: Move cancel_work_sync to avoid deadlock Message-ID: <20250530183140.6cfad3ae@kernel.org> In-Reply-To: References: <20250530014949.215112-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Fri, 30 May 2025 12:45:13 -0700 Joe Damato wrote: > > nit: as Jakub mentioned in another thread, it seems more about the > > flush_work waiting for the reset_task to complete rather than > > wq mutexes (which are fake)? > > Hm, I probably misunderstood something. Also, not sure what you > meant by the wq mutexes being fake? > > My understanding (which is prob wrong) from the syzbot and user > report was that the order of wq mutex and rtnl are inverted in the > two paths, which can cause a deadlock if both paths run. Take a look at touch_work_lockdep_map(), theres nosaj thing as wq mutex. It's just a lockdep "annotation" that helps lockdep connect the dots between waiting thread and the work item, not a real mutex. So the commit msg may be better phrased like this (modulo the lines in front): CPU 0: , - RTNL is held / - e1000_close | - e1000_down +- - cancel_work_sync (cancel / wait for e1000_reset_task()) | | CPU 1: | - process_one_work \ - e1000_reset_task `- take RTNL