From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: smtp.subspace.kernel.org; dkim=none X-Greylist: delayed 761 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 29 Nov 2023 03:30:23 PST Received: from bmailout3.hostsharing.net (bmailout3.hostsharing.net [176.9.242.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9927DA; Wed, 29 Nov 2023 03:30:23 -0800 (PST) Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL Global TLS RSA4096 SHA256 2022 CA1" (verified OK)) by bmailout3.hostsharing.net (Postfix) with ESMTPS id 4AE99100D9438; Wed, 29 Nov 2023 12:30:22 +0100 (CET) Received: by h08.hostsharing.net (Postfix, from userid 100393) id 24486372DE; Wed, 29 Nov 2023 12:30:22 +0100 (CET) Date: Wed, 29 Nov 2023 12:30:22 +0100 From: Lukas Wunner To: Shinichiro Kawasaki Cc: Heiner Kallweit , Andy Shevchenko , Keith Busch , Wolfram Sang , Jean Delvare , "linux-pci@vger.kernel.org" , "linux-i2c@vger.kernel.org" , Bjorn Helgaas Subject: Re: [bug report] lockdep WARN at PCI device rescan Message-ID: <20231129113022.GB14152@wunner.de> References: <6xb24fjmptxxn5js2fjrrddjae6twex5bjaftwqsuawuqqqydx@7cl3uik5ef6j> <20231114155701.GA27547@wunner.de> Precedence: bulk X-Mailing-List: linux-i2c@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) On Tue, Nov 28, 2023 at 10:16:28AM +0000, Shinichiro Kawasaki wrote: > Here are my three observations: > > A) pci drivers should be able to call p2sb_bar() in probe() without failure. > B) when /sys/bus/pci/rescan is written, pci_rescan_remove_lock is locked then > probe() is called. > C) p2sb_bar() locks pci_rescan_remove_lock. > > These results in the deadlock. To avoid the deadlock, one of the three needs > to change. A) is not to change. I guess changing B) will be too much. So, I > would like to question if we can change C). It is possible to allow recursive acquisition of a mutex by doing two things: * You need to store a pointer to the task_struct which is holding the lock. This allows you to identify upon a recursive acquisition that you're already holding the lock. The acquire operation becomes a no-op in this case. * You need a counter of how many times you've acquired the lock recursively. This allows you to determine upon lock release whether the release operation should be a no-op (due to previous recursive acquisition) or whether it should result in actual lock release (no previous recursive locking, or recursive locking has ended). Actually struct mutex already stores the owner of the lock, but that's only available internally. While it would be possible to allow recursive acquisition of pci_rescan_remove_lock in this way, doing so merely because of a vendor-specific platform quirk will likely be considered dodgy by the upstream community. So Andy's proposal to stash the struct resource on affected platforms seems more viable from an upstream acceptability point of view. Thanks, Lukas