From: Jason Gunthorpe <jgg@nvidia.com>
To: Joerg Roedel <jroedel@suse.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>,
iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
Len Brown <lenb@kernel.org>,
linux-acpi@vger.kernel.org,
"Rafael J. Wysocki" <rafael@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>,
Lu Baolu <baolu.lu@linux.intel.com>,
Kevin Tian <kevin.tian@intel.com>,
Chen-Yu Tsai <wenst@chromium.org>
Subject: Re: [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths
Date: Fri, 18 Aug 2023 16:18:35 -0300 [thread overview]
Message-ID: <ZN/EC4ULF84CZUmh@nvidia.com> (raw)
In-Reply-To: <ZN-3Qfp3CyNiwJBK@suse.de>
On Fri, Aug 18, 2023 at 08:24:01PM +0200, Joerg Roedel wrote:
> On Fri, Aug 18, 2023 at 01:06:43PM -0300, Jason Gunthorpe wrote:
> > On Fri, Aug 18, 2023 at 05:56:20PM +0200, Joerg Roedel wrote:
> > > On Thu, Aug 17, 2023 at 03:33:16PM -0300, Jason Gunthorpe wrote:
> > > > Bascially.. Yikes!
> > >
> > > Hmm, that is a difficult situation. Even if the problem is a misuse of
> > > the APIs we can not just blindly break other drivers by our core
> > > changes.
> >
> > They are not broken, they just throw a lockdep warning and keep going
> > as before. This is what triggers:
> >
> > static inline void device_lock_assert(struct device *dev)
> > {
> > lockdep_assert_held(&dev->mutex);
> > }
> >
> > So non-debug builds won't even see anything.
>
> But this still means that a function is called without holding the
> proper lock.
It has alway been like that. of_dma_configure_id() always required the
device_lock to be held!
eg as one example:
of_iommu_configure
of_iommu_configure_device
of_iommu_configure_dev
of_iommu_xlate
iommu_fwspec_init
dev_iommu_get
It is subtle, but the device_lock is what protects the store to
dev->iommu inside dev_iommu_get(). In v6.5-rc1 many callers held the
device lock here, and after this series only these broken drivers
don't.
The driver assumes it has exclusive use of the platform device it
steals. Not just for this call, but in general it does other stuff
that rests on this assumption. Since it is exclusive it doesn't
actually need any locking - this is why it works reliably as is today.
Again, there is no practical bug here, the driver works fine. On
non-debug kernels there is no warning or functional issue. Debug
kernels rightly highlight that the API is being used wrong.
It is wrong use of the APIs because someone could go and use sysfs to
attach, say, VFIO to the stolen platform_device and cause all kinds of
kernel problems.
> I can't send anything with known problems upstream.
In my view this is not creating a new problem. It is exposing existing
problems with a debugging message only on debug kernels.
However, if you view the debug message as a problem then I suggest we
simply comment out with a note the device_lock_assert() from the iommu
code. This would be sad because the vast majority of systems don't use
these badly designed drivers.
This puts it back to the v6.5-rc1 behavior where of_dma_configure_id()
won't make any prints if it is called with wrong locking.
Please let me know.
Jason
prev parent reply other threads:[~2023-08-18 19:19 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20230809144403eucas1p1345aec6ec34440f1794594426e0402ab@eucas1p1.samsung.com>
2023-08-09 14:43 ` [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths Jason Gunthorpe
2023-08-09 14:43 ` [PATCH v2 1/4] iommu: Provide iommu_probe_device_locked() Jason Gunthorpe
2023-08-09 14:43 ` [PATCH v2 2/4] iommu: Pass in the iommu_device to probe for in bus_iommu_probe() Jason Gunthorpe
2023-08-09 14:43 ` [PATCH v2 3/4] iommu: Do not attempt to re-lock the iommu device when probing Jason Gunthorpe
2023-08-10 2:37 ` Tian, Kevin
2023-08-09 14:43 ` [PATCH v2 4/4] iommu: dev->iommu->iommu_dev must be set before ops->device_group() Jason Gunthorpe
2023-08-10 2:37 ` Tian, Kevin
2023-08-09 15:49 ` [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths Joerg Roedel
2023-08-09 15:55 ` Jason Gunthorpe
2023-08-10 16:15 ` Jeffrey Hugo
2023-08-17 8:31 ` Marek Szyprowski
2023-08-17 18:33 ` Jason Gunthorpe
2023-08-18 15:56 ` Joerg Roedel
2023-08-18 16:06 ` Jason Gunthorpe
2023-08-18 18:00 ` Eric Farman
2023-08-18 18:15 ` Jason Gunthorpe
2023-08-18 18:32 ` Eric Farman
2023-08-18 18:24 ` Joerg Roedel
2023-08-18 18:50 ` Robin Murphy
2023-08-18 19:19 ` Jason Gunthorpe
2023-08-21 11:35 ` Robin Murphy
2023-08-18 19:18 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZN/EC4ULF84CZUmh@nvidia.com \
--to=jgg@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=jroedel@suse.de \
--cc=kevin.tian@intel.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=rafael@kernel.org \
--cc=robin.murphy@arm.com \
--cc=wenst@chromium.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.