public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Norris <briannorris@chromium.org>
To: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>,
	Krishna Reddy <vdumpa@nvidia.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Niklas Schnelle <schnelle@linux.ibm.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <jroedel@suse.de>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: 6.1-rc1 regression: bisected to 57365a04c921 iommu: Move bus setup to IOMMU device registration
Date: Wed, 19 Oct 2022 16:25:59 -0700	[thread overview]
Message-ID: <Y1CHh2oM5wyHs06J@google.com> (raw)

Hi all,

I'm testing out asynchronous probe (that's, kernel cmdline
'driver_async_probe=*' or similar), and I've identified a regression in
v6.1-rc1 due to this:

commit 57365a04c92126525a58bf7a1599ddfa832415e9
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Mon Aug 15 17:20:06 2022 +0100

    iommu: Move bus setup to IOMMU device registration

In particular, I'm testing a Rockchip RK3399 system with
'driver_async_probe=rk_iommu', and finding I crash like this:

[    0.180480] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
...
[    0.180583] CPU: 2 PID: 49 Comm: kworker/u12:1 Not tainted 6.1.0-rc1 #57
[    0.180593] Hardware name: Google Scarlet (DT)
[    0.180602] Workqueue: events_unbound async_run_entry_fn
[    0.180622] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.180632] pc : dev_iommu_free+0x24/0x54
[    0.180644] lr : __iommu_probe_device+0x110/0x180
...
[    0.180785] Call trace:
[    0.180791]  dev_iommu_free+0x24/0x54
[    0.180800]  __iommu_probe_device+0x110/0x180
[    0.180807]  probe_iommu_group+0x40/0x58
[    0.180816]  bus_for_each_dev+0x8c/0xd8
[    0.180829]  bus_iommu_probe+0x5c/0x2d0
[    0.180840]  iommu_device_register+0xbc/0x104
[    0.180851]  rk_iommu_probe+0x260/0x354
[    0.180861]  platform_probe+0xb4/0xd4
[    0.180872]  really_probe+0xfc/0x284
[    0.180884]  __driver_probe_device+0xc0/0xec
[    0.180894]  driver_probe_device+0x4c/0xd4
[    0.180905]  __driver_attach_async_helper+0x3c/0x60
[    0.180915]  async_run_entry_fn+0x34/0xd4
[    0.180926]  process_one_work+0x1e0/0x3b4
[    0.180936]  worker_thread+0x120/0x404
[    0.180942] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 0
[    0.180944]  kthread+0xf4/0x14c
[    0.180953]  ret_from_fork+0x10/0x20
[    0.180968] Code: f9000bf3 910003fd f9416c13 f9016c1f (f9401a68) 
[    0.180981] ---[ end trace 0000000000000000 ]---

I find if I revert the above commit (and 29e932295bfa ("iommu: Clean up
bus_set_iommu()"), to keep the reverts clean), things start working again.

I haven't worked out exactly what's going wrong, but the patch looks like it
isn't async safe at all, due to the way each device is poking (without locking)
at the global iommu_buses[].

Is this a regression worth tracking (e.g., potentially reverting commit
57365a04c921)? Or should we just accept that iommu drivers cannot probe
asynchronously (and thus set PROBE_FORCE_SYNCHRONOUS for all iommu drivers)?

In the meantime, I'll see if I can figure out a reasonable non-revert solution
that is async safe, but I wanted to report this now, in case I don't
have much luck or enough time.

Thanks,
Brian

             reply	other threads:[~2022-10-19 23:26 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-19 23:25 Brian Norris [this message]
2022-10-20 10:02 ` 6.1-rc1 regression: bisected to 57365a04c921 iommu: Move bus setup to IOMMU device registration Robin Murphy
2022-10-20 16:21   ` Brian Norris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1CHh2oM5wyHs06J@google.com \
    --to=briannorris@chromium.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=jroedel@suse.de \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=robin.murphy@arm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=vdumpa@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox