public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Dave Jiang <dave.jiang@intel.com>, Vinod Koul <vkoul@kernel.org>,
	Zhou Wang <wangzhou1@hisilicon.com>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH 00/12] iommu: Remove IOMMU_DEV_FEAT_SVA/_IOPF
Date: Tue, 18 Feb 2025 09:57:51 -0400	[thread overview]
Message-ID: <20250218135751.GH3696814@ziepe.ca> (raw)
In-Reply-To: <59998dcc-9452-4efd-be69-d95754217633@linux.intel.com>

On Tue, Feb 18, 2025 at 10:57:30AM +0800, Baolu Lu wrote:
> > > > [  304.961340]  __iommu_attach_device+0x2c/0x110
> > > > [  304.961343]  __iommu_device_set_domain.isra.0+0x78/0xe0
> > > > [  304.961345]  __iommu_group_set_domain_internal+0x78/0x160
> > > > [  304.961347]  iommu_replace_group_handle+0x9c/0x150
> > > > [  304.961350]  iommufd_fault_domain_replace_dev+0x88/0x120
> > > > [  304.961353]  iommufd_device_do_replace+0x190/0x3c0
> > > > [  304.961355]  iommufd_device_change_pt+0x270/0x688
> > > > [  304.961357]  iommufd_device_replace+0x20/0x38
> > > > [  304.961359]  vfio_iommufd_physical_attach_ioas+0x30/0x78
> > > > [  304.961363]  vfio_df_ioctl_attach_pt+0xa8/0x188
> > > > [  304.961366]  vfio_device_fops_unl_ioctl+0x310/0x990
> > > > 
> > > > 
> > > > When page fault triggers:
> > > > 
> > > > [ 1016.383578] ------------[ cut here ]-----------
> > > > [ 1016.388184] WARNING: CPU: 35 PID: 717 at
> > > > drivers/iommu/io-pgfault.c:231 iommu_report_device_fault+0x2c8/0x470
> > > It's likely that iopf_queue_add_device() was not called for this device.
> > iopf_queue_add_device is called, but quickly iopf_queue_remove_device
> > is called during guest bootup.
> > Then fault_param is set to NULL.
> > 
> > arm_smmu_attach_commit
> > arm_smmu_remove_master_domain
> > // newly added in the first patch
> >         if (master_domain) {
> >                    if (master_domain->using_iopf)
> 
> It seems the above check is incorrect. We only need to disable iopf when
> an iopf-capable domain is about to be removed. Will the following
> additional change make any difference?

The check looks right, it should only disable if it was enabled? The
refcounting is what keep track of the 'about to be removed' and it
should  be that using_iopf and domain->iopf_handler are mostly the
same.

Hmm, I think the issue is related to nested

to_smmu_domain_devices() returns the S2 parent for the nesting domain
always

Which means the smmu_domain->devices list (on the s2) will end up with
two entries for the same SID during the replace operation at VM boot,
one with faulting and one without.

I think that arm_smmu_remove_master_domain() will end up removing the
wrong master_domain because arm_smmu_find_master_domain() can't tell
the two apart.

When I wrote this there was no nested and the list devices list was
unique to each domain, so everything inside was the same.

Like below?

Jason

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b14f1d0ee7076b..dc8708b414468e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2710,10 +2710,9 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
 	pci_disable_pasid(pdev);
 }
 
-static struct arm_smmu_master_domain *
-arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
-			    struct arm_smmu_master *master,
-			    ioasid_t ssid, bool nested_ats_flush)
+static struct arm_smmu_master_domain *arm_smmu_find_master_domain(
+	struct arm_smmu_domain *smmu_domain, struct iommu_domain *domain,
+	struct arm_smmu_master *master, ioasid_t ssid, bool nested_ats_flush)
 {
 	struct arm_smmu_master_domain *master_domain;
 
@@ -2722,6 +2721,7 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
 	list_for_each_entry(master_domain, &smmu_domain->devices,
 			    devices_elm) {
 		if (master_domain->master == master &&
+		    master_domain->domain == domain &&
 		    master_domain->ssid == ssid &&
 		    master_domain->nested_ats_flush == nested_ats_flush)
 			return master_domain;
@@ -2812,8 +2812,8 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
 		nested_ats_flush = to_smmu_nested_domain(domain)->enable_ats;
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
-	master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid,
-						    nested_ats_flush);
+	master_domain = arm_smmu_find_master_domain(smmu_domain, domain, master,
+						    ssid, nested_ats_flush);
 	if (master_domain) {
 		list_del(&master_domain->devices_elm);
 		if (master->ats_enabled)
@@ -2889,6 +2889,7 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
 		master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
 		if (!master_domain)
 			return -ENOMEM;
+		master_domain->domain = new_domain;
 		master_domain->master = master;
 		master_domain->ssid = state->ssid;
 		if (new_domain->type == IOMMU_DOMAIN_NESTED)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 5653d7417db7d9..fe6b88affa4a60 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -907,6 +907,11 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
 struct arm_smmu_master_domain {
 	struct list_head devices_elm;
 	struct arm_smmu_master *master;
+	/*
+	 * For nested domains the master_domain is threaded onto the S2 parent,
+	 * this points to the IOMMU_DOMAIN_NESTED to disambiguate the masters.
+	 */
+	struct iommu_domain *domain;
 	ioasid_t ssid;
 	bool nested_ats_flush : 1;
 	bool using_iopf : 1;

  parent reply	other threads:[~2025-02-18 13:57 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-14  6:10 [PATCH 00/12] iommu: Remove IOMMU_DEV_FEAT_SVA/_IOPF Lu Baolu
2025-02-14  6:10 ` [PATCH 01/12] iommu/arm-smmu-v3: Put iopf enablement in the domain attach path Lu Baolu
2025-02-14  6:10 ` [PATCH 02/12] iommu/vt-d: Check if SVA is supported when attaching the SVA domain Lu Baolu
2025-02-14  6:10 ` [PATCH 03/12] iommu: Remove IOMMU_DEV_FEAT_SVA Lu Baolu
2025-02-14  6:10 ` [PATCH 04/12] iommu/vt-d: Move scalable mode ATS enablement to probe path Lu Baolu
2025-02-14  6:10 ` [PATCH 05/12] iommu/vt-d: Move PRI enablement in " Lu Baolu
2025-02-14  6:10 ` [PATCH 06/12] iommu/vt-d: Cleanup intel_context_flush_present() Lu Baolu
2025-02-14  6:10 ` [PATCH 07/12] iommu/vt-d: Put iopf enablement in domain attach path Lu Baolu
2025-02-14  6:11 ` [PATCH 08/12] iommufd/selftest: " Lu Baolu
2025-02-20  1:02   ` Jason Gunthorpe
2025-02-20  7:03     ` Baolu Lu
2025-02-20 18:00       ` Jason Gunthorpe
2025-02-21  1:31         ` Baolu Lu
2025-02-21 15:04           ` Jason Gunthorpe
2025-02-22  7:25             ` Baolu Lu
2025-02-24 19:23               ` Jason Gunthorpe
2025-02-14  6:11 ` [PATCH 09/12] dmaengine: idxd: Remove unnecessary IOMMU_DEV_FEAT_IOPF Lu Baolu
2025-02-14 11:22   ` Vinod Koul
2025-02-14 16:25   ` Dave Jiang
2025-02-18 22:55   ` Fenghua Yu
2025-02-19  6:02     ` Baolu Lu
2025-02-20  1:03   ` Jason Gunthorpe
2025-02-14  6:11 ` [PATCH 10/12] uacce: " Lu Baolu
2025-02-20  1:03   ` Jason Gunthorpe
2025-02-14  6:11 ` [PATCH 11/12] iommufd: " Lu Baolu
2025-02-14  7:06   ` Nicolin Chen
2025-02-15  6:32     ` Baolu Lu
2025-02-18 13:06       ` Jason Gunthorpe
2025-02-19  5:59         ` Baolu Lu
2025-02-20  1:04   ` Jason Gunthorpe
2025-02-14  6:11 ` [PATCH 12/12] iommu: Remove iommu_dev_enable/disable_feature() Lu Baolu
2025-02-20  1:04   ` Jason Gunthorpe
2025-02-14  8:43 ` [PATCH 00/12] iommu: Remove IOMMU_DEV_FEAT_SVA/_IOPF Zhangfei Gao
2025-02-14  9:24   ` Zhangfei Gao
2025-02-14 12:56   ` Jason Gunthorpe
2025-02-15  8:11     ` Zhangfei Gao
2025-02-15 10:06       ` Baolu Lu
2025-02-15 11:35         ` Zhangfei Gao
2025-02-18  2:57           ` Baolu Lu
2025-02-18  6:13             ` Zhangfei Gao
2025-02-18 13:57             ` Jason Gunthorpe [this message]
2025-02-18 15:25               ` Zhangfei Gao
2025-02-18 16:53                 ` Jason Gunthorpe
2025-02-19  6:06                   ` Baolu Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250218135751.GH3696814@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=baolu.lu@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=vkoul@kernel.org \
    --cc=wangzhou1@hisilicon.com \
    --cc=will@kernel.org \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox