public inbox for virtualization@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: Nicolin Chen <nicolinc@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "joro@8bytes.org" <joro@8bytes.org>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"suravee.suthikulpanit@amd.com" <suravee.suthikulpanit@amd.com>,
	"will@kernel.org" <will@kernel.org>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"sven@kernel.org" <sven@kernel.org>,
	"j@jannau.net" <j@jannau.net>,
	"jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"robin.clark@oss.qualcomm.com" <robin.clark@oss.qualcomm.com>,
	"dwmw2@infradead.org" <dwmw2@infradead.org>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
	"yong.wu@mediatek.com" <yong.wu@mediatek.com>,
	"matthias.bgg@gmail.com" <matthias.bgg@gmail.com>,
	"angelogioacchino.delregno@collabora.com"
	<angelogioacchino.delregno@collabora.com>,
	"tjeznach@rivosinc.com" <tjeznach@rivosinc.com>,
	"pjw@kernel.org" <pjw@kernel.org>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"heiko@sntech.de" <heiko@sntech.de>,
	"schnelle@linux.ibm.com" <schnelle@linux.ibm.com>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"wens@csie.org" <wens@csie.org>,
	"jernej.skrabec@gmail.com" <jernej.skrabec@gmail.com>,
	"samuel@sholland.org" <samuel@sholland.org>,
	"thierry.reding@gmail.com" <thierry.reding@gmail.com>,
	"jonathanh@nvidia.com" <jonathanh@nvidia.com>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"asahi@lists.linux.dev" <asahi@lists.linux.dev>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-arm-msm@vger.kernel.org" <linux-arm-msm@vger.kernel.org>,
	"linux-mediatek@lists.infradead.org"
	<linux-mediatek@lists.infradead.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-rockchip@lists.infradead.org"
	<linux-rockchip@lists.infradead.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-sunxi@lists.linux.dev" <linux-sunxi@lists.linux.dev>,
	"linux-tegra@vger.kernel.org" <linux-tegra@vger.kernel.org>,
	"virtualization@lists.linux.dev" <virtualization@lists.linux.dev>,
	"patches@lists.linux.dev" <patches@lists.linux.dev>
Subject: Re: [PATCH v1 02/20] iommu: Introduce a test_dev domain op and an internal helper
Date: Thu, 30 Oct 2025 12:43:59 -0700	[thread overview]
Message-ID: <aQO//+6/B/WbdK2h@Asurada-Nvidia> (raw)
In-Reply-To: <BN9PR11MB5276D10BD480FE66881870B08CFBA@BN9PR11MB5276.namprd11.prod.outlook.com>

On Thu, Oct 30, 2025 at 08:47:18AM +0000, Tian, Kevin wrote:
> It might need more work to meet this requirement. e.g. after patch4
> I could still spot other errors easily in the attach path:
> 
> intel_iommu_attach_device()
>   iopf_for_domain_set()
>     intel_iommu_enable_iopf():
> 
>         if (!info->pri_enabled)
>                 return -ENODEV;

Yea, I missed that.

> intel_iommu_attach_device()
>   dmar_domain_attach_device()
>     domain_attach_iommu():
>       
>        curr = xa_cmpxchg(&domain->iommu_array, iommu->seq_id,
>                           NULL, info, GFP_KERNEL);
>         if (curr) {
>                 ret = xa_err(curr) ? : -EBUSY;
>                 goto err_clear;
>         }

There is actually an xa_load() in this function:

	curr = xa_load(&domain->iommu_array, iommu->seq_id);
	if (curr) {
		curr->refcnt++;
		kfree(info);
		return 0;
	}

	[...]

	info->refcnt	= 1;
	info->did	= num;
	info->iommu	= iommu;
	curr = xa_cmpxchg(&domain->iommu_array, iommu->seq_id,
			  NULL, info, GFP_KERNEL);
	if (curr) {
		ret = xa_err(curr) ? : -EBUSY;
		goto err_clear;
	}

It seems that this xa_cmpxchg could be just xa_store()?

> intel_iommu_attach_device()
>   dmar_domain_attach_device()
>     domain_setup_first_level()
>       __domain_setup_first_level()
>         intel_pasid_setup_first_level():

Yea. There are a few others in the track also..

>         pte = intel_pasid_get_entry(dev, pasid);
>         if (!pte) {
>                 spin_unlock(&iommu->lock);
>                 return -ENODEV;
>         }
> 
>         if (pasid_pte_is_present(pte)) {
>                 spin_unlock(&iommu->lock);
>                 return -EBUSY;
>         }

Hmm, this is fenced by iommu->lock and can race with !attach_dev
callbacks. It might be difficult to shift these to test_dev..

> On the other hand, how do we communicate whatever errors returned
> by attach_dev in the reset_done path back to userspace? As noted above
> resource allocation failures could still occur in attach_dev, but userspace
> may think the requested attach in middle of a reset has succeeded as
> long as it passes the test_dev check.

That's a legit point. Jason pointed out that we would end up with
some inconsistency between driver and core as well, at the SMMUv3
patch. So, this test_dev doesn't seemingly solve our problem very
well..

> Does it work better to block the attaching process upon ongoing reset
> and wake it up later upon reset_done to resume attach?

Yea, I think returning -EBUSY would be the simplest solution like
we did in the previous version.

But the concern is that VF might not be aware of a PF reset, so it
can still race an attachment, which would be -EBUSY as well. Then,
if its driver doesn't retry/defer the attach, this might break it?

FWIW, I am thinking of another design based on Jason's remarks:
https://lore.kernel.org/linux-iommu/aQBopHFub8wyQh5C@Asurada-Nvidia/

So, instead of core initiating the round trip between the blocking
domain and group->domain, it forwards dev_reset_prepare/done to the
driver where it does a low-level attachment that wouldn't fail:
  For SMMUv3, it's an STE update.
  For intel_iommu, it seems to be the context table update?

Then, any concurrent would be allowed to carry on to go through all
the compatibility/sanity checks as usual, but it would bypass the
final step: STE or context table update.

Thanks
Nicolin

  reply	other threads:[~2025-10-30 19:44 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13  0:04 [PATCH v1 00/20] iommu: Introduce and roll out test_dev domain op Nicolin Chen
2025-10-13  0:04 ` [PATCH v1 01/20] iommu: Lock group->mutex in iommu_deferred_attach() Nicolin Chen
2025-10-13  0:04 ` [PATCH v1 02/20] iommu: Introduce a test_dev domain op and an internal helper Nicolin Chen
2025-10-13  9:53   ` Niklas Schnelle
2025-10-13 17:22     ` Nicolin Chen
2025-10-14 14:20       ` Niklas Schnelle
2025-10-20 16:27   ` Jason Gunthorpe
2025-10-20 18:51     ` Nicolin Chen
2025-10-27 23:23       ` Jason Gunthorpe
2025-10-28  4:37         ` Nicolin Chen
2025-10-27 23:18   ` Nicolin Chen
2025-10-30  8:47   ` Tian, Kevin
2025-10-30 19:43     ` Nicolin Chen [this message]
2025-11-03 18:54       ` Jason Gunthorpe
2025-11-05  6:57         ` Tian, Kevin
2025-11-05 18:18           ` Nicolin Chen
2025-11-07 18:54             ` Jason Gunthorpe
2025-11-07 18:58               ` Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 03/20] iommu/arm-smmu-v3: Implement arm_smmu_domain_test_dev Nicolin Chen
2025-10-20 16:32   ` Jason Gunthorpe
2025-10-20 20:08     ` Nicolin Chen
2025-10-27 23:26       ` Jason Gunthorpe
2025-10-28  6:54         ` Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 04/20] iommu/intel: Implement test_dev callbacks to domain ops Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 05/20] iommu/amd: " Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 06/20] iommu/arm-smmu: Implement arm_smmu_test_dev Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 07/20] iommu/qcom_iommu: Implement test_dev callbacks to domain ops Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 08/20] iommu/riscv: Implement riscv_iommu_test_paging_domain Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 09/20] iommu/mkt_iommu: Implement mtk_iommu_test_device Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 10/20] iommu/apple-dart: Implement test_dev callbacks to domain ops Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 11/20] iommu/ipmmu-vmsa: Implement ipmmu_domain_test_device Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 12/20] iommu/sun50i-iommu: Implement sun50i_iommu_domain_test_device Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 13/20] iommu/rockchip-iommu: Implement rk_iommu_identity_test_dev Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 14/20] iommu/msm_iommu: Implement msm_iommu_domain_test_dev Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 15/20] iommu/fsl_pamu_domain: Implement fsl_pamu_domain_test_device Nicolin Chen
2025-10-15 22:02   ` Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 16/20] iommu/omap-iommu: Implement omap_iommu_domain_test_dev Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 17/20] iommu/s390-iommu: Implement s390_iommu_domain_test_device Nicolin Chen
2025-10-13  9:59   ` Niklas Schnelle
2025-10-13  0:05 ` [PATCH v1 18/20] iommufd/selftest: Implement mock_domain_nop_test Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 19/20] iommu/virtio-iommu: Implement viommu_domain_test_dev Nicolin Chen
2025-10-13  0:05 ` [PATCH v1 20/20] iommu/tegra-smmu: Implement tegra_smmu_domain_test_dev Nicolin Chen
2025-10-20 16:24 ` [PATCH v1 00/20] iommu: Introduce and roll out test_dev domain op Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQO//+6/B/WbdK2h@Asurada-Nvidia \
    --to=nicolinc@nvidia.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=asahi@lists.linux.dev \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=heiko@sntech.de \
    --cc=iommu@lists.linux.dev \
    --cc=j@jannau.net \
    --cc=jean-philippe@linaro.org \
    --cc=jernej.skrabec@gmail.com \
    --cc=jgg@nvidia.com \
    --cc=jonathanh@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sunxi@lists.linux.dev \
    --cc=linux-tegra@vger.kernel.org \
    --cc=matthias.bgg@gmail.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=palmer@dabbelt.com \
    --cc=patches@lists.linux.dev \
    --cc=pjw@kernel.org \
    --cc=robin.clark@oss.qualcomm.com \
    --cc=robin.murphy@arm.com \
    --cc=samuel@sholland.org \
    --cc=schnelle@linux.ibm.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=sven@kernel.org \
    --cc=thierry.reding@gmail.com \
    --cc=tjeznach@rivosinc.com \
    --cc=virtualization@lists.linux.dev \
    --cc=wens@csie.org \
    --cc=will@kernel.org \
    --cc=yong.wu@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox