Linux IOMMU Development
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Vasant Hegde <vasant.hegde@amd.com>
Cc: Baolu Lu <baolu.lu@linux.intel.com>,
	iommu@lists.linux.dev, joro@8bytes.org, will@kernel.org,
	robin.murphy@arm.com, suravee.suthikulpanit@amd.com,
	yi.l.liu@intel.com, kevin.tian@intel.com,
	jacob.pan@linux.microsoft.com
Subject: Re: [PATCH v2 0/8] iommu: Domain allocation enhancements
Date: Wed, 9 Oct 2024 09:15:48 -0300	[thread overview]
Message-ID: <20241009121548.GE762027@ziepe.ca> (raw)
In-Reply-To: <9d0101b4-af13-4bf6-94a5-43a79f4d9989@amd.com>

On Wed, Oct 09, 2024 at 03:23:16PM +0530, Vasant Hegde wrote:

> > This change might cause a functional regression when it comes to nested
> > translation. In nested translation mode, the user page table (e.g.,
> > created and managed by a guest VM for guest kernel DMA) must be in the
> > first-stage page table format. Then, it can be nested on a second-stage
> > page table managed by the host kernel.
> > 
> > Currently, the kernel automatically selects the page table formats. For
> > example, the Intel IOMMU driver always uses the first-stage page table
> > for guest kernel DMA. After this change, this assumption no longer holds
> > true. This means the kernel might use a second-stage page table for
> > guest kernel DMA, breaking nested translation.
> 
> Hmmm. I assumed after discussion in v1 series you are fine. Looks like I misread it?

It is a bug in the implementation in the intel driver.

intel_iommu_domain_alloc_user() should always allocate a page table
that works, if you are in a guest context it must allocate a first
level/guest page table otherwise nested VFIO will be broken.

For this reason Intel driver should always allocate the guest
compatible page table unless IOMMU_HWPT_ALLOC_NEST_PARENT is
specified.

Something like this:

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9f6b0780f2ef5e..cfae6c2973e0ee 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1440,7 +1440,7 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
  * Check and return whether first level is used by default for
  * DMA translation.
  */
-static bool first_level_by_default(unsigned int type)
+static bool first_level_by_default(void)
 {
 	/* Only SL is available in legacy mode */
 	if (!scalable_mode_support())
@@ -1450,8 +1450,8 @@ static bool first_level_by_default(unsigned int type)
 	if (intel_cap_flts_sanity() ^ intel_cap_slts_sanity())
 		return intel_cap_flts_sanity();
 
-	/* Both levels are available, decide it based on domain type */
-	return type != IOMMU_DOMAIN_UNMANAGED;
+	/* Both levels are available, use FL */
+	return true;
 }
 
 static struct dmar_domain *alloc_domain(unsigned int type)
@@ -1463,7 +1463,7 @@ static struct dmar_domain *alloc_domain(unsigned int type)
 		return NULL;
 
 	domain->nid = NUMA_NO_NODE;
-	if (first_level_by_default(type))
+	if (first_level_by_default())
 		domain->use_first_level = true;
 	INIT_LIST_HEAD(&domain->devices);
 	INIT_LIST_HEAD(&domain->dev_pasids);
@@ -3287,8 +3287,7 @@ int __init intel_iommu_init(void)
 		 * is likely to be much lower than the overhead of synchronizing
 		 * the virtual and physical IOMMU page-tables.
 		 */
-		if (cap_caching_mode(iommu->cap) &&
-		    !first_level_by_default(IOMMU_DOMAIN_DMA)) {
+		if (cap_caching_mode(iommu->cap) && !first_level_by_default()) {
 			pr_info_once("IOMMU batching disallowed due to virtualization\n");
 			iommu_set_dma_strict();
 		}
@@ -3530,6 +3529,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags,
 	struct intel_iommu *iommu = info->iommu;
 	struct dmar_domain *dmar_domain;
 	struct iommu_domain *domain;
+	bool first_level;
 
 	/* Must be NESTING domain */
 	if (parent) {
@@ -3541,13 +3541,18 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags,
 	if (flags &
 	    (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING)))
 		return ERR_PTR(-EOPNOTSUPP);
+	if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !intel_cap_slts_sanity())
+		return ERR_PTR(-EOPNOTSUPP);
 	if (nested_parent && !nested_supported(iommu))
 		return ERR_PTR(-EOPNOTSUPP);
 	if (user_data || (dirty_tracking && !ssads_supported(iommu)))
 		return ERR_PTR(-EOPNOTSUPP);
 
-	/* Do not use first stage for user domain translation. */
-	dmar_domain = paging_domain_alloc(dev, false);
+	if (flags & IOMMU_HWPT_ALLOC_NEST_PARENT)
+		first_level = false;
+	else
+		first_level = first_level_by_default();
+	dmar_domain = paging_domain_alloc(dev, first_level);
 	if (IS_ERR(dmar_domain))
 		return ERR_CAST(dmar_domain);
 	domain = &dmar_domain->domain;

  reply	other threads:[~2024-10-09 12:15 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-11 10:19 [PATCH v2 0/8] iommu: Domain allocation enhancements Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 1/8] iommu: Refactor __iommu_domain_alloc() Vasant Hegde
2024-09-12  1:50   ` Baolu Lu
2024-09-13  4:02   ` Jacob Pan
2024-09-26 10:17     ` Vasant Hegde
2024-09-30 17:55       ` Jacob Pan
2024-10-01  4:31         ` Vasant Hegde
2024-10-02  5:11           ` Jacob Pan
     [not found]       ` <66fae60d.170a0220.280357.3d11SMTPIN_ADDED_BROKEN@mx.google.com>
2024-10-02 14:19         ` Jason Gunthorpe
2024-10-02 16:16           ` Jacob Pan
2024-10-02 19:09   ` Jason Gunthorpe
2024-10-15  8:12   ` Tian, Kevin
2024-09-11 10:19 ` [PATCH v2 2/8] iommu: Introduce iommu_paging_domain_alloc_flags() Vasant Hegde
2024-09-12  4:04   ` Baolu Lu
2024-09-26 10:43     ` Vasant Hegde
2024-10-15  8:24     ` Tian, Kevin
2024-10-15 12:31       ` Jason Gunthorpe
2024-10-16  2:44         ` Tian, Kevin
2024-10-02 19:12   ` Jason Gunthorpe
2024-10-09 21:14   ` Jacob Pan
2024-10-16 10:14     ` Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 3/8] iommu: Add new flag to explictly request PASID capable domain Vasant Hegde
2024-09-12  4:14   ` Baolu Lu
2024-09-26 10:29     ` Vasant Hegde
2024-09-26 11:01       ` Vasant Hegde
2024-10-02 14:23         ` Jason Gunthorpe
2024-10-02 19:02           ` Jacob Pan
     [not found]           ` <66fd98e3.170a0220.23d7ae.c2a9SMTPIN_ADDED_BROKEN@mx.google.com>
2024-10-02 19:07             ` Jason Gunthorpe
2024-10-03 16:00               ` Jacob Pan
2024-10-02 19:23   ` Jason Gunthorpe
2024-10-04  8:12     ` Vasant Hegde
2024-10-04 12:46       ` Jason Gunthorpe
2024-10-15  8:31   ` Tian, Kevin
2024-09-11 10:19 ` [PATCH v2 4/8] iommu/amd: Separate page table setup from domain allocation Vasant Hegde
2024-09-13 17:08   ` Jacob Pan
2024-10-02 19:24   ` Jason Gunthorpe
2024-09-11 10:19 ` [PATCH v2 5/8] iommu/amd: Pass page table type as param to pdom_setup_pgtable() Vasant Hegde
2024-09-13 21:39   ` Jacob Pan
2024-09-26 10:25     ` Vasant Hegde
2024-09-30 17:57       ` Jacob Pan
     [not found]   ` <66e4b125.170a0220.2fa213.1e2cSMTPIN_ADDED_BROKEN@mx.google.com>
2024-09-20 13:02     ` Jason Gunthorpe
2024-09-11 10:19 ` [PATCH v2 6/8] iommu/amd: Enhance domain_alloc_user() to allocate PASID capable domain Vasant Hegde
2024-10-02 19:31   ` Jason Gunthorpe
2024-10-04  8:18     ` Vasant Hegde
2024-10-04 12:48       ` Jason Gunthorpe
2024-10-04 14:32         ` Vasant Hegde
2024-10-15  8:41   ` Tian, Kevin
2024-10-15 12:40     ` Jason Gunthorpe
2024-10-16  2:48       ` Tian, Kevin
2024-10-16 15:28         ` Jason Gunthorpe
2024-10-17  6:11           ` Tian, Kevin
2024-10-17 11:03             ` Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 7/8] iommu/amd: Add iommu_ops->domain_alloc_paging support Vasant Hegde
2024-10-02 19:33   ` Jason Gunthorpe
2024-10-04 11:55     ` Vasant Hegde
2024-10-04 12:56       ` Jason Gunthorpe
2024-10-04 14:30         ` Vasant Hegde
2024-10-04 15:31           ` Jason Gunthorpe
2024-10-08 10:08             ` Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 8/8] iommu/amd: Implement global identity domain Vasant Hegde
2024-10-02 19:36   ` Jason Gunthorpe
2024-10-04 11:42     ` Vasant Hegde
2024-10-02  5:30 ` [PATCH v2 0/8] iommu: Domain allocation enhancements Vasant Hegde
2024-10-02 14:24   ` Jason Gunthorpe
2024-10-04  6:11     ` Vasant Hegde
2024-10-09  2:47   ` Baolu Lu
2024-10-09  9:53     ` Vasant Hegde
2024-10-09 12:15       ` Jason Gunthorpe [this message]
2024-10-10  6:40         ` Baolu Lu
2024-10-10  6:48           ` Baolu Lu
2024-10-10 11:38             ` Jason Gunthorpe
2024-10-10 14:06               ` Baolu Lu
2024-10-11  5:06         ` Tian, Kevin
2024-10-11 11:39           ` Jason Gunthorpe
2024-10-15  8:10             ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241009121548.GE762027@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jacob.pan@linux.microsoft.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox