From: Baolu Lu <baolu.lu@linux.intel.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: baolu.lu@linux.intel.com, Vasant Hegde <vasant.hegde@amd.com>,
iommu@lists.linux.dev, joro@8bytes.org, will@kernel.org,
robin.murphy@arm.com, suravee.suthikulpanit@amd.com,
yi.l.liu@intel.com, kevin.tian@intel.com,
jacob.pan@linux.microsoft.com
Subject: Re: [PATCH v2 0/8] iommu: Domain allocation enhancements
Date: Thu, 10 Oct 2024 22:06:06 +0800 [thread overview]
Message-ID: <3e83bc8a-c827-413d-b652-893a154d6dc9@linux.intel.com> (raw)
In-Reply-To: <20241010113817.GG762027@ziepe.ca>
On 2024/10/10 19:38, Jason Gunthorpe wrote:
> On Thu, Oct 10, 2024 at 02:48:09PM +0800, Baolu Lu wrote:
>> On 2024/10/10 14:40, Baolu Lu wrote:
>>> On 2024/10/9 20:15, Jason Gunthorpe wrote:
>>>> On Wed, Oct 09, 2024 at 03:23:16PM +0530, Vasant Hegde wrote:
>>>>
>>>>>> This change might cause a functional regression when it comes to nested
>>>>>> translation. In nested translation mode, the user page table (e.g.,
>>>>>> created and managed by a guest VM for guest kernel DMA) must be in the
>>>>>> first-stage page table format. Then, it can be nested on a second-stage
>>>>>> page table managed by the host kernel.
>>>>>>
>>>>>> Currently, the kernel automatically selects the page table formats. For
>>>>>> example, the Intel IOMMU driver always uses the first-stage page table
>>>>>> for guest kernel DMA. After this change, this assumption no
>>>>>> longer holds
>>>>>> true. This means the kernel might use a second-stage page table for
>>>>>> guest kernel DMA, breaking nested translation.
>>>>> Hmmm. I assumed after discussion in v1 series you are fine.
>>>>> Looks like I misread it?
>>>> It is a bug in the implementation in the intel driver.
>>>>
>>>> intel_iommu_domain_alloc_user() should always allocate a page table
>>>> that works, if you are in a guest context it must allocate a first
>>>> level/guest page table otherwise nested VFIO will be broken.
>>>>
>>>> For this reason Intel driver should always allocate the guest
>>>> compatible page table unless IOMMU_HWPT_ALLOC_NEST_PARENT is
>>>> specified.
>>>>
>>>> Something like this:
>>> This will break the existing dirty page tracking functionality. Intel
>>> IOMMU only supports enabling or disabling dirty page tracking at the
>>> second-stage page table.
>> So, perhaps something like below?
>>
>> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
>> index 1497f3112b12..e11dde259afa 100644
>> --- a/drivers/iommu/intel/iommu.h
>> +++ b/drivers/iommu/intel/iommu.h
>> @@ -544,6 +544,8 @@ enum {
>> ecap_slads((iommu)->ecap))
>> #define nested_supported(iommu) (sm_supported(iommu) && \
>> ecap_nest((iommu)->ecap))
>> +#define flts_supported(iommu) (sm_supported(iommu) && \
>> + ecap_flts((iommu)->ecap))
>>
>> struct pasid_entry;
>> struct pasid_state_entry;
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 9f6b0780f2ef..d1a7378489a4 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -3530,6 +3530,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32
>> flags,
>> struct intel_iommu *iommu = info->iommu;
>> struct dmar_domain *dmar_domain;
>> struct iommu_domain *domain;
>> + bool first_stage;
>>
>> /* Must be NESTING domain */
>> if (parent) {
>> @@ -3546,8 +3547,14 @@ intel_iommu_domain_alloc_user(struct device *dev, u32
>> flags,
>> if (user_data || (dirty_tracking && !ssads_supported(iommu)))
>> return ERR_PTR(-EOPNOTSUPP);
>>
>> - /* Do not use first stage for user domain translation. */
>> - dmar_domain = paging_domain_alloc(dev, false);
>> + /*
>> + * Always allocate the guest compatible page table unless
>> + * IOMMU_HWPT_ALLOC_NEST_PARENT or IOMMU_HWPT_ALLOC_DIRTY_TRACKING
>> + * is specified.
>> + */
>> + first_stage = (nested_parent || dirty_tracking) ?
>> + false : flts_supported(iommu);
> That makes sense, but these flags still need to be rejected if the
> second level is not supported in the HW.
>
> + if ((flags & (IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING)) && !intel_cap_slts_sanity())
> + return ERR_PTR(-EOPNOTSUPP);
>
> I also think my version was cleaner as we should be using
> first_level_by_default() consistently to make that decision.
>
> Also don't care for ternary expressions 🙂
Okay, seems that we are on the same page now. :-)
I have a series to add domain_alloc_paging support in the Intel iommu
driver. I will convert the change in a formal patch and put it in that
series for further review.
Thanks,
baolu
next prev parent reply other threads:[~2024-10-10 14:06 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-11 10:19 [PATCH v2 0/8] iommu: Domain allocation enhancements Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 1/8] iommu: Refactor __iommu_domain_alloc() Vasant Hegde
2024-09-12 1:50 ` Baolu Lu
2024-09-13 4:02 ` Jacob Pan
2024-09-26 10:17 ` Vasant Hegde
2024-09-30 17:55 ` Jacob Pan
2024-10-01 4:31 ` Vasant Hegde
2024-10-02 5:11 ` Jacob Pan
[not found] ` <66fae60d.170a0220.280357.3d11SMTPIN_ADDED_BROKEN@mx.google.com>
2024-10-02 14:19 ` Jason Gunthorpe
2024-10-02 16:16 ` Jacob Pan
2024-10-02 19:09 ` Jason Gunthorpe
2024-10-15 8:12 ` Tian, Kevin
2024-09-11 10:19 ` [PATCH v2 2/8] iommu: Introduce iommu_paging_domain_alloc_flags() Vasant Hegde
2024-09-12 4:04 ` Baolu Lu
2024-09-26 10:43 ` Vasant Hegde
2024-10-15 8:24 ` Tian, Kevin
2024-10-15 12:31 ` Jason Gunthorpe
2024-10-16 2:44 ` Tian, Kevin
2024-10-02 19:12 ` Jason Gunthorpe
2024-10-09 21:14 ` Jacob Pan
2024-10-16 10:14 ` Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 3/8] iommu: Add new flag to explictly request PASID capable domain Vasant Hegde
2024-09-12 4:14 ` Baolu Lu
2024-09-26 10:29 ` Vasant Hegde
2024-09-26 11:01 ` Vasant Hegde
2024-10-02 14:23 ` Jason Gunthorpe
2024-10-02 19:02 ` Jacob Pan
[not found] ` <66fd98e3.170a0220.23d7ae.c2a9SMTPIN_ADDED_BROKEN@mx.google.com>
2024-10-02 19:07 ` Jason Gunthorpe
2024-10-03 16:00 ` Jacob Pan
2024-10-02 19:23 ` Jason Gunthorpe
2024-10-04 8:12 ` Vasant Hegde
2024-10-04 12:46 ` Jason Gunthorpe
2024-10-15 8:31 ` Tian, Kevin
2024-09-11 10:19 ` [PATCH v2 4/8] iommu/amd: Separate page table setup from domain allocation Vasant Hegde
2024-09-13 17:08 ` Jacob Pan
2024-10-02 19:24 ` Jason Gunthorpe
2024-09-11 10:19 ` [PATCH v2 5/8] iommu/amd: Pass page table type as param to pdom_setup_pgtable() Vasant Hegde
2024-09-13 21:39 ` Jacob Pan
2024-09-26 10:25 ` Vasant Hegde
2024-09-30 17:57 ` Jacob Pan
[not found] ` <66e4b125.170a0220.2fa213.1e2cSMTPIN_ADDED_BROKEN@mx.google.com>
2024-09-20 13:02 ` Jason Gunthorpe
2024-09-11 10:19 ` [PATCH v2 6/8] iommu/amd: Enhance domain_alloc_user() to allocate PASID capable domain Vasant Hegde
2024-10-02 19:31 ` Jason Gunthorpe
2024-10-04 8:18 ` Vasant Hegde
2024-10-04 12:48 ` Jason Gunthorpe
2024-10-04 14:32 ` Vasant Hegde
2024-10-15 8:41 ` Tian, Kevin
2024-10-15 12:40 ` Jason Gunthorpe
2024-10-16 2:48 ` Tian, Kevin
2024-10-16 15:28 ` Jason Gunthorpe
2024-10-17 6:11 ` Tian, Kevin
2024-10-17 11:03 ` Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 7/8] iommu/amd: Add iommu_ops->domain_alloc_paging support Vasant Hegde
2024-10-02 19:33 ` Jason Gunthorpe
2024-10-04 11:55 ` Vasant Hegde
2024-10-04 12:56 ` Jason Gunthorpe
2024-10-04 14:30 ` Vasant Hegde
2024-10-04 15:31 ` Jason Gunthorpe
2024-10-08 10:08 ` Vasant Hegde
2024-09-11 10:19 ` [PATCH v2 8/8] iommu/amd: Implement global identity domain Vasant Hegde
2024-10-02 19:36 ` Jason Gunthorpe
2024-10-04 11:42 ` Vasant Hegde
2024-10-02 5:30 ` [PATCH v2 0/8] iommu: Domain allocation enhancements Vasant Hegde
2024-10-02 14:24 ` Jason Gunthorpe
2024-10-04 6:11 ` Vasant Hegde
2024-10-09 2:47 ` Baolu Lu
2024-10-09 9:53 ` Vasant Hegde
2024-10-09 12:15 ` Jason Gunthorpe
2024-10-10 6:40 ` Baolu Lu
2024-10-10 6:48 ` Baolu Lu
2024-10-10 11:38 ` Jason Gunthorpe
2024-10-10 14:06 ` Baolu Lu [this message]
2024-10-11 5:06 ` Tian, Kevin
2024-10-11 11:39 ` Jason Gunthorpe
2024-10-15 8:10 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3e83bc8a-c827-413d-b652-893a154d6dc9@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jacob.pan@linux.microsoft.com \
--cc=jgg@ziepe.ca \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=robin.murphy@arm.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=vasant.hegde@amd.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox