From: Baolu Lu <baolu.lu@linux.intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>, Jason Gunthorpe <jgg@ziepe.ca>
Cc: baolu.lu@linux.intel.com, Vasant Hegde <vasant.hegde@amd.com>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"joro@8bytes.org" <joro@8bytes.org>,
"will@kernel.org" <will@kernel.org>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"suravee.suthikulpanit@amd.com" <suravee.suthikulpanit@amd.com>,
"Liu, Yi L" <yi.l.liu@intel.com>
Subject: Re: [PATCH 1/5] iommu: Enhance domain allocation code to take additional flags
Date: Mon, 26 Aug 2024 16:34:10 +0800 [thread overview]
Message-ID: <7ca97716-fe84-426e-bc73-808f8182ddbf@linux.intel.com> (raw)
In-Reply-To: <BN9PR11MB5276296DD7C9503378C9227B8C8B2@BN9PR11MB5276.namprd11.prod.outlook.com>
On 2024/8/26 16:08, Tian, Kevin wrote:
>> From: Baolu Lu<baolu.lu@linux.intel.com>
>> Sent: Friday, August 23, 2024 10:48 AM
>>
>> On 8/22/24 8:43 PM, Jason Gunthorpe wrote:
>>> On Thu, Aug 22, 2024 at 09:50:57AM +0800, Baolu Lu wrote:
>>>> On 8/22/24 12:31 AM, Jason Gunthorpe wrote:
>>>>>> I think instead of having separate function it may be better to
>>>>>> enhance __iommu_domain_alloc() such that:
>>>>>> - Keep below changes from this patch
>>>>>> - iommu_domain_init()
>>>>>> - iommu_get_dma_cookie call inside
>> iommu_setup_default_domain()
>>>>>> - modify __iommu_domain_alloc() to additional param (flags)
>>>>>> - iommu_paging_domain_alloc_flags() will call
>> __iommu_domain_alloc()
>>>>> My expectation was to basically remove iommu_domain_alloc() entirely
>>>>> once Lu's work is merged.
>>>>>
>>>>> Instead we'd have these direct APIs:
>>>>> iommu_domain_alloc_paging_flags()
>>>> Is it possible to use different domain allocation APIs for kernel DMA
>>>> and user-space DMA? Right now, we differentiate between these two
>> types
>>>> of domains using IOMMU_DOMAIN_DMA and
>> IOMMU_DOMAIN_UNMANAGED.
>>> I really don't want to have such a distinction.
>>>
>>>> I'm thinking about this because the Intel iommu driver has a similar
>>>> need to AMD. They both recommend using different page table formats
>> for
>>>> IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_UNMANAGED, which is
> Where is such recommendation coming from?
>
>> currently stopping
>>>> us from implementing domain_alloc_paging in the Intel iommu driver.
>>> Why? What exactly is the issue?
>>>
>>> It is inhernetly wrong to behave differently based on DMA API or VFIO.
>>> They are not different things.
>>>
>>> If you have different behaviors and different properies, like AMD's
>>> PASID, then they should be described and mapped to some kind of flag.
>>>
>>> Otherwise the driver should always allocate a paging domain that gives
>>> the highest performance.
>> It relates to Intel VT-d's nested translation. Intel VT-d has two types
>> of page table formats for DMA translation: first level and second level.
>> In nested translation, the first level page table is used for first-
>> stage translation, and the second level page table is used for second-
>> stage translation.
>>
>> The iommu driver for vIOMMU in the guest kernel must use the first level
>> page table format for kernel DMA. This page table will then be nested on
>> a second level page table in the VMM host kernel.
> If a 'legacy-only' vIOMMU is exposed the guest kernel will certainly
> use the 2nd stage page table.
>
> nested is an optimization manageable by VMM. Not something that
> the kernel driver needs to restrict.
>
>> Our current design uses the first level page table for both the host and
>> guest kernel for simplicity. This is why we use different page table
>> formats for IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_UNMANAGED.
> As you said it's the current 'design', not an arch limitation. 😊
>
>> We considered determining the page table format based on whether the
>> iommu has caching mode capability. This would result in the first level
>> page table format being used for guest kernel DMA and the second level
>> page table format being used for host kernel DMA. However, this approach
>> creates an inconsistency between the host and guest kernels.
> Why would one care about such consistency between the host and the guest?
>
> In the end the iommu driver needs to decide based on the requested hwpt
> type and available iommu cap to decide the format.
>
> e.g. is there a problem with a simple policy below?
>
> - Default use stage-1 for both DMA/UNMANAGED, if nesting_parent is
> not specified and stage-1 is supported by hw
> - Otherwise use stage-2
First-stage has some limitations for an UNMANAGED domain. For example,
- No separate controls for the Access/Dirty page tracking;
- No Write-only permission support;
- No page-level control for forcing snoop.
Thanks,
baolu
next prev parent reply other threads:[~2024-08-26 8:34 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-21 13:35 [PATCH 0/5] iommu: Domain allocation enhancements Vasant Hegde
2024-08-21 13:35 ` [PATCH 1/5] iommu: Enhance domain allocation code to take additional flags Vasant Hegde
2024-08-21 16:31 ` Jason Gunthorpe
2024-08-22 1:50 ` Baolu Lu
2024-08-22 12:43 ` Jason Gunthorpe
2024-08-23 2:47 ` Baolu Lu
2024-08-26 8:08 ` Tian, Kevin
2024-08-26 8:34 ` Baolu Lu [this message]
2024-08-26 8:59 ` Tian, Kevin
2024-08-26 13:51 ` Jason Gunthorpe
2024-08-26 8:47 ` Vasant Hegde
2024-08-26 13:45 ` Jason Gunthorpe
2024-08-22 11:27 ` Yi Liu
2024-08-22 12:44 ` Jason Gunthorpe
2024-08-23 8:58 ` Yi Liu
2024-08-24 14:47 ` Vasant Hegde
2024-08-28 21:52 ` Jacob Pan
2024-08-29 10:51 ` Vasant Hegde
2024-08-29 12:10 ` Jason Gunthorpe
2024-08-29 12:47 ` Vasant Hegde
2024-08-29 13:11 ` Jason Gunthorpe
2024-09-11 10:54 ` Vasant Hegde
2024-08-29 17:40 ` Jacob Pan
[not found] ` <66d0b2a1.630a0220.1dd301.daceSMTPIN_ADDED_BROKEN@mx.google.com>
2024-08-30 15:00 ` Jason Gunthorpe
2024-08-26 8:36 ` Vasant Hegde
2024-08-26 13:56 ` Jason Gunthorpe
2024-08-29 12:34 ` Vasant Hegde
2024-08-22 1:38 ` Baolu Lu
2024-08-22 12:40 ` Jason Gunthorpe
2024-08-23 2:04 ` Baolu Lu
2024-08-26 6:09 ` Vasant Hegde
2024-08-22 2:10 ` kernel test robot
2024-08-22 3:03 ` kernel test robot
2024-08-22 5:07 ` kernel test robot
2024-08-21 13:35 ` [PATCH 2/5] iommu/amd: Separate page table setup from domain allocation Vasant Hegde
2024-08-21 16:40 ` Jason Gunthorpe
2024-08-21 13:35 ` [PATCH 3/5] iommu/amd: Pass page table type to pdomain_setup_pgtable() Vasant Hegde
2024-08-21 13:35 ` [PATCH 4/5] iommu/amd: Enhance domain_alloc_user() to allocate PASID capable domain Vasant Hegde
2024-08-21 13:35 ` [PATCH 5/5] iommu/amd: Add iommu_ops->domain_alloc_paging support Vasant Hegde
2024-08-21 15:57 ` Jason Gunthorpe
2024-09-11 10:44 ` Vasant Hegde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7ca97716-fe84-426e-bc73-808f8182ddbf@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=robin.murphy@arm.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=vasant.hegde@amd.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.