All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Baolu Lu <baolu.lu@linux.intel.com>, Vasant Hegde <vasant.hegde@amd.com>
Cc: iommu@lists.linux.dev, joro@8bytes.org, will@kernel.org,
	robin.murphy@arm.com, suravee.suthikulpanit@amd.com,
	yi.l.liu@intel.com, kevin.tian@intel.com
Subject: Re: [PATCH 1/5] iommu: Enhance domain allocation code to take additional flags
Date: Mon, 26 Aug 2024 10:45:41 -0300	[thread overview]
Message-ID: <20240826134541.GF3468552@ziepe.ca> (raw)
In-Reply-To: <ec556b8e-4a8c-45ed-a3d2-319ef9fdd04f@linux.intel.com>

On Fri, Aug 23, 2024 at 10:47:46AM +0800, Baolu Lu wrote:
> > Why? What exactly is the issue?
> > 
> > It is inhernetly wrong to behave differently based on DMA API or VFIO.
> > They are not different things.
> > 
> > If you have different behaviors and different properies, like AMD's
> > PASID, then they should be described and mapped to some kind of flag.
> > 
> > Otherwise the driver should always allocate a paging domain that gives
> > the highest performance.
> 
> It relates to Intel VT-d's nested translation. Intel VT-d has two types
> of page table formats for DMA translation: first level and second level.
> In nested translation, the first level page table is used for first-
> stage translation, and the second level page table is used for second-
> stage translation.
> 
> The iommu driver for vIOMMU in the guest kernel must use the first level
> page table format for kernel DMA. This page table will then be nested on
> a second level page table in the VMM host kernel.

The guest kernel must know that the IOMMU can only support the first
level format in this case! There is no other option!

> Our current design uses the first level page table for both the host and
> guest kernel for simplicity. This is why we use different page table
> formats for IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_UNMANAGED.

You should just always use the first level format for every paging
domain except when IOMMU_HWPT_ALLOC_NEST_PARENT is set.

IOMMU_HWPT_ALLOC_NEST_PARENT was added specifically to solve this
problem.

Ideally the guest kernel will reject IOMMU_HWPT_ALLOC_NEST_PARENT as
not supported when allocated or attached. ARM does.

> I am not sure about other architectures, like AMD, ARM and RISC-V.
> Perhaps all of them have the similar need?

ARM/AMD/VTD all have the two page table problem.

ARM informs the guest if S1/S2 is available via IDR which is like
VT-D's ECAP. Only S1 can be used in the nested guest VM.

AMD has a global amd_iommu_pgtable, it must be set to AMD_IOMMU_V2
within a nested guest VM. The only way I saw for that to happen was
via kernel command line parameter? Was expecting some ACPI or
FEATURE_XXX thing? Vasant?

VTD should add some discoverability that the second stage is not
supported and behave as above.

Jason

  parent reply	other threads:[~2024-08-26 13:45 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-21 13:35 [PATCH 0/5] iommu: Domain allocation enhancements Vasant Hegde
2024-08-21 13:35 ` [PATCH 1/5] iommu: Enhance domain allocation code to take additional flags Vasant Hegde
2024-08-21 16:31   ` Jason Gunthorpe
2024-08-22  1:50     ` Baolu Lu
2024-08-22 12:43       ` Jason Gunthorpe
2024-08-23  2:47         ` Baolu Lu
2024-08-26  8:08           ` Tian, Kevin
2024-08-26  8:34             ` Baolu Lu
2024-08-26  8:59               ` Tian, Kevin
2024-08-26 13:51                 ` Jason Gunthorpe
2024-08-26  8:47           ` Vasant Hegde
2024-08-26 13:45           ` Jason Gunthorpe [this message]
2024-08-22 11:27     ` Yi Liu
2024-08-22 12:44       ` Jason Gunthorpe
2024-08-23  8:58         ` Yi Liu
2024-08-24 14:47           ` Vasant Hegde
2024-08-28 21:52             ` Jacob Pan
2024-08-29 10:51               ` Vasant Hegde
2024-08-29 12:10                 ` Jason Gunthorpe
2024-08-29 12:47                   ` Vasant Hegde
2024-08-29 13:11                     ` Jason Gunthorpe
2024-09-11 10:54                       ` Vasant Hegde
2024-08-29 17:40                     ` Jacob Pan
     [not found]                     ` <66d0b2a1.630a0220.1dd301.daceSMTPIN_ADDED_BROKEN@mx.google.com>
2024-08-30 15:00                       ` Jason Gunthorpe
2024-08-26  8:36     ` Vasant Hegde
2024-08-26 13:56       ` Jason Gunthorpe
2024-08-29 12:34         ` Vasant Hegde
2024-08-22  1:38   ` Baolu Lu
2024-08-22 12:40     ` Jason Gunthorpe
2024-08-23  2:04       ` Baolu Lu
2024-08-26  6:09     ` Vasant Hegde
2024-08-22  2:10   ` kernel test robot
2024-08-22  3:03   ` kernel test robot
2024-08-22  5:07   ` kernel test robot
2024-08-21 13:35 ` [PATCH 2/5] iommu/amd: Separate page table setup from domain allocation Vasant Hegde
2024-08-21 16:40   ` Jason Gunthorpe
2024-08-21 13:35 ` [PATCH 3/5] iommu/amd: Pass page table type to pdomain_setup_pgtable() Vasant Hegde
2024-08-21 13:35 ` [PATCH 4/5] iommu/amd: Enhance domain_alloc_user() to allocate PASID capable domain Vasant Hegde
2024-08-21 13:35 ` [PATCH 5/5] iommu/amd: Add iommu_ops->domain_alloc_paging support Vasant Hegde
2024-08-21 15:57   ` Jason Gunthorpe
2024-09-11 10:44     ` Vasant Hegde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240826134541.GF3468552@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.