kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
To: Nicolin Chen <nicolinc@nvidia.com>,
	"will@kernel.org" <will@kernel.org>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"kevin.tian@intel.com" <kevin.tian@intel.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"maz@kernel.org" <maz@kernel.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>
Cc: "joro@8bytes.org" <joro@8bytes.org>,
	"shuah@kernel.org" <shuah@kernel.org>,
	"reinette.chatre@intel.com" <reinette.chatre@intel.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"yebin (H)" <yebin10@huawei.com>,
	"apatel@ventanamicro.com" <apatel@ventanamicro.com>,
	"shivamurthy.shastri@linutronix.de"
	<shivamurthy.shastri@linutronix.de>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"anna-maria@linutronix.de" <anna-maria@linutronix.de>,
	"yury.norov@gmail.com" <yury.norov@gmail.com>,
	"nipun.gupta@amd.com" <nipun.gupta@amd.com>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"patches@lists.linux.dev" <patches@lists.linux.dev>,
	"jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"mdf@kernel.org" <mdf@kernel.org>,
	"mshavit@google.com" <mshavit@google.com>,
	"smostafa@google.com" <smostafa@google.com>,
	"ddutile@redhat.com" <ddutile@redhat.com>
Subject: RE: [PATCH RFCv2 00/13] iommu: Add MSI mapping support with nested SMMU
Date: Thu, 23 Jan 2025 09:06:49 +0000	[thread overview]
Message-ID: <4946ea266bdc4b1e8796dee1b228bd8f@huawei.com> (raw)
In-Reply-To: <cover.1736550979.git.nicolinc@nvidia.com>

Hi Nicolin,

> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Saturday, January 11, 2025 3:32 AM
> To: will@kernel.org; robin.murphy@arm.com; jgg@nvidia.com;
> kevin.tian@intel.com; tglx@linutronix.de; maz@kernel.org;
> alex.williamson@redhat.com
> Cc: joro@8bytes.org; shuah@kernel.org; reinette.chatre@intel.com;
> eric.auger@redhat.com; yebin (H) <yebin10@huawei.com>;
> apatel@ventanamicro.com; shivamurthy.shastri@linutronix.de;
> bhelgaas@google.com; anna-maria@linutronix.de; yury.norov@gmail.com;
> nipun.gupta@amd.com; iommu@lists.linux.dev; linux-
> kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> kvm@vger.kernel.org; linux-kselftest@vger.kernel.org;
> patches@lists.linux.dev; jean-philippe@linaro.org; mdf@kernel.org;
> mshavit@google.com; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; smostafa@google.com;
> ddutile@redhat.com
> Subject: [PATCH RFCv2 00/13] iommu: Add MSI mapping support with
> nested SMMU
> 
> [ Background ]
> On ARM GIC systems and others, the target address of the MSI is translated
> by the IOMMU. For GIC, the MSI address page is called "ITS" page. When
> the
> IOMMU is disabled, the MSI address is programmed to the physical location
> of the GIC ITS page (e.g. 0x20200000). When the IOMMU is enabled, the ITS
> page is behind the IOMMU, so the MSI address is programmed to an
> allocated
> IO virtual address (a.k.a IOVA), e.g. 0xFFFF0000, which must be mapped to
> the physical ITS page: IOVA (0xFFFF0000) ===> PA (0x20200000).
> When a 2-stage translation is enabled, IOVA will be still used to program
> the MSI address, though the mappings will be in two stages:
>   IOVA (0xFFFF0000) ===> IPA (e.g. 0x80900000) ===> PA (0x20200000)
> (IPA stands for Intermediate Physical Address).
> 
> If the device that generates MSI is attached to an IOMMU_DOMAIN_DMA,
> the
> IOVA is dynamically allocated from the top of the IOVA space. If attached
> to an IOMMU_DOMAIN_UNMANAGED (e.g. a VFIO passthrough device), the
> IOVA is
> fixed to an MSI window reported by the IOMMU driver via
> IOMMU_RESV_SW_MSI,
> which is hardwired to MSI_IOVA_BASE (IOVA==0x8000000) for ARM
> IOMMUs.
> 
> So far, this IOMMU_RESV_SW_MSI works well as kernel is entirely in charge
> of the IOMMU translation (1-stage translation), since the IOVA for the ITS
> page is fixed and known by kernel. However, with virtual machine enabling
> a nested IOMMU translation (2-stage), a guest kernel directly controls the
> stage-1 translation with an IOMMU_DOMAIN_DMA, mapping a vITS page (at
> an
> IPA 0x80900000) onto its own IOVA space (e.g. 0xEEEE0000). Then, the host
> kernel can't know that guest-level IOVA to program the MSI address.
> 
> There have been two approaches to solve this problem:
> 1. Create an identity mapping in the stage-1. VMM could insert a few RMRs
>    (Reserved Memory Regions) in guest's IORT. Then the guest kernel would
>    fetch these RMR entries from the IORT and create an
> IOMMU_RESV_DIRECT
>    region per iommu group for a direct mapping. Eventually, the mappings
>    would look like: IOVA (0x8000000) === IPA (0x8000000) ===> 0x20200000
>    This requires an IOMMUFD ioctl for kernel and VMM to agree on the IPA.
> 2. Forward the guest-level MSI IOVA captured by VMM to the host-level GIC
>    driver, to program the correct MSI IOVA. Forward the VMM-defined vITS
>    page location (IPA) to the kernel for the stage-2 mapping. Eventually:
>    IOVA (0xFFFF0000) ===> IPA (0x80900000) ===> PA (0x20200000)
>    This requires a VFIO ioctl (for IOVA) and an IOMMUFD ioctl (for IPA).
> 
> Worth mentioning that when Eric Auger was working on the same topic
> with
> the VFIO iommu uAPI, he had the approach (2) first, and then switched to
> the approach (1), suggested by Jean-Philippe for reduction of complexity.
> 
> The approach (1) basically feels like the existing VFIO passthrough that
> has a 1-stage mapping for the unmanaged domain, yet only by shifting the
> MSI mapping from stage 1 (guest-has-no-iommu case) to stage 2 (guest-has-
> iommu case). So, it could reuse the existing IOMMU_RESV_SW_MSI piece,
> by
> sharing the same idea of "VMM leaving everything to the kernel".
> 
> The approach (2) is an ideal solution, yet it requires additional effort
> for kernel to be aware of the 1-stage gIOVA(s) and 2-stage IPAs for vITS
> page(s), which demands VMM to closely cooperate.
>  * It also brings some complicated use cases to the table where the host
>    or/and guest system(s) has/have multiple ITS pages.

I had done some basic sanity tests with this series and the Qemu branches you
provided on a HiSilicon hardwrae. The basic dev assignment works fine. I will 
rebase my Qemu smuv3-accel branch on top of this and will do some more tests.

One confusion I have about the above text is, do we still plan to support the
approach -1( Using RMR in Qemu) or you are just mentioning it here because
it is still possible to make use of that. I think from previous discussions the 
argument was to adopt a more dedicated MSI pass-through model which I
think is  approach-2 here.  Could you please confirm.

Thanks,
Shameer




  parent reply	other threads:[~2025-01-23  9:07 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-11  3:32 [PATCH RFCv2 00/13] iommu: Add MSI mapping support with nested SMMU Nicolin Chen
2025-01-11  3:32 ` [PATCH RFCv2 01/13] genirq/msi: Store the IOMMU IOVA directly in msi_desc instead of iommu_cookie Nicolin Chen
2025-01-23 17:10   ` Eric Auger
2025-01-23 18:48     ` Jason Gunthorpe
2025-01-29 12:11       ` Eric Auger
2025-01-11  3:32 ` [PATCH RFCv2 02/13] genirq/msi: Rename iommu_dma_compose_msi_msg() to msi_msg_set_msi_addr() Nicolin Chen
2025-01-23 17:10   ` Eric Auger
2025-01-23 18:50     ` Jason Gunthorpe
2025-01-29 10:44       ` Eric Auger
2025-01-11  3:32 ` [PATCH RFCv2 03/13] iommu: Make iommu_dma_prepare_msi() into a generic operation Nicolin Chen
2025-01-23 17:10   ` Eric Auger
2025-01-23 18:16     ` Jason Gunthorpe
2025-01-29 12:29       ` Eric Auger
2025-01-11  3:32 ` [PATCH RFCv2 04/13] irqchip: Have CONFIG_IRQ_MSI_IOMMU be selected by the irqchips that need it Nicolin Chen
2025-01-11  3:32 ` [PATCH RFCv2 05/13] iommu: Turn fault_data to iommufd private pointer Nicolin Chen
2025-01-23  9:54   ` Tian, Kevin
2025-01-23 13:25     ` Jason Gunthorpe
2025-01-29 12:40   ` Eric Auger
2025-02-03 17:48     ` Nicolin Chen
2025-01-11  3:32 ` [PATCH RFCv2 06/13] iommufd: Make attach_handle generic Nicolin Chen
2025-01-18  8:23   ` Yi Liu
2025-01-18 20:32     ` Nicolin Chen
2025-01-19 10:40       ` Yi Liu
2025-01-20  5:54         ` Nicolin Chen
2025-01-24 13:31           ` Yi Liu
2025-01-20 14:20       ` Jason Gunthorpe
2025-01-29 13:14   ` Eric Auger
2025-02-03 18:08     ` Nicolin Chen
2025-01-11  3:32 ` [PATCH RFCv2 07/13] iommufd: Implement sw_msi support natively Nicolin Chen
2025-01-15  4:21   ` Yury Norov
2025-01-16 20:21     ` Jason Gunthorpe
2025-01-23 19:30   ` Jason Gunthorpe
2025-01-11  3:32 ` [PATCH RFCv2 08/13] iommu: Turn iova_cookie to dma-iommu private pointer Nicolin Chen
2025-01-13 16:40   ` Jason Gunthorpe
2025-01-11  3:32 ` [PATCH RFCv2 09/13] iommufd: Add IOMMU_OPTION_SW_MSI_START/SIZE ioctls Nicolin Chen
2025-01-23 10:07   ` Tian, Kevin
2025-02-03 18:36     ` Nicolin Chen
2025-01-29 13:44   ` Eric Auger
2025-01-29 14:58     ` Jason Gunthorpe
2025-01-29 17:23       ` Eric Auger
2025-01-29 17:39         ` Jason Gunthorpe
2025-01-29 17:49           ` Eric Auger
2025-01-29 20:15             ` Jason Gunthorpe
2025-02-07  4:26       ` Nicolin Chen
2025-02-07 14:30         ` Jason Gunthorpe
2025-02-07 15:28           ` Jason Gunthorpe
2025-02-07 18:59             ` Nicolin Chen
2025-02-09 18:09               ` Jason Gunthorpe
2025-01-11  3:32 ` [PATCH RFCv2 10/13] iommufd/selftes: Add coverage for IOMMU_OPTION_SW_MSI_START/SIZE Nicolin Chen
2025-01-11  3:32 ` [PATCH RFCv2 11/13] iommufd/device: Allow setting IOVAs for MSI(x) vectors Nicolin Chen
2025-01-11  3:32 ` [PATCH RFCv2 12/13] vfio-iommufd: Provide another layer of msi_iova helpers Nicolin Chen
2025-01-11  3:32 ` [PATCH RFCv2 13/13] vfio/pci: Allow preset MSI IOVAs via VFIO_IRQ_SET_ACTION_PREPARE Nicolin Chen
2025-01-23  9:06 ` Shameerali Kolothum Thodi [this message]
2025-01-23 13:24   ` [PATCH RFCv2 00/13] iommu: Add MSI mapping support with nested SMMU Jason Gunthorpe
2025-01-29 14:54     ` Eric Auger
2025-01-29 15:04       ` Jason Gunthorpe
2025-01-29 17:46         ` Eric Auger
2025-01-29 20:13           ` Jason Gunthorpe
2025-02-04 12:55             ` Eric Auger
2025-02-04 13:02               ` Jason Gunthorpe
2025-02-05 22:49 ` Jacob Pan
2025-02-05 22:56   ` Nicolin Chen
2025-02-07 14:34 ` Jason Gunthorpe
2025-02-07 14:42   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4946ea266bdc4b1e8796dee1b228bd8f@huawei.com \
    --to=shameerali.kolothum.thodi@huawei.com \
    --cc=alex.williamson@redhat.com \
    --cc=anna-maria@linutronix.de \
    --cc=apatel@ventanamicro.com \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mdf@kernel.org \
    --cc=mshavit@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=nipun.gupta@amd.com \
    --cc=patches@lists.linux.dev \
    --cc=reinette.chatre@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=shivamurthy.shastri@linutronix.de \
    --cc=shuah@kernel.org \
    --cc=smostafa@google.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=yebin10@huawei.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).