qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Auger <eric.auger@redhat.com>
To: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "alex@shazbot.org" <alex@shazbot.org>,
	"clg@redhat.com" <clg@redhat.com>,
	 "mst@redhat.com" <mst@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"ddutile@redhat.com" <ddutile@redhat.com>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"skolothumtho@nvidia.com" <skolothumtho@nvidia.com>,
	"joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
	"clement.mathieu--drif@eviden.com"
	<clement.mathieu--drif@eviden.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"Peng, Chao P" <chao.p.peng@intel.com>
Subject: Re: [PATCH v8 09/23] intel_iommu_accel: Check for compatibility with IOMMUFD backed device when x-flts=on
Date: Thu, 11 Dec 2025 08:09:03 +0100	[thread overview]
Message-ID: <ef46c714-28f6-413c-9270-6119a92a2849@redhat.com> (raw)
In-Reply-To: <IA3PR11MB9136A94E3A6B29A2F35D57EA92A1A@IA3PR11MB9136.namprd11.prod.outlook.com>



On 12/11/25 7:49 AM, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v8 09/23] intel_iommu_accel: Check for compatibility
>> with IOMMUFD backed device when x-flts=on
>>
>> Hi Zhenzhong,
>> On 11/17/25 10:37 AM, Zhenzhong Duan wrote:
>>> When vIOMMU is configured x-flts=on in scalable mode, first stage page
>> table
>>> is passed to host to construct nested page table for passthrough devices.
>>>
>>> We need to check compatibility of some critical IOMMU capabilities
>> between
>>> vIOMMU and host IOMMU to ensure guest first stage page table could be
>> used by
>>> host.
>>>
>>> For instance, vIOMMU supports first stage 1GB large page mapping, but
>> host does
>>> not, then this IOMMUFD backed device should fail.
>>>
>>> Even of the checks pass, for now we willingly reject the association because
>>> all the bits are not there yet, it will be relaxed in the end of this series.
>>>
>>> Note vIOMMU has exposed IOMMU_HWPT_ALLOC_NEST_PARENT flag to
>> force VFIO core to
>>> create nesting parent HWPT, if host doesn't support nested translation, the
>>> creation will fail. So no need to check nested capability here.
>>>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>  MAINTAINERS                 |  1 +
>>>  hw/i386/intel_iommu_accel.h | 28 +++++++++++++++++++++++++
>>>  hw/i386/intel_iommu.c       |  5 ++---
>>>  hw/i386/intel_iommu_accel.c | 42
>> +++++++++++++++++++++++++++++++++++++
>>>  hw/i386/Kconfig             |  5 +++++
>>>  hw/i386/meson.build         |  1 +
>>>  6 files changed, 79 insertions(+), 3 deletions(-)
>>>  create mode 100644 hw/i386/intel_iommu_accel.h
>>>  create mode 100644 hw/i386/intel_iommu_accel.c
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index f4a30c126b..bc1d2b6261 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -3929,6 +3929,7 @@ R: Clément Mathieu--Drif
>> <clement.mathieu--drif@eviden.com>
>>>  S: Supported
>>>  F: hw/i386/intel_iommu.c
>>>  F: hw/i386/intel_iommu_internal.h
>>> +F: hw/i386/intel_iommu_accel.*
>>>  F: include/hw/i386/intel_iommu.h
>>>  F: tests/functional/x86_64/test_intel_iommu.py
>>>  F: tests/qtest/intel-iommu-test.c
>>> diff --git a/hw/i386/intel_iommu_accel.h b/hw/i386/intel_iommu_accel.h
>>> new file mode 100644
>>> index 0000000000..c5274e342c
>>> --- /dev/null
>>> +++ b/hw/i386/intel_iommu_accel.h
>>> @@ -0,0 +1,28 @@
>>> +/*
>>> + * Intel IOMMU acceleration with nested translation
>>> + *
>>> + * Copyright (C) 2025 Intel Corporation.
>>> + *
>>> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> + *
>>> + * SPDX-License-Identifier: GPL-2.0-or-later
>>> + */
>>> +
>>> +#ifndef HW_I386_INTEL_IOMMU_ACCEL_H
>>> +#define HW_I386_INTEL_IOMMU_ACCEL_H
>>> +#include CONFIG_DEVICES
> Here to address Cédric's suggestion.
>
>>> +
>>> +#ifdef CONFIG_VTD_ACCEL
>>> +bool vtd_check_hiod_accel(IntelIOMMUState *s, HostIOMMUDevice
>> *hiod,
>>> +                          Error **errp);
>>> +#else
>>> +static inline bool vtd_check_hiod_accel(IntelIOMMUState *s,
>>> +                                        HostIOMMUDevice
>> *hiod,
>>> +                                        Error **errp)
>>> +{
>>> +    error_setg(errp,
>>> +               "host IOMMU is incompatible with guest first stage
>> translation");
>> I would rather change the error msg to
>>
>> host IOMMU cannot be checked!
>> + append a hint through error_append_hint,
>> CONFIG_VTD_ACCEL is not enabled or smthg alike
> Will do.
>
>>> +    return false;
>>> +}
>>> +#endif
>>> +#endif
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index 3095d78321..d3c8a75878 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -26,6 +26,7 @@
>>>  #include "hw/sysbus.h"
>>>  #include "hw/iommu.h"
>>>  #include "intel_iommu_internal.h"
>>> +#include "intel_iommu_accel.h"
>>>  #include "hw/pci/pci.h"
>>>  #include "hw/pci/pci_bus.h"
>>>  #include "hw/qdev-properties.h"
>>> @@ -4596,9 +4597,7 @@ static bool vtd_check_hiod(IntelIOMMUState *s,
>> HostIOMMUDevice *hiod,
>>>          return true;
>>>      }
>>>
>>> -    error_setg(errp,
>>> -               "host device is uncompatible with first stage
>> translation");
>>> -    return false;
>>> +    return vtd_check_hiod_accel(s, hiod, errp);
>>>  }
>>>
>>>  static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>> devfn,
>>> diff --git a/hw/i386/intel_iommu_accel.c b/hw/i386/intel_iommu_accel.c
>>> new file mode 100644
>>> index 0000000000..6846c6ec4d
>>> --- /dev/null
>>> +++ b/hw/i386/intel_iommu_accel.c
>>> @@ -0,0 +1,42 @@
>>> +/*
>>> + * Intel IOMMU acceleration with nested translation
>>> + *
>>> + * Copyright (C) 2025 Intel Corporation.
>>> + *
>>> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> + *
>>> + * SPDX-License-Identifier: GPL-2.0-or-later
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "system/iommufd.h"
>>> +#include "intel_iommu_internal.h"
>>> +#include "intel_iommu_accel.h"
>>> +
>>> +bool vtd_check_hiod_accel(IntelIOMMUState *s, HostIOMMUDevice
>> *hiod,
>>> +                          Error **errp)
>>> +{
>>> +    struct HostIOMMUDeviceCaps *caps = &hiod->caps;
>>> +    struct iommu_hw_info_vtd *vtd = &caps->vendor_caps.vtd;
>>> +
>>> +    if (!object_dynamic_cast(OBJECT(hiod),
>> TYPE_HOST_IOMMU_DEVICE_IOMMUFD)) {
>>> +        error_setg(errp, "Need IOMMUFD backend when x-flts=on");
>>> +        return false;
>>> +    }
>>> +
>>> +    if (caps->type != IOMMU_HW_INFO_TYPE_INTEL_VTD) {
>>> +        error_setg(errp, "Incompatible host platform IOMMU type %d",
>>> +                   caps->type);
>>> +        return false;
>>> +    }
>>> +
>>> +    if (s->fs1gp && !(vtd->cap_reg & VTD_CAP_FS1GP)) {
>>> +        error_setg(errp,
>>> +                   "First stage 1GB large page is unsupported by host
>> IOMMU");
>>> +        return false;
>>> +    }
>>> +
>>> +    error_setg(errp,
>>> +               "host IOMMU is incompatible with guest first stage
>> translation");
>>> +    return false;
>>> +}
>>> diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
>>> index 6a0ab54bea..12473acaa7 100644
>>> --- a/hw/i386/Kconfig
>>> +++ b/hw/i386/Kconfig
>>> @@ -150,8 +150,13 @@ config X86_IOMMU
>>>
>>>  config VTD
>>>      bool
>>> +    imply VTD_ACCEL
>>>      select X86_IOMMU
>>>
>>> +config VTD_ACCEL
>>> +    bool
>>> +    depends on VTD && IOMMUFD
>>> +
>>>  config AMD_IOMMU
>>>      bool
>>>      select X86_IOMMU
>>> diff --git a/hw/i386/meson.build b/hw/i386/meson.build
>>> index 436b3ce52d..63ae57baa5 100644
>>> --- a/hw/i386/meson.build
>>> +++ b/hw/i386/meson.build
>>> @@ -21,6 +21,7 @@ i386_ss.add(when: 'CONFIG_Q35', if_true:
>> files('pc_q35.c'))
>>>  i386_ss.add(when: 'CONFIG_VMMOUSE', if_true: files('vmmouse.c'))
>>>  i386_ss.add(when: 'CONFIG_VMPORT', if_true: files('vmport.c'))
>>>  i386_ss.add(when: 'CONFIG_VTD', if_true: files('intel_iommu.c'))
>>> +i386_ss.add(when: 'CONFIG_VTD_ACCEL', if_true:
>> files('intel_iommu_accel.c'))
>>>  i386_ss.add(when: 'CONFIG_SGX', if_true: files('sgx-epc.c','sgx.c'),
>>>                                  if_false: files('sgx-stub.c'))
>>>
>> wrt comments made by Cédric in
>> https://lore.kernel.org/all/IA3PR11MB9136B13C0C48EF293D3B599D92FAA@
>> IA3PR11MB9136.namprd11.prod.outlook.com/
>> I see you kept the original approach. I have no strong opinion on that.
>> I let Cédric's comment if he strongly disagrees.
> Guess you mean adding '#include CONFIG_DEVICES'?
> I added it in hw/i386/intel_iommu_accel.h, see above. There is reference to
> CONFIG_VTD_ACCEL in intel_iommu_accel.h, I thought it's better to add it
> there instead of intel_iommu_accel.c

No I rather meant Cédric's comment on extending HostIOMMUDeviceClass
instead of using iommufd directly Eric


>
> Thanks
> Zhenzhong
>
>> With my comment taken into account feel free to grab my
>>
>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>
>> Thanks
>>
>> Eric



  reply	other threads:[~2025-12-11  7:09 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-17  9:37 [PATCH v8 00/23] intel_iommu: Enable first stage translation for passthrough device Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 01/23] intel_iommu: Rename vtd_ce_get_rid2pasid_entry to vtd_ce_get_pasid_entry Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 02/23] intel_iommu: Delete RPS capability related supporting code Zhenzhong Duan
2025-12-10 10:57   ` Eric Auger
2025-12-11  8:22   ` Jason Wang
2025-12-11 11:04     ` Yi Liu
2025-11-17  9:37 ` [PATCH v8 03/23] intel_iommu: Update terminology to match VTD spec Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 04/23] hw/pci: Export pci_device_get_iommu_bus_devfn() and return bool Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 05/23] hw/pci: Introduce pci_device_get_viommu_flags() Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 06/23] intel_iommu: Implement get_viommu_flags() callback Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 07/23] intel_iommu: Introduce a new structure VTDHostIOMMUDevice Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 08/23] vfio/iommufd: Force creating nesting parent HWPT Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 09/23] intel_iommu_accel: Check for compatibility with IOMMUFD backed device when x-flts=on Zhenzhong Duan
2025-12-10 13:59   ` Eric Auger
2025-12-11  6:49     ` Duan, Zhenzhong
2025-12-11  7:09       ` Eric Auger [this message]
2025-12-12  2:29         ` Duan, Zhenzhong
2025-11-17  9:37 ` [PATCH v8 10/23] intel_iommu_accel: Fail passthrough device under PCI bridge if x-flts=on Zhenzhong Duan
2025-12-10 14:01   ` Eric Auger
2025-11-17  9:37 ` [PATCH v8 11/23] intel_iommu_accel: Stick to system MR for IOMMUFD backed host device when x-flts=on Zhenzhong Duan
2025-12-10 14:02   ` Eric Auger
2025-11-17  9:37 ` [PATCH v8 12/23] intel_iommu: Add some macros and inline functions Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 13/23] intel_iommu_accel: Bind/unbind guest page table to host Zhenzhong Duan
2025-12-10 17:42   ` Eric Auger
2025-12-11  7:52     ` Duan, Zhenzhong
2025-12-12  2:12       ` Duan, Zhenzhong
2025-12-12  3:02         ` Nicolin Chen
2025-11-17  9:37 ` [PATCH v8 14/23] intel_iommu_accel: Propagate PASID-based iotlb invalidation " Zhenzhong Duan
2025-12-10 17:49   ` Eric Auger
2025-11-17  9:37 ` [PATCH v8 15/23] intel_iommu: Replay all pasid bindings when either SRTP or TE bit is changed Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 16/23] intel_iommu: Replay pasid bindings after context cache invalidation Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 17/23] vfio/listener: Bypass readonly region for dirty tracking Zhenzhong Duan
2025-11-28  2:08   ` Duan, Zhenzhong
2025-11-28  4:27     ` Yi Liu
2025-11-28  5:47       ` Duan, Zhenzhong
2025-11-28 12:58     ` Cédric Le Goater
2025-12-01  3:21       ` Duan, Zhenzhong
2025-11-17  9:37 ` [PATCH v8 18/23] intel_iommu: Add migration support with x-flts=on Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 19/23] hw/pci: Introduce pci_device_get_host_iommu_quirks() Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 20/23] intel_iommu_accel: Implement get_host_iommu_quirks() callback Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 21/23] Workaround for ERRATA_772415_SPR17 Zhenzhong Duan
2025-12-10 17:52   ` Eric Auger
2025-11-17  9:37 ` [PATCH v8 22/23] intel_iommu: Enable host device when x-flts=on in scalable mode Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 23/23] docs/devel: Add IOMMUFD nesting documentation Zhenzhong Duan
2025-12-09  9:50 ` [PATCH v8 00/23] intel_iommu: Enable first stage translation for passthrough device Duan, Zhenzhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef46c714-28f6-413c-9270-6119a92a2849@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=alex@shazbot.org \
    --cc=chao.p.peng@intel.com \
    --cc=clement.mathieu--drif@eviden.com \
    --cc=clg@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=skolothumtho@nvidia.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).