From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>
Cc: "Cédric Le Goater" <clg@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"peterx@redhat.com" <peterx@redhat.com>,
"jasowang@redhat.com" <jasowang@redhat.com>,
"jgg@nvidia.com" <jgg@nvidia.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"Liu, Yi L" <yi.l.liu@intel.com>,
"Peng, Chao P" <chao.p.peng@intel.com>,
"Yi Sun" <yi.y.sun@linux.intel.com>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Eduardo Habkost" <eduardo@habkost.net>
Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap
Date: Sun, 2 Jun 2024 08:56:00 -0400 [thread overview]
Message-ID: <20240602085542-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <SJ0PR11MB67441F19CD04FF98AD2BFA3C92162@SJ0PR11MB6744.namprd11.prod.outlook.com>
On Fri, Apr 26, 2024 at 03:10:14AM +0000, Duan, Zhenzhong wrote:
>
>
> >-----Original Message-----
> >From: Cédric Le Goater <clg@redhat.com>
> >Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >compatibility check with host IOMMU cap/ecap
> >
> >On 4/25/24 10:46, Duan, Zhenzhong wrote:
> >> Hi Cédric,
> >>
> >>> -----Original Message-----
> >>> From: Cédric Le Goater <clg@redhat.com>
> >>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>> compatibility check with host IOMMU cap/ecap
> >>>
> >>> Hello Zhenzhong,
> >>>
> >>> On 4/18/24 10:42, Duan, Zhenzhong wrote:
> >>>> Hi Cédric,
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Cédric Le Goater <clg@redhat.com>
> >>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>>>> compatibility check with host IOMMU cap/ecap
> >>>>>
> >>>>> Hello Zhenzhong
> >>>>>
> >>>>> On 4/17/24 11:24, Duan, Zhenzhong wrote:
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Cédric Le Goater <clg@redhat.com>
> >>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>>>>>> compatibility check with host IOMMU cap/ecap
> >>>>>>>
> >>>>>>> On 4/17/24 06:21, Duan, Zhenzhong wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Cédric Le Goater <clg@redhat.com>
> >>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>>>>>>>> compatibility check with host IOMMU cap/ecap
> >>>>>>>>>
> >>>>>>>>> Hello,
> >>>>>>>>>
> >>>>>>>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
> >>>>>>>>>> Hi Cédric,
> >>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: Cédric Le Goater <clg@redhat.com>
> >>>>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to
> >do
> >>>>>>>>>>> compatibility check with host IOMMU cap/ecap
> >>>>>>>>>>>
> >>>>>>>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
> >>>>>>>>>>>> From: Yi Liu <yi.l.liu@intel.com>
> >>>>>>>>>>>>
> >>>>>>>>>>>> If check fails, the host side device(either vfio or vdpa device)
> >>> should
> >>>>>>> not
> >>>>>>>>>>>> be passed to guest.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Implementation details for different backends will be in
> >>> following
> >>>>>>>>> patches.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> >>>>>>>>>>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> >>>>>>>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>> hw/i386/intel_iommu.c | 35
> >>>>>>>>>>> +++++++++++++++++++++++++++++++++++
> >>>>>>>>>>>> 1 file changed, 35 insertions(+)
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> >>>>>>>>>>>> index 4f84e2e801..a49b587c73 100644
> >>>>>>>>>>>> --- a/hw/i386/intel_iommu.c
> >>>>>>>>>>>> +++ b/hw/i386/intel_iommu.c
> >>>>>>>>>>>> @@ -35,6 +35,7 @@
> >>>>>>>>>>>> #include "sysemu/kvm.h"
> >>>>>>>>>>>> #include "sysemu/dma.h"
> >>>>>>>>>>>> #include "sysemu/sysemu.h"
> >>>>>>>>>>>> +#include "sysemu/iommufd.h"
> >>>>>>>>>>>> #include "hw/i386/apic_internal.h"
> >>>>>>>>>>>> #include "kvm/kvm_i386.h"
> >>>>>>>>>>>> #include "migration/vmstate.h"
> >>>>>>>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
> >>>>>>>>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
> >>>>>>>>>>>> return vtd_dev_as;
> >>>>>>>>>>>> }
> >>>>>>>>>>>>
> >>>>>>>>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
> >>>>>>>>>>>> + HostIOMMUDevice *hiod,
> >>>>>>>>>>>> + Error **errp)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> + return 0;
> >>>>>>>>>>>> +}
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
> >>>>>>>>>>>> + HostIOMMUDevice *hiod,
> >>>>>>>>>>>> + Error **errp)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> + return 0;
> >>>>>>>>>>>> +}
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +static int vtd_check_hdev(IntelIOMMUState *s,
> >>>>>>>>> VTDHostIOMMUDevice
> >>>>>>>>>>> *vtd_hdev,
> >>>>>>>>>>>> + Error **errp)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> + HostIOMMUDevice *hiod = vtd_hdev->dev;
> >>>>>>>>>>>> +
> >>>>>>>>>>>> + if (object_dynamic_cast(OBJECT(hiod),
> >>> TYPE_HIOD_IOMMUFD))
> >>>>> {
> >>>>>>>>>>>> + return vtd_check_iommufd_hdev(s, hiod, errp);
> >>>>>>>>>>>> + }
> >>>>>>>>>>>> +
> >>>>>>>>>>>> + return vtd_check_legacy_hdev(s, hiod, errp);
> >>>>>>>>>>>> +}
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I think we should be using the .get_host_iommu_info() class
> >>> handler
> >>>>>>>>>>> instead. Can we refactor the code slightly to avoid this check on
> >>>>>>>>>>> the type ?
> >>>>>>>>>>
> >>>>>>>>>> There is some difficulty ini avoiding this check, the behavior of
> >>>>>>>>> vtd_check_legacy_hdev
> >>>>>>>>>> and vtd_check_iommufd_hdev are different especially after
> >>> nesting
> >>>>>>>>> support introduced.
> >>>>>>>>>> vtd_check_iommufd_hdev() has much wider check over
> >cap/ecap
> >>> bits
> >>>>>>>>> besides aw_bits.
> >>>>>>>>>
> >>>>>>>>> I think it is important to fully separate the vIOMMU model from
> >the
> >>>>>>>>> host IOMMU backing device.
> >>>>>
> >>>>> This comment is true for the structures also.
> >>>>>
> >>>>>>>>> Could we introduce a new HostIOMMUDeviceClass
> >>>>>>>>> handler .check_hdev() handler, which would
> >>>>> call .get_host_iommu_info() ?
> >>>>>
> >>>>> This means that HIOD_LEGACY_INFO and HIOD_IOMMUFD_INFO
> >should
> >>> be
> >>>>> a common structure 'HostIOMMUDeviceInfo' holding all attributes
> >>>>> for the different backends. Each .get_host_iommu_info()
> >implementation
> >>>>> would translate the specific host iommu device data presentation
> >>>>> into the common 'HostIOMMUDeviceInfo', this is true for
> >host_aw_bits.
> >>>>
> >>>> I see, it's just not easy to define the unified elements in
> >>> HostIOMMUDeviceInfo
> >>>> so that they maps to bits or fields in host return IOMMU info.
> >>>
> >>> The proposal is adding a vIOMMU <-> HostIOMMUDevice interface and a
> >>> new
> >>> API needs to be completely defined for it. The IOMMU backend
> >>> implementation
> >>> could be anything, legacy, iommufd, iommufd v2, some other framework
> >>> and
> >>> the vIOMMU shouldn't be aware of its implementation.
> >>>
> >>> Exposing the kernel structures as done below should be avoided because
> >>> they are part of the QEMU <-> kernel IOMMUFD interface.
> >>>
> >>>
> >>>> Different platform returned host IOMMU info is platform specific.
> >>>> For vtd and siommu:
> >>>>
> >>>> struct iommu_hw_info_vtd {
> >>>> __u32 flags;
> >>>> __u32 __reserved;
> >>>> __aligned_u64 cap_reg;
> >>>> __aligned_u64 ecap_reg;
> >>>> };
> >>>>
> >>>> struct iommu_hw_info_arm_smmuv3 {
> >>>> __u32 flags;
> >>>> __u32 __reserved;
> >>>> __u32 idr[6];
> >>>> __u32 iidr;
> >>>> __u32 aidr;
> >>>> };
> >>>>
> >>>> I can think of two kinds of declaration of HostIOMMUDeviceInfo:
> >>>>
> >>>> struct HostIOMMUDeviceInfo {
> >>>> uint8_t aw_bits;
> >>>> enum iommu_hw_info_type type;
> >>>> union {
> >>>> struct iommu_hw_info_vtd vtd;
> >>>> struct iommu_hw_info_arm_smmuv3;
> >>>> ......
> >>>> } data;
> >>>> }
> >>>>
> >>>> or
> >>>>
> >>>> struct HostIOMMUDeviceInfo {
> >>>> uint8_t aw_bits;
> >>>> enum iommu_hw_info_type type;
> >>>> __u32 flags;
> >>>> __aligned_u64 cap_reg;
> >>>> __aligned_u64 ecap_reg;
> >>>> __u32 idr[6];
> >>>> __u32 iidr;
> >>>> __u32 aidr;
> >>>> ......
> >>>> }
> >>>>
> >>>> Not clear if any is your expected format.
> >>>>
> >>>>> 'type' could be handled the same way, with a 'HostIOMMUDeviceInfo'
> >>>>> type attribute and host iommu device type definitions, or as you
> >>>>> suggested with a QOM interface. This is more complex however. In
> >>>>> this case, I would suggest to implement a .compatible() handler to
> >>>>> compare the host iommu device type with the vIOMMU type.
> >>>>>
> >>>>> The resulting check_hdev routine would look something like :
> >>>>>
> >>>>> static int vtd_check_hdev(IntelIOMMUState *s,
> >VTDHostIOMMUDevice
> >>>>> *vtd_hdev,
> >>>>> Error **errp)
> >>>>> {
> >>>>> HostIOMMUDevice *hiod = vtd_hdev->dev;
> >>>>> HostIOMMUDeviceClass *hiodc =
> >>>>> HOST_IOMMU_DEVICE_GET_CLASS(hiod);
> >>>>> HostIOMMUDevice info;
> >>>>> int host_aw_bits, ret;
> >>>>>
> >>>>> ret = hiodc->get_host_iommu_info(hiod, &info, sizeof(info), errp);
> >>>>> if (ret) {
> >>>>> return ret;
> >>>>> }
> >>>>>
> >>>>> ret = hiodc->is_compatible(hiod, VIOMMU_INTERFACE(s));
> >>>>> if (ret) {
> >>>>> return ret;
> >>>>> }
> >>>>>
> >>>>> if (s->aw_bits > info.aw_bits) {
> >>>>> error_setg(errp, "aw-bits %d > host aw-bits %d",
> >>>>> s->aw_bits, info.aw_bits);
> >>>>> return -EINVAL;
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> and the HostIOMMUDeviceClass::is_compatible() handler would call a
> >>>>> vIOMMUInterface::compatible() handler simply returning
> >>>>> IOMMU_HW_INFO_TYPE_INTEL_VTD. How does that sound ?
> >>>>
> >>>> Not quite get what HostIOMMUDeviceClass::is_compatible() does.
> >>>
> >>> HostIOMMUDeviceClass::is_compatible() calls in the host IOMMU
> >backend
> >>> to determine which IOMMU types are exposed by the host, then calls the
> >>> vIOMMUInterface::compatible() handler to do the compare. API is to be
> >>> defined.
> >>>
> >>> As a refinement, we could introduce in the vIOMMU <->
> >HostIOMMUDevice
> >>> interface capabilities, or features, to check more precisely the level
> >>> of compatibility between the vIOMMU and the host IOMMU device. This
> >is
> >>> similar to what is done between QEMU and KVM.
> >>>
> >>> If you think this is too complex, include type in HostIOMMUDeviceInfo.
> >>>
> >>>> Currently legacy and IOMMUFD host device has different check logic,
> >how
> >>> it can help
> >>>> in merging vtd_check_legacy_hdev() and vtd_check_iommufd_hdev()
> >into
> >>> a single vtd_check_hdev()?
> >>>
> >>> IMHO, IOMMU shouldn't be aware of the IOMMU backend
> >implementation,
> >>> but
> >>> if you think the Intel vIOMMU should access directly the iommufd
> >backend
> >>> when available, then we should drop this proposal and revisit the design
> >>> to take a different approach.
> >>
> >> I implemented a draft following your suggestions so we could explore
> >further.
> >> See
> >https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
> >q_v3_tmp
> >>
> >> In this draft, it uses .check_cap() to query HOST_IOMMU_DEVICE_CAP_xxx
> >> just like KVM CAPs.
> >> A common HostIOMMUDeviceCaps structure is introduced to be used by
> >> both legacy and iommufd backend.
> >>
> >> It indeed is cleaner. Only problem is I failed to implement .compatible()
> >> as all the check could go ahead by just calling check_cap().
> >> Could you help a quick check to see if I misunderstood any of your
> >suggestion?
> >
> >Thanks for the changes. It looks cleaner and simpler ! Some comments,
> >
> >* HostIOMMUDeviceIOMMUFDClass seems useless as it is empty. I don't
> > remember if you told me already you had plans for future changes.
> > Sorry about that if this is the case. I forgot.
>
> Never mind😊, reason is:
>
> In nesting series
> https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting_rfcv2/
> This commit
> https://github.com/yiliu1765/qemu/commit/581fc900aa296988eaa48abee6d68d3670faf8c9
> implement [at|de]tach_hwpt handlers.
>
> So I add an extra layer of abstract HostIOMMUDeviceIOMMUFDClass to define
> [at|de]tach_hwpt handlers.
>
> >
> >* I would use the 'host_iommu_device_' prefix for external routines
> > which are part of the HostIOMMUDevice API and use 'hiod_' for
> > internal routines where it makes sense, to limit the name length for
> > instance.
>
> Good idea, will do.
>
> >
> >* I would rename HOST_IOMMU_DEVICE_CAP_IOMMUFD_V1 to
> > HOST_IOMMU_DEVICE_CAP_IOMMUFD. I mentioned IOMMUFD v2 as a
> > theoretical example of a different IOMMU interface. I don't think we
> > need to anticipate yet :)
>
> Will do.
>
> >
> >* HostIOMMUDeviceCaps is using 'enum iommu_hw_info_type' from
> > 'linux/iommufd.h', that's not my preferred choice but I won't
> > object. The result looks good.
>
> Ok, will keep it for now. We can change when you want in future.
>
> >
> >* HostIOMMUDevice now has a realize() routine to query the host IOMMU
> > capability for later usage. This is a good idea. However, you could
> > change the return value to bool and avoid the ERRP_GUARD() prologue.
>
> Will do.
>
> >
> >* Beware of :
> >
> > struct Range {
> > /*
> > * Do not access members directly, use the functions!
> > * A non-empty range has @lob <= @upb.
> > * An empty range has @lob == @upb + 1.
> > */
> > uint64_t lob; /* inclusive lower bound */
> > uint64_t upb; /* inclusive upper bound */
> > };
>
> I remember😊, will add the change in formal version.
>
> >
> >
> >* I think you could introduce a new VFIOIOMMUClass attribute. Let's
> > call it VFIOIOMMUClass::hiod_typename. The creation of
> >HostIOMMUDevice
> > would become generic and you could move :
> >
> > hiod=
> >HOST_IOMMU_DEVICE(object_new(TYPE_HOST_IOMMU_DEVICE_LEGACY_V
> >FIO));
> > HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp);
> > if (*errp) {
> > object_unref(hiod);
> > return -EINVAL;
> > }
> > vbasedev->hiod = hiod;
> >
> > at the end of vfio_attach_device().
>
> Good suggestion! Will do.
>
> Thanks
> Zhenzhong
So I'm expecting v3 of this.
next prev parent reply other threads:[~2024-06-02 12:57 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-08 8:43 [PATCH v2 0/5] Check host IOMMU compatilibity with vIOMMU Zhenzhong Duan
2024-04-08 8:44 ` [PATCH v2 1/5] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
2024-04-08 8:44 ` [PATCH v2 2/5] intel_iommu: Implement set/unset_iommu_device() callback Zhenzhong Duan
2024-04-08 8:44 ` [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap Zhenzhong Duan
2024-04-15 15:31 ` Cédric Le Goater
2024-04-16 7:09 ` Duan, Zhenzhong
2024-04-16 14:17 ` Cédric Le Goater
2024-04-17 4:21 ` Duan, Zhenzhong
2024-04-17 8:30 ` Cédric Le Goater
2024-04-17 9:24 ` Duan, Zhenzhong
2024-04-18 6:42 ` Cédric Le Goater
2024-04-18 8:42 ` Duan, Zhenzhong
2024-04-19 6:20 ` Cédric Le Goater
2024-04-19 9:49 ` Duan, Zhenzhong
2024-04-25 8:46 ` Duan, Zhenzhong
2024-04-25 12:40 ` Cédric Le Goater
2024-04-26 3:10 ` Duan, Zhenzhong
2024-06-02 12:56 ` Michael S. Tsirkin [this message]
2024-06-03 6:25 ` Duan, Zhenzhong
2024-04-08 8:44 ` [PATCH v2 4/5] intel_iommu: Check for compatibility with legacy device Zhenzhong Duan
2024-04-08 8:44 ` [PATCH v2 5/5] intel_iommu: Check for compatibility with iommufd backed device Zhenzhong Duan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240602085542-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@intel.com \
--cc=clg@redhat.com \
--cc=eduardo@habkost.net \
--cc=eric.auger@redhat.com \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=kevin.tian@intel.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=nicolinc@nvidia.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=yi.l.liu@intel.com \
--cc=yi.y.sun@linux.intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).