From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>
Cc: "Liu, Yi L" <yi.l.liu@intel.com>,
CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"clg@redhat.com" <clg@redhat.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"peterx@redhat.com" <peterx@redhat.com>,
"jasowang@redhat.com" <jasowang@redhat.com>,
"jgg@nvidia.com" <jgg@nvidia.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"Peng, Chao P" <chao.p.peng@intel.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <richard.henderson@linaro.org>,
Eduardo Habkost <eduardo@habkost.net>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Subject: Re: [PATCH v4 04/17] intel_iommu: Flush stage-2 cache in PASID-selective PASID-based iotlb invalidation
Date: Mon, 4 Nov 2024 07:01:19 -0500 [thread overview]
Message-ID: <20241104070102-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <SJ0PR11MB6744765C7F68ABF7449E73F492512@SJ0PR11MB6744.namprd11.prod.outlook.com>
On Mon, Nov 04, 2024 at 11:55:39AM +0000, Duan, Zhenzhong wrote:
>
>
> >-----Original Message-----
> >From: Michael S. Tsirkin <mst@redhat.com>
> >Sent: Monday, November 4, 2024 7:51 PM
> >Subject: Re: [PATCH v4 04/17] intel_iommu: Flush stage-2 cache in PASID-
> >selective PASID-based iotlb invalidation
> >
> >On Mon, Nov 04, 2024 at 11:46:00AM +0000, Duan, Zhenzhong wrote:
> >>
> >>
> >> >-----Original Message-----
> >> >From: Liu, Yi L <yi.l.liu@intel.com>
> >> >Sent: Monday, November 4, 2024 4:45 PM
> >> >Subject: Re: [PATCH v4 04/17] intel_iommu: Flush stage-2 cache in PASID-
> >> >selective PASID-based iotlb invalidation
> >> >
> >> >On 2024/11/4 15:37, CLEMENT MATHIEU--DRIF wrote:
> >> >>
> >> >>
> >> >> On 04/11/2024 03:49, Yi Liu wrote:
> >> >>> Caution: External email. Do not open attachments or click links, unless
> >> >>> this email comes from a known sender and you know the content is safe.
> >> >>>
> >> >>>
> >> >>> On 2024/9/30 17:26, Zhenzhong Duan wrote:
> >> >>>> Per spec 6.5.2.4, PADID-selective PASID-based iotlb invalidation will
> >> >>>> flush stage-2 iotlb entries with matching domain id and pasid.
> >> >>>
> >> >>> Also, call out it's per table Table 21. PASID-based-IOTLB Invalidation of
> >> >>> VT-d spec 4.1.
> >> >>>
> >> >>>> With scalable modern mode introduced, guest could send PASID-selective
> >> >>>> PASID-based iotlb invalidation to flush both stage-1 and stage-2 entries.
> >> >>>>
> >> >>>> By this chance, remove old IOTLB related definitions which were unused.
> >> >>>
> >> >>>
> >> >>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> >> >>>> Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
> >> >>>> Acked-by: Jason Wang <jasowang@redhat.com>
> >> >>>> ---
> >> >>>> hw/i386/intel_iommu_internal.h | 14 ++++--
> >> >>>> hw/i386/intel_iommu.c | 88
> >+++++++++++++++++++++++++++++++++-
> >> >>>> 2 files changed, 96 insertions(+), 6 deletions(-)
> >> >>>>
> >> >>>> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/
> >> >>>> intel_iommu_internal.h
> >> >>>> index d0f9d4589d..eec8090190 100644
> >> >>>> --- a/hw/i386/intel_iommu_internal.h
> >> >>>> +++ b/hw/i386/intel_iommu_internal.h
> >> >>>> @@ -403,11 +403,6 @@ typedef union VTDInvDesc VTDInvDesc;
> >> >>>> #define VTD_INV_DESC_IOTLB_AM(val) ((val) & 0x3fULL)
> >> >>>> #define VTD_INV_DESC_IOTLB_RSVD_LO 0xffffffff0000f100ULL
> >> >>>> #define VTD_INV_DESC_IOTLB_RSVD_HI 0xf80ULL
> >> >>>> -#define VTD_INV_DESC_IOTLB_PASID_PASID (2ULL << 4)
> >> >>>> -#define VTD_INV_DESC_IOTLB_PASID_PAGE (3ULL << 4)
> >> >>>> -#define VTD_INV_DESC_IOTLB_PASID(val) (((val) >> 32) &
> >> >>>> VTD_PASID_ID_MASK)
> >> >>>> -#define
> >VTD_INV_DESC_IOTLB_PASID_RSVD_LO 0xfff00000000001c0ULL
> >> >>>> -#define VTD_INV_DESC_IOTLB_PASID_RSVD_HI 0xf80ULL
> >> >>>>
> >> >>>> /* Mask for Device IOTLB Invalidate Descriptor */
> >> >>>> #define VTD_INV_DESC_DEVICE_IOTLB_ADDR(val) ((val) &
> >> >>>> 0xfffffffffffff000ULL)
> >> >>>> @@ -433,6 +428,15 @@ typedef union VTDInvDesc VTDInvDesc;
> >> >>>> #define VTD_SPTE_LPAGE_L3_RSVD_MASK(aw) \
> >> >>>> (0x3ffff800ULL | ~(VTD_HAW_MASK(aw) | VTD_SL_IGN_COM))
> >> >>>>
> >> >>>> +/* Masks for PIOTLB Invalidate Descriptor */
> >> >>>> +#define VTD_INV_DESC_PIOTLB_G (3ULL << 4)
> >> >>>> +#define VTD_INV_DESC_PIOTLB_ALL_IN_PASID (2ULL << 4)
> >> >>>> +#define VTD_INV_DESC_PIOTLB_PSI_IN_PASID (3ULL << 4)
> >> >>>> +#define VTD_INV_DESC_PIOTLB_DID(val) (((val) >> 16) &
> >> >>>> VTD_DOMAIN_ID_MASK)
> >> >>>> +#define VTD_INV_DESC_PIOTLB_PASID(val) (((val) >> 32) & 0xfffffULL)
> >> >>>> +#define VTD_INV_DESC_PIOTLB_RSVD_VAL0 0xfff000000000f1c0ULL
> >> >>>> +#define VTD_INV_DESC_PIOTLB_RSVD_VAL1 0xf80ULL
> >> >>>> +
> >> >>>> /* Information about page-selective IOTLB invalidate */
> >> >>>> struct VTDIOTLBPageInvInfo {
> >> >>>> uint16_t domain_id;
> >> >>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> >> >>>> index 9e6ef0cb99..72c9c91d4f 100644
> >> >>>> --- a/hw/i386/intel_iommu.c
> >> >>>> +++ b/hw/i386/intel_iommu.c
> >> >>>> @@ -2656,6 +2656,86 @@ static bool
> >> >>>> vtd_process_iotlb_desc(IntelIOMMUState *s, VTDInvDesc *inv_desc)
> >> >>>> return true;
> >> >>>> }
> >> >>>>
> >> >>>> +static gboolean vtd_hash_remove_by_pasid(gpointer key, gpointer value,
> >> >>>> + gpointer user_data)
> >> >>>> +{
> >> >>>> + VTDIOTLBEntry *entry = (VTDIOTLBEntry *)value;
> >> >>>> + VTDIOTLBPageInvInfo *info = (VTDIOTLBPageInvInfo *)user_data;
> >> >>>> +
> >> >>>> + return ((entry->domain_id == info->domain_id) &&
> >> >>>> + (entry->pasid == info->pasid));
> >> >>>> +}
> >> >>>> +
> >> >>>> +static void vtd_piotlb_pasid_invalidate(IntelIOMMUState *s,
> >> >>>> + uint16_t domain_id, uint32_t
> >> >>>> pasid)
> >> >>>> +{
> >> >>>> + VTDIOTLBPageInvInfo info;
> >> >>>> + VTDAddressSpace *vtd_as;
> >> >>>> + VTDContextEntry ce;
> >> >>>> +
> >> >>>> + info.domain_id = domain_id;
> >> >>>> + info.pasid = pasid;
> >> >>>> +
> >> >>>> + vtd_iommu_lock(s);
> >> >>>> + g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_pasid,
> >> >>>> + &info);
> >> >>>> + vtd_iommu_unlock(s);
> >> >>>> +
> >> >>>> + QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
> >> >>>> + if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
> >> >>>> + vtd_as->devfn, &ce) &&
> >> >>>> + domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid)) {
> >> >>>> + uint32_t rid2pasid = VTD_CE_GET_RID2PASID(&ce);
> >> >>>> +
> >> >>>> + if ((vtd_as->pasid != PCI_NO_PASID || pasid != rid2pasid) &&
> >> >>>> + vtd_as->pasid != pasid) {
> >> >>>> + continue;
> >> >>>> + }
> >> >>>> +
> >> >>>> + if (!s->scalable_modern) {
> >> >>>> + vtd_address_space_sync(vtd_as);
> >> >>>> + }
> >> >>>> + }
> >> >>>> + }
> >> >>>> +}
> >> >>>> +
> >> >>>> +static bool vtd_process_piotlb_desc(IntelIOMMUState *s,
> >> >>>> + VTDInvDesc *inv_desc)
> >> >>>> +{
> >> >>>> + uint16_t domain_id;
> >> >>>> + uint32_t pasid;
> >> >>>> +
> >> >>>> + if ((inv_desc->val[0] & VTD_INV_DESC_PIOTLB_RSVD_VAL0) ||
> >> >>>> + (inv_desc->val[1] & VTD_INV_DESC_PIOTLB_RSVD_VAL1) ||
> >> >>>> + inv_desc->val[2] || inv_desc->val[3]) {
> >> >>>> + error_report_once("%s: invalid piotlb inv desc val[3]=0x%"PRIx64
> >> >>>> + " val[2]=0x%"PRIx64" val[1]=0x%"PRIx64
> >> >>>> + " val[0]=0x%"PRIx64" (reserved bits unzero)",
> >> >>>> + __func__, inv_desc->val[3], inv_desc->val[2],
> >> >>>> + inv_desc->val[1], inv_desc->val[0]);
> >> >>>> + return false;
> >> >>>> + }
> >> >>>
> >> >>> Need to consider the below behaviour as well.
> >> >>>
> >> >>> "
> >> >>> This
> >> >>> descriptor is a 256-bit descriptor and will result in an invalid descriptor
> >> >>> error if submitted in an IQ that
> >> >>> is setup to provide hardware with 128-bit descriptors (IQA_REG.DW=0)
> >> >>> "
> >> >>>
> >> >>> Also there are descriptions about the old inv desc types (e.g.
> >> >>> iotlb_inv_desc) that can be either 128bits or 256bits.
> >> >>>
> >> >>> "If a 128-bit
> >> >>> version of this descriptor is submitted into an IQ that is setup to provide
> >> >>> hardware with 256-bit
> >> >>> descriptors or vice-versa it will result in an invalid descriptor error.
> >> >>> "
> >> >>>
> >> >>> If DW==1, vIOMMU fetches 32 bytes per desc. In such case, if the guest
> >> >>> submits 128bits desc, then the high 128bits would be non-zero if there is
> >> >>> more than one desc. But if there is only one desc in the queue, then the
> >> >>> high 128bits would be zero as well. While, it may be captured by the
> >> >>> tail register update. Bit4 is reserved when DW==1, and guest would use
> >> >>> bit4 when it only submits one desc.
> >> >>>
> >> >>> If DW==0, vIOMMU fetchs 16bytes per desc. If guest submits 256bits desc,
> >> >>> it would appear to be two descs from vIOMMU p.o.v. The first 128bits
> >> >>> can be identified as valid except for the types that does not requires
> >> >>> 256bits. The higher 128bits would be subjected to the desc sanity check
> >> >>> as well.
> >> >>>
> >> >>> Based on the above, I think you may need to add two more checks. If
> >DW==0,
> >> >>> vIOMMU should fail the inv types that requires 256bits; If DW==1, you
> >> >>> should check the inv_desc->val[2] and inv_desc->val[3]. You've already
> >> >>> done it in this patch.
> >> >>>
> >> >>> Thoughts are welcomed here.
> >> >>
> >> >> Good catch,
> >> >> I think we should write the check in vtd_process_inv_desc
> >> >> rather than updating the handlers.
> >> >>
> >> >> What are your thoughts?
> >> >
> >> >the first check can be done in vtd_process_inv_desc(). The second may
> >> >be better in the handlers as the handlers have the reserved bits check.
> >> >But given that none of the inv types use the high 128bits, so it is also
> >> >acceptable to do it in vtd_process_inv_desc(). Do add proper comment.
> >>
> >> Thanks Yi and Clement's suggestion, I'll send a small series to fix that
> >> for upstream.
> >>
> >> BRs.
> >> Zhenzhong
> >
> >Ok so you will send v5?
>
> No, what Yi pointed out is an upstream issue, I'll send a small series(3 patches)
> to fix that issue for upstream.
>
> Thanks
> Zhenzhong
Also ok. There's still gonnu be v5 because of other comments, right?
next prev parent reply other threads:[~2024-11-04 12:02 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-30 9:26 [PATCH v4 00/17] intel_iommu: Enable stage-1 translation for emulated device Zhenzhong Duan
2024-09-30 9:26 ` [PATCH v4 01/17] intel_iommu: Use the latest fault reasons defined by spec Zhenzhong Duan
2024-09-30 9:26 ` [PATCH v4 02/17] intel_iommu: Make pasid entry type check accurate Zhenzhong Duan
2024-09-30 9:26 ` [PATCH v4 03/17] intel_iommu: Add a placeholder variable for scalable modern mode Zhenzhong Duan
2024-10-04 5:22 ` CLEMENT MATHIEU--DRIF
2024-11-03 14:21 ` Yi Liu
2024-09-30 9:26 ` [PATCH v4 04/17] intel_iommu: Flush stage-2 cache in PASID-selective PASID-based iotlb invalidation Zhenzhong Duan
2024-11-04 2:49 ` Yi Liu
2024-11-04 7:37 ` CLEMENT MATHIEU--DRIF
2024-11-04 8:45 ` Yi Liu
2024-11-04 11:46 ` Duan, Zhenzhong
2024-11-04 11:50 ` Michael S. Tsirkin
2024-11-04 11:55 ` Duan, Zhenzhong
2024-11-04 12:01 ` Michael S. Tsirkin [this message]
2024-11-04 12:03 ` Duan, Zhenzhong
2024-09-30 9:26 ` [PATCH v4 05/17] intel_iommu: Rename slpte to pte Zhenzhong Duan
2024-09-30 9:26 ` [PATCH v4 06/17] intel_iommu: Implement stage-1 translation Zhenzhong Duan
2024-11-03 14:21 ` Yi Liu
2024-11-04 3:05 ` Duan, Zhenzhong
2024-11-04 7:02 ` Yi Liu
2024-09-30 9:26 ` [PATCH v4 07/17] intel_iommu: Check if the input address is canonical Zhenzhong Duan
2024-11-03 14:22 ` Yi Liu
2024-09-30 9:26 ` [PATCH v4 08/17] intel_iommu: Set accessed and dirty bits during first stage translation Zhenzhong Duan
2024-11-04 2:49 ` Yi Liu
2024-11-08 3:15 ` Jason Wang
2024-09-30 9:26 ` [PATCH v4 09/17] intel_iommu: Flush stage-1 cache in iotlb invalidation Zhenzhong Duan
2024-11-04 2:50 ` Yi Liu
2024-11-04 3:38 ` Duan, Zhenzhong
2024-11-04 7:36 ` Yi Liu
2024-09-30 9:26 ` [PATCH v4 10/17] intel_iommu: Process PASID-based " Zhenzhong Duan
2024-11-04 2:50 ` Yi Liu
2024-11-04 5:40 ` Duan, Zhenzhong
2024-11-04 7:05 ` Yi Liu
2024-09-30 9:26 ` [PATCH v4 11/17] intel_iommu: Add an internal API to find an address space with PASID Zhenzhong Duan
2024-11-04 2:50 ` Yi Liu
2024-11-04 5:47 ` Duan, Zhenzhong
2024-09-30 9:26 ` [PATCH v4 12/17] intel_iommu: Add support for PASID-based device IOTLB invalidation Zhenzhong Duan
2024-11-04 2:51 ` Yi Liu
2024-09-30 9:26 ` [PATCH v4 13/17] intel_iommu: piotlb invalidation should notify unmap Zhenzhong Duan
2024-11-04 3:05 ` Yi Liu
2024-11-04 8:15 ` Duan, Zhenzhong
2024-11-05 6:29 ` Yi Liu
2024-11-05 7:25 ` Duan, Zhenzhong
2024-11-08 4:39 ` Jason Wang
2024-09-30 9:26 ` [PATCH v4 14/17] intel_iommu: Set default aw_bits to 48 in scalable modern mode Zhenzhong Duan
2024-11-04 3:16 ` Yi Liu
2024-11-04 3:19 ` Duan, Zhenzhong
2024-11-04 7:25 ` Yi Liu
2024-11-08 4:41 ` Jason Wang
2024-11-08 5:30 ` Duan, Zhenzhong
2024-11-11 1:24 ` Jason Wang
2024-11-11 2:58 ` Duan, Zhenzhong
2024-11-11 3:03 ` Jason Wang
2024-09-30 9:26 ` [PATCH v4 15/17] intel_iommu: Introduce a property x-fls for " Zhenzhong Duan
2024-11-04 4:25 ` Yi Liu
2024-11-04 6:25 ` Duan, Zhenzhong
2024-11-04 7:23 ` Yi Liu
2024-11-05 3:11 ` Duan, Zhenzhong
2024-11-05 5:56 ` Yi Liu
2024-11-05 6:03 ` Duan, Zhenzhong
2024-11-05 6:26 ` Yi Liu
2024-09-30 9:26 ` [PATCH v4 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting Zhenzhong Duan
2024-11-04 7:00 ` Yi Liu
2024-11-08 4:45 ` Jason Wang
2024-09-30 9:26 ` [PATCH v4 17/17] tests/qtest: Add intel-iommu test Zhenzhong Duan
2024-09-30 9:52 ` Duan, Zhenzhong
2024-10-25 6:32 ` [PATCH v4 00/17] intel_iommu: Enable stage-1 translation for emulated device Duan, Zhenzhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241104070102-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@intel.com \
--cc=clement.mathieu--drif@eviden.com \
--cc=clg@redhat.com \
--cc=eduardo@habkost.net \
--cc=eric.auger@redhat.com \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=kevin.tian@intel.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=nicolinc@nvidia.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).