From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@bull.com>
To: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
CLEMENT MATHIEU--DRIF <clement.mathieu--drif@bull.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "alex@shazbot.org" <alex@shazbot.org>,
"clg@redhat.com" <clg@redhat.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"mst@redhat.com" <mst@redhat.com>,
"jasowang@redhat.com" <jasowang@redhat.com>,
"jgg@nvidia.com" <jgg@nvidia.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"skolothumtho@nvidia.com" <skolothumtho@nvidia.com>,
"joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"Liu, Yi L" <yi.l.liu@intel.com>,
"Hao, Xudong" <xudong.hao@intel.com>
Subject: Re: [PATCH v1 11/13] intel_iommu_accel: drop _lock suffix in vtd_flush_host_piotlb_all_locked()
Date: Thu, 2 Apr 2026 08:05:19 +0000 [thread overview]
Message-ID: <e2ef69de16fc315fdf0f1badd0119e47aeedc42e.camel@eviden.com> (raw)
In-Reply-To: <IA3PR11MB913614688F3DE7F70A09089A9250A@IA3PR11MB9136.namprd11.prod.outlook.com>
On Wed, 2026-04-01 at 09:23 +0000, Duan, Zhenzhong wrote:
> Hi Clement,
>
>
> > -----Original Message-----
> > From: CLEMENT MATHIEU--DRIF <[clement.mathieu--drif@bull.com](mailto:clement.mathieu--drif@bull.com)>
> > Subject: Re: [PATCH v1 11/13] intel_iommu_accel: drop _lock suffix in
> > vtd_flush_host_piotlb_all_locked()
> >
> >
> > On Tue, 2026-03-31 at 07:18 +0000, Duan, Zhenzhong wrote:
> >
> > >
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: CLEMENT MATHIEU--DRIF <[clement.mathieu--
> > >
> >
> > [drif@bull.com](mailto:drif@bull.com)](mailto:[clement.mathieu--drif@bull.com](mailto:clement.mathieu--drif@bull.com))>
> >
> > >
> > > > Subject: Re: [PATCH v1 11/13] intel_iommu_accel: drop _lock suffix in
> > > > vtd_flush_host_piotlb_all_locked()
> > > >
> > > >
> > > > On Fri, 2026-03-20 at 04:04 +0000, Duan, Zhenzhong wrote:
> > > >
> > > >
> > > > > Hi Clement,
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Duan, Zhenzhong
> > > > > > Subject: RE: [PATCH v1 11/13] intel_iommu_accel: drop _lock suffix in
> > > > > > vtd_flush_host_piotlb_all_locked()
> > > > > >
> > > > > > Hi Clement,
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: CLEMENT MATHIEU--DRIF <[clement.mathieu--
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > [[drif@bull.com](mailto:drif@bull.com)](mailto:[drif@bull.com](mailto:drif@bull.com))](mailto:[clement.mathieu--
> > >
> >
> > [drif@bull.com](mailto:drif@bull.com)](mailto:[clement.mathieu--drif@bull.com](mailto:clement.mathieu--drif@bull.com)))>
> >
> > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > Subject: Re: [PATCH v1 11/13] intel_iommu_accel: drop _lock suffix in
> > > > > > > vtd_flush_host_piotlb_all_locked()
> > > > > > >
> > > > > > > vtd_piotlb_page_invalidate can be called from
> > > > > > > vtd_pri_perform_implicit_invalidation without any lock. Thus we can call
> > > > > > > vtd_flush_host_piotlb_all_accel unlocked. Is there a risk in case of
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > concurrent hot
> > > >
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > plugging of another device?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Good catch, hotplug is protected by BQL, but if an iothread could send PRI
> > > > >
> > > > >
> > > >
> > > >
> > > > request,
> > > >
> > > >
> > > > >
> > > > >
> > > > > > we suffer from the race.
> > > > > >
> > > > > > I think we can bypass vtd_pri_perform_implicit_invalidation() for
> > > > >
> > > >
> > >
> >
> > passthrough
> >
> > >
> > > >
> > > > >
> > > > > > device
> > > > > > which doesn't cache iotlb entry in QEMU but in host, like:
> > > > > >
> > > > > > @@ -5401,7 +5401,8 @@ static int vtd_pri_request_page(PCIBus *bus,
> > > > >
> > > >
> > >
> >
> > void
> >
> > >
> > > >
> > > > >
> > > > > > *opaque, int devfn,
> > > > > > return -ENOSPC;
> > > > > > }
> > > > > >
> > > > > > - if (vtd_pri_perform_implicit_invalidation(vtd_as, addr)) {
> > > > > > + if (!vtd_find_hiod_iommufd(vtd_as) &&
> > > > > > + vtd_pri_perform_implicit_invalidation(vtd_as, addr)) {
> > > > > > return -EINVAL;
> > > > > > }
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > After further thinking, I found above change doesn’t resolve the race you
> > > >
> > > >
> > > > mentioned.
> > > >
> > > >
> > > > >
> > > > > But I think the iothread should take BQL before calling ::pri_request_page(),
> > > > > because it is a function changing vtd device context, e.g., PQT and PRS
> > > >
> > >
> >
> > register,
> >
> > >
> > > >
> > > > > page invalidation queue and iotlb entries. We should always take BQL before
> > > >
> > > >
> > > > changing
> > > >
> > > >
> > > > > context of any device. Let me know if you have different opinions.
> > > >
> > > >
> > > >
> > > > Do you mean that every emulated device that triggers a PRI request should
> > >
> >
> > hold
> >
> > >
> > > > the BQL?
> > > > I'm not sure this is a viable pattern as the initiator probably already holds an
> > > > internal lock
> > > > and thus cannot take the BQL without risking deadlocks with concurrent
> > >
> >
> > MMIO :/
> >
> > >
> > >
> > > IIUC, qemu doesn't support concurrent MMIO emulations yet, they are serialized
> >
> > by BQL. If we don't take BQL in call site, then we need it in callee, like:
> >
> > >
> > > @@ -5351,6 +5351,8 @@ static int vtd_pri_request_page(PCIBus *bus, void
> >
> > *opaque, int devfn,
> >
> > > IntelIOMMUState *s = opaque;
> > > VTDAddressSpace *vtd_as;
> > >
> > > + BQL_LOCK_GUARD();
> > > +
> > > vtd_as = vtd_find_add_as(s, bus, devfn, pasid);
> > >
> > > uint64_t queue_addr_reg = vtd_get_quad(s, DMAR_PQA_REG);
> > >
> > >
> > >
> > >
> > > > Otherwise, it means that the device always takes the BQL before taking it's
> > >
> >
> > own
> >
> > >
> > > > lock.
> > >
> > >
> > >
> > > I think it's true because we always take BQL before emulation start.
> >
> >
> > This is pretty dangerous IMHO.
> >
> > The emulated device that ends up calling this function probably has a lock on its
> > side already, let's call it dev_lock.
> >
> > I the device does:
> > - lock(&dev_lock)
> > - start a PRI request
> > - lock(&bql)
>
>
> I have two questions:
>
> 1. which thread may have above sequence?
> 2. do we have to protect PRI request with dev_lock?
>
> PRI state is protected by BQL, I think no need to be guarded by dev_lock. Maybe unlock dev_lock during PRI?
Devices that perform svm operations instead of regular dma end up performing this kind of sequence.
The issue is that device state is guarded by the device lock. Thus, taking the bql in the pri callback
of the iommu requires the device to unlock its own lock before starting the PRI operation or to rely on
the bql all the time instead of its own lock.
I'm just trying to make sure svm devices will not become impossible to implement just because they
have to break their flow and unlock every time they trigger a memory operation.
>
> Thanks
> Zhenzhong
>
>
>
> > and if the host triggers an MMIO operation on the device:
> > - lock(&bql)
> > - lock(&dev_lock)
> >
> > We can take dev_lock and bql in reverse order, leading to potential deadlocks o.O
> >
> >
> > >
> > > Thanks
> > > Zhenzhong
> >
>
next prev parent reply other threads:[~2026-04-02 8:06 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 3:43 [PATCH v1 00/13] intel_iommu: Enable PASID support for passthrough device Zhenzhong Duan
2026-03-06 3:43 ` [PATCH v1 01/13] vfio/iommufd: Extend attach/detach_hwpt callback implementations with pasid Zhenzhong Duan
2026-03-18 11:55 ` Yi Liu
2026-03-19 7:43 ` Duan, Zhenzhong
2026-03-06 3:43 ` [PATCH v1 02/13] iommufd: Extend attach/detach_hwpt callbacks to support pasid Zhenzhong Duan
2026-03-18 12:03 ` Yi Liu
2026-03-18 12:15 ` Yi Liu
2026-03-19 7:47 ` Duan, Zhenzhong
2026-03-06 3:43 ` [PATCH v1 03/13] vfio/iommufd: Create nesting parent hwpt with IOMMU_HWPT_ALLOC_PASID flag Zhenzhong Duan
2026-03-18 12:15 ` Yi Liu
2026-03-19 7:54 ` Duan, Zhenzhong
2026-03-06 3:43 ` [PATCH v1 04/13] intel_iommu: Create the nested " Zhenzhong Duan
2026-03-06 7:27 ` CLEMENT MATHIEU--DRIF
2026-03-18 12:18 ` Yi Liu
2026-03-06 3:43 ` [PATCH v1 05/13] intel_iommu: Change pasid property from bool to uint8 Zhenzhong Duan
2026-03-18 12:20 ` Yi Liu
2026-03-19 8:08 ` Duan, Zhenzhong
2026-03-06 3:44 ` [PATCH v1 06/13] intel_iommu: Export some functions Zhenzhong Duan
2026-03-18 12:21 ` Yi Liu
2026-03-06 3:44 ` [PATCH v1 07/13] intel_iommu: Handle PASID entry addition for pc_inv_dsc request Zhenzhong Duan
2026-03-18 12:42 ` Yi Liu
2026-03-19 8:26 ` Duan, Zhenzhong
2026-03-20 10:13 ` Yi Liu
2026-03-23 5:59 ` Duan, Zhenzhong
2026-03-20 10:08 ` Yi Liu
2026-03-23 5:50 ` Duan, Zhenzhong
2026-03-23 7:38 ` Yi Liu
2026-03-23 8:11 ` Duan, Zhenzhong
2026-03-06 3:44 ` [PATCH v1 08/13] intel_iommu: Handle PASID entry removal " Zhenzhong Duan
2026-03-20 10:08 ` Yi Liu
2026-03-23 6:08 ` Duan, Zhenzhong
2026-03-23 7:40 ` Yi Liu
2026-03-23 8:12 ` Duan, Zhenzhong
2026-03-23 7:43 ` Yi Liu
2026-03-23 8:41 ` Duan, Zhenzhong
2026-03-06 3:44 ` [PATCH v1 09/13] intel_iommu: Handle PASID entry removal for system reset Zhenzhong Duan
2026-03-06 3:44 ` [PATCH v1 10/13] intel_iommu_accel: Support pasid binding/unbinding and PIOTLB flushing Zhenzhong Duan
2026-03-06 3:44 ` [PATCH v1 11/13] intel_iommu_accel: drop _lock suffix in vtd_flush_host_piotlb_all_locked() Zhenzhong Duan
2026-03-19 8:02 ` CLEMENT MATHIEU--DRIF
2026-03-19 9:07 ` Duan, Zhenzhong
2026-03-20 4:04 ` Duan, Zhenzhong
2026-03-31 6:49 ` CLEMENT MATHIEU--DRIF
2026-03-31 7:18 ` Duan, Zhenzhong
2026-04-01 8:18 ` CLEMENT MATHIEU--DRIF
2026-04-01 9:23 ` Duan, Zhenzhong
2026-04-02 8:05 ` CLEMENT MATHIEU--DRIF [this message]
2026-04-02 9:30 ` Duan, Zhenzhong
2026-04-02 18:27 ` CLEMENT MATHIEU--DRIF
2026-03-06 3:44 ` [PATCH v1 12/13] intel_iommu_accel: Add pasid bits size check Zhenzhong Duan
2026-03-06 7:27 ` CLEMENT MATHIEU--DRIF
2026-03-09 2:16 ` Duan, Zhenzhong
2026-03-06 3:44 ` [PATCH v1 13/13] intel_iommu: Expose flag VIOMMU_FLAG_PASID_SUPPORTED when configured Zhenzhong Duan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e2ef69de16fc315fdf0f1badd0119e47aeedc42e.camel@eviden.com \
--to=clement.mathieu--drif@bull.com \
--cc=alex@shazbot.org \
--cc=clg@redhat.com \
--cc=eric.auger@redhat.com \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=kevin.tian@intel.com \
--cc=mst@redhat.com \
--cc=nicolinc@nvidia.com \
--cc=qemu-devel@nongnu.org \
--cc=skolothumtho@nvidia.com \
--cc=xudong.hao@intel.com \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.