From: Jean-Philippe Brucker <jean-philippe@linaro.org>
To: Baolu Lu <baolu.lu@linux.intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>, Kevin Tian <kevin.tian@intel.com>,
Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Nicolin Chen <nicolinc@nvidia.com>, Yi Liu <yi.l.liu@intel.com>,
Jacob Pan <jacob.jun.pan@linux.intel.com>,
iommu@lists.linux.dev, linux-kselftest@vger.kernel.org,
virtualization@lists.linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space
Date: Mon, 26 Jun 2023 11:51:48 +0200 [thread overview]
Message-ID: <ZJlftJF9ufnBMjHi@larix> (raw)
In-Reply-To: <bc963573-f4d3-1467-daaf-2d85f6befe3f@linux.intel.com>
On Mon, Jun 19, 2023 at 11:35:50AM +0800, Baolu Lu wrote:
> > Another outstanding issue was what to do for PASID stop. When the guest
> > device driver stops using a PASID it issues a PASID stop request to the
> > device (a device-specific mechanism). If the device is not using PRI stop
> > markers it waits for pending PRs to complete and we're fine. Otherwise it
> > sends a stop marker which is flushed to the PRI queue, but does not wait
> > for pending PRs.
> >
> > Handling stop markers is annoying. If the device issues one, then the PRI
> > queue contains stale faults, a stop marker, followed by valid faults for
> > the next address space bound to this PASID. The next address space will
> > get all the spurious faults because the fault handler doesn't know that
> > there is a stop marker coming. Linux is probably alright with spurious
> > faults, though maybe not in all cases, and other guests may not support
> > them at all.
> >
> > We might need to revisit supporting stop markers: request that each device
> > driver declares whether their device uses stop markers on unbind() ("This
> > mechanism must indicate that a Stop Marker Message will be generated."
> > says the spec, but doesn't say if the function always uses one or the
> > other mechanism so it's per-unbind). Then we still have to synchronize
> > unbind() with the fault handler to deal with the pending stop marker,
> > which might have already gone through or be generated later.
>
> I don't quite follow here. Once a PASID is unbound from the device, the
> device driver should be free to release the PASID. The PASID could then
> be used for any other purpose. The device driver has no idea when the
> pending page requests are flushed after unbind(), so it cannot decide
> how long should the PASID be delayed for reuse. Therefore, I understand
> that a successful return from the unbind() function denotes that all
> pending page requests have been flushed and the PASID is viable for
> other use.
Yes that's the contract for unbind() at the moment
>
> >
> > Currently we ignore all that and just flush the PRI queue, followed by the
> > IOPF queue, to get rid of any stale fault before reassigning the PASID. A
> > guest however would also need to first flush the HW PRI queue, but doesn't
> > have a direct way to do that. If we want to support guests that don't deal
> > with stop markers, the host needs to flush the PRI queue when a PASID is
> > detached. I guess on Intel detaching the PASID goes through the host which
> > can flush the host queue. On Arm we'll probably need to flush the queue
> > when receiving a PASID cache invalidation, which the guest issues after
> > clearing a PASID table entry.
>
> The Intel VT-d driver follows below steps to drain pending page requests
> when a PASID is unbound from a device.
>
> - Tear down the device's pasid table entry for the stopped pasid.
> This ensures that ATS/PRI will stop putting more page requests for the
> pasid in VT-d PRQ.
Oh that's interesting, I didn't know about the implicit TLB invalidations
on page requests for VT-d.
For Arm SMMU, clearing the PASID table entry does cause ATS Translation
Requests to return with Completer Abort, but does not affect PRI. The SMMU
pushes page requests directly into the PRI queue without reading any table
(unless the queue overflows).
We're counting on the device driver to perform the PASID stop request
before calling unbind(), described in PCIe 6.20.1 (Managing PASID Usage)
and 10.4.1.2 (Managing PASID Usage on PRG Requests). This ensures that
when unbind() is called, no more page request for the PASID is pushed into
the PRI queue. But some may still be in the queue if the device uses stop
markers.
> - Sync with the PRQ handling thread until all related page requests in
> PRQ have been delivered.
This is what I'm concerned about. For VT-d this happens in the host which
is in charge of modifying the PASID table. For SMMU, the guest writes the
PASID table. It flushes its virtual PRI queue, but not the physical queue
that is managed by the host.
One synchronization point where the host could flush the physical PRI
queue is the PASID config invalidation (CMD_CFGI_CD). As Jason pointed out
the host may not be able to observe those if a command queue is assigned
directly to the guest (a theoretical SMMU extension), though in that case
the guest may also have direct access to a PRI queue (like the AMD vIOMMU
extension) and be able to flush the queue directly.
But we can just wait for PRI implementations and see what the drivers
need. Maybe no device will implement stop markers.
Thanks,
Jean
> - Flush the iopf queue with iopf_queue_flush_dev().
> - Follow the steps defined in VT-d spec section 7.10 to drain all page
> requests and responses between VT-d and the endpoint device.
next prev parent reply other threads:[~2023-06-26 9:52 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-30 5:37 [RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 01/17] iommu: Move iommu fault data to linux/iommu.h Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 02/17] iommu: Support asynchronous I/O page fault response Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 03/17] iommu: Add helper to set iopf handler for domain Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 04/17] iommu: Pass device parameter to iopf handler Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 05/17] iommu: Split IO page fault handling from SVA Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 06/17] iommu: Add iommu page fault cookie helpers Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 07/17] iommufd: Add iommu page fault data Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 08/17] iommufd: IO page fault delivery initialization and release Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 09/17] iommufd: Add iommufd hwpt iopf handler Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 10/17] iommufd: Add IOMMU_HWPT_ALLOC_FLAGS_USER_PASID_TABLE for hwpt_alloc Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 11/17] iommufd: Deliver fault messages to user space Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 12/17] iommufd: Add io page fault response support Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 13/17] iommufd: Add a timer for each iommufd fault data Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 14/17] iommufd: Drain all pending faults when destroying hwpt Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 15/17] iommufd: Allow new hwpt_alloc flags Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 16/17] iommufd/selftest: Add IOPF feature for mock devices Lu Baolu
2023-05-30 5:37 ` [RFC PATCHES 17/17] iommufd/selftest: Cover iopf-capable nested hwpt Lu Baolu
2023-05-30 18:50 ` [RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space Nicolin Chen
2023-05-31 2:10 ` Baolu Lu
2023-05-31 4:12 ` Nicolin Chen
2023-06-25 6:30 ` Baolu Lu
2023-06-25 19:21 ` Nicolin Chen
2023-06-26 3:10 ` Baolu Lu
2023-06-26 18:02 ` Nicolin Chen
2023-06-26 18:33 ` Jason Gunthorpe
2023-06-28 2:00 ` Baolu Lu
2023-06-28 12:49 ` Jason Gunthorpe
2023-06-29 1:07 ` Baolu Lu
2023-05-31 0:33 ` Jason Gunthorpe
2023-05-31 3:17 ` Baolu Lu
2023-06-23 6:18 ` Baolu Lu
2023-06-23 13:50 ` Jason Gunthorpe
2023-06-16 11:32 ` Jean-Philippe Brucker
2023-06-19 3:35 ` Baolu Lu
2023-06-26 9:51 ` Jean-Philippe Brucker [this message]
2023-06-19 12:58 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZJlftJF9ufnBMjHi@larix \
--to=jean-philippe@linaro.org \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jacob.jun.pan@linux.intel.com \
--cc=jgg@ziepe.ca \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox