From: Nicolin Chen <nicolinc@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>, <kevin.tian@intel.com>,
<yi.l.liu@intel.com>, <eric.auger@redhat.com>,
<baolu.lu@linux.intel.com>,
<shameerali.kolothum.thodi@huawei.com>,
<jean-philippe@linaro.org>, <iommu@lists.linux.dev>
Subject: Re: Cache Invalidation Solution for Nested IOMMU
Date: Mon, 3 Apr 2023 17:02:09 -0700 [thread overview]
Message-ID: <ZCtpAYn76u499p6G@Asurada-Nvidia> (raw)
In-Reply-To: <ea6fba01-b483-94e6-8c84-d096dcb64b5a@arm.com>
Hi Robin,
On Mon, Apr 03, 2023 at 08:15:03PM +0100, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2023-04-03 15:51, Nicolin Chen wrote:
> > On Mon, Apr 03, 2023 at 11:08:23AM -0300, Jason Gunthorpe wrote:
> > > On Sun, Apr 02, 2023 at 05:33:35PM -0700, Nicolin Chen wrote:
> > > > The first version is simply to individually forward the entire
> > > > command. This can save a few CPU cycles from packing/unpacking
> > > > invalidation fields of the commands via a data structure, v.s.
> > > > the structure in v1[2].
> > >
> > > The kernel must validate the SID for the ATS invalidations, we can't
> > > just blindly pass it through.
> >
> > Yes. I didn't go further with the first version, yet leaving a
> > line of comments in the handler: we'd need set/unset_rid_user,
> > to validate the SID field of INV_ATC commands, as we discussed.
> >
> > > And this simple path needs an explanation how errors are properly
> > > handled, eg by making execution synchronous, or someone guaranteeing
> > > that errors are impossible.
> >
> > Yes. Both versions here execute in a synchronous fashion. The
> > error code will be returned in the cache_invalidate_user data
> > structure.
> >
> > > > Then I added a new mmap interface to share kernel page(s) from the
> > > > Driver, to allow QEMU to write all TLBI commands as a single batch.
> > > > Then it can initiate the batch invalidation via another synchronous
> > > > hypercall.
> > >
> > > I don't think a mmap is really needed for simple batching, just
> > > passing a larger buffer to ioctl is probably good enough
> >
> > It wouldn't be a must, yet can omit a copy_from_user() at each
> > hypercall? And it also eases VCMDQ a bit.
> >
> > > If a SW side is built it should mirror the HW vCMDQ path, not be
> > > different.
> >
> > The host kernel has the host queue, while the hypervisor fills
> > in a guest TLBI queue. Switching between two queues at one SMMU
> > CMDQ (HW) requires a very complicated locking mechanism, v.s.
> > inserting the batch to the existing host queue. And it probably
> > doesn't have a big perf improvement by doing that?
> >
> > If SMMU has ECMDQ, it'd allocate a free CMDQ upon availability,
> > calling arm_smmu_init_one_queue() and mmapping q->base, then it
> > can execute the guest TLBI queue directly, passing that q ptr.
>
> FWIW I don't think that should be visible in a userspace interface. When
> the VMM is just requesting some invalidations in order to emulate some
> commands, it's up to the SMMU driver, or at best between the driver and
> and IOMMUFD, to decide exactly how those requests get executed as
> physical commands - that should not make any difference to the requester
> other than how quickly the requests are processed.
>
> AFAICS this interface can't look like the proper hardware vCMDQ path,
> because the whole point of that will be to configure it in advance, map
> the queue controls directly into the guest, and avoid trapping
> invalidations to the VMM at all. This invalidation request interface is
> a large part of precisely what that path is intended to bypass. I don't
> see much benefit in supporting an additional slightly-accelerated slow
> path where the host avoids a tiny bit of housekeeping by maintaining a
> real vCMDQ *on behalf of* the guest and forwarding trapped commands into
> it instead of just processing them normally as host commands. Or,
I tend to agree with most of your point. The implementation of
a SW emulated VCMDQ might be overcomplicated cooperating with
the kernel driver and the QEMU vSMMU code. If SMMU HW (in most
cases) only has one CMDQ, it is hard to switch between commands
of host's and guest's to have a performance gain. VCMDQ could
simply do that because it has multiple CMDQs by nature.
What is the normal processing approach that you'd suggest? Do
you agree that having a batch invalidation would be nicer? We
could go for the mmap'd page approach in my draft, or go for
the ioctl that Jason pointed out.
My preference is to have a mmap'd page, so the interface can
be reused later by VCMDQ too. Performance-wise, it should be
good enough, since it does batching, IMHO.
> conversely, emulating a vCMDQ *in* the host kernel, in a way which still
> requires traps to bounce through the VMM and back - that just seems
> objectively worse than keeping all the emulation together in one place
> (however, I would concur that a "vhost-style" emulation, using all the
> same interfaces for configuration and error/irq/etc. reporting as the
> real hardware would, might be viable if performance really demands it).
I am not very familiar with the vhost style. By a glance at
its doc, it seems to be one interface for all hypercalls?
That would be very different than what we design here..
Thanks
Nicolin
next prev parent reply other threads:[~2023-04-04 0:02 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-03 0:33 Cache Invalidation Solution for Nested IOMMU Nicolin Chen
2023-04-03 7:26 ` Liu, Yi L
2023-04-03 8:39 ` Tian, Kevin
2023-04-03 15:24 ` Nicolin Chen
2023-04-04 2:42 ` Tian, Kevin
2023-04-04 3:12 ` Nicolin Chen
2023-04-03 12:23 ` Jason Gunthorpe
2023-04-03 8:00 ` Tian, Kevin
2023-04-03 14:29 ` Nicolin Chen
2023-04-04 2:15 ` Tian, Kevin
2023-04-04 2:47 ` Nicolin Chen
2023-04-03 14:08 ` Jason Gunthorpe
2023-04-03 14:51 ` Nicolin Chen
2023-04-03 19:15 ` Robin Murphy
2023-04-04 0:02 ` Nicolin Chen [this message]
2023-04-04 16:20 ` Jason Gunthorpe
2023-04-04 16:50 ` Shameerali Kolothum Thodi
2023-04-05 11:57 ` Jason Gunthorpe
2023-04-06 6:23 ` Zhangfei Gao
2023-04-06 6:39 ` Nicolin Chen
2023-04-06 11:40 ` Jason Gunthorpe
2023-04-10 1:08 ` Nicolin Chen
2023-04-11 9:07 ` Jean-Philippe Brucker
2023-04-11 11:57 ` Jason Gunthorpe
2023-04-11 18:39 ` Nicolin Chen
2023-04-11 18:41 ` Jason Gunthorpe
2023-04-11 19:02 ` Nicolin Chen
2023-04-11 18:43 ` Nicolin Chen
2023-04-12 2:47 ` Zhangfei Gao
2023-04-12 5:47 ` Nicolin Chen
2023-05-03 15:14 ` Shameerali Kolothum Thodi
2023-05-03 23:44 ` Nicolin Chen
2023-04-05 5:45 ` Nicolin Chen
2023-04-05 11:37 ` Jason Gunthorpe
2023-04-05 15:34 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCtpAYn76u499p6G@Asurada-Nvidia \
--to=nicolinc@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=jgg@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox