From: "Liu, Yi L" <yi.l.liu@linux.intel.com>
To: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Cc: tianyu.lan@intel.com, kevin.tian@intel.com, "Liu,
Yi L" <yi.l.liu@intel.com>,
ashok.raj@intel.com, kvm@vger.kernel.org, jasowang@redhat.com,
Will Deacon <Will.Deacon@arm.com>,
alex.williamson@redhat.com, peterx@redhat.com,
qemu-devel@nongnu.org, iommu@lists.linux-foundation.org,
jacob.jun.pan@intel.com
Subject: Re: [Qemu-devel] [RFC PATCH 7/8] VFIO: Add new IOCTL for IOMMU TLB invalidate propagation
Date: Mon, 3 Jul 2017 18:31:15 +0800 [thread overview]
Message-ID: <20170703103115.GB22053@gmail.com> (raw)
In-Reply-To: <0e4f2dd4-d553-b1b7-7bec-fe0ff5242c54@arm.com>
Hi Jean,
On Mon, Jul 03, 2017 at 12:52:52PM +0100, Jean-Philippe Brucker wrote:
> Hi Yi,
>
> On 02/07/17 11:06, Liu, Yi L wrote:
> > On Fri, May 12, 2017 at 01:11:02PM +0100, Jean-Philippe Brucker wrote:
> >
> > Hi Jean,
> >
> > As we've got a few discussions on it. I'd like to have a conclusion and
> > make it as a reference for future discussion.
> >
> > Currently, we are inclined to have a hybrid format for the iommu tlb
> > invalidation from userspace(vIOMMU or userspace driver).
> >
> > Based on the previous discussion, may the below work?
> >
> > 1. Add a IOCTL for iommu tlb invalidation.
> >
> > VFIO_IOMMU_TLB_INVALIDATE
> >
> > struct vfio_iommu_tlb_invalidate {
> > __u32 argsz;
> > __u32 length;
>
> Wouldn't argsz be exactly length + 8? Might be redundant in this case.
yes, it is. we may not use it in future version. but yes, if we still use it.
I think we can make it easier.
> > __u8 data[];
> > };
> >
> > comments from Alex William: is it more suitable to add a new flag bit on
> > vfio_device_svm(a structure defined in patch 5 of this patchset), the data
> > structure is so similar.
> >
> > Personally, I'm ok with it. Pls let me know your thoughts. However, the
> > precondition is we accept the whole definition in this email. If not, the
> > vfio_iommu_tlb_invalidate would be defined differently.
>
> With this proposal sharing the structure makes sense. As I understand it
> we're keeping the VFIO_IOMMU_TLB_INVALIDATE ioctl? In which case adding a
> flag bit would be redundant.
yes, it seems to be strange if we share vfio_device_svm structure but use
a separate IOCTL cmd. Maybe it's more reasonable to share IOCTL cmd and just
add a new flag. Then all the svm related operations share the IOCTL. However,
need to check if there would be any non-svm related iommu tlb invalidation.
Then vfio_device_svm should be renamed to be non-svm specific.
>
> > 2. Define a structure in include/uapi/linux/iommu.h(newly added header file)
> >
> > struct iommu_tlb_invalidate {
> > __u32 scope;
> > /* pasid-selective invalidation described by @pasid */
> > #define IOMMU_INVALIDATE_PASID (1 << 0)
> > /* address-selevtive invalidation described by (@vaddr, @size) */
> > #define IOMMU_INVALIDATE_VADDR (1 << 1)
> > __u32 flags;
> > /* targets non-pasid mappings, @pasid is not valid */
> > #define IOMMU_INVALIDATE_NO_PASID (1 << 0)
>
> Although it was my proposal, I don't like this flag. In ARM SMMU, we're
> using a special mode where PASID 0 is reserved and any traffic without
> PASID uses entry 0 of the PASID table. So I proposed the "NO_PASID" flag
> to invalidate that special context explicitly. But this means that
> invalidation packet targeted at that context will have "scope = PASID" and
> "flags = NO_PASID", which is utterly confusing.
>
> I now think that we should get rid of the IOMMU_INVALIDATE_NO_PASID flag
> and just use PASID 0 to invalidate this context on ARM. I don't think
> other architectures would use the NO_PASID flag anyway, but might be mistaken.
I may suggest to keep it so far. On VT-d, we may pass some data in opaque, so
we may work without it. But if other vendor want to issue non-PASID tagged
cache, then may encounter problem.
> > /* indicating that the pIOMMU doesn't need to invalidate
> > all intermediate tables cached as part of the PTE for
> > vaddr, only the last-level entry (pte). This is a hint. */
> > #define IOMMU_INVALIDATE_VADDR_LEAF (1 << 1)
> > __u32 pasid;
> > __u64 vaddr;
> > __u64 size;
> > __u8 data[];
> > };
> >
> > For this part, the scope and flags are basically aligned with your previous
> > email. I renamed the prefix to be "IOMMU_". In my opinion, the scope and flags
> > would be filled by vIOMMU emulator and be parsed by underlying iommu driver,
> > it is much more suitable to be defined in a uapi header file.
>
> I tend to agree, defining a single structure in a new IOMMU UAPI file is
> better than having identical structures both in uapi/linux/vfio.h and
> linux/iommu.h. This way we avoid VFIO having to copy the same structure
> field by field. Arch-specific structures that go in
> iommu_tlb_invalidate.data also ought to be defined in uapi/linux/iommu.h
yes, it is.
> > Besides the reason above, I don't want VFIO engae too much on the data parsing.
> > If we move the scope,flags,pasid,vaddr,size fields to vfio_iommu_tlb_invalidate,
> > then both kernel space vfio and user space vfio needs to do much parsing. So I
> > may prefer the way above.
>
> Would the entire validation of struct iommu_tlb_invalidate be offloaded to
> the IOMMU subsystem? Checking the structure sizes, copying from user, and
> validating the flags?
no, the copying from user and flag validation is still in VFIO. Basic idea is
still passing the iommu_tlb_invalidate.data to iommu sub-system.
> I guess it's mostly an implementation detail, but it might be better to
> keep this code in VFIO as well, even for the validation of
> iommu_tlb_invalidate.data (which would require VFIO to keep track of the
> model used during bind). This way VFIO sanitizes what comes from
> userspace, whilst the IOMMU subsystem only deals with valid kernel
> structures, and another subsystem could easily reuse the
> iommu_tlb_invalidate API.
it's a good idae. may think about it. VFIO should also be able to parse the
generic part of the iommu_tlb_invalidate.data.
Thanks,
Yi L
> The IOMMU subsystem would still validate the meaning of the fields, for
> instance whether a given combination of flag is allowed or if the PASID
> exists.
>
> Thanks,
> Jean
>
> > If you've got any other idea, pls feel free to post it. It's welcomed.
> >
> > Thanks,
> > Yi L
> >
> >> Hi Yi,
> >>
> >> On 26/04/17 11:12, Liu, Yi L wrote:
> >>> From: "Liu, Yi L" <yi.l.liu@linux.intel.com>
> >>>
> >>> This patch adds VFIO_IOMMU_TLB_INVALIDATE to propagate IOMMU TLB
> >>> invalidate request from guest to host.
> >>>
> >>> In the case of SVM virtualization on VT-d, host IOMMU driver has
> >>> no knowledge of caching structure updates unless the guest
> >>> invalidation activities are passed down to the host. So a new
> >>> IOCTL is needed to propagate the guest cache invalidation through
> >>> VFIO.
> >>>
> >>> Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
> >>> ---
> >>> include/uapi/linux/vfio.h | 9 +++++++++
> >>> 1 file changed, 9 insertions(+)
> >>>
> >>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >>> index 6b97987..50c51f8 100644
> >>> --- a/include/uapi/linux/vfio.h
> >>> +++ b/include/uapi/linux/vfio.h
> >>> @@ -564,6 +564,15 @@ struct vfio_device_svm {
> >>>
> >>> #define VFIO_IOMMU_SVM_BIND_TASK _IO(VFIO_TYPE, VFIO_BASE + 22)
> >>>
> >>> +/* For IOMMU TLB Invalidation Propagation */
> >>> +struct vfio_iommu_tlb_invalidate {
> >>> + __u32 argsz;
> >>> + __u32 length;
> >>> + __u8 data[];
> >>> +};
> >>
> >> We initially discussed something a little more generic than this, with
> >> most info explicitly described and only pIOMMU-specific quirks and hints
> >> in an opaque structure. Out of curiosity, why the change? I'm not against
> >> a fully opaque structure, but there seem to be a large overlap between TLB
> >> invalidations across architectures.
> >>
> >>
> >> For what it's worth, when prototyping the paravirtualized IOMMU I came up
> >> with the following.
> >>
> >> (From the paravirtualized POV, the SMMU also has to swizzle endianess
> >> after unpacking an opaque structure, since userspace doesn't know what's
> >> in it and guest might use a different endianess. So we need to force all
> >> opaque data to be e.g. little-endian.)
> >>
> >> struct vfio_iommu_tlb_invalidate {
> >> __u32 argsz;
> >> __u32 scope;
> >> __u32 flags;
> >> __u32 pasid;
> >> __u64 vaddr;
> >> __u64 size;
> >> __u8 data[];
> >> };
> >>
> >> Scope is a bitfields restricting the invalidation scope. By default
> >> invalidate the whole container (all PASIDs and all VAs). @pasid, @vaddr
> >> and @size are unused.
> >>
> >> Adding VFIO_IOMMU_INVALIDATE_PASID (1 << 0) restricts the invalidation
> >> scope to the pasid described by @pasid.
> >> Adding VFIO_IOMMU_INVALIDATE_VADDR (1 << 1) restricts the invalidation
> >> scope to the address range described by (@vaddr, @size).
> >>
> >> So setting scope = VFIO_IOMMU_INVALIDATE_VADDR would invalidate the VA
> >> range for *all* pasids (as well as no_pasid). Setting scope =
> >> (VFIO_IOMMU_INVALIDATE_VADDR|VFIO_IOMMU_INVALIDATE_PASID) would invalidate
> >> the VA range only for @pasid.
> >>
> >> Flags depend on the selected scope:
> >>
> >> VFIO_IOMMU_INVALIDATE_NO_PASID, indicating that invalidation (either
> >> without scope or with INVALIDATE_VADDR) targets non-pasid mappings
> >> exclusively (some architectures, e.g. SMMU, allow this)>>
> >> VFIO_IOMMU_INVALIDATE_VADDR_LEAF, indicating that the pIOMMU doesn't need
> >> to invalidate all intermediate tables cached as part of the PTW for vaddr,
> >> only the last-level entry (pte). This is a hint.
> >>
> >> I guess what's missing for Intel IOMMU and would go in @data is the
> >> "global" hint (which we don't have in SMMU invalidations). Do you see
> >> anything else, that the pIOMMU cannot deduce from this structure?
> >>
> >> Thanks,
> >> Jean
> >>
> >>
> >>> +#define VFIO_IOMMU_TLB_INVALIDATE _IO(VFIO_TYPE, VFIO_BASE + 23)
> >>> +
> >>> /* -------- Additional API for SPAPR TCE (Server POWERPC) IOMMU -------- */
> >>>
> >>> /*
> >>>
> >>
> >>
>
>
next prev parent reply other threads:[~2017-07-04 10:45 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-26 10:11 [Qemu-devel] [RFC PATCH 0/8] Shared Virtual Memory virtualization for VT-d Liu, Yi L
2017-04-26 10:11 ` [Qemu-devel] [RFC PATCH 1/8] iommu: Introduce bind_pasid_table API function Liu, Yi L
2017-04-26 16:56 ` Jean-Philippe Brucker
2017-04-26 18:29 ` jacob pan
2017-04-26 18:59 ` Jean-Philippe Brucker
2017-04-27 6:36 ` Liu, Yi L
2017-04-27 10:12 ` Jean-Philippe Brucker
2017-04-28 7:59 ` Liu, Yi L
2017-04-28 9:04 ` Liu, Yi L
2017-04-28 12:51 ` Jean-Philippe Brucker
2017-05-23 7:50 ` Liu, Yi L
2017-05-25 12:33 ` Jean-Philippe Brucker
2017-05-12 21:59 ` Alex Williamson
2017-05-14 10:56 ` Liu, Yi L
2017-04-26 10:11 ` [Qemu-devel] [RFC PATCH 2/8] iommu/vt-d: add bind_pasid_table function Liu, Yi L
2017-05-12 21:59 ` Alex Williamson
2017-05-15 13:14 ` jacob pan
2017-04-26 10:12 ` [Qemu-devel] [RFC PATCH 3/8] iommu: Introduce iommu do invalidate API function Liu, Yi L
2017-05-12 21:59 ` Alex Williamson
2017-05-17 10:23 ` Liu, Yi L
2017-04-26 10:12 ` [Qemu-devel] [RFC PATCH 4/8] iommu/vt-d: Add iommu do invalidate function Liu, Yi L
2017-05-12 21:59 ` Alex Williamson
2017-05-17 10:24 ` Liu, Yi L
2017-04-26 10:12 ` [Qemu-devel] [RFC PATCH 5/8] VFIO: Add new IOTCL for PASID Table bind propagation Liu, Yi L
2017-04-26 16:56 ` Jean-Philippe Brucker
2017-04-27 5:43 ` Liu, Yi L
2017-05-11 10:29 ` Liu, Yi L
2017-05-12 21:58 ` Alex Williamson
2017-05-17 10:27 ` Liu, Yi L
2017-05-18 11:29 ` Jean-Philippe Brucker
2017-04-26 10:12 ` [Qemu-devel] [RFC PATCH 6/8] VFIO: do pasid table binding Liu, Yi L
2017-05-09 7:55 ` Xiao Guangrong
2017-05-11 10:29 ` Liu, Yi L
2017-05-12 21:59 ` Alex Williamson
2017-04-26 10:12 ` [Qemu-devel] [RFC PATCH 7/8] VFIO: Add new IOCTL for IOMMU TLB invalidate propagation Liu, Yi L
2017-05-12 12:11 ` Jean-Philippe Brucker
2017-05-14 10:12 ` Liu, Yi L
2017-05-15 12:14 ` Jean-Philippe Brucker
2017-07-02 10:06 ` Liu, Yi L
2017-07-03 11:52 ` Jean-Philippe Brucker
2017-07-03 10:31 ` Liu, Yi L [this message]
2017-07-05 6:45 ` Tian, Kevin
2017-07-05 12:42 ` Jean-Philippe Brucker
2017-07-05 17:28 ` Alex Williamson
2017-07-05 22:26 ` Tian, Kevin
2017-07-14 8:58 ` Liu, Yi L
2017-07-14 18:15 ` Alex Williamson
2017-07-17 10:58 ` Liu, Yi L
2017-07-17 22:45 ` Alex Williamson
2017-07-18 9:38 ` Jean-Philippe Brucker
2017-07-18 14:29 ` Alex Williamson
2017-07-18 15:03 ` Jean-Philippe Brucker
2017-07-19 10:45 ` Liu, Yi L
2017-07-19 21:50 ` Jacob Pan
2017-07-05 22:31 ` Tian, Kevin
2017-05-12 21:58 ` Alex Williamson
2017-05-14 10:55 ` Liu, Yi L
2017-07-05 5:32 ` Tian, Kevin
2017-04-26 10:12 ` [Qemu-devel] [RFC PATCH 8/8] VFIO: do IOMMU TLB invalidation from guest Liu, Yi L
2017-05-08 4:09 ` [Qemu-devel] [RFC PATCH 0/8] Shared Virtual Memory virtualization for VT-d Xiao Guangrong
2017-05-07 7:33 ` Liu, Yi L
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170703103115.GB22053@gmail.com \
--to=yi.l.liu@linux.intel.com \
--cc=Will.Deacon@arm.com \
--cc=alex.williamson@redhat.com \
--cc=ashok.raj@intel.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jacob.jun.pan@intel.com \
--cc=jasowang@redhat.com \
--cc=jean-philippe.brucker@arm.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=tianyu.lan@intel.com \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).