From: Linu Cherian <linu.cherian-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
To: Jean-Philippe Brucker
<jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
Cc: "virtio-dev-sDuHXQ4OtrM4h7I2RyI4rWD2FQJk+8+b@public.gmane.org"
<virtio-dev-sDuHXQ4OtrM4h7I2RyI4rWD2FQJk+8+b@public.gmane.org>,
"kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
<mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Marc Zyngier <Marc.Zyngier-5wv7dgnIgG8@public.gmane.org>,
"jasowang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
<jasowang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Will Deacon <Will.Deacon-5wv7dgnIgG8@public.gmane.org>,
"Jayachandran.Nair-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org"
<Jayachandran.Nair-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>,
"virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org"
<virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
"iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org"
<iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
"sunil.goutham-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org"
<sunil.goutham-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>,
"eric.auger.pro-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org"
<eric.auger.pro-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC] virtio-iommu version 0.5
Date: Wed, 25 Oct 2017 16:35:48 +0530 [thread overview]
Message-ID: <20171025110548.GA20548@virtx40> (raw)
In-Reply-To: <6e5c3a23-9e00-1936-f80c-085faf42c420-5wv7dgnIgG8@public.gmane.org>
Hi Jean,
On Wed Oct 25, 2017 at 10:07:53AM +0100, Jean-Philippe Brucker wrote:
> On 25/10/17 08:07, Linu Cherian wrote:
> > Hi Jean,
> >
> > On Tue Oct 24, 2017 at 10:28:59PM +0530, Linu Cherian wrote:
> >> Hi Jean,
> >> Thanks for your reply.
> >>
> >> On Tue Oct 24, 2017 at 09:37:12AM +0100, Jean-Philippe Brucker wrote:
> >>> Hi Linu,
> >>>
> >>> On 24/10/17 07:27, Linu Cherian wrote:
> >>>> Hi Jean,
> >>>>
> >>>> On Mon Oct 23, 2017 at 10:32:41AM +0100, Jean-Philippe Brucker wrote:
> >>>>> This is version 0.5 of the virtio-iommu specification, the paravirtualized
> >>>>> IOMMU. This version addresses feedback from v0.4 and adds an event virtqueue.
> >>>>> Please find the specification, LaTeX sources and pdf, at:
> >>>>> git://linux-arm.org/virtio-iommu.git viommu/v0.5
> >>>>> http://linux-arm.org/git?p=virtio-iommu.git;a=blob;f=dist/v0.5/virtio-iommu-v0.5.pdf
> >>>>>
> >>>>> A detailed changelog since v0.4 follows. You can find the pdf diff at:
> >>>>> http://linux-arm.org/git?p=virtio-iommu.git;a=blob;f=dist/diffs/virtio-iommu-pdf-diff-v0.4-v0.5.pdf
> >>>>>
> >>>>> * Add an event virtqueue for the device to report translation faults to
> >>>>> the driver. For the moment only unrecoverable faults are available but
> >>>>> future versions will extend it.
> >>>>> * Simplify PROBE request by removing the ack part, and flattening RESV
> >>>>> properties.
> >>>>> * Rename "address space" to "domain". The change might seem futile but
> >>>>> allows to introduce PASIDs and other features cleanly in the next
> >>>>> versions. In the same vein, the few remaining "device" occurrences were
> >>>>> replaced by "endpoint", to avoid any confusion with "the device"
> >>>>> referring to the virtio device across the document.
> >>>>> * Add implementation notes for RESV_MEM properties.
> >>>>> * Update ACPI table definition.
> >>>>> * Fix typos and clarify a few things.
> >>>>>
> >>>>> I will publish the Linux driver for v0.5 shortly. Then for next versions
> >>>>> I'll focus on optimizations and adding support for hardware acceleration.
> >>>>>
> >>>>> Existing implementations are simple and can certainly be optimized, even
> >>>>> without architectural changes. But the architecture itself can also be
> >>>>> improved in a number of ways. Currently it is designed to work well with
> >>>>> VFIO. However, having explicit MAP requests is less efficient* than page
> >>>>> tables for emulated and PV endpoints, and the current architecture doesn't
> >>>>> address this. Binding page tables is an obvious way to improve throughput
> >>>>> in that case, but we can explore cleverer (and possibly simpler) ways to
> >>>>> do it.
> >>>>>
> >>>>> So first we'll work on getting the base device and driver merged, then
> >>>>> we'll analyze and compare several ideas for improving performance.
> >>>>>
> >>>>> Thanks,
> >>>>> Jean
> >>>>>
> >>>>> * I have yet to study this behaviour, and would be interested in any
> >>>>> prior art on the subject of analyzing devices DMA patterns (virtio and
> >>>>> others)
> >>>>
> >>>>
> >>>> From the spec,
> >>>> Under future extensions.
> >>>>
> >>>> "Page Table Handover, to allow guests to manage their own page tables and share them with the MMU"
> >>>>
> >>>> Had few questions on this.
> >>>>
> >>>> 1. Did you mean SVM support for vfio-pci devices attached to guest processes here.
> >>>
> >>> Yes, using the VFIO BIND and INVALIDATE ioctls that Intel is working on,
> >>> and adding requests in pretty much the same format to virtio-iommu.
> >>>
> >>>> 2. Can you give some hints on how this is going to work , since virtio-iommu guest kernel
> >>>> driver need to create stage 1 page table as required by hardware which is not the case now.
> >>>> CMIIW.
> >>>
> >>> The virtio-iommu device advertises which PASID/page table format is
> >>> supported by the host (obtained via sysfs and communicated in the PROBE
> >>> request), then the guest binds page tables or PASID tables to a domain and
> >>> populates it. Binding page tables alone is easy because we already have
> >>> the required drivers in the guest (io-pgtable or arch/* for SVM) and code
> >>> in the host to manage PASID tables. But since the PASID table pointer is
> >>> translated by stage-2, it would requires a little more work in the host
> >>> for obtaining GPA buffers from the guest on demand.
> >> Is this for resolving PCI PRI requests ?.
> >> IIUC, PCI PRI requests for devices owned by guest need to be resolved
> >> by guest itself.
>
> Supporting PCI PRI is a separate problem, that will be implemented by
> extending the event queue proposed in v0.5. Once the guest bound the PASID
> table and created the page tables, it will start some DMA job in the
> device. If a page isn't mapped, the pIOMMU sends a PRI Request (a page
> fault) to its driver, which is relayed to userspace by VFIO, then to the
> guest via virtio-iommu. The guest handles the fault, then sends a PRI
> response on the virtio-iommu request queue, relayed to the pIOMMU driver
> via VFIO and the device retries the access.
>
> >> In addition the BIND
> >>> ioctl is different from the one used by VT-d, so this solution didn't get
> >>> much appreciation.
> >>
> >> Could you please share the links on this ?
>
> Please find the latest discussion at
> https://www.mail-archive.com/iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org/msg20189.html
>
> >>> The alternative is to bind PASID tables.
> >>
> >> Sorry, i didnt get the difference here.
>
> PASID table is what we call Context Table in SMMU, it's the array
> associating a PASID (SSID) to a context descriptor. In the SMMUv3 the
> stream table entry (device descriptor) points to a PASID table. Each
> context descriptor in the PASID table points to a page directory (pgd).
>
> So the first solution was for the guest to send a BIND with pasid+pgd, and
> let the host deal with the context tables. The second solution is to send
> a BIND with a PASID table pointer, and have the guest handle the context
> table.
>
> > Also does this solution intend to cover the page table sharing of non SVM
> > cases. For example, if we need to share the IOMMU page table for
> > a device used in guest kernel, so that map/unmap gets directly handled by the guest
> > and only TLB invalidates happens through a virtio-iommu channel.
>
> Yes for non-SVM in SMMuv3, you still have a context table but with a
> single descriptor, so the interface stays the same.
So for non SVM case,
guest virtio-iommu driver will program the context descriptor such a way that,
ASID is not in shared set(ASET = 1b) and hence Physical IOMMU TLB invalidates would get triggered
from software for every viommu_unmap(in guest kernel) through Qemu(using vfio ioctls) ?
And for SVM case, ASID would be in shared set and explicit TLB invalidates
are not required from software ?
But with the second
> solution, nested with SMMUv2 isn't supported since it doesn't have context
> tables. The second solution was considered simpler to implement, so we'll
> first go with this one.
>
> Thanks,
> Jean
>
> >> It requires to factor the guest
> >>> PASID handling code into a library, which is difficult for SMMU. Luckily
> >>> I'm still working on adding PASID code for SMMUv3, so extracting it out of
> >>> the driver isn't a big overhead. The good thing about this solution is
> >>> that it reuses any specification work done for VFIO (and vice versa) and
> >>> any host driver change made for vSMMU/VT-d emulations.
> >>>
> >>> Thanks,
> >>> Jean
> >>
> >> --
> >> Linu cherian
> >
--
Linu cherian
next prev parent reply other threads:[~2017-10-25 11:05 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-23 9:32 [RFC] virtio-iommu version 0.5 Jean-Philippe Brucker
[not found] ` <20171023093241.20113-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2017-10-24 6:27 ` Linu Cherian
2017-10-24 8:37 ` Jean-Philippe Brucker
2017-10-24 16:58 ` Linu Cherian
2017-10-25 7:07 ` Linu Cherian
2017-10-25 9:07 ` Jean-Philippe Brucker
[not found] ` <6e5c3a23-9e00-1936-f80c-085faf42c420-5wv7dgnIgG8@public.gmane.org>
2017-10-25 9:26 ` Linu Cherian
2017-10-25 11:05 ` Linu Cherian [this message]
2017-10-25 12:05 ` Jean-Philippe Brucker
2017-12-01 15:46 ` Jean-Philippe Brucker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171025110548.GA20548@virtx40 \
--to=linu.cherian-ygcgfspz5w/qt0dzr+alfa@public.gmane.org \
--cc=Jayachandran.Nair-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org \
--cc=Marc.Zyngier-5wv7dgnIgG8@public.gmane.org \
--cc=Will.Deacon-5wv7dgnIgG8@public.gmane.org \
--cc=eric.auger.pro-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=jasowang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org \
--cc=kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=sunil.goutham-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org \
--cc=virtio-dev-sDuHXQ4OtrM4h7I2RyI4rWD2FQJk+8+b@public.gmane.org \
--cc=virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).