public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Lan, Tianyu" <tianyu.lan@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: kvm@vger.kernel.org, kevin.tian@intel.com, mst@redhat.com,
	jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com,
	david@gibson.dropbear.id.au, yi.l.liu@intel.com,
	Jean-Philippe.Brucker@arm.com
Subject: Re: [RFC PATCH 0/3] VFIO: Report IOMMU fault event to userspace
Date: Tue, 28 Feb 2017 23:58:10 +0800	[thread overview]
Message-ID: <5bc604ff-7616-56b3-60dc-213b2d72ec47@intel.com> (raw)
In-Reply-To: <20170220222943.5e48de71@t450s.home>

Hi Alex:
Does following comments make sense to you? In the previous discussion,
the type1 IOMMU driver isn't suitable for dynamic map/umap and we should
extend type1 or introduce type2. For fault event reporting or future
IOMMU related function, we need to figure out they should be in the
vfio-pci, vfio-IOMMU driver or something else. SVM support in VM also
will face such kind of choice. As Jean-Philippe posted SVM support for
ARM, I think most platforms have such requirement. Thanks.



On 2017年02月21日 13:29, Alex Williamson wrote:
> On Tue, 21 Feb 2017 12:49:18 +0800
> Lan Tianyu <tianyu.lan@intel.com> wrote:
>
>> On 2017年02月21日 04:53, Alex Williamson wrote:
>>> On Sun, 19 Feb 2017 22:47:06 +0800
>>> Lan Tianyu <tianyu.lan@intel.com> wrote:
>>>
>>>> This patchset proposes a solution to report hardware IOMMU fault
>>>> event to userspace. Motivation is to make vIOMMU in Qemu get physical
>>>> fault event and info from IOMMU hardware. vIOMMU is in charge of transforming
>>>> fault info and inject to guest. The feature is also very important to
>>>> support first level translation(Translation request with PASID) in VM
>>>> which requires vIOMMU to inject device page request to VM.
>>>
>>> Is the type1 IOMMU backend even a valid target here? vIOMMU support
>>> with type1 is functional, but not really usable except for very
>>> specific cases.  Type1 is not designed for and not well suited to
>>> dynamic DMA mapping in a VM, which means that anything other than
>>> iommu=pt or assigning the device to a VM user or l2 guest is going to
>>> have horrible performance. Faulting in mappings, as seems to be
>>> enabled here, sort of implies dynmaic mapping, right?
>>
>> Peter's patches have enabled vIOMMU IOVA with assigned device and guest
>> may change vIOMMU's IOVA page table dynamically. For example, if we
>> assign NIC and enable DMA protection in VM, NIC driver will dynamically
>> map/unmap DMA memory when receive/send packages and change vIOMMU IOVA
>> page table.
>
> Yes, it's functional, but does it have sufficient performance to bother
> to further extend it for the vIOMMU use case?  The benefit of the
> current level of vfio/viommu integration is that we can a) make use of
> DPDK with assigned devices in a secure mode within the guest, and b)
> use nested device assignment (though the usefulness of this this
> versus simple novelty is is questionable).  Both of these make use of
> relatively static mappings through vfio and therefore should not
> experience much mapping overhead aside from setup and teardown.


The fault event introduced by this patchset only works for vIOMMU's
IOVA(request without PASID) and it targets use cases just like you
mentioned(static mapping). IOVA doesn't support device page request and
so it doesn't introduce more dynamic mapping. These fault events are
triggered by misoperation of IO page table in VM.

>
> As I understand your use case, particularly with PASID, you're trying
> to enable the device to fault in mappings, which implies a dynamic
> mapping use case.  Run a 10GBps NIC (or 40 or 100) in a viommu guest
> _without_ using iommu=pt.  How close do you get to native performance?
> How close can a virtio device get in the same configuration?  What is
> the usefulness in extending type1 to support piommu faults if it
> doesn't have worthwhile performance?

For SVM(called first level translation in VTD spec), it doesn't require
to shadow all first level page table. We just need to shadow gPASID
table pointer and put into extend context entry of pIOMMU. VTD hardware
can read guest first level page table directly because it supports
nested translation for request with PASID which helps to translate GPA
to HPA during reading guest page table. So for SVM, we don't need to go 
through typ1 map/umap and dynamic map/umap will just happen in the guest.

For detail, please see Yi's [RFC Design Doc v2] Enable Shared Virtual
Memory feature
https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg03617.html


>
>>> Our container
>>> based accounting falls apart for type1 backing a vIOMMU as well.  Each
>>> device lives in its own address space and therefore vfio container,
>>> which means that management of vIOMMU assigned device VMs needs to
>>> account for locked memory limits per device.  This quickly opens a
>>> large vulnerability if a VM can lock multiples of the VM memory size.
>>
>> This seems we have already this issue if we use assigned device with
>> vIOMMU regardless of whether we report pIOMMU fault event to vIOMMU or
>> not, right?
>
> Yes, so why continue to push viommu features into an iommu model that
> has these limitations?  Perhaps type1 can be extended to avoid them,
> perhaps a new "type2" iommu backend is a better solution.  Either way I
> don't see how adding iommu fault handling to type1 as it exists today
> is more than just a proof of concept exercise.

As mentioned above, these fault events won't make the situation worse.
I am not sure whether we still need to extend type1 or introduce type2?
If yes, could you give more guides how to do that. Thanks.

>
>>>
>>> So is what you're developing here just a proof of concept using type1,
>>> or is this something that you feel actually has value long term?
>>
>> We want to pass physical IOMMU fault event to vIOMMU and then inject
>> virtual fault event to VM. It's necessary to create the channel between
>> pIOMMU driver and vIOMMU. Currently, IOMMU map/umap requests go through
>> VFIO IOMMU type1 driver and so I supposed VFIO IOMMU type1 was better
>> place. This patchset is to show the idea. Another option is to go though
>> VFIO-PCI driver we have considered. It's very helpful if you can give
>> some guides here :)
>
> I understand, but is this any more than a functional proof of concept?
> What are your performance requirements?  Does type1 meet them?  Have
> you tested?  My expectation is that dynamic DMA mapping performance
> through type1 with a viommu configured for dynamic mappings is abysmal
> and it's therefore a poor target on which to base additional features
> such as fault handling.  Do you have data to suggest otherwise?  Going
> through the device rather than the container doesn't solve my assertion
> that type1 is merely functional and does not provide worthwhile
> performance when used in conjunction with dynamic mappings.  Thanks,

Above.

>
> Alex
>
>>>> Introduce two new VFIO cmd for VFIO IOMMU type1 driver
>>>> - VFIO_IOMMU_SET_FAULT_EVENT_FD
>>>>   Receive eventfd from user space and trigger eventfd when get fault event
>>>> from IOMMU.
>>>>
>>>> - VFIO_IOMMU_GET_FAULT_INFO
>>>>   Return IOMMU fault info to userspace.
>>>>
>>>> VFIO IOMMU Type1 driver needs to register IOMMU fault event notifier
>>>> to IOMMU driver for assigned devices when they are attached to VFIO
>>>> container. IOMMU driver will run VFIO fault event callback when hardware
>>>> IOMMU triggers fault events for devices assigned to VFIO container. Then,
>>>> type1 driver signals eventfd provided by userspace and usrspace gets fault
>>>> info via new cmd.
>>>>
>>>> IOMMU interface are still in design stage and so we still can't register
>>>> IOMMU fault event notifier to IOMMU driver. This patchset is prototype
>>>> code to check whether VFIO IOMMU type1 driver is suitable place to deal
>>>> with IOMMU fault event and just passes build test.
>>>>
>>>> Very appreciate for comments.
>>>>
>>>> Lan Tianyu (3):
>>>>   VFIO: Add new cmd to receive eventfd from userspace to notify IOMMU
>>>>     fault event
>>>>   VFIO: Add IOMMU fault notifier callback
>>>>   VFIO: Add new cmd for user space to get IOMMU fault info
>>>>
>>>>  drivers/vfio/vfio_iommu_type1.c | 106 ++++++++++++++++++++++++++++++++++++++++
>>>>  include/linux/iommu.h           |   7 +++
>>>>  include/uapi/linux/vfio.h       |  37 ++++++++++++++
>>>>  3 files changed, 150 insertions(+)
>>>>
>>>
>>
>>
>


-- 
Best regards
Tianyu Lan

  parent reply	other threads:[~2017-02-28 16:02 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-19 14:47 [RFC PATCH 0/3] VFIO: Report IOMMU fault event to userspace Lan Tianyu
2017-02-19 14:47 ` [RFC PATCH 1/3] VFIO: Add new cmd to receive eventfd from userspace to notify IOMMU fault event Lan Tianyu
2017-02-20 20:53   ` Alex Williamson
2017-02-21  5:29     ` Lan Tianyu
2017-02-21  5:48   ` Michael S. Tsirkin
2017-02-21  6:05     ` Alex Williamson
2017-02-21  6:11     ` Liu, Yi L
2017-02-19 14:47 ` [RFC PATCH 2/3] VFIO: Add IOMMU fault notifier callback Lan Tianyu
2017-02-20  2:58   ` Liu, Yi L
2017-02-20 20:53   ` Alex Williamson
2017-02-21  6:05     ` Lan Tianyu
2017-02-21  5:55   ` Michael S. Tsirkin
2017-02-21  6:13     ` Lan Tianyu
2017-02-19 14:47 ` [RFC PATCH 3/3] VFIO: Add new cmd for user space to get IOMMU fault info Lan Tianyu
2017-02-20 20:53   ` Alex Williamson
2017-02-20 20:53 ` [RFC PATCH 0/3] VFIO: Report IOMMU fault event to userspace Alex Williamson
2017-02-21  4:49   ` Lan Tianyu
2017-02-21  5:29     ` Alex Williamson
2017-02-21 15:18       ` Lan Tianyu
2017-02-21 15:21         ` Lan, Tianyu
2017-02-28 15:58       ` Lan, Tianyu [this message]
2017-03-15  6:17         ` Liu, Yi L
2017-03-15 19:52           ` Alex Williamson
2017-03-16  1:42             ` Lan Tianyu
2017-03-16  3:32               ` Jason Wang
2017-03-16  5:22                 ` Lan Tianyu
2017-03-21 23:57               ` Liu, Yi L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5bc604ff-7616-56b3-60dc-213b2d72ec47@intel.com \
    --to=tianyu.lan@intel.com \
    --cc=Jean-Philippe.Brucker@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox