* [Qemu-devel] [RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios
@ 2016-11-30 8:49 Liu, Yi L
2017-02-28 22:07 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 5+ messages in thread
From: Liu, Yi L @ 2016-11-30 8:49 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: kvm@vger.kernel.org, iommu@lists.linux-foundation.org,
Tian, Kevin, Peng, Chao P, Lan, Tianyu, Raj, Ashok,
Pan, Jacob jun, Sun, Yi Y, 'Alex Williamson',
'Peter Xu', Hao, Xudong
What's changed from v2:
a) Detailed feature description
b) refine description in "Address translation in virtual SVM"
b) "Terms" is added
Content
===============================================
1. Feature description
2. Why use it?
3. How to enable it
4. How to test
5. Terms
Details
===============================================
1. Feature description
Shared virtual memory(SVM) is to let application program share its virtual
address with SVM capable devices.
Shared virtual memory details:
a) SVM feature requires ATS/PRQ/PASID support on both device side and
IOMMU side.
b) SVM capable devices could send DMA requests with PASID, the address
in the request would be a virtual address within a program's virtual address
space.
c) IOMMU would use first level page table to translate the address in the
request.
d) First level page table is a HVA->HPA mapping on bare metal.
Shared Virtual Memory feature in pass-through scenarios is actually SVM
virtualization. It is to let application programs(running in guest)share their
virtual address with assigned device(e.g. graphics processors or accelerators).
In virtualization, SVM would be:
a) Require a vIOMMU exposed to guest
b) Assigned SVM capable device could send DMA requests with PASID, the
address in the request would be a virtual address within a guest
program's virtual address space(GVA).
c) Physical IOMMU needs to do GVA->GPA->HPA translation. Nested mode
would be enabled, first level page table would achieve GVA->GPA mapping,
while second level page table would achieve GPA->HPA translation.
For more SVM detail, you may want refer to section 2.5.1.1 of Intel VT-d spec
and section 5.6 of OpenCL spec. For details about SVM address translation,
pls refer to section 3 of Intel VT-d spec.
It's also welcomed to discuss directly in this thread.
Link to related specs:
http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
https://www.khronos.org/registry/cl/specs/opencl-2.0.pdf
2. Why use it?
It is common to pass-through devices to guest and expect to achieve as
much similar performance as it is on host. With this feature enabled,
the application programs in guest would be able to share data-structures
with assigned devices without unnecessary overheads.
3. How to enable it
As mentioned above, SVM virtualization requires a vIOMMU exposed to guest.
Since there is an existing IOMMU emulator in host user space(QEMU), it is
more acceptable to extend the IOMMU emulator to support SVM for assigned
devices. So far, the vIOMMU exposed to guest is only for emulated devices.
In this design, it would focus on virtual SVM for assigned devices. Virtual
IOVA and virtual interrupt remapping will not be included here.
The enabling work would include the following items.
a) IOMMU Register Access Emulation
Already existed in QEMU, need some extensions to support SVM. e.g. support
page request service related registers(PQA_REG).
b) vIOMMU Capability
Report SVM related capabilities(PASID,PRS,DT,PT,ECS etc.) in ex-capability
register and cache mode, DWD, DRD in capability register.
c) QI Handling Emulation
Already existed in QEMU, need to shadow the QIs related to assigned devices to
physical IOMMU.
i. ex-context entry cache invalidation(nested mode setting, guest PASID table
pointer shadowing)
ii. 1st level translation cache invalidation
iii. Response for recoverable faults
d) Address translation in virtual SVM
In virtualization, for requests with PASID from assigned device, the address translation
would be subjected to first level page table and then second level page table, which is
named nested mode. Extended context mode should be supported on hardware. DMA
remapping in SVM virtualization would be:
i. For requests with PASID, the related extended context entry should have
the NESTE bit set.
ii. Guest PASID table pointer should be shadowed to host IOMMU driver.
The PASID table pointer field in extended context entry would be a GPA as
nested mode is on.
First level page table would be maintained by guest IOMMU driver. Second level
page table would be maintained by host IOMMU driver.
e) Recoverable Address Translation Faults Handling Emulation
It is serviced by page request when device support PRS. For assigned devices,
host IOMMU driver would get page requests from pIOMMU. Here, we need a
mechanism to drain the page requests from devices which are assigned to a
guest. In this design it would be done through VFIO. Page request descriptors
would be propagated to user space and then exposed to guest IOMMU driver.
This requires following support:
i. a mechanism to notify vIOMMU emulator to fetch PRQ descriptor
ii. a notify framework in QEMU to signal the PRQ descriptor fetching when
notified by pIOMMU
f) Non-Recoverable Address Translation Handling Emulation
The non-recoverable fault propagation is similar to recoverable faults. In
this design it would propagate fault data to user space(QEMU) through VFIO.
vIOMMU emulator then emulate the fault. Either fill data to vIOMMU fault
record registers or fill the data to memory-resident fault log region. Depends
on the fault reporting type.(primary fault logging or advanced fault logging)
g) SVM Virtualization Architecture
**********************************************************************
Guest +------------------+
+->| vIOMMU driver |
| +------+---------+-+
| | |
+----+(1) |(2) |(3)
| | |
*****************************************|****|*********|*************
Host User | v v
Space +--+---------------------+
| Qemu vIOMMU |
+--+----+----+----+----+-+
| | | | |
*****************************************|****|****|****|****|********
Host Kernel |(1) |(2) |(4) |(5) |(6)
Space | | | | |
+-----------------------+ +---+----+----+----+----+---+
| IOMMU Fault | | | | | | | |
| +--------------------+-------+---+ | | | | |
| | | | | | | | |
| | +-----------------+-------+--------+ | | | |
| | | | | | | | |
| | | IOMMU | | VFIO | | | |
| | | Driver | | | | | |
| | | +--------+-------+-------------+ | | |
| | | | | | | | |
| | | | +-----+-------+------------------+ | |
| | | | | | | | |
| | | | | +--+-------|-----------------------+ |
| | | | | | | | |
+--+--+--------+--+--+--+ +---------------------------+
| | | | |
********|**|********|**|**|*******************************************
HW | | | | |
| v v v v
+-+--------------------+
| pIOMMU |
+----------------------+
**********************************************************************
(1)Fault reporting, include recoverable and un-recoverable faults
(2)PRQ response
(3)Translation cache invalidation(QI)
(4)Set nested mode in pIOMMU ex-context entry
(5)Shadow gPASID table pointer to pIOMMU ex-context entry
(6)Cache invalidation for 1st level translation
<if the diagram is disordered, you may want to paste it to a Linux email client>
4. How to test
Test would be done with devices which has SVM capability. Hereby, Intel i915
GPU would be chosen to do the verification. Intel provides three tools from for
SVM verification. They are:
i) intel-gpu-tools/tests/gem_svm_sanity
ii) intel-gpu-tools/tests/gem_svm_fault
iii) intel-gpu-tools/tests/gem_svm_storedw_loop_render
The following scenarios would have to be covered:
a) Test case 1 - SVM usage in host
i) Requires a physical machine which has at least one SVM capable device.
ii) Run Test Tools in host.
iii) Expect: with vSVM enabled, it shouldn't affect SVM usage in host
b) Test case 2 - SVM usage in guest
i) Requires a physical machine which has at least one SVM capable device.
ii) Create a guest, and assign a SVM capable device to it.
iii) Run Test Tools in the guest.
iV) Expect: with vSVM enabled and device assigned, guest should be able to
use SVM with the assigned device
c) Test case 3 - SVM usage in multi-guests scenario
i) Requires a physical machine which has at least two SVM capable devices.
ii) Create two guests, and assign a SVM capable device to each of them.
iii) Run Test Tools on both of the two guests.
iV) Expect: multi-guest should be able to use SVM with its assigned
devices without affect each other
d) Test case 4 - SVM usage in host/guest scenario
i) Requires a physical machine which has at least two SVM capable devices.
ii) Create a guest, and assign a SVM capable device to the guest.
iii) Run Test Tools on both of the host and the guest.
iV) Expect: host and guest shouldn't affect each other
5. Terms:
SVM: Shared Virtual Memory
CSR: Means the IOMMU registers in this slide
IOVA: IO Virtual Address
PRQ: Page Request
vIOMMU: Virtual IOMMU emulated by QEMU
FLPT: First Level Page Table
SLPT: Second Level Page Table
QI: Queued Invalidation, a mechanism used to invalidate cache in VTd
PASID: Process Address Space ID
Thanks,
Best Wishes,
Yi Liu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios
2016-11-30 8:49 [Qemu-devel] [RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios Liu, Yi L
@ 2017-02-28 22:07 ` Konrad Rzeszutek Wilk
2017-03-01 6:51 ` Tian, Kevin
0 siblings, 1 reply; 5+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-02-28 22:07 UTC (permalink / raw)
To: Liu, Yi L
Cc: qemu-devel@nongnu.org, Lan, Tianyu, Tian, Kevin, Peng, Chao P,
kvm@vger.kernel.org, Hao, Xudong, Sun, Yi Y, 'Peter Xu',
iommu@lists.linux-foundation.org, Pan, Jacob jun
On Wed, Nov 30, 2016 at 08:49:24AM +0000, Liu, Yi L wrote:
> What's changed from v2:
> a) Detailed feature description
> b) refine description in "Address translation in virtual SVM"
> b) "Terms" is added
>
> Content
> ===============================================
> 1. Feature description
> 2. Why use it?
> 3. How to enable it
> 4. How to test
> 5. Terms
>
> Details
> ===============================================
> 1. Feature description
> Shared virtual memory(SVM) is to let application program share its virtual
> address with SVM capable devices.
>
> Shared virtual memory details:
> a) SVM feature requires ATS/PRQ/PASID support on both device side and
> IOMMU side.
> b) SVM capable devices could send DMA requests with PASID, the address
> in the request would be a virtual address within a program's virtual address
> space.
> c) IOMMU would use first level page table to translate the address in the
> request.
> d) First level page table is a HVA->HPA mapping on bare metal.
>
> Shared Virtual Memory feature in pass-through scenarios is actually SVM
> virtualization. It is to let application programs(running in guest)share their
> virtual address with assigned device(e.g. graphics processors or accelerators).
I think I am missing something obvious, but the current way that DRM
works is that the kernel sets up its VA addresses for the GPU and it uses
that for its ring. It also setups an user level mapping for the GPU if the
application (Xorg) really wants it - but most of the time the kernel is
in charge of poking at the ring, and the memory that is shared with the
Xorg is normal RAM allocated via alloc_pages (see drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
and drivers/gpu/drm/ttm/ttm_page_alloc.c).
So are talking about the guest applications having access to the
ring of the GPU?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios
2017-02-28 22:07 ` Konrad Rzeszutek Wilk
@ 2017-03-01 6:51 ` Tian, Kevin
2017-03-01 21:09 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 5+ messages in thread
From: Tian, Kevin @ 2017-03-01 6:51 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk, Liu, Yi L
Cc: qemu-devel@nongnu.org, Lan, Tianyu, Peng, Chao P,
kvm@vger.kernel.org, Hao, Xudong, Sun, Yi Y, 'Peter Xu',
iommu@lists.linux-foundation.org, Pan, Jacob jun
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Wednesday, March 01, 2017 6:07 AM
>
> On Wed, Nov 30, 2016 at 08:49:24AM +0000, Liu, Yi L wrote:
> > What's changed from v2:
> > a) Detailed feature description
> > b) refine description in "Address translation in virtual SVM"
> > b) "Terms" is added
> >
> > Content
> > ===============================================
> > 1. Feature description
> > 2. Why use it?
> > 3. How to enable it
> > 4. How to test
> > 5. Terms
> >
> > Details
> > ===============================================
> > 1. Feature description
> > Shared virtual memory(SVM) is to let application program share its virtual
> > address with SVM capable devices.
> >
> > Shared virtual memory details:
> > a) SVM feature requires ATS/PRQ/PASID support on both device side and
> > IOMMU side.
> > b) SVM capable devices could send DMA requests with PASID, the address
> > in the request would be a virtual address within a program's virtual address
> > space.
> > c) IOMMU would use first level page table to translate the address in the
> > request.
> > d) First level page table is a HVA->HPA mapping on bare metal.
> >
> > Shared Virtual Memory feature in pass-through scenarios is actually SVM
> > virtualization. It is to let application programs(running in guest)share their
> > virtual address with assigned device(e.g. graphics processors or accelerators).
>
> I think I am missing something obvious, but the current way that DRM
> works is that the kernel sets up its VA addresses for the GPU and it uses
> that for its ring. It also setups an user level mapping for the GPU if the
> application (Xorg) really wants it - but most of the time the kernel is
> in charge of poking at the ring, and the memory that is shared with the
> Xorg is normal RAM allocated via alloc_pages (see
> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> and drivers/gpu/drm/ttm/ttm_page_alloc.c).
>
> So are talking about the guest applications having access to the
> ring of the GPU?
No. SVM is purely about sharing CPU address space with device. Command
submission is still through kernel driver which controls rings (with SVM then
you can put VA into those commands). There are other vendor specific
features to enable direct user space submission which is orthogonal to SVM.
Thanks
Kevin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios
2017-03-01 6:51 ` Tian, Kevin
@ 2017-03-01 21:09 ` Konrad Rzeszutek Wilk
2017-03-01 21:30 ` Raj, Ashok
0 siblings, 1 reply; 5+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-03-01 21:09 UTC (permalink / raw)
To: Tian, Kevin
Cc: Liu, Yi L, qemu-devel@nongnu.org, Lan, Tianyu, Peng, Chao P,
kvm@vger.kernel.org, Hao, Xudong, Sun, Yi Y, 'Peter Xu',
iommu@lists.linux-foundation.org, Pan, Jacob jun
.snip..
> > > Shared Virtual Memory feature in pass-through scenarios is actually SVM
> > > virtualization. It is to let application programs(running in guest)share their
> > > virtual address with assigned device(e.g. graphics processors or accelerators).
> >
> > I think I am missing something obvious, but the current way that DRM
> > works is that the kernel sets up its VA addresses for the GPU and it uses
> > that for its ring. It also setups an user level mapping for the GPU if the
> > application (Xorg) really wants it - but most of the time the kernel is
> > in charge of poking at the ring, and the memory that is shared with the
> > Xorg is normal RAM allocated via alloc_pages (see
> > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > and drivers/gpu/drm/ttm/ttm_page_alloc.c).
> >
> > So are talking about the guest applications having access to the
> > ring of the GPU?
>
> No. SVM is purely about sharing CPU address space with device. Command
> submission is still through kernel driver which controls rings (with SVM then
> you can put VA into those commands). There are other vendor specific
> features to enable direct user space submission which is orthogonal to SVM.
Apologies for my ignorance but how is this beneficial? As in
currently you would put in bus addresses on the ring, but now you
can put VA addresses.
The obvious benefit I see is that you omit the DMA ops which means there is
less of 'lookup' (VA->bus address) in software - but I would have thought this
would be negligible performance impact? And now the IOMMU alongside with
the CPU would do this lookup.
Or are there some other improvements in this?
>
> Thanks
> Kevin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios
2017-03-01 21:09 ` Konrad Rzeszutek Wilk
@ 2017-03-01 21:30 ` Raj, Ashok
0 siblings, 0 replies; 5+ messages in thread
From: Raj, Ashok @ 2017-03-01 21:30 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: Tian, Kevin, Lan, Tianyu, Peng, Chao P, kvm@vger.kernel.org,
iommu@lists.linux-foundation.org, Hao, Xudong,
qemu-devel@nongnu.org, Sun, Yi Y, Pan, Jacob jun, Ashok Raj
On Wed, Mar 01, 2017 at 04:09:38PM -0500, Konrad Rzeszutek Wilk wrote:
> .snip..
> >
> > No. SVM is purely about sharing CPU address space with device. Command
> > submission is still through kernel driver which controls rings (with SVM then
> > you can put VA into those commands). There are other vendor specific
> > features to enable direct user space submission which is orthogonal to SVM.
>
> Apologies for my ignorance but how is this beneficial? As in
> currently you would put in bus addresses on the ring, but now you
> can put VA addresses.
>
> The obvious benefit I see is that you omit the DMA ops which means there is
> less of 'lookup' (VA->bus address) in software - but I would have thought this
> would be negligible performance impact? And now the IOMMU alongside with
> the CPU would do this lookup.
>
> Or are there some other improvements in this?
Other benefits include,
- Application can simply pass its pointers to the SVM capable devices. which
means no memory registration overhead to get IO Virtual Addresses.
- No need to pin memory for DMA, since the devices can handle faults and
can request pages to be paged-in on demand.
>
> >
> > Thanks
> > Kevin
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-03-01 21:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-30 8:49 [Qemu-devel] [RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios Liu, Yi L
2017-02-28 22:07 ` Konrad Rzeszutek Wilk
2017-03-01 6:51 ` Tian, Kevin
2017-03-01 21:09 ` Konrad Rzeszutek Wilk
2017-03-01 21:30 ` Raj, Ashok
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).