From: Jason Wang <jasowang@redhat.com>
To: "Liu, Yi L" <yi.l.liu@intel.com>, 'Peter Xu' <peterx@redhat.com>
Cc: "Lan, Tianyu" <tianyu.lan@intel.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"'mst@redhat.com'" <mst@redhat.com>,
"'jan.kiszka@siemens.com'" <jan.kiszka@siemens.com>,
"'bd.aviv@gmail.com'" <bd.aviv@gmail.com>,
"'qemu-devel@nongnu.org'" <qemu-devel@nongnu.org>,
"'alex.williamson@redhat.com'" <alex.williamson@redhat.com>,
'David Gibson' <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback
Date: Fri, 31 Mar 2017 15:16:44 +0800 [thread overview]
Message-ID: <81e79982-4af0-3b46-552c-eea4db05a362@redhat.com> (raw)
In-Reply-To: <A2975661238FB949B60364EF0F2C257439036D0F@shsmsx102.ccr.corp.intel.com>
On 2017年03月31日 13:34, Liu, Yi L wrote:
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang@redhat.com]
>> Sent: Thursday, March 30, 2017 7:58 PM
>> To: Liu, Yi L <yi.l.liu@intel.com>; 'Peter Xu' <peterx@redhat.com>
>> Cc: 'alex.williamson@redhat.com' <alex.williamson@redhat.com>; Lan, Tianyu
>> <tianyu.lan@intel.com>; Tian, Kevin <kevin.tian@intel.com>; 'mst@redhat.com'
>> <mst@redhat.com>; 'jan.kiszka@siemens.com' <jan.kiszka@siemens.com>;
>> 'bd.aviv@gmail.com' <bd.aviv@gmail.com>; 'David Gibson'
>> <david@gibson.dropbear.id.au>; 'qemu-devel@nongnu.org' <qemu-
>> devel@nongnu.org>
>> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
>> MemoryRegionIOMMUOps.replay() callback
>>
>>
>>
>> On 2017年03月30日 19:06, Liu, Yi L wrote:
>>>> -----Original Message-----
>>>> From: Liu, Yi L
>>>> Sent: Monday, March 27, 2017 5:22 PM
>>>> To: Peter Xu <peterx@redhat.com>
>>>> Cc: alex.williamson@redhat.com; Lan, Tianyu <tianyu.lan@intel.com>;
>>>> Tian, Kevin <kevin.tian@intel.com>; mst@redhat.com;
>>>> jan.kiszka@siemens.com; jasowang@redhat.com; bd.aviv@gmail.com; David
>>>> Gibson <david@gibson.dropbear.id.au>; qemu-devel@nongnu.org
>>>> Subject: RE: [Qemu-devel] [PATCH v7 14/17] memory: add
>>>> MemoryRegionIOMMUOps.replay() callback
>>>>
>>>>> -----Original Message-----
>>>>> From: Peter Xu [mailto:peterx@redhat.com]
>>>>> Sent: Monday, March 27, 2017 5:12 PM
>>>>> To: Liu, Yi L <yi.l.liu@intel.com>
>>>>> Cc: alex.williamson@redhat.com; Lan, Tianyu <tianyu.lan@intel.com>;
>>>>> Tian, Kevin <kevin.tian@intel.com>; mst@redhat.com;
>>>>> jan.kiszka@siemens.com; jasowang@redhat.com; bd.aviv@gmail.com;
>>>>> David Gibson <david@gibson.dropbear.id.au>; qemu-devel@nongnu.org
>>>>> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
>>>>> MemoryRegionIOMMUOps.replay() callback
>>>>>
>>>>> On Mon, Mar 27, 2017 at 08:35:05AM +0000, Liu, Yi L wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Qemu-devel
>>>>>>> [mailto:qemu-devel-bounces+yi.l.liu=intel.com@nongnu.org] On
>>>>>>> Behalf Of Peter Xu
>>>>>>> Sent: Tuesday, February 7, 2017 4:28 PM
>>>>>>> To: qemu-devel@nongnu.org
>>>>>>> Cc: Lan, Tianyu <tianyu.lan@intel.com>; Tian, Kevin
>>>>>>> <kevin.tian@intel.com>; mst@redhat.com; jan.kiszka@siemens.com;
>>>>>>> jasowang@redhat.com; peterx@redhat.com;
>>>>>>> alex.williamson@redhat.com; bd.aviv@gmail.com; David Gibson
>>>>>>> <david@gibson.dropbear.id.au>
>>>>>>> Subject: [Qemu-devel] [PATCH v7 14/17] memory: add
>>>>>>> MemoryRegionIOMMUOps.replay() callback
>>>>>>>
>>>>>>> Originally we have one memory_region_iommu_replay() function,
>>>>>>> which is the default behavior to replay the translations of the
>>>>>>> whole IOMMU region. However, on some platform like x86, we may
>>>>>>> want our own
>>>>> replay logic for IOMMU regions.
>>>>>>> This patch add one more hook for IOMMUOps for the callback, and
>>>>>>> it'll override the default if set.
>>>>>>>
>>>>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>>>>>> ---
>>>>>>> include/exec/memory.h | 2 ++
>>>>>>> memory.c | 6 ++++++
>>>>>>> 2 files changed, 8 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/exec/memory.h b/include/exec/memory.h index
>>>>>>> 0767888..30b2a74 100644
>>>>>>> --- a/include/exec/memory.h
>>>>>>> +++ b/include/exec/memory.h
>>>>>>> @@ -191,6 +191,8 @@ struct MemoryRegionIOMMUOps {
>>>>>>> void (*notify_flag_changed)(MemoryRegion *iommu,
>>>>>>> IOMMUNotifierFlag old_flags,
>>>>>>> IOMMUNotifierFlag new_flags);
>>>>>>> + /* Set this up to provide customized IOMMU replay function */
>>>>>>> + void (*replay)(MemoryRegion *iommu, IOMMUNotifier *notifier);
>>>>>>> };
>>>>>>>
>>>>>>> typedef struct CoalescedMemoryRange CoalescedMemoryRange; diff
>>>>>>> --git a/memory.c b/memory.c index 7a4f2f9..9c253cc 100644
>>>>>>> --- a/memory.c
>>>>>>> +++ b/memory.c
>>>>>>> @@ -1630,6 +1630,12 @@ void
>>>>>>> memory_region_iommu_replay(MemoryRegion
>>>>>>> *mr, IOMMUNotifier *n,
>>>>>>> hwaddr addr, granularity;
>>>>>>> IOMMUTLBEntry iotlb;
>>>>>>> + /* If the IOMMU has its own replay callback, override */
>>>>>>> + if (mr->iommu_ops->replay) {
>>>>>>> + mr->iommu_ops->replay(mr, n);
>>>>>>> + return;
>>>>>>> + }
>>>>>> Hi Alex, Peter,
>>>>>>
>>>>>> Will all the other vendors(e.g. PPC, s390, ARM) add their own
>>>>>> replay callback as well? I guess it depends on whether the original
>>>>>> replay algorithm work well for them? Do you have such knowledge?
>>>>> I guess so. At least for VT-d we had this callback since the default
>>>>> replay mechanism did not work well on x86 due to its extremely large
>>>>> memory region size. Thanks,
>>>> thx. that would make sense.
>>> Peter,
>>>
>>> Just come to mind that there may be a corner case here.
>>>
>>> Intel VT-d actually has a "pt" mode which allows device use physical
>>> address even when VT-d is enabled. In kernel, there is a iommu_identity_mapping.
>>> If a device is in this map, then it would use "pt" mode. So that IOMMU
>>> driver would not build second-level page table for it.
>> Yes, but qemu does not support ECAP_PT now, so guest will still have a page table in
>> this case.
> That's true. Without ECAP_PT, IOMMU driver would create a 1:1 map. So this solution
> can work well even a device is in identify_map.
>
>>> Back to the virtual IOVA implementation, if an assigned device is in
>>> the iommu_identity_mapping(e.g. VGA controller), it uses GPA directly to do DMA.
>>> So it demands a GPA->HPA mapping in host. However, the
>>> iommu->ops.replay is not able to build it when guest SL page table is empty.
>>>
>>> So I think building an entire guest PA->HPA mapping before guest
>>> kernel boot would be recommended. Any thoughts?
>> We plan to add PT in 2.10, a possible rough idea is disabled iommu dmar region and
>> use another region without iommu_ops. Then
>> vfio_listener_region_add() will just do the correct mappings.
> Good to know it. Actually, I also need to expose ECAP_PT for vSVM. So just comes to
> realize that the current replay solution may not work well when I expose ECAP_PT to guest.
> I also have a rough idea here. The current listener in container listens to address space
> named with devfn if virtual VTd is added. How about adding one more listener to listen
> memory address space. So that the listener can build entire guest PA->HPA mapping.
This is only needed for PT. So looks like current code is sufficient to
do this I think. See the else part of if (memory_region_is_iommu()) of
vfio_listener_region_add().
Thanks
> Also,
> the vfio notifier is registered when changes happen in device address space. However, I
> didn’t check if all the layout changes in memory address space happen before the first
> dynamic map/unmap request from guest. If not, this solution is not practical.
>
> Thanks,
> Yi L
next prev parent reply other threads:[~2017-03-31 7:17 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-07 8:28 [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Peter Xu
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 01/17] vfio: trace map/unmap for notify as well Peter Xu
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 02/17] vfio: introduce vfio_get_vaddr() Peter Xu
2017-02-10 1:12 ` David Gibson
2017-02-10 5:50 ` Peter Xu
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 03/17] vfio: allow to notify unmap for very large region Peter Xu
2017-02-10 1:13 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 04/17] intel_iommu: add "caching-mode" option Peter Xu
2017-02-10 1:14 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 05/17] intel_iommu: simplify irq region translation Peter Xu
2017-02-10 1:15 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 06/17] intel_iommu: renaming gpa to iova where proper Peter Xu
2017-02-10 1:17 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 07/17] intel_iommu: convert dbg macros to traces for inv Peter Xu
2017-02-08 2:47 ` Jason Wang
2017-02-10 1:19 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 08/17] intel_iommu: convert dbg macros to trace for trans Peter Xu
2017-02-08 2:49 ` Jason Wang
2017-02-10 1:20 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 09/17] intel_iommu: vtd_slpt_level_shift check level Peter Xu
2017-02-10 1:20 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 10/17] memory: add section range info for IOMMU notifier Peter Xu
2017-02-10 2:29 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 11/17] memory: provide IOMMU_NOTIFIER_FOREACH macro Peter Xu
2017-02-10 2:30 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 12/17] memory: provide iommu_replay_all() Peter Xu
2017-02-10 2:31 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 13/17] memory: introduce memory_region_notify_one() Peter Xu
2017-02-10 2:33 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback Peter Xu
2017-02-10 2:34 ` David Gibson
2017-03-27 8:35 ` Liu, Yi L
2017-03-27 9:12 ` Peter Xu
2017-03-27 9:21 ` Liu, Yi L
2017-03-30 11:06 ` Liu, Yi L
2017-03-30 11:57 ` Jason Wang
2017-03-31 2:56 ` Peter Xu
2017-03-31 4:21 ` Jason Wang
2017-03-31 5:01 ` Peter Xu
2017-03-31 5:12 ` Jason Wang
2017-03-31 5:28 ` Peter Xu
2017-03-31 5:34 ` Liu, Yi L
2017-03-31 7:16 ` Jason Wang [this message]
2017-03-31 7:30 ` Liu, Yi L
2017-04-01 5:00 ` Jason Wang
2017-04-01 6:39 ` Liu, Yi L
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 15/17] intel_iommu: provide its own replay() callback Peter Xu
2017-02-10 2:36 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 16/17] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
2017-02-10 2:38 ` David Gibson
2017-02-07 8:28 ` [Qemu-devel] [PATCH v7 17/17] intel_iommu: enable vfio devices Peter Xu
2017-02-10 6:24 ` Jason Wang
2017-03-16 4:05 ` Peter Xu
2017-03-19 15:34 ` Aviv B.D.
2017-03-20 1:56 ` Peter Xu
2017-03-20 2:12 ` Liu, Yi L
2017-03-20 2:41 ` Peter Xu
2017-02-17 17:18 ` [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Alex Williamson
2017-02-20 7:47 ` Peter Xu
2017-02-20 8:17 ` Liu, Yi L
2017-02-20 8:32 ` Peter Xu
2017-02-20 19:15 ` Alex Williamson
2017-02-28 7:52 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=81e79982-4af0-3b46-552c-eea4db05a362@redhat.com \
--to=jasowang@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=bd.aviv@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=jan.kiszka@siemens.com \
--cc=kevin.tian@intel.com \
--cc=mst@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=tianyu.lan@intel.com \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).