From: Avi Kivity <avi@redhat.com>
To: "Han, Weidong" <weidong.han@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>,
kvm@vger.kernel.org, Amit Shah <amit.shah@redhat.com>,
"Kay, Allen M" <allen.m.kay@intel.com>,
"Yang, Sheng" <sheng.yang@intel.com>,
benami@il.ibm.com, muli@il.ibm.com
Subject: Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest
Date: Wed, 08 Oct 2008 21:49:37 +0200 [thread overview]
Message-ID: <48ED0ED1.6040308@redhat.com> (raw)
In-Reply-To: <0122C7C995D32147B66BF4F440D3016301D2998F@pdsmsx415.ccr.corp.intel.com>
Han, Weidong wrote:
>> If we devolve this to the iommu API, the same io page table can be
>> shared by all iommus, so long as they all use the same page table
>> format.
>>
>
> I don't understand how to handle this by iommu API. Let me explain my
> thoughts more clearly:
>
> VT-d spec says:
> Context-entries programmed with the same domain identifier must
> always reference the same address translation structure (through the ASR
> field). Similarly, context-entries referencing the same address
> translation structure must be programmed with the same domain id.
>
> In native VT-d driver, dmar_domain is per device, and has its own VT-d
> page table, which is dynamically setup before each DMA. So it is
> impossible that the same VT-d page table is shared by all iommus.
> Moveover different iommus in system may have different page table
> levels.
Right. This use case is in essence to prevent unintended sharing. It
is also likely to have low page table height, since dma sizes are
relatively small.
> I think it's enough that iommu API tells us its iommu of a
> device.
>
While this is tangential to our conversation, why? Even for the device
driver use case, this only makes the API more complex. If the API hides
the existence of multiple iommus, it's easier to use and harder to make
a mistake.
> Whereas in KVM side, the same VT-d page table can be shared by the
> devices which are under smae iommu and assigned to the same guest,
> because all of the guest's memory are statically mapped in VT-d page
> table. But it needs to wrap dmar_domain, this patch wraps it with a
> reference count for multiple devices relate to same dmar_domain.
>
> This patch already adds an API (intel_iommu_device_get_iommu()) in
> intel-iommu.c, which returns its iommu of a device.
There is a missed optimization here. Suppose we have two devices each
under a different iommu. With the patch, each will be in a different
dmar_domain and so will have a different page table. The amount of
memory used is doubled.
Suppose the iommu API hides the existence of multiple iommus. You
allocate a translation and add devices to it. When you add a device,
the iommu API checks which iommu is needed and programs it accordingly,
but only one io page table is used.
The other benefit is that iommu developers understand this issues while
kvm developers don't, so it's best managed by the iommu API. This way
if things change (as usual, becoming more complicated), the iommu can
make the changes in their code and hide the complexity from kvm or other
users.
I'm probably (badly) duplicating Joerg's iommu API here, but this is how
it could go:
iommu_translation_create() - creates an iommu translation object; this
allocates the page tables but doesn't do anything with them
iommu_translation_map() - adds pages to the translation
iommu_translation_attach() - attach a device to the translation; this
locates the iommu and programs it
_detach(), _unmap(), and _free() undo these operations.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
next prev parent reply other threads:[~2008-10-08 19:51 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-06 6:38 [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest Han, Weidong
2008-10-07 10:04 ` Zhang, Xiantao
2008-10-07 13:59 ` Avi Kivity
2008-10-08 1:58 ` Zhang, Xiantao
2008-10-07 13:29 ` Avi Kivity
2008-10-08 5:40 ` Han, Weidong
2008-10-08 10:32 ` Avi Kivity
2008-10-08 15:06 ` Han, Weidong
2008-10-08 19:49 ` Avi Kivity [this message]
2008-10-09 6:11 ` Han, Weidong
2008-10-09 8:31 ` Avi Kivity
2008-10-09 9:25 ` Han, Weidong
2008-10-09 12:50 ` Avi Kivity
2008-10-09 14:31 ` Han, Weidong
[not found] ` <0122C7C995D32147B66BF4F440D3016301CB08EF@pdsmsx415.ccr.corp.intel.com>
2008-10-10 5:50 ` Han, Weidong
2008-10-10 6:40 ` Avi Kivity
2008-10-10 7:22 ` Han, Weidong
2008-10-10 7:32 ` Avi Kivity
2008-10-10 7:50 ` Han, Weidong
2008-10-29 10:25 ` Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48ED0ED1.6040308@redhat.com \
--to=avi@redhat.com \
--cc=allen.m.kay@intel.com \
--cc=amit.shah@redhat.com \
--cc=benami@il.ibm.com \
--cc=joerg.roedel@amd.com \
--cc=kvm@vger.kernel.org \
--cc=muli@il.ibm.com \
--cc=sheng.yang@intel.com \
--cc=weidong.han@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).