From: Jan Kiszka <jan.kiszka@siemens.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: valentine.sinitsyn@gmail.com, marcel@redhat.com,
David Kiarie <davidkiarie4@gmail.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [V6 0/4] AMD IOMMU
Date: Tue, 1 Mar 2016 15:35:22 +0100 [thread overview]
Message-ID: <56D5A8AA.5020409@siemens.com> (raw)
In-Reply-To: <20160301162142-mutt-send-email-mst@redhat.com>
On 2016-03-01 15:30, Michael S. Tsirkin wrote:
> On Tue, Mar 01, 2016 at 03:18:05PM +0100, Jan Kiszka wrote:
>> On 2016-03-01 15:12, Jan Kiszka wrote:
>>> On 2016-03-01 14:55, Michael S. Tsirkin wrote:
>>>> On Tue, Mar 01, 2016 at 02:48:19PM +0100, Jan Kiszka wrote:
>>>>> On 2016-03-01 14:07, Michael S. Tsirkin wrote:
>>>>>> On Sun, Feb 21, 2016 at 09:10:56PM +0300, David Kiarie wrote:
>>>>>>> Hello there,
>>>>>>>
>>>>>>> Repost, AMD IOMMU patches version 6.
>>>>>>>
>>>>>>> Changes since version 5
>>>>>>> -Fixed macro formating issues
>>>>>>> -changed occurences of IO MMU to IOMMU for consistency
>>>>>>> -Fixed capability registers duplication
>>>>>>> -Rebased to current master
>>>>>>>
>>>>>>> David Kiarie (4):
>>>>>>> hw/i386: Introduce AMD IOMMU
>>>>>>> hw/core: Add AMD IOMMU to machine properties
>>>>>>> hw/i386: ACPI table for AMD IOMMU
>>>>>>> hw/pci-host: Emulate AMD IOMMU
>>>>>>
>>>>>> I went over AMD IOMMU spec.
>>>>>> I'm concerned that it appears that there's no chance for it to
>>>>>> work correctly if host caches invalid PTE entries.
>>>>>>
>>>>>> The spec vaguely discusses write-protecting such PTEs but
>>>>>> that would be very complex if it can be made to work at all.
>>>>>>
>>>>>> This means that this can't work with e.g. VFIO.
>>>>>> It can only work with emulated devices.
>>>>>
>>>>> You mean it can't work if we program a real IOMMU (for VFIO) with
>>>>> translated data from the emulated one but cannot track any updates of
>>>>> the related page tables because the guest is not required to issue
>>>>> traceable flush requests? Hmm, too bad.
>>>>>
>>>>>>
>>>>>> OTOH VTD can easily support PTE shadowing by setting a flag.
>>>>>
>>>>> Do you mean RWBF=1 in the CAP register? Given that "Newer hardware
>>>>> implementations are expected to NOT require explicit software flushing
>>>>> of write buffers and report RWBF=0 in the Capability register", we may
>>>>> eventually run into guests that no longer check that flag if we expose
>>>>> something that looks like a "newer" implementation.
>>>>
>>>> Hopefully not, if that happens we'll have to do a PV IOMMU :)
>>>
>>> Please not.
>>>
>>>>
>>>>> However, this flag is not set right now in our VT-d model.
>>>>>>
>>>>>> I'd like us to find some way to avoid possibility
>>>>>> of user error creating a configuration mixing e.g.
>>>>>> vfio with the amd iommu.
>>>>>>
>>>>>> I'm not sure how to do this.
>>>>>>
>>>>>> Any idea?
>>>>>
>>>>> There is likely no way around write-protecting the IOMMU page tables (in
>>>>> KVM mode) once we evaluated and cached them somewhere.
>>>>
>>>> Well for one, it's possible to use vt-d and not amd iommu.
>>>
>>> That would lead to nice combos of AMD CPUs with VT-d IOMMU. While it may
>>> be possible, I wouldn't rely on guests having tested that combination
>>> very well.
>>
>> To make the concern more concrete: I'm playing with code that will reuse
>> the MMU page tables for the IOMMU - the AMD architecture is designed for
>> that optimization (in contrast to Intel's). So, if the guest is not
>> foreseeing that artificial combo above (ours will definitely not)
>> because it is designed around the reuse, it will at least fail to run.
>>
>> Jan
>
> So if you have an AMD iommu on the host and that is capable
> of 2-level translation, then the flushing problem
> can be fixed by a kind of iommu pass-through
> where you point the host's iommu to guest's page tables.
Yes, right, that could be another approach - provided the tables have
compatible entries. I didn't look details of any of both so far, but I
wouldn't be overly optimistic. Usually, hardware is not very well
designed for interesting nesting purposes.
>
> So maybe what you need to do is make it possible
> for device to query iommu and ask whether it
> supports devices caching invalid PTEs.
> If not, vfio could fail.
Makes sense.
Jan
--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux
next prev parent reply other threads:[~2016-03-01 14:35 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-21 18:10 [Qemu-devel] [V6 0/4] AMD IOMMU David Kiarie
2016-02-21 18:10 ` [Qemu-devel] [V6 1/4] hw/i386: Introduce " David Kiarie
2016-02-25 15:43 ` Marcel Apfelbaum
2016-02-26 6:23 ` David Kiarie
2016-03-02 4:00 ` David Kiarie
2016-03-02 4:08 ` David Kiarie
2016-03-03 9:40 ` Marcel Apfelbaum
2016-03-03 9:34 ` Marcel Apfelbaum
2016-03-02 19:11 ` David Kiarie
2016-03-03 12:16 ` Marcel Apfelbaum
2016-02-21 18:10 ` [Qemu-devel] [V6 2/4] hw/core: Add AMD IOMMU to machine properties David Kiarie
2016-02-21 20:09 ` Jan Kiszka
2016-03-02 20:51 ` David Kiarie
2016-03-03 9:28 ` Marcel Apfelbaum
2016-03-11 13:20 ` Michael S. Tsirkin
2016-02-21 18:10 ` [Qemu-devel] [V6 3/4] hw/i386: ACPI table for AMD IOMMU David Kiarie
2016-02-21 18:20 ` Jan Kiszka
2016-02-21 19:00 ` David Kiarie
2016-02-21 18:11 ` [Qemu-devel] [V6 4/4] hw/pci-host: Emulate " David Kiarie
2016-02-22 11:22 ` Marcel Apfelbaum
[not found] ` <56D75688.1020500@gmail.com>
2016-03-02 21:17 ` Michael S. Tsirkin
2016-03-02 22:04 ` David Kiarie
2016-03-03 9:49 ` Michael S. Tsirkin
2016-03-03 11:47 ` David Kiarie
2016-03-03 12:02 ` Marcel Apfelbaum
2016-03-03 12:06 ` Marcel Apfelbaum
2016-03-03 12:18 ` David Kiarie
2016-03-03 12:58 ` Michael S. Tsirkin
2016-03-08 17:15 ` David Kiarie
2016-03-11 13:22 ` Michael S. Tsirkin
2016-03-13 0:14 ` David Kiarie
2016-03-13 13:59 ` Michael S. Tsirkin
2016-02-21 20:20 ` [Qemu-devel] [V6 0/4] " Jan Kiszka
2016-02-22 5:57 ` David Kiarie
2016-02-22 7:29 ` Jan Kiszka
2016-02-22 11:05 ` David Kiarie
2016-02-22 11:12 ` Jan Kiszka
2016-03-01 13:07 ` Michael S. Tsirkin
2016-03-01 13:48 ` Jan Kiszka
2016-03-01 13:55 ` Michael S. Tsirkin
2016-03-01 14:12 ` Jan Kiszka
2016-03-01 14:18 ` Jan Kiszka
2016-03-01 14:30 ` Michael S. Tsirkin
2016-03-01 14:35 ` Jan Kiszka [this message]
2016-03-01 14:19 ` Michael S. Tsirkin
2016-03-01 14:00 ` Jan Kiszka
2016-03-01 20:11 ` Michael S. Tsirkin
2016-03-01 20:17 ` Jan Kiszka
2016-03-01 20:39 ` Michael S. Tsirkin
2016-03-01 21:23 ` Jan Kiszka
2016-03-01 22:35 ` Michael S. Tsirkin
2016-03-02 21:17 ` David Kiarie
2016-03-02 21:32 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56D5A8AA.5020409@siemens.com \
--to=jan.kiszka@siemens.com \
--cc=davidkiarie4@gmail.com \
--cc=marcel@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=valentine.sinitsyn@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).