Re: [Qemu-devel] [V6 0/4] AMD IOMMU

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kiszka <jan.kiszka@siemens.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: valentine.sinitsyn@gmail.com, marcel@redhat.com,
	David Kiarie <davidkiarie4@gmail.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [V6 0/4] AMD IOMMU
Date: Tue, 1 Mar 2016 15:35:22 +0100	[thread overview]
Message-ID: <56D5A8AA.5020409@siemens.com> (raw)
In-Reply-To: <20160301162142-mutt-send-email-mst@redhat.com>

On 2016-03-01 15:30, Michael S. Tsirkin wrote:
> On Tue, Mar 01, 2016 at 03:18:05PM +0100, Jan Kiszka wrote:
>> On 2016-03-01 15:12, Jan Kiszka wrote:
>>> On 2016-03-01 14:55, Michael S. Tsirkin wrote:
>>>> On Tue, Mar 01, 2016 at 02:48:19PM +0100, Jan Kiszka wrote:
>>>>> On 2016-03-01 14:07, Michael S. Tsirkin wrote:
>>>>>> On Sun, Feb 21, 2016 at 09:10:56PM +0300, David Kiarie wrote:
>>>>>>> Hello there,
>>>>>>>
>>>>>>> Repost, AMD IOMMU patches version 6.
>>>>>>>
>>>>>>> Changes since version 5
>>>>>>>  -Fixed macro formating issues
>>>>>>>  -changed occurences of IO MMU to IOMMU for consistency
>>>>>>>  -Fixed capability registers duplication
>>>>>>>  -Rebased to current master
>>>>>>>
>>>>>>> David Kiarie (4):
>>>>>>>   hw/i386: Introduce AMD IOMMU
>>>>>>>   hw/core: Add AMD IOMMU to machine properties
>>>>>>>   hw/i386: ACPI table for AMD IOMMU
>>>>>>>   hw/pci-host: Emulate AMD IOMMU
>>>>>>
>>>>>> I went over AMD IOMMU spec.
>>>>>> I'm concerned that it appears that there's no chance for it to
>>>>>> work correctly if host caches invalid PTE entries.
>>>>>>
>>>>>> The spec vaguely discusses write-protecting such PTEs but
>>>>>> that would be very complex if it can be made to work at all.
>>>>>>
>>>>>> This means that this can't work with e.g. VFIO.
>>>>>> It can only work with emulated devices.
>>>>>
>>>>> You mean it can't work if we program a real IOMMU (for VFIO) with
>>>>> translated data from the emulated one but cannot track any updates of
>>>>> the related page tables because the guest is not required to issue
>>>>> traceable flush requests? Hmm, too bad.
>>>>>
>>>>>>
>>>>>> OTOH VTD can easily support PTE shadowing by setting a flag.
>>>>>
>>>>> Do you mean RWBF=1 in the CAP register? Given that "Newer hardware
>>>>> implementations are expected to NOT require explicit software flushing
>>>>> of write buffers and report RWBF=0 in the Capability register", we may
>>>>> eventually run into guests that no longer check that flag if we expose
>>>>> something that looks like a "newer" implementation.
>>>>
>>>> Hopefully not, if that happens we'll have to do a PV IOMMU :)
>>>
>>> Please not.
>>>
>>>>
>>>>> However, this flag is not set right now in our VT-d model.
>>>>>>
>>>>>> I'd like us to find some way to avoid possibility
>>>>>> of user error creating a configuration mixing e.g.
>>>>>> vfio with the amd iommu.
>>>>>>
>>>>>> I'm not sure how to do this.
>>>>>>
>>>>>> Any idea?
>>>>>
>>>>> There is likely no way around write-protecting the IOMMU page tables (in
>>>>> KVM mode) once we evaluated and cached them somewhere.
>>>>
>>>> Well for one, it's possible to use vt-d and not amd iommu.
>>>
>>> That would lead to nice combos of AMD CPUs with VT-d IOMMU. While it may
>>> be possible, I wouldn't rely on guests having tested that combination
>>> very well.
>>
>> To make the concern more concrete: I'm playing with code that will reuse
>> the MMU page tables for the IOMMU - the AMD architecture is designed for
>> that optimization (in contrast to Intel's). So, if the guest is not
>> foreseeing that artificial combo above (ours will definitely not)
>> because it is designed around the reuse, it will at least fail to run.
>>
>> Jan
> 
> So if you have an AMD iommu on the host and that is capable
> of 2-level translation, then the flushing problem
> can be fixed by a kind of iommu pass-through
> where you point the host's iommu to guest's page tables.

Yes, right, that could be another approach - provided the tables have
compatible entries. I didn't look details of any of both so far, but I
wouldn't be overly optimistic. Usually, hardware is not very well
designed for interesting nesting purposes.

> 
> So maybe what you need to do is make it possible
> for device to query iommu and ask whether it
> supports devices caching invalid PTEs.
> If not, vfio could fail.

Makes sense.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

next prev parent reply	other threads:[~2016-03-01 14:35 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-21 18:10 [Qemu-devel] [V6 0/4] AMD IOMMU David Kiarie
2016-02-21 18:10 ` [Qemu-devel] [V6 1/4] hw/i386: Introduce " David Kiarie
2016-02-25 15:43   ` Marcel Apfelbaum
2016-02-26  6:23     ` David Kiarie
2016-03-02  4:00       ` David Kiarie
2016-03-02  4:08         ` David Kiarie
2016-03-03  9:40           ` Marcel Apfelbaum
2016-03-03  9:34         ` Marcel Apfelbaum
2016-03-02 19:11     ` David Kiarie
2016-03-03 12:16       ` Marcel Apfelbaum
2016-02-21 18:10 ` [Qemu-devel] [V6 2/4] hw/core: Add AMD IOMMU to machine properties David Kiarie
2016-02-21 20:09   ` Jan Kiszka
2016-03-02 20:51     ` David Kiarie
2016-03-03  9:28       ` Marcel Apfelbaum
2016-03-11 13:20   ` Michael S. Tsirkin
2016-02-21 18:10 ` [Qemu-devel] [V6 3/4] hw/i386: ACPI table for AMD IOMMU David Kiarie
2016-02-21 18:20   ` Jan Kiszka
2016-02-21 19:00     ` David Kiarie
2016-02-21 18:11 ` [Qemu-devel] [V6 4/4] hw/pci-host: Emulate " David Kiarie
2016-02-22 11:22   ` Marcel Apfelbaum
     [not found]     ` <56D75688.1020500@gmail.com>
2016-03-02 21:17       ` Michael S. Tsirkin
2016-03-02 22:04         ` David Kiarie
2016-03-03  9:49           ` Michael S. Tsirkin
2016-03-03 11:47             ` David Kiarie
2016-03-03 12:02               ` Marcel Apfelbaum
2016-03-03 12:06                 ` Marcel Apfelbaum
2016-03-03 12:18                   ` David Kiarie
2016-03-03 12:58                     ` Michael S. Tsirkin
2016-03-08 17:15             ` David Kiarie
2016-03-11 13:22   ` Michael S. Tsirkin
2016-03-13  0:14     ` David Kiarie
2016-03-13 13:59       ` Michael S. Tsirkin
2016-02-21 20:20 ` [Qemu-devel] [V6 0/4] " Jan Kiszka
2016-02-22  5:57   ` David Kiarie
2016-02-22  7:29     ` Jan Kiszka
2016-02-22 11:05       ` David Kiarie
2016-02-22 11:12         ` Jan Kiszka
2016-03-01 13:07 ` Michael S. Tsirkin
2016-03-01 13:48   ` Jan Kiszka
2016-03-01 13:55     ` Michael S. Tsirkin
2016-03-01 14:12       ` Jan Kiszka
2016-03-01 14:18         ` Jan Kiszka
2016-03-01 14:30           ` Michael S. Tsirkin
2016-03-01 14:35             ` Jan Kiszka [this message]
2016-03-01 14:19         ` Michael S. Tsirkin
2016-03-01 14:00     ` Jan Kiszka
2016-03-01 20:11       ` Michael S. Tsirkin
2016-03-01 20:17         ` Jan Kiszka
2016-03-01 20:39           ` Michael S. Tsirkin
2016-03-01 21:23             ` Jan Kiszka
2016-03-01 22:35               ` Michael S. Tsirkin
2016-03-02 21:17     ` David Kiarie
2016-03-02 21:32       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D5A8AA.5020409@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=davidkiarie4@gmail.com \
    --cc=marcel@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=valentine.sinitsyn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).