virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	xen-devel <Xen-devel@lists.xen.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"linux390@de.ibm.com" <linux390@de.ibm.com>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>
Subject: Re: [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API
Date: Wed, 29 Jul 2015 10:54:10 +1000	[thread overview]
Message-ID: <1438131250.7562.191.camel@kernel.crashing.org> (raw)
In-Reply-To: <CALCETrWrLypYdF3Yhzj=ku9i=EmBJm4Ux5ouJ33MuXf7tEgJ0w@mail.gmail.com>

On Tue, 2015-07-28 at 17:47 -0700, Andy Lutomirski wrote:

> Yes, virtio flag.  I dislike having a virtio flag at all, but so far
> no one has come up with any better ideas.  If there was a reliable,
> cross-platform mechanism for per-device PCI bus properties, I'd be all
> for using that instead.

There isn't that I know of, so I think it's the best approach we have.

 .../...

> >  - The kernel should just honor what qemu says, ie, whether the qemu
> > device honors or bypasses the iommu.
> 
> Except for vfio, which maybe just needs a special case: vfio checks if
> the device claims to be virtio and doesn't set the flag, in which case
> vfio just refuses to bind the device.

Right but passing virtio through isn't the highest priority on the
radar, but yes, indeed, it should identify them and reject them.

> >  - Qemu default behaviour should be set via a machine attribute which
> > can be overriden both globally (the machine one) or per-device.
> >
> >> I think that, in an ideal world, there would be no feature flag and
> >> all virtio devices would always respect the IOMMU.  Unfortunately we
> >> have existing practice in the form of PPC and Q35 iommu=on that
> >> conflict with that.
> >
> > And possibly more as in this is how the qemu virtio devices are written
> > today, they do not use the proper DMA accessors, they always bypass,
> > whatever the platform is (so sparc would be in the same boat for
> > example).
> 
> Except that AFAIK Q35 is the only QEMU platform that supports a
> nontrivial IOMMU in the first place.  Are there pseries hosts that
> have a working IOMMU?  Maybe I've just misunderstood.

You may well be correct, I remember that we actually created the iommu
infrastructure to a large extent in qemu for ppc/pseries, then it got
extended when q35 came in.

> >> >>   New QEMU
> >> >> always advertises this feature flag.  If iommu=on, QEMU's virtio
> >> >> devices refuse to work unless the driver acknowledges the flag.
> >> >
> >> > This should be configurable.
> >>
> >> Would any non-PPC user ever configure it differently?  I suppose if
> >> you want to support old kernels on new QEMU, you'd flip the switch.
> >
> > Possibly, have we looked at what ia64, sparc, arm, ... do ? At least
> > sparc has iommus as well.
> 
> I think (I hope!) that ia64 is irrelevant, and last I checked ARM
> didn't have a QEMU-emulated IOMMU.  Maybe things have changed.

Not yet...

 .../...
> >
> > On new machine types, we shouldn't change the behaviour of an existing
> > machine type, and we should keep the default to 0 on ppc/pseries because
> > of backward compatibility issue. But that should be the only place that
> > is "ppc specific", ie, a default value in a machine def structure.
> 
> Fair enough, except I still think we should change the default to be
> "respect IOMMU" on machine types that don't have an IOMMU in the first
> place. 

Ok, but do it in a separate patch because it *is* a behaviour change to
some extent.

>  That way Xen works with old machine types, and I don't think
> we lose anything.
> 
> >
> >> That's the setting that will work in all cases on new guest + new
> >> host, and it's the setting that's safest.  vfio will probably always
> >> malfunction if given a device that looks like it's behind an IOMMU but
> >> doesn't respect it.  For people who need the last bit of performance,
> >> they should use bus-level controls where available (they should be
> >> available everywhere except PPC and maybe arm64) and, ideally, someone
> >> would teach PPC how to exclude devices from the IOMMU cleanly if
> >> possible.  If that can't be done, then there can be an option to
> >> bypass the IOMMU the way it's currently done and no one except PPC
> >> would do it.
> >>
> >> PPC really is different from everything except x86 Q35 iommu=on, and
> >> the latter is experimental.  AFAIK in all other cases, the IOMMU is
> >> respected by virtio, but there is no non-1:1 IOMMU.
> >
> > What about sparc ? I though it was pretty similar to PPC in that
> > regard...
> 
> No clue, honestly.  I could be wrong about the set of existing QEMU
> machine types.

Ok.

Cheers,
Ben.

  reply	other threads:[~2015-07-29  0:54 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-28  1:08 [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Andy Lutomirski
2015-07-28  7:05 ` Christian Borntraeger
2015-07-28  8:16 ` Paolo Bonzini
2015-07-28 10:12   ` Benjamin Herrenschmidt
2015-07-28 12:46     ` Paolo Bonzini
2015-07-28 13:06       ` Michael S. Tsirkin
2015-07-28 13:11         ` Jan Kiszka
2015-07-28 16:11           ` Andy Lutomirski
2015-07-28 16:44             ` Jan Kiszka
2015-07-28 17:10               ` Andy Lutomirski
2015-07-28 17:17                 ` Jan Kiszka
2015-07-28 18:22                   ` Andy Lutomirski
2015-07-28 19:06                     ` Jan Kiszka
2015-07-28 19:24                       ` Andy Lutomirski
2015-07-28 19:33                         ` Jan Kiszka
2015-07-28 21:16                           ` Andy Lutomirski
2015-07-28 22:43                             ` Andy Lutomirski
2015-07-28 23:21                               ` Benjamin Herrenschmidt
2015-07-28 23:33                                 ` Andy Lutomirski
2015-07-29  0:36                                   ` Benjamin Herrenschmidt
2015-07-29  0:47                                     ` Andy Lutomirski
2015-07-29  0:54                                       ` Benjamin Herrenschmidt [this message]
2015-07-29  8:17                                       ` Paolo Bonzini
2015-07-29  8:20                                         ` Jan Kiszka
2015-07-29  9:21                                         ` Benjamin Herrenschmidt
2015-07-29  8:07                                 ` Jan Kiszka
2015-07-28 16:36           ` Paolo Bonzini
2015-07-28 16:42             ` Jan Kiszka
2015-07-28 17:15               ` Paolo Bonzini
2015-07-28 17:19                 ` Jan Kiszka
2015-07-28 17:31                   ` Paolo Bonzini
2015-07-28 13:08 ` Michael S. Tsirkin
  -- strict thread matches above, loose matches on Subject: below --
2014-09-01 17:39 Andy Lutomirski
2014-09-01 22:16 ` Benjamin Herrenschmidt
2014-09-02  5:55   ` Andy Lutomirski
2014-09-02 20:53     ` Benjamin Herrenschmidt
2014-09-02 20:56       ` Konrad Rzeszutek Wilk
2014-09-02 21:08         ` Benjamin Herrenschmidt
2014-09-02 21:37       ` Andy Lutomirski
2014-09-02 22:10         ` Benjamin Herrenschmidt
2014-09-02 23:11           ` Andy Lutomirski
2014-09-02 23:20             ` Benjamin Herrenschmidt
2014-09-02 23:42               ` Andy Lutomirski
2014-09-03  0:25                 ` Benjamin Herrenschmidt
2014-09-03  0:32                   ` Andy Lutomirski
2014-09-03  0:43                     ` Benjamin Herrenschmidt
2014-09-04  2:03                       ` Andy Lutomirski
2014-09-03  7:47                   ` Paolo Bonzini
2014-09-03  7:52                     ` Andy Lutomirski
2014-09-03  8:01                       ` Paolo Bonzini
2014-09-03  8:05                     ` Benjamin Herrenschmidt
2014-09-03 12:11                       ` Paolo Bonzini
2014-09-03 15:07                         ` Andy Lutomirski
2014-09-03 15:11                           ` Paolo Bonzini
2014-09-03 16:39                           ` Michael S. Tsirkin
2014-09-03 20:38                             ` Andy Lutomirski
2014-09-03  7:43               ` Paolo Bonzini
2014-09-03  6:42         ` Rusty Russell
2014-09-03  7:50           ` Andy Lutomirski
2014-09-05  2:31             ` Rusty Russell
2014-09-05  2:57               ` Andy Lutomirski
2014-09-05  5:20                 ` Benjamin Herrenschmidt
2014-09-05  7:33                 ` Christian Borntraeger
2014-09-10 15:36                 ` Christopher Covington
2014-09-10 16:15                   ` Andy Lutomirski
2014-09-05  5:16               ` Benjamin Herrenschmidt
2014-09-14  8:58               ` Michael S. Tsirkin
2014-09-03 12:51           ` Michael S. Tsirkin
2014-09-05  2:32             ` Rusty Russell
2014-09-05  3:06               ` Andy Lutomirski
2014-09-02 21:10     ` Michael S. Tsirkin
2014-09-02 21:49       ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1438131250.7562.191.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=Xen-devel@lists.xen.org \
    --cc=borntraeger@de.ibm.com \
    --cc=jan.kiszka@siemens.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux390@de.ibm.com \
    --cc=luto@amacapital.net \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).