All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Wei Liu <wei.liu2@citrix.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-block@nongnu.org,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Joerg Roedel <joro@8bytes.org>,
	qemu-devel@nongnu.org, peterx@redhat.com,
	linux-kernel@vger.kernel.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	iommu@lists.linux-foundation.org,
	Andy Lutomirski <luto@kernel.org>,
	kvm@vger.kernel.org, Amit Shah <amit.shah@redhat.com>,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	Anthony PERARD <anthony.perard@citrix.com>
Subject: Re: [PATCH V2 RFC] fixup! virtio: convert to use DMA api
Date: Sun, 1 May 2016 13:37:49 +0300	[thread overview]
Message-ID: <20160501132934-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <1461858505.33870.108.camel@infradead.org>

On Thu, Apr 28, 2016 at 04:48:25PM +0100, David Woodhouse wrote:
> On Thu, 2016-04-28 at 18:37 +0300, Michael S. Tsirkin wrote:
> > OK, so for intel, it seems that it's enough to set
> > 	pdev->dev.archdata.iommu = DUMMY_DEVICE_DOMAIN_INFO;
> > for the device.
> 
> Yes, currently. Although that's vile. In fact what we *want* to happen
> is for the intel-iommu code simply to decline to provide DMA ops for
> this device, and let it fall back to the swiotlb or no-op DMA ops, as
> appropriate.
> 
> As it is, we have the intel-iommu DMA ops *unconditionally, and they
> have a hack to manually fall back to calling swiotlb. It's all just
> horrid, which is why I want to clean it up with nice per-device DMA ops
> and discovery thereof :)
> 
> > Do I have to poke at each iommu implementation to find
> > a way to do this, or is there some way to do it
> > portably?
> 
> There *will* be.... Christoph has already done some of the cleanup in
> this space, and I need to take stock of what he's already done, and
> finish off the parts I want to build on top of it.
> 
> > Not exactly - I think that future versions of qemu might lie
> > about some devices but not others.
> 
> Can we keep this simple?
> 
> QEMU currently lies about some devices. Let's implement a heuristic for
> the guest OS to know about that, and react accordingly.
> 
> Then let's fix QEMU to tell the truth. All the time, unconditionally.
> Even on POWER/ARM where there's no obvious *way* for it to tell the
> truth (because you don't have the flexibility that DMAR tables do), and
> we need to devise a way to put it in the device-tree or fwcfg or
> something else.

Right.  Unfortunately all these aren't easy to implement at all.
So I'm inclined to go the "something else" route.
It has the added benefit of giving us a heuristic for free.

> And only once QEMU consistently tells the *truth*, then we can start to
> do new stuff and let it actually change its behaviour.
> 
> > DMAR is unfortunately not a good match for what people do with QEMU.
> > 
> > There is a patchset on list fixing translation of assigned
> > devices. So the fix for these will simply be to do translation for
> > all assigned devices. It's harder for virtio as it isn't always
> > processed in QEMU - there's vhost in kernel and an out of process
> > vhost-user plugin. So we can end up e.g. with modern QEMU which
> > does translate in-process virtio but not out of process one.
> 
> Right... just stop. Fix QEMU to tell the truth first, and *then* once
> we can trust it, we can start to change its behaviour. :)
> 
> > Unfortunately people got used to be able to put any device
> > in any slot, and built external tools around that ability.
> > It's rather painful to break this assumption.
> 
> Well, if you just said you have a patch set which allows translation of
> assigned devices then you are most of the way there, aren't you? We
> just need to fix the out-of-process virtio case, and everything can be
> either translated or untranslated?

Absolutely. But that "just" will take a while.  With out of process
there's always a chance that remote doesn't implement translation. E.g.
new QEMU running on an old host kernel.

> -- 
> dwmw2
> 

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>, Kevin Wolf <kwolf@redhat.com>,
	Wei Liu <wei.liu2@citrix.com>, Andy Lutomirski <luto@kernel.org>,
	qemu-block@nongnu.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Jason Wang <jasowang@redhat.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	qemu-devel@nongnu.org, peterx@redhat.com,
	linux-kernel@vger.kernel.org, Amit Shah <amit.shah@redhat.com>,
	iommu@lists.linux-foundation.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	kvm@vger.kernel.org, cornelia.huck@de.ibm.com,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	Anthony PERARD <anthony.perard@citrix.com>
Subject: Re: [PATCH V2 RFC] fixup! virtio: convert to use DMA api
Date: Sun, 1 May 2016 13:37:49 +0300	[thread overview]
Message-ID: <20160501132934-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <1461858505.33870.108.camel@infradead.org>

On Thu, Apr 28, 2016 at 04:48:25PM +0100, David Woodhouse wrote:
> On Thu, 2016-04-28 at 18:37 +0300, Michael S. Tsirkin wrote:
> > OK, so for intel, it seems that it's enough to set
> > 	pdev->dev.archdata.iommu = DUMMY_DEVICE_DOMAIN_INFO;
> > for the device.
> 
> Yes, currently. Although that's vile. In fact what we *want* to happen
> is for the intel-iommu code simply to decline to provide DMA ops for
> this device, and let it fall back to the swiotlb or no-op DMA ops, as
> appropriate.
> 
> As it is, we have the intel-iommu DMA ops *unconditionally, and they
> have a hack to manually fall back to calling swiotlb. It's all just
> horrid, which is why I want to clean it up with nice per-device DMA ops
> and discovery thereof :)
> 
> > Do I have to poke at each iommu implementation to find
> > a way to do this, or is there some way to do it
> > portably?
> 
> There *will* be.... Christoph has already done some of the cleanup in
> this space, and I need to take stock of what he's already done, and
> finish off the parts I want to build on top of it.
> 
> > Not exactly - I think that future versions of qemu might lie
> > about some devices but not others.
> 
> Can we keep this simple?
> 
> QEMU currently lies about some devices. Let's implement a heuristic for
> the guest OS to know about that, and react accordingly.
> 
> Then let's fix QEMU to tell the truth. All the time, unconditionally.
> Even on POWER/ARM where there's no obvious *way* for it to tell the
> truth (because you don't have the flexibility that DMAR tables do), and
> we need to devise a way to put it in the device-tree or fwcfg or
> something else.

Right.  Unfortunately all these aren't easy to implement at all.
So I'm inclined to go the "something else" route.
It has the added benefit of giving us a heuristic for free.

> And only once QEMU consistently tells the *truth*, then we can start to
> do new stuff and let it actually change its behaviour.
> 
> > DMAR is unfortunately not a good match for what people do with QEMU.
> > 
> > There is a patchset on list fixing translation of assigned
> > devices. So the fix for these will simply be to do translation for
> > all assigned devices. It's harder for virtio as it isn't always
> > processed in QEMU - there's vhost in kernel and an out of process
> > vhost-user plugin. So we can end up e.g. with modern QEMU which
> > does translate in-process virtio but not out of process one.
> 
> Right... just stop. Fix QEMU to tell the truth first, and *then* once
> we can trust it, we can start to change its behaviour. :)
> 
> > Unfortunately people got used to be able to put any device
> > in any slot, and built external tools around that ability.
> > It's rather painful to break this assumption.
> 
> Well, if you just said you have a patch set which allows translation of
> assigned devices then you are most of the way there, aren't you? We
> just need to fix the out-of-process virtio case, and everything can be
> either translated or untranslated?

Absolutely. But that "just" will take a while.  With out of process
there's always a chance that remote doesn't implement translation. E.g.
new QEMU running on an old host kernel.

> -- 
> dwmw2
> 

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>, Kevin Wolf <kwolf@redhat.com>,
	Wei Liu <wei.liu2@citrix.com>, Andy Lutomirski <luto@kernel.org>,
	qemu-block@nongnu.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Jason Wang <jasowang@redhat.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	qemu-devel@nongnu.org, peterx@redhat.com,
	linux-kernel@vger.kernel.org, Amit Shah <amit.shah@redhat.com>,
	iommu@lists.linux-foundation.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	kvm@vger.kernel.org, cornelia.huck@de.ibm.com,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	Anthony PERARD <anthony.perard@citrix.com>
Subject: Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api
Date: Sun, 1 May 2016 13:37:49 +0300	[thread overview]
Message-ID: <20160501132934-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <1461858505.33870.108.camel@infradead.org>

On Thu, Apr 28, 2016 at 04:48:25PM +0100, David Woodhouse wrote:
> On Thu, 2016-04-28 at 18:37 +0300, Michael S. Tsirkin wrote:
> > OK, so for intel, it seems that it's enough to set
> > 	pdev->dev.archdata.iommu = DUMMY_DEVICE_DOMAIN_INFO;
> > for the device.
> 
> Yes, currently. Although that's vile. In fact what we *want* to happen
> is for the intel-iommu code simply to decline to provide DMA ops for
> this device, and let it fall back to the swiotlb or no-op DMA ops, as
> appropriate.
> 
> As it is, we have the intel-iommu DMA ops *unconditionally, and they
> have a hack to manually fall back to calling swiotlb. It's all just
> horrid, which is why I want to clean it up with nice per-device DMA ops
> and discovery thereof :)
> 
> > Do I have to poke at each iommu implementation to find
> > a way to do this, or is there some way to do it
> > portably?
> 
> There *will* be.... Christoph has already done some of the cleanup in
> this space, and I need to take stock of what he's already done, and
> finish off the parts I want to build on top of it.
> 
> > Not exactly - I think that future versions of qemu might lie
> > about some devices but not others.
> 
> Can we keep this simple?
> 
> QEMU currently lies about some devices. Let's implement a heuristic for
> the guest OS to know about that, and react accordingly.
> 
> Then let's fix QEMU to tell the truth. All the time, unconditionally.
> Even on POWER/ARM where there's no obvious *way* for it to tell the
> truth (because you don't have the flexibility that DMAR tables do), and
> we need to devise a way to put it in the device-tree or fwcfg or
> something else.

Right.  Unfortunately all these aren't easy to implement at all.
So I'm inclined to go the "something else" route.
It has the added benefit of giving us a heuristic for free.

> And only once QEMU consistently tells the *truth*, then we can start to
> do new stuff and let it actually change its behaviour.
> 
> > DMAR is unfortunately not a good match for what people do with QEMU.
> > 
> > There is a patchset on list fixing translation of assigned
> > devices. So the fix for these will simply be to do translation for
> > all assigned devices. It's harder for virtio as it isn't always
> > processed in QEMU - there's vhost in kernel and an out of process
> > vhost-user plugin. So we can end up e.g. with modern QEMU which
> > does translate in-process virtio but not out of process one.
> 
> Right... just stop. Fix QEMU to tell the truth first, and *then* once
> we can trust it, we can start to change its behaviour. :)
> 
> > Unfortunately people got used to be able to put any device
> > in any slot, and built external tools around that ability.
> > It's rather painful to break this assumption.
> 
> Well, if you just said you have a patch set which allows translation of
> assigned devices then you are most of the way there, aren't you? We
> just need to fix the out-of-process virtio case, and everything can be
> either translated or untranslated?

Absolutely. But that "just" will take a while.  With out of process
there's always a chance that remote doesn't implement translation. E.g.
new QEMU running on an old host kernel.

> -- 
> dwmw2
> 

  reply	other threads:[~2016-05-01 10:37 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-21 13:43 [PATCH V2 RFC] fixup! virtio: convert to use DMA api Michael S. Tsirkin
2016-04-21 13:43 ` [Qemu-devel] " Michael S. Tsirkin
2016-04-21 13:43 ` Michael S. Tsirkin
2016-04-21 13:54 ` Wei Liu
2016-04-21 13:54 ` Wei Liu
2016-04-21 13:54   ` [Qemu-devel] " Wei Liu
2016-04-21 13:54   ` Wei Liu
2016-04-27 12:18   ` David Woodhouse
2016-04-27 12:18   ` David Woodhouse
2016-04-27 12:18     ` [Qemu-devel] " David Woodhouse
2016-04-27 13:37     ` Michael S. Tsirkin
2016-04-27 13:37       ` [Qemu-devel] " Michael S. Tsirkin
     [not found]       ` <20160427153345-mutt-send-email-mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-27 14:23         ` Joerg Roedel
2016-04-27 14:23           ` [Qemu-devel] " Joerg Roedel
2016-04-27 14:23           ` Joerg Roedel
2016-04-27 14:31           ` Andy Lutomirski
2016-04-27 14:31             ` [Qemu-devel] " Andy Lutomirski
2016-04-27 14:31             ` Andy Lutomirski
2016-04-27 14:38             ` Michael S. Tsirkin
     [not found]             ` <CALCETrVkSSJbjoK8i7pLsSYR0o=Wy1UP-mrmn2uxYUd81g18dg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-27 14:38               ` Michael S. Tsirkin
2016-04-27 14:38                 ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 14:38                 ` Michael S. Tsirkin
2016-04-27 14:43                 ` Andy Lutomirski
2016-04-27 14:43                   ` [Qemu-devel] " Andy Lutomirski
2016-04-27 14:43                   ` Andy Lutomirski
2016-04-27 14:54                   ` Michael S. Tsirkin
2016-04-27 14:54                   ` Michael S. Tsirkin
2016-04-27 14:54                     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 14:54                     ` Michael S. Tsirkin
2016-04-27 14:58                     ` Joerg Roedel
2016-04-27 14:58                       ` [Qemu-devel] " Joerg Roedel
2016-04-27 15:09                       ` Michael S. Tsirkin
2016-04-27 15:09                         ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 15:09                         ` Michael S. Tsirkin
2016-04-27 15:09                       ` Michael S. Tsirkin
2016-04-27 14:58                     ` Joerg Roedel
2016-04-27 15:10                     ` Andy Lutomirski
2016-04-27 15:10                       ` [Qemu-devel] " Andy Lutomirski
2016-04-27 15:10                       ` Andy Lutomirski
2016-04-27 15:10                     ` Andy Lutomirski
2016-04-27 14:43                 ` Andy Lutomirski
2016-04-27 14:34           ` Michael S. Tsirkin
2016-04-27 14:34             ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 14:34             ` Michael S. Tsirkin
     [not found]             ` <20160427172630-mutt-send-email-mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-27 14:56               ` Joerg Roedel
2016-04-27 14:56                 ` [Qemu-devel] " Joerg Roedel
2016-04-27 14:56                 ` Joerg Roedel
2016-04-27 15:05                 ` Michael S. Tsirkin
     [not found]                 ` <20160427145632.GI17926-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-04-27 15:05                   ` Michael S. Tsirkin
2016-04-27 15:05                     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 15:05                     ` Michael S. Tsirkin
2016-04-27 15:15                     ` David Woodhouse
2016-04-27 15:15                       ` [Qemu-devel] " David Woodhouse
2016-04-27 15:15                       ` David Woodhouse
2016-04-27 18:17                       ` Michael S. Tsirkin
2016-04-27 18:17                         ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 18:17                         ` Michael S. Tsirkin
2016-04-27 19:16                         ` David Woodhouse
2016-04-27 19:16                         ` David Woodhouse
2016-04-27 19:16                           ` [Qemu-devel] " David Woodhouse
2016-04-27 19:16                           ` David Woodhouse
2016-04-28 14:34                           ` Michael S. Tsirkin
     [not found]                           ` <1461784617.118304.181.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-04-28 14:34                             ` Michael S. Tsirkin
2016-04-28 14:34                               ` [Qemu-devel] " Michael S. Tsirkin
2016-04-28 14:34                               ` Michael S. Tsirkin
2016-04-28 15:11                               ` David Woodhouse
2016-04-28 15:11                                 ` [Qemu-devel] " David Woodhouse
2016-04-28 15:11                                 ` David Woodhouse
2016-04-28 15:37                                 ` Michael S. Tsirkin
     [not found]                                 ` <1461856314.33870.98.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-04-28 15:37                                   ` Michael S. Tsirkin
2016-04-28 15:37                                     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-28 15:37                                     ` Michael S. Tsirkin
2016-04-28 15:48                                     ` David Woodhouse
2016-04-28 15:48                                       ` [Qemu-devel] " David Woodhouse
2016-04-28 15:48                                       ` David Woodhouse
2016-05-01 10:37                                       ` Michael S. Tsirkin [this message]
2016-05-01 10:37                                         ` [Qemu-devel] " Michael S. Tsirkin
2016-05-01 10:37                                         ` Michael S. Tsirkin
2016-04-28 15:48                                     ` David Woodhouse
2016-05-09 11:09                                     ` Paolo Bonzini
     [not found]                                     ` <20160428182341-mutt-send-email-mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-09 11:09                                       ` Paolo Bonzini
2016-05-09 11:09                                         ` [Qemu-devel] " Paolo Bonzini
2016-05-09 11:09                                         ` Paolo Bonzini
2016-04-27 18:17                       ` Michael S. Tsirkin
2016-04-27 15:15                     ` David Woodhouse
2016-04-27 14:56             ` Joerg Roedel
2016-04-27 14:23       ` Joerg Roedel
2016-04-27 13:37     ` Michael S. Tsirkin
2016-04-21 14:56 ` Stefan Hajnoczi
2016-04-21 14:56   ` [Qemu-devel] " Stefan Hajnoczi
2016-04-21 15:11   ` Michael S. Tsirkin
2016-04-21 15:11     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-21 15:11     ` Michael S. Tsirkin
2016-04-22  9:33     ` Stefan Hajnoczi
2016-04-22  9:33     ` Stefan Hajnoczi
2016-04-22  9:33       ` [Qemu-devel] " Stefan Hajnoczi
2016-04-22  9:33       ` Stefan Hajnoczi
2016-04-21 14:56 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160501132934-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=anthony.perard@citrix.com \
    --cc=borntraeger@de.ibm.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wei.liu2@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.