* virtio DMA API?
@ 2014-08-25 17:18 Andy Lutomirski
2014-08-25 18:54 ` Konrad Rzeszutek Wilk
2014-08-27 11:10 ` Rusty Russell
0 siblings, 2 replies; 10+ messages in thread
From: Andy Lutomirski @ 2014-08-25 17:18 UTC (permalink / raw)
To: Rusty Russell, Michael S. Tsirkin, virtualization,
Konrad Rzeszutek Wilk
Currently, a lot of the virtio code assumes that bus (i.e. hypervisor)
addresses are the same as physical address. This is false on Xen, so
virtio is completely broken. I wouldn't be surprised if it also
becomes a problem the first time that someone sticks a physical
"virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM.
Would you accept patches to convert virtio_ring and virtio_pci to use
the DMA APIs? I think that the only real catch will be that
virtio_ring's approach to freeing indirect blocks is currently
incompatible with the DMA API -- it assumes that knowing the bus
address is enough to call kfree, and I don't think that the DMA API
provides a reverse mapping like that.
--Andy
--
Andy Lutomirski
AMA Capital Management, LLC
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-25 17:18 virtio DMA API? Andy Lutomirski
@ 2014-08-25 18:54 ` Konrad Rzeszutek Wilk
2014-08-25 19:20 ` Andy Lutomirski
2014-08-27 11:10 ` Rusty Russell
1 sibling, 1 reply; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-25 18:54 UTC (permalink / raw)
To: Andy Lutomirski; +Cc: virtualization, Michael S. Tsirkin
On Mon, Aug 25, 2014 at 10:18:46AM -0700, Andy Lutomirski wrote:
> Currently, a lot of the virtio code assumes that bus (i.e. hypervisor)
> addresses are the same as physical address. This is false on Xen, so
> virtio is completely broken. I wouldn't be surprised if it also
> becomes a problem the first time that someone sticks a physical
> "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM.
>
> Would you accept patches to convert virtio_ring and virtio_pci to use
> the DMA APIs? I think that the only real catch will be that
> virtio_ring's approach to freeing indirect blocks is currently
> incompatible with the DMA API -- it assumes that knowing the bus
> address is enough to call kfree, and I don't think that the DMA API
> provides a reverse mapping like that.
If you use the dma_map/unmap_sg all of that ends up being stuck in the
sg structure (sg->dma_address ends with the DMA addr, sg_phys(sg) gives
you the physical address).
>
> --Andy
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-25 18:54 ` Konrad Rzeszutek Wilk
@ 2014-08-25 19:20 ` Andy Lutomirski
0 siblings, 0 replies; 10+ messages in thread
From: Andy Lutomirski @ 2014-08-25 19:20 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: virtualization, Michael S. Tsirkin
On Mon, Aug 25, 2014 at 11:54 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Mon, Aug 25, 2014 at 10:18:46AM -0700, Andy Lutomirski wrote:
>> Currently, a lot of the virtio code assumes that bus (i.e. hypervisor)
>> addresses are the same as physical address. This is false on Xen, so
>> virtio is completely broken. I wouldn't be surprised if it also
>> becomes a problem the first time that someone sticks a physical
>> "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM.
>>
>> Would you accept patches to convert virtio_ring and virtio_pci to use
>> the DMA APIs? I think that the only real catch will be that
>> virtio_ring's approach to freeing indirect blocks is currently
>> incompatible with the DMA API -- it assumes that knowing the bus
>> address is enough to call kfree, and I don't think that the DMA API
>> provides a reverse mapping like that.
>
> If you use the dma_map/unmap_sg all of that ends up being stuck in the
> sg structure (sg->dma_address ends with the DMA addr, sg_phys(sg) gives
> you the physical address).
Unfortunately, virtio_ring doesn't hang on to the sg structure until
complation. I don't think it can, either -- if I read it right, the
virtio_net driver uses one scatterlist per queue instead of one
scatterlist per pending skb, so the sg entries could be overwritten by
the time virtio_ring should unmap it. Fortunately, I think that
dma_unmap_single can handle this case just fine.
I have a WIP here:
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=virtio_ring_xen
It works, but it's mostly missing unmap calls. If there's no iommu or
swiotlb, then there's nothing to leak, so it's okay. If you do, then
this driver will eventually explode. I'll send patches once I have it
fixed up.
--Andy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-25 17:18 virtio DMA API? Andy Lutomirski
2014-08-25 18:54 ` Konrad Rzeszutek Wilk
@ 2014-08-27 11:10 ` Rusty Russell
2014-08-27 11:52 ` Michael S. Tsirkin
` (2 more replies)
1 sibling, 3 replies; 10+ messages in thread
From: Rusty Russell @ 2014-08-27 11:10 UTC (permalink / raw)
To: Andy Lutomirski, Michael S. Tsirkin, virtualization,
Konrad Rzeszutek Wilk, Benjamin Herrenschmidt
Andy Lutomirski <luto@amacapital.net> writes:
> Currently, a lot of the virtio code assumes that bus (i.e. hypervisor)
> addresses are the same as physical address. This is false on Xen, so
> virtio is completely broken. I wouldn't be surprised if it also
> becomes a problem the first time that someone sticks a physical
> "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM.
>
> Would you accept patches to convert virtio_ring and virtio_pci to use
> the DMA APIs? I think that the only real catch will be that
> virtio_ring's approach to freeing indirect blocks is currently
> incompatible with the DMA API -- it assumes that knowing the bus
> address is enough to call kfree, and I don't think that the DMA API
> provides a reverse mapping like that.
Hi Andy,
This has long been a source of contention. virtio assumes that
the hypervisor can decode guest-physical addresses.
PowerPC, in particular, doesn't want to pay the cost of IOMMU
manipulations, and all arguments presented so far for using an IOMMU for
a virtio device are weak. And changing to use DMA APIs would break them
anyway.
Of course, it's Just A Matter of Code, so it's possible to
create a Xen-specific variant which uses the DMA APIs. I'm not sure
what that would look like in the virtio standard, however.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-27 11:10 ` Rusty Russell
@ 2014-08-27 11:52 ` Michael S. Tsirkin
2014-08-27 19:49 ` Konrad Rzeszutek Wilk
2014-08-27 21:32 ` Benjamin Herrenschmidt
2014-08-27 14:55 ` Andy Lutomirski
2014-08-27 21:31 ` Benjamin Herrenschmidt
2 siblings, 2 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2014-08-27 11:52 UTC (permalink / raw)
To: Rusty Russell
Cc: Benjamin Herrenschmidt, virtualization, Konrad Rzeszutek Wilk,
Andy Lutomirski
On Wed, Aug 27, 2014 at 08:40:51PM +0930, Rusty Russell wrote:
> Andy Lutomirski <luto@amacapital.net> writes:
> > Currently, a lot of the virtio code assumes that bus (i.e. hypervisor)
> > addresses are the same as physical address. This is false on Xen, so
> > virtio is completely broken. I wouldn't be surprised if it also
> > becomes a problem the first time that someone sticks a physical
> > "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM.
> >
> > Would you accept patches to convert virtio_ring and virtio_pci to use
> > the DMA APIs? I think that the only real catch will be that
> > virtio_ring's approach to freeing indirect blocks is currently
> > incompatible with the DMA API -- it assumes that knowing the bus
> > address is enough to call kfree, and I don't think that the DMA API
> > provides a reverse mapping like that.
>
> Hi Andy,
>
> This has long been a source of contention. virtio assumes that
> the hypervisor can decode guest-physical addresses.
>
> PowerPC, in particular, doesn't want to pay the cost of IOMMU
> manipulations, and all arguments presented so far for using an IOMMU for
> a virtio device are weak. And changing to use DMA APIs would break them
> anyway.
>
> Of course, it's Just A Matter of Code, so it's possible to
> create a Xen-specific variant which uses the DMA APIs. I'm not sure
> what that would look like in the virtio standard, however.
>
> Cheers,
> Rusty.
For x86 as of QEMU 2.0 there's no iommu.
So a reasonable thing to do for that platform
might be to always use iommu *if it's there*.
My understanding is this isn't the case for powerpc?
--
MST
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-27 11:10 ` Rusty Russell
2014-08-27 11:52 ` Michael S. Tsirkin
@ 2014-08-27 14:55 ` Andy Lutomirski
2014-08-27 21:31 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 10+ messages in thread
From: Andy Lutomirski @ 2014-08-27 14:55 UTC (permalink / raw)
To: Rusty Russell
Cc: Benjamin Herrenschmidt, virtualization, Konrad Rzeszutek Wilk,
Michael S. Tsirkin
On Aug 27, 2014 4:30 AM, "Rusty Russell" <rusty@rustcorp.com.au> wrote:
>
> Andy Lutomirski <luto@amacapital.net> writes:
> > Currently, a lot of the virtio code assumes that bus (i.e. hypervisor)
> > addresses are the same as physical address. This is false on Xen, so
> > virtio is completely broken. I wouldn't be surprised if it also
> > becomes a problem the first time that someone sticks a physical
> > "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM.
> >
> > Would you accept patches to convert virtio_ring and virtio_pci to use
> > the DMA APIs? I think that the only real catch will be that
> > virtio_ring's approach to freeing indirect blocks is currently
> > incompatible with the DMA API -- it assumes that knowing the bus
> > address is enough to call kfree, and I don't think that the DMA API
> > provides a reverse mapping like that.
>
> Hi Andy,
>
> This has long been a source of contention. virtio assumes that
> the hypervisor can decode guest-physical addresses.
>
> PowerPC, in particular, doesn't want to pay the cost of IOMMU
> manipulations, and all arguments presented so far for using an IOMMU for
> a virtio device are weak. And changing to use DMA APIs would break them
> anyway.
>
> Of course, it's Just A Matter of Code, so it's possible to
> create a Xen-specific variant which uses the DMA APIs. I'm not sure
> what that would look like in the virtio standard, however.
I'll reply in the other thread to keep everything in one place.
>
> Cheers,
> Rusty.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-27 11:52 ` Michael S. Tsirkin
@ 2014-08-27 19:49 ` Konrad Rzeszutek Wilk
2014-08-27 21:32 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-27 19:49 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Benjamin Herrenschmidt, virtualization, Andy Lutomirski
On Wed, Aug 27, 2014 at 01:52:50PM +0200, Michael S. Tsirkin wrote:
> On Wed, Aug 27, 2014 at 08:40:51PM +0930, Rusty Russell wrote:
> > Andy Lutomirski <luto@amacapital.net> writes:
> > > Currently, a lot of the virtio code assumes that bus (i.e. hypervisor)
> > > addresses are the same as physical address. This is false on Xen, so
> > > virtio is completely broken. I wouldn't be surprised if it also
> > > becomes a problem the first time that someone sticks a physical
> > > "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM.
> > >
> > > Would you accept patches to convert virtio_ring and virtio_pci to use
> > > the DMA APIs? I think that the only real catch will be that
> > > virtio_ring's approach to freeing indirect blocks is currently
> > > incompatible with the DMA API -- it assumes that knowing the bus
> > > address is enough to call kfree, and I don't think that the DMA API
> > > provides a reverse mapping like that.
> >
> > Hi Andy,
> >
> > This has long been a source of contention. virtio assumes that
> > the hypervisor can decode guest-physical addresses.
> >
> > PowerPC, in particular, doesn't want to pay the cost of IOMMU
> > manipulations, and all arguments presented so far for using an IOMMU for
> > a virtio device are weak. And changing to use DMA APIs would break them
> > anyway.
> >
> > Of course, it's Just A Matter of Code, so it's possible to
> > create a Xen-specific variant which uses the DMA APIs. I'm not sure
> > what that would look like in the virtio standard, however.
> >
> > Cheers,
> > Rusty.
>
> For x86 as of QEMU 2.0 there's no iommu.
> So a reasonable thing to do for that platform
> might be to always use iommu *if it's there*.
> My understanding is this isn't the case for powerpc?
Wasn't there some implementation of AMD IOMMU code on QEMU
mailing list floating around? Aha:
https://www.mail-archive.com/kvm@vger.kernel.org/msg40516.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-27 11:10 ` Rusty Russell
2014-08-27 11:52 ` Michael S. Tsirkin
2014-08-27 14:55 ` Andy Lutomirski
@ 2014-08-27 21:31 ` Benjamin Herrenschmidt
2014-08-29 15:06 ` Konrad Rzeszutek Wilk
2 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2014-08-27 21:31 UTC (permalink / raw)
To: Rusty Russell
Cc: Michael S. Tsirkin, virtualization, Konrad Rzeszutek Wilk,
Andy Lutomirski
On Wed, 2014-08-27 at 20:40 +0930, Rusty Russell wrote:
> Hi Andy,
>
> This has long been a source of contention. virtio assumes that
> the hypervisor can decode guest-physical addresses.
>
> PowerPC, in particular, doesn't want to pay the cost of IOMMU
> manipulations, and all arguments presented so far for using an IOMMU for
> a virtio device are weak. And changing to use DMA APIs would break them
> anyway.
>
> Of course, it's Just A Matter of Code, so it's possible to
> create a Xen-specific variant which uses the DMA APIs. I'm not sure
> what that would look like in the virtio standard, however.
So this has popped up in the past a few times already from people who
want to use virtio as a transport between physical systems connected
via a bus like PCI using non-transparent bridges for example.
There's a way to get both here that isn't too nasty... we can make the
virtio drivers use the dma_map_* APIs and just switch the dma_ops in
the struct device based on the hypervisor requirements. IE. For KVM we
could attach a set of ops that basically just return the physical
address, real PCI transport would use the normal callbacks etc...
The only problem at the moment is that the dma_map_ops, while
defined generically, aren't plumbed into the generic struct device
but instead on some architectures dev_archdata. This includes
powerpc, ARM and x86 (under a CONFIG option for the latter which
is only enabled on x86_64 and some oddball i386 variant).
So either we switch to have all architectures we care about always
use the generic DMA ops and move the pointer to struct device, or
we create another inline "indirection" to deal with the cases
without the dma_map_ops...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-27 11:52 ` Michael S. Tsirkin
2014-08-27 19:49 ` Konrad Rzeszutek Wilk
@ 2014-08-27 21:32 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2014-08-27 21:32 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: virtualization, Konrad Rzeszutek Wilk, Andy Lutomirski
On Wed, 2014-08-27 at 13:52 +0200, Michael S. Tsirkin wrote:
> For x86 as of QEMU 2.0 there's no iommu.
> So a reasonable thing to do for that platform
> might be to always use iommu *if it's there*.
> My understanding is this isn't the case for powerpc?
All 64-bit powerpc have an iommu but not all 32-bit ones.
Also using/emulating one has a cost. Whatever we do shouldn't
impair the fast path of virtio-on-kvm doing direct physical
access.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: virtio DMA API?
2014-08-27 21:31 ` Benjamin Herrenschmidt
@ 2014-08-29 15:06 ` Konrad Rzeszutek Wilk
0 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-29 15:06 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Michael S. Tsirkin, virtualization, Andy Lutomirski
On Thu, Aug 28, 2014 at 07:31:16AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2014-08-27 at 20:40 +0930, Rusty Russell wrote:
>
> > Hi Andy,
> >
> > This has long been a source of contention. virtio assumes that
> > the hypervisor can decode guest-physical addresses.
> >
> > PowerPC, in particular, doesn't want to pay the cost of IOMMU
> > manipulations, and all arguments presented so far for using an IOMMU for
> > a virtio device are weak. And changing to use DMA APIs would break them
> > anyway.
> >
> > Of course, it's Just A Matter of Code, so it's possible to
> > create a Xen-specific variant which uses the DMA APIs. I'm not sure
> > what that would look like in the virtio standard, however.
>
> So this has popped up in the past a few times already from people who
> want to use virtio as a transport between physical systems connected
> via a bus like PCI using non-transparent bridges for example.
>
> There's a way to get both here that isn't too nasty... we can make the
> virtio drivers use the dma_map_* APIs and just switch the dma_ops in
> the struct device based on the hypervisor requirements. IE. For KVM we
> could attach a set of ops that basically just return the physical
> address, real PCI transport would use the normal callbacks etc...
Right.
>
> The only problem at the moment is that the dma_map_ops, while
> defined generically, aren't plumbed into the generic struct device
> but instead on some architectures dev_archdata. This includes
> powerpc, ARM and x86 (under a CONFIG option for the latter which
> is only enabled on x86_64 and some oddball i386 variant).
I am not following the interaction between 'struct device', 'struct
dev_archdata' and 'struct dma_map_ops' ? The 'struct dma_ops' should
be able to exist without having to exist in the other structures?
Naturally the implementation of 'struct dma_ops' has to use
'struct device' otherwise it can't get the details such as dma_mapping.
>
> So either we switch to have all architectures we care about always
> use the generic DMA ops and move the pointer to struct device, or
> we create another inline "indirection" to deal with the cases
> without the dma_map_ops...
Or you implement an passthrough 'dma_map_ops' that you suggested?
Thought I feel I am not groking something from your email. Hmm, time
to get some more coffee.
>
> Cheers,
> Ben.
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-08-29 15:06 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-25 17:18 virtio DMA API? Andy Lutomirski
2014-08-25 18:54 ` Konrad Rzeszutek Wilk
2014-08-25 19:20 ` Andy Lutomirski
2014-08-27 11:10 ` Rusty Russell
2014-08-27 11:52 ` Michael S. Tsirkin
2014-08-27 19:49 ` Konrad Rzeszutek Wilk
2014-08-27 21:32 ` Benjamin Herrenschmidt
2014-08-27 14:55 ` Andy Lutomirski
2014-08-27 21:31 ` Benjamin Herrenschmidt
2014-08-29 15:06 ` Konrad Rzeszutek Wilk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).