All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: USB virt status
@ 2005-11-11 23:23 Ian Pratt
  2005-11-13 18:28 ` USB virt status --- Help please!!! Harry Butterworth
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Pratt @ 2005-11-11 23:23 UTC (permalink / raw)
  To: harry, sanjay.kushwaha, mark.williamson, xen-devel

 
> I've spent a few days testing and fixing bugs and it's now 
> possible to create and mount a filesystem on a USB key in the 
> FE using this code.

Excellent, sounds promising.  
 
> I'm in the middle of debugging a problem where dom0 reboots 
> when flushing a big file to the USB key.  I didn't manage to 
> find a serial cable for a serial console this evening so I 
> don't have much to go on yet.

You could try 'noreboot' and stick to a text mode.

It'll be good to get USB functionality back in the tree.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: USB virt status --- Help please!!!
  2005-11-11 23:23 USB virt status Ian Pratt
@ 2005-11-13 18:28 ` Harry Butterworth
  2005-11-14  9:56   ` Keir Fraser
  0 siblings, 1 reply; 15+ messages in thread
From: Harry Butterworth @ 2005-11-13 18:28 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, mark.williamson, keir.fraser, sanjay.kushwaha

> > I'm in the middle of debugging a problem where dom0 reboots 
> > when flushing a big file to the USB key.  I didn't manage to 
> > find a serial cable for a serial console this evening so I 
> > don't have much to go on yet.
> 
> You could try 'noreboot' and stick to a text mode.

Thanks, yes this worked.

It turns out that dom0 is bombing out in dma_map_single
(arch/xen/i386/kernel/pci-dma.c) with a failure of
range_straddles_page_boundary and if I use swiotlb=force on the command
line as suggested by the printk then it all seems to work fine.

Looking at the kernel documentation for the DMA api I get the impression
that dma_map_single ought to work across page boundaries provided the
pages are contiguous in physical memory.  Given that the code seems to
work writing and verifying large random files with swiotlb=force I think
I'm probably not getting the addresses wrong and I'm assuming that if
the memory was good enough for the USB driver in the FE it ought to be
good enough for the USB driver in the BE.

So, to me this looks like a bug in the xen implementation of
dma_map_single which I think ought to not fail just because of crossing
a page boundary but only if the memory is discontiguous.

But, I don't really know enough about the memory model to know if this
is right.

So, I need to know what I ought to be doing with the USB memory mapping.

Currently, I'm taking the FE pointer to the memory buffer and creating a
grant reference for each page spanned by the request then mapping them
into the BE address space in an empty page range taken from the balloon
driver.  This is the same as the block driver where I copied the code
from.

Then I pass the address of the start of the buffer in the empty page
range in the URB to the usb core.

Is this the right approach? Is the check in dma_map_single overzealous?

Harry.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-13 18:28 ` USB virt status --- Help please!!! Harry Butterworth
@ 2005-11-14  9:56   ` Keir Fraser
  2005-11-14 10:12     ` harry
  0 siblings, 1 reply; 15+ messages in thread
From: Keir Fraser @ 2005-11-14  9:56 UTC (permalink / raw)
  To: Harry Butterworth; +Cc: Ian Pratt, mark.williamson, xen-devel, sanjay.kushwaha


On 13 Nov 2005, at 18:28, Harry Butterworth wrote:

> Is this the right approach? Is the check in dma_map_single overzealous?

When running on Xen, just because Linux has allocated adjacent pages 
from its 'physical' memory map, doesn't mean they really are physically 
contiguous. For I/O we have to go to extra effort to really allocate 
truly contiguous multi-page extents.

So your code is correct. If you can preallocate buffers then you could 
do that with dma_alloc_coherent and that would guarantee contigous 
buffers....

  -- Keir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14  9:56   ` Keir Fraser
@ 2005-11-14 10:12     ` harry
  2005-11-14 10:27       ` harry
  2005-11-14 10:33       ` Keir Fraser
  0 siblings, 2 replies; 15+ messages in thread
From: harry @ 2005-11-14 10:12 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Ian Pratt, xen-devel, mark.williamson, sanjay.kushwaha

On Mon, 2005-11-14 at 09:56 +0000, Keir Fraser wrote:
> On 13 Nov 2005, at 18:28, Harry Butterworth wrote:
> 
> > Is this the right approach? Is the check in dma_map_single overzealous?
> 
> When running on Xen, just because Linux has allocated adjacent pages 
> from its 'physical' memory map, doesn't mean they really are physically 
> contiguous. For I/O we have to go to extra effort to really allocate 
> truly contiguous multi-page extents.
> 
> So your code is correct. If you can preallocate buffers then you could 
> do that with dma_alloc_coherent and that would guarantee contigous 
> buffers....

I'm given the buffers by the assorted USB drivers.  I could stage the
data using memcpy into my own buffer allocated using dma_alloc_coherent.
Or should I go through all the USB drivers and change the buffer
allocation?

> 
>   -- Keir
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 10:12     ` harry
@ 2005-11-14 10:27       ` harry
  2005-11-14 10:33       ` Keir Fraser
  1 sibling, 0 replies; 15+ messages in thread
From: harry @ 2005-11-14 10:27 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Ian Pratt, xen-devel, mark.williamson, sanjay.kushwaha

On Mon, 2005-11-14 at 10:12 +0000, harry wrote:
> On Mon, 2005-11-14 at 09:56 +0000, Keir Fraser wrote:
> > On 13 Nov 2005, at 18:28, Harry Butterworth wrote:
> > 
> > > Is this the right approach? Is the check in dma_map_single overzealous?
> > 
> > When running on Xen, just because Linux has allocated adjacent pages 
> > from its 'physical' memory map, doesn't mean they really are physically 
> > contiguous. For I/O we have to go to extra effort to really allocate 
> > truly contiguous multi-page extents.
> > 
> > So your code is correct. If you can preallocate buffers then you could 
> > do that with dma_alloc_coherent and that would guarantee contigous 
> > buffers....
> 
> I'm given the buffers by the assorted USB drivers.  I could stage the
> data using memcpy into my own buffer allocated using dma_alloc_coherent.
> Or should I go through all the USB drivers and change the buffer
> allocation?

It looks like there is an operation usb_buffer_alloc which I can
implement to give the right kind of buffers to clients.  Any buffers not
allocated from here can be staged.  Should hopefully be OK.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 10:12     ` harry
  2005-11-14 10:27       ` harry
@ 2005-11-14 10:33       ` Keir Fraser
  2005-11-14 17:30         ` Stephen C. Tweedie
  1 sibling, 1 reply; 15+ messages in thread
From: Keir Fraser @ 2005-11-14 10:33 UTC (permalink / raw)
  To: harry; +Cc: Ian Pratt, xen-devel, mark.williamson, sanjay.kushwaha


On 14 Nov 2005, at 10:12, harry wrote:

>> When running on Xen, just because Linux has allocated adjacent pages
>> from its 'physical' memory map, doesn't mean they really are 
>> physically
>> contiguous. For I/O we have to go to extra effort to really allocate
>> truly contiguous multi-page extents.
>>
>> So your code is correct. If you can preallocate buffers then you could
>> do that with dma_alloc_coherent and that would guarantee contigous
>> buffers....
>
> I'm given the buffers by the assorted USB drivers.  I could stage the
> data using memcpy into my own buffer allocated using 
> dma_alloc_coherent.
> Or should I go through all the USB drivers and change the buffer
> allocation?

Your first approach would be equivalent to just using swiotlb. But  it 
is easy to implement and at least avoids 'use swiotlb=force' crashes. 
One possibility is to switch to having swiotlb default to 'on' and then 
need to force it off.

The second approach would never get past the kernel maintainers. This 
is a very xen-specific problem and the maintainers won't allow us to 
solve it by hacking at generic driver code.

  -- Keir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 10:33       ` Keir Fraser
@ 2005-11-14 17:30         ` Stephen C. Tweedie
  2005-11-14 18:42           ` Keir Fraser
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen C. Tweedie @ 2005-11-14 17:30 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Ian Pratt, harry, xen-devel, sanjay.kushwaha, mark.williamson

Hi,

On Mon, 2005-11-14 at 10:33 +0000, Keir Fraser wrote:

> Your first approach would be equivalent to just using swiotlb. But  it 
> is easy to implement and at least avoids 'use swiotlb=force' crashes. 
> One possibility is to switch to having swiotlb default to 'on' and then 
> need to force it off.
> 
> The second approach would never get past the kernel maintainers. This 
> is a very xen-specific problem and the maintainers won't allow us to 
> solve it by hacking at generic driver code.

Having swiotlb the default would also have dealt with the network DMA
problem we had recently, with pci_map_single() refusing to work over
page boundaries.

That got fixed that by adding an arch-specific __alloc_skb which avoided
crossing page boundaries by disabling CONFIG_DEBUG_SLAB for the skb
caches.  But that fix is also something that I doubt maintainers will
allow to go upstream.  I wonder if we'll find enough such special cases
that we really want to have a swiotlb available, just in case, at all
times?  

In cases where we expect most IO not to need it, we can reduce the
default size of it at boot time.  swiotlb_map_single() already tests
address_needs_mapping() as almost the first thing it does, so hopefully
the fast path for cases where the swiotlb is present but not needed will
be good enough to have it always available.

--Stephen

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 17:30         ` Stephen C. Tweedie
@ 2005-11-14 18:42           ` Keir Fraser
  2005-11-14 18:50             ` Stephen C. Tweedie
  0 siblings, 1 reply; 15+ messages in thread
From: Keir Fraser @ 2005-11-14 18:42 UTC (permalink / raw)
  To: Stephen C. Tweedie
  Cc: Ian Pratt, harry, xen-devel, mark.williamson, sanjay.kushwaha


On 14 Nov 2005, at 17:30, Stephen C. Tweedie wrote:

> Having swiotlb the default would also have dealt with the network DMA
> problem we had recently, with pci_map_single() refusing to work over
> page boundaries.
>
> That got fixed that by adding an arch-specific __alloc_skb which 
> avoided
> crossing page boundaries by disabling CONFIG_DEBUG_SLAB for the skb
> caches.  But that fix is also something that I doubt maintainers will
> allow to go upstream.  I wonder if we'll find enough such special cases
> that we really want to have a swiotlb available, just in case, at all
> times?

swiotlb performance is pretty poor since it involves memcpy. The 
alloc_skb fix was pretty clean, if we're allowed an arch hook for 
alloc_skb. We didn;t need to hack the slab allocator itself.

  -- Keir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 18:42           ` Keir Fraser
@ 2005-11-14 18:50             ` Stephen C. Tweedie
  2005-11-14 20:36               ` Harry Butterworth
  2005-11-14 21:46               ` Muli Ben-Yehuda
  0 siblings, 2 replies; 15+ messages in thread
From: Stephen C. Tweedie @ 2005-11-14 18:50 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Ian Pratt, harry, xen-devel, sanjay.kushwaha, mark.williamson

Hi,

On Mon, 2005-11-14 at 18:42 +0000, Keir Fraser wrote:

> > That got fixed that by adding an arch-specific __alloc_skb which 
> > avoided
> > crossing page boundaries by disabling CONFIG_DEBUG_SLAB for the skb
> > caches.  But that fix is also something that I doubt maintainers will
> > allow to go upstream.  I wonder if we'll find enough such special cases
> > that we really want to have a swiotlb available, just in case, at all
> > times?
> 
> swiotlb performance is pretty poor since it involves memcpy. The 
> alloc_skb fix was pretty clean, if we're allowed an arch hook for 
> alloc_skb. We didn;t need to hack the slab allocator itself.

There are two different questions --- whether we should rely on swiotlb
in the general case, and whether we should have it available
just-in-case for edge cases.

I'm not suggesting that we fix the networking problem via swiotlb.  But
that problem does make me wonder what other edge-cases remain lurking
that may bite users later; and if they exist, panicing the kernel when
we get an IO we can't translate directly is a lot worse than falling
back to a slow swiotlb path.

The fact is, on 2.6 the slab allocator can hand out objects (even
sub-pagesized ones) that cross page boundaries.  If some driver happens
to allocate its own objects from a slab cache and run pci_map_single()
on them, it works fine on 2.6 mainline but breaks on Xen w/o swiotlb.

The alloc_skb is just avoiding one special case of this.  It's an
important special case, sure, but to be robust, might we not want to
have a minimal swiotlb cache available at all times as fallback?

--Stephen

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 18:50             ` Stephen C. Tweedie
@ 2005-11-14 20:36               ` Harry Butterworth
  2005-11-14 21:00                 ` Muli Ben-Yehuda
  2005-11-14 21:46               ` Muli Ben-Yehuda
  1 sibling, 1 reply; 15+ messages in thread
From: Harry Butterworth @ 2005-11-14 20:36 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Ian Pratt, xen-devel, mark.williamson, sanjay.kushwaha

On Mon, 2005-11-14 at 13:50 -0500, Stephen C. Tweedie wrote:
> Hi,
> 
> On Mon, 2005-11-14 at 18:42 +0000, Keir Fraser wrote:
> 
> > > That got fixed that by adding an arch-specific __alloc_skb which 
> > > avoided
> > > crossing page boundaries by disabling CONFIG_DEBUG_SLAB for the skb
> > > caches.  But that fix is also something that I doubt maintainers will
> > > allow to go upstream.  I wonder if we'll find enough such special cases
> > > that we really want to have a swiotlb available, just in case, at all
> > > times?
> > 
> > swiotlb performance is pretty poor since it involves memcpy. The 
> > alloc_skb fix was pretty clean, if we're allowed an arch hook for 
> > alloc_skb. We didn;t need to hack the slab allocator itself.
> 
> There are two different questions --- whether we should rely on swiotlb
> in the general case, and whether we should have it available
> just-in-case for edge cases.
> 
> I'm not suggesting that we fix the networking problem via swiotlb.  But
> that problem does make me wonder what other edge-cases remain lurking
> that may bite users later; and if they exist, panicing the kernel when
> we get an IO we can't translate directly is a lot worse than falling
> back to a slow swiotlb path.
> 
> The fact is, on 2.6 the slab allocator can hand out objects (even
> sub-pagesized ones) that cross page boundaries.  If some driver happens
> to allocate its own objects from a slab cache and run pci_map_single()
> on them, it works fine on 2.6 mainline but breaks on Xen w/o swiotlb.
> 
> The alloc_skb is just avoiding one special case of this.  It's an
> important special case, sure, but to be robust, might we not want to
> have a minimal swiotlb cache available at all times as fallback?
> 
> --Stephen

I agree:  there has to be code in the BE somewhere to check that the
pages used by the FE are good otherwise there is a big security hole:
any FE can toast dom0 by invoking the BUG in pci_map_single.

Having swiotlb on by default would be convenient because it would
implement the checking once for all the drivers that need it.

If it's off, I'm going to at least have to implement extra checking in
the BE to avoid the security hole and my own staging if I want to
guarantee that the USB driver will work reliably.

So, I'd like it to be on as a fallback.

-- 
Harry Butterworth <harry@hebutterworth.freeserve.co.uk>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 20:36               ` Harry Butterworth
@ 2005-11-14 21:00                 ` Muli Ben-Yehuda
  0 siblings, 0 replies; 15+ messages in thread
From: Muli Ben-Yehuda @ 2005-11-14 21:00 UTC (permalink / raw)
  To: Harry Butterworth; +Cc: xen-devel, mark.williamson, sanjay.kushwaha, Ian Pratt

On Mon, Nov 14, 2005 at 08:36:07PM +0000, Harry Butterworth wrote:

> So, I'd like it to be on as a fallback.

I've been pondering an extension of my dma_ops patch[0] to allow
setting the dma_ops dynamically on a per device / subsystem, somewhat
similar to how ppc does it. The objective would be to use nommu where
possible and safe, and fallback to swiotlb for devices or subsystems
that aren't safe otherwise. So for net and block we could have nommu
and fall back to swiotlb for e.g. USB. What do people think about this
approach?

Cheers,
Muli
-- 
Muli Ben-Yehuda
http://www.mulix.org | http://mulix.livejournal.com/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 18:50             ` Stephen C. Tweedie
  2005-11-14 20:36               ` Harry Butterworth
@ 2005-11-14 21:46               ` Muli Ben-Yehuda
  2005-11-15 10:23                 ` Keir Fraser
  1 sibling, 1 reply; 15+ messages in thread
From: Muli Ben-Yehuda @ 2005-11-14 21:46 UTC (permalink / raw)
  To: Stephen C. Tweedie
  Cc: Ian Pratt, harry, xen-devel, sanjay.kushwaha, mark.williamson

On Mon, Nov 14, 2005 at 01:50:39PM -0500, Stephen C. Tweedie wrote:

> The alloc_skb is just avoiding one special case of this.  It's an
> important special case, sure, but to be robust, might we not want to
> have a minimal swiotlb cache available at all times as fallback?

I think we certainly do. swiotlb will already avoid bouncing where
possible via the checks for address_needs_mapping(). Is swiotlb
prohibitively expensive even when no bouncing is necessary? if yes, we
could either trim it down or just use it selectively as I outlined
earlier on a per bus / device / subsystem basis.

Cheers,
Muli
-- 
Muli Ben-Yehuda
http://www.mulix.org | http://mulix.livejournal.com/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-14 21:46               ` Muli Ben-Yehuda
@ 2005-11-15 10:23                 ` Keir Fraser
  2005-11-15 19:14                   ` Muli Ben-Yehuda
  0 siblings, 1 reply; 15+ messages in thread
From: Keir Fraser @ 2005-11-15 10:23 UTC (permalink / raw)
  To: Muli Ben-Yehuda
  Cc: Ian Pratt, harry, xen-devel, sanjay.kushwaha, mark.williamson


On 14 Nov 2005, at 21:46, Muli Ben-Yehuda wrote:

>> The alloc_skb is just avoiding one special case of this.  It's an
>> important special case, sure, but to be robust, might we not want to
>> have a minimal swiotlb cache available at all times as fallback?
>
> I think we certainly do. swiotlb will already avoid bouncing where
> possible via the checks for address_needs_mapping(). Is swiotlb
> prohibitively expensive even when no bouncing is necessary? if yes, we
> could either trim it down or just use it selectively as I outlined
> earlier on a per bus / device / subsystem basis.

Not expensive in time, but it does waste memory if it's unused. 
Currently we allocate a 64MB aperture if the machine has memory mapped 
above 2GB (this catches devices like aacraid that can only address 31 
bits). I could also allocate an aperture of, say, 2MB for smaller 
systems. There is of course a tension between not wasting memory yet 
having a big enough aperture that the system will not run out of iommu 
space and crash even when stressed.

  -- Keir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: USB virt status --- Help please!!!
  2005-11-15 10:23                 ` Keir Fraser
@ 2005-11-15 19:14                   ` Muli Ben-Yehuda
  0 siblings, 0 replies; 15+ messages in thread
From: Muli Ben-Yehuda @ 2005-11-15 19:14 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Ian Pratt, harry, xen-devel, sanjay.kushwaha, mark.williamson

On Tue, Nov 15, 2005 at 10:23:40AM +0000, Keir Fraser wrote:

> Not expensive in time, but it does waste memory if it's unused.
> Currently we allocate a 64MB aperture if the machine has memory mapped 
> above 2GB (this catches devices like aacraid that can only address 31 
> bits). I could also allocate an aperture of, say, 2MB for smaller 
> systems. There is of course a tension between not wasting memory yet 
> having a big enough aperture that the system will not run out of iommu 
> space and crash even when stressed.

Since Andi's ZONE_DMA32 patches went in last night, how about
foregoing the up-front allocation of bounce buffers and allocating
them as necessary from ZONE_DMA32? once we have that, we should be
able to switch to swiotlb by default.

Cheers,
Muli
-- 
Muli Ben-Yehuda
http://www.mulix.org | http://mulix.livejournal.com/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: USB virt status --- Help please!!!
@ 2005-11-15 22:36 Ian Pratt
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Pratt @ 2005-11-15 22:36 UTC (permalink / raw)
  To: Muli Ben-Yehuda, Keir Fraser
  Cc: harry, xen-devel, sanjay.kushwaha, mark.williamson

> Since Andi's ZONE_DMA32 patches went in last night, how about 
> foregoing the up-front allocation of bounce buffers and 
> allocating them as necessary from ZONE_DMA32? once we have 
> that, we should be able to switch to swiotlb by default.

It's not quite that simple: We need memory that is machine contiguous,
which means we need to call into Xen with xen_alloc_contig.

Xen is unlikely to have any memory to allocate that is machine
contiguous to the current slab, which means we'd have to introduce
swiotlb to the notion of having multiple apertures. This wouldn't be a
bad thing, but would complicate the code.

Having multiple apertures does rather increase the chances that we'd
avoid the fatal problem of dma_alloc_single not being able to return
failure.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2005-11-15 22:36 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-11 23:23 USB virt status Ian Pratt
2005-11-13 18:28 ` USB virt status --- Help please!!! Harry Butterworth
2005-11-14  9:56   ` Keir Fraser
2005-11-14 10:12     ` harry
2005-11-14 10:27       ` harry
2005-11-14 10:33       ` Keir Fraser
2005-11-14 17:30         ` Stephen C. Tweedie
2005-11-14 18:42           ` Keir Fraser
2005-11-14 18:50             ` Stephen C. Tweedie
2005-11-14 20:36               ` Harry Butterworth
2005-11-14 21:00                 ` Muli Ben-Yehuda
2005-11-14 21:46               ` Muli Ben-Yehuda
2005-11-15 10:23                 ` Keir Fraser
2005-11-15 19:14                   ` Muli Ben-Yehuda
  -- strict thread matches above, loose matches on Subject: below --
2005-11-15 22:36 Ian Pratt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.