* Workaround for USB DMA bugs
@ 2002-04-03 2:43 David Gibson
2002-04-03 8:17 ` Armin
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: David Gibson @ 2002-04-03 2:43 UTC (permalink / raw)
To: linuxppc-embedded
Well, I'm not sure I'd want to pollute the tree with this sort of
hack, but on the other hand it makes things work that wouldn't
otherwise.
Does anyone think it's a good idea to commit the following patch:
diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c linux-grinch/arch/ppc/mm/cachemap.c
--- /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c Thu Mar 14 13:49:01 2002
+++ linux-grinch/arch/ppc/mm/cachemap.c Wed Apr 3 11:51:30 2002
@@ -137,7 +137,18 @@
case PCI_DMA_NONE:
BUG();
case PCI_DMA_FROMDEVICE: /* invalidate only */
+#ifdef CONFIG_USB
+ /* The USB stack is broken in that it uses DMA buffers
+ on the stack which are on the stack and not
+ cacheline aligned. That means cache invalidates
+ before DMA transfers corrupt the stack on machines
+ without DMA-consistent cache. This is a nasty
+ workaround until the USB layer is fixed (apparently
+ done in 2.5). */
+ flush_dcache_range(start, end);
+#else
invalidate_dcache_range(start, end);
+#endif
break;
case PCI_DMA_TODEVICE: /* writeback only */
clean_dcache_range(start, end);
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 2:43 David Gibson
@ 2002-04-03 8:17 ` Armin
2002-04-03 17:39 ` Frank Rowand
2002-04-03 17:49 ` Dan Malek
2002-04-03 9:45 ` "David Müller (ELSOFT AG)"
2002-04-03 17:42 ` Dan Malek
2 siblings, 2 replies; 24+ messages in thread
From: Armin @ 2002-04-03 8:17 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-embedded
David Gibson wrote:
> Well, I'm not sure I'd want to pollute the tree with this sort of
> hack, but on the other hand it makes things work that wouldn't
> otherwise.
>
> Does anyone think it's a good idea to commit the following patch:
>
> diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c linux-grinch/arch/ppc/mm/cachemap.c
> --- /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c Thu Mar 14 13:49:01 2002
> +++ linux-grinch/arch/ppc/mm/cachemap.c Wed Apr 3 11:51:30 2002
> @@ -137,7 +137,18 @@
> case PCI_DMA_NONE:
> BUG();
> case PCI_DMA_FROMDEVICE: /* invalidate only */
> +#ifdef CONFIG_USB
> + /* The USB stack is broken in that it uses DMA buffers
> + on the stack which are on the stack and not
> + cacheline aligned. That means cache invalidates
> + before DMA transfers corrupt the stack on machines
> + without DMA-consistent cache. This is a nasty
> + workaround until the USB layer is fixed (apparently
> + done in 2.5). */
> + flush_dcache_range(start, end);
> +#else
> invalidate_dcache_range(start, end);
> +#endif
> break;
> case PCI_DMA_TODEVICE: /* writeback only */
> clean_dcache_range(start, end);
>
>
> --
> David Gibson | For every complex problem there is a
> david@gibson.dropbear.id.au | solution which is simple, neat and
> | wrong. -- H.L. Mencken
> http://www.ozlabs.org/people/dgibson
>
>
>
>
>
Ah yes, We have done something simular (internally) and had the same
reservations. This will help me with the ocp usb drivers i am working
on. I would like to see some way of reminding us that this was done.
let say a config option for 4xx_usb_dma_workarround.
.
-armin
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 2:43 David Gibson
2002-04-03 8:17 ` Armin
@ 2002-04-03 9:45 ` "David Müller (ELSOFT AG)"
2002-04-03 23:35 ` David Gibson
2002-04-03 17:42 ` Dan Malek
2 siblings, 1 reply; 24+ messages in thread
From: "David Müller (ELSOFT AG)" @ 2002-04-03 9:45 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-embedded
Hello
David Gibson wrote:
> Well, I'm not sure I'd want to pollute the tree with this sort of
> hack, but on the other hand it makes things work that wouldn't
> otherwise.
>
> Does anyone think it's a good idea to commit the following patch:
>
> diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c linux-grinch/arch/ppc/mm/cachemap.c
> --- /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c Thu Mar 14 13:49:01 2002
> +++ linux-grinch/arch/ppc/mm/cachemap.c Wed Apr 3 11:51:30 2002
> @@ -137,7 +137,18 @@
> case PCI_DMA_NONE:
> BUG();
> case PCI_DMA_FROMDEVICE: /* invalidate only */
> +#ifdef CONFIG_USB
> + /* The USB stack is broken in that it uses DMA buffers
> + on the stack which are on the stack and not
> + cacheline aligned. That means cache invalidates
> + before DMA transfers corrupt the stack on machines
> + without DMA-consistent cache. This is a nasty
> + workaround until the USB layer is fixed (apparently
> + done in 2.5). */
> + flush_dcache_range(start, end);
> +#else
> invalidate_dcache_range(start, end);
> +#endif
> break;
> case PCI_DMA_TODEVICE: /* writeback only */
> clean_dcache_range(start, end);
>
>
Cool. This hack fixes a problem i see with the USB stack on our boards
since quite some time.
What about replacing "#ifdef CONFIG_USB" by
"#if defined(CONFIG_USB) || defined(CONFIG_USB_MODULE)" so this
workaround works also if the USB stack is compiled with module support
enabled.
BTW
Do you know from which version of the 2.5 kernel on this buffer
alignment problem is fixed?
Dave
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 8:17 ` Armin
@ 2002-04-03 17:39 ` Frank Rowand
2002-04-03 17:49 ` Dan Malek
1 sibling, 0 replies; 24+ messages in thread
From: Frank Rowand @ 2002-04-03 17:39 UTC (permalink / raw)
To: Armin; +Cc: David Gibson, linuxppc-embedded
Armin wrote:
>
> David Gibson wrote:
> > Well, I'm not sure I'd want to pollute the tree with this sort of
> > hack, but on the other hand it makes things work that wouldn't
> > otherwise.
> >
> > Does anyone think it's a good idea to commit the following patch:
> >
> > diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c linux-grinch/arch/ppc/mm/cachemap.c
> > --- /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c Thu Mar 14 13:49:01 2002
> > +++ linux-grinch/arch/ppc/mm/cachemap.c Wed Apr 3 11:51:30 2002
> > @@ -137,7 +137,18 @@
> > case PCI_DMA_NONE:
> > BUG();
> > case PCI_DMA_FROMDEVICE: /* invalidate only */
> > +#ifdef CONFIG_USB
> > + /* The USB stack is broken in that it uses DMA buffers
> > + on the stack which are on the stack and not
> > + cacheline aligned. That means cache invalidates
> > + before DMA transfers corrupt the stack on machines
> > + without DMA-consistent cache. This is a nasty
> > + workaround until the USB layer is fixed (apparently
> > + done in 2.5). */
> > + flush_dcache_range(start, end);
> > +#else
> > invalidate_dcache_range(start, end);
> > +#endif
> > break;
> > case PCI_DMA_TODEVICE: /* writeback only */
> > clean_dcache_range(start, end);
> >
> >
> > --
> > David Gibson | For every complex problem there is a
> > david@gibson.dropbear.id.au | solution which is simple, neat and
> > | wrong. -- H.L. Mencken
> > http://www.ozlabs.org/people/dgibson
> >
> >
> >
> >
> >
>
> Ah yes, We have done something simular (internally) and had the same
> reservations. This will help me with the ocp usb drivers i am working
> on. I would like to see some way of reminding us that this was done.
> let say a config option for 4xx_usb_dma_workarround.
Not just 4xx processors - this applies to all processors that require
CONFIG_NOT_COHERENT_CACHE.
-Frank
--
Frank Rowand <frank_rowand@mvista.com>
MontaVista Software, Inc
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 2:43 David Gibson
2002-04-03 8:17 ` Armin
2002-04-03 9:45 ` "David Müller (ELSOFT AG)"
@ 2002-04-03 17:42 ` Dan Malek
2002-04-03 23:40 ` David Gibson
2 siblings, 1 reply; 24+ messages in thread
From: Dan Malek @ 2002-04-03 17:42 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-embedded
David Gibson wrote:
> Well, I'm not sure I'd want to pollute the tree with this sort of
> hack, but on the other hand it makes things work that wouldn't
> otherwise.
This has been floating around for a long time while we were searching
for the proper solution. As you said, if you want things to work, you
have to do this. This still isn't logically correct, it just makes
the window of error smaller.
This is one of several problems in the Linux kernel that has to be
solved for processors that don't have bus snooping. All we have been
doing over time is making the problems less apparent, not solving them.
There are similar hacks for DMA into sk buffers, which still doesn't
work in forwarding/bridging applications (you won't notice the bugs
in endpoint only situations).
The real problem to be solved is any memory object that can be the target
for DMA has to be cache line aligned, and modulo cache line size for
incoherent processors. Since few people work with this type of processor,
it is hard to get maintainers interested in making these general changes
(and polluting the Linux kernel with knowledge of cache line size).
-- Dan
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 8:17 ` Armin
2002-04-03 17:39 ` Frank Rowand
@ 2002-04-03 17:49 ` Dan Malek
2002-04-03 20:43 ` David Blythe
1 sibling, 1 reply; 24+ messages in thread
From: Dan Malek @ 2002-04-03 17:49 UTC (permalink / raw)
To: Armin; +Cc: David Gibson, linuxppc-embedded
Armin wrote:
> Ah yes, We have done something simular (internally) and had the same
> reservations. This will help me with the ocp usb drivers i am working
> on. I would like to see some way of reminding us that this was done.
> let say a config option for 4xx_usb_dma_workarround.
It isn't unique to 4xx. This was originally done for the 8xx USB (actually,
probably for StrongARM and I just copied it over). These functions are
only used for incoherent processors, so there isn't any need to further
define this with another config option. Just leave a comment here, change
the #ifdef CONFIG_USB to #if 1 (it should be done all of the time).
Just remember this is not a solution, and simply makes the problem less
likely to appear. If you happen to perform this operation, and then the
processor happens to touch the cache line before the DMA is complete,
your DMA buffer will still be corrputed. So far, we have just been lucky
the DMA completes before anything else in the cache line is touched by
the processor. This only guarantees the processor will never get incorrect
information when it shares a cache line with a DMA.
-- Dan
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 17:49 ` Dan Malek
@ 2002-04-03 20:43 ` David Blythe
2002-04-03 23:34 ` David Gibson
0 siblings, 1 reply; 24+ messages in thread
From: David Blythe @ 2002-04-03 20:43 UTC (permalink / raw)
To: linuxppc-embedded
What he said! We ended up making all of the changes to move buffers,
align them, etc for both USB and skbuffs, as well as fixing the overly
aggressive prefetches in memcpy that overlapped previously
cache-invalidated memory, etc. The end result is an ugly set of
changes, but at least the board works reliably now. We also found one
nasty problem with an Epson USB printer where after request buffers were
bloated to cache line aligned length and the bloated length passed to
the device, that the device doesn't like the bloated length. So we
hacked up the control path a little to get at the original length too.
This sort of stuff is a real pain to fix correctly to deal nicely with
non-coherent systems.
david
Dan Malek wrote:
>
> Armin wrote:
>
>
>
>> Ah yes, We have done something simular (internally) and had the same
>> reservations. This will help me with the ocp usb drivers i am working
>> on. I would like to see some way of reminding us that this was done.
>> let say a config option for 4xx_usb_dma_workarround.
>
>
> It isn't unique to 4xx. This was originally done for the 8xx USB
> (actually,
> probably for StrongARM and I just copied it over). These functions are
> only used for incoherent processors, so there isn't any need to further
> define this with another config option. Just leave a comment here, change
> the #ifdef CONFIG_USB to #if 1 (it should be done all of the time).
>
> Just remember this is not a solution, and simply makes the problem less
> likely to appear. If you happen to perform this operation, and then the
> processor happens to touch the cache line before the DMA is complete,
> your DMA buffer will still be corrputed. So far, we have just been lucky
> the DMA completes before anything else in the cache line is touched by
> the processor. This only guarantees the processor will never get incorrect
> information when it shares a cache line with a DMA.
>
>
> -- Dan
>
>
>
>
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 20:43 ` David Blythe
@ 2002-04-03 23:34 ` David Gibson
0 siblings, 0 replies; 24+ messages in thread
From: David Gibson @ 2002-04-03 23:34 UTC (permalink / raw)
To: linuxppc-embedded
On Wed, Apr 03, 2002 at 12:43:01PM -0800, David Blythe wrote:
>
> What he said! We ended up making all of the changes to move buffers,
> align them, etc for both USB and skbuffs, as well as fixing the overly
> aggressive prefetches in memcpy that overlapped previously
> cache-invalidated memory, etc. The end result is an ugly set of
> changes, but at least the board works reliably now. We also found one
> nasty problem with an Epson USB printer where after request buffers were
> bloated to cache line aligned length and the bloated length passed to
> the device, that the device doesn't like the bloated length. So we
> hacked up the control path a little to get at the original length too.
> This sort of stuff is a real pain to fix correctly to deal nicely with
> non-coherent systems.
That sounds closer to the correct way of fixing this. The USB folks
do consider this a bug, and I heard murmurs that suggested this had
already been fixed in 2.5.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 9:45 ` "David Müller (ELSOFT AG)"
@ 2002-04-03 23:35 ` David Gibson
0 siblings, 0 replies; 24+ messages in thread
From: David Gibson @ 2002-04-03 23:35 UTC (permalink / raw)
To: linuxppc-embedded
On Wed, Apr 03, 2002 at 11:45:41AM +0200, "David M?ller (ELSOFT AG)" wrote:
> Hello
>
> David Gibson wrote:
> >Well, I'm not sure I'd want to pollute the tree with this sort of
> >hack, but on the other hand it makes things work that wouldn't
> >otherwise.
> >
> >Does anyone think it's a good idea to commit the following patch:
> >
> >diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c
> >linux-grinch/arch/ppc/mm/cachemap.c
> >Mar 14 13:49:01 2002
> >+++ linux-grinch/arch/ppc/mm/cachemap.c Wed Apr 3 11:51:30 2002
> >@@ -137,7 +137,18 @@
> > case PCI_DMA_NONE:
> > BUG();
> > case PCI_DMA_FROMDEVICE: /* invalidate only */
> >+#ifdef CONFIG_USB
> >+ /* The USB stack is broken in that it uses DMA buffers
> >+ on the stack which are on the stack and not
> >+ cacheline aligned. That means cache invalidates
> >+ before DMA transfers corrupt the stack on machines
> >+ without DMA-consistent cache. This is a nasty
> >+ workaround until the USB layer is fixed (apparently
> >+ done in 2.5). */
> >+ flush_dcache_range(start, end);
> >+#else
> > invalidate_dcache_range(start, end);
> >+#endif
> > break;
> > case PCI_DMA_TODEVICE: /* writeback only */
> > clean_dcache_range(start, end);
>
> Cool. This hack fixes a problem i see with the USB stack on our boards
> since quite some time.
>
> What about replacing "#ifdef CONFIG_USB" by
> "#if defined(CONFIG_USB) || defined(CONFIG_USB_MODULE)" so this
> workaround works also if the USB stack is compiled with module support
> enabled.
It might be safer / cleaner just to unconditionally enable it, as Dan
Malek suggested.
> BTW
> Do you know from which version of the 2.5 kernel on this buffer
> alignment problem is fixed?
No, I just happened to see a message from one of the USB people (Greg
K-H IIRC) saying he thought it was fixed in 2.5 and that they'd like
people to report any remaining non-cache-aligned buffers.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 17:42 ` Dan Malek
@ 2002-04-03 23:40 ` David Gibson
2002-04-04 2:54 ` Dan Malek
0 siblings, 1 reply; 24+ messages in thread
From: David Gibson @ 2002-04-03 23:40 UTC (permalink / raw)
To: linuxppc-embedded
On Wed, Apr 03, 2002 at 12:42:35PM -0500, Dan Malek wrote:
>
> David Gibson wrote:
>
> >Well, I'm not sure I'd want to pollute the tree with this sort of
> >hack, but on the other hand it makes things work that wouldn't
> >otherwise.
>
> This has been floating around for a long time while we were searching
> for the proper solution. As you said, if you want things to work, you
> have to do this. This still isn't logically correct, it just makes
> the window of error smaller.
I'm well aware of that. In fact I think for most cases in the USB
stack it is safe because the buffers in question are quite a long way
up the stack by the time the DMA is initiated, and the code waits for
the transfer to complete before it pops enough stack frames to touch
this area. Nonetheless it's certainly a hack and a workaround, not a
solution.
> This is one of several problems in the Linux kernel that has to be
> solved for processors that don't have bus snooping. All we have been
> doing over time is making the problems less apparent, not solving them.
> There are similar hacks for DMA into sk buffers, which still doesn't
> work in forwarding/bridging applications (you won't notice the bugs
> in endpoint only situations).
Indeed, and the USB people realise that. Well, amongst other things.
DMA buffers on the stack have other problems as well (I believe the
stack isn't guaranteed physically contiguous on all architectures).
> The real problem to be solved is any memory object that can be the target
> for DMA has to be cache line aligned, and modulo cache line size for
> incoherent processors. Since few people work with this type of processor,
> it is hard to get maintainers interested in making these general changes
> (and polluting the Linux kernel with knowledge of cache line size).
It's becoming easier though, with the __cacheline_aligned macro. Plus
I think if the buffer is individually kmalloc()ed it will be properly
aligned.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-03 23:40 ` David Gibson
@ 2002-04-04 2:54 ` Dan Malek
2002-04-04 3:48 ` David Gibson
2002-04-04 12:35 ` Brad Parker
0 siblings, 2 replies; 24+ messages in thread
From: Dan Malek @ 2002-04-04 2:54 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-embedded
David Gibson wrote:
> It's becoming easier though, with the __cacheline_aligned macro. Plus
> I think if the buffer is individually kmalloc()ed it will be properly
> aligned.
Alignment is only half the problem. You have to ensure the object is
modulo cache line size as well. You can't be sharing cache lines subject
to DMA with other DMA or processor core data.
-- Dan
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 2:54 ` Dan Malek
@ 2002-04-04 3:48 ` David Gibson
2002-04-04 4:09 ` Dan Malek
2002-04-04 12:35 ` Brad Parker
1 sibling, 1 reply; 24+ messages in thread
From: David Gibson @ 2002-04-04 3:48 UTC (permalink / raw)
To: linuxppc-embedded
On Wed, Apr 03, 2002 at 09:54:29PM -0500, Dan Malek wrote:
>
> David Gibson wrote:
>
>
> >It's becoming easier though, with the __cacheline_aligned macro. Plus
> >I think if the buffer is individually kmalloc()ed it will be properly
> >aligned.
>
> Alignment is only half the problem. You have to ensure the object is
> modulo cache line size as well. You can't be sharing cache lines subject
> to DMA with other DMA or processor core data.
Oh, right, yes. That was actually what I was meaning when I said
"aligned" (sort of aligned at both ends), forgetting that the normal
meaning only applies to the start address.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 3:48 ` David Gibson
@ 2002-04-04 4:09 ` Dan Malek
2002-04-04 5:23 ` David Gibson
0 siblings, 1 reply; 24+ messages in thread
From: Dan Malek @ 2002-04-04 4:09 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-embedded
David Gibson wrote:
> Oh, right, yes. That was actually what I was meaning when I said
> "aligned" (sort of aligned at both ends), forgetting that the normal
> meaning only applies to the start address.
I guess if we can ensure a DMA-only pool, simple alignment will work.
You just can't kmalloc() because you don't know what may follow something
that isn't modulo cache line size. You also have to be careful of code
that allocates a large object, then uses part of it for DMA and other
parts for processor core data. I'm not sure we want to require kmalloc()
to always cache align due to the potential for wasted memory space.
Thanks.
-- Dan
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
@ 2002-04-04 4:19 Jeremy Rosen
2002-04-04 6:16 ` Dan Malek
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Rosen @ 2002-04-04 4:19 UTC (permalink / raw)
To: linuxppc-embedded
>I guess if we can ensure a DMA-only pool, simple alignment will work.
>You just can't kmalloc() because you don't know what may follow
something
>that isn't modulo cache line size. You also have to be careful of
code
>that allocates a large object, then uses part of it for DMA and other
>parts for processor core data. I'm not sure we want to require
kmalloc()
>to always cache align due to the potential for wasted memory space.
can't we use the __GFP_DMA flag to specify that the area must have
algnment enforced ?
or did I misunderstand the use of this flag ?
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 4:09 ` Dan Malek
@ 2002-04-04 5:23 ` David Gibson
2002-04-04 6:08 ` Dan Malek
0 siblings, 1 reply; 24+ messages in thread
From: David Gibson @ 2002-04-04 5:23 UTC (permalink / raw)
To: linuxppc-embedded
On Wed, Apr 03, 2002 at 11:09:04PM -0500, Dan Malek wrote:
>
> David Gibson wrote:
>
> >Oh, right, yes. That was actually what I was meaning when I said
> >"aligned" (sort of aligned at both ends), forgetting that the normal
> >meaning only applies to the start address.
>
> I guess if we can ensure a DMA-only pool, simple alignment will work.
> You just can't kmalloc() because you don't know what may follow something
> that isn't modulo cache line size. You also have to be careful of code
> that allocates a large object, then uses part of it for DMA and other
> parts for processor core data. I'm not sure we want to require kmalloc()
> to always cache align due to the potential for wasted memory space.
kmalloc() already guarantess cacheline alignment (at the beginning).
Which means it is safe as long as the allocation is a multiple of the
cacheline size (which comes for free on allocations of a largish power
of 2, a common case).
Possibly a better approach would be to make the consistent_sync()
function be more careful and flush rather than invalidate cachelines
which are only partially covered by the region given. At the moment
it invalidates everything and hence is only safe for regions which are
cacheline aligned, and of size a multiple of cacheline size.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 5:23 ` David Gibson
@ 2002-04-04 6:08 ` Dan Malek
2002-04-04 6:30 ` David Gibson
0 siblings, 1 reply; 24+ messages in thread
From: Dan Malek @ 2002-04-04 6:08 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-embedded
David Gibson wrote:
> Possibly a better approach would be to make the consistent_sync()
> function be more careful and flush rather than invalidate cachelines
That's basically what this patch does, with the overhead of always
writing instead of just invalidating. We just went the whole circle
here, as this isn't a logically correct solution :-).
-- Dan
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 4:19 Workaround for USB DMA bugs Jeremy Rosen
@ 2002-04-04 6:16 ` Dan Malek
2002-04-04 6:32 ` David Gibson
0 siblings, 1 reply; 24+ messages in thread
From: Dan Malek @ 2002-04-04 6:16 UTC (permalink / raw)
To: Jeremy Rosen; +Cc: linuxppc-embedded
Jeremy Rosen wrote:
> can't we use the __GFP_DMA flag to specify that the area must have
> algnment enforced ?
Maybe, but that isn't its intended use.
> or did I misunderstand the use of this flag ?
This flag is used on architectures that have memory only accessible
from the processor, not from a DMA device, to ensure a DMA device
gets a memory bank that it can access.
-- Dan
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 6:08 ` Dan Malek
@ 2002-04-04 6:30 ` David Gibson
2002-04-04 20:21 ` David Blythe
0 siblings, 1 reply; 24+ messages in thread
From: David Gibson @ 2002-04-04 6:30 UTC (permalink / raw)
To: linuxppc-embedded
On Thu, Apr 04, 2002 at 01:08:41AM -0500, Dan Malek wrote:
>
> David Gibson wrote:
>
> >Possibly a better approach would be to make the consistent_sync()
> >function be more careful and flush rather than invalidate cachelines
>
> That's basically what this patch does, with the overhead of always
> writing instead of just invalidating. We just went the whole circle
> here, as this isn't a logically correct solution :-).
No, it's not, but:
a) It reduces the impact of DMA bugs - this way code which fails to
properly align its DMA buffers runs the risk of corrupting its DMA
transfers, but it won't corrupt other random data structures. In
particular DMA buffers on the stack won't cause evil stack corruption
which is a complete PITA to debug (I spent a full day of extreme
frustration tracking this one down).
b) It means that consistent_sync() itself has a safe interface - it
won't damage data outside the region specified. Which means it can be
used by code which does use non-aligned DMA buffers safely -
i.e. where it knows the surrounding memory won't be touched near the
transfer itself (between the consistent_sync() and the completion of
DMA). That could have an application in a driver which allocated a
bunch of DMA buffers each with some attached bookeeping information.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 6:16 ` Dan Malek
@ 2002-04-04 6:32 ` David Gibson
0 siblings, 0 replies; 24+ messages in thread
From: David Gibson @ 2002-04-04 6:32 UTC (permalink / raw)
To: linuxppc-embedded
On Thu, Apr 04, 2002 at 01:16:46AM -0500, Dan Malek wrote:
>
> Jeremy Rosen wrote:
>
>
> >can't we use the __GFP_DMA flag to specify that the area must have
> >algnment enforced ?
>
> Maybe, but that isn't its intended use.
>
> >or did I misunderstand the use of this flag ?
>
> This flag is used on architectures that have memory only accessible
> from the processor, not from a DMA device, to ensure a DMA device
> gets a memory bank that it can access.
More specifically it's used on x86 for old ISA devices.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
@ 2002-04-04 6:40 Rosen Jeremy
2002-04-04 6:52 ` David Gibson
0 siblings, 1 reply; 24+ messages in thread
From: Rosen Jeremy @ 2002-04-04 6:40 UTC (permalink / raw)
To: linuxppc-embedded
>
> This flag is used on architectures that have memory only accessible
> from the processor, not from a DMA device, to ensure a DMA device
> gets a memory bank that it can access.
so it specifies that the kmalloc()ed region must be usable for DMA
transfer.... which is exactly what we want.
except that for us it has impact on boudries instead of location.
TIA
Jeremy
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 6:40 Rosen Jeremy
@ 2002-04-04 6:52 ` David Gibson
0 siblings, 0 replies; 24+ messages in thread
From: David Gibson @ 2002-04-04 6:52 UTC (permalink / raw)
To: linuxppc-embedded
On Thu, Apr 04, 2002 at 04:40:00PM +1000, Rosen Jeremy wrote:
>
> >
> > This flag is used on architectures that have memory only accessible
> > from the processor, not from a DMA device, to ensure a DMA device
> > gets a memory bank that it can access.
>
> so it specifies that the kmalloc()ed region must be usable for DMA
> transfer.... which is exactly what we want.
Not really - IIRC it really is specific to ISA DMA. PCI devices on
i386 won't generally need it and probably won't specify it. Nor would
we want to, since that would put everything in the ludicrously small
ISA DMA region (only 1MB AFAIR) which could rapidly be exhausted.
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 2:54 ` Dan Malek
2002-04-04 3:48 ` David Gibson
@ 2002-04-04 12:35 ` Brad Parker
2002-04-04 14:12 ` Dan Malek
1 sibling, 1 reply; 24+ messages in thread
From: Brad Parker @ 2002-04-04 12:35 UTC (permalink / raw)
To: Dan Malek; +Cc: David Gibson, linuxppc-embedded
Dan Malek wrote:
>
>David Gibson wrote:
>
>
>> It's becoming easier though, with the __cacheline_aligned macro. Plus
>> I think if the buffer is individually kmalloc()ed it will be properly
>> aligned.
>
>Alignment is only half the problem. You have to ensure the object is
>modulo cache line size as well. You can't be sharing cache lines subject
>to DMA with other DMA or processor core data.
Maybe I need to go drink some coffee, but I'm confused by this thread.
The real problem, as I recall, was that cache invalidate routine one
in the 405 port didn't respect the addresses handed to it and instead
invalidated the entire cache line, even if it went beyond the address
range provided.
This behavior was different than other ports, specfically the ARM
port.
I don't think the patch was needed on the ARM. It was just need on
the 405.
Fixing the usb code is a good idea, but it would just hide the real
problem.
The original problem was that the USB code was doing DMA off the
stack. I don't recommend that, but I can't find any reason why it
specifically should not work or not be allowed. (and listen, if you
argue that it's philosophically bad I agree - note I said
*technically* wrong).
When the ohci driver did a cache invalidate on the buffer - in
preparation for dma into the buffer - it destroyed part of the stack,
the part which had not yet been flushed.
On other ports, like the ARM, the cache invalidate routine 'notices'
if the address range is not cache line aligned and flushes the parts
which are not.
I argued long ago (to parties which are present but probably sleeping
right now :-)) that this should be changed.
I'll make this argument again, now.
also, I think the comment in the code is incorrect. I think the code
should read that the cache_invalidate routine is broken on some ports.
off to make some coffee...
-brad
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 12:35 ` Brad Parker
@ 2002-04-04 14:12 ` Dan Malek
0 siblings, 0 replies; 24+ messages in thread
From: Dan Malek @ 2002-04-04 14:12 UTC (permalink / raw)
To: Brad Parker; +Cc: David Gibson, linuxppc-embedded
Brad Parker wrote:
> Maybe I need to go drink some coffee, but I'm confused by this thread.
I'll try to explain it in more detail.
> The real problem, as I recall, was that cache invalidate routine one
> in the 405 port didn't respect the addresses handed to it and instead
> invalidated the entire cache line, even if it went beyond the address
> range provided.
You can't invalidate, flush, manage anything less than a cache line.
The problem is the same on all processors, it doesn't matter how the
DMA object is allocated either. Suppose you have a processor with
something larger than an 8 byte cache line. Suppose you have some DMA buffer
(kmalloc'ed or just static declared), that is only 8 bytes for example,
and it starts at memory location 0x1000. Suppose the processor (or
another DMA) had some data that starts at location 0x1008. The logically
correct thing to do for an incoming DMA is to first invalidate the
cache line, then do the DMA, then when the processor accesses the
DMA buffer it will re-load the cache line. In this example, if the
processor writes some variable at location 0x1008, the invalidate
will also invalide this processor data, which is wrong. The next
access by the processor to this variable will read some old piece
of data, not the last thing it wrote to memory.
So, the solution was to first flush this line, then invalidate it.
Well, that works better but still isn't correct. If the processor
accesses the memory at location 0x1008 before the DMA is complete,
then the cache line is filled with the old stale DMA buffer, and
when the processor accesses the DMA buffer you don't get the
data that is the result of the DMA.
The only reason this patch appears to work is because we are lucky,
and the processor doesn't access anything in the cache line prior
to the DMA completion.
It's the same for all incoherent processors. Memory is accessed
in cache line chunks, you can't invalidate, flush or otherwise
manage part of a cache line. The stack is a weird place to put
a DMA buffer, but that has nothing to do with this problem.
-- Dan
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Workaround for USB DMA bugs
2002-04-04 6:30 ` David Gibson
@ 2002-04-04 20:21 ` David Blythe
0 siblings, 0 replies; 24+ messages in thread
From: David Blythe @ 2002-04-04 20:21 UTC (permalink / raw)
To: linuxppc-embedded
What we ended up doing was adding a debug wrapper around the invalidate
routine that squealed whenever a non-aligned invalidation was attempted.
This was/is a useful tool for finding potential problems - but can
create lots of false positives. Unfortunately in order for something
like this to work, you have to have a consistent philosophy for handling
dma'able buffers where, when they are aligned and bloated that this
address and length is used for invalidation operations. This gets into
the problem of sometimes wanting to know the true length and the
requested length and maybe not having both available.
It would be nice if there was more structure to enforce drivers and
other subsystems to do things right, mostly because things like
inadvertent line sharing are a real pain to track down.
david
David Gibson wrote:
> On Thu, Apr 04, 2002 at 01:08:41AM -0500, Dan Malek wrote:
>
>>David Gibson wrote:
>>
>>
>>>Possibly a better approach would be to make the consistent_sync()
>>>function be more careful and flush rather than invalidate cachelines
>>>
>>That's basically what this patch does, with the overhead of always
>>writing instead of just invalidating. We just went the whole circle
>>here, as this isn't a logically correct solution :-).
>>
> http://www.ozlabs.org/people/dgibson
>
>
>
>
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2002-04-04 20:21 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-04-04 4:19 Workaround for USB DMA bugs Jeremy Rosen
2002-04-04 6:16 ` Dan Malek
2002-04-04 6:32 ` David Gibson
-- strict thread matches above, loose matches on Subject: below --
2002-04-04 6:40 Rosen Jeremy
2002-04-04 6:52 ` David Gibson
2002-04-03 2:43 David Gibson
2002-04-03 8:17 ` Armin
2002-04-03 17:39 ` Frank Rowand
2002-04-03 17:49 ` Dan Malek
2002-04-03 20:43 ` David Blythe
2002-04-03 23:34 ` David Gibson
2002-04-03 9:45 ` "David Müller (ELSOFT AG)"
2002-04-03 23:35 ` David Gibson
2002-04-03 17:42 ` Dan Malek
2002-04-03 23:40 ` David Gibson
2002-04-04 2:54 ` Dan Malek
2002-04-04 3:48 ` David Gibson
2002-04-04 4:09 ` Dan Malek
2002-04-04 5:23 ` David Gibson
2002-04-04 6:08 ` Dan Malek
2002-04-04 6:30 ` David Gibson
2002-04-04 20:21 ` David Blythe
2002-04-04 12:35 ` Brad Parker
2002-04-04 14:12 ` Dan Malek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).