* ALSA vs. non coherent DMA
@ 2008-05-06 0:08 Benjamin Herrenschmidt
2008-05-06 11:01 ` Takashi Iwai
0 siblings, 1 reply; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2008-05-06 0:08 UTC (permalink / raw)
To: Takashi Iwai; +Cc: linuxppc-dev list, alsa-devel, Linux Kernel list
Hi Takashi !
I'm bringing up an old thread as I'm just discovering that the problem
still hasn't been fixed.
There seem to be a few issues with ALSA current usage of mmap vs. non
cache coherent architecture, such as embedded PowerPC's.
I can see at least two with a quick look to pcm-native.c, one I don't
understand and one I think I do:
- The control/status mapping. Can you elaborate a bit on what this is
actually doing and why it shouldn't be done on "non coherent"
architectures ? Currently this -is- done on all powerpc's, whether they
are coherent or not and I want to understand what the underlying issue
is.
- The mmap of DMA pages. Here, the problem appears two fold:
* Use of virt_to_page() on virtual addresses returned by
dma_alloc_coherent().
* No using the appropriate page protection for a DMA coherent mapping
to userspace.
It seems like you have solved that in part with implementing a generic
dma_mmap_coherent() in the past that for some reason you never merged
upstream (I can track that to about 2 years ago). Is there a reason ?
I think we need to at least apply a band-aid today as it's becoming a
nasty issue for several non-coherent powerpc platforms. It could be in
the form of implementing dma_mmap_coherent() and changing Alsa to use it
with the appropriate ifdef, or just adding an ifdef CONFIG_PPC with the
right code in there for now until a better solution is found.
It should be trivial though. Getting the PFN from the DMA address is
easy if we have the dma handle and the virtual address, though that -is-
definitely platform specific. I can implement a function for that if you
need. As for the pgprot, we can come up with something like
pgprot_mmap_dma(). Either that or I can fold it all in a powerpc wide
implementation of a dma_mmap_coherent() like we envisioned initially.
Let me know what approach is preferred here and I'll come up with
patches ASAP. As far as I'm concerned, this is a bug and thus must be
fixed now for .26 and possibly backported to stable even if we can come
up with a non invasive solution). I'm annoyed because it represents a
trivial amount of code, this problem should have been fixed a long time
ago.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA
2008-05-06 0:08 ALSA vs. non coherent DMA Benjamin Herrenschmidt
@ 2008-05-06 11:01 ` Takashi Iwai
2008-05-07 14:22 ` Timur Tabi
0 siblings, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2008-05-06 11:01 UTC (permalink / raw)
To: benh; +Cc: linuxppc-dev list, alsa-devel, Linux Kernel list
Hi Ben,
thanks for signaling this long-standing issue again.
At Tue, 06 May 2008 10:08:28 +1000,
Benjamin Herrenschmidt wrote:
>
> Hi Takashi !
>
> I'm bringing up an old thread as I'm just discovering that the problem
> still hasn't been fixed.
>
> There seem to be a few issues with ALSA current usage of mmap vs. non
> cache coherent architecture, such as embedded PowerPC's.
Yep. And on MIPS, obviously.
> I can see at least two with a quick look to pcm-native.c, one I don't
> understand and one I think I do:
>
> - The control/status mapping. Can you elaborate a bit on what this is
> actually doing and why it shouldn't be done on "non coherent"
> architectures ?
This is a mmap of the data record to be shared in realtime with apps.
The app updates its data pointer (appl_ptr) on the mmapped buffer
while the driver updates the data (e.g. DMA position, called hwptr) on
the fly on the mmapped record. Due to its real-time nature, it has to
be coherent -- at least, it was a problem on ARM.
> Currently this -is- done on all powerpc's, whether they
> are coherent or not and I want to understand what the underlying issue
> is.
It's actually buggy. Should check more precisely.
> - The mmap of DMA pages. Here, the problem appears two fold:
>
> * Use of virt_to_page() on virtual addresses returned by
> dma_alloc_coherent().
>
> * No using the appropriate page protection for a DMA coherent mapping
> to userspace.
>
> It seems like you have solved that in part with implementing a generic
> dma_mmap_coherent() in the past that for some reason you never merged
> upstream (I can track that to about 2 years ago). Is there a reason ?
IIRC, dma_mmap_coherent() cannot be implemented properly on some
architectures. This is no big problem for ALSA as long as it returns
an error or make it out via ifdef. But, the fact that this API cannot
be done for all archs discourage arch maintainers, and the idea faded
out again.
> I think we need to at least apply a band-aid today as it's becoming a
> nasty issue for several non-coherent powerpc platforms. It could be in
> the form of implementing dma_mmap_coherent() and changing Alsa to use it
> with the appropriate ifdef, or just adding an ifdef CONFIG_PPC with the
> right code in there for now until a better solution is found.
Agreed.
> It should be trivial though. Getting the PFN from the DMA address is
> easy if we have the dma handle and the virtual address, though that -is-
> definitely platform specific. I can implement a function for that if you
> need.
That'll be great. dma_mmap_coherent() and friends would be then
really helpful to solve this issue.
> As for the pgprot, we can come up with something like
> pgprot_mmap_dma(). Either that or I can fold it all in a powerpc wide
> implementation of a dma_mmap_coherent() like we envisioned initially.
In principle, pgprot_*() isn't actually needed in the driver side at
all. We use pgprot_noncached() in one part, and it's for hacky way to
mmap the ioremapped pages. It's not available on all architectures,
and I'm not sure whether it works on all PPC models although it's
enabled right now: in include/sound/pcm.h,
/* mmap for io-memory area */
#if defined(CONFIG_X86) || defined(CONFIG_PPC) || defined(CONFIG_ALPHA)
#define SNDRV_PCM_INFO_MMAP_IOMEM SNDRV_PCM_INFO_MMAP
int snd_pcm_lib_mmap_iomem(struct snd_pcm_substream *substream, struct vm_area_struct *area);
#else
#define SNDRV_PCM_INFO_MMAP_IOMEM 0
#define snd_pcm_lib_mmap_iomem NULL
#endif
Highly likely we need to fix this, too. In the easiest way, disable
this except for X86...
> Let me know what approach is preferred here and I'll come up with
> patches ASAP. As far as I'm concerned, this is a bug and thus must be
> fixed now for .26 and possibly backported to stable even if we can come
> up with a non invasive solution). I'm annoyed because it represents a
> trivial amount of code, this problem should have been fixed a long time
> ago.
As a pragmatic solution, as you mentioned in the above, we can disable
or change the problematic code with ifdefs. At best, use
dma_mmap_coherent() if it's available. If not, and if the arch is
known to have not-simply-mappable DMA pages (like MIPS), we can simply
disable the mmap feature.
Once after we have dma_mmap_*() generally, we can clean up codes.
thanks,
Takashi
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA
2008-05-06 11:01 ` Takashi Iwai
@ 2008-05-07 14:22 ` Timur Tabi
2008-05-07 15:44 ` Grant Likely
2008-05-07 21:53 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 6+ messages in thread
From: Timur Tabi @ 2008-05-07 14:22 UTC (permalink / raw)
To: Takashi Iwai; +Cc: alsa-devel, Linux Kernel list, linuxppc-dev list
Takashi Iwai wrote:
> This is a mmap of the data record to be shared in realtime with apps.
> The app updates its data pointer (appl_ptr) on the mmapped buffer
> while the driver updates the data (e.g. DMA position, called hwptr) on
> the fly on the mmapped record. Due to its real-time nature, it has to
> be coherent -- at least, it was a problem on ARM.
This doesn't sound like a coherency problem to me, and least not one you'd find
on PowerPC. Both the driver and the application run on the host CPU, so there
shouldn't be any coherency problem. My understanding is that a "non coherent"
platform is one where the host CPU isn't aware when a *hardware device* writes
directly to memory, e.g. via DMA.
--
Timur Tabi
Linux kernel developer at Freescale
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA
2008-05-07 14:22 ` Timur Tabi
@ 2008-05-07 15:44 ` Grant Likely
2008-05-07 21:53 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 6+ messages in thread
From: Grant Likely @ 2008-05-07 15:44 UTC (permalink / raw)
To: Timur Tabi; +Cc: Takashi Iwai, alsa-devel, Linux Kernel list, linuxppc-dev list
On Wed, May 7, 2008 at 8:22 AM, Timur Tabi <timur@freescale.com> wrote:
> Takashi Iwai wrote:
>
> > This is a mmap of the data record to be shared in realtime with apps.
> > The app updates its data pointer (appl_ptr) on the mmapped buffer
> > while the driver updates the data (e.g. DMA position, called hwptr) on
> > the fly on the mmapped record. Due to its real-time nature, it has to
> > be coherent -- at least, it was a problem on ARM.
>
> This doesn't sound like a coherency problem to me, and least not one you'd find
> on PowerPC. Both the driver and the application run on the host CPU, so there
> shouldn't be any coherency problem. My understanding is that a "non coherent"
> platform is one where the host CPU isn't aware when a *hardware device* writes
> directly to memory, e.g. via DMA.
IIRC, some ARMs have a different situation because the dcache is
virtually instead of physically tagged. Therefore, the kernel mapping
may not see data that has not been flushed out of the user space
mappings. (Someone please correct me if I'm wrong).
Cheers,
g.
--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA
2008-05-07 14:22 ` Timur Tabi
2008-05-07 15:44 ` Grant Likely
@ 2008-05-07 21:53 ` Benjamin Herrenschmidt
2008-05-08 15:41 ` Takashi Iwai
1 sibling, 1 reply; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2008-05-07 21:53 UTC (permalink / raw)
To: Timur Tabi; +Cc: Takashi Iwai, linuxppc-dev list, alsa-devel, Linux Kernel list
On Wed, 2008-05-07 at 09:22 -0500, Timur Tabi wrote:
> Takashi Iwai wrote:
>
> > This is a mmap of the data record to be shared in realtime with apps.
> > The app updates its data pointer (appl_ptr) on the mmapped buffer
> > while the driver updates the data (e.g. DMA position, called hwptr) on
> > the fly on the mmapped record. Due to its real-time nature, it has to
> > be coherent -- at least, it was a problem on ARM.
>
> This doesn't sound like a coherency problem to me, and least not one you'd find
> on PowerPC. Both the driver and the application run on the host CPU, so there
> shouldn't be any coherency problem. My understanding is that a "non coherent"
> platform is one where the host CPU isn't aware when a *hardware device* writes
> directly to memory, e.g. via DMA.
Yes, precisely. I was about to make a reply here. There is some
confusion at least in terminology, in Alsa. This is not DMA coherency,
though it is a problem with virtually tagged data caches that some archs
such as ARM have.
So this is ok for all PowerPC since they all have a physically tagged
data cache.
The real problem -is- still the DMA coherency issue and as I see it, is
two fold:
- mmap'ing of the result of dma_alloc_coherent() doesn't work. There
are two issues at play here, one is the pgprot that -must- be set to
uncached for such a mapping on non coherent architectures (and non
coherent architectures only), and the other is our virt_to_page() that
will puke on virtual addresses coming from dma_alloc_coherent().
- mmap'ing of SG lists for non coherent DMA. There the problem is a
mixture of how Alsa allocate the SG buffers mixes with the previous
problem.
I think it's never valid to create an SG list with the output of
dma_alloc_coherent though. We would need a dma_alloc_sg() for that...
sglists are made of pages, thus allocated with GFP, and later DMA mapped
with dma_map_*, however this brings a whole other set of issues/constra
ints such as bouce bufferring on some MMU less platforms if the memory
happens to come out of the wrong place. Also, such mapped buffers are
-not- coherent as they must not be modified via their virtual address
while mapped, -unless- they are also mapped in kernel and/or user space
(vmap & mmap) using some kind of "coherent" attributes such as
pgprot_noncached. (and provided that is possible at all in kernel place
for archs like MIPS).
I don't have an easy answer there, it seems the bogosity roots deep in
alsa, at least for the SG bits. For the non-SG bits, we can probably
work around with an accessor to get the right pgprot and maybe some
variant of virt_to_page() (dma_virt_to_page() ?) that would walk the
kernel page tables to obtain the pfn.
Ben.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA
2008-05-07 21:53 ` Benjamin Herrenschmidt
@ 2008-05-08 15:41 ` Takashi Iwai
0 siblings, 0 replies; 6+ messages in thread
From: Takashi Iwai @ 2008-05-08 15:41 UTC (permalink / raw)
To: benh; +Cc: linuxppc-dev list, alsa-devel, Linux Kernel list
At Thu, 08 May 2008 07:53:11 +1000,
Benjamin Herrenschmidt wrote:
>
> On Wed, 2008-05-07 at 09:22 -0500, Timur Tabi wrote:
> > Takashi Iwai wrote:
> >
> > > This is a mmap of the data record to be shared in realtime with apps.
> > > The app updates its data pointer (appl_ptr) on the mmapped buffer
> > > while the driver updates the data (e.g. DMA position, called hwptr) on
> > > the fly on the mmapped record. Due to its real-time nature, it has to
> > > be coherent -- at least, it was a problem on ARM.
> >
> > This doesn't sound like a coherency problem to me, and least not one you'd find
> > on PowerPC. Both the driver and the application run on the host CPU, so there
> > shouldn't be any coherency problem. My understanding is that a "non coherent"
> > platform is one where the host CPU isn't aware when a *hardware device* writes
> > directly to memory, e.g. via DMA.
>
> Yes, precisely. I was about to make a reply here. There is some
> confusion at least in terminology, in Alsa. This is not DMA coherency,
> though it is a problem with virtually tagged data caches that some archs
> such as ARM have.
Right. The words should be corrected.
Since the only way to get a certain non-cached map was the (ab-)use of
dma_mmap_coherent(), such a confusing wording was chosen.
> So this is ok for all PowerPC since they all have a physically tagged
> data cache.
OK, so that part should work as is for PPC.
> The real problem -is- still the DMA coherency issue and as I see it, is
> two fold:
>
> - mmap'ing of the result of dma_alloc_coherent() doesn't work. There
> are two issues at play here, one is the pgprot that -must- be set to
> uncached for such a mapping on non coherent architectures (and non
> coherent architectures only), and the other is our virt_to_page() that
> will puke on virtual addresses coming from dma_alloc_coherent().
And dma_mmap_coherent() would be a solution for it, I suppose.
> - mmap'ing of SG lists for non coherent DMA. There the problem is a
> mixture of how Alsa allocate the SG buffers mixes with the previous
> problem.
Yes.
> I think it's never valid to create an SG list with the output of
> dma_alloc_coherent though. We would need a dma_alloc_sg() for that...
>
> sglists are made of pages, thus allocated with GFP, and later DMA mapped
> with dma_map_*, however this brings a whole other set of issues/constra
> ints such as bouce bufferring on some MMU less platforms if the memory
> happens to come out of the wrong place. Also, such mapped buffers are
> -not- coherent as they must not be modified via their virtual address
> while mapped, -unless- they are also mapped in kernel and/or user space
> (vmap & mmap) using some kind of "coherent" attributes such as
> pgprot_noncached. (and provided that is possible at all in kernel place
> for archs like MIPS).
>
> I don't have an easy answer there, it seems the bogosity roots deep in
> alsa, at least for the SG bits. For the non-SG bits, we can probably
> work around with an accessor to get the right pgprot and maybe some
> variant of virt_to_page() (dma_virt_to_page() ?) that would walk the
> kernel page tables to obtain the pfn.
The vmap() in sound/core/sgbuf.c can be omitted by adding proper PCM
callbacks (copy and silent) to handle SG-buffers. These are only guys
that access the linear buffer runtime->area.
Then we'll just need a proepr mmap PCM callback just calling
dma_mmap_coherent() for each SG page. Also, the default PCM mmap
should be fixed to use dma_mmap_coherent() appropriately. That's
all.
So, what we really need is dma_mmap_coherent() implementations...
thanks,
Takashi
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-05-08 15:41 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-06 0:08 ALSA vs. non coherent DMA Benjamin Herrenschmidt
2008-05-06 11:01 ` Takashi Iwai
2008-05-07 14:22 ` Timur Tabi
2008-05-07 15:44 ` Grant Likely
2008-05-07 21:53 ` Benjamin Herrenschmidt
2008-05-08 15:41 ` Takashi Iwai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).