* ALSA vs. non coherent DMA @ 2008-05-06 0:08 Benjamin Herrenschmidt 2008-05-06 11:01 ` Takashi Iwai 0 siblings, 1 reply; 6+ messages in thread From: Benjamin Herrenschmidt @ 2008-05-06 0:08 UTC (permalink / raw) To: Takashi Iwai; +Cc: linuxppc-dev list, alsa-devel, Linux Kernel list Hi Takashi ! I'm bringing up an old thread as I'm just discovering that the problem still hasn't been fixed. There seem to be a few issues with ALSA current usage of mmap vs. non cache coherent architecture, such as embedded PowerPC's. I can see at least two with a quick look to pcm-native.c, one I don't understand and one I think I do: - The control/status mapping. Can you elaborate a bit on what this is actually doing and why it shouldn't be done on "non coherent" architectures ? Currently this -is- done on all powerpc's, whether they are coherent or not and I want to understand what the underlying issue is. - The mmap of DMA pages. Here, the problem appears two fold: * Use of virt_to_page() on virtual addresses returned by dma_alloc_coherent(). * No using the appropriate page protection for a DMA coherent mapping to userspace. It seems like you have solved that in part with implementing a generic dma_mmap_coherent() in the past that for some reason you never merged upstream (I can track that to about 2 years ago). Is there a reason ? I think we need to at least apply a band-aid today as it's becoming a nasty issue for several non-coherent powerpc platforms. It could be in the form of implementing dma_mmap_coherent() and changing Alsa to use it with the appropriate ifdef, or just adding an ifdef CONFIG_PPC with the right code in there for now until a better solution is found. It should be trivial though. Getting the PFN from the DMA address is easy if we have the dma handle and the virtual address, though that -is- definitely platform specific. I can implement a function for that if you need. As for the pgprot, we can come up with something like pgprot_mmap_dma(). Either that or I can fold it all in a powerpc wide implementation of a dma_mmap_coherent() like we envisioned initially. Let me know what approach is preferred here and I'll come up with patches ASAP. As far as I'm concerned, this is a bug and thus must be fixed now for .26 and possibly backported to stable even if we can come up with a non invasive solution). I'm annoyed because it represents a trivial amount of code, this problem should have been fixed a long time ago. Cheers, Ben. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA 2008-05-06 0:08 ALSA vs. non coherent DMA Benjamin Herrenschmidt @ 2008-05-06 11:01 ` Takashi Iwai 2008-05-07 14:22 ` Timur Tabi 0 siblings, 1 reply; 6+ messages in thread From: Takashi Iwai @ 2008-05-06 11:01 UTC (permalink / raw) To: benh; +Cc: linuxppc-dev list, alsa-devel, Linux Kernel list Hi Ben, thanks for signaling this long-standing issue again. At Tue, 06 May 2008 10:08:28 +1000, Benjamin Herrenschmidt wrote: > > Hi Takashi ! > > I'm bringing up an old thread as I'm just discovering that the problem > still hasn't been fixed. > > There seem to be a few issues with ALSA current usage of mmap vs. non > cache coherent architecture, such as embedded PowerPC's. Yep. And on MIPS, obviously. > I can see at least two with a quick look to pcm-native.c, one I don't > understand and one I think I do: > > - The control/status mapping. Can you elaborate a bit on what this is > actually doing and why it shouldn't be done on "non coherent" > architectures ? This is a mmap of the data record to be shared in realtime with apps. The app updates its data pointer (appl_ptr) on the mmapped buffer while the driver updates the data (e.g. DMA position, called hwptr) on the fly on the mmapped record. Due to its real-time nature, it has to be coherent -- at least, it was a problem on ARM. > Currently this -is- done on all powerpc's, whether they > are coherent or not and I want to understand what the underlying issue > is. It's actually buggy. Should check more precisely. > - The mmap of DMA pages. Here, the problem appears two fold: > > * Use of virt_to_page() on virtual addresses returned by > dma_alloc_coherent(). > > * No using the appropriate page protection for a DMA coherent mapping > to userspace. > > It seems like you have solved that in part with implementing a generic > dma_mmap_coherent() in the past that for some reason you never merged > upstream (I can track that to about 2 years ago). Is there a reason ? IIRC, dma_mmap_coherent() cannot be implemented properly on some architectures. This is no big problem for ALSA as long as it returns an error or make it out via ifdef. But, the fact that this API cannot be done for all archs discourage arch maintainers, and the idea faded out again. > I think we need to at least apply a band-aid today as it's becoming a > nasty issue for several non-coherent powerpc platforms. It could be in > the form of implementing dma_mmap_coherent() and changing Alsa to use it > with the appropriate ifdef, or just adding an ifdef CONFIG_PPC with the > right code in there for now until a better solution is found. Agreed. > It should be trivial though. Getting the PFN from the DMA address is > easy if we have the dma handle and the virtual address, though that -is- > definitely platform specific. I can implement a function for that if you > need. That'll be great. dma_mmap_coherent() and friends would be then really helpful to solve this issue. > As for the pgprot, we can come up with something like > pgprot_mmap_dma(). Either that or I can fold it all in a powerpc wide > implementation of a dma_mmap_coherent() like we envisioned initially. In principle, pgprot_*() isn't actually needed in the driver side at all. We use pgprot_noncached() in one part, and it's for hacky way to mmap the ioremapped pages. It's not available on all architectures, and I'm not sure whether it works on all PPC models although it's enabled right now: in include/sound/pcm.h, /* mmap for io-memory area */ #if defined(CONFIG_X86) || defined(CONFIG_PPC) || defined(CONFIG_ALPHA) #define SNDRV_PCM_INFO_MMAP_IOMEM SNDRV_PCM_INFO_MMAP int snd_pcm_lib_mmap_iomem(struct snd_pcm_substream *substream, struct vm_area_struct *area); #else #define SNDRV_PCM_INFO_MMAP_IOMEM 0 #define snd_pcm_lib_mmap_iomem NULL #endif Highly likely we need to fix this, too. In the easiest way, disable this except for X86... > Let me know what approach is preferred here and I'll come up with > patches ASAP. As far as I'm concerned, this is a bug and thus must be > fixed now for .26 and possibly backported to stable even if we can come > up with a non invasive solution). I'm annoyed because it represents a > trivial amount of code, this problem should have been fixed a long time > ago. As a pragmatic solution, as you mentioned in the above, we can disable or change the problematic code with ifdefs. At best, use dma_mmap_coherent() if it's available. If not, and if the arch is known to have not-simply-mappable DMA pages (like MIPS), we can simply disable the mmap feature. Once after we have dma_mmap_*() generally, we can clean up codes. thanks, Takashi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA 2008-05-06 11:01 ` Takashi Iwai @ 2008-05-07 14:22 ` Timur Tabi 2008-05-07 15:44 ` Grant Likely 2008-05-07 21:53 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 6+ messages in thread From: Timur Tabi @ 2008-05-07 14:22 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel, Linux Kernel list, linuxppc-dev list Takashi Iwai wrote: > This is a mmap of the data record to be shared in realtime with apps. > The app updates its data pointer (appl_ptr) on the mmapped buffer > while the driver updates the data (e.g. DMA position, called hwptr) on > the fly on the mmapped record. Due to its real-time nature, it has to > be coherent -- at least, it was a problem on ARM. This doesn't sound like a coherency problem to me, and least not one you'd find on PowerPC. Both the driver and the application run on the host CPU, so there shouldn't be any coherency problem. My understanding is that a "non coherent" platform is one where the host CPU isn't aware when a *hardware device* writes directly to memory, e.g. via DMA. -- Timur Tabi Linux kernel developer at Freescale ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA 2008-05-07 14:22 ` Timur Tabi @ 2008-05-07 15:44 ` Grant Likely 2008-05-07 21:53 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 6+ messages in thread From: Grant Likely @ 2008-05-07 15:44 UTC (permalink / raw) To: Timur Tabi; +Cc: Takashi Iwai, alsa-devel, Linux Kernel list, linuxppc-dev list On Wed, May 7, 2008 at 8:22 AM, Timur Tabi <timur@freescale.com> wrote: > Takashi Iwai wrote: > > > This is a mmap of the data record to be shared in realtime with apps. > > The app updates its data pointer (appl_ptr) on the mmapped buffer > > while the driver updates the data (e.g. DMA position, called hwptr) on > > the fly on the mmapped record. Due to its real-time nature, it has to > > be coherent -- at least, it was a problem on ARM. > > This doesn't sound like a coherency problem to me, and least not one you'd find > on PowerPC. Both the driver and the application run on the host CPU, so there > shouldn't be any coherency problem. My understanding is that a "non coherent" > platform is one where the host CPU isn't aware when a *hardware device* writes > directly to memory, e.g. via DMA. IIRC, some ARMs have a different situation because the dcache is virtually instead of physically tagged. Therefore, the kernel mapping may not see data that has not been flushed out of the user space mappings. (Someone please correct me if I'm wrong). Cheers, g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA 2008-05-07 14:22 ` Timur Tabi 2008-05-07 15:44 ` Grant Likely @ 2008-05-07 21:53 ` Benjamin Herrenschmidt 2008-05-08 15:41 ` Takashi Iwai 1 sibling, 1 reply; 6+ messages in thread From: Benjamin Herrenschmidt @ 2008-05-07 21:53 UTC (permalink / raw) To: Timur Tabi; +Cc: Takashi Iwai, linuxppc-dev list, alsa-devel, Linux Kernel list On Wed, 2008-05-07 at 09:22 -0500, Timur Tabi wrote: > Takashi Iwai wrote: > > > This is a mmap of the data record to be shared in realtime with apps. > > The app updates its data pointer (appl_ptr) on the mmapped buffer > > while the driver updates the data (e.g. DMA position, called hwptr) on > > the fly on the mmapped record. Due to its real-time nature, it has to > > be coherent -- at least, it was a problem on ARM. > > This doesn't sound like a coherency problem to me, and least not one you'd find > on PowerPC. Both the driver and the application run on the host CPU, so there > shouldn't be any coherency problem. My understanding is that a "non coherent" > platform is one where the host CPU isn't aware when a *hardware device* writes > directly to memory, e.g. via DMA. Yes, precisely. I was about to make a reply here. There is some confusion at least in terminology, in Alsa. This is not DMA coherency, though it is a problem with virtually tagged data caches that some archs such as ARM have. So this is ok for all PowerPC since they all have a physically tagged data cache. The real problem -is- still the DMA coherency issue and as I see it, is two fold: - mmap'ing of the result of dma_alloc_coherent() doesn't work. There are two issues at play here, one is the pgprot that -must- be set to uncached for such a mapping on non coherent architectures (and non coherent architectures only), and the other is our virt_to_page() that will puke on virtual addresses coming from dma_alloc_coherent(). - mmap'ing of SG lists for non coherent DMA. There the problem is a mixture of how Alsa allocate the SG buffers mixes with the previous problem. I think it's never valid to create an SG list with the output of dma_alloc_coherent though. We would need a dma_alloc_sg() for that... sglists are made of pages, thus allocated with GFP, and later DMA mapped with dma_map_*, however this brings a whole other set of issues/constra ints such as bouce bufferring on some MMU less platforms if the memory happens to come out of the wrong place. Also, such mapped buffers are -not- coherent as they must not be modified via their virtual address while mapped, -unless- they are also mapped in kernel and/or user space (vmap & mmap) using some kind of "coherent" attributes such as pgprot_noncached. (and provided that is possible at all in kernel place for archs like MIPS). I don't have an easy answer there, it seems the bogosity roots deep in alsa, at least for the SG bits. For the non-SG bits, we can probably work around with an accessor to get the right pgprot and maybe some variant of virt_to_page() (dma_virt_to_page() ?) that would walk the kernel page tables to obtain the pfn. Ben. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ALSA vs. non coherent DMA 2008-05-07 21:53 ` Benjamin Herrenschmidt @ 2008-05-08 15:41 ` Takashi Iwai 0 siblings, 0 replies; 6+ messages in thread From: Takashi Iwai @ 2008-05-08 15:41 UTC (permalink / raw) To: benh; +Cc: linuxppc-dev list, alsa-devel, Linux Kernel list At Thu, 08 May 2008 07:53:11 +1000, Benjamin Herrenschmidt wrote: > > On Wed, 2008-05-07 at 09:22 -0500, Timur Tabi wrote: > > Takashi Iwai wrote: > > > > > This is a mmap of the data record to be shared in realtime with apps. > > > The app updates its data pointer (appl_ptr) on the mmapped buffer > > > while the driver updates the data (e.g. DMA position, called hwptr) on > > > the fly on the mmapped record. Due to its real-time nature, it has to > > > be coherent -- at least, it was a problem on ARM. > > > > This doesn't sound like a coherency problem to me, and least not one you'd find > > on PowerPC. Both the driver and the application run on the host CPU, so there > > shouldn't be any coherency problem. My understanding is that a "non coherent" > > platform is one where the host CPU isn't aware when a *hardware device* writes > > directly to memory, e.g. via DMA. > > Yes, precisely. I was about to make a reply here. There is some > confusion at least in terminology, in Alsa. This is not DMA coherency, > though it is a problem with virtually tagged data caches that some archs > such as ARM have. Right. The words should be corrected. Since the only way to get a certain non-cached map was the (ab-)use of dma_mmap_coherent(), such a confusing wording was chosen. > So this is ok for all PowerPC since they all have a physically tagged > data cache. OK, so that part should work as is for PPC. > The real problem -is- still the DMA coherency issue and as I see it, is > two fold: > > - mmap'ing of the result of dma_alloc_coherent() doesn't work. There > are two issues at play here, one is the pgprot that -must- be set to > uncached for such a mapping on non coherent architectures (and non > coherent architectures only), and the other is our virt_to_page() that > will puke on virtual addresses coming from dma_alloc_coherent(). And dma_mmap_coherent() would be a solution for it, I suppose. > - mmap'ing of SG lists for non coherent DMA. There the problem is a > mixture of how Alsa allocate the SG buffers mixes with the previous > problem. Yes. > I think it's never valid to create an SG list with the output of > dma_alloc_coherent though. We would need a dma_alloc_sg() for that... > > sglists are made of pages, thus allocated with GFP, and later DMA mapped > with dma_map_*, however this brings a whole other set of issues/constra > ints such as bouce bufferring on some MMU less platforms if the memory > happens to come out of the wrong place. Also, such mapped buffers are > -not- coherent as they must not be modified via their virtual address > while mapped, -unless- they are also mapped in kernel and/or user space > (vmap & mmap) using some kind of "coherent" attributes such as > pgprot_noncached. (and provided that is possible at all in kernel place > for archs like MIPS). > > I don't have an easy answer there, it seems the bogosity roots deep in > alsa, at least for the SG bits. For the non-SG bits, we can probably > work around with an accessor to get the right pgprot and maybe some > variant of virt_to_page() (dma_virt_to_page() ?) that would walk the > kernel page tables to obtain the pfn. The vmap() in sound/core/sgbuf.c can be omitted by adding proper PCM callbacks (copy and silent) to handle SG-buffers. These are only guys that access the linear buffer runtime->area. Then we'll just need a proepr mmap PCM callback just calling dma_mmap_coherent() for each SG page. Also, the default PCM mmap should be fixed to use dma_mmap_coherent() appropriately. That's all. So, what we really need is dma_mmap_coherent() implementations... thanks, Takashi ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-05-08 15:41 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-05-06 0:08 ALSA vs. non coherent DMA Benjamin Herrenschmidt 2008-05-06 11:01 ` Takashi Iwai 2008-05-07 14:22 ` Timur Tabi 2008-05-07 15:44 ` Grant Likely 2008-05-07 21:53 ` Benjamin Herrenschmidt 2008-05-08 15:41 ` Takashi Iwai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).