* 2.6.4-rc1: ALSA makes invalid assumptions about memory types
@ 2004-02-29 12:22 Russell King
2004-02-29 12:48 ` [Alsa-devel] " Jaroslav Kysela
0 siblings, 1 reply; 4+ messages in thread
From: Russell King @ 2004-02-29 12:22 UTC (permalink / raw)
To: Alsa Devel list
1) Memory types.
In snd_dma_alloc_pages(), ALSA does this:
case SNDRV_DMA_TYPE_PCI:
dmab->area = snd_malloc_pci_pages(dev->dev.pci, size, &dmab->addr);
which eventually comes down to this:
res = pci_alloc_consistent(pci, PAGE_SIZE * (1 << pg), dma_addr);
IOW:
dmab->area = pci_alloc_consistent(dev->dev.pci, size, &dmab->addr);
and in snd_pcm_lib_malloc_pages() we copy these into the runtime, thusly:
runtime->dma_area = dmab.area;
runtime->dma_addr = dmab.addr;
runtime->dma_private = dmab.private_data;
runtime->dma_bytes = size;
However, in snd_pcm_mmap_data_nopage(), ALSA does this:
vaddr = runtime->dma_area + offset;
page = virt_to_page(vaddr);
virt_to_page may _only_ be used on memory returned by get_free_page()
and kmalloc(), and certainly not on memory returned by
pci_alloc_consistent(). The only reason it works on x86 is because
x86 is a fully cache coherent architecture, so pci_alloc_consistent()
_just happens_ to be equivalent to get_free_pages() on that platform.
Note that the same applies to dma_alloc_coherent().
In other words, the above code will not work on non-cache coherent
architectures without modification.
I believe this needs discussing with the DMA API authors on LKML since
AFAIK the kernel currently doesn't have a clear API to translate memory
returned by either pci_alloc_consistent() or dma_alloc_coherent() back
to it's consituent struct page pointers.
Secondly the user space mapping will be marked as cacheable on some
architectures, which would be Real Bad(tm) on architectures which
are not DMA coherent.
The way architectures mark their mappings uncacheable and/or only
writecombining is architecture specific... see drivers/video/fbmem.c
as an example.
2) PCM mmap control/status mappings
These suffer from a similar cache coherency problem - you can not
assume that two different mappings of the same page will not alias
in the CPUs caches.
In my case on ARM, not only must the user space mappings of these
structures be marked uncacheable, but also the kernel space mappings
of the same to ensure that accesses via both mappings always return
up to date information.
I have hacks in my tree which work around this using ARM specific
functionality, but this is very much architecture specific at the
moment, and I suspect requires a new kernel API for creating memory
(it's similar to the DMA case above.)
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 2.6.4-rc1: ALSA makes invalid assumptions about memory types
2004-02-29 12:22 2.6.4-rc1: ALSA makes invalid assumptions about memory types Russell King
@ 2004-02-29 12:48 ` Jaroslav Kysela
0 siblings, 0 replies; 4+ messages in thread
From: Jaroslav Kysela @ 2004-02-29 12:48 UTC (permalink / raw)
To: Russell King; +Cc: Alsa Devel list, LKML
On Sun, 29 Feb 2004, Russell King wrote:
> 1) Memory types.
>
> In snd_dma_alloc_pages(), ALSA does this:
>
> case SNDRV_DMA_TYPE_PCI:
> dmab->area = snd_malloc_pci_pages(dev->dev.pci, size, &dmab->addr);
>
> which eventually comes down to this:
>
> res = pci_alloc_consistent(pci, PAGE_SIZE * (1 << pg), dma_addr);
>
> IOW:
>
> dmab->area = pci_alloc_consistent(dev->dev.pci, size, &dmab->addr);
>
> and in snd_pcm_lib_malloc_pages() we copy these into the runtime, thusly:
>
> runtime->dma_area = dmab.area;
> runtime->dma_addr = dmab.addr;
> runtime->dma_private = dmab.private_data;
> runtime->dma_bytes = size;
>
> However, in snd_pcm_mmap_data_nopage(), ALSA does this:
>
> vaddr = runtime->dma_area + offset;
> page = virt_to_page(vaddr);
>
> virt_to_page may _only_ be used on memory returned by get_free_page()
> and kmalloc(), and certainly not on memory returned by
> pci_alloc_consistent(). The only reason it works on x86 is because
> x86 is a fully cache coherent architecture, so pci_alloc_consistent()
> _just happens_ to be equivalent to get_free_pages() on that platform.
> Note that the same applies to dma_alloc_coherent().
>
> In other words, the above code will not work on non-cache coherent
> architectures without modification.
>
> I believe this needs discussing with the DMA API authors on LKML since
> AFAIK the kernel currently doesn't have a clear API to translate memory
> returned by either pci_alloc_consistent() or dma_alloc_coherent() back
> to it's consituent struct page pointers.
We can do some #ifdef hacks in our code, of course, but a function
doing this abstraction in kernel for given architecture would be much
better of course.
> Secondly the user space mapping will be marked as cacheable on some
> architectures, which would be Real Bad(tm) on architectures which
> are not DMA coherent.
>
> The way architectures mark their mappings uncacheable and/or only
> writecombining is architecture specific... see drivers/video/fbmem.c
> as an example.
It's real mess. I think that this setup should be moved to linux/arch
tree, too.
> 2) PCM mmap control/status mappings
>
> These suffer from a similar cache coherency problem - you can not
> assume that two different mappings of the same page will not alias
> in the CPUs caches.
>
> In my case on ARM, not only must the user space mappings of these
> structures be marked uncacheable, but also the kernel space mappings
> of the same to ensure that accesses via both mappings always return
> up to date information.
>
> I have hacks in my tree which work around this using ARM specific
> functionality, but this is very much architecture specific at the
> moment, and I suspect requires a new kernel API for creating memory
> (it's similar to the DMA case above.)
It's same problem. We need that a function in kernel API will do this
for us to avoid #ifdefs for all architectures.
I'm not sure, if I'm the right person to do these changes in kernel.
Any volunteer? Thanks.
Until this change is done, I accept the #ifdef hacks to fix support for
all architectures in ALSA.
Jaroslav
-----
Jaroslav Kysela <perex@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Alsa-devel] 2.6.4-rc1: ALSA makes invalid assumptions about memory types
@ 2004-02-29 12:48 ` Jaroslav Kysela
0 siblings, 0 replies; 4+ messages in thread
From: Jaroslav Kysela @ 2004-02-29 12:48 UTC (permalink / raw)
To: Russell King; +Cc: Alsa Devel list, LKML
On Sun, 29 Feb 2004, Russell King wrote:
> 1) Memory types.
>
> In snd_dma_alloc_pages(), ALSA does this:
>
> case SNDRV_DMA_TYPE_PCI:
> dmab->area = snd_malloc_pci_pages(dev->dev.pci, size, &dmab->addr);
>
> which eventually comes down to this:
>
> res = pci_alloc_consistent(pci, PAGE_SIZE * (1 << pg), dma_addr);
>
> IOW:
>
> dmab->area = pci_alloc_consistent(dev->dev.pci, size, &dmab->addr);
>
> and in snd_pcm_lib_malloc_pages() we copy these into the runtime, thusly:
>
> runtime->dma_area = dmab.area;
> runtime->dma_addr = dmab.addr;
> runtime->dma_private = dmab.private_data;
> runtime->dma_bytes = size;
>
> However, in snd_pcm_mmap_data_nopage(), ALSA does this:
>
> vaddr = runtime->dma_area + offset;
> page = virt_to_page(vaddr);
>
> virt_to_page may _only_ be used on memory returned by get_free_page()
> and kmalloc(), and certainly not on memory returned by
> pci_alloc_consistent(). The only reason it works on x86 is because
> x86 is a fully cache coherent architecture, so pci_alloc_consistent()
> _just happens_ to be equivalent to get_free_pages() on that platform.
> Note that the same applies to dma_alloc_coherent().
>
> In other words, the above code will not work on non-cache coherent
> architectures without modification.
>
> I believe this needs discussing with the DMA API authors on LKML since
> AFAIK the kernel currently doesn't have a clear API to translate memory
> returned by either pci_alloc_consistent() or dma_alloc_coherent() back
> to it's consituent struct page pointers.
We can do some #ifdef hacks in our code, of course, but a function
doing this abstraction in kernel for given architecture would be much
better of course.
> Secondly the user space mapping will be marked as cacheable on some
> architectures, which would be Real Bad(tm) on architectures which
> are not DMA coherent.
>
> The way architectures mark their mappings uncacheable and/or only
> writecombining is architecture specific... see drivers/video/fbmem.c
> as an example.
It's real mess. I think that this setup should be moved to linux/arch
tree, too.
> 2) PCM mmap control/status mappings
>
> These suffer from a similar cache coherency problem - you can not
> assume that two different mappings of the same page will not alias
> in the CPUs caches.
>
> In my case on ARM, not only must the user space mappings of these
> structures be marked uncacheable, but also the kernel space mappings
> of the same to ensure that accesses via both mappings always return
> up to date information.
>
> I have hacks in my tree which work around this using ARM specific
> functionality, but this is very much architecture specific at the
> moment, and I suspect requires a new kernel API for creating memory
> (it's similar to the DMA case above.)
It's same problem. We need that a function in kernel API will do this
for us to avoid #ifdefs for all architectures.
I'm not sure, if I'm the right person to do these changes in kernel.
Any volunteer? Thanks.
Until this change is done, I accept the #ifdef hacks to fix support for
all architectures in ALSA.
Jaroslav
-----
Jaroslav Kysela <perex@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Alsa-devel] 2.6.4-rc1: ALSA makes invalid assumptions about memory types
2004-02-29 12:48 ` [Alsa-devel] " Jaroslav Kysela
(?)
@ 2004-02-29 15:26 ` Russell King
-1 siblings, 0 replies; 4+ messages in thread
From: Russell King @ 2004-02-29 15:26 UTC (permalink / raw)
To: Jaroslav Kysela; +Cc: Alsa Devel list, LKML
On Sun, Feb 29, 2004 at 01:48:52PM +0100, Jaroslav Kysela wrote:
> On Sun, 29 Feb 2004, Russell King wrote:
> > I believe this needs discussing with the DMA API authors on LKML since
> > AFAIK the kernel currently doesn't have a clear API to translate memory
> > returned by either pci_alloc_consistent() or dma_alloc_coherent() back
> > to it's consituent struct page pointers.
>
> We can do some #ifdef hacks in our code, of course, but a function
> doing this abstraction in kernel for given architecture would be much
> better of course.
I've been thinking about something like:
struct page *dma_map_to_page(struct device *dev, void *cpu_addr,
dma_addr_t dma_addr, int page_offset);
Where:
- dev is the struct device which was passed to dma_alloc_coherent()
- cpu_addr is the address returned from dma_alloc_coherent()
- dma_addr is the DMA address returned from dma_alloc_coherent()
- page_offset is the offset into the mapping.
This allows an architecture to define the above as a macro. If the
architecture is coherent (ie, as x86 is), then they only need to do
this in their asm/dma-mapping.h header:
#define dma_map_to_page(dev,cpu,dma,off) virt_to_page((cpu) + (off) << PAGE_SHIFT)
This reflects essentially what we have today in snd_pcm_mmap_data_nopage().
However, in the non-coherent case, this becomes very much architecture
dependent, and an architecture probably wants to translate either the
DMA address or CPU address to a struct page by some method.
> > The way architectures mark their mappings uncacheable and/or only
> > writecombining is architecture specific... see drivers/video/fbmem.c
> > as an example.
>
> It's real mess. I think that this setup should be moved to linux/arch
> tree, too.
I think we just need to make sure that architectures define
pgprot_writecombine() and pgprot_uncached(), like ARM and ia64 both
do.
> > 2) PCM mmap control/status mappings
> >
> > These suffer from a similar cache coherency problem - you can not
> > assume that two different mappings of the same page will not alias
> > in the CPUs caches.
> >
> > In my case on ARM, not only must the user space mappings of these
> > structures be marked uncacheable, but also the kernel space mappings
> > of the same to ensure that accesses via both mappings always return
> > up to date information.
> >
> > I have hacks in my tree which work around this using ARM specific
> > functionality, but this is very much architecture specific at the
> > moment, and I suspect requires a new kernel API for creating memory
> > (it's similar to the DMA case above.)
>
> It's same problem. We need that a function in kernel API will do this
> for us to avoid #ifdefs for all architectures.
I think for the moment we'll leave this issue alone for now (and I'll
put up with the hacks.) Once we've sorted (1) out, we should have a
better idea how to sort (2).
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-02-29 15:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-29 12:22 2.6.4-rc1: ALSA makes invalid assumptions about memory types Russell King
2004-02-29 12:48 ` Jaroslav Kysela
2004-02-29 12:48 ` [Alsa-devel] " Jaroslav Kysela
2004-02-29 15:26 ` Russell King
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.