* consistent_free re-revisited
@ 2002-09-11 21:45 Todd Poynor
2002-09-12 3:26 ` David Gibson
0 siblings, 1 reply; 10+ messages in thread
From: Todd Poynor @ 2002-09-11 21:45 UTC (permalink / raw)
To: linuxppc-dev
The APIs for consistent_free have diverged between PPC and ARM. It has,
at least in the past, been a goal to keep the APIs in sync.
Revisiting this is prompted by a couple of new platforms recently added
(Xilinx Virtex II Pro and IBM 405LP/Beech) that use consistent_alloc for
framebuffers mmap'ed to X servers. It would be nice (but is hardly a
critical need) if consistent_alloc/free would set/clear the reserved bit
on the struct page's allocated, such that the returned memory can be
mmap'ed to userspace without the driver explicitly setting these bits, a
la ARM.
On PPC this is complicated by the fact that consistent_free does not
take a size parameter, describing the size of the allocated area, nor a
dma_handle parameter, which ARM uses to derive struct page's.
Furthermore, consistent_alloc does not fill out vm_struct fields for the
VM area that map it to the struct pages, so consistent_free can't use
those to derive the physical pages. And neither can vfree(), which is
what consistent_free() calls to do its work. And so consistent_free()
does not free the physical pages allocated by consistent_alloc(), which
is a potentially more serious matter (although I suppose consistent_free
isn't normally in heavy use, but hi-res framebuffers can get large).
Setting/clearing reserved bits and init'ing the vm_struct fields such
that physical pages are reclaimed by vfree() can be accomplished without
API changes, at the expense of intruding into private VM data structures
and inefficiency of allocating an array of struct page pointers to
describe a contiguous chunk of memory. So it seems preferable to add
the size and dma_handle params to consistent_free and do the physical
page freeing there (and update the drivers that call it). Comments?
Thanks,
--
Todd
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-11 21:45 consistent_free re-revisited Todd Poynor
@ 2002-09-12 3:26 ` David Gibson
2002-09-12 14:29 ` Tom Rini
2002-09-12 21:33 ` Todd Poynor
0 siblings, 2 replies; 10+ messages in thread
From: David Gibson @ 2002-09-12 3:26 UTC (permalink / raw)
To: Todd Poynor; +Cc: linuxppc-dev
On Wed, Sep 11, 2002 at 02:45:31PM -0700, Todd Poynor wrote:
> The APIs for consistent_free have diverged between PPC and ARM. It has,
> at least in the past, been a goal to keep the APIs in sync.
That's the least of our worries - the current consistent_alloc() is
badly broken. It breaks the DMA-mapping.txt rules, because it can't
safely be called from interrupt context. I think we need generic
changes (gfp masks to various functions) to fix this sanely.
It would be nice to have the same interface as ARM, though - on the
other hand the consistent_*() interface as such may be obsoleted by
unified device model interfaces: I believe per-bus consistent memory
interfaces are supposed to be in there.
> Revisiting this is prompted by a couple of new platforms recently added
> (Xilinx Virtex II Pro and IBM 405LP/Beech) that use consistent_alloc for
> framebuffers mmap'ed to X servers. It would be nice (but is hardly a
> critical need) if consistent_alloc/free would set/clear the reserved bit
> on the struct page's allocated, such that the returned memory can be
> mmap'ed to userspace without the driver explicitly setting these bits, a
> la ARM.
I've thought about this before, but on the whole I don't think it's a
good idea. I think setting VM_IO on the vma when it is mapped into
userspace is a better idea.
In PCI space, the assumption appears to be that PageReserved is *not*
set by pci_alloc_consistent() - a couple of the sound drivers assume
this, because they explicitly set and clear the Reserved bit when they
remap various DMA buffers.
Incidentally I think there are also cases where consistent_alloc()
(both ours and ARM's, IIRC) could leak memory if failures occur at
just the right point.
> On PPC this is complicated by the fact that consistent_free does not
> take a size parameter, describing the size of the allocated area, nor a
> dma_handle parameter, which ARM uses to derive struct page's.
Yes, on the other hand having a single "handle" on the memory in order
to free it is nice.
> Furthermore, consistent_alloc does not fill out vm_struct fields for the
> VM area that map it to the struct pages, so consistent_free can't use
> those to derive the physical pages. And neither can vfree(), which is
> what consistent_free() calls to do its work. And so consistent_free()
> does not free the physical pages allocated by consistent_alloc(), which
> is a potentially more serious matter (although I suppose consistent_free
> isn't normally in heavy use, but hi-res framebuffers can get large).
Erm... vfree() walks the page tables to find the pages to free, which
should work fine.
> Setting/clearing reserved bits and init'ing the vm_struct fields such
> that physical pages are reclaimed by vfree() can be accomplished without
> API changes, at the expense of intruding into private VM data structures
> and inefficiency of allocating an array of struct page pointers to
> describe a contiguous chunk of memory. So it seems preferable to add
> the size and dma_handle params to consistent_free and do the physical
> page freeing there (and update the drivers that call it). Comments?
> Thanks,
>
>
--
David Gibson | For every complex problem there is a
david@gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 14:29 ` Tom Rini
@ 2002-09-12 7:23 ` Benjamin Herrenschmidt
2002-09-12 14:51 ` Tom Rini
2002-09-12 17:05 ` Matt Porter
1 sibling, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2002-09-12 7:23 UTC (permalink / raw)
To: Tom Rini, Todd Poynor, linuxppc-dev
>> That's the least of our worries - the current consistent_alloc() is
>> badly broken. It breaks the DMA-mapping.txt rules, because it can't
>> safely be called from interrupt context. I think we need generic
>> changes (gfp masks to various functions) to fix this sanely.
>
>IIRC, to fix the interrupt context bit, we need something like the
>following, which was Paul's idea, and he said he would talk to Dave M.
>about it. Did anything ever happen there?
I tend to hate anything that relies on in_interrupt() as they
are other contexts that will have in_interrupt() cleared but still
have the same limitations. Typically, anything on the VM path must
do either GFP_ATOMIC or GFP_NOIO allocations, wether it's running
at interrupt time or not.
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 14:51 ` Tom Rini
@ 2002-09-12 7:49 ` Benjamin Herrenschmidt
2002-09-12 15:23 ` Tom Rini
0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2002-09-12 7:49 UTC (permalink / raw)
To: Tom Rini; +Cc: Todd Poynor, linuxppc-dev
>> I tend to hate anything that relies on in_interrupt() as they
>> are other contexts that will have in_interrupt() cleared but still
>> have the same limitations. Typically, anything on the VM path must
>> do either GFP_ATOMIC or GFP_NOIO allocations, wether it's running
>> at interrupt time or not.
>
>The problem is that the atomic pool is limited, iirc.
Well, ATOMIC can fail, sure, but if you do GFP_KERNEL within a
VM code path, then be prepared for deadlocks.
A driver that would need such allocations during it's request
handling path would suck anyway and should pre-allocate ;) But
if this is the case, then it must be able to say that it wants
ATOMIC or NOIO allocations, and if the driver is done properly,
it should be able to deal gracefully with failure, typically
for a block driver by triggering a timer to try the request
again later or just fail the request (but that is generally
bad for a block driver).
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 15:23 ` Tom Rini
@ 2002-09-12 8:08 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2002-09-12 8:08 UTC (permalink / raw)
To: Tom Rini; +Cc: Todd Poynor, linuxppc-dev
>Er, so you're objecting to existing GFP_KERNEL's then?
>
>Or is it too early and I need more coffee? :)
Existing GFP_KERNEL is no good at all, sure ;)
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 3:26 ` David Gibson
@ 2002-09-12 14:29 ` Tom Rini
2002-09-12 7:23 ` Benjamin Herrenschmidt
2002-09-12 17:05 ` Matt Porter
2002-09-12 21:33 ` Todd Poynor
1 sibling, 2 replies; 10+ messages in thread
From: Tom Rini @ 2002-09-12 14:29 UTC (permalink / raw)
To: Todd Poynor, linuxppc-dev
On Thu, Sep 12, 2002 at 01:26:40PM +1000, David Gibson wrote:
>
> On Wed, Sep 11, 2002 at 02:45:31PM -0700, Todd Poynor wrote:
> > The APIs for consistent_free have diverged between PPC and ARM. It has,
> > at least in the past, been a goal to keep the APIs in sync.
>
> That's the least of our worries - the current consistent_alloc() is
> badly broken. It breaks the DMA-mapping.txt rules, because it can't
> safely be called from interrupt context. I think we need generic
> changes (gfp masks to various functions) to fix this sanely.
IIRC, to fix the interrupt context bit, we need something like the
following, which was Paul's idea, and he said he would talk to Dave M.
about it. Did anything ever happen there?
But on the other hand, IIRC, the allocation can fail, and the driver has
to be able to deal with that. But almost none of them do.
> It would be nice to have the same interface as ARM, though - on the
> other hand the consistent_*() interface as such may be obsoleted by
> unified device model interfaces: I believe per-bus consistent memory
> interfaces are supposed to be in there.
Well, if we can have the same interface as ARM, wouldn't that go a ways
towards having the code already done for the unified device model? And
this is a 2.4 problem too.
[snip]
> Incidentally I think there are also cases where consistent_alloc()
> (both ours and ARM's, IIRC) could leak memory if failures occur at
> just the right point.
Do you remember what that point was?
--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/
--- arch/ppc/mm/cachemap.c 2002/05/10 16:12:32 1.2
+++ arch/ppc/mm/cachemap.c 2002/07/12 19:01:00
@@ -68,9 +68,6 @@
struct vm_struct *area;
void *ret;
- if (in_interrupt())
- BUG();
-
/* Only allocate page size areas.
*/
size = PAGE_ALIGN(size);
--- mm/vmalloc.c 2002/05/09 00:26:12 1.1.1.1
+++ mm/vmalloc.c 2002/07/12 19:01:00
@@ -12,6 +12,7 @@
#include <linux/spinlock.h>
#include <linux/highmem.h>
#include <linux/smp_lock.h>
+#include <linux/interrupt.h>
#include <asm/uaccess.h>
#include <asm/pgalloc.h>
@@ -169,15 +170,16 @@
struct vm_struct * get_vm_area(unsigned long size, unsigned long flags)
{
- unsigned long addr;
+ unsigned long addr, lflags;
struct vm_struct **p, *tmp, *area;
- area = (struct vm_struct *) kmalloc(sizeof(*area), GFP_KERNEL);
+ area = (struct vm_struct *) kmalloc(sizeof(*area),
+ (in_interrupt() ? GFP_ATOMIC : GFP_KERNEL));
if (!area)
return NULL;
size += PAGE_SIZE;
addr = VMALLOC_START;
- write_lock(&vmlist_lock);
+ write_lock_irqsave(&vmlist_lock, lflags);
for (p = &vmlist; (tmp = *p) ; p = &tmp->next) {
if ((size + addr) < addr)
goto out;
@@ -192,11 +194,11 @@
area->size = size;
area->next = *p;
*p = area;
- write_unlock(&vmlist_lock);
+ write_unlock_irqrestore(&vmlist_lock, lflags);
return area;
out:
- write_unlock(&vmlist_lock);
+ write_unlock_irqrestore(&vmlist_lock, lflags);
kfree(area);
return NULL;
}
--- include/asm-ppc/pgalloc.h.orig Wed Aug 7 11:04:12 2002
+++ include/asm-ppc/pgalloc.h Wed Aug 7 11:09:02 2002
@@ -114,7 +114,8 @@
extern void *early_get_page(void);
if (mem_init_done)
- pte = (pte_t *) __get_free_page(GFP_KERNEL);
+ pte = (pte_t *) __get_free_page(
+ (in_interrupt() ? GFP_ATOMIC : GFP_KERNEL));
else
pte = (pte_t *) early_get_page();
if (pte != NULL)
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 7:23 ` Benjamin Herrenschmidt
@ 2002-09-12 14:51 ` Tom Rini
2002-09-12 7:49 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 10+ messages in thread
From: Tom Rini @ 2002-09-12 14:51 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Todd Poynor, linuxppc-dev
On Thu, Sep 12, 2002 at 09:23:15AM +0200, Benjamin Herrenschmidt wrote:
> >> That's the least of our worries - the current consistent_alloc() is
> >> badly broken. It breaks the DMA-mapping.txt rules, because it can't
> >> safely be called from interrupt context. I think we need generic
> >> changes (gfp masks to various functions) to fix this sanely.
> >
> >IIRC, to fix the interrupt context bit, we need something like the
> >following, which was Paul's idea, and he said he would talk to Dave M.
> >about it. Did anything ever happen there?
>
> I tend to hate anything that relies on in_interrupt() as they
> are other contexts that will have in_interrupt() cleared but still
> have the same limitations. Typically, anything on the VM path must
> do either GFP_ATOMIC or GFP_NOIO allocations, wether it's running
> at interrupt time or not.
The problem is that the atomic pool is limited, iirc.
--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 7:49 ` Benjamin Herrenschmidt
@ 2002-09-12 15:23 ` Tom Rini
2002-09-12 8:08 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 10+ messages in thread
From: Tom Rini @ 2002-09-12 15:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Todd Poynor, linuxppc-dev
On Thu, Sep 12, 2002 at 09:49:07AM +0200, Benjamin Herrenschmidt wrote:
> >> I tend to hate anything that relies on in_interrupt() as they
> >> are other contexts that will have in_interrupt() cleared but still
> >> have the same limitations. Typically, anything on the VM path must
> >> do either GFP_ATOMIC or GFP_NOIO allocations, wether it's running
> >> at interrupt time or not.
> >
> >The problem is that the atomic pool is limited, iirc.
>
> Well, ATOMIC can fail, sure, but if you do GFP_KERNEL within a
> VM code path, then be prepared for deadlocks.
Er, so you're objecting to existing GFP_KERNEL's then?
Or is it too early and I need more coffee? :)
--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 14:29 ` Tom Rini
2002-09-12 7:23 ` Benjamin Herrenschmidt
@ 2002-09-12 17:05 ` Matt Porter
1 sibling, 0 replies; 10+ messages in thread
From: Matt Porter @ 2002-09-12 17:05 UTC (permalink / raw)
To: Tom Rini; +Cc: Todd Poynor, linuxppc-dev
On Thu, Sep 12, 2002 at 07:29:45AM -0700, Tom Rini wrote:
> IIRC, to fix the interrupt context bit, we need something like the
> following, which was Paul's idea, and he said he would talk to Dave M.
> about it. Did anything ever happen there?
<snip>
In addition to that, you need to scrub the entire consistent_alloc
path...that includes map_page()->pte_alloc_kernel() which may
do a GFP_KERNEL allocation.
Then it's more complex yet with Book E's need to have those possible
page table allocations from the pinned TLB region.
Regards,
--
Matt Porter
porter@cox.net
This is Linux Country. On a quiet night, you can hear Windows reboot.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: consistent_free re-revisited
2002-09-12 3:26 ` David Gibson
2002-09-12 14:29 ` Tom Rini
@ 2002-09-12 21:33 ` Todd Poynor
1 sibling, 0 replies; 10+ messages in thread
From: Todd Poynor @ 2002-09-12 21:33 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-dev
David Gibson wrote:
>>Furthermore, consistent_alloc does not fill out vm_struct fields for the
>>VM area that map it to the struct pages, so consistent_free can't use
>>those to derive the physical pages. And neither can vfree(), which is
>>what consistent_free() calls to do its work. And so consistent_free()
>>does not free the physical pages allocated by consistent_alloc()...
>
> Erm... vfree() walks the page tables to find the pages to free, which
> should work fine.
Been curious about this... vfree() walks the page tables to find PTE
entries to unmap (modifying the PTE entry). But its code to call
__free_page() on the physical pages for the area (the deallocate_pages
condition in __vunmap()) is not being executed (because area->nr_pages
== 0), and it seems likely this is needed.
After modifying consistent_alloc() to fill out the area->pages array and
set area->nr_pages, __vunmap() will then call __free_page() on those
pages. Whereupon:
vaddr = consistent_alloc(gfp, size, &physaddr);
consistent_free(vaddr);
vaddr = consistent_alloc(gfp, size, &physaddr);
then allocates the same physaddr both times, which is not currently the
case. There is also an overall difference in memory usage in the
neighborhood of the allocated buffer size between the 2 schemes (per
free(1) at a shell prompt), but I wouldn't swear to it...
--
Todd
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-09-12 21:33 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-11 21:45 consistent_free re-revisited Todd Poynor
2002-09-12 3:26 ` David Gibson
2002-09-12 14:29 ` Tom Rini
2002-09-12 7:23 ` Benjamin Herrenschmidt
2002-09-12 14:51 ` Tom Rini
2002-09-12 7:49 ` Benjamin Herrenschmidt
2002-09-12 15:23 ` Tom Rini
2002-09-12 8:08 ` Benjamin Herrenschmidt
2002-09-12 17:05 ` Matt Porter
2002-09-12 21:33 ` Todd Poynor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).