linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* consistent_free()
@ 2002-06-14  4:29 David Gibson
  2002-06-14  5:57 ` consistent_free() David Gibson
  2002-06-14 15:39 ` consistent_free() Tom Rini
  0 siblings, 2 replies; 20+ messages in thread
From: David Gibson @ 2002-06-14  4:29 UTC (permalink / raw)
  To: linuxppc-embedded; +Cc: Paul Mackerras


In attempting to make consistent_alloc/free() work sensibly on
processors which are cache coherent I ran into a problem.
consistent_free() doesn't take a size argument.  We don't need it in
the case of not cache coherent processors - in that case
consistent_alloc() sets up a vm_area() so there's enough information
to get the size.  However for cache coherent processors we probably
want consistent_alloc() to degenerate to __get_free_pages(), in which
case consistent_free() must degenerate to free_pages(), which takes a
size argument.

I suggest we change consistent_free() to take the virtual addresss,
size and the physical address (dma_addr_t), which will make our
consistent_free() match the one on ARM.  I know we don't need the
third argument in any existing situation.

Patch coming...

--
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14  4:29 consistent_free() David Gibson
@ 2002-06-14  5:57 ` David Gibson
  2002-06-24  2:15   ` consistent_free() David Gibson
  2002-06-14 15:39 ` consistent_free() Tom Rini
  1 sibling, 1 reply; 20+ messages in thread
From: David Gibson @ 2002-06-14  5:57 UTC (permalink / raw)
  To: linuxppc-embedded, Paul Mackerras


On Fri, Jun 14, 2002 at 02:29:28PM +1000, David Gibson wrote:
>
> In attempting to make consistent_alloc/free() work sensibly on
> processors which are cache coherent I ran into a problem.
> consistent_free() doesn't take a size argument.  We don't need it in
> the case of not cache coherent processors - in that case
> consistent_alloc() sets up a vm_area() so there's enough information
> to get the size.  However for cache coherent processors we probably
> want consistent_alloc() to degenerate to __get_free_pages(), in which
> case consistent_free() must degenerate to free_pages(), which takes a
> size argument.
>
> I suggest we change consistent_free() to take the virtual addresss,
> size and the physical address (dma_addr_t), which will make our
> consistent_free() match the one on ARM.  I know we don't need the
> third argument in any existing situation.
>
> Patch coming...

As promised...

This boots up fine on my EP405PC board, and I'm sending this mail from
my TiBook running 2_4_devel with this patch and also the 40x large
page PMD patch.

diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/kernel/pci-dma.c linux-grinch/arch/ppc/kernel/pci-dma.c
--- /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/kernel/pci-dma.c	Fri Jun 14 07:57:36 2002
+++ linux-grinch/arch/ppc/kernel/pci-dma.c	Fri Jun 14 14:30:36 2002
@@ -25,28 +25,16 @@

 	if (hwdev == NULL || hwdev->dma_mask != 0xffffffff)
 		gfp |= GFP_DMA;
-
-#ifdef CONFIG_NOT_COHERENT_CACHE
 	ret = consistent_alloc(gfp, size, dma_handle);
-#else
-	ret = (void *)__get_free_pages(gfp, get_order(size));
-#endif

-	if (ret != NULL) {
+	if (ret != NULL)
 		memset(ret, 0, size);
-#ifndef CONFIG_NOT_COHERENT_CACHE
-		*dma_handle = virt_to_bus(ret);
-#endif
-	}
+
 	return ret;
 }

 void pci_free_consistent(struct pci_dev *hwdev, size_t size,
 			 void *vaddr, dma_addr_t dma_handle)
 {
-#ifdef CONFIG_NOT_COHERENT_CACHE
-	consistent_free(vaddr);
-#else
-	free_pages((unsigned long)vaddr, get_order(size));
-#endif
+	consistent_free(vaddr, size, dma_handle);
 }
diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/kernel/ppc4xx_dma.c linux-grinch/arch/ppc/kernel/ppc4xx_dma.c
--- /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/kernel/ppc4xx_dma.c	Fri Jun  7 15:26:47 2002
+++ linux-grinch/arch/ppc/kernel/ppc4xx_dma.c	Fri Jun 14 14:36:12 2002
@@ -457,6 +457,7 @@
 free_dma_handle(sgl_handle_t handle)
 {
 	sgl_list_info_t *psgl = (sgl_list_info_t *) handle;
+	dma_addr_t dma_addr;

 	if (!handle) {
 #ifdef DEBUG_4xxDMA
@@ -475,7 +476,8 @@
 		return;
 	}

-	consistent_free((void *) psgl);
+	dma_addr = psql->dma_addr;
+	consistent_free((void *) psgl, DMA_PPC4xx_SIZE, dma_addr);
 }

 EXPORT_SYMBOL(hw_init_dma_channel);
diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c linux-grinch/arch/ppc/mm/cachemap.c
--- /home/dgibson/kernel/linuxppc_2_4_devel/arch/ppc/mm/cachemap.c	Thu Jun 13 13:40:03 2002
+++ linux-grinch/arch/ppc/mm/cachemap.c	Fri Jun 14 15:54:36 2002
@@ -121,7 +121,7 @@
 /*
  * free page(s) as defined by the above mapping.
  */
-void consistent_free(void *vaddr)
+void consistent_free(void *vaddr, size_t size, dma_addr_t handle)
 {
 	if (in_interrupt())
 		BUG();
diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/drivers/net/ibm_ocp/ibm_ocp_enet.c linux-grinch/drivers/net/ibm_ocp/ibm_ocp_enet.c
--- /home/dgibson/kernel/linuxppc_2_4_devel/drivers/net/ibm_ocp/ibm_ocp_enet.c	Thu Jun 13 10:18:47 2002
+++ linux-grinch/drivers/net/ibm_ocp/ibm_ocp_enet.c	Fri Jun 14 14:39:09 2002
@@ -1185,8 +1185,9 @@
 	/*
 	 * Unmap the non cached memory space.
 	 */
-	consistent_free((void *) tx_virt_addr);
-	consistent_free((void *) rx_virt_addr);
+
+	consistent_free(tx_virt_addr, PAGE_SIZE * emac_max, tx_phys_addr);
+	consistent_free(rx_virt_addr, PAGE_SIZE * emac_max, rx_phys_addr);

 }

diff -urN /home/dgibson/kernel/linuxppc_2_4_devel/include/asm-ppc/io.h linux-grinch/include/asm-ppc/io.h
--- /home/dgibson/kernel/linuxppc_2_4_devel/include/asm-ppc/io.h	Sat May 11 02:02:08 2002
+++ linux-grinch/include/asm-ppc/io.h	Fri Jun 14 14:54:20 2002
@@ -436,7 +436,7 @@
  * to ensure it is consistent.
  */
 extern void *consistent_alloc(int gfp, size_t size, dma_addr_t *handle);
-extern void consistent_free(void *vaddr);
+extern void consistent_free(void *vaddr, size_t size, dma_addr_t handle);
 extern void consistent_sync(void *vaddr, size_t size, int rw);
 extern void consistent_sync_page(struct page *page, unsigned long offset,
 				 size_t size, int rw);
@@ -458,8 +458,17 @@
 #define dma_cache_wback(_start,_size)		do { } while (0)
 #define dma_cache_wback_inv(_start,_size)	do { } while (0)

-#define consistent_alloc(gfp, size, handle)	NULL
-#define consistent_free(addr, size)		do { } while (0)
+static inline void *consistent_alloc(int gfp, size_t size, dma_addr_t *dma_handle)
+{
+	void *vaddr;
+
+	vaddr = (void *)__get_free_pages(gfp, get_order(size));
+	if (vaddr)
+		*dma_handle = virt_to_bus(vaddr);
+	return vaddr;
+}
+
+#define consistent_free(addr, size, dmaddr)	free_pages((unsigned long)addr, get_order(size))
 #define consistent_sync(addr, size, rw)		do { } while (0)
 #define consistent_sync_page(pg, off, sz, rw)	do { } while (0)




--
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14  4:29 consistent_free() David Gibson
  2002-06-14  5:57 ` consistent_free() David Gibson
@ 2002-06-14 15:39 ` Tom Rini
  2002-06-14 16:44   ` consistent_free() Dan Malek
  2002-06-15  6:02   ` consistent_free() Paul Mackerras
  1 sibling, 2 replies; 20+ messages in thread
From: Tom Rini @ 2002-06-14 15:39 UTC (permalink / raw)
  To: linuxppc-embedded, Paul Mackerras, Dan Malek, David Gibson


On Fri, Jun 14, 2002 at 02:29:28PM +1000, David Gibson wrote:

> In attempting to make consistent_alloc/free() work sensibly on
> processors which are cache coherent I ran into a problem.

My only concern is that is it a good idea to make a 'consistent_alloc'
and 'consistent_free' functions for  cache-cohernet processors, which
aren't doing what the name implies?  The only possible caller of these
are pci_alloc_consistent and pci_free_consistent, in a cache-cohernet
processor.

[snip]
> I suggest we change consistent_free() to take the virtual addresss,
> size and the physical address (dma_addr_t), which will make our
> consistent_free() match the one on ARM.  I know we don't need the
> third argument in any existing situation.

I wonder if ARM couldn't just call vfree() like we do..

In fact, I wonder why consistent_alloc/free seem to have some minor
differences (__get_free_pages vs alloc_pages seems to be the only real
difference aside from style things).  Dan?  Can we get some more insight
into the workings of your mind? :)

--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14 15:39 ` consistent_free() Tom Rini
@ 2002-06-14 16:44   ` Dan Malek
  2002-06-14 17:10     ` consistent_free() Tom Rini
  2002-06-15  6:11     ` consistent_free() Paul Mackerras
  2002-06-15  6:02   ` consistent_free() Paul Mackerras
  1 sibling, 2 replies; 20+ messages in thread
From: Dan Malek @ 2002-06-14 16:44 UTC (permalink / raw)
  To: Tom Rini; +Cc: linuxppc-embedded, Paul Mackerras, David Gibson


Tom Rini wrote:


> My only concern is that is it a good idea to make a 'consistent_alloc'
> and 'consistent_free' functions for  cache-cohernet processors, which
> aren't doing what the name implies?

I don't like it either, but that doesn't seem to matter these days.
In light of the recent discussions about clarity of code, I think the
changes did nothing but make it harder to understand and added nothing
to the value of the functions.

The original purpose of the consistent_alloc() functions was to provide
uncached memory on noncoherent processors that have integrated (or non-PCI)
peripherals.  It just so happened to work (mostly) with the same type of
processors that had a PCI bus, so it was convenient to adapt the PCI
functions to use these.


> I wonder if ARM couldn't just call vfree() like we do..

I've been tracking that, and it may be due to a useful enhancement
they made to consistent_alloc().

The other issue here is these functions are somewhat architecture
specific, but they are expected to work the same way across all platforms.
By keeping similar code in all of these functions, and the same
calling parameters, it is easy to maintain this compatibility.

These functions are not replacements for the pci_* functions, but they
can be used when appropriate.  I thought the way I originally implemented
these functions made it obvious, especially on PowerPC where we have
both coherent and noncoherent caches.

> In fact, I wonder why consistent_alloc/free seem to have some minor
> differences (__get_free_pages vs alloc_pages seems to be the only real
> difference aside from style things).

The newer ARM version only allocates the number of pages necessary to
cover the request.  The old version, and the one in PowerPC right now,
would just use __get_free_pages, which was a power-of-two allocator.
You could end up wasting lots of memory if you weren't allocating power of
two sized buffers.  There were a couple of implementation iterations
to make this work, and I guess this is where ARM ended up.  I think
one version really wanted that page struct, but now it doesn't really matter.

The original use of these functions was for the integrated USB controllers,
so I suspect if anyone tries that on the 8xx today, they will break.  I
don't know how many other peripherals use these functions, except for the
4xx and 8xx that we use locally (and will obviously still work).

> ....  Dan?  Can we get some more insight
> into the workings of your mind? :)

I've been watching the ARM updates and plan to move their changes into
the PowerPC sources.  I don't know why they don't call vfree() directly,
I thought it did the proper clean up.  In any case we will have to adopt
the number of calling parameters, and just ignore the ones we don't need.


	-- Dan


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14 16:44   ` consistent_free() Dan Malek
@ 2002-06-14 17:10     ` Tom Rini
  2002-06-14 21:34       ` consistent_free() Dan Malek
  2002-06-15  6:11     ` consistent_free() Paul Mackerras
  1 sibling, 1 reply; 20+ messages in thread
From: Tom Rini @ 2002-06-14 17:10 UTC (permalink / raw)
  To: Dan Malek; +Cc: linuxppc-embedded, Paul Mackerras, David Gibson


On Fri, Jun 14, 2002 at 12:44:03PM -0400, Dan Malek wrote:

[snip]
> The newer ARM version only allocates the number of pages necessary to
> cover the request.  The old version, and the one in PowerPC right now,
> would just use __get_free_pages, which was a power-of-two allocator.
> You could end up wasting lots of memory if you weren't allocating power of
> two sized buffers.  There were a couple of implementation iterations
> to make this work, and I guess this is where ARM ended up.  I think
> one version really wanted that page struct, but now it doesn't really
> matter.

I ended up noticing that myself, and have things almost right, I think
(patch shortly, if so..)

> >....  Dan?  Can we get some more insight
> >into the workings of your mind? :)
>
> I've been watching the ARM updates and plan to move their changes into
> the PowerPC sources.  I don't know why they don't call vfree() directly,
> I thought it did the proper clean up.

I'm not sure either.  __iounmap on ARM just is vfree() with the casts.

--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14 17:10     ` consistent_free() Tom Rini
@ 2002-06-14 21:34       ` Dan Malek
  0 siblings, 0 replies; 20+ messages in thread
From: Dan Malek @ 2002-06-14 21:34 UTC (permalink / raw)
  To: Tom Rini; +Cc: linuxppc-embedded, Paul Mackerras, David Gibson


Tom Rini wrote:

> I'm not sure either.  __iounmap on ARM just is vfree() with the casts.

The issue is in consistent_free() they forcibly reset the Reserved
indicator on the pages.  I don't know why this is necessary.  It's
either due to the way they remap the pages during the allocation or
some change in the Linux VM that I haven't yet discovered :-)


	-- Dan


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14 15:39 ` consistent_free() Tom Rini
  2002-06-14 16:44   ` consistent_free() Dan Malek
@ 2002-06-15  6:02   ` Paul Mackerras
  2002-06-15  6:27     ` consistent_free() Dan Malek
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Mackerras @ 2002-06-15  6:02 UTC (permalink / raw)
  To: Tom Rini; +Cc: linuxppc-embedded, Dan Malek, David Gibson


Tom Rini writes:

> My only concern is that is it a good idea to make a 'consistent_alloc'
> and 'consistent_free' functions for  cache-cohernet processors, which
> aren't doing what the name implies?  The only possible caller of these
> are pci_alloc_consistent and pci_free_consistent, in a cache-cohernet
> processor.

"Consistent" means "the cpu and the device see the same stuff in
memory".  On a cache-coherent system that is true of all normal
memory.  Surely that is obvious?

Paul.

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14 16:44   ` consistent_free() Dan Malek
  2002-06-14 17:10     ` consistent_free() Tom Rini
@ 2002-06-15  6:11     ` Paul Mackerras
  2002-06-15  6:42       ` consistent_free() Dan Malek
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Mackerras @ 2002-06-15  6:11 UTC (permalink / raw)
  To: Dan Malek; +Cc: Tom Rini, linuxppc-embedded, David Gibson


Dan Malek writes:

> The newer ARM version only allocates the number of pages necessary to
> cover the request.  The old version, and the one in PowerPC right now,
> would just use __get_free_pages, which was a power-of-two allocator.

Interesting.  Using __get_free_pages guarantees that the pages are
physically contiguous.  Does the new ARM version do something clever
to get physically contiguous pages?  Or do the callers have to do a
scatter/gather type of operation if they are using more than 1 page?

Paul.

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-15  6:02   ` consistent_free() Paul Mackerras
@ 2002-06-15  6:27     ` Dan Malek
  2002-06-15  6:57       ` consistent_free() David Gibson
  0 siblings, 1 reply; 20+ messages in thread
From: Dan Malek @ 2002-06-15  6:27 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Tom Rini, linuxppc-embedded, David Gibson


Paul Mackerras wrote:

> "Consistent" means "the cpu and the device see the same stuff in
> memory".  On a cache-coherent system that is true of all normal
> memory.  Surely that is obvious?

Not really.  You have seen enough messages about people wanting to
get "uncached" memory, even though this isn't necessary.

We also don't want to be remapping 'consistent' space on those
processors that don't require it, one of the primary functions
of consistent_alloc() so it works properly on noncoherent processors.

Basically, David Gibson moved an 'ifdef' from one place to another,
and then had to write a consistent_alloc function for processors that
didn't need it before, placing more code in an include file no less.
How does that add value to the kernel?  It just complicated the
maintenance of a function that is similar across multiple architectures.

A more logical thing to do would be to have pci_consistent() always
allocate the pages, and in the case of noncoherent processors perform
the additional mapping step.  Then, have consistent_alloc() just call
pci_consistent() with a NULL hardware device pointer.


	-- Dan


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-15  6:11     ` consistent_free() Paul Mackerras
@ 2002-06-15  6:42       ` Dan Malek
  2002-06-15 10:02         ` consistent_free() Paul Mackerras
  0 siblings, 1 reply; 20+ messages in thread
From: Dan Malek @ 2002-06-15  6:42 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Tom Rini, linuxppc-embedded, David Gibson


Paul Mackerras wrote:

> Interesting.  Using __get_free_pages guarantees that the pages are
> physically contiguous.  Does the new ARM version do something clever
> to get physically contiguous pages?

The  __get_free_pages just does an alloc_pages().  The only difference
is __get_free_pages returns the page address of the first page.  I
think there was a previous version that used page structs to return
unused pages so it was just more convenient to call alloc_pages() directly.
In fact, I guess this is the current implementation.

Oh, I see now........they set the Reserved indicator on pages they allocate.
This is why they can't just vfree() the space anymore.  Pretty sneaky, this
way an application can also mmap() the space and driver can do DMA from
user space.  What a f**king hack when you don't have the features you need
to do it correctly.

 > ....  Or do the callers have to do a
> scatter/gather type of operation if they are using more than 1 page?

No, you still get consecutive physical pages based on the order parameter.



	-- Dan


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-15  6:27     ` consistent_free() Dan Malek
@ 2002-06-15  6:57       ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2002-06-15  6:57 UTC (permalink / raw)
  To: Dan Malek; +Cc: Paul Mackerras, Tom Rini, linuxppc-embedded


On Sat, Jun 15, 2002 at 02:27:53AM -0400, Dan Malek wrote:
>
> Paul Mackerras wrote:
>
> >"Consistent" means "the cpu and the device see the same stuff in
> >memory".  On a cache-coherent system that is true of all normal
> >memory.  Surely that is obvious?
>
> Not really.  You have seen enough messages about people wanting to
> get "uncached" memory, even though this isn't necessary.
>
> We also don't want to be remapping 'consistent' space on those
> processors that don't require it, one of the primary functions
> of consistent_alloc() so it works properly on noncoherent processors.

We don't.

> Basically, David Gibson moved an 'ifdef' from one place to another,
> and then had to write a consistent_alloc function for processors that
> didn't need it before, placing more code in an include file no less.
> How does that add value to the kernel?  It just complicated the
> maintenance of a function that is similar across multiple architectures.

Oh fer cryin' out loud.  Before any non-PCI driver that wanted DMA
consistent memory and wanted to work on both cache-coherent and
non-cache coherent would have to have an ifdef on
CONFIG_NOT_COHERENT_CACHE.  If it was on, it would have to use
consistent_alloc(), other wise it could just allocate the memory
normally (with kmalloc() or get_free_pages()) since consistency comes
for free on cache coherent processors.

I just made consistent_alloc() do for cache coherent processors
exactly what the driver would have to do anyway if
CONFIG_NOT_COHERENT_CACHE were not set.  The cache coherent
implementation of consistent_alloc() is utterly trivial, and does not
remapping.  If we really don't want to be calling consistent_alloc()
on cache coherent processors it should BUG() or cause a link error,
rather than just failing.

Admittedly pci_alloc_consistent() was (so far) the only place this
happened, because every other user of consistent_alloc() was basically
in a driver which would only ever be used on non cache coherent
processors.

--
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-15  6:42       ` consistent_free() Dan Malek
@ 2002-06-15 10:02         ` Paul Mackerras
  2002-06-15 13:51           ` consistent_free() Dan Malek
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Mackerras @ 2002-06-15 10:02 UTC (permalink / raw)
  To: Dan Malek; +Cc: Tom Rini, linuxppc-embedded, David Gibson


Dan Malek writes:

> > Interesting.  Using __get_free_pages guarantees that the pages are
> > physically contiguous.  Does the new ARM version do something clever
> > to get physically contiguous pages?
>
> The  __get_free_pages just does an alloc_pages().  The only difference
> is __get_free_pages returns the page address of the first page.  I
> think there was a previous version that used page structs to return
> unused pages so it was just more convenient to call alloc_pages() directly.
> In fact, I guess this is the current implementation.

So it still allocates 2^n pages for some n?  That makes sense to me.
Your original comment:

> The newer ARM version only allocates the number of pages necessary to
> cover the request.  The old version, and the one in PowerPC right now,
> would just use __get_free_pages, which was a power-of-two allocator.

implied to me that if you wanted 40kB it only allocated 10 pages, not
16, but in fact it sounds like the ARM implementation will actually
allocate 16 pages too.

Paul.

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-15 10:02         ` consistent_free() Paul Mackerras
@ 2002-06-15 13:51           ` Dan Malek
  0 siblings, 0 replies; 20+ messages in thread
From: Dan Malek @ 2002-06-15 13:51 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Tom Rini, linuxppc-embedded, David Gibson


Paul Mackerras wrote:

> implied to me that if you wanted 40kB it only allocated 10 pages, not
> 16, but in fact it sounds like the ARM implementation will actually
> allocate 16 pages too.

The current ARM implementation (that I want to copy again) :-) will
first allocate the 16 pages using alloc_pages().  It will then return
six pages from the end of the allocation back to the free pool.  This
is the only way you can get an arbitrary number of contiguous pages.


	-- Dan


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-14  5:57 ` consistent_free() David Gibson
@ 2002-06-24  2:15   ` David Gibson
  2002-06-25 14:39     ` consistent_free() Tom Rini
  0 siblings, 1 reply; 20+ messages in thread
From: David Gibson @ 2002-06-24  2:15 UTC (permalink / raw)
  To: linuxppc-embedded, Paul Mackerras


On Fri, Jun 14, 2002 at 03:57:11PM +1000, David Gibson wrote:
>
> On Fri, Jun 14, 2002 at 02:29:28PM +1000, David Gibson wrote:
> >
> > In attempting to make consistent_alloc/free() work sensibly on
> > processors which are cache coherent I ran into a problem.
> > consistent_free() doesn't take a size argument.  We don't need it in
> > the case of not cache coherent processors - in that case
> > consistent_alloc() sets up a vm_area() so there's enough information
> > to get the size.  However for cache coherent processors we probably
> > want consistent_alloc() to degenerate to __get_free_pages(), in which
> > case consistent_free() must degenerate to free_pages(), which takes a
> > size argument.
> >
> > I suggest we change consistent_free() to take the virtual addresss,
> > size and the physical address (dma_addr_t), which will make our
> > consistent_free() match the one on ARM.  I know we don't need the
> > third argument in any existing situation.
> >
> > Patch coming...
>
> As promised...
>
> This boots up fine on my EP405PC board, and I'm sending this mail from
> my TiBook running 2_4_devel with this patch and also the 40x large
> page PMD patch.

Again, silence reigns.  Anyone who would object to this being applied
to the linuxppc-2.5 tree, speak up now.

--
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-24  2:15   ` consistent_free() David Gibson
@ 2002-06-25 14:39     ` Tom Rini
  2002-06-26  5:17       ` consistent_free() David Gibson
  0 siblings, 1 reply; 20+ messages in thread
From: Tom Rini @ 2002-06-25 14:39 UTC (permalink / raw)
  To: linuxppc-embedded, Paul Mackerras, David Gibson


On Mon, Jun 24, 2002 at 12:15:38PM +1000, David Gibson wrote:
>
> On Fri, Jun 14, 2002 at 03:57:11PM +1000, David Gibson wrote:
> >
> > On Fri, Jun 14, 2002 at 02:29:28PM +1000, David Gibson wrote:
> > >
> > > In attempting to make consistent_alloc/free() work sensibly on
> > > processors which are cache coherent I ran into a problem.
> > > consistent_free() doesn't take a size argument.  We don't need it in
> > > the case of not cache coherent processors - in that case
> > > consistent_alloc() sets up a vm_area() so there's enough information
> > > to get the size.  However for cache coherent processors we probably
> > > want consistent_alloc() to degenerate to __get_free_pages(), in which
> > > case consistent_free() must degenerate to free_pages(), which takes a
> > > size argument.
> > >
> > > I suggest we change consistent_free() to take the virtual addresss,
> > > size and the physical address (dma_addr_t), which will make our
> > > consistent_free() match the one on ARM.  I know we don't need the
> > > third argument in any existing situation.
> > >
> > > Patch coming...
> >
> > As promised...
> >
> > This boots up fine on my EP405PC board, and I'm sending this mail from
> > my TiBook running 2_4_devel with this patch and also the 40x large
> > page PMD patch.
>
> Again, silence reigns.  Anyone who would object to this being applied
> to the linuxppc-2.5 tree, speak up now.

If we're going to touch this, can you look at the current ARM version of
it and grab all of the goodies they've got now?  I almost had this going
(and I can shoot you the patch off the list if you like) but haven't had
time to go back to it.

--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-25 14:39     ` consistent_free() Tom Rini
@ 2002-06-26  5:17       ` David Gibson
  2002-06-26  5:33         ` consistent_free() Dan Malek
  2002-06-26 14:32         ` consistent_free() Paul Mackerras
  0 siblings, 2 replies; 20+ messages in thread
From: David Gibson @ 2002-06-26  5:17 UTC (permalink / raw)
  To: Tom Rini; +Cc: linuxppc-embedded, Paul Mackerras


On Tue, Jun 25, 2002 at 07:39:37AM -0700, Tom Rini wrote:
>
> On Mon, Jun 24, 2002 at 12:15:38PM +1000, David Gibson wrote:
> >
> > On Fri, Jun 14, 2002 at 03:57:11PM +1000, David Gibson wrote:
> > >
> > > On Fri, Jun 14, 2002 at 02:29:28PM +1000, David Gibson wrote:
> > > >
> > > > In attempting to make consistent_alloc/free() work sensibly on
> > > > processors which are cache coherent I ran into a problem.
> > > > consistent_free() doesn't take a size argument.  We don't need it in
> > > > the case of not cache coherent processors - in that case
> > > > consistent_alloc() sets up a vm_area() so there's enough information
> > > > to get the size.  However for cache coherent processors we probably
> > > > want consistent_alloc() to degenerate to __get_free_pages(), in which
> > > > case consistent_free() must degenerate to free_pages(), which takes a
> > > > size argument.
> > > >
> > > > I suggest we change consistent_free() to take the virtual addresss,
> > > > size and the physical address (dma_addr_t), which will make our
> > > > consistent_free() match the one on ARM.  I know we don't need the
> > > > third argument in any existing situation.
> > > >
> > > > Patch coming...
> > >
> > > As promised...
> > >
> > > This boots up fine on my EP405PC board, and I'm sending this mail from
> > > my TiBook running 2_4_devel with this patch and also the 40x large
> > > page PMD patch.
> >
> > Again, silence reigns.  Anyone who would object to this being applied
> > to the linuxppc-2.5 tree, speak up now.
>
> If we're going to touch this, can you look at the current ARM version of
> it and grab all of the goodies they've got now?  I almost had this going
> (and I can shoot you the patch off the list if you like) but haven't had
> time to go back to it.

Well, I had a look at the ARM version and now I'm a bit confused.  As
far as I can tell it does two things differently from us:
	1) After making the allocation rounded up to the appropriate
order it frees any leftover pages.  That certainly seems worthwhile.
	2) It sets PageReserved on each struct page that it
allocates/remaps.

(2) seems very strange.  It doesn't seem in keeping with the meaning
of PageReserved (well my best guess at the meaning from only slightly
illuminating comments in page.h). Apparently it's so that
remap_page_range() works - as indeed it wouldn't without this because
of a test in remap_pte_range() which that calls.  But that test looks
to be precisely inverted from what it should be.  Which makes me
wonder how the hell anything works now, since remap_page_range() is
apparently called from several places.

--
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-26  5:17       ` consistent_free() David Gibson
@ 2002-06-26  5:33         ` Dan Malek
  2002-06-26  5:59           ` consistent_free() David Gibson
  2002-06-26 14:32         ` consistent_free() Paul Mackerras
  1 sibling, 1 reply; 20+ messages in thread
From: Dan Malek @ 2002-06-26  5:33 UTC (permalink / raw)
  To: David Gibson; +Cc: Tom Rini, linuxppc-embedded, Paul Mackerras


David Gibson wrote:

> (2) seems very strange.  It doesn't seem in keeping with the meaning
> of PageReserved (well my best guess at the meaning from only slightly
> illuminating comments in page.h).

It looks strange to me as well, and wouldn't do that part.  I called
map_page() directly to avoid doing this, and in return we can just
call vfree() to get rid of the space.

> .... Which makes me
> wonder how the hell anything works now, since remap_page_range() is
> apparently called from several places.

I could only find a few places where it is called, and it seems to always
be called on pages that have been previously reserved (kernel ram or I/O space).
One of the advantages of making the page reserved in this case is you can
then mmap() it from a user application and get DMA to/from user space to work.
Normally, an mmap() from user space on memory gets you new, zeroed pages.

My only concern about not marking the pages reserved is ensuring they are not
eligible for swapping.  That would kind of suck if it happened :-)


	-- Dan


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-26  5:33         ` consistent_free() Dan Malek
@ 2002-06-26  5:59           ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2002-06-26  5:59 UTC (permalink / raw)
  To: Dan Malek; +Cc: Tom Rini, linuxppc-embedded, Paul Mackerras


On Wed, Jun 26, 2002 at 01:33:55AM -0400, Dan Malek wrote:
> David Gibson wrote:
>
> >(2) seems very strange.  It doesn't seem in keeping with the meaning
> >of PageReserved (well my best guess at the meaning from only slightly
> >illuminating comments in page.h).
>
> It looks strange to me as well, and wouldn't do that part.  I called
> map_page() directly to avoid doing this, and in return we can just
> call vfree() to get rid of the space.

Well that's the doubly odd thing.  After marking the pages reserved
the ARM consistent_alloc() then uses its own functions, not
remap_page_range() to do the remapping.  From the looks of it they
have the same structure as the remap_page_range(), but are without the
(broken?) test and I think have little effective difference to our
direct calling of map_page().

So the setting of PG_reserved will only affect attempts to
remap_page_range() on the physical pages after they have already been
consistent_alloc()ed.  I can't see why you'd want to do that.

> >.... Which makes me
> >wonder how the hell anything works now, since remap_page_range() is
> >apparently called from several places.
>
> I could only find a few places where it is called, and it seems to always
> be called on pages that have been previously reserved (kernel ram or I/O
> space).

Hmm.. ok.  That does seem a bit odd - the comment above
remap_page_range() says that it will create null mappings if the page
doesn't exist.  But it seems from the code that it will create null
mappings if the page *does* exist and will only create real mappings
if the page doesn't exist or is reserved.

AFAICT most kernel ram is not PageReserved, only leftover bootmem.  IO
memory won't usually be reserved - in general it won't even have a
struct page at all.

> One of the advantages of making the page reserved in this case is you can
> then mmap() it from a user application and get DMA to/from user space to
> work.
> Normally, an mmap() from user space on memory gets you new, zeroed pages.

I don't see how PG_reserved helps this particularly.  As long as the
driver implementing the mmap() allocates the physical memory (so the
VM won't take it away from us) before mapping it into user space.
It's probably also a good idea for the driver to maintain its own
virtual (uncached) mapping as well as the userspace one.  So using
consistent_alloc() and then remapping the physical addresses into
userspace should suffice.

> My only concern about not marking the pages reserved is ensuring they are
> not
> eligible for swapping.  That would kind of suck if it happened :-)

It certainly would.  I don't think pages that we've allocated with
get_free_pages() will be swapped (just like kmalloc()ed pages won't
be), but I don't know enough about the VM to be sure.

--
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-26  5:17       ` consistent_free() David Gibson
  2002-06-26  5:33         ` consistent_free() Dan Malek
@ 2002-06-26 14:32         ` Paul Mackerras
  2002-06-27  2:42           ` consistent_free() David Gibson
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Mackerras @ 2002-06-26 14:32 UTC (permalink / raw)
  To: David Gibson; +Cc: Tom Rini, linuxppc-embedded


David Gibson writes:

> Well, I had a look at the ARM version and now I'm a bit confused.  As
> far as I can tell it does two things differently from us:
> 	1) After making the allocation rounded up to the appropriate
> order it frees any leftover pages.  That certainly seems worthwhile.
> 	2) It sets PageReserved on each struct page that it
> allocates/remaps.
>
> (2) seems very strange.  It doesn't seem in keeping with the meaning
> of PageReserved (well my best guess at the meaning from only slightly

PageReserved means that the VM system shouldn't try to manage this
page.  In particular, it shouldn't try to do the page aging/move to
swap cache/swap out stuff that it will do on normal pages.

> illuminating comments in page.h). Apparently it's so that
> remap_page_range() works - as indeed it wouldn't without this because
> of a test in remap_pte_range() which that calls.  But that test looks
> to be precisely inverted from what it should be.  Which makes me

No, it's right.  remap_page_range() is only meant to be used for pages
that are either memory-mapped I/O (the !VALID_PAGE(page) test) or for
pages that are "special" (the PageReserved(page) test).  You aren't
supposed to use remap_page_range on normal RAM that the VM is
managing.

> wonder how the hell anything works now, since remap_page_range() is
> apparently called from several places.

It's only called from drivers (such as /dev/mem) or from arch-specific
code AFAICS.

Paul.

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: consistent_free()
  2002-06-26 14:32         ` consistent_free() Paul Mackerras
@ 2002-06-27  2:42           ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2002-06-27  2:42 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Tom Rini, linuxppc-embedded


On Wed, Jun 26, 2002 at 10:32:47AM -0400, Paul Mackerras wrote:
>
> David Gibson writes:
>
> > Well, I had a look at the ARM version and now I'm a bit confused.  As
> > far as I can tell it does two things differently from us:
> > 	1) After making the allocation rounded up to the appropriate
> > order it frees any leftover pages.  That certainly seems worthwhile.
> > 	2) It sets PageReserved on each struct page that it
> > allocates/remaps.
> >
> > (2) seems very strange.  It doesn't seem in keeping with the meaning
> > of PageReserved (well my best guess at the meaning from only slightly
>
> PageReserved means that the VM system shouldn't try to manage this
> page.  In particular, it shouldn't try to do the page aging/move to
> swap cache/swap out stuff that it will do on normal pages.

Ok, it's still not clear to me whether we need to set PageReserved or
not.  Is memory we've allocated with get_free_pages() safe to use
without fear of it being swapped from under us or not?

Surely it couldn't be swapped, since it resides in the kernel lowmem
mapping.

> > illuminating comments in page.h). Apparently it's so that
> > remap_page_range() works - as indeed it wouldn't without this because
> > of a test in remap_pte_range() which that calls.  But that test looks
> > to be precisely inverted from what it should be.  Which makes me
>
> No, it's right.  remap_page_range() is only meant to be used for pages
> that are either memory-mapped I/O (the !VALID_PAGE(page) test) or for
> pages that are "special" (the PageReserved(page) test).  You aren't
> supposed to use remap_page_range on normal RAM that the VM is
> managing.

Ah, ok that makes some sense.  I think the comment above
remap_page_range() needs rewriting, though.

> > wonder how the hell anything works now, since remap_page_range() is
> > apparently called from several places.
>
> It's only called from drivers (such as /dev/mem) or from arch-specific
> code AFAICS.

--
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2002-06-27  2:42 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-14  4:29 consistent_free() David Gibson
2002-06-14  5:57 ` consistent_free() David Gibson
2002-06-24  2:15   ` consistent_free() David Gibson
2002-06-25 14:39     ` consistent_free() Tom Rini
2002-06-26  5:17       ` consistent_free() David Gibson
2002-06-26  5:33         ` consistent_free() Dan Malek
2002-06-26  5:59           ` consistent_free() David Gibson
2002-06-26 14:32         ` consistent_free() Paul Mackerras
2002-06-27  2:42           ` consistent_free() David Gibson
2002-06-14 15:39 ` consistent_free() Tom Rini
2002-06-14 16:44   ` consistent_free() Dan Malek
2002-06-14 17:10     ` consistent_free() Tom Rini
2002-06-14 21:34       ` consistent_free() Dan Malek
2002-06-15  6:11     ` consistent_free() Paul Mackerras
2002-06-15  6:42       ` consistent_free() Dan Malek
2002-06-15 10:02         ` consistent_free() Paul Mackerras
2002-06-15 13:51           ` consistent_free() Dan Malek
2002-06-15  6:02   ` consistent_free() Paul Mackerras
2002-06-15  6:27     ` consistent_free() Dan Malek
2002-06-15  6:57       ` consistent_free() David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).