linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCH] prepare_unmapped_area
       [not found]   ` <20070206044516.GA16647@wotan.suse.de>
@ 2007-02-06  5:04     ` Benjamin Herrenschmidt
  2007-02-06  5:31       ` Andrew Morton
                         ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Benjamin Herrenschmidt @ 2007-02-06  5:04 UTC (permalink / raw)
  To: Nick Piggin; +Cc: akpm, hugh, Linux Memory Management

Hi folks !

On Cell, I have, for performance reasons, a need to create special
mappings of SPEs that use a different page size as the system base page
size _and_ as the huge page size.

Due to the way the PowerPC memory management works, however, I can only
have one page size per "segment" of 256MB (or 1T) and thus after such a
mapping have been created in its own segment, I need to constraint
-other- vma's to stay out of that area.

This currently cannot be done with the existing arch hooks (because of
MAP_FIXED). However, the hugetlbfs code already has a hack in there to
do the exact same thing for huge pages. Thus, this patch moves that hack
into something that can be overriden by the architectures. This approach
was choosen as the less ugly of the uglies after discussing with Nick
Piggin. If somebody has a better idea, I'd love to hear it.

If it doesn't shoke anybody to death, I'd like to see that in -mm (and
possibly upstream, I don't know yet if my code using that will make
2.6.21 or not, but it would be nice if the list of "dependent" patches
wasn't 3 pages long anyway :-)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

Index: linux-cell/mm/mmap.c
===================================================================
--- linux-cell.orig/mm/mmap.c	2007-02-06 15:56:42.000000000 +1100
+++ linux-cell/mm/mmap.c	2007-02-06 15:59:23.000000000 +1100
@@ -1353,6 +1353,28 @@ void arch_unmap_area_topdown(struct mm_s
 		mm->free_area_cache = mm->mmap_base;
 }
 
+#ifndef HAVE_ARCH_PREPARE_UNMAPPED_AREA
+int arch_prepare_unmapped_area(struct file *file, unsigned long addr,
+			       unsigned long len, unsigned long pgoff,
+			       unsigned long flags)
+{
+	if (file && is_file_hugepages(file))  {
+		/*
+		 * Check if the given range is hugepage aligned, and
+		 * can be made suitable for hugepages.
+		 */
+		return prepare_hugepage_range(addr, len, pgoff);
+	} else {
+		/*
+		 * Ensure that a normal request is not falling in a
+		 * reserved hugepage range.  For some archs like IA-64,
+		 * there is a separate region for hugepages.
+		 */
+		return is_hugepage_only_range(current->mm, addr, len);
+	}
+}
+#endif /* HAVE_ARCH_PREPARE_UNMAPPED_AREA */
+
 unsigned long
 get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 		unsigned long pgoff, unsigned long flags)
@@ -1374,20 +1396,7 @@ get_unmapped_area(struct file *file, uns
 		return -ENOMEM;
 	if (addr & ~PAGE_MASK)
 		return -EINVAL;
-	if (file && is_file_hugepages(file))  {
-		/*
-		 * Check if the given range is hugepage aligned, and
-		 * can be made suitable for hugepages.
-		 */
-		ret = prepare_hugepage_range(addr, len, pgoff);
-	} else {
-		/*
-		 * Ensure that a normal request is not falling in a
-		 * reserved hugepage range.  For some archs like IA-64,
-		 * there is a separate region for hugepages.
-		 */
-		ret = is_hugepage_only_range(current->mm, addr, len);
-	}
+	ret = arch_prepare_unmapped_area(file, addr, len, pgoff, flags);
 	if (ret)
 		return -EINVAL;
 	return addr;


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  5:04     ` [RFC/PATCH] prepare_unmapped_area Benjamin Herrenschmidt
@ 2007-02-06  5:31       ` Andrew Morton
  2007-02-06  5:46         ` Benjamin Herrenschmidt
  2007-02-06  9:55       ` Christoph Hellwig
  2007-02-06 15:56       ` Adam Litke
  2 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2007-02-06  5:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Nick Piggin, hugh, Linux Memory Management

On Tue, 06 Feb 2007 16:04:56 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> +#ifndef HAVE_ARCH_PREPARE_UNMAPPED_AREA
> +int arch_prepare_unmapped_area(struct file *file, unsigned long addr,

__attribute__((weak)), please.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  5:31       ` Andrew Morton
@ 2007-02-06  5:46         ` Benjamin Herrenschmidt
  2007-02-06  5:58           ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Herrenschmidt @ 2007-02-06  5:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, hugh, Linux Memory Management

On Mon, 2007-02-05 at 21:31 -0800, Andrew Morton wrote:
> On Tue, 06 Feb 2007 16:04:56 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > +#ifndef HAVE_ARCH_PREPARE_UNMAPPED_AREA
> > +int arch_prepare_unmapped_area(struct file *file, unsigned long addr,
> 
> __attribute__((weak)), please.

Not sure about that ... it will usually be inline, in fact, should be
static inline...

Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  5:46         ` Benjamin Herrenschmidt
@ 2007-02-06  5:58           ` Andrew Morton
  2007-02-06  6:02             ` Benjamin Herrenschmidt
  2007-02-06  6:12             ` Nick Piggin
  0 siblings, 2 replies; 17+ messages in thread
From: Andrew Morton @ 2007-02-06  5:58 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Nick Piggin, hugh, Linux Memory Management

On Tue, 06 Feb 2007 16:46:00 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Mon, 2007-02-05 at 21:31 -0800, Andrew Morton wrote:
> > On Tue, 06 Feb 2007 16:04:56 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > 
> > > +#ifndef HAVE_ARCH_PREPARE_UNMAPPED_AREA
> > > +int arch_prepare_unmapped_area(struct file *file, unsigned long addr,
> > 
> > __attribute__((weak)), please.
> 
> Not sure about that ... it will usually be inline, in fact, should be
> static inline...
> 

Bah.  function calls are fast, mmap() is slow and ARCH_HAVE_FOO is fugly.

Alternative: implement include/asm-*/arch-mmap.h and put the implementation
in there.  That way, we can lose HAVE_ARCH_UNMAPPED_AREA and maybe a few other
things too.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  5:58           ` Andrew Morton
@ 2007-02-06  6:02             ` Benjamin Herrenschmidt
  2007-02-06  6:08               ` Andrew Morton
  2007-02-06  6:12             ` Nick Piggin
  1 sibling, 1 reply; 17+ messages in thread
From: Benjamin Herrenschmidt @ 2007-02-06  6:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, hugh, Linux Memory Management

On Mon, 2007-02-05 at 21:58 -0800, Andrew Morton wrote:
> On Tue, 06 Feb 2007 16:46:00 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > On Mon, 2007-02-05 at 21:31 -0800, Andrew Morton wrote:
> > > On Tue, 06 Feb 2007 16:04:56 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > > 
> > > > +#ifndef HAVE_ARCH_PREPARE_UNMAPPED_AREA
> > > > +int arch_prepare_unmapped_area(struct file *file, unsigned long addr,
> > > 
> > > __attribute__((weak)), please.
> > 
> > Not sure about that ... it will usually be inline, in fact, should be
> > static inline...
> > 
> 
> Bah.  function calls are fast, mmap() is slow and ARCH_HAVE_FOO is fugly.
> 
> Alternative: implement include/asm-*/arch-mmap.h and put the implementation
> in there.  That way, we can lose HAVE_ARCH_UNMAPPED_AREA and maybe a few other
> things too.

Yeah, I could have the two version in there become
generic_get_unmapped_area{_topdown} and have arch inlines for all
archs... probably a good idea. I'll look into it tomorrow.

Regarding using weak symbols, I'm not sure what you had in mind... you
can use those to have a symbol in arch overriding a symbol elsewhere ?

Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  6:02             ` Benjamin Herrenschmidt
@ 2007-02-06  6:08               ` Andrew Morton
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Morton @ 2007-02-06  6:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Nick Piggin, hugh, Linux Memory Management

On Tue, 06 Feb 2007 17:02:37 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> Regarding using weak symbols, I'm not sure what you had in mind... you
> can use those to have a symbol in arch overriding a symbol elsewhere ?

yup.  See printk_clock() and arch_vma_name() for examples.

It's quite nice, when it fits.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  5:58           ` Andrew Morton
  2007-02-06  6:02             ` Benjamin Herrenschmidt
@ 2007-02-06  6:12             ` Nick Piggin
  2007-02-06  6:37               ` Andrew Morton
  1 sibling, 1 reply; 17+ messages in thread
From: Nick Piggin @ 2007-02-06  6:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Benjamin Herrenschmidt, hugh, Linux Memory Management

On Mon, Feb 05, 2007 at 09:58:27PM -0800, Andrew Morton wrote:
> On Tue, 06 Feb 2007 16:46:00 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > On Mon, 2007-02-05 at 21:31 -0800, Andrew Morton wrote:
> > > On Tue, 06 Feb 2007 16:04:56 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > > 
> > > > +#ifndef HAVE_ARCH_PREPARE_UNMAPPED_AREA
> > > > +int arch_prepare_unmapped_area(struct file *file, unsigned long addr,
> > > 
> > > __attribute__((weak)), please.
> > 
> > Not sure about that ... it will usually be inline, in fact, should be
> > static inline...
> > 
> 
> Bah.  function calls are fast, mmap() is slow and ARCH_HAVE_FOO is fugly.

It still costs a whole nother cacheline, for just an empty function on
!hugepage kernels.

> Alternative: implement include/asm-*/arch-mmap.h and put the implementation
> in there.  That way, we can lose HAVE_ARCH_UNMAPPED_AREA and maybe a few other
> things too.

Yes please.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  6:12             ` Nick Piggin
@ 2007-02-06  6:37               ` Andrew Morton
  2007-02-06  6:40                 ` Nick Piggin
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2007-02-06  6:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Benjamin Herrenschmidt, hugh, Linux Memory Management

On Tue, 6 Feb 2007 07:12:11 +0100 Nick Piggin <npiggin@suse.de> wrote:

> On Mon, Feb 05, 2007 at 09:58:27PM -0800, Andrew Morton wrote:
> > On Tue, 06 Feb 2007 16:46:00 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > 
> > > On Mon, 2007-02-05 at 21:31 -0800, Andrew Morton wrote:
> > > > On Tue, 06 Feb 2007 16:04:56 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > > > 
> > > > > +#ifndef HAVE_ARCH_PREPARE_UNMAPPED_AREA
> > > > > +int arch_prepare_unmapped_area(struct file *file, unsigned long addr,
> > > > 
> > > > __attribute__((weak)), please.
> > > 
> > > Not sure about that ... it will usually be inline, in fact, should be
> > > static inline...
> > > 
> > 
> > Bah.  function calls are fast, mmap() is slow and ARCH_HAVE_FOO is fugly.
> 
> It still costs a whole nother cacheline, for just an empty function on
> !hugepage kernels.

Doubtful, especially with CONFIG_CC_OPTIMIZE_FOR_SIZE=y.

> > Alternative: implement include/asm-*/arch-mmap.h and put the implementation
> > in there.  That way, we can lose HAVE_ARCH_UNMAPPED_AREA and maybe a few other
> > things too.
> 
> Yes please.

It's a heck of a lot of fuss, adding 20-odd new files.  If we're sure that
we can use it for other things then maybe..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  6:37               ` Andrew Morton
@ 2007-02-06  6:40                 ` Nick Piggin
  2007-02-06  6:54                   ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Nick Piggin @ 2007-02-06  6:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Benjamin Herrenschmidt, hugh, Linux Memory Management

On Mon, Feb 05, 2007 at 10:37:47PM -0800, Andrew Morton wrote:
> On Tue, 6 Feb 2007 07:12:11 +0100 Nick Piggin <npiggin@suse.de> wrote:
> > 
> > It still costs a whole nother cacheline, for just an empty function on
> > !hugepage kernels.
> 
> Doubtful, especially with CONFIG_CC_OPTIMIZE_FOR_SIZE=y.

Oh, does the function call get stripped out in that case? Why does it
get left in with OPTIMIZE_FOR_SIZE=n, I wonder?

> > > Alternative: implement include/asm-*/arch-mmap.h and put the implementation
> > > in there.  That way, we can lose HAVE_ARCH_UNMAPPED_AREA and maybe a few other
> > > things too.
> > 
> > Yes please.
> 
> It's a heck of a lot of fuss, adding 20-odd new files.  If we're sure that
> we can use it for other things then maybe..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  6:40                 ` Nick Piggin
@ 2007-02-06  6:54                   ` Andrew Morton
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Morton @ 2007-02-06  6:54 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Benjamin Herrenschmidt, hugh, Linux Memory Management

On Tue, 6 Feb 2007 07:40:34 +0100 Nick Piggin <npiggin@suse.de> wrote:

> On Mon, Feb 05, 2007 at 10:37:47PM -0800, Andrew Morton wrote:
> > On Tue, 6 Feb 2007 07:12:11 +0100 Nick Piggin <npiggin@suse.de> wrote:
> > > 
> > > It still costs a whole nother cacheline, for just an empty function on
> > > !hugepage kernels.
> > 
> > Doubtful, especially with CONFIG_CC_OPTIMIZE_FOR_SIZE=y.
> 
> Oh, does the function call get stripped out in that case? Why does it
> get left in with OPTIMIZE_FOR_SIZE=n, I wonder?

It's still there, but is probably sharing a cacheline with the callee,
assuming the linker leaves functions in the programmer-specfied order,
which it usually does.

Some of the fancy linker options might subvert that, dunno.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  5:04     ` [RFC/PATCH] prepare_unmapped_area Benjamin Herrenschmidt
  2007-02-06  5:31       ` Andrew Morton
@ 2007-02-06  9:55       ` Christoph Hellwig
  2007-02-06 10:07         ` Benjamin Herrenschmidt
  2007-02-06 15:56       ` Adam Litke
  2 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2007-02-06  9:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Nick Piggin, akpm, hugh, Linux Memory Management

On Tue, Feb 06, 2007 at 04:04:56PM +1100, Benjamin Herrenschmidt wrote:
> Hi folks !
> 
> On Cell, I have, for performance reasons, a need to create special
> mappings of SPEs that use a different page size as the system base page
> size _and_ as the huge page size.
> 
> Due to the way the PowerPC memory management works, however, I can only
> have one page size per "segment" of 256MB (or 1T) and thus after such a
> mapping have been created in its own segment, I need to constraint
> -other- vma's to stay out of that area.
> 
> This currently cannot be done with the existing arch hooks (because of
> MAP_FIXED). However, the hugetlbfs code already has a hack in there to
> do the exact same thing for huge pages. Thus, this patch moves that hack
> into something that can be overriden by the architectures. This approach
> was choosen as the less ugly of the uglies after discussing with Nick
> Piggin. If somebody has a better idea, I'd love to hear it.
> 
> If it doesn't shoke anybody to death, I'd like to see that in -mm (and
> possibly upstream, I don't know yet if my code using that will make
> 2.6.21 or not, but it would be nice if the list of "dependent" patches
> wasn't 3 pages long anyway :-)

Eeek, this is more than fugly.  Dave Hansen suggested to move these
checks into a file operation in response to Adam Litke's hugetlb cleanups,
and this patch shows he was right :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  9:55       ` Christoph Hellwig
@ 2007-02-06 10:07         ` Benjamin Herrenschmidt
  2007-02-06 10:23           ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Herrenschmidt @ 2007-02-06 10:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Nick Piggin, akpm, hugh, Linux Memory Management

> Eeek, this is more than fugly.  Dave Hansen suggested to move these
> checks into a file operation in response to Adam Litke's hugetlb cleanups,
> and this patch shows he was right :)

No, you don't understand... There is a fops for get_unmapped_area for
the "special" file. It's currently not called for MAP_FIXED but that can
be fixed easily enough (in fact, I have a few ideas to clean up some of
that code, it's already horrible today).

The problem is to prevent something -else- from being mapped into one of
those 256MB area once it's been switched to a different page size.

Right now, this is done via this hugetlbfs specific hack. I want to
have instead some way to have the arch "validate" the address after
get_unmapped_area(), in addition, hugetlbfs wants to "prepare" but that
could indeed be done in hugetlbfs provided fops->get_unmapped_area() if
we call it for MAP_FIXED as well.

Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06 10:07         ` Benjamin Herrenschmidt
@ 2007-02-06 10:23           ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2007-02-06 10:23 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Hellwig, Nick Piggin, akpm, hugh,
	Linux Memory Management

On Tue, Feb 06, 2007 at 09:07:22PM +1100, Benjamin Herrenschmidt wrote:
> 
> > Eeek, this is more than fugly.  Dave Hansen suggested to move these
> > checks into a file operation in response to Adam Litke's hugetlb cleanups,
> > and this patch shows he was right :)
> 
> No, you don't understand... There is a fops for get_unmapped_area for
> the "special" file. It's currently not called for MAP_FIXED but that can
> be fixed easily enough (in fact, I have a few ideas to clean up some of
> that code, it's already horrible today).
> 
> The problem is to prevent something -else- from being mapped into one of
> those 256MB area once it's been switched to a different page size.
> 
> Right now, this is done via this hugetlbfs specific hack. I want to
> have instead some way to have the arch "validate" the address after
> get_unmapped_area(), in addition, hugetlbfs wants to "prepare" but that
> could indeed be done in hugetlbfs provided fops->get_unmapped_area() if
> we call it for MAP_FIXED as well.

Can we extent mm->get_unmapped_area for that if it's called for !MAP_FIXED
aswell instead of adding yet another arch hook?

This area is getting a little bit too messy with all the pseudo-generic code
and lots of arch hooks. Personally I'd prefer to let get_unmapped_area
look like the following:


unsigned long
get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
		  unsigned long pgoff, unsigned long flags)
{
	get_area = current->mm->get_unmapped_area;
	if (file && file->f_op && file->f_op->get_unmapped_area)
		get_area = file->f_op->get_unmapped_area;
	addr = get_area(file, addr, len, pgoff, flags);
	if (IS_ERR_VALUE(addr))
		return addr;
}

aka mm->get_unmapped_area is mandatory, and all arch specific code
is move into it.  We'd provide a default mm->get_unmapped_area that
doesn't even deal with hugetlb for all the trivial architectures,
and any arch that wants to do their own work can do all this through
a signle hook.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06  5:04     ` [RFC/PATCH] prepare_unmapped_area Benjamin Herrenschmidt
  2007-02-06  5:31       ` Andrew Morton
  2007-02-06  9:55       ` Christoph Hellwig
@ 2007-02-06 15:56       ` Adam Litke
  2007-02-06 20:12         ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 17+ messages in thread
From: Adam Litke @ 2007-02-06 15:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Nick Piggin, akpm, hugh, Linux Memory Management, hch,
	David C. Hansen [imap]

On Tue, 2007-02-06 at 16:04 +1100, Benjamin Herrenschmidt wrote:
> Hi folks !
> 
> On Cell, I have, for performance reasons, a need to create special
> mappings of SPEs that use a different page size as the system base page
> size _and_ as the huge page size.
> 
> Due to the way the PowerPC memory management works, however, I can only
> have one page size per "segment" of 256MB (or 1T) and thus after such a
> mapping have been created in its own segment, I need to constraint
> -other- vma's to stay out of that area.
> 
> This currently cannot be done with the existing arch hooks (because of
> MAP_FIXED). However, the hugetlbfs code already has a hack in there to
> do the exact same thing for huge pages. Thus, this patch moves that hack
> into something that can be overriden by the architectures. This approach
> was choosen as the less ugly of the uglies after discussing with Nick
> Piggin. If somebody has a better idea, I'd love to hear it.

Hi Ben.  Would my patch from last Jan 31 entitled "[PATCH 5/6] Abstract
is_hugepage_only_range" (attached for your convienence) solve this
problem?

commit ef36c6c859d37ac40f0bd12d08f41f103ab76657
Author: litke@us.ibm.com <aglitke@kernel.localdomain>
Date:   Tue Jan 16 08:57:16 2007 -0800

    Abstract is_hugepage_only_range
    
    Some architectures define regions of the address space that can be used
    exclusively for either normal pages or hugetlb pages.  Currently,
    prepare_hugepage_range() is used to validate an unmapped_area for use with
    hugepages and is_hugepage_only_range() is used to validate an unmapped_area for
    normal pages.
    
    Introduce a prepare_unmapped_area() file operation to abstract the validation
    of unmapped areas.  If prepare_unmapped_area() is not specified, the default
    behavior is to require the area to not overlap any "special" areas.
    
    Buh-bye to another is_file_hugepages() call.
    
    Signed-off-by: Adam Litke <agl@us.ibm.com>

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index b61592f..3eea7a5 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -561,6 +561,7 @@ const struct file_operations hugetlbfs_file_operations = {
 	.mmap			= hugetlbfs_file_mmap,
 	.fsync			= simple_sync_file,
 	.get_unmapped_area	= hugetlb_get_unmapped_area,
+	.prepare_unmapped_area	= prepare_hugepage_range,
 };
 
 static struct inode_operations hugetlbfs_dir_inode_operations = {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1410e53..853a4f4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1094,6 +1094,7 @@ struct file_operations {
 	ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, void *);
 	ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
 	unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
+	int (*prepare_unmapped_area)(unsigned long addr, unsigned long len, pgoff_t pgoff);
 	int (*check_flags)(int);
 	int (*dir_notify)(struct file *filp, unsigned long arg);
 	int (*flock) (struct file *, int, struct file_lock *);
diff --git a/mm/mmap.c b/mm/mmap.c
index a5cb0a5..f8e0bd0 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1374,20 +1374,17 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 		return -ENOMEM;
 	if (addr & ~PAGE_MASK)
 		return -EINVAL;
-	if (file && is_file_hugepages(file))  {
-		/*
-		 * Check if the given range is hugepage aligned, and
-		 * can be made suitable for hugepages.
-		 */
-		ret = prepare_hugepage_range(addr, len, pgoff);
-	} else {
-		/*
-		 * Ensure that a normal request is not falling in a
-		 * reserved hugepage range.  For some archs like IA-64,
-		 * there is a separate region for hugepages.
-		 */
+	/*
+	 * This file may only be able to be mapped into special areas of the
+	 * addess space (eg. hugetlb pages).  If prepare_unmapped_area() is
+	 * specified, use it to validate the selected range.  If not, just
+	 * make sure the range does not overlap any special ranges.
+	 */
+	if (file && file->f_op && file->f_op->prepare_unmapped_area)
+		ret = file->f_op->prepare_unmapped_area(addr, len, pgoff);
+	else
 		ret = is_hugepage_only_range(current->mm, addr, len);
-	}
+
 	if (ret)
 		return -EINVAL;
 	return addr;

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06 15:56       ` Adam Litke
@ 2007-02-06 20:12         ` Benjamin Herrenschmidt
  2007-02-06 20:52           ` Adam Litke
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Herrenschmidt @ 2007-02-06 20:12 UTC (permalink / raw)
  To: Adam Litke
  Cc: Nick Piggin, akpm, hugh, Linux Memory Management, hch,
	David C. Hansen [imap]

On Tue, 2007-02-06 at 09:56 -0600, Adam Litke wrote:
> On Tue, 2007-02-06 at 16:04 +1100, Benjamin Herrenschmidt wrote:
> > Hi folks !
> > 
> > On Cell, I have, for performance reasons, a need to create special
> > mappings of SPEs that use a different page size as the system base page
> > size _and_ as the huge page size.
> > 
> > Due to the way the PowerPC memory management works, however, I can only
> > have one page size per "segment" of 256MB (or 1T) and thus after such a
> > mapping have been created in its own segment, I need to constraint
> > -other- vma's to stay out of that area.
> > 
> > This currently cannot be done with the existing arch hooks (because of
> > MAP_FIXED). However, the hugetlbfs code already has a hack in there to
> > do the exact same thing for huge pages. Thus, this patch moves that hack
> > into something that can be overriden by the architectures. This approach
> > was choosen as the less ugly of the uglies after discussing with Nick
> > Piggin. If somebody has a better idea, I'd love to hear it.
> 
> Hi Ben.  Would my patch from last Jan 31 entitled "[PATCH 5/6] Abstract
> is_hugepage_only_range" (attached for your convienence) solve this
> problem?

I don't see how your patch abstracts is_hugepage_only_range tho... you
still call it at the same spot, you abstracted prepare_hugepage_range.

I was talking to hch and arjan yesterday on irc and we though about
having an mm hook validate_area() that could replace the
is_hugepage_only_range() hack and deal with my issue as well. As for
having prepare in the fops, do we need it at all if we call fops->g_u_a
in the MAP_FIXED case ?

Ben.

> commit ef36c6c859d37ac40f0bd12d08f41f103ab76657
> Author: litke@us.ibm.com <aglitke@kernel.localdomain>
> Date:   Tue Jan 16 08:57:16 2007 -0800
> 
>     Abstract is_hugepage_only_range
>     
>     Some architectures define regions of the address space that can be used
>     exclusively for either normal pages or hugetlb pages.  Currently,
>     prepare_hugepage_range() is used to validate an unmapped_area for use with
>     hugepages and is_hugepage_only_range() is used to validate an unmapped_area for
>     normal pages.
>     
>     Introduce a prepare_unmapped_area() file operation to abstract the validation
>     of unmapped areas.  If prepare_unmapped_area() is not specified, the default
>     behavior is to require the area to not overlap any "special" areas.
>     
>     Buh-bye to another is_file_hugepages() call.
>     
>     Signed-off-by: Adam Litke <agl@us.ibm.com>
> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index b61592f..3eea7a5 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -561,6 +561,7 @@ const struct file_operations hugetlbfs_file_operations = {
>  	.mmap			= hugetlbfs_file_mmap,
>  	.fsync			= simple_sync_file,
>  	.get_unmapped_area	= hugetlb_get_unmapped_area,
> +	.prepare_unmapped_area	= prepare_hugepage_range,
>  };
>  
>  static struct inode_operations hugetlbfs_dir_inode_operations = {
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 1410e53..853a4f4 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1094,6 +1094,7 @@ struct file_operations {
>  	ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, void *);
>  	ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
>  	unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
> +	int (*prepare_unmapped_area)(unsigned long addr, unsigned long len, pgoff_t pgoff);
>  	int (*check_flags)(int);
>  	int (*dir_notify)(struct file *filp, unsigned long arg);
>  	int (*flock) (struct file *, int, struct file_lock *);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index a5cb0a5..f8e0bd0 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1374,20 +1374,17 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  		return -ENOMEM;
>  	if (addr & ~PAGE_MASK)
>  		return -EINVAL;
> -	if (file && is_file_hugepages(file))  {
> -		/*
> -		 * Check if the given range is hugepage aligned, and
> -		 * can be made suitable for hugepages.
> -		 */
> -		ret = prepare_hugepage_range(addr, len, pgoff);
> -	} else {
> -		/*
> -		 * Ensure that a normal request is not falling in a
> -		 * reserved hugepage range.  For some archs like IA-64,
> -		 * there is a separate region for hugepages.
> -		 */
> +	/*
> +	 * This file may only be able to be mapped into special areas of the
> +	 * addess space (eg. hugetlb pages).  If prepare_unmapped_area() is
> +	 * specified, use it to validate the selected range.  If not, just
> +	 * make sure the range does not overlap any special ranges.
> +	 */
> +	if (file && file->f_op && file->f_op->prepare_unmapped_area)
> +		ret = file->f_op->prepare_unmapped_area(addr, len, pgoff);
> +	else
>  		ret = is_hugepage_only_range(current->mm, addr, len);
> -	}
> +
>  	if (ret)
>  		return -EINVAL;
>  	return addr;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06 20:12         ` Benjamin Herrenschmidt
@ 2007-02-06 20:52           ` Adam Litke
  2007-02-06 21:02             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 17+ messages in thread
From: Adam Litke @ 2007-02-06 20:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Nick Piggin, akpm, hugh, Linux Memory Management, hch,
	David C. Hansen [imap]

On Wed, 2007-02-07 at 07:12 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2007-02-06 at 09:56 -0600, Adam Litke wrote:
> > On Tue, 2007-02-06 at 16:04 +1100, Benjamin Herrenschmidt wrote:
> > > Hi folks !
> > > 
> > > On Cell, I have, for performance reasons, a need to create special
> > > mappings of SPEs that use a different page size as the system base page
> > > size _and_ as the huge page size.
> > > 
> > > Due to the way the PowerPC memory management works, however, I can only
> > > have one page size per "segment" of 256MB (or 1T) and thus after such a
> > > mapping have been created in its own segment, I need to constraint
> > > -other- vma's to stay out of that area.
> > > 
> > > This currently cannot be done with the existing arch hooks (because of
> > > MAP_FIXED). However, the hugetlbfs code already has a hack in there to
> > > do the exact same thing for huge pages. Thus, this patch moves that hack
> > > into something that can be overriden by the architectures. This approach
> > > was choosen as the less ugly of the uglies after discussing with Nick
> > > Piggin. If somebody has a better idea, I'd love to hear it.
> > 
> > Hi Ben.  Would my patch from last Jan 31 entitled "[PATCH 5/6] Abstract
> > is_hugepage_only_range" (attached for your convienence) solve this
> > problem?
> 
> I don't see how your patch abstracts is_hugepage_only_range tho... you
> still call it at the same spot, you abstracted prepare_hugepage_range.

Yeah, you're right... Former revisions of the patch created a function
called is_special_range() which for the moment only called
is_hugepage_only_range().  The thought was that other types of "special
ranges" could be checked for in this function.  I guess that's basically
the same idea as validate_area() below.  That would work for me.

> I was talking to hch and arjan yesterday on irc and we though about
> having an mm hook validate_area() that could replace the
> is_hugepage_only_range() hack and deal with my issue as well. As for
> having prepare in the fops, do we need it at all if we call fops->g_u_a
> in the MAP_FIXED case ?

Nah, if we cleaned up g_u_a() so that it is always called, away goes the
need for f_ops->prepare_unmapped_area().

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC/PATCH] prepare_unmapped_area
  2007-02-06 20:52           ` Adam Litke
@ 2007-02-06 21:02             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 17+ messages in thread
From: Benjamin Herrenschmidt @ 2007-02-06 21:02 UTC (permalink / raw)
  To: Adam Litke
  Cc: Nick Piggin, akpm, hugh, Linux Memory Management, hch,
	David C. Hansen [imap]

> Yeah, you're right... Former revisions of the patch created a function
> called is_special_range() which for the moment only called
> is_hugepage_only_range().  The thought was that other types of "special
> ranges" could be checked for in this function.  I guess that's basically
> the same idea as validate_area() below.  That would work for me.
> 
> > I was talking to hch and arjan yesterday on irc and we though about
> > having an mm hook validate_area() that could replace the
> > is_hugepage_only_range() hack and deal with my issue as well. As for
> > having prepare in the fops, do we need it at all if we call fops->g_u_a
> > in the MAP_FIXED case ?
> 
> Nah, if we cleaned up g_u_a() so that it is always called, away goes the
> need for f_ops->prepare_unmapped_area().

Ok, I'll cook up a patch around those lines, possibly next week.

Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-02-06 21:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200702060405.l1645R7G009668@shell0.pdx.osdl.net>
     [not found] ` <1170736938.2620.213.camel@localhost.localdomain>
     [not found]   ` <20070206044516.GA16647@wotan.suse.de>
2007-02-06  5:04     ` [RFC/PATCH] prepare_unmapped_area Benjamin Herrenschmidt
2007-02-06  5:31       ` Andrew Morton
2007-02-06  5:46         ` Benjamin Herrenschmidt
2007-02-06  5:58           ` Andrew Morton
2007-02-06  6:02             ` Benjamin Herrenschmidt
2007-02-06  6:08               ` Andrew Morton
2007-02-06  6:12             ` Nick Piggin
2007-02-06  6:37               ` Andrew Morton
2007-02-06  6:40                 ` Nick Piggin
2007-02-06  6:54                   ` Andrew Morton
2007-02-06  9:55       ` Christoph Hellwig
2007-02-06 10:07         ` Benjamin Herrenschmidt
2007-02-06 10:23           ` Christoph Hellwig
2007-02-06 15:56       ` Adam Litke
2007-02-06 20:12         ` Benjamin Herrenschmidt
2007-02-06 20:52           ` Adam Litke
2007-02-06 21:02             ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).