linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
@ 2006-06-02 14:08 Adam Litke
  2006-06-02 15:17 ` Dave Hansen
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Adam Litke @ 2006-06-02 14:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, linux-kernel, David Gibson

[PATCH] powerpc: Close hugetlb regions when unmapping VMAs

On powerpc, each segment can contain pages of only one size.  When a
hugetlb mapping is requested, a segment is located and marked for use
with huge pages.  This is a uni-directional operation -- hugetlb
segments are never marked for use again with normal pages.  For long
running processes which make use of a combination of normal and hugetlb
mappings, this behavior can unduly constrain the virtual address space.

The following patch introduces a architecture-specific vm_ops.close()
hook.  For all architectures besides powerpc, this is a no-op.  On
powerpc, the low and high segments are scanned to locate empty hugetlb
segments which can be made available for normal mappings.  Comments?

Signed-off-by: Adam Litke <agl@us.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c |   39 ++++++++++++++++++++++++++++++++++++++-
 include/asm-powerpc/pgtable.h |    1 +
 include/linux/hugetlb.h       |    6 ++++++
 mm/hugetlb.c                  |    1 +
 4 files changed, 46 insertions(+), 1 deletion(-)
diff -upN reference/arch/powerpc/mm/hugetlbpage.c current/arch/powerpc/mm/hugetlbpage.c
--- reference/arch/powerpc/mm/hugetlbpage.c
+++ current/arch/powerpc/mm/hugetlbpage.c
@@ -297,7 +297,6 @@ void hugetlb_free_pgd_range(struct mmu_g
 	start = addr;
 	pgd = pgd_offset((*tlb)->mm, addr);
 	do {
-		BUG_ON(! in_hugepage_area((*tlb)->mm->context, addr));
 		next = pgd_addr_end(addr, end);
 		if (pgd_none_or_clear_bad(pgd))
 			continue;
@@ -494,6 +493,44 @@ static int open_high_hpage_areas(struct 
 	return 0;
 }
 
+/*
+ * Called when tearing down a hugetlb vma.  See if we can free up any
+ * htlb areas so normal pages can be mapped there again.
+ */
+void arch_hugetlb_close_vma(struct vm_area_struct *vma)
+{
+	struct mm_struct *mm = vma->vm_mm;
+	unsigned long i;
+	struct slb_flush_info fi;
+	u16 inuse, hiflush, loflush;
+
+	if (!mm)
+		return;
+
+	inuse = mm->context.low_htlb_areas;
+	for (i = 0; i < NUM_LOW_AREAS; i++)
+		if (prepare_low_area_for_htlb(mm, i) == 0)
+			inuse &= ~(1 << i);
+	loflush = inuse ^ mm->context.low_htlb_areas;
+	mm->context.low_htlb_areas = inuse;
+
+	inuse = mm->context.high_htlb_areas;
+	for (i = 0; i < NUM_HIGH_AREAS; i++)
+		if (prepare_high_area_for_htlb(mm, i) == 0)
+			inuse &= ~(1 << i);
+	hiflush = inuse ^ mm->context.high_htlb_areas;
+	mm->context.high_htlb_areas = inuse;
+
+	/* the context changes must make it to memory before the flush,
+	 * so that further SLB misses do the right thing. */
+	mb();
+	fi.mm = mm;
+	if ((fi.newareas = loflush))
+		on_each_cpu(flush_low_segments, &fi, 0, 1);
+	if ((fi.newareas = hiflush))
+		on_each_cpu(flush_high_segments, &fi, 0, 1);
+}
+
 int prepare_hugepage_range(unsigned long addr, unsigned long len)
 {
 	int err = 0;
diff -upN reference/include/asm-powerpc/pgtable.h current/include/asm-powerpc/pgtable.h
--- reference/include/asm-powerpc/pgtable.h
+++ current/include/asm-powerpc/pgtable.h
@@ -155,6 +155,7 @@ extern unsigned long empty_zero_page[PAG
 
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#define ARCH_HAS_HUGETLB_CLOSE_VMA
 
 #endif
 
diff -upN reference/include/linux/hugetlb.h current/include/linux/hugetlb.h
--- reference/include/linux/hugetlb.h
+++ current/include/linux/hugetlb.h
@@ -85,6 +85,12 @@ pte_t huge_ptep_get_and_clear(struct mm_
 void hugetlb_prefault_arch_hook(struct mm_struct *mm);
 #endif
 
+#ifndef ARCH_HAS_HUGETLB_CLOSE_VMA
+#define arch_hugetlb_close_vma(x)	0
+#else
+void arch_hugetlb_close_vma(struct vm_area_struct *vma);
+#endif
+
 #else /* !CONFIG_HUGETLB_PAGE */
 
 static inline int is_vm_hugetlb_page(struct vm_area_struct *vma)
diff -upN reference/mm/hugetlb.c current/mm/hugetlb.c
--- reference/mm/hugetlb.c
+++ current/mm/hugetlb.c
@@ -399,6 +399,7 @@ static struct page *hugetlb_nopage(struc
 
 struct vm_operations_struct hugetlb_vm_ops = {
 	.nopage = hugetlb_nopage,
+	.close = arch_hugetlb_close_vma,
 };
 
 static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page,

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 14:08 [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close Adam Litke
@ 2006-06-02 15:17 ` Dave Hansen
  2006-06-02 16:47   ` Adam Litke
  2006-06-02 16:43 ` Hugh Dickins
  2006-06-02 20:06 ` Christoph Lameter
  2 siblings, 1 reply; 9+ messages in thread
From: Dave Hansen @ 2006-06-02 15:17 UTC (permalink / raw)
  To: Adam Litke; +Cc: linuxppc-dev, David Gibson, linux-kernel, linux-mm

On Fri, 2006-06-02 at 09:08 -0500, Adam Litke wrote:
>  #define HAVE_ARCH_UNMAPPED_AREA
>  #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
> +#define ARCH_HAS_HUGETLB_CLOSE_VMA
>  
>  #endif
>  
> diff -upN reference/include/linux/hugetlb.h
> current/include/linux/hugetlb.h
> --- reference/include/linux/hugetlb.h
> +++ current/include/linux/hugetlb.h
> @@ -85,6 +85,12 @@ pte_t huge_ptep_get_and_clear(struct mm_
>  void hugetlb_prefault_arch_hook(struct mm_struct *mm);
>  #endif
>  
> +#ifndef ARCH_HAS_HUGETLB_CLOSE_VMA
> +#define arch_hugetlb_close_vma(x)      0
> +#else
> +void arch_hugetlb_close_vma(struct vm_area_struct *vma);
> +#endif

Please don't do this ARCH_HAS stuff.  Use Kconfig at the very least.
You could also have an arch-specific htlb vma init function that could
be used for other things in the future. 

> @@ -297,7 +297,6 @@ void hugetlb_free_pgd_range(struct mmu_g
>         start = addr;
>         pgd = pgd_offset((*tlb)->mm, addr);
>         do {
> -               BUG_ON(! in_hugepage_area((*tlb)->mm->context, addr));
>                 next = pgd_addr_end(addr, end);
>                 if (pgd_none_or_clear_bad(pgd))
>                         continue;

Why does this BUG() go away?

> +/*
> + * Called when tearing down a hugetlb vma.  See if we can free up any
> + * htlb areas so normal pages can be mapped there again.
> + */
> +void arch_hugetlb_close_vma(struct vm_area_struct *vma)
> +{
> +       struct mm_struct *mm = vma->vm_mm;
> +       unsigned long i;
> +       struct slb_flush_info fi;
> +       u16 inuse, hiflush, loflush;
> +
> +       if (!mm)
> +               return;

Why is this check necessary?  Do kernel threads use vmas? ;)

> +       inuse = mm->context.low_htlb_areas;
> +       for (i = 0; i < NUM_LOW_AREAS; i++)
> +               if (prepare_low_area_for_htlb(mm, i) == 0)
> +                       inuse &= ~(1 << i);

Why check _all_ the areas?  Shouldn't the check just be for the current
VMA's area?  Also, prepare_low_area_for_htlb() is a pretty silly
function name, especially for its use here.  Especially because you are
tearing down a htlb area.  low_area_contains_vma() is a bit more apt.  

My first thought about what this function is that it should probably be
asking the question, "is the VMA that I'm closing right now that last
one in this segment?"

> +       loflush = inuse ^ mm->context.low_htlb_areas;
> +       mm->context.low_htlb_areas = inuse;

This bit fiddling should really be done in some helper functions.  It
isn't immediately and completely obvious what this is doing.  

> +       inuse = mm->context.high_htlb_areas;

Are you re-using "inuse"?  How about a different variable name for a
different use?

> +       for (i = 0; i < NUM_HIGH_AREAS; i++)
> +               if (prepare_high_area_for_htlb(mm, i) == 0)
> +                       inuse &= ~(1 << i);
> +       hiflush = inuse ^ mm->context.high_htlb_areas;
> +       mm->context.high_htlb_areas = inuse;

This, combined with the other loop, completely rebuild the mm->context's
view into htlb state, right?  Isn't that a bit excessive?

> +       /* the context changes must make it to memory before the flush,
> +        * so that further SLB misses do the right thing. */
> +       mb();
> +       fi.mm = mm;
> +       if ((fi.newareas = loflush))
> +               on_each_cpu(flush_low_segments, &fi, 0, 1);
> +       if ((fi.newareas = hiflush))
> +               on_each_cpu(flush_high_segments, &fi, 0, 1);
> +}

Yikes!  Think about a pathological program here.  It mmap()s 1 htlb
area, then unmaps it quickly, over and over.  What will that perform
like here?  

-- Dave

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 14:08 [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close Adam Litke
  2006-06-02 15:17 ` Dave Hansen
@ 2006-06-02 16:43 ` Hugh Dickins
  2006-06-02 16:49   ` Adam Litke
  2006-06-02 20:06 ` Christoph Lameter
  2 siblings, 1 reply; 9+ messages in thread
From: Hugh Dickins @ 2006-06-02 16:43 UTC (permalink / raw)
  To: Adam Litke; +Cc: linuxppc-dev, David Gibson, linux-kernel, linux-mm

On Fri, 2 Jun 2006, Adam Litke wrote:
> 
> On powerpc, each segment can contain pages of only one size.  When a
> hugetlb mapping is requested, a segment is located and marked for use
> with huge pages.  This is a uni-directional operation -- hugetlb
> segments are never marked for use again with normal pages.  For long
> running processes which make use of a combination of normal and hugetlb
> mappings, this behavior can unduly constrain the virtual address space.
> 
> The following patch introduces a architecture-specific vm_ops.close()
> hook.  For all architectures besides powerpc, this is a no-op.  On
> powerpc, the low and high segments are scanned to locate empty hugetlb
> segments which can be made available for normal mappings.  Comments?

Wouldn't hugetlb_free_pgd_range be a better place to do that kind of
thing, all within arch/powerpc, no need for arch_hugetlb_close_vma etc?

Hugh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 15:17 ` Dave Hansen
@ 2006-06-02 16:47   ` Adam Litke
  0 siblings, 0 replies; 9+ messages in thread
From: Adam Litke @ 2006-06-02 16:47 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linuxppc-dev, David Gibson, linux-kernel, linux-mm

On Fri, 2006-06-02 at 08:17 -0700, Dave Hansen wrote:
> On Fri, 2006-06-02 at 09:08 -0500, Adam Litke wrote:
> >  #define HAVE_ARCH_UNMAPPED_AREA
> >  #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
> > +#define ARCH_HAS_HUGETLB_CLOSE_VMA
> >  
> >  #endif
> >  
> > diff -upN reference/include/linux/hugetlb.h
> > current/include/linux/hugetlb.h
> > --- reference/include/linux/hugetlb.h
> > +++ current/include/linux/hugetlb.h
> > @@ -85,6 +85,12 @@ pte_t huge_ptep_get_and_clear(struct mm_
> >  void hugetlb_prefault_arch_hook(struct mm_struct *mm);
> >  #endif
> >  
> > +#ifndef ARCH_HAS_HUGETLB_CLOSE_VMA
> > +#define arch_hugetlb_close_vma(x)      0
> > +#else
> > +void arch_hugetlb_close_vma(struct vm_area_struct *vma);
> > +#endif
> 
> Please don't do this ARCH_HAS stuff.  Use Kconfig at the very least.
> You could also have an arch-specific htlb vma init function that could
> be used for other things in the future. 

That's how the rest of the hugetlb arch hooks are implemented.

> > @@ -297,7 +297,6 @@ void hugetlb_free_pgd_range(struct mmu_g
> >         start = addr;
> >         pgd = pgd_offset((*tlb)->mm, addr);
> >         do {
> > -               BUG_ON(! in_hugepage_area((*tlb)->mm->context, addr));
> >                 next = pgd_addr_end(addr, end);
> >                 if (pgd_none_or_clear_bad(pgd))
> >                         continue;
> 
> Why does this BUG() go away?

Since the area is 'closed' to huge pages before the page tables are torn
down, it is no longer a bug to have huge ptes in a non-hugetlb region.

> > +/*
> > + * Called when tearing down a hugetlb vma.  See if we can free up any
> > + * htlb areas so normal pages can be mapped there again.
> > + */
> > +void arch_hugetlb_close_vma(struct vm_area_struct *vma)
> > +{
> > +       struct mm_struct *mm = vma->vm_mm;
> > +       unsigned long i;
> > +       struct slb_flush_info fi;
> > +       u16 inuse, hiflush, loflush;
> > +
> > +       if (!mm)
> > +               return;
> 
> Why is this check necessary?  Do kernel threads use vmas? ;)

Paranoia got the best of me here.  I have a habit of checking for null
before dereferencing pointers.  But as you suggest, it should be safe to
remove.

> > +       inuse = mm->context.low_htlb_areas;
> > +       for (i = 0; i < NUM_LOW_AREAS; i++)
> > +               if (prepare_low_area_for_htlb(mm, i) == 0)
> > +                       inuse &= ~(1 << i);
> 
> Why check _all_ the areas?  Shouldn't the check just be for the current
> VMA's area?  Also, prepare_low_area_for_htlb() is a pretty silly
> function name, especially for its use here.  Especially because you are
> tearing down a htlb area.  low_area_contains_vma() is a bit more apt.

Checking all the areas does make the code simpler (if a fair bit less
efficient).  I suppose I could only check htlb-enabled areas as a simple
optimization.  But checking only those regions affected by this vma
might not be that bad.

Yes I agree about the function names.  Originally I was planning to
rename these in a different patch, but I suppose those changes can be
folded into this already small patch.

> My first thought about what this function is that it should probably be
> asking the question, "is the VMA that I'm closing right now that last
> one in this segment?"
> 
> > +       loflush = inuse ^ mm->context.low_htlb_areas;
> > +       mm->context.low_htlb_areas = inuse;
> 
> This bit fiddling should really be done in some helper functions.  It
> isn't immediately and completely obvious what this is doing.  
> 
> > +       inuse = mm->context.high_htlb_areas;
> 
> Are you re-using "inuse"?  How about a different variable name for a
> different use?
> 
> > +       for (i = 0; i < NUM_HIGH_AREAS; i++)
> > +               if (prepare_high_area_for_htlb(mm, i) == 0)
> > +                       inuse &= ~(1 << i);
> > +       hiflush = inuse ^ mm->context.high_htlb_areas;
> > +       mm->context.high_htlb_areas = inuse;
> 
> This, combined with the other loop, completely rebuild the mm->context's
> view into htlb state, right?  Isn't that a bit excessive?

Ok.  These bit flipping operations might benefit from some abstraction
to share more code with the 'open' cases.  Point conceded.

> > +       /* the context changes must make it to memory before the flush,
> > +        * so that further SLB misses do the right thing. */
> > +       mb();
> > +       fi.mm = mm;
> > +       if ((fi.newareas = loflush))
> > +               on_each_cpu(flush_low_segments, &fi, 0, 1);
> > +       if ((fi.newareas = hiflush))
> > +               on_each_cpu(flush_high_segments, &fi, 0, 1);
> > +}
> 
> Yikes!  Think about a pathological program here.  It mmap()s 1 htlb
> area, then unmaps it quickly, over and over.  What will that perform
> like here?  

Well, it will only flush segments on cpus currently executing on the
same mm.  So said pathological program would only be slowing itself down
(with the exception of the interrupt overhead).

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 16:43 ` Hugh Dickins
@ 2006-06-02 16:49   ` Adam Litke
  0 siblings, 0 replies; 9+ messages in thread
From: Adam Litke @ 2006-06-02 16:49 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linuxppc-dev, David Gibson, linux-kernel, linux-mm

On Fri, 2006-06-02 at 17:43 +0100, Hugh Dickins wrote:
> On Fri, 2 Jun 2006, Adam Litke wrote:
> > 
> > On powerpc, each segment can contain pages of only one size.  When a
> > hugetlb mapping is requested, a segment is located and marked for use
> > with huge pages.  This is a uni-directional operation -- hugetlb
> > segments are never marked for use again with normal pages.  For long
> > running processes which make use of a combination of normal and hugetlb
> > mappings, this behavior can unduly constrain the virtual address space.
> > 
> > The following patch introduces a architecture-specific vm_ops.close()
> > hook.  For all architectures besides powerpc, this is a no-op.  On
> > powerpc, the low and high segments are scanned to locate empty hugetlb
> > segments which can be made available for normal mappings.  Comments?
> 
> Wouldn't hugetlb_free_pgd_range be a better place to do that kind of
> thing, all within arch/powerpc, no need for arch_hugetlb_close_vma etc?

Hmm.  Interesting idea.  I'll take a look.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 14:08 [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close Adam Litke
  2006-06-02 15:17 ` Dave Hansen
  2006-06-02 16:43 ` Hugh Dickins
@ 2006-06-02 20:06 ` Christoph Lameter
  2006-06-02 20:57   ` Adam Litke
  2 siblings, 1 reply; 9+ messages in thread
From: Christoph Lameter @ 2006-06-02 20:06 UTC (permalink / raw)
  To: Adam Litke; +Cc: linuxppc-dev, David Gibson, linux-kernel, linux-mm

On Fri, 2 Jun 2006, Adam Litke wrote:

> The following patch introduces a architecture-specific vm_ops.close()
> hook.  For all architectures besides powerpc, this is a no-op.  On
> powerpc, the low and high segments are scanned to locate empty hugetlb
> segments which can be made available for normal mappings.  Comments?

IA64 has similar issues and uses the hook suggested by Hugh. However, we 
have a permanently reserved memory area. I am a bit surprised about the 
need to make address space available for normal mappings. Is this for 32 
bit powerpc support?

void hugetlb_free_pgd_range(struct mmu_gather **tlb,
                        unsigned long addr, unsigned long end,
                        unsigned long floor, unsigned long ceiling)
{
        /*
         * This is called to free hugetlb page tables.
         *
         * The offset of these addresses from the base of the hugetlb
         * region must be scaled down by HPAGE_SIZE/PAGE_SIZE so that
         * the standard free_pgd_range will free the right page tables.
         *
         * If floor and ceiling are also in the hugetlb region, they
         * must likewise be scaled down; but if outside, left unchanged.
         */

        addr = htlbpage_to_page(addr);
        end  = htlbpage_to_page(end);
        if (REGION_NUMBER(floor) == RGN_HPAGE)
                floor = htlbpage_to_page(floor);
        if (REGION_NUMBER(ceiling) == RGN_HPAGE)
                ceiling = htlbpage_to_page(ceiling);

        free_pgd_range(tlb, addr, end, floor, ceiling);
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 20:06 ` Christoph Lameter
@ 2006-06-02 20:57   ` Adam Litke
  2006-06-02 21:08     ` Christoph Lameter
  0 siblings, 1 reply; 9+ messages in thread
From: Adam Litke @ 2006-06-02 20:57 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linuxppc-dev, David Gibson, linux-kernel, linux-mm

On Fri, 2006-06-02 at 13:06 -0700, Christoph Lameter wrote:
> On Fri, 2 Jun 2006, Adam Litke wrote:
> 
> > The following patch introduces a architecture-specific vm_ops.close()
> > hook.  For all architectures besides powerpc, this is a no-op.  On
> > powerpc, the low and high segments are scanned to locate empty hugetlb
> > segments which can be made available for normal mappings.  Comments?
> 
> IA64 has similar issues and uses the hook suggested by Hugh. However, we 
> have a permanently reserved memory area. I am a bit surprised about the 
> need to make address space available for normal mappings. Is this for 32 
> bit powerpc support?

I now have a working implementation using Hugh's suggestion and
incorporating some suggestions from David Hansen... (attaching for
reference).

The real reason I want to "close" hugetlb regions (even on 64bit
platforms) is so a process can replace a previous hugetlb mapping with
normal pages when huge pages become scarce.  An example would be the
hugetlb morecore (malloc) feature in libhugetlbfs :)

[PATCH] powerpc: Close hugetlb regions when unmapping VMAs

On powerpc, each segment can contain pages of only one size.  When a hugetlb
mapping is requested, a segment is located and marked for use with huge pages.
This is a uni-directional operation -- hugetlb segments are never marked for
use again with normal pages.  For long running processes which make use of a
combination of normal and hugetlb mappings, this behavior can unduly constrain
the virtual address space.

Changes since V1:
 * Modifications limited to arch-specific code (hugetlb_free_pgd_range)
 * Only scan segments covered by the range to be unmapped

Signed-off-by: Adam Litke <agl@us.ibm.com>
---
 hugetlbpage.c |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 49 insertions(+)
diff -upN reference/arch/powerpc/mm/hugetlbpage.c current/arch/powerpc/mm/hugetlbpage.c
--- reference/arch/powerpc/mm/hugetlbpage.c
+++ current/arch/powerpc/mm/hugetlbpage.c
@@ -52,6 +52,7 @@
 typedef struct { unsigned long pd; } hugepd_t;
 
 #define hugepd_none(hpd)	((hpd).pd == 0)
+void close_hugetlb_areas(struct mm_struct *mm);
 
 static inline pte_t *hugepd_page(hugepd_t hpd)
 {
@@ -303,6 +304,8 @@ void hugetlb_free_pgd_range(struct mmu_g
 			continue;
 		hugetlb_free_pud_range(*tlb, pgd, addr, next, floor, ceiling);
 	} while (pgd++, addr = next, addr != end);
+
+	close_hugetlb_areas((*tlb)->mm);
 }
 
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
@@ -518,6 +521,52 @@ int prepare_hugepage_range(unsigned long
 	return 0;
 }
 
+void close_hugetlb_areas(struct mm_struct *mm, unsigned long start,
+		unsigned long end)
+{
+	unsigned long i;
+	struct slb_flush_info fi;
+	u16 inuse, hiflush, loflush, mask;
+
+	if (!mm)
+		return;
+
+	if (start < 0x100000000UL) {
+		mask = LOW_ESID_MASK(start, end - start);
+		inuse = mm->context.low_htlb_areas;
+		for (i = 0; i < NUM_LOW_AREAS; i++) {
+			if (!(mask & (1 << i)))
+				continue;
+			if (prepare_low_area_for_htlb(mm, i) == 0)
+				inuse &= ~(1 << i);
+		}
+		loflush = inuse ^ mm->context.low_htlb_areas;
+		mm->context.low_htlb_areas = inuse;
+	}
+
+	if (end > 0x100000000UL) {
+		mask = HTLB_AREA_MASK(start, end - start);
+		inuse = mm->context.high_htlb_areas;
+		for (i = 0; i < NUM_HIGH_AREAS; i++) {
+			if (!(mask & (1 << i)))
+				continue;
+			if (prepare_high_area_for_htlb(mm, i) == 0)
+				inuse &= ~(1 << i);
+		}
+		hiflush = inuse ^ mm->context.high_htlb_areas;
+		mm->context.high_htlb_areas = inuse;
+	}
+
+	/* the context changes must make it to memory before the flush,
+	 * so that further SLB misses do the right thing. */
+	mb();
+	fi.mm = mm;
+	if ((fi.newareas = loflush))
+		on_each_cpu(flush_low_segments, &fi, 0, 1);
+	if ((fi.newareas = hiflush))
+		on_each_cpu(flush_high_segments, &fi, 0, 1);
+}
+
 struct page *
 follow_huge_addr(struct mm_struct *mm, unsigned long address, int write)
 {

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 20:57   ` Adam Litke
@ 2006-06-02 21:08     ` Christoph Lameter
  2006-06-09  8:49       ` David Gibson
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Lameter @ 2006-06-02 21:08 UTC (permalink / raw)
  To: Adam Litke; +Cc: linuxppc-dev, David Gibson, linux-kernel, linux-mm

On Fri, 2 Jun 2006, Adam Litke wrote:

> The real reason I want to "close" hugetlb regions (even on 64bit
> platforms) is so a process can replace a previous hugetlb mapping with
> normal pages when huge pages become scarce.  An example would be the
> hugetlb morecore (malloc) feature in libhugetlbfs :)

Well that approach wont work on IA64 it seems.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close
  2006-06-02 21:08     ` Christoph Lameter
@ 2006-06-09  8:49       ` David Gibson
  0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2006-06-09  8:49 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm, linuxppc-dev, linux-kernel


On Fri, Jun 02, 2006 at 02:08:27PM -0700, Christoph Lameter wrote:
> On Fri, 2 Jun 2006, Adam Litke wrote:
> 
> > The real reason I want to "close" hugetlb regions (even on 64bit
> > platforms) is so a process can replace a previous hugetlb mapping with
> > normal pages when huge pages become scarce.  An example would be the
> > hugetlb morecore (malloc) feature in libhugetlbfs :)
> 
> Well that approach wont work on IA64 it seems.

Yes, but there's not much that can be done about that.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-06-09  8:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-02 14:08 [PATCH] hugetlb: powerpc: Actively close unused htlb regions on vma close Adam Litke
2006-06-02 15:17 ` Dave Hansen
2006-06-02 16:47   ` Adam Litke
2006-06-02 16:43 ` Hugh Dickins
2006-06-02 16:49   ` Adam Litke
2006-06-02 20:06 ` Christoph Lameter
2006-06-02 20:57   ` Adam Litke
2006-06-02 21:08     ` Christoph Lameter
2006-06-09  8:49       ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).