Re: [PATCH 1/1] hugetlbfs: handle pages higher order than MAX_ORDER

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mel@csn.ul.ie>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Jon Tollefson <kniht@linux.vnet.ibm.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: [PATCH 1/1] hugetlbfs: handle pages higher order than MAX_ORDER
Date: Wed, 8 Oct 2008 15:57:00 +0100	[thread overview]
Message-ID: <20081008145659.GA13816@csn.ul.ie> (raw)
In-Reply-To: <1223458431-12640-2-git-send-email-apw@shadowen.org>

On (08/10/08 10:33), Andy Whitcroft didst pronounce:
> When working with hugepages, hugetlbfs assumes that those hugepages
> are smaller than MAX_ORDER.  Specifically it assumes that the mem_map
> is contigious and uses that to optimise access to the elements of the
> mem_map that represent the hugepage.  Gigantic pages (such as 16GB pages
> on powerpc) by definition are of greater order than MAX_ORDER (larger
> than MAX_ORDER_NR_PAGES in size).  This means that we can no longer make
> use of the buddy alloctor guarentees for the contiguity of the mem_map,
> which ensures that the mem_map is at least contigious for maximmally
> aligned areas of MAX_ORDER_NR_PAGES pages.
> 
> This patch adds new mem_map accessors and iterator helpers which handle
> any discontiguity at MAX_ORDER_NR_PAGES boundaries.  It then uses these
> within copy_huge_page, clear_huge_page, and follow_hugetlb_page to allow
> these to handle gigantic pages.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Acked-by: Mel Gorman <mel@csn.ul.ie>

> ---
>  mm/hugetlb.c  |   15 ++++++++++-----
>  mm/internal.h |   28 ++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 67a7119..bb5cf81 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -357,11 +357,12 @@ static void clear_huge_page(struct page *page,
>  			unsigned long addr, unsigned long sz)
>  {
>  	int i;
> +	struct page *p = page;
>  
>  	might_sleep();
> -	for (i = 0; i < sz/PAGE_SIZE; i++) {
> +	for (i = 0; i < sz/PAGE_SIZE; i++, p = mem_map_next(p, page, i)) {
>  		cond_resched();
> -		clear_user_highpage(page + i, addr + i * PAGE_SIZE);
> +		clear_user_highpage(p, addr + i * PAGE_SIZE);
>  	}
>  }
>  
> @@ -370,11 +371,15 @@ static void copy_huge_page(struct page *dst, struct page *src,
>  {
>  	int i;
>  	struct hstate *h = hstate_vma(vma);
> +	struct page *dst_base = dst;
> +	struct page *src_base = src;
>  
>  	might_sleep();
> -	for (i = 0; i < pages_per_huge_page(h); i++) {
> +	for (i = 0; i < pages_per_huge_page(h); i++,
> +				dst = mem_map_next(dst, dst_base, i),
> +				src = mem_map_next(src, src_base, i)) {
>  		cond_resched();
> -		copy_user_highpage(dst + i, src + i, addr + i*PAGE_SIZE, vma);
> +		copy_user_highpage(dst, src, addr + i*PAGE_SIZE, vma);
>  	}
>  }
>  
> @@ -2103,7 +2108,7 @@ int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  same_page:
>  		if (pages) {
>  			get_page(page);
> -			pages[i] = page + pfn_offset;
> +			pages[i] = mem_map_offset(page, pfn_offset);
>  		}
>  
>  		if (vmas)
> diff --git a/mm/internal.h b/mm/internal.h
> index 1f43f74..08b8dea 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -53,6 +53,34 @@ static inline unsigned long page_order(struct page *page)
>  }
>  
>  /*
> + * Return the mem_map entry representing the 'offset' subpage within
> + * the maximally aligned gigantic page 'base'.  Handle any discontiguity
> + * in the mem_map at MAX_ORDER_NR_PAGES boundaries.
> + */
> +static inline struct page *mem_map_offset(struct page *base, int offset)
> +{
> +	if (unlikely(offset >= MAX_ORDER_NR_PAGES))
> +		return pfn_to_page(page_to_pfn(base) + offset);
> +	return base + offset;
> +}
> +
> +/*
> + * Iterator over all subpages withing the maximally aligned gigantic
> + * page 'base'.  Handle any discontiguity in the mem_map.
> + */
> +static inline struct page *mem_map_next(struct page *iter,
> +						struct page *base, int offset)
> +{
> +	if (unlikely((offset & (MAX_ORDER_NR_PAGES - 1)) == 0)) {
> +		unsigned long pfn = page_to_pfn(base) + offset;
> +		if (!pfn_valid(pfn))
> +			return NULL;
> +		return pfn_to_page(pfn);
> +	}
> +	return iter + 1;
> +}
> +
> +/*
>   * FLATMEM and DISCONTIGMEM configurations use alloc_bootmem_node,
>   * so all functions starting at paging_init should be marked __init
>   * in those cases. SPARSEMEM, however, allows for memory hotplug,
> -- 
> 1.6.0.1.451.gc8d31
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mel@csn.ul.ie>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Jon Tollefson <kniht@linux.vnet.ibm.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: [PATCH 1/1] hugetlbfs: handle pages higher order than MAX_ORDER
Date: Wed, 8 Oct 2008 15:57:00 +0100	[thread overview]
Message-ID: <20081008145659.GA13816@csn.ul.ie> (raw)
In-Reply-To: <1223458431-12640-2-git-send-email-apw@shadowen.org>

On (08/10/08 10:33), Andy Whitcroft didst pronounce:
> When working with hugepages, hugetlbfs assumes that those hugepages
> are smaller than MAX_ORDER.  Specifically it assumes that the mem_map
> is contigious and uses that to optimise access to the elements of the
> mem_map that represent the hugepage.  Gigantic pages (such as 16GB pages
> on powerpc) by definition are of greater order than MAX_ORDER (larger
> than MAX_ORDER_NR_PAGES in size).  This means that we can no longer make
> use of the buddy alloctor guarentees for the contiguity of the mem_map,
> which ensures that the mem_map is at least contigious for maximmally
> aligned areas of MAX_ORDER_NR_PAGES pages.
> 
> This patch adds new mem_map accessors and iterator helpers which handle
> any discontiguity at MAX_ORDER_NR_PAGES boundaries.  It then uses these
> within copy_huge_page, clear_huge_page, and follow_hugetlb_page to allow
> these to handle gigantic pages.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Acked-by: Mel Gorman <mel@csn.ul.ie>

> ---
>  mm/hugetlb.c  |   15 ++++++++++-----
>  mm/internal.h |   28 ++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 67a7119..bb5cf81 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -357,11 +357,12 @@ static void clear_huge_page(struct page *page,
>  			unsigned long addr, unsigned long sz)
>  {
>  	int i;
> +	struct page *p = page;
>  
>  	might_sleep();
> -	for (i = 0; i < sz/PAGE_SIZE; i++) {
> +	for (i = 0; i < sz/PAGE_SIZE; i++, p = mem_map_next(p, page, i)) {
>  		cond_resched();
> -		clear_user_highpage(page + i, addr + i * PAGE_SIZE);
> +		clear_user_highpage(p, addr + i * PAGE_SIZE);
>  	}
>  }
>  
> @@ -370,11 +371,15 @@ static void copy_huge_page(struct page *dst, struct page *src,
>  {
>  	int i;
>  	struct hstate *h = hstate_vma(vma);
> +	struct page *dst_base = dst;
> +	struct page *src_base = src;
>  
>  	might_sleep();
> -	for (i = 0; i < pages_per_huge_page(h); i++) {
> +	for (i = 0; i < pages_per_huge_page(h); i++,
> +				dst = mem_map_next(dst, dst_base, i),
> +				src = mem_map_next(src, src_base, i)) {
>  		cond_resched();
> -		copy_user_highpage(dst + i, src + i, addr + i*PAGE_SIZE, vma);
> +		copy_user_highpage(dst, src, addr + i*PAGE_SIZE, vma);
>  	}
>  }
>  
> @@ -2103,7 +2108,7 @@ int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  same_page:
>  		if (pages) {
>  			get_page(page);
> -			pages[i] = page + pfn_offset;
> +			pages[i] = mem_map_offset(page, pfn_offset);
>  		}
>  
>  		if (vmas)
> diff --git a/mm/internal.h b/mm/internal.h
> index 1f43f74..08b8dea 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -53,6 +53,34 @@ static inline unsigned long page_order(struct page *page)
>  }
>  
>  /*
> + * Return the mem_map entry representing the 'offset' subpage within
> + * the maximally aligned gigantic page 'base'.  Handle any discontiguity
> + * in the mem_map at MAX_ORDER_NR_PAGES boundaries.
> + */
> +static inline struct page *mem_map_offset(struct page *base, int offset)
> +{
> +	if (unlikely(offset >= MAX_ORDER_NR_PAGES))
> +		return pfn_to_page(page_to_pfn(base) + offset);
> +	return base + offset;
> +}
> +
> +/*
> + * Iterator over all subpages withing the maximally aligned gigantic
> + * page 'base'.  Handle any discontiguity in the mem_map.
> + */
> +static inline struct page *mem_map_next(struct page *iter,
> +						struct page *base, int offset)
> +{
> +	if (unlikely((offset & (MAX_ORDER_NR_PAGES - 1)) == 0)) {
> +		unsigned long pfn = page_to_pfn(base) + offset;
> +		if (!pfn_valid(pfn))
> +			return NULL;
> +		return pfn_to_page(pfn);
> +	}
> +	return iter + 1;
> +}
> +
> +/*
>   * FLATMEM and DISCONTIGMEM configurations use alloc_bootmem_node,
>   * so all functions starting at paging_init should be marked __init
>   * in those cases. SPARSEMEM, however, allows for memory hotplug,
> -- 
> 1.6.0.1.451.gc8d31
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-10-08 14:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-08  9:33 [PATCH 0/1] gigantic compound pages part 2 Andy Whitcroft
2008-10-08  9:33 ` Andy Whitcroft
2008-10-08  9:33 ` [PATCH 1/1] hugetlbfs: handle pages higher order than MAX_ORDER Andy Whitcroft
2008-10-08  9:33   ` Andy Whitcroft
2008-10-08 12:29   ` Nick Piggin
2008-10-08 12:29     ` Nick Piggin
2008-10-13 13:36     ` Andy Whitcroft
2008-10-13 13:36       ` Andy Whitcroft
2008-10-08 14:57   ` Mel Gorman [this message]
2008-10-08 14:57     ` Mel Gorman
2008-10-08 16:17   ` Christoph Lameter
2008-10-08 16:17     ` Christoph Lameter
2008-10-08 17:36     ` Andi Kleen
2008-10-08 17:36       ` Andi Kleen
2008-10-08 18:55     ` Andy Whitcroft
2008-10-08 18:55       ` Andy Whitcroft
2008-10-08 19:35       ` Christoph Lameter
2008-10-08 19:35         ` Christoph Lameter
2008-10-13 13:34         ` Andy Whitcroft
2008-10-13 13:34           ` Andy Whitcroft
2008-10-13 16:04           ` Christoph Lameter
2008-10-13 16:04             ` Christoph Lameter
2008-10-14  7:00             ` Andy Whitcroft
2008-10-14  7:00               ` Andy Whitcroft

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081008145659.GA13816@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=apw@shadowen.org \
    --cc=kniht@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.