Re: [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Christophe Leroy <christophe.leroy@c-s.fr>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Scott Wood <oss@buserror.net>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic
Date: Mon, 19 Sep 2016 11:15:44 +0530	[thread overview]
Message-ID: <87poo0xx13.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <dab1fde59f711dd7d3f1996dee72f61c46033f47.1474009019.git.christophe.leroy@c-s.fr>

Christophe Leroy <christophe.leroy@c-s.fr> writes:

> Today there are two implementations of hugetlbpages which are managed
> by exclusive #ifdefs:
> * FSL_BOOKE: several directory entries points to the same single hugepage
> * BOOK3S: one upper level directory entry points to a table of hugepages
>
> In preparation of implementation of hugepage support on the 8xx, we
> need a mix of the two above solutions, because the 8xx needs both cases
> depending on the size of pages:
> * In 4k page size mode, each PGD entry covers a 4M bytes area. It means
> that 2 PGD entries will be necessary to cover an 8M hugepage while a
> single PGD entry will cover 8x 512k hugepages.
> * In 16 page size mode, each PGD entry covers a 64M bytes area. It means
> that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
> hugepages will be covers by one PGD entry.
>
> This patch:
> * removes #ifdefs in favor of if/else based on the range sizes
> * merges the two huge_pte_alloc() functions as they are pretty similar
> * merges the two hugetlbpage_init() functions as they are pretty similar
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
> v2: This part is new and results from a split of last patch of v1 serie in
> two parts
>
>  arch/powerpc/mm/hugetlbpage.c | 189 +++++++++++++++++-------------------------
>  1 file changed, 77 insertions(+), 112 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 8a512b1..2119f00 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -64,14 +64,16 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  {
>  	struct kmem_cache *cachep;
>  	pte_t *new;
> -
> -#ifdef CONFIG_PPC_FSL_BOOK3E
>  	int i;
> -	int num_hugepd = 1 << (pshift - pdshift);
> -	cachep = hugepte_cache;
> -#else
> -	cachep = PGT_CACHE(pdshift - pshift);
> -#endif
> +	int num_hugepd;
> +
> +	if (pshift >= pdshift) {
> +		cachep = hugepte_cache;
> +		num_hugepd = 1 << (pshift - pdshift);
> +	} else {
> +		cachep = PGT_CACHE(pdshift - pshift);
> +		num_hugepd = 1;
> +	}

Is there a way to hint likely/unlikely branch based on the page size
selected at build time ?



>  
>  	new = kmem_cache_zalloc(cachep, GFP_KERNEL);
>  
> @@ -89,7 +91,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  	smp_wmb();
>  
>  	spin_lock(&mm->page_table_lock);
> -#ifdef CONFIG_PPC_FSL_BOOK3E
> +
>  	/*
>  	 * We have multiple higher-level entries that point to the same
>  	 * actual pte location.  Fill in each as we go and backtrack on error.
> @@ -100,8 +102,13 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  		if (unlikely(!hugepd_none(*hpdp)))
>  			break;
>  		else

....

> -#ifdef CONFIG_PPC_FSL_BOOK3E
>  struct kmem_cache *hugepte_cache;
>  static int __init hugetlbpage_init(void)
>  {
>  	int psize;
>  
> -	for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> -		unsigned shift;
> -
> -		if (!mmu_psize_defs[psize].shift)
> -			continue;
> -
> -		shift = mmu_psize_to_shift(psize);
> -
> -		/* Don't treat normal page sizes as huge... */
> -		if (shift != PAGE_SHIFT)
> -			if (add_huge_page_size(1ULL << shift) < 0)
> -				continue;
> -	}
> -
> -	/*
> -	 * Create a kmem cache for hugeptes.  The bottom bits in the pte have
> -	 * size information encoded in them, so align them to allow this
> -	 */
> -	hugepte_cache =  kmem_cache_create("hugepte-cache", sizeof(pte_t),
> -					   HUGEPD_SHIFT_MASK + 1, 0, NULL);
> -	if (hugepte_cache == NULL)
> -		panic("%s: Unable to create kmem cache for hugeptes\n",
> -		      __func__);
> -
> -	/* Default hpage size = 4M */
> -	if (mmu_psize_defs[MMU_PAGE_4M].shift)
> -		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_4M].shift;
> -	else
> -		panic("%s: Unable to set default huge page size\n", __func__);
> -
> -
> -	return 0;
> -}
> -#else
> -static int __init hugetlbpage_init(void)
> -{
> -	int psize;
> -
> +#if !defined(CONFIG_PPC_FSL_BOOK3E)
>  	if (!radix_enabled() && !mmu_has_feature(MMU_FTR_16M_PAGE))
>  		return -ENODEV;
> -
> +#endif

Do we need that #if ? radix_enabled() should become 0 and that if
condition should be removed at compile time isn't it ? or are you
finding errors with that ?


>  	for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
>  		unsigned shift;
>  		unsigned pdshift;
> @@ -860,16 +807,31 @@ static int __init hugetlbpage_init(void)
>  		 * if we have pdshift and shift value same, we don't
>  		 * use pgt cache for hugepd.
>  		 */
> -		if (pdshift != shift) {
> +		if (pdshift > shift) {
>  			pgtable_cache_add(pdshift - shift, NULL);
>  			if (!PGT_CACHE(pdshift - shift))
>  				panic("hugetlbpage_init(): could not create "
>  				      "pgtable cache for %d bit pagesize\n", shift);
> +		} else if (!hugepte_cache) {
> +			/*
> +			 * Create a kmem cache for hugeptes.  The bottom bits in
> +			 * the pte have size information encoded in them, so
> +			 * align them to allow this
> +			 */
> +			hugepte_cache = kmem_cache_create("hugepte-cache",
> +							  sizeof(pte_t),
> +							  HUGEPD_SHIFT_MASK + 1,
> +							  0, NULL);
> +			if (hugepte_cache == NULL)
> +				panic("%s: Unable to create kmem cache "
> +				      "for hugeptes\n", __func__);
> +


We don't need hugepte_cache for book3s 64K. I guess we will endup
creating one here ?

>  		}
>  	}
>  
>  	/* Set default large page size. Currently, we pick 16M or 1M
>  	 * depending on what is available
> +	 * We select 4M on other ones.
>  	 */
>  	if (mmu_psize_defs[MMU_PAGE_16M].shift)
>  		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_16M].shift;
> @@ -877,11 +839,14 @@ static int __init hugetlbpage_init(void)
>  		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_1M].shift;
>  	else if (mmu_psize_defs[MMU_PAGE_2M].shift)
>  		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_2M].shift;
> -
> +	else if (mmu_psize_defs[MMU_PAGE_4M].shift)
> +		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_4M].shift;
> +	else
> +		panic("%s: Unable to set default huge page size\n", __func__);
>  
>  	return 0;
>  }
> -#endif
> +
>  arch_initcall(hugetlbpage_init);
>  
>  void flush_dcache_icache_hugepage(struct page *page)
> -- 
> 2.1.0

next prev parent reply	other threads:[~2016-09-19  5:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-16  7:40 [PATCH v2 0/3] powerpc: implementation of huge pages for 8xx Christophe Leroy
2016-09-16  7:40 ` [PATCH v2 1/3] powerpc: port 64 bits pgtable_cache to 32 bits Christophe Leroy
2016-09-19  5:22   ` Aneesh Kumar K.V
2016-09-19 18:46     ` christophe leroy
2016-09-16  7:40 ` [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic Christophe Leroy
2016-09-19  5:45   ` Aneesh Kumar K.V [this message]
2016-09-19 18:32     ` christophe leroy
2016-09-20  2:45       ` Aneesh Kumar K.V
2016-09-21  6:13     ` Christophe Leroy
2016-09-19  5:50   ` Aneesh Kumar K.V
2016-09-19 18:36     ` christophe leroy
2016-09-20  2:28       ` Aneesh Kumar K.V
2016-09-20  5:22         ` Christophe Leroy
2016-09-16  7:40 ` [PATCH v2 3/3] powerpc/8xx: Implement support of hugepages Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87poo0xx13.fsf@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=oss@buserror.net \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).