All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Christophe Leroy <christophe.leroy@c-s.fr>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Scott Wood <oss@buserror.net>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic
Date: Mon, 19 Sep 2016 11:15:44 +0530	[thread overview]
Message-ID: <87poo0xx13.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <dab1fde59f711dd7d3f1996dee72f61c46033f47.1474009019.git.christophe.leroy@c-s.fr>

Christophe Leroy <christophe.leroy@c-s.fr> writes:

> Today there are two implementations of hugetlbpages which are managed
> by exclusive #ifdefs:
> * FSL_BOOKE: several directory entries points to the same single hugepage
> * BOOK3S: one upper level directory entry points to a table of hugepages
>
> In preparation of implementation of hugepage support on the 8xx, we
> need a mix of the two above solutions, because the 8xx needs both cases
> depending on the size of pages:
> * In 4k page size mode, each PGD entry covers a 4M bytes area. It means
> that 2 PGD entries will be necessary to cover an 8M hugepage while a
> single PGD entry will cover 8x 512k hugepages.
> * In 16 page size mode, each PGD entry covers a 64M bytes area. It means
> that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
> hugepages will be covers by one PGD entry.
>
> This patch:
> * removes #ifdefs in favor of if/else based on the range sizes
> * merges the two huge_pte_alloc() functions as they are pretty similar
> * merges the two hugetlbpage_init() functions as they are pretty similar
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
> v2: This part is new and results from a split of last patch of v1 serie in
> two parts
>
>  arch/powerpc/mm/hugetlbpage.c | 189 +++++++++++++++++-------------------------
>  1 file changed, 77 insertions(+), 112 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 8a512b1..2119f00 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -64,14 +64,16 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  {
>  	struct kmem_cache *cachep;
>  	pte_t *new;
> -
> -#ifdef CONFIG_PPC_FSL_BOOK3E
>  	int i;
> -	int num_hugepd = 1 << (pshift - pdshift);
> -	cachep = hugepte_cache;
> -#else
> -	cachep = PGT_CACHE(pdshift - pshift);
> -#endif
> +	int num_hugepd;
> +
> +	if (pshift >= pdshift) {
> +		cachep = hugepte_cache;
> +		num_hugepd = 1 << (pshift - pdshift);
> +	} else {
> +		cachep = PGT_CACHE(pdshift - pshift);
> +		num_hugepd = 1;
> +	}

Is there a way to hint likely/unlikely branch based on the page size
selected at build time ?



>  
>  	new = kmem_cache_zalloc(cachep, GFP_KERNEL);
>  
> @@ -89,7 +91,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  	smp_wmb();
>  
>  	spin_lock(&mm->page_table_lock);
> -#ifdef CONFIG_PPC_FSL_BOOK3E
> +
>  	/*
>  	 * We have multiple higher-level entries that point to the same
>  	 * actual pte location.  Fill in each as we go and backtrack on error.
> @@ -100,8 +102,13 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  		if (unlikely(!hugepd_none(*hpdp)))
>  			break;
>  		else

....

> -#ifdef CONFIG_PPC_FSL_BOOK3E
>  struct kmem_cache *hugepte_cache;
>  static int __init hugetlbpage_init(void)
>  {
>  	int psize;
>  
> -	for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> -		unsigned shift;
> -
> -		if (!mmu_psize_defs[psize].shift)
> -			continue;
> -
> -		shift = mmu_psize_to_shift(psize);
> -
> -		/* Don't treat normal page sizes as huge... */
> -		if (shift != PAGE_SHIFT)
> -			if (add_huge_page_size(1ULL << shift) < 0)
> -				continue;
> -	}
> -
> -	/*
> -	 * Create a kmem cache for hugeptes.  The bottom bits in the pte have
> -	 * size information encoded in them, so align them to allow this
> -	 */
> -	hugepte_cache =  kmem_cache_create("hugepte-cache", sizeof(pte_t),
> -					   HUGEPD_SHIFT_MASK + 1, 0, NULL);
> -	if (hugepte_cache == NULL)
> -		panic("%s: Unable to create kmem cache for hugeptes\n",
> -		      __func__);
> -
> -	/* Default hpage size = 4M */
> -	if (mmu_psize_defs[MMU_PAGE_4M].shift)
> -		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_4M].shift;
> -	else
> -		panic("%s: Unable to set default huge page size\n", __func__);
> -
> -
> -	return 0;
> -}
> -#else
> -static int __init hugetlbpage_init(void)
> -{
> -	int psize;
> -
> +#if !defined(CONFIG_PPC_FSL_BOOK3E)
>  	if (!radix_enabled() && !mmu_has_feature(MMU_FTR_16M_PAGE))
>  		return -ENODEV;
> -
> +#endif

Do we need that #if ? radix_enabled() should become 0 and that if
condition should be removed at compile time isn't it ? or are you
finding errors with that ?


>  	for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
>  		unsigned shift;
>  		unsigned pdshift;
> @@ -860,16 +807,31 @@ static int __init hugetlbpage_init(void)
>  		 * if we have pdshift and shift value same, we don't
>  		 * use pgt cache for hugepd.
>  		 */
> -		if (pdshift != shift) {
> +		if (pdshift > shift) {
>  			pgtable_cache_add(pdshift - shift, NULL);
>  			if (!PGT_CACHE(pdshift - shift))
>  				panic("hugetlbpage_init(): could not create "
>  				      "pgtable cache for %d bit pagesize\n", shift);
> +		} else if (!hugepte_cache) {
> +			/*
> +			 * Create a kmem cache for hugeptes.  The bottom bits in
> +			 * the pte have size information encoded in them, so
> +			 * align them to allow this
> +			 */
> +			hugepte_cache = kmem_cache_create("hugepte-cache",
> +							  sizeof(pte_t),
> +							  HUGEPD_SHIFT_MASK + 1,
> +							  0, NULL);
> +			if (hugepte_cache == NULL)
> +				panic("%s: Unable to create kmem cache "
> +				      "for hugeptes\n", __func__);
> +


We don't need hugepte_cache for book3s 64K. I guess we will endup
creating one here ?

>  		}
>  	}
>  
>  	/* Set default large page size. Currently, we pick 16M or 1M
>  	 * depending on what is available
> +	 * We select 4M on other ones.
>  	 */
>  	if (mmu_psize_defs[MMU_PAGE_16M].shift)
>  		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_16M].shift;
> @@ -877,11 +839,14 @@ static int __init hugetlbpage_init(void)
>  		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_1M].shift;
>  	else if (mmu_psize_defs[MMU_PAGE_2M].shift)
>  		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_2M].shift;
> -
> +	else if (mmu_psize_defs[MMU_PAGE_4M].shift)
> +		HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_4M].shift;
> +	else
> +		panic("%s: Unable to set default huge page size\n", __func__);
>  
>  	return 0;
>  }
> -#endif
> +
>  arch_initcall(hugetlbpage_init);
>  
>  void flush_dcache_icache_hugepage(struct page *page)
> -- 
> 2.1.0

  reply	other threads:[~2016-09-19  5:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-16  7:40 [PATCH v2 0/3] powerpc: implementation of huge pages for 8xx Christophe Leroy
2016-09-16  7:40 ` [PATCH v2 1/3] powerpc: port 64 bits pgtable_cache to 32 bits Christophe Leroy
2016-09-19  5:22   ` Aneesh Kumar K.V
2016-09-19 18:46     ` christophe leroy
2016-09-16  7:40 ` [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic Christophe Leroy
2016-09-19  5:45   ` Aneesh Kumar K.V [this message]
2016-09-19 18:32     ` christophe leroy
2016-09-20  2:45       ` Aneesh Kumar K.V
2016-09-21  6:13     ` Christophe Leroy
2016-09-19  5:50   ` Aneesh Kumar K.V
2016-09-19 18:36     ` christophe leroy
2016-09-20  2:28       ` Aneesh Kumar K.V
2016-09-20  2:28         ` Aneesh Kumar K.V
2016-09-20  5:22         ` Christophe Leroy
2016-09-16  7:40 ` [PATCH v2 3/3] powerpc/8xx: Implement support of hugepages Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87poo0xx13.fsf@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=oss@buserror.net \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.