Re: [2/5] Cleanup management of kmem_caches for pagetables

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: linuxppc-dev@ozlabs.org
Subject: Re: [2/5] Cleanup management of kmem_caches for pagetables
Date: Mon, 14 Sep 2009 08:28:11 +1000	[thread overview]
Message-ID: <1252880891.8375.94.camel@pasglop> (raw)
In-Reply-To: <20090909055916.58003B7B60@bilbo.ozlabs.org>

On Wed, 2009-09-09 at 15:59 +1000, David Gibson wrote:

> 6 files changed, 73 insertions(+), 108 deletions(-)

That's a pretty good start :-)

> +struct kmem_cache *pgtable_cache[PGF_SHIFT_MASK];
> +
> +void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
> +{
> +	char *name;
> +	unsigned long table_size = sizeof(void *) << shift;
> +	struct kmem_cache *new;
> +
> +	BUG_ON((shift < 1) || (shift > PGF_SHIFT_MASK));
> +	if (PGT_CACHE(shift))
> +		return; /* Already have a cache of this size */
> +	name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
> +	new = kmem_cache_create(name, table_size, table_size, 0, ctor);
> +	PGT_CACHE(shift) = new;
> +}

I'm getting partial to verbose boot nowadays :-) At least a pr_debug if
not a pr_info up there "Allocated pgtable for order %d" or something
like that might end up being of some use when debugging things.

>  void pgtable_cache_init(void)
>  {
> -	pgtable_cache[0] = kmem_cache_create(pgtable_cache_name[0], PGD_TABLE_SIZE, PGD_TABLE_SIZE, SLAB_PANIC, pgd_ctor);
> -	pgtable_cache[1] = kmem_cache_create(pgtable_cache_name[1], PMD_TABLE_SIZE, PMD_TABLE_SIZE, SLAB_PANIC, pmd_ctor);
> +	pgtable_cache_add(PGD_INDEX_SIZE, pgd_ctor);
> +	pgtable_cache_add(PMD_INDEX_SIZE, pmd_ctor);
> +	if (!PGT_CACHE(PGD_INDEX_SIZE) || !PGT_CACHE(PMD_INDEX_SIZE))
> +		panic("Couldn't allocate pgtable caches");
> +	BUG_ON(!PGT_CACHE(PUD_INDEX_SIZE));

What if PUD_INDEX_SIZE is 0 ? (64k pages)

Couldn't we just do a

	if (PUD_INDEX_SIZE)
		pgtable_cache_add(PUD_INDEX_SIZE...)

If it's the same size as another cache it would just not do anything... 

> -static inline void pgtable_free(pgtable_free_t pgf)
> +static inline void pgtable_free(void *table, unsigned index_size)
>  {
> -	void *p = (void *)(pgf.val & ~PGF_CACHENUM_MASK);
> -	int cachenum = pgf.val & PGF_CACHENUM_MASK;
> -
> -	if (cachenum == PTE_NONCACHE_NUM)
> -		free_page((unsigned long)p);
> -	else
> -		kmem_cache_free(pgtable_cache[cachenum], p);
> +	if (!index_size)
> +		free_page((unsigned long)table);
> +	else {
> +		BUG_ON(index_size > PGF_SHIFT_MASK);
> +		kmem_cache_free(PGT_CACHE(index_size), table);
> +	}
>  }

Out of curiosity, what is the index_size == 0 case for ? Do we still use
it ? Agh... found it ... we don't put PTE pages in a cache, just use
gfp. I suppose that's lower overhead.

>  /* This needs to be big enough to allow for MMU_PAGE_COUNT + 2 to be stored
>   * and small enough to fit in the low bits of any naturally aligned page
>   * table cache entry. Arbitrarily set to 0x1f, that should give us some
>   * room to grow
>   */

The comment above will need updating (don't just remove it, please -do-
explain what it's all about :-)

> -#define PGF_CACHENUM_MASK	0x1f
> -
> -static inline pgtable_free_t pgtable_free_cache(void *p, int cachenum,
> -						unsigned long mask)
> -{
> -	BUG_ON(cachenum > PGF_CACHENUM_MASK);
> -
> -	return (pgtable_free_t){.val = ((unsigned long) p & ~mask) | cachenum};
> -}
> +#define PGF_SHIFT_MASK		0xf

That does bring one question tho... You still fill the batch by sticking
the shift into the low bits of the table right ? Which means that your
table must have at least 4 bits 0 at the bottom, ie, it must at least
have 2 pointers on 64-bit and 4 on 32-bit. Maybe you should add some
runtime test for the later (your comparisons to 1 do the job for 64-bit
but not for 32-bit or did I miss something ?)

Overall, a really nice cleanup... the previous stuff was hairy and prone
to breakage (see how it broke it twice due to misaligned cache names
array during the build-up of the book3e support). I'm actually tempted,
with a bit more testing, to sneak that into .32 despite arriving a bit
late, because the current code is really fragile.

Cheers,
Ben.

next prev parent reply	other threads:[~2009-09-13 22:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-09  5:55 [0/5] Assorted hugepage cleanups David Gibson
2009-09-09  5:59 ` [2/5] Cleanup management of kmem_caches for pagetables David Gibson
2009-09-13 22:28   ` Benjamin Herrenschmidt [this message]
2009-09-14  4:56     ` David Gibson
2009-09-09  5:59 ` [5/5] Split hash MMU specific hugepage code into a new file David Gibson
2009-09-09  5:59 ` [1/5] Make hpte_need_flush() correctly mask for multiple page sizes David Gibson
2009-09-09  5:59 ` [3/5] Allow more flexible layouts for hugepage pagetables David Gibson
2009-09-09  5:59 ` [4/5] Cleanup initialization of hugepages on powerpc David Gibson
  -- strict thread matches above, loose matches on Subject: below --
2009-09-15  6:41 [0/5] Assorted hugepage cleanups (v2) David Gibson
2009-09-15  6:43 ` [2/5] Cleanup management of kmem_caches for pagetables David Gibson
2009-09-28  4:39 [0/5] Assorted hugepage cleanups (v3) David Gibson
2009-09-28  4:41 ` [2/5] Cleanup management of kmem_caches for pagetables David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1252880891.8375.94.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).