linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, paulus@samba.org
Subject: Re: [PATCH 2/2] powerpc: thp: invalidate old 64K based hash page mapping before insert
Date: Tue, 22 Jul 2014 15:32:03 +1000	[thread overview]
Message-ID: <1406007123.22200.11.camel@pasglop> (raw)
In-Reply-To: <1405435927-24027-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

On Tue, 2014-07-15 at 20:22 +0530, Aneesh Kumar K.V wrote:
> If we changed base page size of the segment, either via sub_page_protect
> or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
> table entries. We do that when inserting a new hash pte by checking the
> _PAGE_COMBO flag. We missed to do that when inserting hash for a new 16MB
> page. Add the same. This patch mark the 4k base page size 16MB hugepage
> via _PAGE_COMBO.

please improve the above, I don't understand it.

> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/hugepage-hash64.c | 66 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 66 insertions(+)
> 
> diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
> index 826893fcb3a7..28d1b8b93674 100644
> --- a/arch/powerpc/mm/hugepage-hash64.c
> +++ b/arch/powerpc/mm/hugepage-hash64.c
> @@ -18,6 +18,56 @@
>  #include <linux/mm.h>
>  #include <asm/machdep.h>
>  
> +static void flush_hash_hugepage(unsigned long vsid, unsigned long addr,
> +				pmd_t *pmdp, unsigned int psize, int ssize)
> +{

What do that function do ? From the name of it, it would be used
whenever one wants to flush a huge page out of the hash, and thus would
be rather generic, but you only use it in a fairly narrow special
case...

> +	int i, max_hpte_count, valid;
> +	unsigned long s_addr = addr;
> +	unsigned char *hpte_slot_array;
> +	unsigned long hidx, shift, vpn, hash, slot;
> +
> +	hpte_slot_array = get_hpte_slot_array(pmdp);
> +	/*
> +	 * IF we try to do a HUGE PTE update after a withdraw is done.
> +	 * we will find the below NULL. This happens when we do
> +	 * split_huge_page_pmd
> +	 */
> +	if (!hpte_slot_array)
> +		return;

Can I assume we proper synchronization here ? (Interrupt off vs. IPIs on
the withdraw side or something similar ?)

> +	if (ppc_md.hugepage_invalidate)
> +		return ppc_md.hugepage_invalidate(vsid, addr, hpte_slot_array,
> +						  psize, ssize);
> +	/*
> +	 * No bluk hpte removal support, invalidate each entry
> +	 */
> +	shift = mmu_psize_defs[psize].shift;
> +	max_hpte_count = HPAGE_PMD_SIZE >> shift;
> +	for (i = 0; i < max_hpte_count; i++) {
> +		/*
> +		 * 8 bits per each hpte entries
> +		 * 000| [ secondary group (one bit) | hidx (3 bits) | valid bit]
> +		 */
> +		valid = hpte_valid(hpte_slot_array, i);
> +		if (!valid)
> +			continue;
> +		hidx =  hpte_hash_index(hpte_slot_array, i);
> +
> +		/* get the vpn */
> +		addr = s_addr + (i * (1ul << shift));
> +		vpn = hpt_vpn(addr, vsid, ssize);
> +		hash = hpt_hash(vpn, shift, ssize);
> +		if (hidx & _PTEIDX_SECONDARY)
> +			hash = ~hash;
> +
> +		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> +		slot += hidx & _PTEIDX_GROUP_IX;
> +		ppc_md.hpte_invalidate(slot, vpn, psize,
> +				       MMU_PAGE_16M, ssize, 0);
> +	}
> +}
> +
> +
>  int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
>  		    pmd_t *pmdp, unsigned long trap, int local, int ssize,
>  		    unsigned int psize)
> @@ -85,6 +135,15 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
>  	vpn = hpt_vpn(ea, vsid, ssize);
>  	hash = hpt_hash(vpn, shift, ssize);
>  	hpte_slot_array = get_hpte_slot_array(pmdp);
> +	if (psize == MMU_PAGE_4K) {
> +		/*
> +		 * invalidate the old hpte entry if we have that mapped via 64K
> +		 * base page size. This is because demote_segment won't flush
> +		 * hash page table entries.
> +		 */

Please provide a better explanation of the scenario, this is really not
clear to me.

> +		if (!(old_pmd & _PAGE_COMBO))
> +			flush_hash_hugepage(vsid, ea, pmdp, MMU_PAGE_64K, ssize);
> +	}
>  
>  	valid = hpte_valid(hpte_slot_array, index);
>  	if (valid) {
> @@ -172,6 +231,13 @@ repeat:
>  		mark_hpte_slot_valid(hpte_slot_array, index, slot);
>  	}
>  	/*
> +	 * Mark the pte with _PAGE_COMBO, if we are trying to hash it with
> +	 * base page size 4k.
> +	 */
> +	if (psize == MMU_PAGE_4K)
> +		new_pmd |= _PAGE_COMBO;
> +
> +

Why ? Please explain.

Ben.

> 	/*
>  	 * No need to use ldarx/stdcx here
>  	 */
>  	*pmdp = __pmd(new_pmd & ~_PAGE_BUSY);

  reply	other threads:[~2014-07-22  5:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-15 14:52 [PATCH 1/2] powerpc: thp: don't recompute vsid and ssize in loop on invalidate Aneesh Kumar K.V
2014-07-15 14:52 ` [PATCH 2/2] powerpc: thp: invalidate old 64K based hash page mapping before insert Aneesh Kumar K.V
2014-07-22  5:32   ` Benjamin Herrenschmidt [this message]
2014-07-22 18:55     ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1406007123.22200.11.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).