linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits
@ 2017-11-06  8:50 Ram Pai
  2017-11-06  8:50 ` [PATCH v9 1/8] powerpc: introduce pte_set_hidx() helper Ram Pai
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

RPAGE_RSV0..4 PTE bits are currently used for HPTE slot tracking.
We need these bits for memory-protection keys. Luckily these four
bits are relatively easier to move among all the other candidate
bits.

For 64K linux-PTEs backed by 4k HPTEs, these bits are used for
tracking the validity of the slot value stored in the
second-part-of-the-PTE. We device a new mechanism for tracking
the validity without using those bits. The mechanism is
explained in the first patch.

For 64K linux-PTE backed by 64K HPTEs, we simply move the slot
tracking bits to the second-part-of-the-PTE.

The above mechanism is also used to free the bits for hugetlb
linux-PTEs.

For 4k linux-PTE, we have only 3 free bits available. We swizzle
around the bits and release RPAGE_RSV{2,3,4} for memory protection
keys.

Testing:
--------
has survived kernel compilation on multiple platforms
p8 powernv hash-mode, p9 powernv hash-mode, p7 powervm,
p8-powervm, p8-kvm-guest.

History:
-------
This patchset is a spin-off from the memkey patchset.

version v10:
	(1) hidx value is rotated by one to save
		initialization of the secondary-PTE.
	(2) comments are not anymore right justified.
		-- comment by mpe

version v9:
	(1) rearranged the patch order. First the helper
	routines are defined followed by the
	patches that make use of the helpers.

version v8:
	(1) an additional patch added to free up
	RSV{2,3,4} on 4K linux-PTE.

version v7:
	(1) GIX bit reset change moved to the second
		patch -- noticed by Aneesh.
	(2) Separated this patches from memkey patchset
	(3) merged a bunch of patches, that used the
		helper function, into one.
version v6:
	(1) No changes related to PTE.

version v5:
	(1) No changes related to PTE.

version v4:
	(1) No changes related to PTE.

version v3:
	(1) split the patches into smaller consumable
		patches.
	(2) A bug fix while invalidating a HPTE slot
		in __hash_page_4K()
		-- noticed by Aneesh
version v2:
 	(1) fixed a bug in 4k HPTE backed 64k PTE
		where page invalidation was not
		done correctly, and initialization
		of second-part-of-the-PTE was not
		done correctly if the PTE was not
		yet Hashed with a HPTE.
		--	Reported by Aneesh.
	
version v1: Initial version

Ram Pai (8):
  powerpc: introduce pte_set_hidx() helper
  powerpc: introduce pte_get_hash_gslot() helper
  powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  powerpc: shifted-by-one hidx value
  powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6
  powerpc: use helper functions to get and set hash slots
  powerpc: capture the PTE format changes in the dump pte report

 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   20 ++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   71 ++++++++++----
 arch/powerpc/include/asm/book3s/64/hash.h     |    8 +-
 arch/powerpc/mm/dump_linuxpagetables.c        |    3 +-
 arch/powerpc/mm/hash64_4k.c                   |   14 +--
 arch/powerpc/mm/hash64_64k.c                  |  123 +++++++++++++------------
 arch/powerpc/mm/hash_utils_64.c               |   35 +++++--
 arch/powerpc/mm/hugetlbpage-hash64.c          |   16 +--
 8 files changed, 175 insertions(+), 115 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v9 1/8] powerpc: introduce pte_set_hidx() helper
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  2018-01-17 13:30   ` [v9,1/8] " Michael Ellerman
  2017-11-06  8:50 ` [PATCH v9 2/8] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

Introduce pte_set_hidx().It sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX) bits
at the appropriate location in the PTE of 4K PTE. For 64K PTE, it sets
the bits in the second part of the PTE. Though the implementation for
the former just needs the slot parameter, it does take some additional
parameters to keep the prototype consistent.

This function will be handy as we work towards re-arranging the bits in
the subsequent patches.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   14 ++++++++++++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   25 +++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 197ced1..2975fc1 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -49,6 +49,20 @@ static inline int hash__hugepd_ok(hugepd_t hpd)
 }
 #endif
 
+/*
+ * 4K PTE format is different from 64K PTE format. Saving the hash_slot is just
+ * a matter of returning the PTE bits that need to be modified. On 64K PTE,
+ * things are a little more involved and hence needs many more parameters to
+ * accomplish the same. However we want to abstract this out from the caller by
+ * keeping the prototype consistent across the two formats.
+ */
+static inline unsigned long pte_set_hidx(pte_t *ptep, real_pte_t rpte,
+			unsigned int subpg_index, unsigned long hidx)
+{
+	return (hidx << H_PAGE_F_GIX_SHIFT) &
+		(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
 static inline char *get_hpte_slot_array(pmd_t *pmdp)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 8d40cf0..c1bd258 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -68,6 +68,8 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 	return rpte;
 }
 
+#define HIDX_BITS(x, index)  (x << (index << 2))
+
 static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
 	if ((pte_val(rpte.pte) & H_PAGE_COMBO))
@@ -75,6 +77,29 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
 }
 
+/*
+ * Commit the hidx and return PTE bits that needs to be modified. The caller is
+ * expected to modify the PTE bits accordingly and commit the PTE to memory.
+ */
+static inline unsigned long pte_set_hidx(pte_t *ptep, real_pte_t rpte,
+		unsigned int subpg_index, unsigned long hidx)
+{
+	unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+
+	rpte.hidx &= ~HIDX_BITS(0xfUL, subpg_index);
+	*hidxp = rpte.hidx  | HIDX_BITS(hidx, subpg_index);
+
+	/*
+	 * Anyone reading PTE must ensure hidx bits are read after reading the
+	 * PTE by using the read-side barrier smp_rmb(). __real_pte() can be
+	 * used for that.
+	 */
+	smp_wmb();
+
+	/* No PTE bits to be modified, return 0x0UL */
+	return 0x0UL;
+}
+
 #define __rpte_to_pte(r)	((r).pte)
 extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 /*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 2/8] powerpc: introduce pte_get_hash_gslot() helper
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
  2017-11-06  8:50 ` [PATCH v9 1/8] powerpc: introduce pte_set_hidx() helper Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  2017-11-06  8:50 ` [PATCH v9 3/8] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

Introduce pte_get_hash_gslot()() which returns the global slot number of
the HPTE in the global hash table.

This function will come in handy as we work towards re-arranging the PTE
bits in the later patches.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h |    3 +++
 arch/powerpc/mm/hash_utils_64.c           |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index ecb1239..9099c1f 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -167,6 +167,9 @@ static inline int hash__pte_none(pte_t pte)
 	return (pte_val(pte) & ~H_PTE_NONE_MASK) == 0;
 }
 
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+		int ssize, real_pte_t rpte, unsigned int subpg_index);
+
 /* This low level function performs the actual PTE insertion
  * Setting the PTE depends on the MMU type and other factors. It's
  * an horrible mess that I'm not going to try to clean up now but
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 67ec2e9..48b8968 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1591,6 +1591,24 @@ static inline void tm_flush_hash_page(int local)
 }
 #endif
 
+/*
+ * Return the global hash slot, corresponding to the given PTE, which contains
+ * the HPTE.
+ */
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+		int ssize, real_pte_t rpte, unsigned int subpg_index)
+{
+	unsigned long hash, gslot, hidx;
+
+	hash = hpt_hash(vpn, shift, ssize);
+	hidx = __rpte_to_hidx(rpte, subpg_index);
+	if (hidx & _PTEIDX_SECONDARY)
+		hash = ~hash;
+	gslot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+	gslot += hidx & _PTEIDX_GROUP_IX;
+	return gslot;
+}
+
 /* WARNING: This is called from hash_low_64.S, if you change this prototype,
  *          do not forget to update the assembly call site !
  */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 3/8] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
  2017-11-06  8:50 ` [PATCH v9 1/8] powerpc: introduce pte_set_hidx() helper Ram Pai
  2017-11-06  8:50 ` [PATCH v9 2/8] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  2017-11-06  8:50 ` [PATCH v9 4/8] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6,
in the 4K backed HPTE pages.These bits continue to be used
for 64K backed HPTE pages in this patch, but will be freed
up in the next patch. The bit numbers are big-endian as
defined in the ISA3.0

The patch does the following change to the 4k HTPE backed
64K PTE's format.

H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
		below)
V0 which occupied bit 4 is not used anymore.
V1 which occupied bit 5 is not used anymore.
V2 which occupied bit 6 is not used anymore.
V3 which occupied bit 7 is not used anymore.

Before the patch, the 4k backed 64k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

After the patch, the 4k backed 64k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

the four bits S,G,I,X (one quadruplet per 4k HPTE) that
cache the hash-bucket slot value, is initialized to
1,1,1,1 indicating -- an invalid slot. If a HPTE gets
cached in a 1111 slot(i.e 7th slot of secondary hash
bucket), it is released immediately. In other words,
even though 1111 is a valid slot value in the hash
bucket, we consider it invalid and release the slot and
the HPTE. This gives us the opportunity to determine
the validity of S,G,I,X bits based on its contents and
not on any of the bits V0,V1,V2 or V3 in the primary PTE

When we release a HPTE cached in the 1111 slot
we also release a legitimate slot in the primary
hash bucket and unmap its corresponding HPTE. This
is to ensure that we do get a HPTE cached in a slot
of the primary hash bucket, the next time we retry.

Though treating 1111 slot as invalid, reduces the
number of available slots in the hash bucket and may
have an effect on the performance, the probabilty of
hitting a 1111 slot is extermely low.

Compared to the current scheme, the above scheme
reduces the number of false hash table updates
significantly and has the added advantage of releasing
four valuable PTE bits for other purpose.

NOTE:even though bits 3, 4, 5, 6, 7 are not used when
the 64K PTE is backed by 4k HPTE, they continue to be
used if the PTE gets backed by 64k HPTE. The next
patch will decouple that aswell, and truely release the
bits.

This idea was jointly developed by Paul Mackerras,
Aneesh, Michael Ellermen and myself.

4K PTE format remains unchanged currently.

The patch does the following code changes
a) PTE flags are split between 64k and 4k header files.
b) __hash_page_4K() is reimplemented to reflect the
 above logic.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    2 +
 arch/powerpc/include/asm/book3s/64/hash-64k.h |    9 +--
 arch/powerpc/include/asm/book3s/64/hash.h     |    1 -
 arch/powerpc/mm/hash64_64k.c                  |  102 +++++++++++++------------
 arch/powerpc/mm/hash_utils_64.c               |    4 +-
 5 files changed, 61 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 2975fc1..e0130c3 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -17,6 +17,8 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
+
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
 			 H_PAGE_F_SECOND | H_PAGE_F_GIX)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index c1bd258..c929e64 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -13,18 +13,14 @@
  */
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
+#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
+
 /*
  * We need to differentiate between explicit huge page and THP huge
  * page, since THP huge page also need to track real subpage details
  */
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
-/*
- * Used to track subpage group valid if H_PAGE_COMBO is set
- * This overloads H_PAGE_F_GIX and H_PAGE_F_SECOND
- */
-#define H_PAGE_COMBO_VALID	(H_PAGE_F_GIX | H_PAGE_F_SECOND)
-
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
 			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
@@ -69,6 +65,7 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 }
 
 #define HIDX_BITS(x, index)  (x << (index << 2))
+#define INVALID_RPTE_HIDX  ~(0x0UL)
 
 static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 9099c1f..4e19a61 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -10,7 +10,6 @@
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
 #define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_BUSY		_RPAGE_RSV1 /* software: PTE & hash are busy */
 #define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
 #define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 1a68cb1..b79f50c 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -15,34 +15,22 @@
 #include <linux/mm.h>
 #include <asm/machdep.h>
 #include <asm/mmu.h>
+
 /*
- * index from 0 - 15
+ * Return true, if the entry has a slot value which
+ * the software considers as invalid.
  */
-bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
+static inline bool hpte_soft_invalid(unsigned long hidx)
 {
-	unsigned long g_idx;
-	unsigned long ptev = pte_val(rpte.pte);
-
-	g_idx = (ptev & H_PAGE_COMBO_VALID) >> H_PAGE_F_GIX_SHIFT;
-	index = index >> 2;
-	if (g_idx & (0x1 << index))
-		return true;
-	else
-		return false;
+	return ((hidx & 0xfUL) == 0xfUL);
 }
+
 /*
  * index from 0 - 15
  */
-static unsigned long mark_subptegroup_valid(unsigned long ptev, unsigned long index)
+bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
 {
-	unsigned long g_idx;
-
-	if (!(ptev & H_PAGE_COMBO))
-		return ptev;
-	index = index >> 2;
-	g_idx = 0x1 << index;
-
-	return ptev | (g_idx << H_PAGE_F_GIX_SHIFT);
+	return !(hpte_soft_invalid(__rpte_to_hidx(rpte, index)));
 }
 
 int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
@@ -50,12 +38,11 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   int ssize, int subpg_prot)
 {
 	real_pte_t rpte;
-	unsigned long *hidxp;
 	unsigned long hpte_group;
 	unsigned int subpg_index;
-	unsigned long rflags, pa, hidx;
+	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte, subpg_pte;
-	unsigned long vpn, hash, slot;
+	unsigned long vpn, hash, slot, gslot;
 	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
 
 	/*
@@ -126,18 +113,14 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	if (__rpte_sub_valid(rpte, subpg_index)) {
 		int ret;
 
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(rpte, subpg_index);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
-
-		ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
+					   subpg_index);
+		ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
 						 MMU_PAGE_4K, MMU_PAGE_4K,
 						 ssize, flags);
+
 		/*
-		 *if we failed because typically the HPTE wasn't really here
+		 * If we failed because typically the HPTE wasn't really here
 		 * we try an insertion.
 		 */
 		if (ret == -1)
@@ -148,6 +131,14 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	}
 
 htab_insert_hpte:
+
+	/*
+	 * Initialize all hidx entries to invalid value, the first time
+	 * the PTE is about to allocate a 4K HPTE.
+	 */
+	if (!(old_pte & H_PAGE_COMBO))
+		rpte.hidx = INVALID_RPTE_HIDX;
+
 	/*
 	 * handle H_PAGE_4K_PFN case
 	 */
@@ -172,15 +163,39 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * Primary is full, try the secondary
 	 */
 	if (unlikely(slot == -1)) {
+		bool soft_invalid;
+
 		hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
 		slot = mmu_hash_ops.hpte_insert(hpte_group, vpn, pa,
 						rflags, HPTE_V_SECONDARY,
 						MMU_PAGE_4K, MMU_PAGE_4K,
 						ssize);
-		if (slot == -1) {
-			if (mftb() & 0x1)
+
+		soft_invalid = hpte_soft_invalid(slot);
+		if (unlikely(soft_invalid)) {
+			/*
+			 * We got a valid slot from a hardware point of view.
+			 * but we cannot use it, because we use this special
+			 * value; as defined by hpte_soft_invalid(), to track
+			 * invalid slots. We cannot use it. So invalidate it.
+			 */
+			gslot = slot & _PTEIDX_GROUP_IX;
+			mmu_hash_ops.hpte_invalidate(hpte_group + gslot, vpn,
+						     MMU_PAGE_4K, MMU_PAGE_4K,
+						     ssize, 0);
+		}
+
+		if (unlikely(slot == -1 || soft_invalid)) {
+			/*
+			 * For soft invalid slot, let's ensure that we release a
+			 * slot from the primary, with the hope that we will
+			 * acquire that slot next time we try. This will ensure
+			 * that we do not get the same soft-invalid slot.
+			 */
+			if (soft_invalid || (mftb() & 0x1))
 				hpte_group = ((hash & htab_hash_mask) *
 					      HPTES_PER_GROUP) & ~0x7UL;
+
 			mmu_hash_ops.hpte_remove(hpte_group);
 			/*
 			 * FIXME!! Should be try the group from which we removed ?
@@ -198,21 +213,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 				   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
 		return -1;
 	}
-	/*
-	 * Insert slot number & secondary bit in PTE second half,
-	 * clear H_PAGE_BUSY and set appropriate HPTE slot bit
-	 * Since we have H_PAGE_BUSY set on ptep, we can be sure
-	 * nobody is undating hidx.
-	 */
-	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
-	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
-	new_pte = mark_subptegroup_valid(new_pte, subpg_index);
-	new_pte |=  H_PAGE_HASHPTE;
-	/*
-	 * check __real_pte for details on matching smp_rmb()
-	 */
-	smp_wmb();
+
+	new_pte |= pte_set_hidx(ptep, rpte, subpg_index, slot);
+	new_pte |= H_PAGE_HASHPTE;
+
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
 }
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 48b8968..4bf6ea7 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -978,8 +978,9 @@ void __init hash__early_init_devtree(void)
 
 void __init hash__early_init_mmu(void)
 {
+#ifndef CONFIG_PPC_64K_PAGES
 	/*
-	 * We have code in __hash_page_64K() and elsewhere, which assumes it can
+	 * We have code in __hash_page_4K() and elsewhere, which assumes it can
 	 * do the following:
 	 *   new_pte |= (slot << H_PAGE_F_GIX_SHIFT) & (H_PAGE_F_SECOND | H_PAGE_F_GIX);
 	 *
@@ -990,6 +991,7 @@ void __init hash__early_init_mmu(void)
 	 * with a BUILD_BUG_ON().
 	 */
 	BUILD_BUG_ON(H_PAGE_F_SECOND != (1ul  << (H_PAGE_F_GIX_SHIFT + 3)));
+#endif /* CONFIG_PPC_64K_PAGES */
 
 	htab_init_page_sizes();
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 4/8] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (2 preceding siblings ...)
  2017-11-06  8:50 ` [PATCH v9 3/8] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  2017-11-06  8:50 ` [PATCH v9 5/8] powerpc: shifted-by-one hidx value Ram Pai
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6
in the 64K backed HPTE pages. This along with the earlier
patch will entirely free up the four bits from 64K PTE.
The bit numbers are big-endian as defined in the ISA3.0

This patch does the following change to 64K PTE backed
by 64K HPTE.

H_PAGE_F_SECOND (S) which occupied bit 4 moves to the
	second part of the pte to bit 60.
H_PAGE_F_GIX (G,I,X) which occupied bit 5, 6 and 7 also
	moves to the second part of the pte to bit 61,
 	62, 63, 64 respectively

since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
bit 9 to bit 7.

The second part of the PTE will hold
(H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
NOTE: None of the bits in the secondary PTE were not used
by 64k-HPTE backed PTE.

Before the patch, the 64K HPTE backed 64k PTE format was
as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
| | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

After the patch, the 64k HPTE backed 64k PTE format is
as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
| | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

The above PTE changes is applicable to hugetlbpages aswell.

The patch does the following code changes:

a) moves the H_PAGE_F_SECOND and H_PAGE_F_GIX to 4k PTE
	header since it is no more needed b the 64k PTEs.
b) abstracts out __real_pte() and __rpte_to_hidx() so the
	caller need not know the bit location of the slot.
c) moves the slot bits to the secondary pte.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   30 +++++++++++-------------
 arch/powerpc/include/asm/book3s/64/hash.h     |    3 --
 arch/powerpc/mm/hash64_64k.c                  |   21 ++++++++---------
 arch/powerpc/mm/hugetlbpage-hash64.c          |   16 +++++--------
 5 files changed, 33 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index e0130c3..ed0d1cd 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -17,6 +17,9 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_F_GIX_SHIFT	56
+#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
 
 /* PTE flags to conserve for HPTE identification */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index c929e64..73ed988 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -13,7 +13,7 @@
  */
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
-#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
+#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
 
 /*
  * We need to differentiate between explicit huge page and THP huge
@@ -22,8 +22,7 @@
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
 /* PTE flags to conserve for HPTE identification */
-#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
-			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
+#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)
 /*
  * we support 16 fragments per PTE page of 64K size.
  */
@@ -51,27 +50,26 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 	unsigned long *hidxp;
 
 	rpte.pte = pte;
-	rpte.hidx = 0;
-	if (pte_val(pte) & H_PAGE_COMBO) {
-		/*
-		 * Make sure we order the hidx load against the H_PAGE_COMBO
-		 * check. The store side ordering is done in __hash_page_4K
-		 */
-		smp_rmb();
-		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-		rpte.hidx = *hidxp;
-	}
+
+	/*
+	 * Ensure that we do not read the hidx before we read the PTE. Because
+	 * the writer side is expected to finish writing the hidx first followed
+	 * by the PTE, by using smp_wmb(). pte_set_hash_slot() ensures that.
+	 */
+	smp_rmb();
+
+	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+	rpte.hidx = *hidxp;
 	return rpte;
 }
 
 #define HIDX_BITS(x, index)  (x << (index << 2))
+#define BITS_TO_HIDX(x, index)  ((x >> (index << 2)) & 0xfUL)
 #define INVALID_RPTE_HIDX  ~(0x0UL)
 
 static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
-	if ((pte_val(rpte.pte) & H_PAGE_COMBO))
-		return (rpte.hidx >> (index<<2)) & 0xf;
-	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
+	return BITS_TO_HIDX(rpte.hidx, index);
 }
 
 /*
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 4e19a61..e72474f 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -9,9 +9,6 @@
  *
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
-#define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
 
 #ifdef CONFIG_PPC_64K_PAGES
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index b79f50c..2253bbc 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		 * On hash insert failure we use old pte value and we don't
 		 * want slot information there if we have a insert failure.
 		 */
-		old_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
-		new_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
+		old_pte &= ~H_PAGE_HASHPTE;
+		new_pte &= ~H_PAGE_HASHPTE;
 		goto htab_insert_hpte;
 	}
 	/*
@@ -225,6 +225,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 		    unsigned long vsid, pte_t *ptep, unsigned long trap,
 		    unsigned long flags, int ssize)
 {
+	real_pte_t rpte;
 	unsigned long hpte_group;
 	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte;
@@ -261,6 +262,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 	} while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
 
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -268,16 +270,13 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 
 	vpn  = hpt_vpn(ea, vsid, ssize);
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
+		unsigned long gslot;
+
 		/*
 		 * There MIGHT be an HPTE for this pte
 		 */
-		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
-
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,
 					       MMU_PAGE_64K, ssize,
 					       flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
@@ -326,9 +325,9 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 					   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);
 			return -1;
 		}
+
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+		new_pte |= pte_set_hidx(ptep, rpte, 0, slot);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index 0c2a91d..12511f5 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -23,6 +23,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		     pte_t *ptep, unsigned long trap, unsigned long flags,
 		     int ssize, unsigned int shift, unsigned int mmu_psize)
 {
+	real_pte_t rpte;
 	unsigned long vpn;
 	unsigned long old_pte, new_pte;
 	unsigned long rflags, pa, sz;
@@ -62,6 +63,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
 
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	sz = ((1UL) << shift);
 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -72,15 +74,10 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	/* Check if pte already has an hpte (case 2) */
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
 		/* There MIGHT be an HPTE for this pte */
-		unsigned long hash, slot;
+		unsigned long gslot;
 
-		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
-
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,
 					       mmu_psize, ssize, flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
 	}
@@ -107,8 +104,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 			return -1;
 		}
 
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+		new_pte |= pte_set_hidx(ptep, rpte, 0, slot);
 	}
 
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 5/8] powerpc: shifted-by-one hidx value
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (3 preceding siblings ...)
  2017-11-06  8:50 ` [PATCH v9 4/8] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  2017-11-06  8:50 ` [PATCH v9 6/8] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 Ram Pai
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

0xf is considered invalid hidx value. It indicates absence of a backing
HPTE. A PTE is initialized to 0xf either
a) when it is new it is newly allocated to hold 4k-backing-HPTE
	or
b) Any time it gets demoted to a 4k-backing-HPTE

This patch shifts the representation by one-modulo-0xf; i.e hidx 0 is
represented as 1, 1 as 2,... , and 0xf as 0. This convention lets us
initialize the secondary-part of the PTE to all zeroes. PTEs are anyway
zero'd when allocated. We do not have to zero them again; thus saving on
the initialization.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 73ed988..b78d94e 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -63,13 +63,21 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 	return rpte;
 }
 
+/*
+ * shift the hidx representation by one-modulo-0xf; i.e hidx 0 is respresented
+ * as 1, 1 as 2,... , and 0xf as 0.  This convention lets us represent a
+ * invalid hidx 0xf with a 0x0 bit value. PTEs are anyway zero'd when
+ * allocated. We dont have to zero them gain; thus save on the initialization.
+ */
+#define HIDX_UNSHIFT_BY_ONE(x) ((x + 0xfUL) & 0xfUL) /* shift backward by one */
+#define HIDX_SHIFT_BY_ONE(x) ((x + 0x1UL) & 0xfUL)   /* shift forward by one */
 #define HIDX_BITS(x, index)  (x << (index << 2))
 #define BITS_TO_HIDX(x, index)  ((x >> (index << 2)) & 0xfUL)
-#define INVALID_RPTE_HIDX  ~(0x0UL)
+#define INVALID_RPTE_HIDX  0x0UL
 
 static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
-	return BITS_TO_HIDX(rpte.hidx, index);
+	return HIDX_UNSHIFT_BY_ONE(BITS_TO_HIDX(rpte.hidx, index));
 }
 
 /*
@@ -82,7 +90,7 @@ static inline unsigned long pte_set_hidx(pte_t *ptep, real_pte_t rpte,
 	unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
 
 	rpte.hidx &= ~HIDX_BITS(0xfUL, subpg_index);
-	*hidxp = rpte.hidx  | HIDX_BITS(hidx, subpg_index);
+	*hidxp = rpte.hidx  | HIDX_BITS(HIDX_SHIFT_BY_ONE(hidx), subpg_index);
 
 	/*
 	 * Anyone reading PTE must ensure hidx bits are read after reading the
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 6/8] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (4 preceding siblings ...)
  2017-11-06  8:50 ` [PATCH v9 5/8] powerpc: shifted-by-one hidx value Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  2017-11-06  8:50 ` [PATCH v9 7/8] powerpc: use helper functions to get and set hash slots Ram Pai
  2017-11-06  8:50 ` [PATCH v9 8/8] powerpc: capture the PTE format changes in the dump pte report Ram Pai
  7 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

We need PTE bits 3 ,4, 5, 6 and 57 to support protection-keys,
because these are the bits we want to consolidate on across all
configuration to support protection keys.

Bit 3,4,5 and 6 are currently used on 4K-pte kernels. But bit 9
and 10 are available. Hence we use the two available bits and
free up bit 5 and 6. We will still not be able to free up bit 3
and 4. In the absence of any other free bits, we will have to
stay satisfied with what we have :-(. This means we will not
be able to support 32 protection keys, but only 8. The bit
numbers are big-endian as defined in the ISA3.0

This patch does the following change to 4K PTE.

H_PAGE_F_SECOND (S) which occupied bit 4 moves to bit 7.
H_PAGE_F_GIX (G,I,X) which occupied bit 5, 6 and 7 also moves
to bit 8,9, 10 respectively.
H_PAGE_HASHPTE (H) which occupied bit 8 moves to bit 4.

Before the patch, the 4k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 10....................57.....63
 : : : : :  :  :  :  : : :                      :     :
 v v v v v  v  v  v  v v v                      v     v
,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|S |G |I |X |H| | |x|x|................| |x|x|x|
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'

After the patch, the 4k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 10....................57.....63
 : : : : :  :  :  :  : : :                      :     :
 v v v v v  v  v  v  v v v                      v     v
,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|H |  |  |S |G|I|X|x|x|................| |.|.|.|
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'

The patch has no code changes; just swizzles around bits.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    7 ++++---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |    1 +
 arch/powerpc/include/asm/book3s/64/hash.h     |    1 -
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index ed0d1cd..8a91f24 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -17,10 +17,11 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
-#define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
+#define H_PAGE_F_GIX_SHIFT	53
+#define H_PAGE_F_SECOND	_RPAGE_RPN44	/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX	(_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
 #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
+#define H_PAGE_HASHPTE	_RPAGE_RSV2     /* software: PTE & hash are busy */
 
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index b78d94e..62e101b 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -14,6 +14,7 @@
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
 #define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
+#define H_PAGE_HASHPTE	_RPAGE_RPN43	/* PTE has associated HPTE */
 
 /*
  * We need to differentiate between explicit huge page and THP huge
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index e72474f..0920eff 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -9,7 +9,6 @@
  *
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
-#define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
 
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 7/8] powerpc: use helper functions to get and set hash slots
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (5 preceding siblings ...)
  2017-11-06  8:50 ` [PATCH v9 6/8] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  2017-11-06  8:50 ` [PATCH v9 8/8] powerpc: capture the PTE format changes in the dump pte report Ram Pai
  7 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

replace redundant code in __hash_page_4K() and flush_hash_page()
with helper functions pte_get_hash_gslot() and pte_set_hidx()

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/hash64_4k.c     |   14 ++++++--------
 arch/powerpc/mm/hash_utils_64.c |   13 ++++---------
 2 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index 6fa450c..5a69b51 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -20,6 +20,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   pte_t *ptep, unsigned long trap, unsigned long flags,
 		   int ssize, int subpg_prot)
 {
+	real_pte_t rpte;
 	unsigned long hpte_group;
 	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte;
@@ -54,6 +55,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * need to add in 0x1 if it's a read-only user page
 	 */
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -64,13 +66,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		/*
 		 * There MIGHT be an HPTE for this pte
 		 */
-		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
+		unsigned long gslot = pte_get_hash_gslot(vpn, shift, ssize,
+							 rpte, 0);
 
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_4K,
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_4K,
 					       MMU_PAGE_4K, ssize, flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
 	}
@@ -118,8 +117,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 			return -1;
 		}
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+		new_pte |= pte_set_hidx(ptep, rpte, 0, slot);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 4bf6ea7..578d5a3 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1617,23 +1617,18 @@ unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
 void flush_hash_page(unsigned long vpn, real_pte_t pte, int psize, int ssize,
 		     unsigned long flags)
 {
-	unsigned long hash, index, shift, hidx, slot;
+	unsigned long index, shift, gslot;
 	int local = flags & HPTE_LOCAL_UPDATE;
 
 	DBG_LOW("flush_hash_page(vpn=%016lx)\n", vpn);
 	pte_iterate_hashed_subpages(pte, psize, vpn, index, shift) {
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(pte, index);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
-		DBG_LOW(" sub %ld: hash=%lx, hidx=%lx\n", index, slot, hidx);
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, pte, index);
+		DBG_LOW(" sub %ld: gslot=%lx\n", index, gslot);
 		/*
 		 * We use same base page size and actual psize, because we don't
 		 * use these functions for hugepage
 		 */
-		mmu_hash_ops.hpte_invalidate(slot, vpn, psize, psize,
+		mmu_hash_ops.hpte_invalidate(gslot, vpn, psize, psize,
 					     ssize, local);
 	} pte_iterate_hashed_end();
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 8/8] powerpc: capture the PTE format changes in the dump pte report
  2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (6 preceding siblings ...)
  2017-11-06  8:50 ` [PATCH v9 7/8] powerpc: use helper functions to get and set hash slots Ram Pai
@ 2017-11-06  8:50 ` Ram Pai
  7 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2017-11-06  8:50 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm, linuxram

The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
capture these changes in the dump pte report.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/dump_linuxpagetables.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/dump_linuxpagetables.c b/arch/powerpc/mm/dump_linuxpagetables.c
index c9282d2..0323dc4 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -213,7 +213,7 @@ struct flag_info {
 		.val	= H_PAGE_4K_PFN,
 		.set	= "4K_pfn",
 	}, {
-#endif
+#else /* CONFIG_PPC_64K_PAGES */
 		.mask	= H_PAGE_F_GIX,
 		.val	= H_PAGE_F_GIX,
 		.set	= "f_gix",
@@ -224,6 +224,7 @@ struct flag_info {
 		.val	= H_PAGE_F_SECOND,
 		.set	= "f_second",
 	}, {
+#endif /* CONFIG_PPC_64K_PAGES */
 #endif
 		.mask	= _PAGE_SPECIAL,
 		.val	= _PAGE_SPECIAL,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [v9,1/8] powerpc: introduce pte_set_hidx() helper
  2017-11-06  8:50 ` [PATCH v9 1/8] powerpc: introduce pte_set_hidx() helper Ram Pai
@ 2018-01-17 13:30   ` Michael Ellerman
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Ellerman @ 2018-01-17 13:30 UTC (permalink / raw)
  To: Ram Pai
  Cc: ebiederm, linuxram, mhocko, paulus, aneesh.kumar, bauerman,
	linuxppc-dev, khandual

On Mon, 2017-11-06 at 08:50:45 UTC, Ram Pai wrote:
> Introduce pte_set_hidx().It sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX) bits
> at the appropriate location in the PTE of 4K PTE. For 64K PTE, it sets
> the bits in the second part of the PTE. Though the implementation for
> the former just needs the slot parameter, it does take some additional
> parameters to keep the prototype consistent.
> 
> This function will be handy as we work towards re-arranging the bits in
> the subsequent patches.
> 
> Acked-by: Balbir Singh <bsingharora@gmail.com>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/59aa31fd6f964ac63ac9cb90c8468f

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-01-17 13:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-06  8:50 [PATCH v9 0/8] powerpc: Free up RPAGE_RSV bits Ram Pai
2017-11-06  8:50 ` [PATCH v9 1/8] powerpc: introduce pte_set_hidx() helper Ram Pai
2018-01-17 13:30   ` [v9,1/8] " Michael Ellerman
2017-11-06  8:50 ` [PATCH v9 2/8] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
2017-11-06  8:50 ` [PATCH v9 3/8] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
2017-11-06  8:50 ` [PATCH v9 4/8] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
2017-11-06  8:50 ` [PATCH v9 5/8] powerpc: shifted-by-one hidx value Ram Pai
2017-11-06  8:50 ` [PATCH v9 6/8] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 Ram Pai
2017-11-06  8:50 ` [PATCH v9 7/8] powerpc: use helper functions to get and set hash slots Ram Pai
2017-11-06  8:50 ` [PATCH v9 8/8] powerpc: capture the PTE format changes in the dump pte report Ram Pai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).