From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kvm-ppc@vger.kernel.org
Cc: npiggin@gmail.com, paulus@ozlabs.org, leonardo@linux.ibm.com,
kirill@shutemov.name,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: [PATCH v3 03/22] powerpc/mm/hash64: use _PAGE_PTE when checking for pte_present
Date: Mon, 20 Apr 2020 12:56:15 +0000 [thread overview]
Message-ID: <20200420124434.47330-4-aneesh.kumar@linux.ibm.com> (raw)
In-Reply-To: <20200420124434.47330-1-aneesh.kumar@linux.ibm.com>
This makes the pte_present check stricter by checking for additional _PAGE_PTE
bit. A level 1 pte pointer (THP pte) can be switched to a pointer to level 0 pte
page table page by following two operations.
1) THP split.
2) madvise(MADV_DONTNEED) in parallel to page fault.
A lockless page table walk need to make sure we can handle such changes
gracefully.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 15 ++++++++++-----
arch/powerpc/mm/book3s64/hash_utils.c | 11 +++++++++--
2 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 368b136517e0..03521a8b0292 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -553,6 +553,12 @@ static inline pte_t pte_clear_savedwrite(pte_t pte)
}
#endif /* CONFIG_NUMA_BALANCING */
+static inline bool pte_hw_valid(pte_t pte)
+{
+ return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE)) =
+ cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE);
+}
+
static inline int pte_present(pte_t pte)
{
/*
@@ -561,12 +567,11 @@ static inline int pte_present(pte_t pte)
* invalid during ptep_set_access_flags. Hence we look for _PAGE_INVALID
* if we find _PAGE_PRESENT cleared.
*/
- return !!(pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_INVALID));
-}
-static inline bool pte_hw_valid(pte_t pte)
-{
- return !!(pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT));
+ if (pte_hw_valid(pte))
+ return true;
+ return (pte_raw(pte) & cpu_to_be64(_PAGE_INVALID | _PAGE_PTE)) =
+ cpu_to_be64(_PAGE_INVALID | _PAGE_PTE);
}
#ifdef CONFIG_PPC_MEM_KEYS
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index e951e87a974d..525eac4ee2c2 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1350,8 +1350,15 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea,
goto bail;
}
- /* Add _PAGE_PRESENT to the required access perm */
- access |= _PAGE_PRESENT;
+ /*
+ * Add _PAGE_PRESENT to the required access perm. If there are parallel
+ * updates to the pte that can possibly clear _PAGE_PTE, catch that too.
+ *
+ * We can safely use the return pte address in rest of the function
+ * because we do set H_PAGE_BUSY which prevents further updates to pte
+ * from generic code.
+ */
+ access |= _PAGE_PRESENT | _PAGE_PTE;
/*
* Pre-check access permissions (will be re-checked atomically
--
2.25.3
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kvm-ppc@vger.kernel.org
Cc: leonardo@linux.ibm.com,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
npiggin@gmail.com, kirill@shutemov.name
Subject: [PATCH v3 03/22] powerpc/mm/hash64: use _PAGE_PTE when checking for pte_present
Date: Mon, 20 Apr 2020 18:14:15 +0530 [thread overview]
Message-ID: <20200420124434.47330-4-aneesh.kumar@linux.ibm.com> (raw)
In-Reply-To: <20200420124434.47330-1-aneesh.kumar@linux.ibm.com>
This makes the pte_present check stricter by checking for additional _PAGE_PTE
bit. A level 1 pte pointer (THP pte) can be switched to a pointer to level 0 pte
page table page by following two operations.
1) THP split.
2) madvise(MADV_DONTNEED) in parallel to page fault.
A lockless page table walk need to make sure we can handle such changes
gracefully.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 15 ++++++++++-----
arch/powerpc/mm/book3s64/hash_utils.c | 11 +++++++++--
2 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 368b136517e0..03521a8b0292 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -553,6 +553,12 @@ static inline pte_t pte_clear_savedwrite(pte_t pte)
}
#endif /* CONFIG_NUMA_BALANCING */
+static inline bool pte_hw_valid(pte_t pte)
+{
+ return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE)) ==
+ cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE);
+}
+
static inline int pte_present(pte_t pte)
{
/*
@@ -561,12 +567,11 @@ static inline int pte_present(pte_t pte)
* invalid during ptep_set_access_flags. Hence we look for _PAGE_INVALID
* if we find _PAGE_PRESENT cleared.
*/
- return !!(pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_INVALID));
-}
-static inline bool pte_hw_valid(pte_t pte)
-{
- return !!(pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT));
+ if (pte_hw_valid(pte))
+ return true;
+ return (pte_raw(pte) & cpu_to_be64(_PAGE_INVALID | _PAGE_PTE)) ==
+ cpu_to_be64(_PAGE_INVALID | _PAGE_PTE);
}
#ifdef CONFIG_PPC_MEM_KEYS
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index e951e87a974d..525eac4ee2c2 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1350,8 +1350,15 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea,
goto bail;
}
- /* Add _PAGE_PRESENT to the required access perm */
- access |= _PAGE_PRESENT;
+ /*
+ * Add _PAGE_PRESENT to the required access perm. If there are parallel
+ * updates to the pte that can possibly clear _PAGE_PTE, catch that too.
+ *
+ * We can safely use the return pte address in rest of the function
+ * because we do set H_PAGE_BUSY which prevents further updates to pte
+ * from generic code.
+ */
+ access |= _PAGE_PRESENT | _PAGE_PTE;
/*
* Pre-check access permissions (will be re-checked atomically
--
2.25.3
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kvm-ppc@vger.kernel.org
Cc: npiggin@gmail.com, paulus@ozlabs.org, leonardo@linux.ibm.com,
kirill@shutemov.name,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: [PATCH v3 03/22] powerpc/mm/hash64: use _PAGE_PTE when checking for pte_present
Date: Mon, 20 Apr 2020 18:14:15 +0530 [thread overview]
Message-ID: <20200420124434.47330-4-aneesh.kumar@linux.ibm.com> (raw)
In-Reply-To: <20200420124434.47330-1-aneesh.kumar@linux.ibm.com>
This makes the pte_present check stricter by checking for additional _PAGE_PTE
bit. A level 1 pte pointer (THP pte) can be switched to a pointer to level 0 pte
page table page by following two operations.
1) THP split.
2) madvise(MADV_DONTNEED) in parallel to page fault.
A lockless page table walk need to make sure we can handle such changes
gracefully.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 15 ++++++++++-----
arch/powerpc/mm/book3s64/hash_utils.c | 11 +++++++++--
2 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 368b136517e0..03521a8b0292 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -553,6 +553,12 @@ static inline pte_t pte_clear_savedwrite(pte_t pte)
}
#endif /* CONFIG_NUMA_BALANCING */
+static inline bool pte_hw_valid(pte_t pte)
+{
+ return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE)) ==
+ cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE);
+}
+
static inline int pte_present(pte_t pte)
{
/*
@@ -561,12 +567,11 @@ static inline int pte_present(pte_t pte)
* invalid during ptep_set_access_flags. Hence we look for _PAGE_INVALID
* if we find _PAGE_PRESENT cleared.
*/
- return !!(pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_INVALID));
-}
-static inline bool pte_hw_valid(pte_t pte)
-{
- return !!(pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT));
+ if (pte_hw_valid(pte))
+ return true;
+ return (pte_raw(pte) & cpu_to_be64(_PAGE_INVALID | _PAGE_PTE)) ==
+ cpu_to_be64(_PAGE_INVALID | _PAGE_PTE);
}
#ifdef CONFIG_PPC_MEM_KEYS
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index e951e87a974d..525eac4ee2c2 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1350,8 +1350,15 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea,
goto bail;
}
- /* Add _PAGE_PRESENT to the required access perm */
- access |= _PAGE_PRESENT;
+ /*
+ * Add _PAGE_PRESENT to the required access perm. If there are parallel
+ * updates to the pte that can possibly clear _PAGE_PTE, catch that too.
+ *
+ * We can safely use the return pte address in rest of the function
+ * because we do set H_PAGE_BUSY which prevents further updates to pte
+ * from generic code.
+ */
+ access |= _PAGE_PRESENT | _PAGE_PTE;
/*
* Pre-check access permissions (will be re-checked atomically
--
2.25.3
next prev parent reply other threads:[~2020-04-20 12:56 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-20 12:44 [PATCH v3 00/22] Avoid IPI while updating page table entries Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 01/22] powerpc/pkeys: Avoid using lockless page table walk Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 02/22] powerpc/pkeys: Check vma before returning key fault error to the user Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V [this message]
2020-04-20 12:56 ` [PATCH v3 03/22] powerpc/mm/hash64: use _PAGE_PTE when checking for pte_present Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 04/22] powerpc/hash64: Restrict page table lookup using init_mm with __flush_hash_table_range Aneesh Kumar K.V
2020-04-20 12:56 ` [PATCH v3 04/22] powerpc/hash64: Restrict page table lookup using init_mm with __flush_hash_table_ra Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 04/22] powerpc/hash64: Restrict page table lookup using init_mm with __flush_hash_table_range Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 05/22] powerpc/book3s64/hash: Use the pte_t address from the caller Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 06/22] powerpc/mce: Don't reload pte val in addr_to_pfn Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 07/22] powerpc/perf/callchain: Use __get_user_pages_fast in read_user_stack_slow Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 08/22] powerpc/kvm/book3s: switch from raw_spin_*lock to arch_spin_lock Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 09/22] powerpc/kvm/book3s: Add helper to walk partition scoped linux page table Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 10/22] powerpc/kvm/nested: Add helper to walk nested shadow " Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 11/22] powerpc/kvm/book3s: Use kvm helpers to walk shadow or secondary table Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 12/22] powerpc/kvm/book3s: Add helper for host page table walk Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 13/22] powerpc/kvm/book3s: Use find_kvm_host_pte in page fault handler Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 14/22] powerpc/kvm/book3s: Use find_kvm_host_pte in h_enter Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 15/22] powerpc/kvm/book3s: use find_kvm_host_pte in pute_tce functions Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 16/22] powerpc/kvm/book3s: Avoid using rmap to protect parallel page table update Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 17/22] powerpc/kvm/book3s: use find_kvm_host_pte in kvmppc_book3s_instantiate_page Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 18/22] powerpc/kvm/book3s: Use find_kvm_host_pte in kvmppc_get_hpa Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 19/22] powerpc/kvm/book3s: Use pte_present instead of opencoding _PAGE_PRESENT check Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 20/22] powerpc/mm/book3s64: Avoid sending IPI on clearing PMD Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 21/22] mm: change pmdp_huge_get_and_clear_full take vm_area_struct as arg Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
2020-04-20 12:44 ` [PATCH v3 22/22] powerpc/mm/book3s64: Fix MADV_DONTNEED and parallel page fault race Aneesh Kumar K.V
2020-04-20 12:56 ` Aneesh Kumar K.V
2020-04-20 12:44 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200420124434.47330-4-aneesh.kumar@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=kirill@shutemov.name \
--cc=kvm-ppc@vger.kernel.org \
--cc=leonardo@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulus@ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.