* [PATCH v2] powerpc/64s/radix: do not flush TLB when relaxing access
@ 2018-05-20 12:28 Nicholas Piggin
2018-05-21 6:02 ` Aneesh Kumar K.V
2018-05-21 22:34 ` kbuild test robot
0 siblings, 2 replies; 3+ messages in thread
From: Nicholas Piggin @ 2018-05-20 12:28 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
Radix flushes the TLB when updating ptes to increase permissiveness
of protection (increase access authority). Book3S does not require
TLB flushing in this case, and it is not done on hash. This patch
avoids the flush for radix.
>From Power ISA v3.0B, p.1090:
Setting a Reference or Change Bit or Upgrading Access Authority
(PTE Subject to Atomic Hardware Updates)
If the only change being made to a valid PTE that is subject to
atomic hardware updates is to set the Reference or Change bit to 1
or to add access authorities, a simpler sequence suffices because
the translation hardware will refetch the PTE if an access is
attempted for which the only problems were reference and/or change
bits needing to be set or insufficient access authority.
The nest MMU on POWER9 does not re-fetch the PTE after such an access
attempt before faulting, so address spaces with a coprocessor
attached will continue to flush in these cases.
This reduces tlbies for a kernel compile workload from 1.28M to 0.95M,
tlbiels from 20.17M 19.68M.
fork --fork --exec benchmark improved 2.77% (12000->12300).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
Oops I missed this patch, it's supposed to go as the first patch in
the "Various TLB and PTE improvements" patch.
arch/powerpc/mm/pgtable-book3s64.c | 10 +++++++---
arch/powerpc/mm/pgtable.c | 29 ++++++++++++++++++++++++++---
2 files changed, 33 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 518518fb7c45..994492453f0e 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -31,16 +31,20 @@ int (*register_process_table)(unsigned long base, unsigned long page_size,
int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp, pmd_t entry, int dirty)
{
+ struct mm_struct *mm = vma->vm_mm;
int changed;
#ifdef CONFIG_DEBUG_VM
WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
- assert_spin_locked(&vma->vm_mm->page_table_lock);
+ assert_spin_locked(&mm->page_table_lock);
#endif
changed = !pmd_same(*(pmdp), entry);
if (changed) {
- __ptep_set_access_flags(vma->vm_mm, pmdp_ptep(pmdp),
+ __ptep_set_access_flags(mm, pmdp_ptep(pmdp),
pmd_pte(entry), address);
- flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+ /* See ptep_set_access_flags comments */
+ if (atomic_read(&mm->context.copros) > 0)
+ flush_pmd_tlb_range(vma, address,
+ address + HPAGE_PMD_SIZE);
}
return changed;
}
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 9f361ae571e9..525ec4656a55 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -217,14 +217,37 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
pte_t *ptep, pte_t entry, int dirty)
{
+ struct mm_struct *mm = vma->vm_mm;
int changed;
+
entry = set_access_flags_filter(entry, vma, dirty);
changed = !pte_same(*(ptep), entry);
if (changed) {
if (!is_vm_hugetlb_page(vma))
- assert_pte_locked(vma->vm_mm, address);
- __ptep_set_access_flags(vma->vm_mm, ptep, entry, address);
- flush_tlb_page(vma, address);
+ assert_pte_locked(mm, address);
+ __ptep_set_access_flags(mm, ptep, entry, address);
+ if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
+ /*
+ * Book3S does not require a TLB flush when relaxing
+ * access restrictions because the core MMU will reload
+ * the pte after taking an access fault. However the
+ * NMMU on POWER9 does not re-load the pte, so flush
+ * if we have a coprocessor attached to this address
+ * space.
+ *
+ * This could be further refined and pushed out to
+ * NMMU drivers so TLBIEs are only done for NMMU
+ * faults, but this is a more minimal fix. The NMMU
+ * fault handler does a get_user_pages_remote or
+ * similar to bring the page tables in, and this
+ * flush_tlb_page will do a global TLBIE because the
+ * coprocessor is attached to the address space.
+ */
+ if (atomic_read(&mm->context.copros) > 0)
+ flush_tlb_page(vma, address);
+ } else {
+ flush_tlb_page(vma, address);
+ }
}
return changed;
}
--
2.17.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] powerpc/64s/radix: do not flush TLB when relaxing access
2018-05-20 12:28 [PATCH v2] powerpc/64s/radix: do not flush TLB when relaxing access Nicholas Piggin
@ 2018-05-21 6:02 ` Aneesh Kumar K.V
2018-05-21 22:34 ` kbuild test robot
1 sibling, 0 replies; 3+ messages in thread
From: Aneesh Kumar K.V @ 2018-05-21 6:02 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
Nicholas Piggin <npiggin@gmail.com> writes:
> Radix flushes the TLB when updating ptes to increase permissiveness
> of protection (increase access authority). Book3S does not require
> TLB flushing in this case, and it is not done on hash. This patch
> avoids the flush for radix.
>
> From Power ISA v3.0B, p.1090:
>
> Setting a Reference or Change Bit or Upgrading Access Authority
> (PTE Subject to Atomic Hardware Updates)
>
> If the only change being made to a valid PTE that is subject to
> atomic hardware updates is to set the Reference or Change bit to 1
> or to add access authorities, a simpler sequence suffices because
> the translation hardware will refetch the PTE if an access is
> attempted for which the only problems were reference and/or change
> bits needing to be set or insufficient access authority.
>
> The nest MMU on POWER9 does not re-fetch the PTE after such an access
> attempt before faulting, so address spaces with a coprocessor
> attached will continue to flush in these cases.
>
> This reduces tlbies for a kernel compile workload from 1.28M to 0.95M,
> tlbiels from 20.17M 19.68M.
>
> fork --fork --exec benchmark improved 2.77% (12000->12300).
>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> Oops I missed this patch, it's supposed to go as the first patch in
> the "Various TLB and PTE improvements" patch.
>
> arch/powerpc/mm/pgtable-book3s64.c | 10 +++++++---
> arch/powerpc/mm/pgtable.c | 29 ++++++++++++++++++++++++++---
> 2 files changed, 33 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
> index 518518fb7c45..994492453f0e 100644
> --- a/arch/powerpc/mm/pgtable-book3s64.c
> +++ b/arch/powerpc/mm/pgtable-book3s64.c
> @@ -31,16 +31,20 @@ int (*register_process_table)(unsigned long base, unsigned long page_size,
> int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
> pmd_t *pmdp, pmd_t entry, int dirty)
> {
> + struct mm_struct *mm = vma->vm_mm;
> int changed;
> #ifdef CONFIG_DEBUG_VM
> WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
> - assert_spin_locked(&vma->vm_mm->page_table_lock);
> + assert_spin_locked(&mm->page_table_lock);
> #endif
> changed = !pmd_same(*(pmdp), entry);
> if (changed) {
> - __ptep_set_access_flags(vma->vm_mm, pmdp_ptep(pmdp),
> + __ptep_set_access_flags(mm, pmdp_ptep(pmdp),
> pmd_pte(entry), address);
> - flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
> + /* See ptep_set_access_flags comments */
> + if (atomic_read(&mm->context.copros) > 0)
> + flush_pmd_tlb_range(vma, address,
> + address + HPAGE_PMD_SIZE);
> }
> return changed;
> }
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index 9f361ae571e9..525ec4656a55 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -217,14 +217,37 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
> pte_t *ptep, pte_t entry, int dirty)
> {
> + struct mm_struct *mm = vma->vm_mm;
> int changed;
> +
> entry = set_access_flags_filter(entry, vma, dirty);
> changed = !pte_same(*(ptep), entry);
> if (changed) {
> if (!is_vm_hugetlb_page(vma))
> - assert_pte_locked(vma->vm_mm, address);
> - __ptep_set_access_flags(vma->vm_mm, ptep, entry, address);
> - flush_tlb_page(vma, address);
> + assert_pte_locked(mm, address);
> + __ptep_set_access_flags(mm, ptep, entry, address);
> + if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
> + /*
> + * Book3S does not require a TLB flush when relaxing
> + * access restrictions because the core MMU will reload
> + * the pte after taking an access fault. However the
> + * NMMU on POWER9 does not re-load the pte, so flush
> + * if we have a coprocessor attached to this address
> + * space.
> + *
> + * This could be further refined and pushed out to
> + * NMMU drivers so TLBIEs are only done for NMMU
> + * faults, but this is a more minimal fix. The NMMU
> + * fault handler does a get_user_pages_remote or
> + * similar to bring the page tables in, and this
> + * flush_tlb_page will do a global TLBIE because the
> + * coprocessor is attached to the address space.
> + */
> + if (atomic_read(&mm->context.copros) > 0)
> + flush_tlb_page(vma, address);
> + } else {
> + flush_tlb_page(vma, address);
> + }
> }
> return changed;
> }
> --
> 2.17.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] powerpc/64s/radix: do not flush TLB when relaxing access
2018-05-20 12:28 [PATCH v2] powerpc/64s/radix: do not flush TLB when relaxing access Nicholas Piggin
2018-05-21 6:02 ` Aneesh Kumar K.V
@ 2018-05-21 22:34 ` kbuild test robot
1 sibling, 0 replies; 3+ messages in thread
From: kbuild test robot @ 2018-05-21 22:34 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: kbuild-all, linuxppc-dev, Nicholas Piggin
[-- Attachment #1: Type: text/plain, Size: 3180 bytes --]
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on v4.17-rc6]
[cannot apply to powerpc/next next-20180517]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64s-radix-do-not-flush-TLB-when-relaxing-access/20180522-024317
config: powerpc-bamboo_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc
All errors (new ones prefixed by >>):
arch/powerpc/mm/pgtable.c: In function 'ptep_set_access_flags':
>> arch/powerpc/mm/pgtable.c:246:32: error: 'mm_context_t {aka struct <anonymous>}' has no member named 'copros'
if (atomic_read(&mm->context.copros) > 0)
^
vim +246 arch/powerpc/mm/pgtable.c
209
210 /*
211 * This is called when relaxing access to a PTE. It's also called in the page
212 * fault path when we don't hit any of the major fault cases, ie, a minor
213 * update of _PAGE_ACCESSED, _PAGE_DIRTY, etc... The generic code will have
214 * handled those two for us, we additionally deal with missing execute
215 * permission here on some processors
216 */
217 int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
218 pte_t *ptep, pte_t entry, int dirty)
219 {
220 struct mm_struct *mm = vma->vm_mm;
221 int changed;
222
223 entry = set_access_flags_filter(entry, vma, dirty);
224 changed = !pte_same(*(ptep), entry);
225 if (changed) {
226 if (!is_vm_hugetlb_page(vma))
227 assert_pte_locked(mm, address);
228 __ptep_set_access_flags(mm, ptep, entry, address);
229 if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
230 /*
231 * Book3S does not require a TLB flush when relaxing
232 * access restrictions because the core MMU will reload
233 * the pte after taking an access fault. However the
234 * NMMU on POWER9 does not re-load the pte, so flush
235 * if we have a coprocessor attached to this address
236 * space.
237 *
238 * This could be further refined and pushed out to
239 * NMMU drivers so TLBIEs are only done for NMMU
240 * faults, but this is a more minimal fix. The NMMU
241 * fault handler does a get_user_pages_remote or
242 * similar to bring the page tables in, and this
243 * flush_tlb_page will do a global TLBIE because the
244 * coprocessor is attached to the address space.
245 */
> 246 if (atomic_read(&mm->context.copros) > 0)
247 flush_tlb_page(vma, address);
248 } else {
249 flush_tlb_page(vma, address);
250 }
251 }
252 return changed;
253 }
254
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10986 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-05-21 22:35 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-20 12:28 [PATCH v2] powerpc/64s/radix: do not flush TLB when relaxing access Nicholas Piggin
2018-05-21 6:02 ` Aneesh Kumar K.V
2018-05-21 22:34 ` kbuild test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).