* Re: [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries [not found] <014201d1d738$744c8f90$5ce5aeb0$@alibaba-inc.com> @ 2016-07-06 4:03 ` Hillf Danton 2016-07-06 6:23 ` Martin Schwidefsky 0 siblings, 1 reply; 7+ messages in thread From: Hillf Danton @ 2016-07-06 4:03 UTC (permalink / raw) To: 'Martin Schwidefsky'; +Cc: linux-kernel, linux-mm > > +void ptep_invalidate_range(struct mm_struct *mm, unsigned long start, > + unsigned long end, pte_t *ptep) > +{ > + unsigned long nr; > + > + if (!MACHINE_HAS_IPTE_RANGE || mm_has_pgste(mm)) > + return; > + preempt_disable(); > + nr = (end - start) >> PAGE_SHIFT; > + /* If the flush is likely to be local skip the ipte range */ > + if (nr && !cpumask_equal(mm_cpumask(mm), > + cpumask_of(smp_processor_id()))) s/smp/raw_smp/ to avoid adding schedule entry with page table lock held? > + __ptep_ipte_range(start, nr - 1, ptep); > + preempt_enable(); > +} > +EXPORT_SYMBOL(ptep_invalidate_range); > + thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries 2016-07-06 4:03 ` [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries Hillf Danton @ 2016-07-06 6:23 ` Martin Schwidefsky 2016-07-06 6:42 ` Hillf Danton 0 siblings, 1 reply; 7+ messages in thread From: Martin Schwidefsky @ 2016-07-06 6:23 UTC (permalink / raw) To: Hillf Danton; +Cc: linux-kernel, linux-mm On Wed, 06 Jul 2016 12:03:28 +0800 "Hillf Danton" <hillf.zj@alibaba-inc.com> wrote: > > > > +void ptep_invalidate_range(struct mm_struct *mm, unsigned long start, > > + unsigned long end, pte_t *ptep) > > +{ > > + unsigned long nr; > > + > > + if (!MACHINE_HAS_IPTE_RANGE || mm_has_pgste(mm)) > > + return; > > + preempt_disable(); > > + nr = (end - start) >> PAGE_SHIFT; > > + /* If the flush is likely to be local skip the ipte range */ > > + if (nr && !cpumask_equal(mm_cpumask(mm), > > + cpumask_of(smp_processor_id()))) > > s/smp/raw_smp/ to avoid adding schedule entry with page table > lock held? There can not be a schedule entry with either the page table lock held or the preempt_disable() a few lines above. > > + __ptep_ipte_range(start, nr - 1, ptep); > > + preempt_enable(); > > +} > > +EXPORT_SYMBOL(ptep_invalidate_range); > > + > > thanks > Hillf > -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries 2016-07-06 6:23 ` Martin Schwidefsky @ 2016-07-06 6:42 ` Hillf Danton 2016-07-06 8:47 ` Martin Schwidefsky 0 siblings, 1 reply; 7+ messages in thread From: Hillf Danton @ 2016-07-06 6:42 UTC (permalink / raw) To: 'Martin Schwidefsky'; +Cc: 'linux-kernel', linux-mm > > > > > > +void ptep_invalidate_range(struct mm_struct *mm, unsigned long start, > > > + unsigned long end, pte_t *ptep) > > > +{ > > > + unsigned long nr; > > > + > > > + if (!MACHINE_HAS_IPTE_RANGE || mm_has_pgste(mm)) > > > + return; > > > + preempt_disable(); > > > + nr = (end - start) >> PAGE_SHIFT; > > > + /* If the flush is likely to be local skip the ipte range */ > > > + if (nr && !cpumask_equal(mm_cpumask(mm), > > > + cpumask_of(smp_processor_id()))) > > > > s/smp/raw_smp/ to avoid adding schedule entry with page table > > lock held? > > There can not be a schedule entry with either the page table lock held > or the preempt_disable() a few lines above. > Yes, Sir. > > > + __ptep_ipte_range(start, nr - 1, ptep); > > > + preempt_enable(); Then would you please, Sir, take a look at another case where preempt is enabled? > > > +} > > > +EXPORT_SYMBOL(ptep_invalidate_range); > > > + > > thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries 2016-07-06 6:42 ` Hillf Danton @ 2016-07-06 8:47 ` Martin Schwidefsky 2016-07-06 9:26 ` Hillf Danton 0 siblings, 1 reply; 7+ messages in thread From: Martin Schwidefsky @ 2016-07-06 8:47 UTC (permalink / raw) To: Hillf Danton; +Cc: 'linux-kernel', linux-mm On Wed, 06 Jul 2016 14:42:16 +0800 "Hillf Danton" <hillf.zj@alibaba-inc.com> wrote: > > > > > > > > +void ptep_invalidate_range(struct mm_struct *mm, unsigned long start, > > > > + unsigned long end, pte_t *ptep) > > > > +{ > > > > + unsigned long nr; > > > > + > > > > + if (!MACHINE_HAS_IPTE_RANGE || mm_has_pgste(mm)) > > > > + return; > > > > + preempt_disable(); > > > > + nr = (end - start) >> PAGE_SHIFT; > > > > + /* If the flush is likely to be local skip the ipte range */ > > > > + if (nr && !cpumask_equal(mm_cpumask(mm), > > > > + cpumask_of(smp_processor_id()))) > > > > > > s/smp/raw_smp/ to avoid adding schedule entry with page table > > > lock held? > > > > There can not be a schedule entry with either the page table lock held > > or the preempt_disable() a few lines above. > > > Yes, Sir. > > > > > + __ptep_ipte_range(start, nr - 1, ptep); > > > > + preempt_enable(); > > Then would you please, Sir, take a look at another case where > preempt is enabled? You are still a bit cryptic, are you trying to tell me that your hint is about trying to avoid the preempt_enable() call? The reason why I added the preempt_disable()/preempt_enable() pair to ptep_invalidate_range is that I recently got bitten by a preempt problem in the ptep_xchg_lazy() function which is used for ptep_get_and_clear(). Now ptep_get_and_clear() is used in vunmap_pte_range() which is called while preemption is allowed. To keep things symmetrical it seems sensible to explicitely disable preemption on all ptep_xxx code paths with cpu mask checks, no? -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries 2016-07-06 8:47 ` Martin Schwidefsky @ 2016-07-06 9:26 ` Hillf Danton 2016-07-06 10:51 ` Martin Schwidefsky 0 siblings, 1 reply; 7+ messages in thread From: Hillf Danton @ 2016-07-06 9:26 UTC (permalink / raw) To: 'Martin Schwidefsky'; +Cc: 'linux-kernel', linux-mm > > You are still a bit cryptic, > Sorry, Sir, simply because I'm not native English speaker. > are you trying to tell me that your hint is > about trying to avoid the preempt_enable() call? > Yes, since we are already in the context with page table lock held. thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries 2016-07-06 9:26 ` Hillf Danton @ 2016-07-06 10:51 ` Martin Schwidefsky 0 siblings, 0 replies; 7+ messages in thread From: Martin Schwidefsky @ 2016-07-06 10:51 UTC (permalink / raw) To: Hillf Danton; +Cc: 'linux-kernel', linux-mm On Wed, 06 Jul 2016 17:26:08 +0800 "Hillf Danton" <hillf.zj@alibaba-inc.com> wrote: > > > > You are still a bit cryptic, > > > Sorry, Sir, simply because I'm not native English speaker. > > > are you trying to tell me that your hint is > > about trying to avoid the preempt_enable() call? > > > Yes, since we are already in the context with page table lock held. Ok, got it. An option would be to drop the preempt_disable/preempt_enable, add "BUG_ON(preemptible())" and use raw_smp_processor_id. But I wonder if it is worth the effort. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/2][RFC] mm callback for batched pte updates @ 2016-07-05 12:00 Martin Schwidefsky 2016-07-05 12:00 ` [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries Martin Schwidefsky 0 siblings, 1 reply; 7+ messages in thread From: Martin Schwidefsky @ 2016-07-05 12:00 UTC (permalink / raw) To: linux-mm; +Cc: Martin Schwidefsky Hello, there is another peculiarity on s390 I would like to exploit, the range option of the IPTE instruction. This is an extension that allows to set the invalid bit and clear the associated TLB entry for multiple page table entries with a single instruction instead of doing an IPTE for each pte. Each IPTE or IPTE-range is a quiescing operation, basically an IPI to all other CPUs to coordinate the pte invalidation. The IPTE-range is useful in mulit-threaded programs for a fork or a mprotect/munmap/mremap affecting large memory areas where s390 may not just do the pte update and clear the TLBs later. In order to add the IPTE range optimization another mm callback is needed in copy_page_range, unmap_page_range, move_page_tables, and change_protection_range. The name is 'ptep_prepare_range', suggestions for a better name are welcome. With the two patches the update for the ptes inside a single page table is done in two steps. First the prep_prepare_range invalidates all ptes, this makes the address range inaccessible for all CPUs. The pages are still marked as present and could be revalidated again if the page table lock is released, but this does not happen with the current code. The second step is the usual update loop over all single ptes. Given a multi-threaded program a fork or a mprotect/munmap/mremap of a large address range now needs fewer IPTEs / IPIs by a factor up to 256. My mprotect stress test runs faster by an order of magnitude. Martin Schwidefsky (2): mm: add callback to prepare the update of multiple page table entries s390/mm: use ipte range to invalidate multiple page table entries arch/s390/include/asm/pgtable.h | 25 +++++++++++++++++++++++++ arch/s390/include/asm/setup.h | 2 ++ arch/s390/kernel/early.c | 2 ++ arch/s390/mm/pageattr.c | 2 +- arch/s390/mm/pgtable.c | 17 +++++++++++++++++ include/asm-generic/pgtable.h | 4 ++++ mm/memory.c | 2 ++ mm/mprotect.c | 1 + mm/mremap.c | 1 + 9 files changed, 55 insertions(+), 1 deletion(-) -- 2.6.6 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries 2016-07-05 12:00 [PATCH 0/2][RFC] mm callback for batched pte updates Martin Schwidefsky @ 2016-07-05 12:00 ` Martin Schwidefsky 0 siblings, 0 replies; 7+ messages in thread From: Martin Schwidefsky @ 2016-07-05 12:00 UTC (permalink / raw) To: linux-mm; +Cc: Martin Schwidefsky The IPTE instruction with the range option can invalidate up to 256 page table entries at once. This speeds up the mprotect, munmap, mremap and fork operations for multi-threaded programs. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> --- arch/s390/include/asm/pgtable.h | 25 +++++++++++++++++++++++++ arch/s390/include/asm/setup.h | 2 ++ arch/s390/kernel/early.c | 2 ++ arch/s390/mm/pageattr.c | 2 +- arch/s390/mm/pgtable.c | 17 +++++++++++++++++ 5 files changed, 47 insertions(+), 1 deletion(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 20e5f7d..2caf726 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -997,6 +997,31 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma, return 1; } +void ptep_invalidate_range(struct mm_struct *mm, unsigned long start, + unsigned long end, pte_t *ptep); + +static inline void ptep_prepare_range(struct mm_struct *mm, + unsigned long start, + unsigned long end, + pte_t *ptep, int full) +{ + if (!full) + ptep_invalidate_range(mm, start, end, ptep); +} +#define ptep_prepare_range ptep_prepare_range + +#define __HAVE_ARCH_MOVE_PTE +static inline pte_t move_pte(pte_t pte, pgprot_t prot, + unsigned long old_addr, + unsigned long new_addr) +{ + if ((pte_val(pte) & _PAGE_PRESENT) && + (pte_val(pte) & _PAGE_READ) && + (pte_val(pte) & _PAGE_YOUNG)) + pte_val(pte) &= ~_PAGE_INVALID; + return pte; +} + /* * Additional functions to handle KVM guest page tables */ diff --git a/arch/s390/include/asm/setup.h b/arch/s390/include/asm/setup.h index c0f0efb..58b13e0 100644 --- a/arch/s390/include/asm/setup.h +++ b/arch/s390/include/asm/setup.h @@ -30,6 +30,7 @@ #define MACHINE_FLAG_TLB_LC _BITUL(12) #define MACHINE_FLAG_VX _BITUL(13) #define MACHINE_FLAG_CAD _BITUL(14) +#define MACHINE_FLAG_IPTE_RANGE _BITUL(15) #define LPP_MAGIC _BITUL(31) #define LPP_PFAULT_PID_MASK _AC(0xffffffff, UL) @@ -71,6 +72,7 @@ extern void detect_memory_memblock(void); #define MACHINE_HAS_TLB_LC (S390_lowcore.machine_flags & MACHINE_FLAG_TLB_LC) #define MACHINE_HAS_VX (S390_lowcore.machine_flags & MACHINE_FLAG_VX) #define MACHINE_HAS_CAD (S390_lowcore.machine_flags & MACHINE_FLAG_CAD) +#define MACHINE_HAS_IPTE_RANGE (S390_lowcore.machine_flags & MACHINE_FLAG_IPTE_RANGE) /* * Console mode. Override with conmode= diff --git a/arch/s390/kernel/early.c b/arch/s390/kernel/early.c index 717b03a..ebf69c4 100644 --- a/arch/s390/kernel/early.c +++ b/arch/s390/kernel/early.c @@ -339,6 +339,8 @@ static __init void detect_machine_facilities(void) S390_lowcore.machine_flags |= MACHINE_FLAG_EDAT1; __ctl_set_bit(0, 23); } + if (test_facility(13)) + S390_lowcore.machine_flags |= MACHINE_FLAG_IPTE_RANGE; if (test_facility(78)) S390_lowcore.machine_flags |= MACHINE_FLAG_EDAT2; if (test_facility(3)) diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c index 7104ffb..91809d9 100644 --- a/arch/s390/mm/pageattr.c +++ b/arch/s390/mm/pageattr.c @@ -306,7 +306,7 @@ static void ipte_range(pte_t *pte, unsigned long address, int nr) { int i; - if (test_facility(13)) { + if (MACHINE_HAS_IPTE_RANGE) { __ptep_ipte_range(address, nr - 1, pte); return; } diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 74f8f2a..3dd85ec 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -283,6 +283,23 @@ void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL(ptep_modify_prot_commit); +void ptep_invalidate_range(struct mm_struct *mm, unsigned long start, + unsigned long end, pte_t *ptep) +{ + unsigned long nr; + + if (!MACHINE_HAS_IPTE_RANGE || mm_has_pgste(mm)) + return; + preempt_disable(); + nr = (end - start) >> PAGE_SHIFT; + /* If the flush is likely to be local skip the ipte range */ + if (nr && !cpumask_equal(mm_cpumask(mm), + cpumask_of(smp_processor_id()))) + __ptep_ipte_range(start, nr - 1, ptep); + preempt_enable(); +} +EXPORT_SYMBOL(ptep_invalidate_range); + static inline pmd_t pmdp_flush_direct(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { -- 2.6.6 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-07-06 10:51 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <014201d1d738$744c8f90$5ce5aeb0$@alibaba-inc.com> 2016-07-06 4:03 ` [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries Hillf Danton 2016-07-06 6:23 ` Martin Schwidefsky 2016-07-06 6:42 ` Hillf Danton 2016-07-06 8:47 ` Martin Schwidefsky 2016-07-06 9:26 ` Hillf Danton 2016-07-06 10:51 ` Martin Schwidefsky 2016-07-05 12:00 [PATCH 0/2][RFC] mm callback for batched pte updates Martin Schwidefsky 2016-07-05 12:00 ` [PATCH 2/2] s390/mm: use ipte range to invalidate multiple page table entries Martin Schwidefsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).