linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] powerpc/64s/radix TLB flush performance improvements
@ 2017-10-31  6:44 Nicholas Piggin
  2017-10-31  6:44 ` [RFC PATCH 1/7] powerpc/64s/radix: optimize TLB range flush barriers Nicholas Piggin
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Nicholas Piggin @ 2017-10-31  6:44 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Aneesh Kumar K . V

Here's a random mix of performance improvements for radix TLB flushing
code. The main aims are to reduce the amount of translation that gets
invalidated, and to reduce global flushes where we can do local.

To that end, a parallel kernel compile benchmark using powerpc:tlbie
tracepoint shows a reduction in tlbie instructions from about 290,000
to 80,000, and a reduction in tlbiel instructions from 49,500,000 to
15,000,000. Looks great, but unfortunately does not translate to a
statistically significant performance improvement! The needle on TLB
misses does not move much, I suspect because a lot of the flushing is
done a startup and shutdown, and because a significant cost of TLB
flushing itself is in the barriers.

I have some microbenchmarks in the individual patches, and should
start looking around for some more interesting workloads. I think
most of this series is pretty obviously the right thing to do though.

This goes on top of the 3 radix TLB fixes I sent out earlier.

Thanks,
Nick

Nicholas Piggin (7):
  powerpc/64s/radix: optimize TLB range flush barriers
  powerpc/64s/radix: Implement _tlbie(l)_va_range flush functions
  powerpc/64s/radix: Optimize flush_tlb_range
  powerpc/64s/radix: Introduce local single page ceiling for TLB range
    flush
  powerpc/64s/radix: Improve TLB flushing for page table freeing
  powerpc/64s/radix: reset mm_cpumask for single thread process when
    possible
  powerpc/64s/radix: Only flush local TLB for spurious fault flushes

 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |   5 +
 arch/powerpc/include/asm/book3s/64/tlbflush.h      |  11 +
 arch/powerpc/include/asm/mmu_context.h             |  19 ++
 arch/powerpc/mm/pgtable-book3s64.c                 |   5 +-
 arch/powerpc/mm/pgtable.c                          |   2 +-
 arch/powerpc/mm/tlb-radix.c                        | 363 ++++++++++++++++-----
 6 files changed, 325 insertions(+), 80 deletions(-)

-- 
2.15.0.rc2

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-11-02  3:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-31  6:44 [RFC PATCH 0/7] powerpc/64s/radix TLB flush performance improvements Nicholas Piggin
2017-10-31  6:44 ` [RFC PATCH 1/7] powerpc/64s/radix: optimize TLB range flush barriers Nicholas Piggin
2017-10-31  6:44 ` [RFC PATCH 2/7] powerpc/64s/radix: Implement _tlbie(l)_va_range flush functions Nicholas Piggin
2017-10-31  6:45 ` [RFC PATCH 3/7] powerpc/64s/radix: Optimize flush_tlb_range Nicholas Piggin
2017-10-31  6:45 ` [RFC PATCH 4/7] powerpc/64s/radix: Introduce local single page ceiling for TLB range flush Nicholas Piggin
2017-10-31  6:45 ` [RFC PATCH 5/7] powerpc/64s/radix: Improve TLB flushing for page table freeing Nicholas Piggin
2017-11-01 12:05 ` [RFC PATCH 0/7] powerpc/64s/radix TLB flush performance improvements Anshuman Khandual
2017-11-01 13:39   ` Nicholas Piggin
2017-11-02  3:19     ` Anshuman Khandual
2017-11-02  3:27       ` Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).