linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13 v3] idle performance improvements
@ 2017-06-13 13:05 Nicholas Piggin
  2017-06-13 13:05 ` [PATCH 01/13] powerpc/64s: idle move soft interrupt mask logic into C code Nicholas Piggin
                   ` (12 more replies)
  0 siblings, 13 replies; 33+ messages in thread
From: Nicholas Piggin @ 2017-06-13 13:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy, Vaidyanathan Srinivasan

Since last time, I accounted for the various comments
in reviews, most importantly fixed the miscalculation
of SRR1 bit for the wakeup-interrupt. Verified it does
the right thing and replays the right wakeup interrupt
(e.g., decrementer) from __replay_interrupt stepping
through the instructions in the simulator.

I've found that performance testing is a little difficult
because the ping-pong test cases must sometimes get into
synchronization and run concurrently without sleeping.
Looking at context switch rates in vmstat, I run the
context-switch ping-poing test with snooze idle disabled
on a POWER8, and the numbers before and after this series
are:

         different thread   different core
vanilla  600K/s             470K/s
patched  780K/s             550K/s

(these are 2x what's reported by context_switch selftest
because each step switches to and from idle thread)

It's still not a perfect measurement because if there is
some unwanted concurrency happening then you can get
different amount of userspace work per context switch,
but there seems to be a decent speedup here.

Thanks,
Nick

Nicholas Piggin (13):
  powerpc/64s: idle move soft interrupt mask logic into C code
  powerpc/64s: idle hotplug lazy-irq simplification
  powerpc/64s: idle process interrupts from system reset wakeup
  powerpc/64s: msgclr when handling doorbell exceptions
  powerpc/64s: interrupt replay balance the return branch predictor
  powerpc/64s: idle branch to handler with virtual mode offset
  powerpc/64s: idle avoid SRR usage in idle sleep/wake paths
  powerpc/64s: idle hmi wakeup is unlikely
  powerpc/64s: cpuidle set polling before enabling irqs
  powerpc/64s: cpuidle read mostly for common globals
  powerpc/64s: cpuidle no memory barrier after break from idle
  powerpc/64: runlatch CTRL[RUN] set optimisation
  powerpc/64s: idle runlatch switch is done with MSR[EE]=0

 arch/powerpc/include/asm/dbell.h         |  13 +++
 arch/powerpc/include/asm/exception-64s.h |  13 +++
 arch/powerpc/include/asm/hw_irq.h        |   5 ++
 arch/powerpc/include/asm/machdep.h       |   1 +
 arch/powerpc/include/asm/ppc-opcode.h    |   3 +
 arch/powerpc/include/asm/processor.h     |  10 +--
 arch/powerpc/kernel/asm-offsets.c        |   1 +
 arch/powerpc/kernel/exceptions-64s.S     |  38 +++++++--
 arch/powerpc/kernel/idle_book3s.S        | 135 +++++++++----------------------
 arch/powerpc/kernel/irq.c                |  62 +++++++++++++-
 arch/powerpc/kernel/process.c            |  12 +--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |   8 +-
 arch/powerpc/platforms/powernv/idle.c    |  88 ++++++++++++++++++--
 arch/powerpc/platforms/powernv/smp.c     |  31 ++++---
 arch/powerpc/platforms/powernv/subcore.c |   3 +-
 drivers/cpuidle/cpuidle-powernv.c        |  37 +++++----
 drivers/cpuidle/cpuidle-pseries.c        |  22 +++--
 17 files changed, 315 insertions(+), 167 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 33+ messages in thread
* [PATCH 00/13 v3] idle performance improvements
@ 2017-06-13 13:05 Nicholas Piggin
  0 siblings, 0 replies; 33+ messages in thread
From: Nicholas Piggin @ 2017-06-13 13:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy, Vaidyanathan Srinivasan

Since last time, I accounted for the various comments
in reviews, most importantly fixed the miscalculation
of SRR1 bit for the wakeup-interrupt. Verified it does
the right thing and replays the right wakeup interrupt
(e.g., decrementer) from __replay_interrupt stepping
through the instructions in the simulator.

I've found that performance testing is a little difficult
because the ping-pong test cases must sometimes get into
synchronization and run concurrently without sleeping.
Looking at context switch rates in vmstat, I run the
context-switch ping-poing test with snooze idle disabled
on a POWER8, and the numbers before and after this series
are:

         different thread   different core
vanilla  600K/s             470K/s
patched  780K/s             550K/s

(these are 2x what's reported by context_switch selftest
because each step switches to and from idle thread)

It's still not a perfect measurement because if there is
some unwanted concurrency happening then you can get
different amount of userspace work per context switch,
but there seems to be a decent speedup here.

Thanks,
Nick

Nicholas Piggin (13):
  powerpc/64s: idle move soft interrupt mask logic into C code
  powerpc/64s: idle hotplug lazy-irq simplification
  powerpc/64s: idle process interrupts from system reset wakeup
  powerpc/64s: msgclr when handling doorbell exceptions
  powerpc/64s: interrupt replay balance the return branch predictor
  powerpc/64s: idle branch to handler with virtual mode offset
  powerpc/64s: idle avoid SRR usage in idle sleep/wake paths
  powerpc/64s: idle hmi wakeup is unlikely
  powerpc/64s: cpuidle set polling before enabling irqs
  powerpc/64s: cpuidle read mostly for common globals
  powerpc/64s: cpuidle no memory barrier after break from idle
  powerpc/64: runlatch CTRL[RUN] set optimisation
  powerpc/64s: idle runlatch switch is done with MSR[EE]=0

 arch/powerpc/include/asm/dbell.h         |  13 +++
 arch/powerpc/include/asm/exception-64s.h |  13 +++
 arch/powerpc/include/asm/hw_irq.h        |   5 ++
 arch/powerpc/include/asm/machdep.h       |   1 +
 arch/powerpc/include/asm/ppc-opcode.h    |   3 +
 arch/powerpc/include/asm/processor.h     |  10 +--
 arch/powerpc/kernel/asm-offsets.c        |   1 +
 arch/powerpc/kernel/exceptions-64s.S     |  38 +++++++--
 arch/powerpc/kernel/idle_book3s.S        | 135 +++++++++----------------------
 arch/powerpc/kernel/irq.c                |  62 +++++++++++++-
 arch/powerpc/kernel/process.c            |  12 +--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |   8 +-
 arch/powerpc/platforms/powernv/idle.c    |  88 ++++++++++++++++++--
 arch/powerpc/platforms/powernv/smp.c     |  31 ++++---
 arch/powerpc/platforms/powernv/subcore.c |   3 +-
 drivers/cpuidle/cpuidle-powernv.c        |  37 +++++----
 drivers/cpuidle/cpuidle-pseries.c        |  22 +++--
 17 files changed, 315 insertions(+), 167 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-06-19 12:25 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-13 13:05 [PATCH 00/13 v3] idle performance improvements Nicholas Piggin
2017-06-13 13:05 ` [PATCH 01/13] powerpc/64s: idle move soft interrupt mask logic into C code Nicholas Piggin
2017-06-19 12:25   ` [01/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 02/13] powerpc/64s: idle hotplug lazy-irq simplification Nicholas Piggin
2017-06-19 12:25   ` [02/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 03/13] powerpc/64s: idle process interrupts from system reset wakeup Nicholas Piggin
2017-06-13 13:28   ` Nicholas Piggin
2017-06-14 11:29     ` Michael Ellerman
2017-06-19 12:25   ` [03/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 04/13] powerpc/64s: msgclr when handling doorbell exceptions Nicholas Piggin
2017-06-19 12:25   ` [04/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 05/13] powerpc/64s: interrupt replay balance the return branch predictor Nicholas Piggin
2017-06-19 12:25   ` [05/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 06/13] powerpc/64s: idle branch to handler with virtual mode offset Nicholas Piggin
2017-06-19 12:25   ` [06/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 07/13] powerpc/64s: idle avoid SRR usage in idle sleep/wake paths Nicholas Piggin
2017-06-15 12:11   ` Nicholas Piggin
2017-06-19 12:25   ` [07/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 08/13] powerpc/64s: idle hmi wakeup is unlikely Nicholas Piggin
2017-06-19 12:25   ` [08/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 09/13] powerpc/64s: cpuidle set polling before enabling irqs Nicholas Piggin
2017-06-14 11:40   ` Michael Ellerman
2017-06-14 11:50     ` Nicholas Piggin
2017-06-14 13:05       ` Michael Ellerman
2017-06-13 13:05 ` [PATCH 10/13] powerpc/64s: cpuidle read mostly for common globals Nicholas Piggin
2017-06-13 13:05 ` [PATCH 11/13] powerpc/64s: cpuidle no memory barrier after break from idle Nicholas Piggin
2017-06-13 13:05 ` [PATCH 12/13] powerpc/64: runlatch CTRL[RUN] set optimisation Nicholas Piggin
2017-06-14 11:38   ` Michael Ellerman
2017-06-14 13:44     ` Nicholas Piggin
2017-06-15  9:35       ` Michael Ellerman
2017-06-13 13:05 ` [PATCH 13/13] powerpc/64s: idle runlatch switch is done with MSR[EE]=0 Nicholas Piggin
2017-06-19 12:25   ` [13/13] " Michael Ellerman
  -- strict thread matches above, loose matches on Subject: below --
2017-06-13 13:05 [PATCH 00/13 v3] idle performance improvements Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).