All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13 v3] idle performance improvements
@ 2017-06-13 13:05 Nicholas Piggin
  2017-06-13 13:05 ` [PATCH 01/13] powerpc/64s: idle move soft interrupt mask logic into C code Nicholas Piggin
                   ` (12 more replies)
  0 siblings, 13 replies; 33+ messages in thread
From: Nicholas Piggin @ 2017-06-13 13:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy, Vaidyanathan Srinivasan

Since last time, I accounted for the various comments
in reviews, most importantly fixed the miscalculation
of SRR1 bit for the wakeup-interrupt. Verified it does
the right thing and replays the right wakeup interrupt
(e.g., decrementer) from __replay_interrupt stepping
through the instructions in the simulator.

I've found that performance testing is a little difficult
because the ping-pong test cases must sometimes get into
synchronization and run concurrently without sleeping.
Looking at context switch rates in vmstat, I run the
context-switch ping-poing test with snooze idle disabled
on a POWER8, and the numbers before and after this series
are:

         different thread   different core
vanilla  600K/s             470K/s
patched  780K/s             550K/s

(these are 2x what's reported by context_switch selftest
because each step switches to and from idle thread)

It's still not a perfect measurement because if there is
some unwanted concurrency happening then you can get
different amount of userspace work per context switch,
but there seems to be a decent speedup here.

Thanks,
Nick

Nicholas Piggin (13):
  powerpc/64s: idle move soft interrupt mask logic into C code
  powerpc/64s: idle hotplug lazy-irq simplification
  powerpc/64s: idle process interrupts from system reset wakeup
  powerpc/64s: msgclr when handling doorbell exceptions
  powerpc/64s: interrupt replay balance the return branch predictor
  powerpc/64s: idle branch to handler with virtual mode offset
  powerpc/64s: idle avoid SRR usage in idle sleep/wake paths
  powerpc/64s: idle hmi wakeup is unlikely
  powerpc/64s: cpuidle set polling before enabling irqs
  powerpc/64s: cpuidle read mostly for common globals
  powerpc/64s: cpuidle no memory barrier after break from idle
  powerpc/64: runlatch CTRL[RUN] set optimisation
  powerpc/64s: idle runlatch switch is done with MSR[EE]=0

 arch/powerpc/include/asm/dbell.h         |  13 +++
 arch/powerpc/include/asm/exception-64s.h |  13 +++
 arch/powerpc/include/asm/hw_irq.h        |   5 ++
 arch/powerpc/include/asm/machdep.h       |   1 +
 arch/powerpc/include/asm/ppc-opcode.h    |   3 +
 arch/powerpc/include/asm/processor.h     |  10 +--
 arch/powerpc/kernel/asm-offsets.c        |   1 +
 arch/powerpc/kernel/exceptions-64s.S     |  38 +++++++--
 arch/powerpc/kernel/idle_book3s.S        | 135 +++++++++----------------------
 arch/powerpc/kernel/irq.c                |  62 +++++++++++++-
 arch/powerpc/kernel/process.c            |  12 +--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |   8 +-
 arch/powerpc/platforms/powernv/idle.c    |  88 ++++++++++++++++++--
 arch/powerpc/platforms/powernv/smp.c     |  31 ++++---
 arch/powerpc/platforms/powernv/subcore.c |   3 +-
 drivers/cpuidle/cpuidle-powernv.c        |  37 +++++----
 drivers/cpuidle/cpuidle-pseries.c        |  22 +++--
 17 files changed, 315 insertions(+), 167 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 33+ messages in thread
* [PATCH 00/13 v3] idle performance improvements
@ 2017-06-13 13:05 Nicholas Piggin
  0 siblings, 0 replies; 33+ messages in thread
From: Nicholas Piggin @ 2017-06-13 13:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy, Vaidyanathan Srinivasan

Since last time, I accounted for the various comments
in reviews, most importantly fixed the miscalculation
of SRR1 bit for the wakeup-interrupt. Verified it does
the right thing and replays the right wakeup interrupt
(e.g., decrementer) from __replay_interrupt stepping
through the instructions in the simulator.

I've found that performance testing is a little difficult
because the ping-pong test cases must sometimes get into
synchronization and run concurrently without sleeping.
Looking at context switch rates in vmstat, I run the
context-switch ping-poing test with snooze idle disabled
on a POWER8, and the numbers before and after this series
are:

         different thread   different core
vanilla  600K/s             470K/s
patched  780K/s             550K/s

(these are 2x what's reported by context_switch selftest
because each step switches to and from idle thread)

It's still not a perfect measurement because if there is
some unwanted concurrency happening then you can get
different amount of userspace work per context switch,
but there seems to be a decent speedup here.

Thanks,
Nick

Nicholas Piggin (13):
  powerpc/64s: idle move soft interrupt mask logic into C code
  powerpc/64s: idle hotplug lazy-irq simplification
  powerpc/64s: idle process interrupts from system reset wakeup
  powerpc/64s: msgclr when handling doorbell exceptions
  powerpc/64s: interrupt replay balance the return branch predictor
  powerpc/64s: idle branch to handler with virtual mode offset
  powerpc/64s: idle avoid SRR usage in idle sleep/wake paths
  powerpc/64s: idle hmi wakeup is unlikely
  powerpc/64s: cpuidle set polling before enabling irqs
  powerpc/64s: cpuidle read mostly for common globals
  powerpc/64s: cpuidle no memory barrier after break from idle
  powerpc/64: runlatch CTRL[RUN] set optimisation
  powerpc/64s: idle runlatch switch is done with MSR[EE]=0

 arch/powerpc/include/asm/dbell.h         |  13 +++
 arch/powerpc/include/asm/exception-64s.h |  13 +++
 arch/powerpc/include/asm/hw_irq.h        |   5 ++
 arch/powerpc/include/asm/machdep.h       |   1 +
 arch/powerpc/include/asm/ppc-opcode.h    |   3 +
 arch/powerpc/include/asm/processor.h     |  10 +--
 arch/powerpc/kernel/asm-offsets.c        |   1 +
 arch/powerpc/kernel/exceptions-64s.S     |  38 +++++++--
 arch/powerpc/kernel/idle_book3s.S        | 135 +++++++++----------------------
 arch/powerpc/kernel/irq.c                |  62 +++++++++++++-
 arch/powerpc/kernel/process.c            |  12 +--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |   8 +-
 arch/powerpc/platforms/powernv/idle.c    |  88 ++++++++++++++++++--
 arch/powerpc/platforms/powernv/smp.c     |  31 ++++---
 arch/powerpc/platforms/powernv/subcore.c |   3 +-
 drivers/cpuidle/cpuidle-powernv.c        |  37 +++++----
 drivers/cpuidle/cpuidle-pseries.c        |  22 +++--
 17 files changed, 315 insertions(+), 167 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-06-19 12:25 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-13 13:05 [PATCH 00/13 v3] idle performance improvements Nicholas Piggin
2017-06-13 13:05 ` [PATCH 01/13] powerpc/64s: idle move soft interrupt mask logic into C code Nicholas Piggin
2017-06-19 12:25   ` [01/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 02/13] powerpc/64s: idle hotplug lazy-irq simplification Nicholas Piggin
2017-06-19 12:25   ` [02/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 03/13] powerpc/64s: idle process interrupts from system reset wakeup Nicholas Piggin
2017-06-13 13:28   ` Nicholas Piggin
2017-06-14 11:29     ` Michael Ellerman
2017-06-19 12:25   ` [03/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 04/13] powerpc/64s: msgclr when handling doorbell exceptions Nicholas Piggin
2017-06-19 12:25   ` [04/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 05/13] powerpc/64s: interrupt replay balance the return branch predictor Nicholas Piggin
2017-06-19 12:25   ` [05/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 06/13] powerpc/64s: idle branch to handler with virtual mode offset Nicholas Piggin
2017-06-19 12:25   ` [06/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 07/13] powerpc/64s: idle avoid SRR usage in idle sleep/wake paths Nicholas Piggin
2017-06-15 12:11   ` Nicholas Piggin
2017-06-19 12:25   ` [07/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 08/13] powerpc/64s: idle hmi wakeup is unlikely Nicholas Piggin
2017-06-19 12:25   ` [08/13] " Michael Ellerman
2017-06-13 13:05 ` [PATCH 09/13] powerpc/64s: cpuidle set polling before enabling irqs Nicholas Piggin
2017-06-14 11:40   ` Michael Ellerman
2017-06-14 11:50     ` Nicholas Piggin
2017-06-14 13:05       ` Michael Ellerman
2017-06-13 13:05 ` [PATCH 10/13] powerpc/64s: cpuidle read mostly for common globals Nicholas Piggin
2017-06-13 13:05 ` [PATCH 11/13] powerpc/64s: cpuidle no memory barrier after break from idle Nicholas Piggin
2017-06-13 13:05 ` [PATCH 12/13] powerpc/64: runlatch CTRL[RUN] set optimisation Nicholas Piggin
2017-06-14 11:38   ` Michael Ellerman
2017-06-14 13:44     ` Nicholas Piggin
2017-06-15  9:35       ` Michael Ellerman
2017-06-13 13:05 ` [PATCH 13/13] powerpc/64s: idle runlatch switch is done with MSR[EE]=0 Nicholas Piggin
2017-06-19 12:25   ` [13/13] " Michael Ellerman
  -- strict thread matches above, loose matches on Subject: below --
2017-06-13 13:05 [PATCH 00/13 v3] idle performance improvements Nicholas Piggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.