* [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation
@ 2017-08-05 17:02 Nicholas Piggin
2017-08-05 17:02 ` [PATCH 01/13] powerpc/64s: masked interrupt avoid branch Nicholas Piggin
` (12 more replies)
0 siblings, 13 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
The first 9 patches are a bunch of independent small cleanups
and improvements I've collected. The next 4 are improvements to
POWER9 idle state entry and exit. They significantly change how
we enter idle, now in virtual mode and without ptesync. EC=0 idle
is significantly simplified from there, and does not save anything.
This runs fine on a DD1 test machine, but the hardware has some
problem with performance all over the place, so I'm not able to get
anything meaningful there yet.
Thanks,
Nick
Nicholas Piggin (13):
powerpc/64s: masked interrupt avoid branch
powerpc/64s: masked interrupt avoid instruction
powerpc/64s: masked interrupt returns to kernel so avoid r13 restore
powerpc/64: cleanup __check_irq_replay
powerpc/64s: irq replay merge HV and non-HV paths for doorbell replay
powerpc/64s: irq replay external use the HV handler in HV mode on
POWER9
powerpc/64: remove redundant instruction in interrupt replay
powerpc/64s: irq replay remove spurious irq reason
powerpc/64: runlatch CTRL[RUN] set optimisation
powerpc/64s: idle simplify KVM idle on POWER9
powerpc/64s: idle POWER9 can execute stop without ptesync
powerpc/64s: idle POWER9 can execute stop in virtual mode
powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead
arch/powerpc/include/asm/kvm_book3s_asm.h | 4 +++
arch/powerpc/kernel/entry_64.S | 7 +---
arch/powerpc/kernel/exceptions-64s.S | 19 +++++-----
arch/powerpc/kernel/idle_book3s.S | 60 ++++++++++++++-----------------
arch/powerpc/kernel/irq.c | 47 ++++++++++++------------
arch/powerpc/kernel/process.c | 35 +++++++++++++-----
arch/powerpc/kvm/book3s_hv.c | 37 ++++++++++++++++---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 32 ++++++++++-------
8 files changed, 143 insertions(+), 98 deletions(-)
--
2.11.0
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 01/13] powerpc/64s: masked interrupt avoid branch
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 02/13] powerpc/64s: masked interrupt avoid instruction Nicholas Piggin
` (11 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
Interrupts which do not require EE to be cleared can all
be tested with a single bitwise test.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index f14f3c04ec7e..f8ad3f0eb383 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1370,10 +1370,8 @@ masked_##_H##interrupt: \
ori r10,r10,0xffff; \
mtspr SPRN_DEC,r10; \
b MASKED_DEC_HANDLER_LABEL; \
-1: cmpwi r10,PACA_IRQ_DBELL; \
- beq 2f; \
- cmpwi r10,PACA_IRQ_HMI; \
- beq 2f; \
+1: andi. r10,r10,(PACA_IRQ_DBELL|PACA_IRQ_HMI); \
+ bne 2f; \
mfspr r10,SPRN_##_H##SRR1; \
rldicl r10,r10,48,1; /* clear MSR_EE */ \
rotldi r10,r10,16; \
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 02/13] powerpc/64s: masked interrupt avoid instruction
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
2017-08-05 17:02 ` [PATCH 01/13] powerpc/64s: masked interrupt avoid branch Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 03/13] powerpc/64s: masked interrupt returns to kernel so avoid r13 restore Nicholas Piggin
` (10 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
EE is always enabled in SRR1 for masked interrupts, so clearing
it can use xor.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index f8ad3f0eb383..c4f50a9e2ab5 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1373,8 +1373,7 @@ masked_##_H##interrupt: \
1: andi. r10,r10,(PACA_IRQ_DBELL|PACA_IRQ_HMI); \
bne 2f; \
mfspr r10,SPRN_##_H##SRR1; \
- rldicl r10,r10,48,1; /* clear MSR_EE */ \
- rotldi r10,r10,16; \
+ xori r10,r10,MSR_EE; /* clear MSR_EE */ \
mtspr SPRN_##_H##SRR1,r10; \
2: mtcrf 0x80,r9; \
ld r9,PACA_EXGEN+EX_R9(r13); \
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 03/13] powerpc/64s: masked interrupt returns to kernel so avoid r13 restore
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
2017-08-05 17:02 ` [PATCH 01/13] powerpc/64s: masked interrupt avoid branch Nicholas Piggin
2017-08-05 17:02 ` [PATCH 02/13] powerpc/64s: masked interrupt avoid instruction Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 04/13] powerpc/64: cleanup __check_irq_replay Nicholas Piggin
` (9 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
Places in the kernel where r13 is not the PACA pointer must have
maskable interrupts disabled, so r13 does not have to be restored
when returning from a soft-masked interrupt.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index c4f50a9e2ab5..67321be3122c 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1379,7 +1379,7 @@ masked_##_H##interrupt: \
ld r9,PACA_EXGEN+EX_R9(r13); \
ld r10,PACA_EXGEN+EX_R10(r13); \
ld r11,PACA_EXGEN+EX_R11(r13); \
- GET_SCRATCH0(r13); \
+ /* returns to kernel where r13 must be set up, so don't restore it */ \
##_H##rfid; \
b .; \
MASKED_DEC_HANDLER(_H)
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 04/13] powerpc/64: cleanup __check_irq_replay
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (2 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 03/13] powerpc/64s: masked interrupt returns to kernel so avoid r13 restore Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 05/13] powerpc/64s: irq replay merge HV and non-HV paths for doorbell replay Nicholas Piggin
` (8 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
Move the clearing of irq_happened bits into the condition where
they were found to be set. This reduces instruction count slightly,
and reduces stores into irq_happened.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/irq.c | 45 +++++++++++++++++++++++----------------------
1 file changed, 23 insertions(+), 22 deletions(-)
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index f291f7826abc..7c46e0cce054 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -143,9 +143,10 @@ notrace unsigned int __check_irq_replay(void)
*/
unsigned char happened = local_paca->irq_happened;
- /* Clear bit 0 which we wouldn't clear otherwise */
- local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
if (happened & PACA_IRQ_HARD_DIS) {
+ /* Clear bit 0 which we wouldn't clear otherwise */
+ local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
+
/*
* We may have missed a decrementer interrupt if hard disabled.
* Check the decrementer register in case we had a rollover
@@ -173,39 +174,39 @@ notrace unsigned int __check_irq_replay(void)
* This is a higher priority interrupt than the others, so
* replay it first.
*/
- local_paca->irq_happened &= ~PACA_IRQ_HMI;
- if (happened & PACA_IRQ_HMI)
+ if (happened & PACA_IRQ_HMI) {
+ local_paca->irq_happened &= ~PACA_IRQ_HMI;
return 0xe60;
+ }
- /*
- * We may have missed a decrementer interrupt. We check the
- * decrementer itself rather than the paca irq_happened field
- * in case we also had a rollover while hard disabled
- */
- local_paca->irq_happened &= ~PACA_IRQ_DEC;
- if (happened & PACA_IRQ_DEC)
+ if (happened & PACA_IRQ_DEC) {
+ local_paca->irq_happened &= ~PACA_IRQ_DEC;
return 0x900;
+ }
- /* Finally check if an external interrupt happened */
- local_paca->irq_happened &= ~PACA_IRQ_EE;
- if (happened & PACA_IRQ_EE)
+ if (happened & PACA_IRQ_EE) {
+ local_paca->irq_happened &= ~PACA_IRQ_EE;
return 0x500;
+ }
#ifdef CONFIG_PPC_BOOK3E
- /* Finally check if an EPR external interrupt happened
- * this bit is typically set if we need to handle another
- * "edge" interrupt from within the MPIC "EPR" handler
+ /*
+ * Check if an EPR external interrupt happened this bit is typically
+ * set if we need to handle another "edge" interrupt from within the
+ * MPIC "EPR" handler.
*/
- local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
- if (happened & PACA_IRQ_EE_EDGE)
+ if (happened & PACA_IRQ_EE_EDGE) {
+ local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
return 0x500;
+ }
- local_paca->irq_happened &= ~PACA_IRQ_DBELL;
- if (happened & PACA_IRQ_DBELL)
+ if (happened & PACA_IRQ_DBELL) {
+ local_paca->irq_happened &= ~PACA_IRQ_DBELL;
return 0x280;
+ }
#else
- local_paca->irq_happened &= ~PACA_IRQ_DBELL;
if (happened & PACA_IRQ_DBELL) {
+ local_paca->irq_happened &= ~PACA_IRQ_DBELL;
if (cpu_has_feature(CPU_FTR_HVMODE))
return 0xe80;
return 0xa00;
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 05/13] powerpc/64s: irq replay merge HV and non-HV paths for doorbell replay
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (3 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 04/13] powerpc/64: cleanup __check_irq_replay Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 06/13] powerpc/64s: irq replay external use the HV handler in HV mode on POWER9 Nicholas Piggin
` (7 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
This results in smaller code, and fewer branches.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/entry_64.S | 6 +-----
arch/powerpc/kernel/exceptions-64s.S | 2 +-
arch/powerpc/kernel/irq.c | 2 --
3 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 49d8422767b4..ec67f67dafab 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -990,11 +990,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
#ifdef CONFIG_PPC_BOOK3E
cmpwi cr0,r3,0x280
#else
- BEGIN_FTR_SECTION
- cmpwi cr0,r3,0xe80
- FTR_SECTION_ELSE
- cmpwi cr0,r3,0xa00
- ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
+ cmpwi cr0,r3,0xa00
#endif /* CONFIG_PPC_BOOK3E */
bne 1f
addi r3,r1,STACK_FRAME_OVERHEAD;
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 67321be3122c..f9d0796fb2c9 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1674,7 +1674,7 @@ _GLOBAL(__replay_interrupt)
cmpwi r3,0x500
beq hardware_interrupt_common
BEGIN_FTR_SECTION
- cmpwi r3,0xe80
+ cmpwi r3,0xa00
beq h_doorbell_common_msgclr
cmpwi r3,0xea0
beq h_virt_irq_common
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 7c46e0cce054..60ee6d7251b8 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -207,8 +207,6 @@ notrace unsigned int __check_irq_replay(void)
#else
if (happened & PACA_IRQ_DBELL) {
local_paca->irq_happened &= ~PACA_IRQ_DBELL;
- if (cpu_has_feature(CPU_FTR_HVMODE))
- return 0xe80;
return 0xa00;
}
#endif /* CONFIG_PPC_BOOK3E */
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 06/13] powerpc/64s: irq replay external use the HV handler in HV mode on POWER9
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (4 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 05/13] powerpc/64s: irq replay merge HV and non-HV paths for doorbell replay Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 07/13] powerpc/64: remove redundant instruction in interrupt replay Nicholas Piggin
` (6 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
POWER9 host external interrupts use the h_virt_irq_common handler, so
use that to replay them rather than using the hardware_interrupt_common
handler. Both call do_IRQ, but using the correct handler reduces i-cache
footprint.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index f9d0796fb2c9..29253cecf713 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1672,7 +1672,11 @@ _GLOBAL(__replay_interrupt)
cmpwi r3,0x900
beq decrementer_common
cmpwi r3,0x500
+BEGIN_FTR_SECTION
+ beq h_virt_irq_common
+FTR_SECTION_ELSE
beq hardware_interrupt_common
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_300)
BEGIN_FTR_SECTION
cmpwi r3,0xa00
beq h_doorbell_common_msgclr
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 07/13] powerpc/64: remove redundant instruction in interrupt replay
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (5 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 06/13] powerpc/64s: irq replay external use the HV handler in HV mode on POWER9 Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason Nicholas Piggin
` (5 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/entry_64.S | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index ec67f67dafab..3f2666d24a7e 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -995,7 +995,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
bne 1f
addi r3,r1,STACK_FRAME_OVERHEAD;
bl doorbell_exception
- b ret_from_except
#endif /* CONFIG_PPC_DOORBELL */
1: b ret_from_except /* What else to do here ? */
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (6 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 07/13] powerpc/64: remove redundant instruction in interrupt replay Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 23:00 ` Benjamin Herrenschmidt
2017-08-05 17:02 ` [PATCH 09/13] powerpc/64: runlatch CTRL[RUN] set optimisation Nicholas Piggin
` (4 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
HVI interrupts have always used 0x500, so remove the dead branch.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 29253cecf713..566cf126c13b 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1680,8 +1680,6 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_300)
BEGIN_FTR_SECTION
cmpwi r3,0xa00
beq h_doorbell_common_msgclr
- cmpwi r3,0xea0
- beq h_virt_irq_common
cmpwi r3,0xe60
beq hmi_exception_common
FTR_SECTION_ELSE
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 09/13] powerpc/64: runlatch CTRL[RUN] set optimisation
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (7 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9 Nicholas Piggin
` (3 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
The CTRL register is read-only except bit 63 which is the run latch
control. This means it can be updated with a mtspr rather than
mfspr/mtspr.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/process.c | 35 +++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 9f3e2c932dcc..75306b6e1812 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1994,11 +1994,25 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
void notrace __ppc64_runlatch_on(void)
{
struct thread_info *ti = current_thread_info();
- unsigned long ctrl;
- ctrl = mfspr(SPRN_CTRLF);
- ctrl |= CTRL_RUNLATCH;
- mtspr(SPRN_CTRLT, ctrl);
+ if (cpu_has_feature(CPU_FTR_ARCH_206)) {
+ /*
+ * Least significant bit (RUN) is the only writable bit of
+ * the CTRL register, so we can avoid mfspr. 2.06 is not the
+ * earliest ISA where this is the case, but it's convenient.
+ */
+ mtspr(SPRN_CTRLT, CTRL_RUNLATCH);
+ } else {
+ unsigned long ctrl;
+
+ /*
+ * Some architectures (e.g., Cell) have writable fields other
+ * than RUN, so do the read-modify-write.
+ */
+ ctrl = mfspr(SPRN_CTRLF);
+ ctrl |= CTRL_RUNLATCH;
+ mtspr(SPRN_CTRLT, ctrl);
+ }
ti->local_flags |= _TLF_RUNLATCH;
}
@@ -2007,13 +2021,18 @@ void notrace __ppc64_runlatch_on(void)
void notrace __ppc64_runlatch_off(void)
{
struct thread_info *ti = current_thread_info();
- unsigned long ctrl;
ti->local_flags &= ~_TLF_RUNLATCH;
- ctrl = mfspr(SPRN_CTRLF);
- ctrl &= ~CTRL_RUNLATCH;
- mtspr(SPRN_CTRLT, ctrl);
+ if (cpu_has_feature(CPU_FTR_ARCH_206)) {
+ mtspr(SPRN_CTRLT, 0);
+ } else {
+ unsigned long ctrl;
+
+ ctrl = mfspr(SPRN_CTRLF);
+ ctrl &= ~CTRL_RUNLATCH;
+ mtspr(SPRN_CTRLT, ctrl);
+ }
}
#endif /* CONFIG_PPC64 */
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (8 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 09/13] powerpc/64: runlatch CTRL[RUN] set optimisation Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-08 10:36 ` Gautham R Shenoy
2017-08-10 13:14 ` Michael Ellerman
2017-08-05 17:02 ` [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync Nicholas Piggin
` (2 subsequent siblings)
12 siblings, 2 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy, Paul Mackerras, kvm-ppc
POWER9 CPUs have independent MMU contexts per thread so KVM
does not have to bring sibling threads into real-mode when
switching MMU mode to guest. This can simplify POWER9 sleep/wake
paths and avoids hwsyncs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/kvm_book3s_asm.h | 4 ++++
arch/powerpc/kernel/idle_book3s.S | 8 ++-----
arch/powerpc/kvm/book3s_hv.c | 37 ++++++++++++++++++++++++++-----
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 8 +++++++
4 files changed, 46 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 7cea76f11c26..83596f32f50b 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -104,6 +104,10 @@ struct kvmppc_host_state {
u8 napping;
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+ /*
+ * hwthread_req/hwthread_state pair is used to pull sibling threads
+ * out of guest on pre-ISAv3.0B CPUs where threads share MMU.
+ */
u8 hwthread_req;
u8 hwthread_state;
u8 host_ipi;
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index e6252c5a57a4..3ab73f9223e4 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -243,12 +243,6 @@ enter_winkle:
* r3 - PSSCR value corresponding to the requested stop state.
*/
power_enter_stop:
-#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
- /* Tell KVM we're entering idle */
- li r4,KVM_HWTHREAD_IN_IDLE
- /* DO THIS IN REAL MODE! See comment above. */
- stb r4,HSTATE_HWTHREAD_STATE(r13)
-#endif
/*
* Check if we are executing the lite variant with ESL=EC=0
*/
@@ -435,6 +429,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
mr r3,r12
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+BEGIN_FTR_SECTION
li r0,KVM_HWTHREAD_IN_KERNEL
stb r0,HSTATE_HWTHREAD_STATE(r13)
/* Order setting hwthread_state vs. testing hwthread_req */
@@ -444,6 +439,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
beq 1f
b kvm_start_guest
1:
+END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
#endif
/* Return SRR1 from power7_nap() */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 359c79cdf0cc..bb1ab14f963a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2111,6 +2111,16 @@ static int kvmppc_grab_hwthread(int cpu)
struct paca_struct *tpaca;
long timeout = 10000;
+ /*
+ * ISA v3.0 idle routines do not set hwthread_state or test
+ * hwthread_req, so they can not grab idle threads.
+ */
+ if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+ WARN_ON(1);
+ pr_err("KVM: can not control sibling threads\n");
+ return -EBUSY;
+ }
+
tpaca = &paca[cpu];
/* Ensure the thread won't go into the kernel if it wakes */
@@ -2145,12 +2155,26 @@ static void kvmppc_release_hwthread(int cpu)
struct paca_struct *tpaca;
tpaca = &paca[cpu];
- tpaca->kvm_hstate.hwthread_req = 0;
tpaca->kvm_hstate.kvm_vcpu = NULL;
tpaca->kvm_hstate.kvm_vcore = NULL;
tpaca->kvm_hstate.kvm_split_mode = NULL;
}
+static void kvmppc_release_hwthread_secondary(int cpu)
+{
+ struct paca_struct *tpaca;
+
+ if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+ WARN_ON(1);
+ return;
+ }
+
+ tpaca = &paca[cpu];
+ tpaca->kvm_hstate.hwthread_req = 0;
+ kvmppc_release_hwthread(cpu);
+}
+
+
static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu)
{
int i;
@@ -2274,7 +2298,7 @@ static int on_primary_thread(void)
if (kvmppc_grab_hwthread(cpu + thr)) {
/* Couldn't grab one; let the others go */
do {
- kvmppc_release_hwthread(cpu + thr);
+ kvmppc_release_hwthread_secondary(cpu + thr);
} while (--thr > 0);
return 0;
}
@@ -2702,8 +2726,9 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
kvmppc_vcore_preempt(pvc);
spin_unlock(&pvc->lock);
}
- for (i = 0; i < controlled_threads; ++i)
- kvmppc_release_hwthread(pcpu + i);
+ for (i = 1; i < controlled_threads; ++i)
+ kvmppc_release_hwthread_secondary(pcpu + i);
+ kvmppc_release_hwthread(pcpu);
return;
}
@@ -2858,11 +2883,13 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
/* Let secondaries go back to the offline loop */
for (i = 0; i < controlled_threads; ++i) {
- kvmppc_release_hwthread(pcpu + i);
if (sip && sip->napped[i])
kvmppc_ipi_thread(pcpu + i);
cpumask_clear_cpu(pcpu + i, &vc->kvm->arch.cpu_in_guest);
}
+ for (i = 1; i < controlled_threads; ++i)
+ kvmppc_release_hwthread_secondary(pcpu + i);
+ kvmppc_release_hwthread(pcpu);
spin_unlock(&vc->lock);
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c52184a8efdf..3e024fd71fe8 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -149,9 +149,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
subf r4, r4, r3
mtspr SPRN_DEC, r4
+BEGIN_FTR_SECTION
/* hwthread_req may have got set by cede or no vcpu, so clear it */
li r0, 0
stb r0, HSTATE_HWTHREAD_REQ(r13)
+END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
/*
* For external interrupts we need to call the Linux
@@ -314,6 +316,7 @@ kvm_novcpu_exit:
* Relocation is off and most register values are lost.
* r13 points to the PACA.
* r3 contains the SRR1 wakeup value, SRR1 is trashed.
+ * This is not used by ISAv3.0B processors.
*/
.globl kvm_start_guest
kvm_start_guest:
@@ -432,6 +435,9 @@ kvm_secondary_got_guest:
* While waiting we also need to check if we get given a vcpu to run.
*/
kvm_no_guest:
+BEGIN_FTR_SECTION
+ twi 31,0,0
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
lbz r3, HSTATE_HWTHREAD_REQ(r13)
cmpwi r3, 0
bne 53f
@@ -2509,8 +2515,10 @@ kvm_do_nap:
clrrdi r0, r0, 1
mtspr SPRN_CTRLT, r0
+BEGIN_FTR_SECTION
li r0,1
stb r0,HSTATE_HWTHREAD_REQ(r13)
+END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
mfspr r5,SPRN_LPCR
ori r5,r5,LPCR_PECE0 | LPCR_PECE1
BEGIN_FTR_SECTION
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (9 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9 Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-08 10:37 ` Gautham R Shenoy
2017-08-10 13:15 ` Michael Ellerman
2017-08-05 17:02 ` [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode Nicholas Piggin
2017-08-05 17:02 ` [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead Nicholas Piggin
12 siblings, 2 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/idle_book3s.S | 7 ++++---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 24 ++++++++++++------------
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 3ab73f9223e4..75746111e2c4 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -249,7 +249,7 @@ power_enter_stop:
andis. r4,r3,PSSCR_EC_ESL_MASK_SHIFTED
clrldi r3,r3,60 /* r3 = Bits[60:63] = Requested Level (RL) */
bne .Lhandle_esl_ec_set
- IDLE_STATE_ENTER_SEQ(PPC_STOP)
+ PPC_STOP
li r3,0 /* Since we didn't lose state, return 0 */
/*
@@ -282,7 +282,8 @@ power_enter_stop:
ld r4,ADDROFF(pnv_first_deep_stop_state)(r5)
cmpd r3,r4
bge .Lhandle_deep_stop
- IDLE_STATE_ENTER_SEQ_NORET(PPC_STOP)
+ PPC_STOP /* Does not return (system reset interrupt) */
+
.Lhandle_deep_stop:
/*
* Entering deep idle state.
@@ -304,7 +305,7 @@ lwarx_loop_stop:
bl save_sprs_to_stack
- IDLE_STATE_ENTER_SEQ_NORET(PPC_STOP)
+ PPC_STOP /* Does not return (system reset interrupt) */
/*
* Entered with MSR[EE]=0 and no soft-masked interrupts pending.
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 3e024fd71fe8..edb47738a686 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2527,7 +2527,17 @@ BEGIN_FTR_SECTION
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
kvm_nap_sequence: /* desired LPCR value in r5 */
-BEGIN_FTR_SECTION
+BEGIN_FTR_SECTION /* nap sequence */
+ mtspr SPRN_LPCR,r5
+ isync
+ li r0, 0
+ std r0, HSTATE_SCRATCH0(r13)
+ ptesync
+ ld r0, HSTATE_SCRATCH0(r13)
+1: cmpd r0, r0
+ bne 1b
+ nap
+FTR_SECTION_ELSE /* stop sequence */
/*
* PSSCR bits: exit criterion = 1 (wakeup based on LPCR at sreset)
* enable state loss = 1 (allow SMT mode switch)
@@ -2539,18 +2549,8 @@ BEGIN_FTR_SECTION
li r4, LPCR_PECE_HVEE@higher
sldi r4, r4, 32
or r5, r5, r4
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
mtspr SPRN_LPCR,r5
- isync
- li r0, 0
- std r0, HSTATE_SCRATCH0(r13)
- ptesync
- ld r0, HSTATE_SCRATCH0(r13)
-1: cmpd r0, r0
- bne 1b
-BEGIN_FTR_SECTION
- nap
-FTR_SECTION_ELSE
+
PPC_STOP
ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
b .
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (10 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 22:57 ` Benjamin Herrenschmidt
2017-08-08 10:45 ` Gautham R Shenoy
2017-08-05 17:02 ` [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead Nicholas Piggin
12 siblings, 2 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy, Paul Mackerras
The hardware can execute stop in any context, and KVM does not
require real mode. This saves a switch to real-mode when going
idle.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/idle_book3s.S | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 75746111e2c4..8ac366a51bb5 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -141,7 +141,16 @@ pnv_powersave_common:
std r5,_CCR(r1)
std r1,PACAR1(r13)
+BEGIN_FTR_SECTION
+ /*
+ * POWER9 does not require real mode to stop, and does not set
+ * hwthread_state for KVM (threads don't share MMU context), so
+ * we can remain in virtual mode for this.
+ */
+ bctr
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
/*
+ * POWER8
* Go to real mode to do the nap, as required by the architecture.
* Also, we need to be in real mode before setting hwthread_state,
* because as soon as we do that, another thread can switch
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
` (11 preceding siblings ...)
2017-08-05 17:02 ` [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode Nicholas Piggin
@ 2017-08-05 17:02 ` Nicholas Piggin
2017-08-05 17:11 ` Nicholas Piggin
2017-08-08 10:49 ` Gautham R Shenoy
12 siblings, 2 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:02 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin, Gautham R . Shenoy
When stop is executed with EC=ESL=0, it appears to execute like a
normal instruction (resuming from NIP when woken by interrupt).
So all the save/restore handling can be avoided completely. In
particular NV GPRs do not have to be saved, and MSR does not have
to be switched back to kernel MSR.
So move the test for "lite" sleep states out to power9_idle_stop.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/idle_book3s.S | 38 +++++++++++++-------------------------
1 file changed, 13 insertions(+), 25 deletions(-)
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 8ac366a51bb5..9eb47c99dc39 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -251,31 +251,8 @@ enter_winkle:
/*
* r3 - PSSCR value corresponding to the requested stop state.
*/
-power_enter_stop:
-/*
- * Check if we are executing the lite variant with ESL=EC=0
- */
- andis. r4,r3,PSSCR_EC_ESL_MASK_SHIFTED
+power_enter_stop_esl:
clrldi r3,r3,60 /* r3 = Bits[60:63] = Requested Level (RL) */
- bne .Lhandle_esl_ec_set
- PPC_STOP
- li r3,0 /* Since we didn't lose state, return 0 */
-
- /*
- * pnv_wakeup_noloss() expects r12 to contain the SRR1 value so
- * it can determine if the wakeup reason is an HMI in
- * CHECK_HMI_INTERRUPT.
- *
- * However, when we wakeup with ESL=0, SRR1 will not contain the wakeup
- * reason, so there is no point setting r12 to SRR1.
- *
- * Further, we clear r12 here, so that we don't accidentally enter the
- * HMI in pnv_wakeup_noloss() if the value of r12[42:45] == WAKE_HMI.
- */
- li r12, 0
- b pnv_wakeup_noloss
-
-.Lhandle_esl_ec_set:
/*
* POWER9 DD2 can incorrectly set PMAO when waking up after a
* state-loss idle. Saving and restoring MMCR0 over idle is a
@@ -348,9 +325,20 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \
* r3 contains desired PSSCR register value.
*/
_GLOBAL(power9_idle_stop)
+ /*
+ * Check if we are executing the lite variant with ESL=EC=0
+ * This case resumes execution after the stop instruction without
+ * losing any state, so nothing has to be saved.
+ */
+ andis. r4,r3,PSSCR_EC_ESL_MASK_SHIFTED
+ bne 1f
+ PPC_STOP
+ li r3,0 /* Since we didn't lose state, return 0 */
+ blr
+1: /* state-loss idle */
std r3, PACA_REQ_PSSCR(r13)
mtspr SPRN_PSSCR,r3
- LOAD_REG_ADDR(r4,power_enter_stop)
+ LOAD_REG_ADDR(r4,power_enter_stop_esl)
b pnv_powersave_common
/* No return */
--
2.11.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead
2017-08-05 17:02 ` [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead Nicholas Piggin
@ 2017-08-05 17:11 ` Nicholas Piggin
2017-08-08 10:49 ` Gautham R Shenoy
1 sibling, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-05 17:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Gautham R . Shenoy
On Sun, 6 Aug 2017 03:02:41 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:
> When stop is executed with EC=ESL=0, it appears to execute like a
> normal instruction (resuming from NIP when woken by interrupt).
> So all the save/restore handling can be avoided completely. In
> particular NV GPRs do not have to be saved, and MSR does not have
> to be switched back to kernel MSR.
>
> So move the test for "lite" sleep states out to power9_idle_stop.
I forgot to mention this actually breaks in the simulator because
it sets MSR as though it's taken a system reset interrupt when
resuming from EC=0 stop. It works on real hardware.
This might have to be quirked away by firmware or something, but
I've reported the bug so waiting to hear back first.
Thanks,
Nick
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kernel/idle_book3s.S | 38 +++++++++++++-------------------------
> 1 file changed, 13 insertions(+), 25 deletions(-)
>
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index 8ac366a51bb5..9eb47c99dc39 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -251,31 +251,8 @@ enter_winkle:
> /*
> * r3 - PSSCR value corresponding to the requested stop state.
> */
> -power_enter_stop:
> -/*
> - * Check if we are executing the lite variant with ESL=EC=0
> - */
> - andis. r4,r3,PSSCR_EC_ESL_MASK_SHIFTED
> +power_enter_stop_esl:
> clrldi r3,r3,60 /* r3 = Bits[60:63] = Requested Level (RL) */
> - bne .Lhandle_esl_ec_set
> - PPC_STOP
> - li r3,0 /* Since we didn't lose state, return 0 */
> -
> - /*
> - * pnv_wakeup_noloss() expects r12 to contain the SRR1 value so
> - * it can determine if the wakeup reason is an HMI in
> - * CHECK_HMI_INTERRUPT.
> - *
> - * However, when we wakeup with ESL=0, SRR1 will not contain the wakeup
> - * reason, so there is no point setting r12 to SRR1.
> - *
> - * Further, we clear r12 here, so that we don't accidentally enter the
> - * HMI in pnv_wakeup_noloss() if the value of r12[42:45] == WAKE_HMI.
> - */
> - li r12, 0
> - b pnv_wakeup_noloss
> -
> -.Lhandle_esl_ec_set:
> /*
> * POWER9 DD2 can incorrectly set PMAO when waking up after a
> * state-loss idle. Saving and restoring MMCR0 over idle is a
> @@ -348,9 +325,20 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \
> * r3 contains desired PSSCR register value.
> */
> _GLOBAL(power9_idle_stop)
> + /*
> + * Check if we are executing the lite variant with ESL=EC=0
> + * This case resumes execution after the stop instruction without
> + * losing any state, so nothing has to be saved.
> + */
> + andis. r4,r3,PSSCR_EC_ESL_MASK_SHIFTED
> + bne 1f
> + PPC_STOP
> + li r3,0 /* Since we didn't lose state, return 0 */
> + blr
> +1: /* state-loss idle */
> std r3, PACA_REQ_PSSCR(r13)
> mtspr SPRN_PSSCR,r3
> - LOAD_REG_ADDR(r4,power_enter_stop)
> + LOAD_REG_ADDR(r4,power_enter_stop_esl)
> b pnv_powersave_common
> /* No return */
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode
2017-08-05 17:02 ` [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode Nicholas Piggin
@ 2017-08-05 22:57 ` Benjamin Herrenschmidt
2017-08-08 10:45 ` Gautham R Shenoy
1 sibling, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2017-08-05 22:57 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Gautham R . Shenoy, Paul Mackerras
On Sun, 2017-08-06 at 03:02 +1000, Nicholas Piggin wrote:
> The hardware can execute stop in any context, and KVM does not
> require real mode. This saves a switch to real-mode when going
> idle.
+Paulus.
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kernel/idle_book3s.S | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index 75746111e2c4..8ac366a51bb5 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -141,7 +141,16 @@ pnv_powersave_common:
> std r5,_CCR(r1)
> std r1,PACAR1(r13)
>
> +BEGIN_FTR_SECTION
> + /*
> + * POWER9 does not require real mode to stop, and does not set
> + * hwthread_state for KVM (threads don't share MMU context), so
> + * we can remain in virtual mode for this.
> + */
> + bctr
> +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
> /*
> + * POWER8
> * Go to real mode to do the nap, as required by the architecture.
> * Also, we need to be in real mode before setting hwthread_state,
> * because as soon as we do that, another thread can switch
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason
2017-08-05 17:02 ` [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason Nicholas Piggin
@ 2017-08-05 23:00 ` Benjamin Herrenschmidt
2017-08-06 0:51 ` Nicholas Piggin
0 siblings, 1 reply; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2017-08-05 23:00 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
On Sun, 2017-08-06 at 03:02 +1000, Nicholas Piggin wrote:
> HVI interrupts have always used 0x500, so remove the dead branch.
Maybe we should fix that and "catch" in incorrect entry via 0x500
which would mean the XIVE is trying to deliver guest irqs to the OS...
That can happen if some LPCR bits aren't set properly and/or KVM
doesn't pull the guest in time. I had bugs like that in my early
dev so I've been running with a b . at 0x500 for a while :-)
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kernel/exceptions-64s.S | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 29253cecf713..566cf126c13b 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1680,8 +1680,6 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_300)
> BEGIN_FTR_SECTION
> cmpwi r3,0xa00
> beq h_doorbell_common_msgclr
> - cmpwi r3,0xea0
> - beq h_virt_irq_common
> cmpwi r3,0xe60
> beq hmi_exception_common
> FTR_SECTION_ELSE
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason
2017-08-05 23:00 ` Benjamin Herrenschmidt
@ 2017-08-06 0:51 ` Nicholas Piggin
2017-08-10 13:12 ` Michael Ellerman
0 siblings, 1 reply; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-06 0:51 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
On Sun, 06 Aug 2017 09:00:32 +1000
Benjamin Herrenschmidt <benh@au1.ibm.com> wrote:
> On Sun, 2017-08-06 at 03:02 +1000, Nicholas Piggin wrote:
> > HVI interrupts have always used 0x500, so remove the dead branch.
>
> Maybe we should fix that and "catch" in incorrect entry via 0x500
> which would mean the XIVE is trying to deliver guest irqs to the OS...
I should be more clear, when I say 0x500, it is only in reference to
the constant used by the soft-irq replay. After patch 6 the replay is
sent to the 0xea0 common handler.
> That can happen if some LPCR bits aren't set properly and/or KVM
> doesn't pull the guest in time. I had bugs like that in my early
> dev so I've been running with a b . at 0x500 for a while :-)
So that's a separate issue of hardware actually doing a 0x500. I
had this
http://patchwork.ozlabs.org/patch/750033/
>
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > arch/powerpc/kernel/exceptions-64s.S | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> > index 29253cecf713..566cf126c13b 100644
> > --- a/arch/powerpc/kernel/exceptions-64s.S
> > +++ b/arch/powerpc/kernel/exceptions-64s.S
> > @@ -1680,8 +1680,6 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_300)
> > BEGIN_FTR_SECTION
> > cmpwi r3,0xa00
> > beq h_doorbell_common_msgclr
> > - cmpwi r3,0xea0
> > - beq h_virt_irq_common
> > cmpwi r3,0xe60
> > beq hmi_exception_common
> > FTR_SECTION_ELSE
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9
2017-08-05 17:02 ` [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9 Nicholas Piggin
@ 2017-08-08 10:36 ` Gautham R Shenoy
2017-08-08 12:42 ` Nicholas Piggin
2017-08-10 13:14 ` Michael Ellerman
1 sibling, 1 reply; 29+ messages in thread
From: Gautham R Shenoy @ 2017-08-08 10:36 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev, Gautham R . Shenoy, Paul Mackerras, kvm-ppc
Hi Nicholas,
On Sun, Aug 06, 2017 at 03:02:38AM +1000, Nicholas Piggin wrote:
> POWER9 CPUs have independent MMU contexts per thread so KVM
> does not have to bring sibling threads into real-mode when
> switching MMU mode to guest. This can simplify POWER9 sleep/wake
> paths and avoids hwsyncs.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/kvm_book3s_asm.h | 4 ++++
> arch/powerpc/kernel/idle_book3s.S | 8 ++-----
> arch/powerpc/kvm/book3s_hv.c | 37 ++++++++++++++++++++++++++-----
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 8 +++++++
> 4 files changed, 46 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h
> index 7cea76f11c26..83596f32f50b 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_asm.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
> @@ -104,6 +104,10 @@ struct kvmppc_host_state {
> u8 napping;
>
> #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> + /*
> + * hwthread_req/hwthread_state pair is used to pull sibling threads
> + * out of guest on pre-ISAv3.0B CPUs where threads share MMU.
> + */
> u8 hwthread_req;
> u8 hwthread_state;
> u8 host_ipi;
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index e6252c5a57a4..3ab73f9223e4 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -243,12 +243,6 @@ enter_winkle:
> * r3 - PSSCR value corresponding to the requested stop state.
> */
> power_enter_stop:
> -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> - /* Tell KVM we're entering idle */
> - li r4,KVM_HWTHREAD_IN_IDLE
> - /* DO THIS IN REAL MODE! See comment above. */
> - stb r4,HSTATE_HWTHREAD_STATE(r13)
> -#endif
> /*
> * Check if we are executing the lite variant with ESL=EC=0
> */
> @@ -435,6 +429,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> mr r3,r12
>
> #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +BEGIN_FTR_SECTION
> li r0,KVM_HWTHREAD_IN_KERNEL
> stb r0,HSTATE_HWTHREAD_STATE(r13)
> /* Order setting hwthread_state vs. testing hwthread_req */
> @@ -444,6 +439,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> beq 1f
> b kvm_start_guest
> 1:
> +END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
This would be 7 nops on power9. Should we move this to a different
function and do a bl to that?
> #endif
>
> /* Return SRR1 from power7_nap() */
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 359c79cdf0cc..bb1ab14f963a 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2111,6 +2111,16 @@ static int kvmppc_grab_hwthread(int cpu)
> struct paca_struct *tpaca;
> long timeout = 10000;
>
> + /*
> + * ISA v3.0 idle routines do not set hwthread_state or test
> + * hwthread_req, so they can not grab idle threads.
> + */
> + if (cpu_has_feature(CPU_FTR_ARCH_300)) {
> + WARN_ON(1);
> + pr_err("KVM: can not control sibling threads\n");
> + return -EBUSY;
> + }
> +
> tpaca = &paca[cpu];
>
> /* Ensure the thread won't go into the kernel if it wakes */
> @@ -2145,12 +2155,26 @@ static void kvmppc_release_hwthread(int cpu)
> struct paca_struct *tpaca;
>
> tpaca = &paca[cpu];
> - tpaca->kvm_hstate.hwthread_req = 0;
> tpaca->kvm_hstate.kvm_vcpu = NULL;
> tpaca->kvm_hstate.kvm_vcore = NULL;
> tpaca->kvm_hstate.kvm_split_mode = NULL;
> }
>
> +static void kvmppc_release_hwthread_secondary(int cpu)
> +{
> + struct paca_struct *tpaca;
> +
> + if (cpu_has_feature(CPU_FTR_ARCH_300)) {
> + WARN_ON(1);
> + return;
> + }
> +
> + tpaca = &paca[cpu];
> + tpaca->kvm_hstate.hwthread_req = 0;
> + kvmppc_release_hwthread(cpu);
> +}
> +
> +
Extra blank line not needed.
> static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu)
> {
> int i;
> @@ -2274,7 +2298,7 @@ static int on_primary_thread(void)
> if (kvmppc_grab_hwthread(cpu + thr)) {
> /* Couldn't grab one; let the others go */
> do {
> - kvmppc_release_hwthread(cpu + thr);
> + kvmppc_release_hwthread_secondary(cpu + thr);
> } while (--thr > 0);
> return 0;
> }
> @@ -2702,8 +2726,9 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
> kvmppc_vcore_preempt(pvc);
> spin_unlock(&pvc->lock);
> }
> - for (i = 0; i < controlled_threads; ++i)
> - kvmppc_release_hwthread(pcpu + i);
> + for (i = 1; i < controlled_threads; ++i)
> + kvmppc_release_hwthread_secondary(pcpu + i);
> + kvmppc_release_hwthread(pcpu);
> return;
> }
>
> @@ -2858,11 +2883,13 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>
> /* Let secondaries go back to the offline loop */
> for (i = 0; i < controlled_threads; ++i) {
> - kvmppc_release_hwthread(pcpu + i);
> if (sip && sip->napped[i])
> kvmppc_ipi_thread(pcpu + i);
> cpumask_clear_cpu(pcpu + i, &vc->kvm->arch.cpu_in_guest);
> }
We are sending an IPI to the thread that has exited the guest and is
currently napping. The IPI wakes it up so that it can executes
offline loop. But we haven't released the hwthread yet, which means
that hwthread_req for this thread is still set.
The thread wakes up from nap, executes the pnv_powersave_wakeup code
where it can enter kvm_start_guest. Is this a legitimate race or am I
missing something?
> + for (i = 1; i < controlled_threads; ++i)
> + kvmppc_release_hwthread_secondary(pcpu + i);
> + kvmppc_release_hwthread(pcpu);
>
> spin_unlock(&vc->lock);
>
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index c52184a8efdf..3e024fd71fe8 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -149,9 +149,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> subf r4, r4, r3
> mtspr SPRN_DEC, r4
>
> +BEGIN_FTR_SECTION
> /* hwthread_req may have got set by cede or no vcpu, so clear it */
> li r0, 0
> stb r0, HSTATE_HWTHREAD_REQ(r13)
> +END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
>
> /*
> * For external interrupts we need to call the Linux
> @@ -314,6 +316,7 @@ kvm_novcpu_exit:
> * Relocation is off and most register values are lost.
> * r13 points to the PACA.
> * r3 contains the SRR1 wakeup value, SRR1 is trashed.
> + * This is not used by ISAv3.0B processors.
> */
> .globl kvm_start_guest
> kvm_start_guest:
> @@ -432,6 +435,9 @@ kvm_secondary_got_guest:
> * While waiting we also need to check if we get given a vcpu to run.
> */
> kvm_no_guest:
> +BEGIN_FTR_SECTION
> + twi 31,0,0
> +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
> lbz r3, HSTATE_HWTHREAD_REQ(r13)
> cmpwi r3, 0
> bne 53f
> @@ -2509,8 +2515,10 @@ kvm_do_nap:
> clrrdi r0, r0, 1
> mtspr SPRN_CTRLT, r0
>
> +BEGIN_FTR_SECTION
> li r0,1
> stb r0,HSTATE_HWTHREAD_REQ(r13)
> +END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
> mfspr r5,SPRN_LPCR
> ori r5,r5,LPCR_PECE0 | LPCR_PECE1
> BEGIN_FTR_SECTION
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync
2017-08-05 17:02 ` [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync Nicholas Piggin
@ 2017-08-08 10:37 ` Gautham R Shenoy
2017-08-10 13:15 ` Michael Ellerman
1 sibling, 0 replies; 29+ messages in thread
From: Gautham R Shenoy @ 2017-08-08 10:37 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev, Gautham R . Shenoy
On Sun, Aug 06, 2017 at 03:02:39AM +1000, Nicholas Piggin wrote:
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
--
Thanks and Regards
gautham.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode
2017-08-05 17:02 ` [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode Nicholas Piggin
2017-08-05 22:57 ` Benjamin Herrenschmidt
@ 2017-08-08 10:45 ` Gautham R Shenoy
1 sibling, 0 replies; 29+ messages in thread
From: Gautham R Shenoy @ 2017-08-08 10:45 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev, Gautham R . Shenoy, Paul Mackerras
On Sun, Aug 06, 2017 at 03:02:40AM +1000, Nicholas Piggin wrote:
> The hardware can execute stop in any context, and KVM does not
> require real mode. This saves a switch to real-mode when going
> idle.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
> arch/powerpc/kernel/idle_book3s.S | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index 75746111e2c4..8ac366a51bb5 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -141,7 +141,16 @@ pnv_powersave_common:
> std r5,_CCR(r1)
> std r1,PACAR1(r13)
>
> +BEGIN_FTR_SECTION
> + /*
> + * POWER9 does not require real mode to stop, and does not set
> + * hwthread_state for KVM (threads don't share MMU context), so
> + * we can remain in virtual mode for this.
> + */
> + bctr
> +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
> /*
> + * POWER8
> * Go to real mode to do the nap, as required by the architecture.
> * Also, we need to be in real mode before setting hwthread_state,
> * because as soon as we do that, another thread can switch
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead
2017-08-05 17:02 ` [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead Nicholas Piggin
2017-08-05 17:11 ` Nicholas Piggin
@ 2017-08-08 10:49 ` Gautham R Shenoy
1 sibling, 0 replies; 29+ messages in thread
From: Gautham R Shenoy @ 2017-08-08 10:49 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev, Gautham R . Shenoy
On Sun, Aug 06, 2017 at 03:02:41AM +1000, Nicholas Piggin wrote:
> When stop is executed with EC=ESL=0, it appears to execute like a
> normal instruction (resuming from NIP when woken by interrupt).
> So all the save/restore handling can be avoided completely. In
> particular NV GPRs do not have to be saved, and MSR does not have
> to be switched back to kernel MSR.
>
> So move the test for "lite" sleep states out to power9_idle_stop.
Nice optimization!
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kernel/idle_book3s.S | 38 +++++++++++++-------------------------
> 1 file changed, 13 insertions(+), 25 deletions(-)
>
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index 8ac366a51bb5..9eb47c99dc39 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -251,31 +251,8 @@ enter_winkle:
> /*
> * r3 - PSSCR value corresponding to the requested stop state.
> */
> -power_enter_stop:
> -/*
> - * Check if we are executing the lite variant with ESL=EC=0
> - */
> - andis. r4,r3,PSSCR_EC_ESL_MASK_SHIFTED
> +power_enter_stop_esl:
> clrldi r3,r3,60 /* r3 = Bits[60:63] = Requested Level (RL) */
> - bne .Lhandle_esl_ec_set
> - PPC_STOP
> - li r3,0 /* Since we didn't lose state, return 0 */
> -
> - /*
> - * pnv_wakeup_noloss() expects r12 to contain the SRR1 value so
> - * it can determine if the wakeup reason is an HMI in
> - * CHECK_HMI_INTERRUPT.
> - *
> - * However, when we wakeup with ESL=0, SRR1 will not contain the wakeup
> - * reason, so there is no point setting r12 to SRR1.
> - *
> - * Further, we clear r12 here, so that we don't accidentally enter the
> - * HMI in pnv_wakeup_noloss() if the value of r12[42:45] == WAKE_HMI.
> - */
> - li r12, 0
> - b pnv_wakeup_noloss
> -
> -.Lhandle_esl_ec_set:
> /*
> * POWER9 DD2 can incorrectly set PMAO when waking up after a
> * state-loss idle. Saving and restoring MMCR0 over idle is a
> @@ -348,9 +325,20 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \
> * r3 contains desired PSSCR register value.
> */
> _GLOBAL(power9_idle_stop)
> + /*
> + * Check if we are executing the lite variant with ESL=EC=0
> + * This case resumes execution after the stop instruction without
> + * losing any state, so nothing has to be saved.
> + */
> + andis. r4,r3,PSSCR_EC_ESL_MASK_SHIFTED
> + bne 1f
> + PPC_STOP
> + li r3,0 /* Since we didn't lose state, return 0 */
> + blr
> +1: /* state-loss idle */
> std r3, PACA_REQ_PSSCR(r13)
> mtspr SPRN_PSSCR,r3
> - LOAD_REG_ADDR(r4,power_enter_stop)
> + LOAD_REG_ADDR(r4,power_enter_stop_esl)
> b pnv_powersave_common
> /* No return */
>
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9
2017-08-08 10:36 ` Gautham R Shenoy
@ 2017-08-08 12:42 ` Nicholas Piggin
2017-08-10 6:24 ` Gautham R Shenoy
0 siblings, 1 reply; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-08 12:42 UTC (permalink / raw)
To: Gautham R Shenoy; +Cc: linuxppc-dev, Paul Mackerras, kvm-ppc
On Tue, 8 Aug 2017 16:06:43 +0530
Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Hi Nicholas,
>
> On Sun, Aug 06, 2017 at 03:02:38AM +1000, Nicholas Piggin wrote:
> > POWER9 CPUs have independent MMU contexts per thread so KVM
> > does not have to bring sibling threads into real-mode when
> > switching MMU mode to guest. This can simplify POWER9 sleep/wake
> > paths and avoids hwsyncs.
> >
> > @@ -444,6 +439,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> > beq 1f
> > b kvm_start_guest
> > 1:
> > +END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
>
> This would be 7 nops on power9. Should we move this to a different
> function and do a bl to that?
Yes that's a good idea.
> > +static void kvmppc_release_hwthread_secondary(int cpu)
> > +{
> > + struct paca_struct *tpaca;
> > +
> > + if (cpu_has_feature(CPU_FTR_ARCH_300)) {
> > + WARN_ON(1);
> > + return;
> > + }
> > +
> > + tpaca = &paca[cpu];
> > + tpaca->kvm_hstate.hwthread_req = 0;
> > + kvmppc_release_hwthread(cpu);
> > +}
> > +
> > +
>
> Extra blank line not needed.
Sure.
> > @@ -2858,11 +2883,13 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
> >
> > /* Let secondaries go back to the offline loop */
> > for (i = 0; i < controlled_threads; ++i) {
> > - kvmppc_release_hwthread(pcpu + i);
> > if (sip && sip->napped[i])
> > kvmppc_ipi_thread(pcpu + i);
> > cpumask_clear_cpu(pcpu + i, &vc->kvm->arch.cpu_in_guest);
> > }
>
> We are sending an IPI to the thread that has exited the guest and is
> currently napping. The IPI wakes it up so that it can executes
> offline loop. But we haven't released the hwthread yet, which means
> that hwthread_req for this thread is still set.
>
> The thread wakes up from nap, executes the pnv_powersave_wakeup code
> where it can enter kvm_start_guest. Is this a legitimate race or am I
> missing something?
Oh I think it's just a silly mistake in my patch, good catch.
Would moving this loop below the one below solve it? I wasn't
completely happy with uglifying these loops by making the
primary release different than secondary... maybe I will just
move the difference into kvmppc_release_hwthread and which is
less intrusive to callers.
Thanks,
Nick
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9
2017-08-08 12:42 ` Nicholas Piggin
@ 2017-08-10 6:24 ` Gautham R Shenoy
0 siblings, 0 replies; 29+ messages in thread
From: Gautham R Shenoy @ 2017-08-10 6:24 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: Gautham R Shenoy, linuxppc-dev, Paul Mackerras, kvm-ppc
On Tue, Aug 08, 2017 at 10:42:57PM +1000, Nicholas Piggin wrote:
> On Tue, 8 Aug 2017 16:06:43 +0530
> Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
>
> > Hi Nicholas,
> >
> > On Sun, Aug 06, 2017 at 03:02:38AM +1000, Nicholas Piggin wrote:
> > > POWER9 CPUs have independent MMU contexts per thread so KVM
> > > does not have to bring sibling threads into real-mode when
> > > switching MMU mode to guest. This can simplify POWER9 sleep/wake
> > > paths and avoids hwsyncs.
> > > @@ -2858,11 +2883,13 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
> > >
> > > /* Let secondaries go back to the offline loop */
> > > for (i = 0; i < controlled_threads; ++i) {
> > > - kvmppc_release_hwthread(pcpu + i);
> > > if (sip && sip->napped[i])
> > > kvmppc_ipi_thread(pcpu + i);
> > > cpumask_clear_cpu(pcpu + i, &vc->kvm->arch.cpu_in_guest);
> > > }
> >
> > We are sending an IPI to the thread that has exited the guest and is
> > currently napping. The IPI wakes it up so that it can executes
> > offline loop. But we haven't released the hwthread yet, which means
> > that hwthread_req for this thread is still set.
> >
> > The thread wakes up from nap, executes the pnv_powersave_wakeup code
> > where it can enter kvm_start_guest. Is this a legitimate race or am I
> > missing something?
>
> Oh I think it's just a silly mistake in my patch, good catch.
Ah,np!
> Would moving this loop below the one below solve it? I wasn't
> completely happy with uglifying these loops by making the
> primary release different than secondary... maybe I will just
> move the difference into kvmppc_release_hwthread and which is
> less intrusive to callers.
I think moving it to kvmppc_release_hwthread is a good idea.
>
> Thanks,
> Nick
>
--
Thanks and Regards
gautham.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason
2017-08-06 0:51 ` Nicholas Piggin
@ 2017-08-10 13:12 ` Michael Ellerman
0 siblings, 0 replies; 29+ messages in thread
From: Michael Ellerman @ 2017-08-10 13:12 UTC (permalink / raw)
To: Nicholas Piggin, Benjamin Herrenschmidt; +Cc: linuxppc-dev
Nicholas Piggin <npiggin@gmail.com> writes:
> On Sun, 06 Aug 2017 09:00:32 +1000
> Benjamin Herrenschmidt <benh@au1.ibm.com> wrote:
>
>> On Sun, 2017-08-06 at 03:02 +1000, Nicholas Piggin wrote:
>> > HVI interrupts have always used 0x500, so remove the dead branch.
>>
>> Maybe we should fix that and "catch" in incorrect entry via 0x500
>> which would mean the XIVE is trying to deliver guest irqs to the OS...
>
> I should be more clear, when I say 0x500, it is only in reference to
> the constant used by the soft-irq replay. After patch 6 the replay is
> sent to the 0xea0 common handler.
>
>> That can happen if some LPCR bits aren't set properly and/or KVM
>> doesn't pull the guest in time. I had bugs like that in my early
>> dev so I've been running with a b . at 0x500 for a while :-)
>
> So that's a separate issue of hardware actually doing a 0x500. I
> had this
>
> http://patchwork.ozlabs.org/patch/750033/
Hmm, yeah I was going to merge that.
But I got side tracked by HVICE, ie. the "interrupts don't go to 0x500"
is only true if we set HVICE.
Which we currently always do, in __setup_cpu_power9(), but then we have
a DT CPU feature for it, so we really shouldn't be always enabling
HVICE, we should leave it up to the DT CPU feature code.
And then it becomes controlled by the device tree, which makes your
patch potentially wrong depending on the device tree CPU features.
Ugh.
cheers
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9
2017-08-05 17:02 ` [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9 Nicholas Piggin
2017-08-08 10:36 ` Gautham R Shenoy
@ 2017-08-10 13:14 ` Michael Ellerman
2017-08-10 13:53 ` Nicholas Piggin
1 sibling, 1 reply; 29+ messages in thread
From: Michael Ellerman @ 2017-08-10 13:14 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Cc: Gautham R . Shenoy, kvm-ppc, Nicholas Piggin
Nicholas Piggin <npiggin@gmail.com> writes:
> POWER9 CPUs have independent MMU contexts per thread so KVM
> does not have to bring sibling threads into real-mode when
> switching MMU mode to guest. This can simplify POWER9 sleep/wake
> paths and avoids hwsyncs.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/kvm_book3s_asm.h | 4 ++++
> arch/powerpc/kernel/idle_book3s.S | 8 ++-----
> arch/powerpc/kvm/book3s_hv.c | 37 ++++++++++++++++++++++++++-----
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 8 +++++++
This will need to go via, or at least be shared with Paul's tree.
So if it's possible, splitting it out of this series would be easier.
cheers
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync
2017-08-05 17:02 ` [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync Nicholas Piggin
2017-08-08 10:37 ` Gautham R Shenoy
@ 2017-08-10 13:15 ` Michael Ellerman
2017-08-10 14:03 ` Nicholas Piggin
1 sibling, 1 reply; 29+ messages in thread
From: Michael Ellerman @ 2017-08-10 13:15 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Gautham R . Shenoy, Nicholas Piggin
Nicholas Piggin <npiggin@gmail.com> writes:
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kernel/idle_book3s.S | 7 ++++---
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 24 ++++++++++++------------
If you can split this into a KVM and non-KVM patch that would be
helpful.
cheers
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9
2017-08-10 13:14 ` Michael Ellerman
@ 2017-08-10 13:53 ` Nicholas Piggin
0 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-10 13:53 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, Gautham R . Shenoy, kvm-ppc
On Thu, 10 Aug 2017 23:14:46 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:
> Nicholas Piggin <npiggin@gmail.com> writes:
>
> > POWER9 CPUs have independent MMU contexts per thread so KVM
> > does not have to bring sibling threads into real-mode when
> > switching MMU mode to guest. This can simplify POWER9 sleep/wake
> > paths and avoids hwsyncs.
> >
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > arch/powerpc/include/asm/kvm_book3s_asm.h | 4 ++++
> > arch/powerpc/kernel/idle_book3s.S | 8 ++-----
> > arch/powerpc/kvm/book3s_hv.c | 37 ++++++++++++++++++++++++++-----
> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 8 +++++++
>
> This will need to go via, or at least be shared with Paul's tree.
>
> So if it's possible, splitting it out of this series would be easier.
I agree it's really a KVM patch, but patch 12 depends on this,
it is a Linux patch. Not sure how you want to handle that?
Thanks,
Nick
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync
2017-08-10 13:15 ` Michael Ellerman
@ 2017-08-10 14:03 ` Nicholas Piggin
0 siblings, 0 replies; 29+ messages in thread
From: Nicholas Piggin @ 2017-08-10 14:03 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, Gautham R . Shenoy
On Thu, 10 Aug 2017 23:15:50 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:
> Nicholas Piggin <npiggin@gmail.com> writes:
>
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > arch/powerpc/kernel/idle_book3s.S | 7 ++++---
> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 24 ++++++++++++------------
>
> If you can split this into a KVM and non-KVM patch that would be
> helpful.
Yes I can do that.
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2017-08-10 14:03 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-05 17:02 [PATCH 00/13] idle and soft-irq improvements and POWER9 idle optimisation Nicholas Piggin
2017-08-05 17:02 ` [PATCH 01/13] powerpc/64s: masked interrupt avoid branch Nicholas Piggin
2017-08-05 17:02 ` [PATCH 02/13] powerpc/64s: masked interrupt avoid instruction Nicholas Piggin
2017-08-05 17:02 ` [PATCH 03/13] powerpc/64s: masked interrupt returns to kernel so avoid r13 restore Nicholas Piggin
2017-08-05 17:02 ` [PATCH 04/13] powerpc/64: cleanup __check_irq_replay Nicholas Piggin
2017-08-05 17:02 ` [PATCH 05/13] powerpc/64s: irq replay merge HV and non-HV paths for doorbell replay Nicholas Piggin
2017-08-05 17:02 ` [PATCH 06/13] powerpc/64s: irq replay external use the HV handler in HV mode on POWER9 Nicholas Piggin
2017-08-05 17:02 ` [PATCH 07/13] powerpc/64: remove redundant instruction in interrupt replay Nicholas Piggin
2017-08-05 17:02 ` [PATCH 08/13] powerpc/64s: irq replay remove spurious irq reason Nicholas Piggin
2017-08-05 23:00 ` Benjamin Herrenschmidt
2017-08-06 0:51 ` Nicholas Piggin
2017-08-10 13:12 ` Michael Ellerman
2017-08-05 17:02 ` [PATCH 09/13] powerpc/64: runlatch CTRL[RUN] set optimisation Nicholas Piggin
2017-08-05 17:02 ` [PATCH 10/13] powerpc/64s: idle simplify KVM idle on POWER9 Nicholas Piggin
2017-08-08 10:36 ` Gautham R Shenoy
2017-08-08 12:42 ` Nicholas Piggin
2017-08-10 6:24 ` Gautham R Shenoy
2017-08-10 13:14 ` Michael Ellerman
2017-08-10 13:53 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 11/13] powerpc/64s: idle POWER9 can execute stop without ptesync Nicholas Piggin
2017-08-08 10:37 ` Gautham R Shenoy
2017-08-10 13:15 ` Michael Ellerman
2017-08-10 14:03 ` Nicholas Piggin
2017-08-05 17:02 ` [PATCH 12/13] powerpc/64s: idle POWER9 can execute stop in virtual mode Nicholas Piggin
2017-08-05 22:57 ` Benjamin Herrenschmidt
2017-08-08 10:45 ` Gautham R Shenoy
2017-08-05 17:02 ` [PATCH 13/13] powerpc/64s: idle ESL=0 stop can avoid all save/restore overhead Nicholas Piggin
2017-08-05 17:11 ` Nicholas Piggin
2017-08-08 10:49 ` Gautham R Shenoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).