* [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged
@ 2017-10-05 16:49 Cédric Le Goater
2017-10-05 16:49 ` [Qemu-devel] [PATCH 1/2] spapr/rtas: " Cédric Le Goater
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Cédric Le Goater @ 2017-10-05 16:49 UTC (permalink / raw)
To: qemu-ppc, qemu-devel, David Gibson, Nikunj A Dadhania,
Benjamin Herrenschmidt, Alexey Kardashevskiy
Cc: Cédric Le Goater
Hello,
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
If the DECR timer fires after 'stop-self' is called and before the CPU
'stop' state is reached, the nearly-dead CPU will have some work to do
and the guest will crash. This case happens very frequently with the
not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
occasionally fired but after 'stop' state, so no work is to be done
and the guest survives.
I suspect there is a race between the QEMU mainloop triggering the
timers and the TCG CPU thread but I could not quite identify the root
cause. To be safe, let's disable the decrementer interrupt in the LPCR
when the CPU is halted and reenable it when the CPU is restarted.
Reseting the MSR is now pointless, so remove this dubious workaround.
Thanks,
C.
Cédric Le Goater (2):
spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
spapr/rtas: do not reset the MSR in stop-self command
hw/ppc/spapr_rtas.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
--
2.13.6
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 1/2] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
2017-10-05 16:49 [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Cédric Le Goater
@ 2017-10-05 16:49 ` Cédric Le Goater
2017-10-06 9:07 ` David Gibson
2017-10-05 16:49 ` [Qemu-devel] [PATCH 2/2] spapr/rtas: do not reset the MSR in stop-self command Cédric Le Goater
2017-10-06 6:10 ` [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Nikunj A Dadhania
2 siblings, 1 reply; 16+ messages in thread
From: Cédric Le Goater @ 2017-10-05 16:49 UTC (permalink / raw)
To: qemu-ppc, qemu-devel, David Gibson, Nikunj A Dadhania,
Benjamin Herrenschmidt, Alexey Kardashevskiy
Cc: Cédric Le Goater
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
If the DECR timer fires after 'stop-self' is called and before the CPU
'stop' state is reached, the nearly-dead CPU will have some work to do
and the guest will crash. This case happens very frequently with the
not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
occasionally fired but after 'stop' state, so no work is to be done
and the guest survives.
I suspect there is a race between the QEMU mainloop triggering the
timers and the TCG CPU thread but I could not quite identify the root
cause. To be safe, let's disable the decrementer interrupt in the LPCR
when the CPU is halted and reenable it when the CPU is restarted.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
hw/ppc/spapr_rtas.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index cdf0b607a0a0..2389220c9738 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
kvm_cpu_synchronize_state(cs);
env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
+
+ /* Enable DECR interrupt */
+ if (env->mmu_model == POWERPC_MMU_3_00) {
+ env->spr[SPR_LPCR] |= LPCR_DEE;
+ } else {
+ /* P7 and P8 both have same bit for DECR */
+ env->spr[SPR_LPCR] |= LPCR_P8_PECE3;
+ }
+
env->nip = start;
env->gpr[3] = r3;
cs->halted = 0;
@@ -210,6 +219,13 @@ static void rtas_stop_self(PowerPCCPU *cpu, sPAPRMachineState *spapr,
* no need to bother with specific bits, we just clear it.
*/
env->msr = 0;
+
+ if (env->mmu_model == POWERPC_MMU_3_00) {
+ env->spr[SPR_LPCR] &= ~LPCR_DEE;
+ } else {
+ /* P7 and P8 both have same bit for DECR */
+ env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
+ }
}
static inline int sysparm_st(target_ulong addr, target_ulong len,
--
2.13.6
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 2/2] spapr/rtas: do not reset the MSR in stop-self command
2017-10-05 16:49 [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Cédric Le Goater
2017-10-05 16:49 ` [Qemu-devel] [PATCH 1/2] spapr/rtas: " Cédric Le Goater
@ 2017-10-05 16:49 ` Cédric Le Goater
2017-10-06 9:08 ` David Gibson
2017-10-06 6:10 ` [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Nikunj A Dadhania
2 siblings, 1 reply; 16+ messages in thread
From: Cédric Le Goater @ 2017-10-05 16:49 UTC (permalink / raw)
To: qemu-ppc, qemu-devel, David Gibson, Nikunj A Dadhania,
Benjamin Herrenschmidt, Alexey Kardashevskiy
Cc: Cédric Le Goater
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
The CPU is now also protected from the decrementer interrupt by the
LPCR:PECE* bits which are disabled in the 'stop-self' RTAS
call. Reseting the MSR is pointless.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
hw/ppc/spapr_rtas.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 2389220c9738..7f5ddce89ef2 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -209,16 +209,6 @@ static void rtas_stop_self(PowerPCCPU *cpu, sPAPRMachineState *spapr,
cs->halted = 1;
qemu_cpu_kick(cs);
- /*
- * While stopping a CPU, the guest calls H_CPPR which
- * effectively disables interrupts on XICS level.
- * However decrementer interrupts in TCG can still
- * wake the CPU up so here we disable interrupts in MSR
- * as well.
- * As rtas_start_cpu() resets the whole MSR anyway, there is
- * no need to bother with specific bits, we just clear it.
- */
- env->msr = 0;
if (env->mmu_model == POWERPC_MMU_3_00) {
env->spr[SPR_LPCR] &= ~LPCR_DEE;
--
2.13.6
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged
2017-10-05 16:49 [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Cédric Le Goater
2017-10-05 16:49 ` [Qemu-devel] [PATCH 1/2] spapr/rtas: " Cédric Le Goater
2017-10-05 16:49 ` [Qemu-devel] [PATCH 2/2] spapr/rtas: do not reset the MSR in stop-self command Cédric Le Goater
@ 2017-10-06 6:10 ` Nikunj A Dadhania
2017-10-06 6:14 ` Cédric Le Goater
` (2 more replies)
2 siblings, 3 replies; 16+ messages in thread
From: Nikunj A Dadhania @ 2017-10-06 6:10 UTC (permalink / raw)
To: Cédric Le Goater, qemu-ppc, qemu-devel, David Gibson,
Benjamin Herrenschmidt, Alexey Kardashevskiy
Cédric Le Goater <clg@kaod.org> writes:
> Hello,
>
> When a CPU is stopped with the 'stop-self' RTAS call, its state
> 'halted' is switched to 1 and, in this case, the MSR is not taken into
> account anymore in the cpu_has_work() routine. Only the pending
> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>
> If the DECR timer fires after 'stop-self' is called and before the CPU
> 'stop' state is reached, the nearly-dead CPU will have some work to do
> and the guest will crash. This case happens very frequently with the
> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
> occasionally fired but after 'stop' state, so no work is to be done
> and the guest survives.
>
> I suspect there is a race between the QEMU mainloop triggering the
> timers and the TCG CPU thread but I could not quite identify the root
> cause. To be safe, let's disable the decrementer interrupt in the LPCR
> when the CPU is halted and reenable it when the CPU is restarted.
Moreover, disabling the DECR in the reset path solves the TCG multi cpu
reboot case, as reboot path does not call stop-cpu rtas call.
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 3e20b1d886..c5150ee590 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -86,6 +86,15 @@ static void spapr_cpu_reset(void *opaque)
cs->halted = 1;
env->spr[SPR_HIOR] = 0;
+ /* Disable DECR for secondary cpus */
+ if (cs != first_cpu) {
+ if (env->mmu_model == POWERPC_MMU_3_00) {
+ env->spr[SPR_LPCR] &= ~LPCR_DEE;
+ } else {
+ /* P7 and P8 both have same bit for DECR */
+ env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
+ }
+ }
}
static void spapr_cpu_destroy(PowerPCCPU *cpu)
Regards
Nikunj
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged
2017-10-06 6:10 ` [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Nikunj A Dadhania
@ 2017-10-06 6:14 ` Cédric Le Goater
2017-10-06 7:46 ` Benjamin Herrenschmidt
2017-10-06 9:09 ` David Gibson
2 siblings, 0 replies; 16+ messages in thread
From: Cédric Le Goater @ 2017-10-06 6:14 UTC (permalink / raw)
To: Nikunj A Dadhania, qemu-ppc, qemu-devel, David Gibson,
Benjamin Herrenschmidt, Alexey Kardashevskiy
On 10/06/2017 08:10 AM, Nikunj A Dadhania wrote:
> Cédric Le Goater <clg@kaod.org> writes:
>
>> Hello,
>>
>> When a CPU is stopped with the 'stop-self' RTAS call, its state
>> 'halted' is switched to 1 and, in this case, the MSR is not taken into
>> account anymore in the cpu_has_work() routine. Only the pending
>> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>>
>> If the DECR timer fires after 'stop-self' is called and before the CPU
>> 'stop' state is reached, the nearly-dead CPU will have some work to do
>> and the guest will crash. This case happens very frequently with the
>> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
>> occasionally fired but after 'stop' state, so no work is to be done
>> and the guest survives.
>>
>> I suspect there is a race between the QEMU mainloop triggering the
>> timers and the TCG CPU thread but I could not quite identify the root
>> cause. To be safe, let's disable the decrementer interrupt in the LPCR
>> when the CPU is halted and reenable it when the CPU is restarted.
>
> Moreover, disabling the DECR in the reset path solves the TCG multi cpu
> reboot case, as reboot path does not call stop-cpu rtas call.
yes. I was going to restart the thread on the topic.
Let's how these two little patches are discussed. Then we/you can
resend the missing hunk in reset which is needed to perform a TCG
reboot.
Thanks,
C.
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 3e20b1d886..c5150ee590 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -86,6 +86,15 @@ static void spapr_cpu_reset(void *opaque)
> cs->halted = 1;
>
> env->spr[SPR_HIOR] = 0;
> + /* Disable DECR for secondary cpus */
> + if (cs != first_cpu) {
> + if (env->mmu_model == POWERPC_MMU_3_00) {
> + env->spr[SPR_LPCR] &= ~LPCR_DEE;
> + } else {
> + /* P7 and P8 both have same bit for DECR */
> + env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
> + }
> + }
> }
>
> static void spapr_cpu_destroy(PowerPCCPU *cpu)
>
>
> Regards
> Nikunj
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged
2017-10-06 6:10 ` [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Nikunj A Dadhania
2017-10-06 6:14 ` Cédric Le Goater
@ 2017-10-06 7:46 ` Benjamin Herrenschmidt
2017-10-06 7:53 ` Cédric Le Goater
2017-10-06 8:11 ` Nikunj A Dadhania
2017-10-06 9:09 ` David Gibson
2 siblings, 2 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2017-10-06 7:46 UTC (permalink / raw)
To: Nikunj A Dadhania, Cédric Le Goater, qemu-ppc, qemu-devel,
David Gibson, Alexey Kardashevskiy
On Fri, 2017-10-06 at 11:40 +0530, Nikunj A Dadhania wrote:
> Cédric Le Goater <clg@kaod.org> writes:
>
> > Hello,
> >
> > When a CPU is stopped with the 'stop-self' RTAS call, its state
> > 'halted' is switched to 1 and, in this case, the MSR is not taken into
> > account anymore in the cpu_has_work() routine. Only the pending
> > hardware interrupts are checked with their LPCR:PECE* enablement bit.
> >
> > If the DECR timer fires after 'stop-self' is called and before the CPU
> > 'stop' state is reached, the nearly-dead CPU will have some work to do
> > and the guest will crash. This case happens very frequently with the
> > not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
> > occasionally fired but after 'stop' state, so no work is to be done
> > and the guest survives.
> >
> > I suspect there is a race between the QEMU mainloop triggering the
> > timers and the TCG CPU thread but I could not quite identify the root
> > cause. To be safe, let's disable the decrementer interrupt in the LPCR
> > when the CPU is halted and reenable it when the CPU is restarted.
>
> Moreover, disabling the DECR in the reset path solves the TCG multi cpu
> reboot case, as reboot path does not call stop-cpu rtas call.
SHouldn't we do it in set_papr too and only turn it on for the boot CPU
and in start-cpu RTAS call ? Same with the other PECEs in fact...
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 3e20b1d886..c5150ee590 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -86,6 +86,15 @@ static void spapr_cpu_reset(void *opaque)
> cs->halted = 1;
>
> env->spr[SPR_HIOR] = 0;
> + /* Disable DECR for secondary cpus */
> + if (cs != first_cpu) {
> + if (env->mmu_model == POWERPC_MMU_3_00) {
> + env->spr[SPR_LPCR] &= ~LPCR_DEE;
> + } else {
> + /* P7 and P8 both have same bit for DECR */
> + env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
> + }
> + }
> }
>
> static void spapr_cpu_destroy(PowerPCCPU *cpu)
>
>
> Regards
> Nikunj
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged
2017-10-06 7:46 ` Benjamin Herrenschmidt
@ 2017-10-06 7:53 ` Cédric Le Goater
2017-10-06 8:11 ` Nikunj A Dadhania
1 sibling, 0 replies; 16+ messages in thread
From: Cédric Le Goater @ 2017-10-06 7:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Nikunj A Dadhania, qemu-ppc, qemu-devel,
David Gibson, Alexey Kardashevskiy
On 10/06/2017 09:46 AM, Benjamin Herrenschmidt wrote:
> On Fri, 2017-10-06 at 11:40 +0530, Nikunj A Dadhania wrote:
>> Cédric Le Goater <clg@kaod.org> writes:
>>
>>> Hello,
>>>
>>> When a CPU is stopped with the 'stop-self' RTAS call, its state
>>> 'halted' is switched to 1 and, in this case, the MSR is not taken into
>>> account anymore in the cpu_has_work() routine. Only the pending
>>> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>>>
>>> If the DECR timer fires after 'stop-self' is called and before the CPU
>>> 'stop' state is reached, the nearly-dead CPU will have some work to do
>>> and the guest will crash. This case happens very frequently with the
>>> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
>>> occasionally fired but after 'stop' state, so no work is to be done
>>> and the guest survives.
>>>
>>> I suspect there is a race between the QEMU mainloop triggering the
>>> timers and the TCG CPU thread but I could not quite identify the root
>>> cause. To be safe, let's disable the decrementer interrupt in the LPCR
>>> when the CPU is halted and reenable it when the CPU is restarted.
>>
>> Moreover, disabling the DECR in the reset path solves the TCG multi cpu
>> reboot case, as reboot path does not call stop-cpu rtas call.
>
> SHouldn't we do it in set_papr too and only turn it on for the boot CPU
> and in start-cpu RTAS call ? Same with the other PECEs in fact...
yes I agree.
In cpu_ppc_set_papr(), we should set the PECE* bits only for the boot
CPU and then let the RTAS calls start-cpu and stop-self do the enablement
and disablement.
I will respin the patchset.
C.
>> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
>> index 3e20b1d886..c5150ee590 100644
>> --- a/hw/ppc/spapr_cpu_core.c
>> +++ b/hw/ppc/spapr_cpu_core.c
>> @@ -86,6 +86,15 @@ static void spapr_cpu_reset(void *opaque)
>> cs->halted = 1;
>>
>> env->spr[SPR_HIOR] = 0;
>> + /* Disable DECR for secondary cpus */
>> + if (cs != first_cpu) {
>> + if (env->mmu_model == POWERPC_MMU_3_00) {
>> + env->spr[SPR_LPCR] &= ~LPCR_DEE;
>> + } else {
>> + /* P7 and P8 both have same bit for DECR */
>> + env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
>> + }
>> + }
>> }
>>
>> static void spapr_cpu_destroy(PowerPCCPU *cpu)
>>
>>
>> Regards
>> Nikunj
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged
2017-10-06 7:46 ` Benjamin Herrenschmidt
2017-10-06 7:53 ` Cédric Le Goater
@ 2017-10-06 8:11 ` Nikunj A Dadhania
1 sibling, 0 replies; 16+ messages in thread
From: Nikunj A Dadhania @ 2017-10-06 8:11 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Cédric Le Goater, qemu-ppc,
qemu-devel, David Gibson, Alexey Kardashevskiy
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Fri, 2017-10-06 at 11:40 +0530, Nikunj A Dadhania wrote:
>> Cédric Le Goater <clg@kaod.org> writes:
>>
>> > Hello,
>> >
>> > When a CPU is stopped with the 'stop-self' RTAS call, its state
>> > 'halted' is switched to 1 and, in this case, the MSR is not taken into
>> > account anymore in the cpu_has_work() routine. Only the pending
>> > hardware interrupts are checked with their LPCR:PECE* enablement bit.
>> >
>> > If the DECR timer fires after 'stop-self' is called and before the CPU
>> > 'stop' state is reached, the nearly-dead CPU will have some work to do
>> > and the guest will crash. This case happens very frequently with the
>> > not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
>> > occasionally fired but after 'stop' state, so no work is to be done
>> > and the guest survives.
>> >
>> > I suspect there is a race between the QEMU mainloop triggering the
>> > timers and the TCG CPU thread but I could not quite identify the root
>> > cause. To be safe, let's disable the decrementer interrupt in the LPCR
>> > when the CPU is halted and reenable it when the CPU is restarted.
>>
>> Moreover, disabling the DECR in the reset path solves the TCG multi cpu
>> reboot case, as reboot path does not call stop-cpu rtas call.
>
> SHouldn't we do it in set_papr too and only turn it on for the boot CPU
> and in start-cpu RTAS call ? Same with the other PECEs in fact...
Yes, +1 for that
Regards
Nikunj
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
2017-10-05 16:49 ` [Qemu-devel] [PATCH 1/2] spapr/rtas: " Cédric Le Goater
@ 2017-10-06 9:07 ` David Gibson
2017-10-06 9:53 ` Benjamin Herrenschmidt
2017-10-06 21:15 ` Cédric Le Goater
0 siblings, 2 replies; 16+ messages in thread
From: David Gibson @ 2017-10-06 9:07 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-ppc, qemu-devel, Nikunj A Dadhania, Benjamin Herrenschmidt,
Alexey Kardashevskiy
[-- Attachment #1: Type: text/plain, Size: 2922 bytes --]
On Thu, Oct 05, 2017 at 06:49:58PM +0200, Cédric Le Goater wrote:
> When a CPU is stopped with the 'stop-self' RTAS call, its state
> 'halted' is switched to 1 and, in this case, the MSR is not taken into
> account anymore in the cpu_has_work() routine. Only the pending
> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>
> If the DECR timer fires after 'stop-self' is called and before the CPU
> 'stop' state is reached, the nearly-dead CPU will have some work to do
> and the guest will crash. This case happens very frequently with the
> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
> occasionally fired but after 'stop' state, so no work is to be done
> and the guest survives.
>
> I suspect there is a race between the QEMU mainloop triggering the
> timers and the TCG CPU thread but I could not quite identify the root
> cause. To be safe, let's disable the decrementer interrupt in the LPCR
> when the CPU is halted and reenable it when the CPU is restarted.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> hw/ppc/spapr_rtas.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index cdf0b607a0a0..2389220c9738 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
> kvm_cpu_synchronize_state(cs);
>
> env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
> +
> + /* Enable DECR interrupt */
> + if (env->mmu_model == POWERPC_MMU_3_00) {
Hm. Checking mmu_model doesn't seem right to me. I mean, it'll get
the right answer in practice, but the LPCR programming has nothing
whatsoever to do with the MMU.
I think explicitly checking if cpu_ is a POWER9 instance with
object_dynamic_cast would be a better option.
> + env->spr[SPR_LPCR] |= LPCR_DEE;
> + } else {
> + /* P7 and P8 both have same bit for DECR */
> + env->spr[SPR_LPCR] |= LPCR_P8_PECE3;
> + }
> +
> env->nip = start;
> env->gpr[3] = r3;
> cs->halted = 0;
> @@ -210,6 +219,13 @@ static void rtas_stop_self(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> * no need to bother with specific bits, we just clear it.
> */
> env->msr = 0;
> +
> + if (env->mmu_model == POWERPC_MMU_3_00) {
> + env->spr[SPR_LPCR] &= ~LPCR_DEE;
> + } else {
> + /* P7 and P8 both have same bit for DECR */
> + env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
> + }
> }
>
> static inline int sysparm_st(target_ulong addr, target_ulong len,
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] spapr/rtas: do not reset the MSR in stop-self command
2017-10-05 16:49 ` [Qemu-devel] [PATCH 2/2] spapr/rtas: do not reset the MSR in stop-self command Cédric Le Goater
@ 2017-10-06 9:08 ` David Gibson
0 siblings, 0 replies; 16+ messages in thread
From: David Gibson @ 2017-10-06 9:08 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-ppc, qemu-devel, Nikunj A Dadhania, Benjamin Herrenschmidt,
Alexey Kardashevskiy
[-- Attachment #1: Type: text/plain, Size: 1776 bytes --]
On Thu, Oct 05, 2017 at 06:49:59PM +0200, Cédric Le Goater wrote:
> When a CPU is stopped with the 'stop-self' RTAS call, its state
> 'halted' is switched to 1 and, in this case, the MSR is not taken into
> account anymore in the cpu_has_work() routine. Only the pending
> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>
> The CPU is now also protected from the decrementer interrupt by the
> LPCR:PECE* bits which are disabled in the 'stop-self' RTAS
> call. Reseting the MSR is pointless.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> hw/ppc/spapr_rtas.c | 10 ----------
> 1 file changed, 10 deletions(-)
>
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 2389220c9738..7f5ddce89ef2 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -209,16 +209,6 @@ static void rtas_stop_self(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>
> cs->halted = 1;
> qemu_cpu_kick(cs);
> - /*
> - * While stopping a CPU, the guest calls H_CPPR which
> - * effectively disables interrupts on XICS level.
> - * However decrementer interrupts in TCG can still
> - * wake the CPU up so here we disable interrupts in MSR
> - * as well.
> - * As rtas_start_cpu() resets the whole MSR anyway, there is
> - * no need to bother with specific bits, we just clear it.
> - */
> - env->msr = 0;
>
> if (env->mmu_model == POWERPC_MMU_3_00) {
> env->spr[SPR_LPCR] &= ~LPCR_DEE;
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged
2017-10-06 6:10 ` [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Nikunj A Dadhania
2017-10-06 6:14 ` Cédric Le Goater
2017-10-06 7:46 ` Benjamin Herrenschmidt
@ 2017-10-06 9:09 ` David Gibson
2 siblings, 0 replies; 16+ messages in thread
From: David Gibson @ 2017-10-06 9:09 UTC (permalink / raw)
To: Nikunj A Dadhania
Cc: Cédric Le Goater, qemu-ppc, qemu-devel,
Benjamin Herrenschmidt, Alexey Kardashevskiy
[-- Attachment #1: Type: text/plain, Size: 2230 bytes --]
On Fri, Oct 06, 2017 at 11:40:02AM +0530, Nikunj A Dadhania wrote:
> Cédric Le Goater <clg@kaod.org> writes:
>
> > Hello,
> >
> > When a CPU is stopped with the 'stop-self' RTAS call, its state
> > 'halted' is switched to 1 and, in this case, the MSR is not taken into
> > account anymore in the cpu_has_work() routine. Only the pending
> > hardware interrupts are checked with their LPCR:PECE* enablement bit.
> >
> > If the DECR timer fires after 'stop-self' is called and before the CPU
> > 'stop' state is reached, the nearly-dead CPU will have some work to do
> > and the guest will crash. This case happens very frequently with the
> > not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
> > occasionally fired but after 'stop' state, so no work is to be done
> > and the guest survives.
> >
> > I suspect there is a race between the QEMU mainloop triggering the
> > timers and the TCG CPU thread but I could not quite identify the root
> > cause. To be safe, let's disable the decrementer interrupt in the LPCR
> > when the CPU is halted and reenable it when the CPU is restarted.
>
> Moreover, disabling the DECR in the reset path solves the TCG multi cpu
> reboot case, as reboot path does not call stop-cpu rtas call.
>
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 3e20b1d886..c5150ee590 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -86,6 +86,15 @@ static void spapr_cpu_reset(void *opaque)
> cs->halted = 1;
>
> env->spr[SPR_HIOR] = 0;
> + /* Disable DECR for secondary cpus */
> + if (cs != first_cpu) {
> + if (env->mmu_model == POWERPC_MMU_3_00) {
> + env->spr[SPR_LPCR] &= ~LPCR_DEE;
> + } else {
> + /* P7 and P8 both have same bit for DECR */
> + env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
> + }
> + }
> }
This seems reasonable.
>
> static void spapr_cpu_destroy(PowerPCCPU *cpu)
>
>
> Regards
> Nikunj
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
2017-10-06 9:07 ` David Gibson
@ 2017-10-06 9:53 ` Benjamin Herrenschmidt
2017-10-06 10:10 ` David Gibson
2017-10-06 21:15 ` Cédric Le Goater
1 sibling, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2017-10-06 9:53 UTC (permalink / raw)
To: David Gibson, Cédric Le Goater
Cc: qemu-ppc, qemu-devel, Nikunj A Dadhania, Alexey Kardashevskiy
On Fri, 2017-10-06 at 20:07 +1100, David Gibson wrote:
> Hm. Checking mmu_model doesn't seem right to me. I mean, it'll get
> the right answer in practice, but the LPCR programming has nothing
> whatsoever to do with the MMU.
>
> I think explicitly checking if cpu_ is a POWER9 instance with
> object_dynamic_cast would be a better option.
Best is ARCH 300 ... do we hvae arch versions outside of MMU model
these days ?
Ben.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
2017-10-06 9:53 ` Benjamin Herrenschmidt
@ 2017-10-06 10:10 ` David Gibson
2017-10-09 14:28 ` Cédric Le Goater
0 siblings, 1 reply; 16+ messages in thread
From: David Gibson @ 2017-10-06 10:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Cédric Le Goater, qemu-ppc, qemu-devel, Nikunj A Dadhania,
Alexey Kardashevskiy
[-- Attachment #1: Type: text/plain, Size: 799 bytes --]
On Fri, Oct 06, 2017 at 11:53:30AM +0200, Benjamin Herrenschmidt wrote:
> On Fri, 2017-10-06 at 20:07 +1100, David Gibson wrote:
> > Hm. Checking mmu_model doesn't seem right to me. I mean, it'll get
> > the right answer in practice, but the LPCR programming has nothing
> > whatsoever to do with the MMU.
> >
> > I think explicitly checking if cpu_ is a POWER9 instance with
> > object_dynamic_cast would be a better option.
>
> Best is ARCH 300 ... do we hvae arch versions outside of MMU model
> these days ?
Not that I could spot easily. Apart from implicitly in the cpu
family.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
2017-10-06 9:07 ` David Gibson
2017-10-06 9:53 ` Benjamin Herrenschmidt
@ 2017-10-06 21:15 ` Cédric Le Goater
2017-10-07 5:16 ` David Gibson
1 sibling, 1 reply; 16+ messages in thread
From: Cédric Le Goater @ 2017-10-06 21:15 UTC (permalink / raw)
To: David Gibson
Cc: qemu-ppc, qemu-devel, Nikunj A Dadhania, Benjamin Herrenschmidt,
Alexey Kardashevskiy
On 10/06/2017 11:07 AM, David Gibson wrote:
> On Thu, Oct 05, 2017 at 06:49:58PM +0200, Cédric Le Goater wrote:
>> When a CPU is stopped with the 'stop-self' RTAS call, its state
>> 'halted' is switched to 1 and, in this case, the MSR is not taken into
>> account anymore in the cpu_has_work() routine. Only the pending
>> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>>
>> If the DECR timer fires after 'stop-self' is called and before the CPU
>> 'stop' state is reached, the nearly-dead CPU will have some work to do
>> and the guest will crash. This case happens very frequently with the
>> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
>> occasionally fired but after 'stop' state, so no work is to be done
>> and the guest survives.
>>
>> I suspect there is a race between the QEMU mainloop triggering the
>> timers and the TCG CPU thread but I could not quite identify the root
>> cause. To be safe, let's disable the decrementer interrupt in the LPCR
>> when the CPU is halted and reenable it when the CPU is restarted.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>> hw/ppc/spapr_rtas.c | 16 ++++++++++++++++
>> 1 file changed, 16 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
>> index cdf0b607a0a0..2389220c9738 100644
>> --- a/hw/ppc/spapr_rtas.c
>> +++ b/hw/ppc/spapr_rtas.c
>> @@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
>> kvm_cpu_synchronize_state(cs);
>>
>> env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
>> +
>> + /* Enable DECR interrupt */
>> + if (env->mmu_model == POWERPC_MMU_3_00) {
>
> Hm. Checking mmu_model doesn't seem right to me. I mean, it'll get
> the right answer in practice, but the LPCR programming has nothing
> whatsoever to do with the MMU.
>
> I think explicitly checking if cpu_ is a POWER9 instance with
> object_dynamic_cast would be a better option.
OK. So I guess we should change the switch statement in cpu_ppc_set_papr()
also.
C.
>
>> + env->spr[SPR_LPCR] |= LPCR_DEE;
>> + } else {
>> + /* P7 and P8 both have same bit for DECR */
>> + env->spr[SPR_LPCR] |= LPCR_P8_PECE3;
>> + }
>> +
>> env->nip = start;
>> env->gpr[3] = r3;
>> cs->halted = 0;
>> @@ -210,6 +219,13 @@ static void rtas_stop_self(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>> * no need to bother with specific bits, we just clear it.
>> */
>> env->msr = 0;
>> +
>> + if (env->mmu_model == POWERPC_MMU_3_00) {
>> + env->spr[SPR_LPCR] &= ~LPCR_DEE;
>> + } else {
>> + /* P7 and P8 both have same bit for DECR */
>> + env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
>> + }
>> }
>>
>> static inline int sysparm_st(target_ulong addr, target_ulong len,
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
2017-10-06 21:15 ` Cédric Le Goater
@ 2017-10-07 5:16 ` David Gibson
0 siblings, 0 replies; 16+ messages in thread
From: David Gibson @ 2017-10-07 5:16 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-ppc, qemu-devel, Nikunj A Dadhania, Benjamin Herrenschmidt,
Alexey Kardashevskiy
[-- Attachment #1: Type: text/plain, Size: 2485 bytes --]
On Fri, Oct 06, 2017 at 11:15:31PM +0200, Cédric Le Goater wrote:
> On 10/06/2017 11:07 AM, David Gibson wrote:
> > On Thu, Oct 05, 2017 at 06:49:58PM +0200, Cédric Le Goater wrote:
> >> When a CPU is stopped with the 'stop-self' RTAS call, its state
> >> 'halted' is switched to 1 and, in this case, the MSR is not taken into
> >> account anymore in the cpu_has_work() routine. Only the pending
> >> hardware interrupts are checked with their LPCR:PECE* enablement bit.
> >>
> >> If the DECR timer fires after 'stop-self' is called and before the CPU
> >> 'stop' state is reached, the nearly-dead CPU will have some work to do
> >> and the guest will crash. This case happens very frequently with the
> >> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
> >> occasionally fired but after 'stop' state, so no work is to be done
> >> and the guest survives.
> >>
> >> I suspect there is a race between the QEMU mainloop triggering the
> >> timers and the TCG CPU thread but I could not quite identify the root
> >> cause. To be safe, let's disable the decrementer interrupt in the LPCR
> >> when the CPU is halted and reenable it when the CPU is restarted.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >> hw/ppc/spapr_rtas.c | 16 ++++++++++++++++
> >> 1 file changed, 16 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> >> index cdf0b607a0a0..2389220c9738 100644
> >> --- a/hw/ppc/spapr_rtas.c
> >> +++ b/hw/ppc/spapr_rtas.c
> >> @@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
> >> kvm_cpu_synchronize_state(cs);
> >>
> >> env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
> >> +
> >> + /* Enable DECR interrupt */
> >> + if (env->mmu_model == POWERPC_MMU_3_00) {
> >
> > Hm. Checking mmu_model doesn't seem right to me. I mean, it'll get
> > the right answer in practice, but the LPCR programming has nothing
> > whatsoever to do with the MMU.
> >
> > I think explicitly checking if cpu_ is a POWER9 instance with
> > object_dynamic_cast would be a better option.
>
> OK. So I guess we should change the switch statement in cpu_ppc_set_papr()
> also.
Yeah, I guess so. No rush.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
2017-10-06 10:10 ` David Gibson
@ 2017-10-09 14:28 ` Cédric Le Goater
0 siblings, 0 replies; 16+ messages in thread
From: Cédric Le Goater @ 2017-10-09 14:28 UTC (permalink / raw)
To: David Gibson, Benjamin Herrenschmidt
Cc: qemu-ppc, qemu-devel, Nikunj A Dadhania, Alexey Kardashevskiy
On 10/06/2017 12:10 PM, David Gibson wrote:
> On Fri, Oct 06, 2017 at 11:53:30AM +0200, Benjamin Herrenschmidt wrote:
>> On Fri, 2017-10-06 at 20:07 +1100, David Gibson wrote:
>>> Hm. Checking mmu_model doesn't seem right to me. I mean, it'll get
>>> the right answer in practice, but the LPCR programming has nothing
>>> whatsoever to do with the MMU.
>>>
>>> I think explicitly checking if cpu_ is a POWER9 instance with
>>> object_dynamic_cast would be a better option.
>>
>> Best is ARCH 300 ... do we hvae arch versions outside of MMU model
>> these days ?
>
> Not that I could spot easily. Apart from implicitly in the cpu
> family.
>
how about :
pcc->pvr_match(pcc, CPU_POWERPC_LOGICAL_3_00);
C.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-10-09 14:28 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-05 16:49 [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Cédric Le Goater
2017-10-05 16:49 ` [Qemu-devel] [PATCH 1/2] spapr/rtas: " Cédric Le Goater
2017-10-06 9:07 ` David Gibson
2017-10-06 9:53 ` Benjamin Herrenschmidt
2017-10-06 10:10 ` David Gibson
2017-10-09 14:28 ` Cédric Le Goater
2017-10-06 21:15 ` Cédric Le Goater
2017-10-07 5:16 ` David Gibson
2017-10-05 16:49 ` [Qemu-devel] [PATCH 2/2] spapr/rtas: do not reset the MSR in stop-self command Cédric Le Goater
2017-10-06 9:08 ` David Gibson
2017-10-06 6:10 ` [Qemu-devel] [PATCH 0/2] disable the decrementer interrupt when a CPU is unplugged Nikunj A Dadhania
2017-10-06 6:14 ` Cédric Le Goater
2017-10-06 7:46 ` Benjamin Herrenschmidt
2017-10-06 7:53 ` Cédric Le Goater
2017-10-06 8:11 ` Nikunj A Dadhania
2017-10-06 9:09 ` David Gibson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).