Re: Boot flakiness with QEMU 3.1.0 and Clang built kernels

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* Re: Boot flakiness with QEMU 3.1.0 and Clang built kernels
       [not found]     ` <1586597161.xyshvdbjo6.astroid@bobo.none>
@ 2020-04-11 13:57       ` Nicholas Piggin
  2020-04-11 23:35         ` Nathan Chancellor
                           ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Nicholas Piggin @ 2020-04-11 13:57 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: qemu-devel, clang-built-linux, Cédric Le Goater, qemu-ppc,
	linuxppc-dev, David Gibson

Nicholas Piggin's on April 11, 2020 7:32 pm:
> Nathan Chancellor's on April 11, 2020 10:53 am:
>> The tt.config values are needed to reproduce but I did not verify that
>> ONLY tt.config was needed. Other than that, no, we are just building
>> either pseries_defconfig or powernv_defconfig with those configs and
>> letting it boot up with a simple initramfs, which prints the version
>> string then shuts the machine down.
>> 
>> Let me know if you need any more information, cheers!
> 
> Okay I can reproduce it. Sometimes it eventually recovers after a long
> pause, and some keyboard input often helps it along. So that seems like 
> it might be a lost interrupt.
> 
> POWER8 vs POWER9 might just be a timing thing if P9 is still hanging
> sometimes. I wasn't able to reproduce it with defconfig+tt.config, I
> needed your other config with various other debug options.
> 
> Thanks for the very good report. I'll let you know what I find.

It looks like a qemu bug. Booting with '-d int' shows the decrementer 
simply stops firing at the point of the hang, even though MSR[EE]=1 and 
the DEC register is wrapping. Linux appears to be doing the right thing 
as far as I can tell (not losing interrupts).

This qemu patch fixes the boot hang for me. I don't know that qemu 
really has the right idea of "context synchronizing" as defined in the
powerpc architecture -- mtmsrd L=1 is not context synchronizing but that
does not mean it can avoid looking at exceptions until the next such
event. It looks like the decrementer exception goes high but the
execution of mtmsrd L=1 is ignoring it.

Prior to the Linux patch 3282a3da25b you bisected to, interrupt replay
code would return with an 'rfi' instruction as part of interrupt return,
which probably helped to get things moving along a bit. However it would
not be foolproof, and Cedric did say he encountered some mysterious
lockups under load with qemu powernv before that patch was merged, so
maybe it's the same issue?

Thanks,
Nick

The patch is a bit of a hack, but if you can run it and verify it fixes
your boot hang would be good.
---

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b207fb5386..1d997f5c32 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4364,12 +4364,21 @@ static void gen_mtmsrd(DisasContext *ctx)
     if (ctx->opcode & 0x00010000) {
         /* Special form that does not need any synchronisation */
         TCGv t0 = tcg_temp_new();
+        TCGv t1 = tcg_temp_new();
         tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
                         (1 << MSR_RI) | (1 << MSR_EE));
-        tcg_gen_andi_tl(cpu_msr, cpu_msr,
+        tcg_gen_andi_tl(t1, cpu_msr,
                         ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
-        tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
+        tcg_gen_or_tl(t1, t1, t0);
+
+        gen_update_nip(ctx, ctx->base.pc_next);
+        gen_helper_store_msr(cpu_env, t1);
         tcg_temp_free(t0);
+        tcg_temp_free(t1);
+        /* Must stop the translation as machine state (may have) changed */
+        /* Note that mtmsr is not always defined as context-synchronizing */
+        gen_stop_exception(ctx);
+
     } else {
         /*
          * XXX: we need to update nip before the store if we enter


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Boot flakiness with QEMU 3.1.0 and Clang built kernels
  2020-04-11 13:57       ` Boot flakiness with QEMU 3.1.0 and Clang built kernels Nicholas Piggin
@ 2020-04-11 23:35         ` Nathan Chancellor
  2020-04-12 12:03         ` Cédric Le Goater
  2020-04-14  2:05         ` David Gibson
  2 siblings, 0 replies; 6+ messages in thread
From: Nathan Chancellor @ 2020-04-11 23:35 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: qemu-devel, clang-built-linux, Cédric Le Goater, qemu-ppc,
	linuxppc-dev, David Gibson

On Sat, Apr 11, 2020 at 11:57:23PM +1000, Nicholas Piggin wrote:
> Nicholas Piggin's on April 11, 2020 7:32 pm:
> > Nathan Chancellor's on April 11, 2020 10:53 am:
> >> The tt.config values are needed to reproduce but I did not verify that
> >> ONLY tt.config was needed. Other than that, no, we are just building
> >> either pseries_defconfig or powernv_defconfig with those configs and
> >> letting it boot up with a simple initramfs, which prints the version
> >> string then shuts the machine down.
> >> 
> >> Let me know if you need any more information, cheers!
> > 
> > Okay I can reproduce it. Sometimes it eventually recovers after a long
> > pause, and some keyboard input often helps it along. So that seems like 
> > it might be a lost interrupt.
> > 
> > POWER8 vs POWER9 might just be a timing thing if P9 is still hanging
> > sometimes. I wasn't able to reproduce it with defconfig+tt.config, I
> > needed your other config with various other debug options.
> > 
> > Thanks for the very good report. I'll let you know what I find.
> 
> It looks like a qemu bug. Booting with '-d int' shows the decrementer 
> simply stops firing at the point of the hang, even though MSR[EE]=1 and 
> the DEC register is wrapping. Linux appears to be doing the right thing 
> as far as I can tell (not losing interrupts).
> 
> This qemu patch fixes the boot hang for me. I don't know that qemu 
> really has the right idea of "context synchronizing" as defined in the
> powerpc architecture -- mtmsrd L=1 is not context synchronizing but that
> does not mean it can avoid looking at exceptions until the next such
> event. It looks like the decrementer exception goes high but the
> execution of mtmsrd L=1 is ignoring it.
> 
> Prior to the Linux patch 3282a3da25b you bisected to, interrupt replay
> code would return with an 'rfi' instruction as part of interrupt return,
> which probably helped to get things moving along a bit. However it would
> not be foolproof, and Cedric did say he encountered some mysterious
> lockups under load with qemu powernv before that patch was merged, so
> maybe it's the same issue?
> 
> Thanks,
> Nick
> 
> The patch is a bit of a hack, but if you can run it and verify it fixes
> your boot hang would be good.

Yes, with this patch applied on top of 5.0.0-rc2 and using the
pseries-3.1 and powernv8 machines, I do not see any hangs with a clang
built kernel at b032227c62939b5481bcd45442b36dfa263f4a7c across 100
boots.

If you happen to send it upstream:

Tested-by: Nathan Chancellor <natechancellor@gmail.com>

Thanks for looking into it!

> ---
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index b207fb5386..1d997f5c32 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4364,12 +4364,21 @@ static void gen_mtmsrd(DisasContext *ctx)
>      if (ctx->opcode & 0x00010000) {
>          /* Special form that does not need any synchronisation */
>          TCGv t0 = tcg_temp_new();
> +        TCGv t1 = tcg_temp_new();
>          tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
>                          (1 << MSR_RI) | (1 << MSR_EE));
> -        tcg_gen_andi_tl(cpu_msr, cpu_msr,
> +        tcg_gen_andi_tl(t1, cpu_msr,
>                          ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
> -        tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
> +        tcg_gen_or_tl(t1, t1, t0);
> +
> +        gen_update_nip(ctx, ctx->base.pc_next);
> +        gen_helper_store_msr(cpu_env, t1);
>          tcg_temp_free(t0);
> +        tcg_temp_free(t1);
> +        /* Must stop the translation as machine state (may have) changed */
> +        /* Note that mtmsr is not always defined as context-synchronizing */
> +        gen_stop_exception(ctx);
> +
>      } else {
>          /*
>           * XXX: we need to update nip before the store if we enter

Cheers,
Nathan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Boot flakiness with QEMU 3.1.0 and Clang built kernels
  2020-04-11 13:57       ` Boot flakiness with QEMU 3.1.0 and Clang built kernels Nicholas Piggin
  2020-04-11 23:35         ` Nathan Chancellor
@ 2020-04-12 12:03         ` Cédric Le Goater
  2020-04-14  2:05         ` David Gibson
  2 siblings, 0 replies; 6+ messages in thread
From: Cédric Le Goater @ 2020-04-12 12:03 UTC (permalink / raw)
  To: Nicholas Piggin, Nathan Chancellor
  Cc: Michael Neuling, qemu-devel, clang-built-linux, qemu-ppc,
	Anton Blanchard, linuxppc-dev, David Gibson

[-- Attachment #1: Type: text/plain, Size: 2205 bytes --]

On 4/11/20 3:57 PM, Nicholas Piggin wrote:
> Nicholas Piggin's on April 11, 2020 7:32 pm:
>> Nathan Chancellor's on April 11, 2020 10:53 am:
>>> The tt.config values are needed to reproduce but I did not verify that
>>> ONLY tt.config was needed. Other than that, no, we are just building
>>> either pseries_defconfig or powernv_defconfig with those configs and
>>> letting it boot up with a simple initramfs, which prints the version
>>> string then shuts the machine down.
>>>
>>> Let me know if you need any more information, cheers!
>>
>> Okay I can reproduce it. Sometimes it eventually recovers after a long
>> pause, and some keyboard input often helps it along. So that seems like 
>> it might be a lost interrupt.
>>
>> POWER8 vs POWER9 might just be a timing thing if P9 is still hanging
>> sometimes. I wasn't able to reproduce it with defconfig+tt.config, I
>> needed your other config with various other debug options.
>>
>> Thanks for the very good report. I'll let you know what I find.
> 
> It looks like a qemu bug. Booting with '-d int' shows the decrementer 
> simply stops firing at the point of the hang, even though MSR[EE]=1 and 
> the DEC register is wrapping. Linux appears to be doing the right thing 
> as far as I can tell (not losing interrupts).
> 
> This qemu patch fixes the boot hang for me. I don't know that qemu 
> really has the right idea of "context synchronizing" as defined in the
> powerpc architecture -- mtmsrd L=1 is not context synchronizing but that
> does not mean it can avoid looking at exceptions until the next such
> event. It looks like the decrementer exception goes high but the
> execution of mtmsrd L=1 is ignoring it.
> 
> Prior to the Linux patch 3282a3da25b you bisected to, interrupt replay
> code would return with an 'rfi' instruction as part of interrupt return,
> which probably helped to get things moving along a bit. However it would
> not be foolproof, and Cedric did say he encountered some mysterious
> lockups under load with qemu powernv before that patch was merged, so
> maybe it's the same issue?

Nope :/ but this is a fix for an important problem reported by Anton in 
November. Attached is the test case.  

Thanks,

C. 


 

[-- Attachment #2: test.S --]
[-- Type: text/plain, Size: 2844 bytes --]

/*

Mikey and I noticed that the decrementer isn't firing when
it should. If a decrementer is pending and an mtmsrd(MSR_EE) is
executed then we should take the decrementer exception. From the PPC AS:

  If MSR EE = 0 and an External, Decrementer, or Per-
  formance Monitor exception is pending, executing
  an mtmsrd instruction that sets MSR EE to 1 will
  cause the interrupt to occur before the next instruc-
  tion is executed, if no higher priority exception
  exists

A test case is below. r31 is incremented for every decrementer
exception.

powerpc64le-linux-gcc -c test.S
powerpc64le-linux-ld -Ttext=0x0 -o test.elf test.o
powerpc64le-linux-objcopy -O binary test.elf test.bin

qemu-system-ppc64 -M powernv -cpu POWER9 -nographic -bios test.bin

"info registers" shows it looping in the lower loop, ie the
decrementer exception was never taken.

r31 never moves. If I build with:

powerpc64le-linux-gcc -DFIX_BROKEN -c test.S

I see r31 move.

*/

#include <ppc-asm.h>

/* Load an immediate 64-bit value into a register */
#define LOAD_IMM64(r, e)			\
	lis	r,(e)@highest;			\
	ori	r,r,(e)@higher;			\
	rldicr	r,r, 32, 31;			\
	oris	r,r, (e)@h;			\
	ori	r,r, (e)@l;

#define FIXUP_ENDIAN						   \
	tdi   0,0,0x48;	  /* Reverse endian of b . + 8		*/ \
	b     191f;	  /* Skip trampoline if endian is good	*/ \
	.long 0xa600607d; /* mfmsr r11				*/ \
	.long 0x01006b69; /* xori r11,r11,1			*/ \
	.long 0x05009f42; /* bcl 20,31,$+4			*/ \
	.long 0xa602487d; /* mflr r10				*/ \
	.long 0x14004a39; /* addi r10,r10,20			*/ \
	.long 0xa64b5a7d; /* mthsrr0 r10			*/ \
	.long 0xa64b7b7d; /* mthsrr1 r11			*/ \
	.long 0x2402004c; /* hrfid				*/ \
191:

	.= 0x0
.globl _start
_start:
	b	1f

	.= 0x10
	FIXUP_ENDIAN
	b	1f

	.= 0x100
1:
	FIXUP_ENDIAN
	b	__initialize

#define EXCEPTION(nr)		\
	.= nr			;\
	b	.

	/* More exception stubs */
	EXCEPTION(0x300)
	EXCEPTION(0x380)
	EXCEPTION(0x400)
	EXCEPTION(0x480)
	EXCEPTION(0x500)
	EXCEPTION(0x600)
	EXCEPTION(0x700)
	EXCEPTION(0x800)

	.= 0x900
	LOAD_IMM64(r0, 0x1000000)
	mtdec	r0
	addi	r31,r31,1
	rfid

	EXCEPTION(0x980)
	EXCEPTION(0xa00)
	EXCEPTION(0xb00)
	EXCEPTION(0xc00)
	EXCEPTION(0xd00)
	EXCEPTION(0xe00)
	EXCEPTION(0xe20)
	EXCEPTION(0xe40)
	EXCEPTION(0xe60)
	EXCEPTION(0xe80)
	EXCEPTION(0xf00)
	EXCEPTION(0xf20)
	EXCEPTION(0xf40)
	EXCEPTION(0xf60)
	EXCEPTION(0xf80)
	EXCEPTION(0x1000)
	EXCEPTION(0x1100)
	EXCEPTION(0x1200)
	EXCEPTION(0x1300)
	EXCEPTION(0x1400)
	EXCEPTION(0x1500)
	EXCEPTION(0x1600)

__initialize:
	/* SF, HV, EE, RI, LE */
	LOAD_IMM64(r0, 0x9000000000008003)
	mtmsrd	r0
	
	/* HID0: HILE */
	LOAD_IMM64(r0, 0x800000000000000)
	mtspr	0x3f0,r0

	LOAD_IMM64(r0, 0x1000000)
	mtdec r0

1:	LOAD_IMM64(r30,0x8000)
	mtmsrd	r30,1

	/* We should take the decrementer here */
#ifdef FIX_BROKEN
	LOAD_IMM64(r29,0x100000000)
	mtctr	r29
2:	bdnz	2b
#endif

	li	r30,0x0
	mtmsrd	r30,1
	b	1b

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Boot flakiness with QEMU 3.1.0 and Clang built kernels
  2020-04-11 13:57       ` Boot flakiness with QEMU 3.1.0 and Clang built kernels Nicholas Piggin
  2020-04-11 23:35         ` Nathan Chancellor
  2020-04-12 12:03         ` Cédric Le Goater
@ 2020-04-14  2:05         ` David Gibson
  2020-04-14  4:05           ` Nathan Chancellor
  2 siblings, 1 reply; 6+ messages in thread
From: David Gibson @ 2020-04-14  2:05 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: qemu-devel, clang-built-linux, Cédric Le Goater, qemu-ppc,
	Nathan Chancellor, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 4056 bytes --]

On Sat, Apr 11, 2020 at 11:57:23PM +1000, Nicholas Piggin wrote:
> Nicholas Piggin's on April 11, 2020 7:32 pm:
> > Nathan Chancellor's on April 11, 2020 10:53 am:
> >> The tt.config values are needed to reproduce but I did not verify that
> >> ONLY tt.config was needed. Other than that, no, we are just building
> >> either pseries_defconfig or powernv_defconfig with those configs and
> >> letting it boot up with a simple initramfs, which prints the version
> >> string then shuts the machine down.
> >> 
> >> Let me know if you need any more information, cheers!
> > 
> > Okay I can reproduce it. Sometimes it eventually recovers after a long
> > pause, and some keyboard input often helps it along. So that seems like 
> > it might be a lost interrupt.
> > 
> > POWER8 vs POWER9 might just be a timing thing if P9 is still hanging
> > sometimes. I wasn't able to reproduce it with defconfig+tt.config, I
> > needed your other config with various other debug options.
> > 
> > Thanks for the very good report. I'll let you know what I find.
> 
> It looks like a qemu bug. Booting with '-d int' shows the decrementer 
> simply stops firing at the point of the hang, even though MSR[EE]=1 and 
> the DEC register is wrapping. Linux appears to be doing the right thing 
> as far as I can tell (not losing interrupts).
> 
> This qemu patch fixes the boot hang for me. I don't know that qemu 
> really has the right idea of "context synchronizing" as defined in the
> powerpc architecture -- mtmsrd L=1 is not context synchronizing but that
> does not mean it can avoid looking at exceptions until the next such
> event. It looks like the decrementer exception goes high but the
> execution of mtmsrd L=1 is ignoring it.
> 
> Prior to the Linux patch 3282a3da25b you bisected to, interrupt replay
> code would return with an 'rfi' instruction as part of interrupt return,
> which probably helped to get things moving along a bit. However it would
> not be foolproof, and Cedric did say he encountered some mysterious
> lockups under load with qemu powernv before that patch was merged, so
> maybe it's the same issue?
> 
> Thanks,
> Nick
> 
> The patch is a bit of a hack, but if you can run it and verify it fixes
> your boot hang would be good.

So a bug in this handling wouldn't surprise me at all.  However a
report against QEMU 3.1 isn't particularly useful.

 * Does the problem occur with current upstream master qemu?
 * Does the problem occur with qemu-2.12 (a pretty widely deployed
   "stable" qemu, e.g. in RHEL)?

> ---
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index b207fb5386..1d997f5c32 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4364,12 +4364,21 @@ static void gen_mtmsrd(DisasContext *ctx)
>      if (ctx->opcode & 0x00010000) {
>          /* Special form that does not need any synchronisation */
>          TCGv t0 = tcg_temp_new();
> +        TCGv t1 = tcg_temp_new();
>          tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
>                          (1 << MSR_RI) | (1 << MSR_EE));
> -        tcg_gen_andi_tl(cpu_msr, cpu_msr,
> +        tcg_gen_andi_tl(t1, cpu_msr,
>                          ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
> -        tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
> +        tcg_gen_or_tl(t1, t1, t0);
> +
> +        gen_update_nip(ctx, ctx->base.pc_next);
> +        gen_helper_store_msr(cpu_env, t1);
>          tcg_temp_free(t0);
> +        tcg_temp_free(t1);
> +        /* Must stop the translation as machine state (may have) changed */
> +        /* Note that mtmsr is not always defined as context-synchronizing */
> +        gen_stop_exception(ctx);
> +
>      } else {
>          /*
>           * XXX: we need to update nip before the store if we enter
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Boot flakiness with QEMU 3.1.0 and Clang built kernels
  2020-04-14  2:05         ` David Gibson
@ 2020-04-14  4:05           ` Nathan Chancellor
  2020-04-14  4:40             ` David Gibson
  0 siblings, 1 reply; 6+ messages in thread
From: Nathan Chancellor @ 2020-04-14  4:05 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, Nicholas Piggin, clang-built-linux,
	Cédric Le Goater, qemu-ppc, linuxppc-dev

On Tue, Apr 14, 2020 at 12:05:53PM +1000, David Gibson wrote:
> On Sat, Apr 11, 2020 at 11:57:23PM +1000, Nicholas Piggin wrote:
> > Nicholas Piggin's on April 11, 2020 7:32 pm:
> > > Nathan Chancellor's on April 11, 2020 10:53 am:
> > >> The tt.config values are needed to reproduce but I did not verify that
> > >> ONLY tt.config was needed. Other than that, no, we are just building
> > >> either pseries_defconfig or powernv_defconfig with those configs and
> > >> letting it boot up with a simple initramfs, which prints the version
> > >> string then shuts the machine down.
> > >> 
> > >> Let me know if you need any more information, cheers!
> > > 
> > > Okay I can reproduce it. Sometimes it eventually recovers after a long
> > > pause, and some keyboard input often helps it along. So that seems like 
> > > it might be a lost interrupt.
> > > 
> > > POWER8 vs POWER9 might just be a timing thing if P9 is still hanging
> > > sometimes. I wasn't able to reproduce it with defconfig+tt.config, I
> > > needed your other config with various other debug options.
> > > 
> > > Thanks for the very good report. I'll let you know what I find.
> > 
> > It looks like a qemu bug. Booting with '-d int' shows the decrementer 
> > simply stops firing at the point of the hang, even though MSR[EE]=1 and 
> > the DEC register is wrapping. Linux appears to be doing the right thing 
> > as far as I can tell (not losing interrupts).
> > 
> > This qemu patch fixes the boot hang for me. I don't know that qemu 
> > really has the right idea of "context synchronizing" as defined in the
> > powerpc architecture -- mtmsrd L=1 is not context synchronizing but that
> > does not mean it can avoid looking at exceptions until the next such
> > event. It looks like the decrementer exception goes high but the
> > execution of mtmsrd L=1 is ignoring it.
> > 
> > Prior to the Linux patch 3282a3da25b you bisected to, interrupt replay
> > code would return with an 'rfi' instruction as part of interrupt return,
> > which probably helped to get things moving along a bit. However it would
> > not be foolproof, and Cedric did say he encountered some mysterious
> > lockups under load with qemu powernv before that patch was merged, so
> > maybe it's the same issue?
> > 
> > Thanks,
> > Nick
> > 
> > The patch is a bit of a hack, but if you can run it and verify it fixes
> > your boot hang would be good.
> 
> So a bug in this handling wouldn't surprise me at all.  However a
> report against QEMU 3.1 isn't particularly useful.
> 
>  * Does the problem occur with current upstream master qemu?

Yes, I can reproduce the hang on 5.0.0-rc2.

>  * Does the problem occur with qemu-2.12 (a pretty widely deployed
>    "stable" qemu, e.g. in RHEL)?

No idea but I would assume so. I might have time later this week to test
but I assume it is kind of irrelevant if it is reproducible at ToT.

> > ---
> > 
> > diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> > index b207fb5386..1d997f5c32 100644
> > --- a/target/ppc/translate.c
> > +++ b/target/ppc/translate.c
> > @@ -4364,12 +4364,21 @@ static void gen_mtmsrd(DisasContext *ctx)
> >      if (ctx->opcode & 0x00010000) {
> >          /* Special form that does not need any synchronisation */
> >          TCGv t0 = tcg_temp_new();
> > +        TCGv t1 = tcg_temp_new();
> >          tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
> >                          (1 << MSR_RI) | (1 << MSR_EE));
> > -        tcg_gen_andi_tl(cpu_msr, cpu_msr,
> > +        tcg_gen_andi_tl(t1, cpu_msr,
> >                          ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
> > -        tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
> > +        tcg_gen_or_tl(t1, t1, t0);
> > +
> > +        gen_update_nip(ctx, ctx->base.pc_next);
> > +        gen_helper_store_msr(cpu_env, t1);
> >          tcg_temp_free(t0);
> > +        tcg_temp_free(t1);
> > +        /* Must stop the translation as machine state (may have) changed */
> > +        /* Note that mtmsr is not always defined as context-synchronizing */
> > +        gen_stop_exception(ctx);
> > +
> >      } else {
> >          /*
> >           * XXX: we need to update nip before the store if we enter
> > 
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

Cheers,
Nathan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Boot flakiness with QEMU 3.1.0 and Clang built kernels
  2020-04-14  4:05           ` Nathan Chancellor
@ 2020-04-14  4:40             ` David Gibson
  0 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2020-04-14  4:40 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: qemu-devel, Nicholas Piggin, clang-built-linux,
	Cédric Le Goater, qemu-ppc, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 4831 bytes --]

On Mon, Apr 13, 2020 at 09:05:15PM -0700, Nathan Chancellor wrote:
> On Tue, Apr 14, 2020 at 12:05:53PM +1000, David Gibson wrote:
> > On Sat, Apr 11, 2020 at 11:57:23PM +1000, Nicholas Piggin wrote:
> > > Nicholas Piggin's on April 11, 2020 7:32 pm:
> > > > Nathan Chancellor's on April 11, 2020 10:53 am:
> > > >> The tt.config values are needed to reproduce but I did not verify that
> > > >> ONLY tt.config was needed. Other than that, no, we are just building
> > > >> either pseries_defconfig or powernv_defconfig with those configs and
> > > >> letting it boot up with a simple initramfs, which prints the version
> > > >> string then shuts the machine down.
> > > >> 
> > > >> Let me know if you need any more information, cheers!
> > > > 
> > > > Okay I can reproduce it. Sometimes it eventually recovers after a long
> > > > pause, and some keyboard input often helps it along. So that seems like 
> > > > it might be a lost interrupt.
> > > > 
> > > > POWER8 vs POWER9 might just be a timing thing if P9 is still hanging
> > > > sometimes. I wasn't able to reproduce it with defconfig+tt.config, I
> > > > needed your other config with various other debug options.
> > > > 
> > > > Thanks for the very good report. I'll let you know what I find.
> > > 
> > > It looks like a qemu bug. Booting with '-d int' shows the decrementer 
> > > simply stops firing at the point of the hang, even though MSR[EE]=1 and 
> > > the DEC register is wrapping. Linux appears to be doing the right thing 
> > > as far as I can tell (not losing interrupts).
> > > 
> > > This qemu patch fixes the boot hang for me. I don't know that qemu 
> > > really has the right idea of "context synchronizing" as defined in the
> > > powerpc architecture -- mtmsrd L=1 is not context synchronizing but that
> > > does not mean it can avoid looking at exceptions until the next such
> > > event. It looks like the decrementer exception goes high but the
> > > execution of mtmsrd L=1 is ignoring it.
> > > 
> > > Prior to the Linux patch 3282a3da25b you bisected to, interrupt replay
> > > code would return with an 'rfi' instruction as part of interrupt return,
> > > which probably helped to get things moving along a bit. However it would
> > > not be foolproof, and Cedric did say he encountered some mysterious
> > > lockups under load with qemu powernv before that patch was merged, so
> > > maybe it's the same issue?
> > > 
> > > Thanks,
> > > Nick
> > > 
> > > The patch is a bit of a hack, but if you can run it and verify it fixes
> > > your boot hang would be good.
> > 
> > So a bug in this handling wouldn't surprise me at all.  However a
> > report against QEMU 3.1 isn't particularly useful.
> > 
> >  * Does the problem occur with current upstream master qemu?
> 
> Yes, I can reproduce the hang on 5.0.0-rc2.

Ok.

Nick, can you polish up your fix shortly and submit upstream in the
usual fashion?

> >  * Does the problem occur with qemu-2.12 (a pretty widely deployed
> >    "stable" qemu, e.g. in RHEL)?
> 
> No idea but I would assume so. I might have time later this week to test
> but I assume it is kind of irrelevant if it is reproducible at ToT.
> 
> > > ---
> > > 
> > > diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> > > index b207fb5386..1d997f5c32 100644
> > > --- a/target/ppc/translate.c
> > > +++ b/target/ppc/translate.c
> > > @@ -4364,12 +4364,21 @@ static void gen_mtmsrd(DisasContext *ctx)
> > >      if (ctx->opcode & 0x00010000) {
> > >          /* Special form that does not need any synchronisation */
> > >          TCGv t0 = tcg_temp_new();
> > > +        TCGv t1 = tcg_temp_new();
> > >          tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
> > >                          (1 << MSR_RI) | (1 << MSR_EE));
> > > -        tcg_gen_andi_tl(cpu_msr, cpu_msr,
> > > +        tcg_gen_andi_tl(t1, cpu_msr,
> > >                          ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
> > > -        tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
> > > +        tcg_gen_or_tl(t1, t1, t0);
> > > +
> > > +        gen_update_nip(ctx, ctx->base.pc_next);
> > > +        gen_helper_store_msr(cpu_env, t1);
> > >          tcg_temp_free(t0);
> > > +        tcg_temp_free(t1);
> > > +        /* Must stop the translation as machine state (may have) changed */
> > > +        /* Note that mtmsr is not always defined as context-synchronizing */
> > > +        gen_stop_exception(ctx);
> > > +
> > >      } else {
> > >          /*
> > >           * XXX: we need to update nip before the store if we enter
> > > 
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-14  6:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20200410205932.GA880@ubuntu-s3-xlarge-x86>
     [not found] ` <1586564375.zt8lm9finh.astroid@bobo.none>
     [not found]   ` <20200411005354.GA24145@ubuntu-s3-xlarge-x86>
     [not found]     ` <1586597161.xyshvdbjo6.astroid@bobo.none>
2020-04-11 13:57       ` Boot flakiness with QEMU 3.1.0 and Clang built kernels Nicholas Piggin
2020-04-11 23:35         ` Nathan Chancellor
2020-04-12 12:03         ` Cédric Le Goater
2020-04-14  2:05         ` David Gibson
2020-04-14  4:05           ` Nathan Chancellor
2020-04-14  4:40             ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).