From: Nicholas Piggin <npiggin@gmail.com>
To: qemu-devel@nongnu.org
Cc: qemu-stable@nongnu.org, "Nicholas Piggin" <npiggin@gmail.com>,
"Cédric Le Goater" <clg@fr.ibm.com>,
qemu-ppc@nongnu.org,
"Nathan Chancellor" <natechancellor@gmail.com>,
linuxppc-dev@lists.ozlabs.org,
"David Gibson" <david@gibson.dropbear.id.au>
Subject: [PATCH] target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts
Date: Tue, 14 Apr 2020 21:11:31 +1000 [thread overview]
Message-ID: <20200414111131.465560-1-npiggin@gmail.com> (raw)
If mtmsr L=1 sets MSR[EE] while there is a maskable exception pending,
it does not cause an interrupt. This causes the test case to hang:
https://lists.gnu.org/archive/html/qemu-ppc/2019-10/msg00826.html
More recently, Linux reduced the occurance of operations (e.g., rfi)
which stop translation and allow pending interrupts to be processed.
This started causing hangs in Linux boot in long-running kernel tests,
running with '-d int' shows the decrementer stops firing despite DEC
wrapping and MSR[EE]=1.
https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-April/208301.html
The cause is the broken mtmsr L=1 behaviour, which is contrary to the
architecture. From Power ISA v3.0B, p.977, Move To Machine State Register,
Programming Note states:
If MSR[EE]=0 and an External, Decrementer, or Performance Monitor
exception is pending, executing an mtmsrd instruction that sets
MSR[EE] to 1 will cause the interrupt to occur before the next
instruction is executed, if no higher priority exception exists
Fix this by handling L=1 exactly the same way as L=0, modulo the MSR
bits altered.
The confusion arises from L=0 being "context synchronizing" whereas L=1
is "execution synchronizing", which is a weaker semantic. However this
is not a relaxation of the requirement that these exceptions cause
interrupts when MSR[EE]=1 (e.g., when mtmsr executes to completion as
TCG is doing here), rather it specifies how a pipelined processor can
have multiple instructions in flight where one may influence how another
behaves.
Cc: qemu-stable@nongnu.org
Reported-by: Anton Blanchard <anton@ozlabs.org>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
Thanks very much to Nathan for reporting and testing it, I added his
Tested-by tag despite a more polished patch, as the the basics are
still the same (and still fixes his test case here).
This bug possibly goes back to early v2.04 / mtmsrd L=1 support around
2007, and the code has been changed several times since then so may
require some backporting.
32-bit / mtmsr untested at the moment, I don't have an environment
handy.
target/ppc/translate.c | 46 +++++++++++++++++++++++++-----------------
1 file changed, 27 insertions(+), 19 deletions(-)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b207fb5386..9959259dba 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4361,30 +4361,34 @@ static void gen_mtmsrd(DisasContext *ctx)
CHK_SV;
#if !defined(CONFIG_USER_ONLY)
+ if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
+ gen_io_start();
+ }
if (ctx->opcode & 0x00010000) {
- /* Special form that does not need any synchronisation */
+ /* L=1 form only updates EE and RI */
TCGv t0 = tcg_temp_new();
+ TCGv t1 = tcg_temp_new();
tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
(1 << MSR_RI) | (1 << MSR_EE));
- tcg_gen_andi_tl(cpu_msr, cpu_msr,
+ tcg_gen_andi_tl(t1, cpu_msr,
~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
- tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
+ tcg_gen_or_tl(t1, t1, t0);
+
+ gen_helper_store_msr(cpu_env, t1);
tcg_temp_free(t0);
+ tcg_temp_free(t1);
+
} else {
/*
* XXX: we need to update nip before the store if we enter
* power saving mode, we will exit the loop directly from
* ppc_store_msr
*/
- if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
- gen_io_start();
- }
gen_update_nip(ctx, ctx->base.pc_next);
gen_helper_store_msr(cpu_env, cpu_gpr[rS(ctx->opcode)]);
- /* Must stop the translation as machine state (may have) changed */
- /* Note that mtmsr is not always defined as context-synchronizing */
- gen_stop_exception(ctx);
}
+ /* Must stop the translation as machine state (may have) changed */
+ gen_stop_exception(ctx);
#endif /* !defined(CONFIG_USER_ONLY) */
}
#endif /* defined(TARGET_PPC64) */
@@ -4394,15 +4398,23 @@ static void gen_mtmsr(DisasContext *ctx)
CHK_SV;
#if !defined(CONFIG_USER_ONLY)
- if (ctx->opcode & 0x00010000) {
- /* Special form that does not need any synchronisation */
+ if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
+ gen_io_start();
+ }
+ if (ctx->opcode & 0x00010000) {
+ /* L=1 form only updates EE and RI */
TCGv t0 = tcg_temp_new();
+ TCGv t1 = tcg_temp_new();
tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
(1 << MSR_RI) | (1 << MSR_EE));
- tcg_gen_andi_tl(cpu_msr, cpu_msr,
+ tcg_gen_andi_tl(t1, cpu_msr,
~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
- tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
+ tcg_gen_or_tl(t1, t1, t0);
+
+ gen_helper_store_msr(cpu_env, t1);
tcg_temp_free(t0);
+ tcg_temp_free(t1);
+
} else {
TCGv msr = tcg_temp_new();
@@ -4411,9 +4423,6 @@ static void gen_mtmsr(DisasContext *ctx)
* power saving mode, we will exit the loop directly from
* ppc_store_msr
*/
- if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
- gen_io_start();
- }
gen_update_nip(ctx, ctx->base.pc_next);
#if defined(TARGET_PPC64)
tcg_gen_deposit_tl(msr, cpu_msr, cpu_gpr[rS(ctx->opcode)], 0, 32);
@@ -4422,10 +4431,9 @@ static void gen_mtmsr(DisasContext *ctx)
#endif
gen_helper_store_msr(cpu_env, msr);
tcg_temp_free(msr);
- /* Must stop the translation as machine state (may have) changed */
- /* Note that mtmsr is not always defined as context-synchronizing */
- gen_stop_exception(ctx);
}
+ /* Must stop the translation as machine state (may have) changed */
+ gen_stop_exception(ctx);
#endif
}
--
2.23.0
WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: qemu-devel@nongnu.org
Cc: qemu-stable@nongnu.org, "Nicholas Piggin" <npiggin@gmail.com>,
"Cédric Le Goater" <clg@fr.ibm.com>,
qemu-ppc@nongnu.org,
"Nathan Chancellor" <natechancellor@gmail.com>,
"Anton Blanchard" <anton@ozlabs.org>,
linuxppc-dev@lists.ozlabs.org,
"David Gibson" <david@gibson.dropbear.id.au>
Subject: [PATCH] target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts
Date: Tue, 14 Apr 2020 21:11:31 +1000 [thread overview]
Message-ID: <20200414111131.465560-1-npiggin@gmail.com> (raw)
If mtmsr L=1 sets MSR[EE] while there is a maskable exception pending,
it does not cause an interrupt. This causes the test case to hang:
https://lists.gnu.org/archive/html/qemu-ppc/2019-10/msg00826.html
More recently, Linux reduced the occurance of operations (e.g., rfi)
which stop translation and allow pending interrupts to be processed.
This started causing hangs in Linux boot in long-running kernel tests,
running with '-d int' shows the decrementer stops firing despite DEC
wrapping and MSR[EE]=1.
https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-April/208301.html
The cause is the broken mtmsr L=1 behaviour, which is contrary to the
architecture. From Power ISA v3.0B, p.977, Move To Machine State Register,
Programming Note states:
If MSR[EE]=0 and an External, Decrementer, or Performance Monitor
exception is pending, executing an mtmsrd instruction that sets
MSR[EE] to 1 will cause the interrupt to occur before the next
instruction is executed, if no higher priority exception exists
Fix this by handling L=1 exactly the same way as L=0, modulo the MSR
bits altered.
The confusion arises from L=0 being "context synchronizing" whereas L=1
is "execution synchronizing", which is a weaker semantic. However this
is not a relaxation of the requirement that these exceptions cause
interrupts when MSR[EE]=1 (e.g., when mtmsr executes to completion as
TCG is doing here), rather it specifies how a pipelined processor can
have multiple instructions in flight where one may influence how another
behaves.
Cc: qemu-stable@nongnu.org
Reported-by: Anton Blanchard <anton@ozlabs.org>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
Thanks very much to Nathan for reporting and testing it, I added his
Tested-by tag despite a more polished patch, as the the basics are
still the same (and still fixes his test case here).
This bug possibly goes back to early v2.04 / mtmsrd L=1 support around
2007, and the code has been changed several times since then so may
require some backporting.
32-bit / mtmsr untested at the moment, I don't have an environment
handy.
target/ppc/translate.c | 46 +++++++++++++++++++++++++-----------------
1 file changed, 27 insertions(+), 19 deletions(-)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b207fb5386..9959259dba 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4361,30 +4361,34 @@ static void gen_mtmsrd(DisasContext *ctx)
CHK_SV;
#if !defined(CONFIG_USER_ONLY)
+ if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
+ gen_io_start();
+ }
if (ctx->opcode & 0x00010000) {
- /* Special form that does not need any synchronisation */
+ /* L=1 form only updates EE and RI */
TCGv t0 = tcg_temp_new();
+ TCGv t1 = tcg_temp_new();
tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
(1 << MSR_RI) | (1 << MSR_EE));
- tcg_gen_andi_tl(cpu_msr, cpu_msr,
+ tcg_gen_andi_tl(t1, cpu_msr,
~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
- tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
+ tcg_gen_or_tl(t1, t1, t0);
+
+ gen_helper_store_msr(cpu_env, t1);
tcg_temp_free(t0);
+ tcg_temp_free(t1);
+
} else {
/*
* XXX: we need to update nip before the store if we enter
* power saving mode, we will exit the loop directly from
* ppc_store_msr
*/
- if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
- gen_io_start();
- }
gen_update_nip(ctx, ctx->base.pc_next);
gen_helper_store_msr(cpu_env, cpu_gpr[rS(ctx->opcode)]);
- /* Must stop the translation as machine state (may have) changed */
- /* Note that mtmsr is not always defined as context-synchronizing */
- gen_stop_exception(ctx);
}
+ /* Must stop the translation as machine state (may have) changed */
+ gen_stop_exception(ctx);
#endif /* !defined(CONFIG_USER_ONLY) */
}
#endif /* defined(TARGET_PPC64) */
@@ -4394,15 +4398,23 @@ static void gen_mtmsr(DisasContext *ctx)
CHK_SV;
#if !defined(CONFIG_USER_ONLY)
- if (ctx->opcode & 0x00010000) {
- /* Special form that does not need any synchronisation */
+ if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
+ gen_io_start();
+ }
+ if (ctx->opcode & 0x00010000) {
+ /* L=1 form only updates EE and RI */
TCGv t0 = tcg_temp_new();
+ TCGv t1 = tcg_temp_new();
tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
(1 << MSR_RI) | (1 << MSR_EE));
- tcg_gen_andi_tl(cpu_msr, cpu_msr,
+ tcg_gen_andi_tl(t1, cpu_msr,
~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
- tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
+ tcg_gen_or_tl(t1, t1, t0);
+
+ gen_helper_store_msr(cpu_env, t1);
tcg_temp_free(t0);
+ tcg_temp_free(t1);
+
} else {
TCGv msr = tcg_temp_new();
@@ -4411,9 +4423,6 @@ static void gen_mtmsr(DisasContext *ctx)
* power saving mode, we will exit the loop directly from
* ppc_store_msr
*/
- if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
- gen_io_start();
- }
gen_update_nip(ctx, ctx->base.pc_next);
#if defined(TARGET_PPC64)
tcg_gen_deposit_tl(msr, cpu_msr, cpu_gpr[rS(ctx->opcode)], 0, 32);
@@ -4422,10 +4431,9 @@ static void gen_mtmsr(DisasContext *ctx)
#endif
gen_helper_store_msr(cpu_env, msr);
tcg_temp_free(msr);
- /* Must stop the translation as machine state (may have) changed */
- /* Note that mtmsr is not always defined as context-synchronizing */
- gen_stop_exception(ctx);
}
+ /* Must stop the translation as machine state (may have) changed */
+ gen_stop_exception(ctx);
#endif
}
--
2.23.0
next reply other threads:[~2020-04-14 11:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-14 11:11 Nicholas Piggin [this message]
2020-04-14 11:11 ` [PATCH] target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts Nicholas Piggin
2020-04-14 23:35 ` Nathan Chancellor
2020-04-14 23:35 ` Nathan Chancellor
2020-04-15 6:49 ` [EXTERNAL] " Cédric Le Goater
2020-04-15 6:49 ` Cédric Le Goater
2020-04-16 7:53 ` Nicholas Piggin
2020-04-16 7:53 ` Nicholas Piggin
2020-04-16 12:28 ` Alex Bennée
2020-04-16 12:28 ` Alex Bennée
2020-04-17 0:40 ` David Gibson
2020-04-17 0:40 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200414111131.465560-1-npiggin@gmail.com \
--to=npiggin@gmail.com \
--cc=clg@fr.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=natechancellor@gmail.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=qemu-stable@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.