* Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
[not found] <1361556605-21963-1-git-send-email-peter.maydell@linaro.org>
@ 2013-03-03 13:23 ` Peter Maydell
2013-03-03 15:50 ` Blue Swirl
2013-05-09 8:09 ` Michael Tokarev
2 siblings, 0 replies; 10+ messages in thread
From: Peter Maydell @ 2013-03-03 13:23 UTC (permalink / raw)
To: qemu-devel
Cc: Anthony Liguori, patches, Alexander Graf, Blue Swirl, Paul Brook,
Andreas Färber, Richard Henderson
Ping!
thanks
-- PMM
On 23 February 2013 02:09, Peter Maydell <peter.maydell@linaro.org> wrote:
> This patch series gets rid of cpu_unlink_tb(), which is irredeemably
> racy, since it modifies the TB graph with no locking from other
> threads, signal handlers, etc etc. (The signal handler case is
> why you can't just fix this with more locks.) Instead we take the
> much simpler approach of setting a flag for the CPU when we want
> it to stop executing TBs, and generate code to check the flag at
> the start of every TB. The raciness is easiest to provoke with
> multithreaded linux-user guests but it is I think also a risk
> in system emulation mode.
>
> This fixes the crashes seen in LP:668799; however there are another
> class of crashes described in LP:1098729 which stem from the fact
> that in linux-user with a multithreaded guest all threads will
> use and modify the same global TCG date structures (including the
> generated code buffer) without any kind of locking. This means that
> multithreaded guest binaries are still in the "unsupported" category.
>
> Patch 1 has been on the list before, but I improved the comment
> a bit [no code changes from the previously posted version] which
> is why I haven't given it rth's reviewed-by tag.
>
> The ENV_OFFSET macros patch is one from an old patch series
> of Andreas'; we need some way in generic code to find the offset
> of a CPUState field from the CPUArchState pointer. (The commit
> message is mine, though.)
>
> I've tested this in various ways, and also ran a popular embedded
> benchmark. This is the worst case for this change, since it is
> compute intensive code in a situation that shouldn't have too
> many interrupts; the slowdown varied from benchmark to benchmark
> but was generally something like 3-5%. I think that's an acceptably
> small hit to gain actual correctness and non-crashiness :-)
>
>
> Andreas Färber (1):
> cpu: Introduce ENV_OFFSET macros
>
> Peter Maydell (5):
> tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
> cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
> Handle CPU interrupts by inline checking of a flag
> translate-all.c: Remove cpu_unlink_tb()
> gen-icount.h: Rename gen_icount_start/end to gen_tb_start/end
>
> cpu-exec.c | 59 +++++++++++++++++++++++++--------
> exec.c | 2 +-
> include/exec/gen-icount.h | 18 ++++++++--
> include/qom/cpu.h | 3 ++
> target-alpha/cpu-qom.h | 1 +
> target-alpha/translate.c | 4 +--
> target-arm/cpu-qom.h | 2 ++
> target-arm/translate.c | 4 +--
> target-cris/cpu-qom.h | 1 +
> target-cris/translate.c | 4 +--
> target-i386/cpu-qom.h | 1 +
> target-i386/translate.c | 4 +--
> target-lm32/cpu-qom.h | 1 +
> target-lm32/translate.c | 4 +--
> target-m68k/cpu-qom.h | 1 +
> target-m68k/translate.c | 4 +--
> target-microblaze/cpu-qom.h | 1 +
> target-microblaze/translate.c | 4 +--
> target-mips/cpu-qom.h | 1 +
> target-mips/translate.c | 4 +--
> target-openrisc/cpu.h | 2 ++
> target-openrisc/translate.c | 4 +--
> target-ppc/cpu-qom.h | 3 +-
> target-ppc/translate.c | 4 +--
> target-s390x/cpu-qom.h | 1 +
> target-s390x/translate.c | 4 +--
> target-sh4/cpu-qom.h | 1 +
> target-sh4/translate.c | 4 +--
> target-sparc/cpu-qom.h | 1 +
> target-sparc/translate.c | 4 +--
> target-unicore32/cpu-qom.h | 1 +
> target-unicore32/translate.c | 4 +--
> target-xtensa/cpu-qom.h | 1 +
> target-xtensa/translate.c | 4 +--
> tcg/tcg.h | 49 ++++++++++++++++++++++++++-
> translate-all.c | 73 ++---------------------------------------
> 36 files changed, 162 insertions(+), 121 deletions(-)
>
> --
> 1.7.9.5
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
[not found] <1361556605-21963-1-git-send-email-peter.maydell@linaro.org>
2013-03-03 13:23 ` [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb() Peter Maydell
@ 2013-03-03 15:50 ` Blue Swirl
2013-05-09 8:09 ` Michael Tokarev
2 siblings, 0 replies; 10+ messages in thread
From: Blue Swirl @ 2013-03-03 15:50 UTC (permalink / raw)
To: Peter Maydell
Cc: Anthony Liguori, patches, qemu-devel, Alexander Graf, Paul Brook,
Andreas Färber, Richard Henderson
Thanks, applied all.
On Fri, Feb 22, 2013 at 6:09 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> This patch series gets rid of cpu_unlink_tb(), which is irredeemably
> racy, since it modifies the TB graph with no locking from other
> threads, signal handlers, etc etc. (The signal handler case is
> why you can't just fix this with more locks.) Instead we take the
> much simpler approach of setting a flag for the CPU when we want
> it to stop executing TBs, and generate code to check the flag at
> the start of every TB. The raciness is easiest to provoke with
> multithreaded linux-user guests but it is I think also a risk
> in system emulation mode.
>
> This fixes the crashes seen in LP:668799; however there are another
> class of crashes described in LP:1098729 which stem from the fact
> that in linux-user with a multithreaded guest all threads will
> use and modify the same global TCG date structures (including the
> generated code buffer) without any kind of locking. This means that
> multithreaded guest binaries are still in the "unsupported" category.
>
> Patch 1 has been on the list before, but I improved the comment
> a bit [no code changes from the previously posted version] which
> is why I haven't given it rth's reviewed-by tag.
>
> The ENV_OFFSET macros patch is one from an old patch series
> of Andreas'; we need some way in generic code to find the offset
> of a CPUState field from the CPUArchState pointer. (The commit
> message is mine, though.)
>
> I've tested this in various ways, and also ran a popular embedded
> benchmark. This is the worst case for this change, since it is
> compute intensive code in a situation that shouldn't have too
> many interrupts; the slowdown varied from benchmark to benchmark
> but was generally something like 3-5%. I think that's an acceptably
> small hit to gain actual correctness and non-crashiness :-)
>
>
> Andreas Färber (1):
> cpu: Introduce ENV_OFFSET macros
>
> Peter Maydell (5):
> tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
> cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
> Handle CPU interrupts by inline checking of a flag
> translate-all.c: Remove cpu_unlink_tb()
> gen-icount.h: Rename gen_icount_start/end to gen_tb_start/end
>
> cpu-exec.c | 59 +++++++++++++++++++++++++--------
> exec.c | 2 +-
> include/exec/gen-icount.h | 18 ++++++++--
> include/qom/cpu.h | 3 ++
> target-alpha/cpu-qom.h | 1 +
> target-alpha/translate.c | 4 +--
> target-arm/cpu-qom.h | 2 ++
> target-arm/translate.c | 4 +--
> target-cris/cpu-qom.h | 1 +
> target-cris/translate.c | 4 +--
> target-i386/cpu-qom.h | 1 +
> target-i386/translate.c | 4 +--
> target-lm32/cpu-qom.h | 1 +
> target-lm32/translate.c | 4 +--
> target-m68k/cpu-qom.h | 1 +
> target-m68k/translate.c | 4 +--
> target-microblaze/cpu-qom.h | 1 +
> target-microblaze/translate.c | 4 +--
> target-mips/cpu-qom.h | 1 +
> target-mips/translate.c | 4 +--
> target-openrisc/cpu.h | 2 ++
> target-openrisc/translate.c | 4 +--
> target-ppc/cpu-qom.h | 3 +-
> target-ppc/translate.c | 4 +--
> target-s390x/cpu-qom.h | 1 +
> target-s390x/translate.c | 4 +--
> target-sh4/cpu-qom.h | 1 +
> target-sh4/translate.c | 4 +--
> target-sparc/cpu-qom.h | 1 +
> target-sparc/translate.c | 4 +--
> target-unicore32/cpu-qom.h | 1 +
> target-unicore32/translate.c | 4 +--
> target-xtensa/cpu-qom.h | 1 +
> target-xtensa/translate.c | 4 +--
> tcg/tcg.h | 49 ++++++++++++++++++++++++++-
> translate-all.c | 73 ++---------------------------------------
> 36 files changed, 162 insertions(+), 121 deletions(-)
>
> --
> 1.7.9.5
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
[not found] <1361556605-21963-1-git-send-email-peter.maydell@linaro.org>
2013-03-03 13:23 ` [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb() Peter Maydell
2013-03-03 15:50 ` Blue Swirl
@ 2013-05-09 8:09 ` Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 1/4] tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses Michael Tokarev
` (5 more replies)
2 siblings, 6 replies; 10+ messages in thread
From: Michael Tokarev @ 2013-05-09 8:09 UTC (permalink / raw)
To: Peter Maydell
Cc: Anthony Liguori, qemu-stable, qemu-devel, Alexander Graf,
Blue Swirl, Paul Brook, Andreas Färber, Richard Henderson
[Rehashing a relatively old thread.
It is available, in particular, at http://thread.gmane.org/gmane.comp.emulators.qemu/196629]
22.02.2013 22:09, Peter Maydell wrote:
> This patch series gets rid of cpu_unlink_tb(), which is irredeemably
> racy, since it modifies the TB graph with no locking from other
> threads, signal handlers, etc etc. (The signal handler case is
> why you can't just fix this with more locks.) Instead we take the
> much simpler approach of setting a flag for the CPU when we want
> it to stop executing TBs, and generate code to check the flag at
> the start of every TB. The raciness is easiest to provoke with
> multithreaded linux-user guests but it is I think also a risk
> in system emulation mode.
>
> This fixes the crashes seen in LP:668799; however there are another
> class of crashes described in LP:1098729 which stem from the fact
> that in linux-user with a multithreaded guest all threads will
> use and modify the same global TCG date structures (including the
> generated code buffer) without any kind of locking. This means that
> multithreaded guest binaries are still in the "unsupported" category.
Now when Debian Wheezy has been released with qemu 1.2, users started
to file bugreports about multithreaded apps crashing. So, while I
understand these are "unsupported" as per above, I think it is better
to fix as much as we can, and allow some more usage cases, than to
describe that it "does not work". (And yes, I also understand it's
better to fix that before a release, but oh well).
So I tried to backport the series to 1.2 branch, to see how it goes.
But there were many changes since 1.2, so it ended up in quite some
changes, -- not exactly huge and complicated, but I need some help
or additional pair (or two) of eyes to ensure the sanity of the
resulting code.
The backported patchset is smaller than the original.
> Andreas Färber (1):
> cpu: Introduce ENV_OFFSET macros
This change isn't needed in 1.2, because all CPU state is in single
place there, it hasn't been split between env and cpu states there.
> Peter Maydell (5):
> tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
Just a minor context difference in the header, no questions.
> cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
This needed a "back merge" of env+cpu states back to cpu.
Maybe we should somehow revisit the whole concept of the
two, because it's sorta fun: at some point all functions
has been converted to accept `cpu' instead of `env', but
in many places the first thing a function does is to
get `env' pointer out of `cpu'. The backport of this
change reverts this piggy-backing and uses `env' everywhere
consistently again.
> Handle CPU interrupts by inline checking of a flag
The main patch in the series.
The same simplification from `cpu' back to `env' as above.
I had to add `tcg_exit_req' field into CPU_COMMON macro in
cpu-defs.h, instead of adding it to CPUState struct in
include/qom/cpu.h.
Plus the corresponding changes in gen-icount.h -- that's
where ENV_OFFSET was used, but due to the same env to cpu
split, not needed anymore. My main question is actually
about this place, I don't exactly understand how the
code generation works, so am not sure if I backported
it correctly.
Plus minor code shuffling - for one, bits in translate-all.c
was in exec.c before, so I had to modify it there.
> translate-all.c: Remove cpu_unlink_tb()
That's easy, but again the bits being removed are in
exec.c in 1.2, not in translate-all.c (so the resulting
backported patch is now misnamed).
Technically this patch isn't needed for a backport,
since the function(s) are really unused, but gcc
generates a warning about unused static function
and the compilation fails due to -Werror.
> gen-icount.h: Rename gen_icount_start/end to gen_tb_start/end
And I didn't try to backport this one, since this is
just mechanical s/icount/tb/g without any code changes.
Now, the resulting thing compiles (ha!), but I'm not
really sure how to test it. I ran a few random apps
using qemu-i386 and qemu-arm, it appears to work.
If all goes well, I think this series needs to be
included in whatever -stable qemu series are in use.
I'll post the 4 resulting patches in reply to this
message.
Thank you!
/mjt
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 1/4] tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
2013-05-09 8:09 ` Michael Tokarev
@ 2013-05-09 8:13 ` Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 2/4] cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC Michael Tokarev
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Michael Tokarev @ 2013-05-09 8:13 UTC (permalink / raw)
To: Peter Maydell
Cc: Anthony Liguori, qemu-stable, Michael Tokarev, qemu-devel,
Alexander Graf, Blue Swirl, Paul Brook, Andreas Färber,
Richard Henderson
From: Peter Maydell <peter.maydell@linaro.org>
Document tcg_qemu_tb_exec(). In particular, its return value is a
combination of a pointer to the next translation block and some
extra information in the low two bits. Provide some #defines for
the values passed in these bits to improve code clarity.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 0980011b4f66482d2733ab2dd0f2f61747772c6b)
Conflicts:
tcg/tcg.h
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
cpu-exec.c | 9 +++++----
gen-icount.h | 2 +-
tcg/tcg.h | 44 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 49 insertions(+), 6 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index 6db32cd..18da91e 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -71,7 +71,7 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
next_tb = tcg_qemu_tb_exec(env, tb->tc_ptr);
env->current_tb = NULL;
- if ((next_tb & 3) == 2) {
+ if ((next_tb & TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) {
/* Restore PC. This may happen if async event occurs before
the TB starts executing. */
cpu_pc_from_tb(env, tb);
@@ -555,7 +555,8 @@ int cpu_exec(CPUArchState *env)
spans two pages, we cannot safely do a direct
jump. */
if (next_tb != 0 && tb->page_addr[1] == -1) {
- tb_add_jump((TranslationBlock *)(next_tb & ~3), next_tb & 3, tb);
+ tb_add_jump((TranslationBlock *)(next_tb & ~TB_EXIT_MASK),
+ next_tb & TB_EXIT_MASK, tb);
}
spin_unlock(&tb_lock);
@@ -569,10 +570,10 @@ int cpu_exec(CPUArchState *env)
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = tcg_qemu_tb_exec(env, tc_ptr);
- if ((next_tb & 3) == 2) {
+ if ((next_tb & TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) {
/* Instruction counter expired. */
int insns_left;
- tb = (TranslationBlock *)(next_tb & ~3);
+ tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
/* Restore PC. */
cpu_pc_from_tb(env, tb);
insns_left = env->icount_decr.u32;
diff --git a/gen-icount.h b/gen-icount.h
index 430cb44..3d5c131 100644
--- a/gen-icount.h
+++ b/gen-icount.h
@@ -29,7 +29,7 @@ static void gen_icount_end(TranslationBlock *tb, int num_insns)
if (use_icount) {
*icount_arg = num_insns;
gen_set_label(icount_label);
- tcg_gen_exit_tb((tcg_target_long)tb + 2);
+ tcg_gen_exit_tb((tcg_target_long)tb + TB_EXIT_ICOUNT_EXPIRED);
}
}
diff --git a/tcg/tcg.h b/tcg/tcg.h
index a83bddd..af06eb5 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -581,7 +581,49 @@ TCGv_i64 tcg_const_local_i64(int64_t val);
extern uint8_t code_gen_prologue[];
-/* TCG targets may use a different definition of tcg_qemu_tb_exec. */
+/**
+ * tcg_qemu_tb_exec:
+ * @env: CPUArchState * for the CPU
+ * @tb_ptr: address of generated code for the TB to execute
+ *
+ * Start executing code from a given translation block.
+ * Where translation blocks have been linked, execution
+ * may proceed from the given TB into successive ones.
+ * Control eventually returns only when some action is needed
+ * from the top-level loop: either control must pass to a TB
+ * which has not yet been directly linked, or an asynchronous
+ * event such as an interrupt needs handling.
+ *
+ * The return value is a pointer to the next TB to execute
+ * (if known; otherwise zero). This pointer is assumed to be
+ * 4-aligned, and the bottom two bits are used to return further
+ * information:
+ * 0, 1: the link between this TB and the next is via the specified
+ * TB index (0 or 1). That is, we left the TB via (the equivalent
+ * of) "goto_tb <index>". The main loop uses this to determine
+ * how to link the TB just executed to the next.
+ * 2: we are using instruction counting code generation, and we
+ * did not start executing this TB because the instruction counter
+ * would hit zero midway through it. In this case the next-TB pointer
+ * returned is the TB we were about to execute, and the caller must
+ * arrange to execute the remaining count of instructions.
+ *
+ * If the bottom two bits indicate an exit-via-index then the CPU
+ * state is correctly synchronised and ready for execution of the next
+ * TB (and in particular the guest PC is the address to execute next).
+ * Otherwise, we gave up on execution of this TB before it started, and
+ * the caller must fix up the CPU state by calling cpu_pc_from_tb()
+ * with the next-TB pointer we return.
+ *
+ * Note that TCG targets may use a different definition of tcg_qemu_tb_exec
+ * to this default (which just calls the prologue.code emitted by
+ * tcg_target_qemu_prologue()).
+ */
+#define TB_EXIT_MASK 3
+#define TB_EXIT_IDX0 0
+#define TB_EXIT_IDX1 1
+#define TB_EXIT_ICOUNT_EXPIRED 2
+
#if !defined(tcg_qemu_tb_exec)
# define tcg_qemu_tb_exec(env, tb_ptr) \
((tcg_target_ulong (*)(void *, void *))code_gen_prologue)(env, tb_ptr)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 2/4] cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
2013-05-09 8:09 ` Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 1/4] tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses Michael Tokarev
@ 2013-05-09 8:13 ` Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 3/4] Handle CPU interrupts by inline checking of a flag Michael Tokarev
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Michael Tokarev @ 2013-05-09 8:13 UTC (permalink / raw)
To: Peter Maydell
Cc: Anthony Liguori, qemu-stable, Michael Tokarev, qemu-devel,
Alexander Graf, Blue Swirl, Paul Brook, Andreas Färber,
Richard Henderson
From: Peter Maydell <peter.maydell@linaro.org>
If tcg_qemu_tb_exec() returns a value whose low bits don't indicate a
link to an indexed next TB, this means that the TB execution never
started (eg because the instruction counter hit zero). In this case the
guest PC has to be reset to the address of the start of the TB.
Refactor the cpu-exec code to make all tcg_qemu_tb_exec() calls pass
through a wrapper function which does this restoration if necessary.
Note that the apparent change in cpu_exec_nocache() from calling
cpu_pc_from_tb() with the old TB to calling it with the TB returned by
do_tcg_qemu_tb_exec() is safe, because in the nocache case we can
guarantee that the TB we try to execute is not linked to any others,
so the only possible returned TB is the one we started at. That is,
we should arguably previously have included in cpu_exec_nocache() an
assert(next_tb & ~TB_EXIT_MASK) == tb), since the API requires restore
from next_tb but we were using tb.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 77211379d73ea0c89c0b5bb6eee74b17cb06f9a8)
Conflicts:
cpu-exec.c
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
cpu-exec.c | 28 +++++++++++++++++-----------
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index 18da91e..edef121 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -51,12 +51,26 @@ void cpu_resume_from_signal(CPUArchState *env, void *puc)
}
#endif
+/* Execute a TB, and fix up the CPU state afterwards if necessary */
+static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
+{
+ tcg_target_ulong next_tb = tcg_qemu_tb_exec(env, tb_ptr);
+ if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
+ /* We didn't start executing this TB (eg because the instruction
+ * counter hit zero); we must restore the guest PC to the address
+ * of the start of the TB.
+ */
+ TranslationBlock *tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
+ cpu_pc_from_tb(env, tb);
+ }
+ return next_tb;
+}
+
/* Execute the code without caching the generated code. An interpreter
could be used if available. */
static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
TranslationBlock *orig_tb)
{
- tcg_target_ulong next_tb;
TranslationBlock *tb;
/* Should never happen.
@@ -68,14 +82,8 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
max_cycles);
env->current_tb = tb;
/* execute the generated code */
- next_tb = tcg_qemu_tb_exec(env, tb->tc_ptr);
+ cpu_tb_exec(env, tb->tc_ptr);
env->current_tb = NULL;
-
- if ((next_tb & TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) {
- /* Restore PC. This may happen if async event occurs before
- the TB starts executing. */
- cpu_pc_from_tb(env, tb);
- }
tb_phys_invalidate(tb, -1);
tb_free(tb);
}
@@ -569,13 +577,11 @@ int cpu_exec(CPUArchState *env)
if (likely(!env->exit_request)) {
tc_ptr = tb->tc_ptr;
/* execute the generated code */
- next_tb = tcg_qemu_tb_exec(env, tc_ptr);
+ next_tb = cpu_tb_exec(env, tc_ptr);
if ((next_tb & TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) {
/* Instruction counter expired. */
int insns_left;
tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
- /* Restore PC. */
- cpu_pc_from_tb(env, tb);
insns_left = env->icount_decr.u32;
if (env->icount_extra && insns_left >= 0) {
/* Refill decrementer and continue execution. */
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 3/4] Handle CPU interrupts by inline checking of a flag
2013-05-09 8:09 ` Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 1/4] tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 2/4] cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC Michael Tokarev
@ 2013-05-09 8:13 ` Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 4/4] translate-all.c: Remove cpu_unlink_tb() Michael Tokarev
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Michael Tokarev @ 2013-05-09 8:13 UTC (permalink / raw)
To: Peter Maydell
Cc: Anthony Liguori, qemu-stable, Michael Tokarev, qemu-devel,
Alexander Graf, Blue Swirl, Paul Brook, Andreas Färber,
Richard Henderson
From: Peter Maydell <peter.maydell@linaro.org>
Fix some of the nasty TCG race conditions and crashes by implementing
cpu_exit() as setting a flag which is checked at the start of each TB.
This avoids crashes if a thread or signal handler calls cpu_exit()
while the execution thread is itself modifying the TB graph (which
may happen in system emulation mode as well as in linux-user mode
with a multithreaded guest binary).
This fixes the crashes seen in LP:668799; however there are another
class of crashes described in LP:1098729 which stem from the fact
that in linux-user with a multithreaded guest all threads will
use and modify the same global TCG date structures (including the
generated code buffer) without any kind of locking. This means that
multithreaded guest binaries are still in the "unsupported"
category.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 378df4b23753a11be650af7664ca76bc75cb9f01)
Conflicts:
cpu-exec.c
exec.c
include/qom/cpu.h
translate-all.c (original code was in exec.c, changed there)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
cpu-defs.h | 1 +
cpu-exec.c | 25 ++++++++++++++++++++++++-
exec.c | 6 +++---
gen-icount.h | 11 +++++++++++
tcg/tcg.h | 5 +++++
5 files changed, 44 insertions(+), 4 deletions(-)
diff --git a/cpu-defs.h b/cpu-defs.h
index f49e950..75c8e40 100644
--- a/cpu-defs.h
+++ b/cpu-defs.h
@@ -173,6 +173,7 @@ typedef struct CPUWatchpoint {
uint32_t halted; /* Nonzero if the CPU is in suspend state */ \
uint32_t interrupt_request; \
volatile sig_atomic_t exit_request; \
+ volatile sig_atomic_t tcg_exit_req; \
CPU_COMMON_TLB \
struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE]; \
/* buffer for temporaries in the code generator */ \
diff --git a/cpu-exec.c b/cpu-exec.c
index edef121..3049e46 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -63,6 +63,12 @@ static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
TranslationBlock *tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
cpu_pc_from_tb(env, tb);
}
+ if ((next_tb & TB_EXIT_MASK) == TB_EXIT_REQUESTED) {
+ /* We were asked to stop executing TBs (probably a pending
+ * interrupt. We've now stopped, so clear the flag.
+ */
+ env->tcg_exit_req = 0;
+ }
return next_tb;
}
@@ -578,7 +584,20 @@ int cpu_exec(CPUArchState *env)
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = cpu_tb_exec(env, tc_ptr);
- if ((next_tb & TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) {
+ switch (next_tb & TB_EXIT_MASK) {
+ case TB_EXIT_REQUESTED:
+ /* Something asked us to stop executing
+ * chained TBs; just continue round the main
+ * loop. Whatever requested the exit will also
+ * have set something else (eg exit_request or
+ * interrupt_request) which we will handle
+ * next time around the loop.
+ */
+ tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
+ next_tb = 0;
+ break;
+ case TB_EXIT_ICOUNT_EXPIRED:
+ {
/* Instruction counter expired. */
int insns_left;
tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
@@ -602,6 +621,10 @@ int cpu_exec(CPUArchState *env)
next_tb = 0;
cpu_loop_exit(env);
}
+ break;
+ }
+ default:
+ break;
}
}
env->current_tb = NULL;
diff --git a/exec.c b/exec.c
index 0a67f07..54a37b5 100644
--- a/exec.c
+++ b/exec.c
@@ -1756,7 +1756,7 @@ static void tcg_handle_interrupt(CPUArchState *env, int mask)
cpu_abort(env, "Raised interrupt while not in I/O function");
}
} else {
- cpu_unlink_tb(env);
+ env->tcg_exit_req = 1;
}
}
@@ -1767,7 +1767,7 @@ CPUInterruptHandler cpu_interrupt_handler = tcg_handle_interrupt;
void cpu_interrupt(CPUArchState *env, int mask)
{
env->interrupt_request |= mask;
- cpu_unlink_tb(env);
+ env->tcg_exit_req = 1;
}
#endif /* CONFIG_USER_ONLY */
@@ -1779,7 +1779,7 @@ void cpu_reset_interrupt(CPUArchState *env, int mask)
void cpu_exit(CPUArchState *env)
{
env->exit_request = 1;
- cpu_unlink_tb(env);
+ env->tcg_exit_req = 1;
}
const CPULogItem cpu_log_items[] = {
diff --git a/gen-icount.h b/gen-icount.h
index 3d5c131..8037e53 100644
--- a/gen-icount.h
+++ b/gen-icount.h
@@ -4,10 +4,18 @@
static TCGArg *icount_arg;
static int icount_label;
+static int exitreq_label;
static inline void gen_icount_start(void)
{
TCGv_i32 count;
+ TCGv_i32 flag;
+
+ exitreq_label = gen_new_label();
+ flag = tcg_temp_local_new_i32();
+ tcg_gen_ld_i32(flag, cpu_env, offsetof(CPUArchState, tcg_exit_req));
+ tcg_gen_brcondi_i32(TCG_COND_NE, flag, 0, exitreq_label);
+ tcg_temp_free_i32(flag);
if (!use_icount)
return;
@@ -26,6 +34,9 @@ static inline void gen_icount_start(void)
static void gen_icount_end(TranslationBlock *tb, int num_insns)
{
+ gen_set_label(exitreq_label);
+ tcg_gen_exit_tb((tcg_target_long)tb + TB_EXIT_REQUESTED);
+
if (use_icount) {
*icount_arg = num_insns;
gen_set_label(icount_label);
diff --git a/tcg/tcg.h b/tcg/tcg.h
index af06eb5..9b16add 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -607,6 +607,10 @@ extern uint8_t code_gen_prologue[];
* would hit zero midway through it. In this case the next-TB pointer
* returned is the TB we were about to execute, and the caller must
* arrange to execute the remaining count of instructions.
+ * 3: we stopped because the CPU's exit_request flag was set
+ * (usually meaning that there is an interrupt that needs to be
+ * handled). The next-TB pointer returned is the TB we were
+ * about to execute when we noticed the pending exit request.
*
* If the bottom two bits indicate an exit-via-index then the CPU
* state is correctly synchronised and ready for execution of the next
@@ -623,6 +627,7 @@ extern uint8_t code_gen_prologue[];
#define TB_EXIT_IDX0 0
#define TB_EXIT_IDX1 1
#define TB_EXIT_ICOUNT_EXPIRED 2
+#define TB_EXIT_REQUESTED 3
#if !defined(tcg_qemu_tb_exec)
# define tcg_qemu_tb_exec(env, tb_ptr) \
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 4/4] translate-all.c: Remove cpu_unlink_tb()
2013-05-09 8:09 ` Michael Tokarev
` (2 preceding siblings ...)
2013-05-09 8:13 ` [Qemu-devel] [PATCH 3/4] Handle CPU interrupts by inline checking of a flag Michael Tokarev
@ 2013-05-09 8:13 ` Michael Tokarev
2013-05-09 9:01 ` [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb() Peter Maydell
2013-05-09 10:26 ` Andreas Färber
5 siblings, 0 replies; 10+ messages in thread
From: Michael Tokarev @ 2013-05-09 8:13 UTC (permalink / raw)
To: Peter Maydell
Cc: Anthony Liguori, qemu-stable, Michael Tokarev, qemu-devel,
Alexander Graf, Blue Swirl, Paul Brook, Andreas Färber,
Richard Henderson
From: Peter Maydell <peter.maydell@linaro.org>
The (unsafe) function cpu_unlink_tb() is now unused, so we can simply
remove it and any code that was only used by it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 3a808cc407744c30daa7470b5f191cde1fbc1aae)
Conflicts:
translate-all.c
(original code was in exec.c, so the patch title is now misleading)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
exec.c | 67 ----------------------------------------------------------------
1 file changed, 67 deletions(-)
diff --git a/exec.c b/exec.c
index 54a37b5..fb72d74 100644
--- a/exec.c
+++ b/exec.c
@@ -1421,53 +1421,6 @@ TranslationBlock *tb_find_pc(uintptr_t tc_ptr)
return &tbs[m_max];
}
-static void tb_reset_jump_recursive(TranslationBlock *tb);
-
-static inline void tb_reset_jump_recursive2(TranslationBlock *tb, int n)
-{
- TranslationBlock *tb1, *tb_next, **ptb;
- unsigned int n1;
-
- tb1 = tb->jmp_next[n];
- if (tb1 != NULL) {
- /* find head of list */
- for(;;) {
- n1 = (uintptr_t)tb1 & 3;
- tb1 = (TranslationBlock *)((uintptr_t)tb1 & ~3);
- if (n1 == 2)
- break;
- tb1 = tb1->jmp_next[n1];
- }
- /* we are now sure now that tb jumps to tb1 */
- tb_next = tb1;
-
- /* remove tb from the jmp_first list */
- ptb = &tb_next->jmp_first;
- for(;;) {
- tb1 = *ptb;
- n1 = (uintptr_t)tb1 & 3;
- tb1 = (TranslationBlock *)((uintptr_t)tb1 & ~3);
- if (n1 == n && tb1 == tb)
- break;
- ptb = &tb1->jmp_next[n1];
- }
- *ptb = tb->jmp_next[n];
- tb->jmp_next[n] = NULL;
-
- /* suppress the jump to next tb in generated code */
- tb_reset_jump(tb, n);
-
- /* suppress jumps in the tb on which we could have jumped */
- tb_reset_jump_recursive(tb_next);
- }
-}
-
-static void tb_reset_jump_recursive(TranslationBlock *tb)
-{
- tb_reset_jump_recursive2(tb, 0);
- tb_reset_jump_recursive2(tb, 1);
-}
-
#if defined(TARGET_HAS_ICE)
#if defined(CONFIG_USER_ONLY)
static void breakpoint_invalidate(CPUArchState *env, target_ulong pc)
@@ -1711,26 +1664,6 @@ void cpu_set_log_filename(const char *filename)
cpu_set_log(loglevel);
}
-static void cpu_unlink_tb(CPUArchState *env)
-{
- /* FIXME: TB unchaining isn't SMP safe. For now just ignore the
- problem and hope the cpu will stop of its own accord. For userspace
- emulation this often isn't actually as bad as it sounds. Often
- signals are used primarily to interrupt blocking syscalls. */
- TranslationBlock *tb;
- static spinlock_t interrupt_lock = SPIN_LOCK_UNLOCKED;
-
- spin_lock(&interrupt_lock);
- tb = env->current_tb;
- /* if the cpu is currently executing code, we must unlink it and
- all the potentially executing TB */
- if (tb) {
- env->current_tb = NULL;
- tb_reset_jump_recursive(tb);
- }
- spin_unlock(&interrupt_lock);
-}
-
#ifndef CONFIG_USER_ONLY
/* mask must never be zero, except for A20 change call */
static void tcg_handle_interrupt(CPUArchState *env, int mask)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
2013-05-09 8:09 ` Michael Tokarev
` (3 preceding siblings ...)
2013-05-09 8:13 ` [Qemu-devel] [PATCH 4/4] translate-all.c: Remove cpu_unlink_tb() Michael Tokarev
@ 2013-05-09 9:01 ` Peter Maydell
2013-05-09 10:05 ` Michael Tokarev
2013-05-09 10:26 ` Andreas Färber
5 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2013-05-09 9:01 UTC (permalink / raw)
To: Michael Tokarev
Cc: Anthony Liguori, qemu-stable, qemu-devel, Alexander Graf,
Blue Swirl, Paul Brook, Andreas Färber, Richard Henderson
On 9 May 2013 09:09, Michael Tokarev <mjt@tls.msk.ru> wrote:
> Now, the resulting thing compiles (ha!), but I'm not
> really sure how to test it. I ran a few random apps
> using qemu-i386 and qemu-arm, it appears to work.
You need to test TCG system emulation too, and in
particular something with multiple guest CPU cores.
-- PMM
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
2013-05-09 9:01 ` [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb() Peter Maydell
@ 2013-05-09 10:05 ` Michael Tokarev
0 siblings, 0 replies; 10+ messages in thread
From: Michael Tokarev @ 2013-05-09 10:05 UTC (permalink / raw)
To: Peter Maydell
Cc: Anthony Liguori, qemu-stable, qemu-devel, Alexander Graf,
Blue Swirl, Paul Brook, Andreas Färber, Richard Henderson
09.05.2013 13:01, Peter Maydell wrote:
> On 9 May 2013 09:09, Michael Tokarev <mjt@tls.msk.ru> wrote:
>> Now, the resulting thing compiles (ha!), but I'm not
>> really sure how to test it. I ran a few random apps
>> using qemu-i386 and qemu-arm, it appears to work.
>
> You need to test TCG system emulation too, and in
> particular something with multiple guest CPU cores.
As suggested on IRC, I ran a proposed omp testcase (in
LP:668799) in a armel chroot created by the same
patched qemu-arm-static, with 6 and 60 threads, and
tried running a few java binaries I've found -- that
works well from creating system to building and running
testcases. I also tried the same with qemu-system-i386
(without kvm), installing a system using patched qemu,
and running multi-threaded apps in it.
At least I don't see any obvious new breakages due to
my backport, but I can definitely go further than with
unpatched qemu.
So this is at least promising, so far... ;)
Thanks,
/mjt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
2013-05-09 8:09 ` Michael Tokarev
` (4 preceding siblings ...)
2013-05-09 9:01 ` [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb() Peter Maydell
@ 2013-05-09 10:26 ` Andreas Färber
5 siblings, 0 replies; 10+ messages in thread
From: Andreas Färber @ 2013-05-09 10:26 UTC (permalink / raw)
To: Michael Tokarev
Cc: Peter Maydell, Anthony Liguori, qemu-stable, qemu-devel,
Alexander Graf, Blue Swirl, Paul Brook, Richard Henderson
Am 09.05.2013 10:09, schrieb Michael Tokarev:
> 22.02.2013 22:09, Peter Maydell wrote:
>> cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
>
> This needed a "back merge" of env+cpu states back to cpu.
> Maybe we should somehow revisit the whole concept of the
> two, because it's sorta fun: at some point all functions
> has been converted to accept `cpu' instead of `env', but
> in many places the first thing a function does is to
> get `env' pointer out of `cpu'.
The concept is really easy: There is so much CPU code around that for
many years no one dared to touch it, ;) so changes need to be done
incrementally - not only to identify any fallout! If one function is
converted to no longer rely on env, then rather likely in some caller it
still needs to convert from env -> cpu. Once that caller is converted
too, it goes on moving the conversion "outwards" until the only
remaining env functions are TCG-related. env access from a specific *CPU
type is cheap, thus only talking about common code here. You will find
more background on the big "QOM CPUState part X" series. CPU_COMMON is
definitely not the way for the future, it should be okay for backporting
though in this case.
Regards,
Andreas
--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-05-09 10:26 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1361556605-21963-1-git-send-email-peter.maydell@linaro.org>
2013-03-03 13:23 ` [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb() Peter Maydell
2013-03-03 15:50 ` Blue Swirl
2013-05-09 8:09 ` Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 1/4] tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 2/4] cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 3/4] Handle CPU interrupts by inline checking of a flag Michael Tokarev
2013-05-09 8:13 ` [Qemu-devel] [PATCH 4/4] translate-all.c: Remove cpu_unlink_tb() Michael Tokarev
2013-05-09 9:01 ` [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb() Peter Maydell
2013-05-09 10:05 ` Michael Tokarev
2013-05-09 10:26 ` Andreas Färber
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).