* [Qemu-devel] Counting barrier instructions in ARM
@ 2014-10-16 1:45 Pranith Kumar
2014-10-16 1:54 ` Max Filippov
2014-10-16 8:05 ` Peter Maydell
0 siblings, 2 replies; 8+ messages in thread
From: Pranith Kumar @ 2014-10-16 1:45 UTC (permalink / raw)
To: qemu-devel
Hello,
I am trying to count the number of barrier instructions (dmb) which are being
executed in an multi-threaded ARM executable. I am running the executable using
qemu user mode with the following patch applied.
Basically I created two counters in the ARM cpu state and incrementing them by
generating a TCG instruction whenever a barrier instruction is translated. I am
doing something similar even for counting the total instructions executed.
The problem I am facing is that this seems to be crashing when run with a
multi-threaded executable. Also the statistics gathered are not really accurate.
Is there something obviously wrong with what I am trying to do? Any help is
highly appreciated.
Thanks!
---
linux-user/main.c | 10 +++++++++-
target-arm/cpu.h | 2 ++
target-arm/translate.c | 9 +++++++++
3 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/linux-user/main.c b/linux-user/main.c
index b453a39..7984027 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -816,7 +816,15 @@ void cpu_loop(CPUARMState *env)
break;
}
} else {
- env->regs[0] = do_syscall(env,
+ // print stats on exit
+ if (n == 248) {
+ FILE *icount_file = fopen("inscount.out", "w");
+ unsigned long total = (unsigned long)env->fence_count;
+ unsigned long icount = (unsigned long)env->insn_count;
+ fprintf(icount_file, "%lu, %lu, %f\n", icount, total, total * 1000.0/icount);
+ fclose(icount_file);
+ }
+ env->regs[0] = do_syscall(env,
n,
env->regs[0],
env->regs[1],
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 369d472..be38574 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -304,6 +304,8 @@ typedef struct CPUARMState {
uint64_t exclusive_test;
uint32_t exclusive_info;
#endif
+ uint32_t fence_count;
+ uint32_t insn_count;
/* iwMMXt coprocessor state. */
struct {
diff --git a/target-arm/translate.c b/target-arm/translate.c
index cf4e767..4d4ceb1 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -68,6 +68,8 @@ static TCGv_i64 cpu_exclusive_val;
static TCGv_i64 cpu_exclusive_test;
static TCGv_i32 cpu_exclusive_info;
#endif
+static TCGv_i32 cpu_fence_count;
+static TCGv_i32 cpu_insn_count;
/* FIXME: These should be removed. */
static TCGv_i32 cpu_F0s, cpu_F1s;
@@ -106,6 +108,10 @@ void arm_translate_init(void)
cpu_exclusive_info = tcg_global_mem_new_i32(TCG_AREG0,
offsetof(CPUARMState, exclusive_info), "exclusive_info");
#endif
+ cpu_fence_count = tcg_global_mem_new_i32(TCG_AREG0,
+ offsetof(CPUARMState, fence_count), "fence_count");
+ cpu_insn_count = tcg_global_mem_new_i32(TCG_AREG0,
+ offsetof(CPUARMState, insn_count), "insn_count");
a64_translate_init();
}
@@ -7568,6 +7574,7 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s)
case 5: /* dmb */
case 6: /* isb */
ARCH(7);
+ tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
/* We don't emulate caches so these are a no-op. */
return;
default:
@@ -9740,6 +9747,7 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw
case 4: /* dsb */
case 5: /* dmb */
case 6: /* isb */
+ tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
/* These execute as NOPs. */
break;
default:
@@ -11022,6 +11030,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
tcg_gen_debug_insn_start(dc->pc);
}
+ tcg_gen_add_i32(cpu_insn_count, cpu_insn_count, 1);
if (dc->thumb) {
disas_thumb_insn(env, dc);
if (dc->condexec_mask) {
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Counting barrier instructions in ARM
2014-10-16 1:45 [Qemu-devel] Counting barrier instructions in ARM Pranith Kumar
@ 2014-10-16 1:54 ` Max Filippov
2014-10-16 2:00 ` Pranith Kumar
2014-10-16 8:05 ` Peter Maydell
1 sibling, 1 reply; 8+ messages in thread
From: Max Filippov @ 2014-10-16 1:54 UTC (permalink / raw)
To: Pranith Kumar; +Cc: qemu-devel
Hi,
On Thu, Oct 16, 2014 at 5:45 AM, Pranith Kumar <bobby.prani@gmail.com> wrote:
> Is there something obviously wrong with what I am trying to do? Any help is
> highly appreciated.
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -7568,6 +7574,7 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s)
> case 5: /* dmb */
> case 6: /* isb */
> ARCH(7);
> + tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
tcg_gen_addi_i32, as you're adding an immediate.
Enabling TCG debug would catch such errors.
Not sure if that will fix crashing, but it should improve accuracy (:
> /* We don't emulate caches so these are a no-op. */
> return;
> default:
> @@ -9740,6 +9747,7 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw
> case 4: /* dsb */
> case 5: /* dmb */
> case 6: /* isb */
> + tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
Here as well.
> /* These execute as NOPs. */
> break;
> default:
> @@ -11022,6 +11030,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
> tcg_gen_debug_insn_start(dc->pc);
> }
>
> + tcg_gen_add_i32(cpu_insn_count, cpu_insn_count, 1);
And here.
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Counting barrier instructions in ARM
2014-10-16 1:54 ` Max Filippov
@ 2014-10-16 2:00 ` Pranith Kumar
2014-10-16 2:12 ` Max Filippov
0 siblings, 1 reply; 8+ messages in thread
From: Pranith Kumar @ 2014-10-16 2:00 UTC (permalink / raw)
To: Max Filippov; +Cc: qemu-devel
Hi Max,
On Wed, Oct 15, 2014 at 9:54 PM, Max Filippov <jcmvbkbc@gmail.com> wrote:
> Hi,
>
> On Thu, Oct 16, 2014 at 5:45 AM, Pranith Kumar <bobby.prani@gmail.com> wrote:
>> Is there something obviously wrong with what I am trying to do? Any help is
>> highly appreciated.
>> --- a/target-arm/translate.c
>> +++ b/target-arm/translate.c
>> @@ -7568,6 +7574,7 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s)
>> case 5: /* dmb */
>> case 6: /* isb */
>> ARCH(7);
>> + tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
>
> tcg_gen_addi_i32, as you're adding an immediate.
> Enabling TCG debug would catch such errors.
> Not sure if that will fix crashing, but it should improve accuracy (:
I had that before the change to tcg_gen_add_i32 and the crashes were
still there. But yes, this needs to be changed to use addi.
How do I enable the TCG debug build?
Thanks!
>
>> /* We don't emulate caches so these are a no-op. */
>> return;
>> default:
>> @@ -9740,6 +9747,7 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw
>> case 4: /* dsb */
>> case 5: /* dmb */
>> case 6: /* isb */
>> + tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
>
> Here as well.
>
>> /* These execute as NOPs. */
>> break;
>> default:
>> @@ -11022,6 +11030,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
>> tcg_gen_debug_insn_start(dc->pc);
>> }
>>
>> + tcg_gen_add_i32(cpu_insn_count, cpu_insn_count, 1);
>
> And here.
>
> --
> Thanks.
> -- Max
--
Pranith
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Counting barrier instructions in ARM
2014-10-16 2:00 ` Pranith Kumar
@ 2014-10-16 2:12 ` Max Filippov
2014-10-16 16:05 ` Pranith Kumar
0 siblings, 1 reply; 8+ messages in thread
From: Max Filippov @ 2014-10-16 2:12 UTC (permalink / raw)
To: Pranith Kumar; +Cc: qemu-devel
On Thu, Oct 16, 2014 at 6:00 AM, Pranith Kumar <bobby.prani@gmail.com> wrote:
>>> + tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
>>
>> tcg_gen_addi_i32, as you're adding an immediate.
>> Enabling TCG debug would catch such errors.
>> Not sure if that will fix crashing, but it should improve accuracy (:
>
> I had that before the change to tcg_gen_add_i32 and the crashes were
> still there. But yes, this needs to be changed to use addi.
Got backtraces of your crashes?
> How do I enable the TCG debug build?
configure with --enable-debug-tcg swithch.
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Counting barrier instructions in ARM
2014-10-16 2:12 ` Max Filippov
@ 2014-10-16 16:05 ` Pranith Kumar
0 siblings, 0 replies; 8+ messages in thread
From: Pranith Kumar @ 2014-10-16 16:05 UTC (permalink / raw)
To: Max Filippov; +Cc: qemu-devel
On Wed, Oct 15, 2014 at 10:12 PM, Max Filippov <jcmvbkbc@gmail.com> wrote:
> On Thu, Oct 16, 2014 at 6:00 AM, Pranith Kumar <bobby.prani@gmail.com> wrote:
>>>> + tcg_gen_add_i32(cpu_fence_count, cpu_fence_count, 1);
>>>
>>> tcg_gen_addi_i32, as you're adding an immediate.
>>> Enabling TCG debug would catch such errors.
>>> Not sure if that will fix crashing, but it should improve accuracy (:
>>
>> I had that before the change to tcg_gen_add_i32 and the crashes were
>> still there. But yes, this needs to be changed to use addi.
>
> Got backtraces of your crashes?
>
These bugs seem to be Heisenbugs. As Peter Maydell pointed out,
apparently multi-threaded user mode execution is not supported.
I will try to use system mode.
Thanks!
--
Pranith
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Counting barrier instructions in ARM
2014-10-16 1:45 [Qemu-devel] Counting barrier instructions in ARM Pranith Kumar
2014-10-16 1:54 ` Max Filippov
@ 2014-10-16 8:05 ` Peter Maydell
2014-10-16 16:01 ` Pranith Kumar
1 sibling, 1 reply; 8+ messages in thread
From: Peter Maydell @ 2014-10-16 8:05 UTC (permalink / raw)
To: Pranith Kumar; +Cc: QEMU Developers
On 16 October 2014 03:45, Pranith Kumar <bobby.prani@gmail.com> wrote:
> The problem I am facing is that this seems to be crashing when run with a
> multi-threaded executable.
This is nothing to do with your changes -- user-mode QEMU does not
support multi-threaded guest executables. QEMU may crash, hang,
or stop with an assertion failure, fairly randomly. Don't try
to run multithreaded guests :-)
> Also the statistics gathered are not really accurate.
This will be because you're using add rather than addi,
so you're adding effectively a random number to the count
every time (whatever TCG's "value in temporary 1" happens
to be, I expect).
thanks
-- PMM
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Counting barrier instructions in ARM
2014-10-16 8:05 ` Peter Maydell
@ 2014-10-16 16:01 ` Pranith Kumar
2014-10-17 10:43 ` Alex Bennée
0 siblings, 1 reply; 8+ messages in thread
From: Pranith Kumar @ 2014-10-16 16:01 UTC (permalink / raw)
To: Peter Maydell; +Cc: QEMU Developers
Hi Peter,
On Thu, Oct 16, 2014 at 4:05 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 16 October 2014 03:45, Pranith Kumar <bobby.prani@gmail.com> wrote:
>> The problem I am facing is that this seems to be crashing when run with a
>> multi-threaded executable.
>
> This is nothing to do with your changes -- user-mode QEMU does not
> support multi-threaded guest executables. QEMU may crash, hang,
> or stop with an assertion failure, fairly randomly. Don't try
> to run multithreaded guests :-)
OK, I will try to gather the statistics in system mode. Is there any
way to indicate from within the system to qemu to start collecting the stats?
I dont want to collect the stats for bootup and other unrelated code paths.
>
>> Also the statistics gathered are not really accurate.
>
> This will be because you're using add rather than addi,
> so you're adding effectively a random number to the count
> every time (whatever TCG's "value in temporary 1" happens
> to be, I expect).
OK, I updated this when Max Filippov pointed it out. Thanks!
--
Pranith
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Counting barrier instructions in ARM
2014-10-16 16:01 ` Pranith Kumar
@ 2014-10-17 10:43 ` Alex Bennée
0 siblings, 0 replies; 8+ messages in thread
From: Alex Bennée @ 2014-10-17 10:43 UTC (permalink / raw)
To: Pranith Kumar; +Cc: Peter Maydell, QEMU Developers
Pranith Kumar <bobby.prani@gmail.com> writes:
> Hi Peter,
>
> On Thu, Oct 16, 2014 at 4:05 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
>> On 16 October 2014 03:45, Pranith Kumar <bobby.prani@gmail.com> wrote:
>>> The problem I am facing is that this seems to be crashing when run with a
>>> multi-threaded executable.
>>
>> This is nothing to do with your changes -- user-mode QEMU does not
>> support multi-threaded guest executables. QEMU may crash, hang,
>> or stop with an assertion failure, fairly randomly. Don't try
>> to run multithreaded guests :-)
>
> OK, I will try to gather the statistics in system mode. Is there any
> way to indicate from within the system to qemu to start collecting the stats?
>
> I dont want to collect the stats for bootup and other unrelated code
> paths.
You could enable CONTEXTIDR in the kernel and then make your counts
against an array indexed by it. I've used that technique before to
profile guest hot-blocks and work out which guest PID was responsible.
--
Alex Bennée
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-10-17 10:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-16 1:45 [Qemu-devel] Counting barrier instructions in ARM Pranith Kumar
2014-10-16 1:54 ` Max Filippov
2014-10-16 2:00 ` Pranith Kumar
2014-10-16 2:12 ` Max Filippov
2014-10-16 16:05 ` Pranith Kumar
2014-10-16 8:05 ` Peter Maydell
2014-10-16 16:01 ` Pranith Kumar
2014-10-17 10:43 ` Alex Bennée
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).