* [Qemu-devel] [PATCH v2 1/3] trace: teach lttng backend to use format strings
2014-07-15 11:42 [Qemu-devel] [PATCH v2 0/3] some TCG related trace patches Alex Bennée
@ 2014-07-15 11:42 ` Alex Bennée
2014-08-01 9:05 ` Alex Bennée
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 2/3] trace: add some tcg tracing support Alex Bennée
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity Alex Bennée
2 siblings, 1 reply; 16+ messages in thread
From: Alex Bennée @ 2014-07-15 11:42 UTC (permalink / raw)
To: stefanha; +Cc: Alex Bennée, qemu-devel, mohamad.gebai
This makes the UST backend pay attention to the format string arguments
that are defined when defining payload data. With this you can now
ensure integers are reported in hex mode if you want.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
v2
- remove silly debug statements
v3
- fix spelling
- rebase to latest tracetool
diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index eccf552..0a65d6a 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -122,13 +122,16 @@ class Event(object):
Properties of the event.
args : Arguments
The event arguments.
+ arg_fmts : str
+ The format strings for each argument.
"""
_CRE = re.compile("((?P<props>.*)\s+)?(?P<name>[^(\s]+)\((?P<args>[^)]*)\)\s*(?P<fmt>\".*)?")
+ _FMT = re.compile("(%\w+|%.*PRI\S+)")
_VALID_PROPS = set(["disable"])
- def __init__(self, name, props, fmt, args):
+ def __init__(self, name, props, fmt, args, arg_fmts):
"""
Parameters
----------
@@ -140,11 +143,15 @@ class Event(object):
Event printing format.
args : Arguments
Event arguments.
+ arg_fmts : list of str
+ Format strings for each argument.
+
"""
self.name = name
self.properties = props
self.fmt = fmt
self.args = args
+ self.arg_fmts = arg_fmts
unknown_props = set(self.properties) - self._VALID_PROPS
if len(unknown_props) > 0:
@@ -173,8 +180,9 @@ class Event(object):
props = groups["props"].split()
fmt = groups["fmt"]
args = Arguments.build(groups["args"])
+ arg_fmts = Event._FMT.findall(fmt)
- return Event(name, props, fmt, args)
+ return Event(name, props, fmt, args, arg_fmts)
def __repr__(self):
"""Evaluable string representation for this object."""
diff --git a/scripts/tracetool/format/ust_events_h.py b/scripts/tracetool/format/ust_events_h.py
index 5102565..d189899 100644
--- a/scripts/tracetool/format/ust_events_h.py
+++ b/scripts/tracetool/format/ust_events_h.py
@@ -63,13 +63,20 @@ def generate(events, backend):
name=e.name,
args=", ".join(", ".join(i) for i in e.args))
- for t, n in e.args:
- if ('int' in t) or ('long' in t) or ('unsigned' in t) or ('size_t' in t):
+ types = e.args.types()
+ names = e.args.names()
+ fmts = e.arg_fmts
+ for t,n,f in zip(types, names, fmts):
+ if ('char *' in t) or ('char*' in t):
+ out(' ctf_string(' + n + ', ' + n + ')')
+ elif ("%p" in f) or ("x" in f) or ("PRIx" in f):
+ out(' ctf_integer_hex('+ t + ', ' + n + ', ' + n + ')')
+ elif ("ptr" in t) or ("*" in t):
+ out(' ctf_integer_hex('+ t + ', ' + n + ', ' + n + ')')
+ elif ('int' in t) or ('long' in t) or ('unsigned' in t) or ('size_t' in t):
out(' ctf_integer(' + t + ', ' + n + ', ' + n + ')')
elif ('double' in t) or ('float' in t):
out(' ctf_float(' + t + ', ' + n + ', ' + n + ')')
- elif ('char *' in t) or ('char*' in t):
- out(' ctf_string(' + n + ', ' + n + ')')
elif ('void *' in t) or ('void*' in t):
out(' ctf_integer_hex(unsigned long, ' + n + ', ' + n + ')')
--
2.0.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH v2 2/3] trace: add some tcg tracing support
2014-07-15 11:42 [Qemu-devel] [PATCH v2 0/3] some TCG related trace patches Alex Bennée
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 1/3] trace: teach lttng backend to use format strings Alex Bennée
@ 2014-07-15 11:42 ` Alex Bennée
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity Alex Bennée
2 siblings, 0 replies; 16+ messages in thread
From: Alex Bennée @ 2014-07-15 11:42 UTC (permalink / raw)
To: stefanha; +Cc: Alex Bennée, qemu-devel, mohamad.gebai
This adds a couple of tcg specific trace-events which are useful for
tracing execution though tcg generated blocks. It's been tested with
lttng user space tracing but is generic enough for all systems. The tcg
events are:
* translate_block - when a subject block is translated
* exec_tb - when a translated block is entered
* exec_tb_exit - when we exit the translated code
* exec_tb_nocache - special case translations
Of course we can only trace the entrance to the first block of a chain
as each block will jump directly to the next when it can. See the -d
nochain patch to allow more complete tracing at the expense of
performance.
---
v2
- rebase
diff --git a/cpu-exec.c b/cpu-exec.c
index 38e5f02..45ef77b 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -18,6 +18,7 @@
*/
#include "config.h"
#include "cpu.h"
+#include "trace.h"
#include "disas/disas.h"
#include "tcg.h"
#include "qemu/atomic.h"
@@ -65,6 +66,9 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr)
#endif /* DEBUG_DISAS */
next_tb = tcg_qemu_tb_exec(env, tb_ptr);
+ trace_exec_tb_exit( (void *) (next_tb & ~TB_EXIT_MASK),
+ next_tb & TB_EXIT_MASK);
+
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
@@ -105,6 +109,7 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
max_cycles);
cpu->current_tb = tb;
/* execute the generated code */
+ trace_exec_tb_nocache(tb, tb->pc);
cpu_tb_exec(cpu, tb->tc_ptr);
cpu->current_tb = NULL;
tb_phys_invalidate(tb, -1);
@@ -637,6 +642,7 @@ int cpu_exec(CPUArchState *env)
cpu->current_tb = tb;
barrier();
if (likely(!cpu->exit_request)) {
+ trace_exec_tb(tb, tb->pc);
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = cpu_tb_exec(cpu, tc_ptr);
diff --git a/trace-events b/trace-events
index 709de68..f8cc35f 100644
--- a/trace-events
+++ b/trace-events
@@ -1237,6 +1237,14 @@ kvm_failed_spr_get(int str, const char *msg) "Warning: Unable to retrieve SPR %d
kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s"
kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s"
+# cpu-exec.c
+exec_tb(void *tb, uintptr_t pc) "tb:%p pc=0x%x"
+exec_tb_nocache(void *tb, uintptr_t pc) "tb:%p pc=0x%x"
+exec_tb_exit(void *next_tb, unsigned int flags) "tb:%p flags=%x"
+
+# translate-all.c
+translate_block(void *tb, uintptr_t pc, uint8_t *tb_code) "tb:%p, pc:0x%x, tb_code:%p"
+
# memory.c
memory_region_ops_read(void *mr, uint64_t addr, uint64_t value, unsigned size) "mr %p addr %#"PRIx64" value %#"PRIx64" size %u"
memory_region_ops_write(void *mr, uint64_t addr, uint64_t value, unsigned size) "mr %p addr %#"PRIx64" value %#"PRIx64" size %u"
diff --git a/translate-all.c b/translate-all.c
index 11d3f28..a11c083 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -34,6 +34,7 @@
#include "qemu-common.h"
#define NO_CPU_IO_DEFS
#include "cpu.h"
+#include "trace.h"
#include "disas/disas.h"
#include "tcg.h"
#if defined(CONFIG_USER_ONLY)
@@ -177,6 +178,8 @@ int cpu_gen_code(CPUArchState *env, TranslationBlock *tb, int *gen_code_size_ptr
gen_intermediate_code(env, tb);
+ trace_translate_block(tb, tb->pc, tb->tc_ptr);
+
/* generate machine code */
gen_code_buf = tb->tc_ptr;
tb->tb_next_offset[0] = 0xffff;
--
2.0.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 11:42 [Qemu-devel] [PATCH v2 0/3] some TCG related trace patches Alex Bennée
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 1/3] trace: teach lttng backend to use format strings Alex Bennée
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 2/3] trace: add some tcg tracing support Alex Bennée
@ 2014-07-15 11:42 ` Alex Bennée
2014-07-15 12:15 ` Andreas Färber
` (2 more replies)
2 siblings, 3 replies; 16+ messages in thread
From: Alex Bennée @ 2014-07-15 11:42 UTC (permalink / raw)
To: stefanha; +Cc: Alex Bennée, qemu-devel, Andreas Färber, mohamad.gebai
The tb_find_fast path is important to quickly moving from one block to
the next. However we need to flush it when tlb changes occur so it's
important to know how well we are doing with the cache.
This patch adds some basic hit/miss profiling to the tb_find_fast
tracepoint as well as a number of other tb_ related areas. I've also
added a trace_inc_counter() helper which gets inlined away when tracing
is disabled.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
diff --git a/cpu-exec.c b/cpu-exec.c
index 45ef77b..771272f 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -187,7 +187,10 @@ static inline TranslationBlock *tb_find_fast(CPUArchState *env)
tb = cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)];
if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base ||
tb->flags != flags)) {
+ trace_inc_counter(&cpu->tb_jmp_cache_stats.misses);
tb = tb_find_slow(env, pc, cs_base, flags);
+ } else {
+ trace_inc_counter(&cpu->tb_jmp_cache_stats.hits);
}
return tb;
}
diff --git a/cputlb.c b/cputlb.c
index 7bd3573..672656a 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -58,7 +58,7 @@ void tlb_flush(CPUState *cpu, int flush_global)
cpu->current_tb = NULL;
memset(env->tlb_table, -1, sizeof(env->tlb_table));
- memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
+ tb_flush_all_jmp_cache(cpu);
env->tlb_flush_addr = -1;
env->tlb_flush_mask = 0;
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index df977c8..8376678 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -243,6 +243,10 @@ struct CPUState {
void *env_ptr; /* CPUArchState */
struct TranslationBlock *current_tb;
struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
+ struct {
+ int hits;
+ int misses;
+ } tb_jmp_cache_stats;
struct GDBRegisterState *gdb_regs;
int gdb_num_regs;
int gdb_num_g_regs;
@@ -584,6 +588,15 @@ void cpu_exit(CPUState *cpu);
*/
void cpu_resume(CPUState *cpu);
+
+/**
+ * tb_flush_all_jmp_cache:
+ * @cpu: The CPU jmp cache to flush
+ *
+ * Flush all the entries from the cpu fast jump cache
+ */
+void tb_flush_all_jmp_cache(CPUState *cpu);
+
/**
* qemu_init_vcpu:
* @cpu: The vCPU to initialize.
diff --git a/include/trace.h b/include/trace.h
index c15f498..7a9c0dc 100644
--- a/include/trace.h
+++ b/include/trace.h
@@ -3,4 +3,14 @@
#include "trace/generated-tracers.h"
+#ifndef CONFIG_TRACE_NOP
+static inline void trace_inc_counter(int *counter) {
+ int cnt = *counter;
+ cnt++;
+ *counter = cnt;
+}
+#else
+static inline void trace_inc_counter(int *counter) { /* do nothing */ }
+#endif
+
#endif /* TRACE_H */
diff --git a/qom/cpu.c b/qom/cpu.c
index fada2d4..956b36d 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -244,7 +244,7 @@ static void cpu_common_reset(CPUState *cpu)
cpu->icount_extra = 0;
cpu->icount_decr.u32 = 0;
cpu->can_do_io = 0;
- memset(cpu->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof(void *));
+ tb_flush_all_jmp_cache(cpu);
}
static bool cpu_common_has_work(CPUState *cs)
diff --git a/trace-events b/trace-events
index f8cc35f..5a58a11 100644
--- a/trace-events
+++ b/trace-events
@@ -1244,6 +1244,9 @@ exec_tb_exit(void *next_tb, unsigned int flags) "tb:%p flags=%x"
# translate-all.c
translate_block(void *tb, uintptr_t pc, uint8_t *tb_code) "tb:%p, pc:0x%x, tb_code:%p"
+tb_flush(void) ""
+tb_flush_jump_cache(uintptr_t pc) "pc:0x%x"
+tb_flush_all_jump_cache(int hits, int misses) "hits:%d misses:%d"
# memory.c
memory_region_ops_read(void *mr, uint64_t addr, uint64_t value, unsigned size) "mr %p addr %#"PRIx64" value %#"PRIx64" size %u"
diff --git a/translate-all.c b/translate-all.c
index a11c083..8e7bbcc 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -714,12 +714,22 @@ static void page_flush_tb(void)
}
}
+void tb_flush_all_jmp_cache(CPUState *cpu)
+{
+ trace_tb_flush_all_jump_cache(cpu->tb_jmp_cache_stats.hits,
+ cpu->tb_jmp_cache_stats.misses);
+ memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
+ memset((void *) &cpu->tb_jmp_cache_stats, 0, sizeof(cpu->tb_jmp_cache_stats));
+}
+
/* flush all the translation blocks */
/* XXX: tb_flush is currently not thread safe */
void tb_flush(CPUArchState *env1)
{
CPUState *cpu = ENV_GET_CPU(env1);
+ trace_tb_flush();
+
#if defined(DEBUG_FLUSH)
printf("qemu: flush code_size=%ld nb_tbs=%d avg_tb_size=%ld\n",
(unsigned long)(tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer),
@@ -734,7 +744,7 @@ void tb_flush(CPUArchState *env1)
tcg_ctx.tb_ctx.nb_tbs = 0;
CPU_FOREACH(cpu) {
- memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
+ tb_flush_all_jmp_cache(cpu);
}
memset(tcg_ctx.tb_ctx.tb_phys_hash, 0, sizeof(tcg_ctx.tb_ctx.tb_phys_hash));
@@ -1520,6 +1530,8 @@ void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr)
i = tb_jmp_cache_hash_page(addr);
memset(&cpu->tb_jmp_cache[i], 0,
TB_JMP_PAGE_SIZE * sizeof(TranslationBlock *));
+
+ trace_tb_flush_jump_cache(addr);
}
void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
--
2.0.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity Alex Bennée
@ 2014-07-15 12:15 ` Andreas Färber
2014-07-15 13:12 ` Alex Bennée
2014-07-15 12:23 ` Peter Maydell
2014-07-15 13:19 ` Paolo Bonzini
2 siblings, 1 reply; 16+ messages in thread
From: Andreas Färber @ 2014-07-15 12:15 UTC (permalink / raw)
To: Alex Bennée; +Cc: qemu-devel, stefanha, mohamad.gebai
Hi,
Am 15.07.2014 13:42, schrieb Alex Bennée:
> The tb_find_fast path is important to quickly moving from one block to
> the next. However we need to flush it when tlb changes occur so it's
> important to know how well we are doing with the cache.
>
> This patch adds some basic hit/miss profiling to the tb_find_fast
> tracepoint as well as a number of other tb_ related areas. I've also
> added a trace_inc_counter() helper which gets inlined away when tracing
> is disabled.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 45ef77b..771272f 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -187,7 +187,10 @@ static inline TranslationBlock *tb_find_fast(CPUArchState *env)
> tb = cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)];
> if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base ||
> tb->flags != flags)) {
> + trace_inc_counter(&cpu->tb_jmp_cache_stats.misses);
> tb = tb_find_slow(env, pc, cs_base, flags);
> + } else {
> + trace_inc_counter(&cpu->tb_jmp_cache_stats.hits);
> }
> return tb;
> }
> diff --git a/cputlb.c b/cputlb.c
> index 7bd3573..672656a 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -58,7 +58,7 @@ void tlb_flush(CPUState *cpu, int flush_global)
> cpu->current_tb = NULL;
>
> memset(env->tlb_table, -1, sizeof(env->tlb_table));
> - memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
> + tb_flush_all_jmp_cache(cpu);
>
> env->tlb_flush_addr = -1;
> env->tlb_flush_mask = 0;
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index df977c8..8376678 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -243,6 +243,10 @@ struct CPUState {
> void *env_ptr; /* CPUArchState */
> struct TranslationBlock *current_tb;
> struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
> + struct {
> + int hits;
> + int misses;
Is anything else going to be added here? If not, the indentation can be
dropped.
> + } tb_jmp_cache_stats;
This is lacking documentation. Should be trivial to add for this field
(not here, above the struct). To document the subfields we may need to
name the struct.
> struct GDBRegisterState *gdb_regs;
> int gdb_num_regs;
> int gdb_num_g_regs;
> @@ -584,6 +588,15 @@ void cpu_exit(CPUState *cpu);
> */
> void cpu_resume(CPUState *cpu);
>
> +
> +/**
> + * tb_flush_all_jmp_cache:
> + * @cpu: The CPU jmp cache to flush
> + *
> + * Flush all the entries from the cpu fast jump cache
"CPU" for consistency
> + */
> +void tb_flush_all_jmp_cache(CPUState *cpu);
> +
> /**
> * qemu_init_vcpu:
> * @cpu: The vCPU to initialize.
> diff --git a/include/trace.h b/include/trace.h
> index c15f498..7a9c0dc 100644
> --- a/include/trace.h
> +++ b/include/trace.h
> @@ -3,4 +3,14 @@
>
> #include "trace/generated-tracers.h"
>
> +#ifndef CONFIG_TRACE_NOP
> +static inline void trace_inc_counter(int *counter) {
> + int cnt = *counter;
> + cnt++;
> + *counter = cnt;
> +}
> +#else
> +static inline void trace_inc_counter(int *counter) { /* do nothing */ }
> +#endif
> +
> #endif /* TRACE_H */
Coding Style issues with the first function. For simplicity just keep
the first implementation but with the proper brace placement, and then
just put the #ifdef into the function body. That avoids the signatures
getting out of sync.
> diff --git a/qom/cpu.c b/qom/cpu.c
> index fada2d4..956b36d 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -244,7 +244,7 @@ static void cpu_common_reset(CPUState *cpu)
> cpu->icount_extra = 0;
> cpu->icount_decr.u32 = 0;
> cpu->can_do_io = 0;
> - memset(cpu->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof(void *));
> + tb_flush_all_jmp_cache(cpu);
> }
>
> static bool cpu_common_has_work(CPUState *cs)
> diff --git a/trace-events b/trace-events
> index f8cc35f..5a58a11 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1244,6 +1244,9 @@ exec_tb_exit(void *next_tb, unsigned int flags) "tb:%p flags=%x"
>
> # translate-all.c
> translate_block(void *tb, uintptr_t pc, uint8_t *tb_code) "tb:%p, pc:0x%x, tb_code:%p"
> +tb_flush(void) ""
> +tb_flush_jump_cache(uintptr_t pc) "pc:0x%x"
> +tb_flush_all_jump_cache(int hits, int misses) "hits:%d misses:%d"
>
> # memory.c
> memory_region_ops_read(void *mr, uint64_t addr, uint64_t value, unsigned size) "mr %p addr %#"PRIx64" value %#"PRIx64" size %u"
> diff --git a/translate-all.c b/translate-all.c
> index a11c083..8e7bbcc 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -714,12 +714,22 @@ static void page_flush_tb(void)
> }
> }
>
> +void tb_flush_all_jmp_cache(CPUState *cpu)
> +{
> + trace_tb_flush_all_jump_cache(cpu->tb_jmp_cache_stats.hits,
> + cpu->tb_jmp_cache_stats.misses);
> + memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
> + memset((void *) &cpu->tb_jmp_cache_stats, 0, sizeof(cpu->tb_jmp_cache_stats));
> +}
> +
> /* flush all the translation blocks */
> /* XXX: tb_flush is currently not thread safe */
> void tb_flush(CPUArchState *env1)
> {
> CPUState *cpu = ENV_GET_CPU(env1);
>
> + trace_tb_flush();
> +
> #if defined(DEBUG_FLUSH)
> printf("qemu: flush code_size=%ld nb_tbs=%d avg_tb_size=%ld\n",
> (unsigned long)(tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer),
> @@ -734,7 +744,7 @@ void tb_flush(CPUArchState *env1)
> tcg_ctx.tb_ctx.nb_tbs = 0;
>
> CPU_FOREACH(cpu) {
> - memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
> + tb_flush_all_jmp_cache(cpu);
> }
>
> memset(tcg_ctx.tb_ctx.tb_phys_hash, 0, sizeof(tcg_ctx.tb_ctx.tb_phys_hash));
> @@ -1520,6 +1530,8 @@ void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr)
> i = tb_jmp_cache_hash_page(addr);
> memset(&cpu->tb_jmp_cache[i], 0,
> TB_JMP_PAGE_SIZE * sizeof(TranslationBlock *));
Can this one be dropped, too?
> +
> + trace_tb_flush_jump_cache(addr);
> }
>
> void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
Cheers,
Andreas
--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 12:15 ` Andreas Färber
@ 2014-07-15 13:12 ` Alex Bennée
0 siblings, 0 replies; 16+ messages in thread
From: Alex Bennée @ 2014-07-15 13:12 UTC (permalink / raw)
To: Andreas Färber; +Cc: qemu-devel, stefanha, mohamad.gebai
Andreas Färber writes:
> Hi,
>
> Am 15.07.2014 13:42, schrieb Alex Bennée:
<snip>
>> index df977c8..8376678 100644
>> --- a/include/qom/cpu.h
>> +++ b/include/qom/cpu.h
>> @@ -243,6 +243,10 @@ struct CPUState {
>> void *env_ptr; /* CPUArchState */
>> struct TranslationBlock *current_tb;
>> struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
>> + struct {
>> + int hits;
>> + int misses;
>
> Is anything else going to be added here? If not, the indentation can be
> dropped.
At the moment probably not.
>
>> + } tb_jmp_cache_stats;
>
> This is lacking documentation. Should be trivial to add for this field
> (not here, above the struct). To document the subfields we may need to
> name the struct.
Would not a simple comment be enough?
>
>> struct GDBRegisterState *gdb_regs;
>> int gdb_num_regs;
>> int gdb_num_g_regs;
>> @@ -584,6 +588,15 @@ void cpu_exit(CPUState *cpu);
>> */
>> void cpu_resume(CPUState *cpu);
>>
>> +
>> +/**
>> + * tb_flush_all_jmp_cache:
>> + * @cpu: The CPU jmp cache to flush
>> + *
>> + * Flush all the entries from the cpu fast jump cache
>
> "CPU" for consistency
OK
>
<snip>
>> +#ifndef CONFIG_TRACE_NOP
>> +static inline void trace_inc_counter(int *counter) {
>> + int cnt = *counter;
>> + cnt++;
>> + *counter = cnt;
>> +}
>> +#else
>> +static inline void trace_inc_counter(int *counter) { /* do nothing */ }
>> +#endif
>> +
>> #endif /* TRACE_H */
>
> Coding Style issues with the first function. For simplicity just keep
> the first implementation but with the proper brace placement, and then
> just put the #ifdef into the function body. That avoids the signatures
> getting out of sync.
mea-culpa, I forgot to run checkpatch.pl....
>>
>> memset(tcg_ctx.tb_ctx.tb_phys_hash, 0, sizeof(tcg_ctx.tb_ctx.tb_phys_hash));
>> @@ -1520,6 +1530,8 @@ void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr)
>> i = tb_jmp_cache_hash_page(addr);
>> memset(&cpu->tb_jmp_cache[i], 0,
>> TB_JMP_PAGE_SIZE * sizeof(TranslationBlock *));
>
> Can this one be dropped, too?
No, this is only a partial invalidation. I did toy with instrumenting
how many entries are flushed but it didn't seem the be worth it given
with the current architecture there is not much we can do.
>
>> +
>> + trace_tb_flush_jump_cache(addr);
>> }
>>
>> void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
>
> Cheers,
> Andreas
--
Alex Bennée
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity Alex Bennée
2014-07-15 12:15 ` Andreas Färber
@ 2014-07-15 12:23 ` Peter Maydell
2014-07-15 13:07 ` Peter Maydell
2014-07-15 13:10 ` Alex Bennée
2014-07-15 13:19 ` Paolo Bonzini
2 siblings, 2 replies; 16+ messages in thread
From: Peter Maydell @ 2014-07-15 12:23 UTC (permalink / raw)
To: Alex Bennée
Cc: mohamad.gebai, QEMU Developers, Stefan Hajnoczi,
Andreas Färber
On 15 July 2014 12:42, Alex Bennée <alex.bennee@linaro.org> wrote:
> +#ifndef CONFIG_TRACE_NOP
> +static inline void trace_inc_counter(int *counter) {
> + int cnt = *counter;
> + cnt++;
> + *counter = cnt;
> +}
...why isn't this just "*counter++;" ?
-- PMM
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 12:23 ` Peter Maydell
@ 2014-07-15 13:07 ` Peter Maydell
2014-07-15 13:10 ` Alex Bennée
1 sibling, 0 replies; 16+ messages in thread
From: Peter Maydell @ 2014-07-15 13:07 UTC (permalink / raw)
To: Alex Bennée
Cc: mohamad.gebai, QEMU Developers, Stefan Hajnoczi,
Andreas Färber
On 15 July 2014 13:23, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 15 July 2014 12:42, Alex Bennée <alex.bennee@linaro.org> wrote:
>> +#ifndef CONFIG_TRACE_NOP
>> +static inline void trace_inc_counter(int *counter) {
>> + int cnt = *counter;
>> + cnt++;
>> + *counter = cnt;
>> +}
>
> ...why isn't this just "*counter++;" ?
Derp.
(*counter)++;
I leave for the reader to decide whether this constitutes an
argument in favour of the way you originally phrased it...
-- PMM
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 12:23 ` Peter Maydell
2014-07-15 13:07 ` Peter Maydell
@ 2014-07-15 13:10 ` Alex Bennée
1 sibling, 0 replies; 16+ messages in thread
From: Alex Bennée @ 2014-07-15 13:10 UTC (permalink / raw)
To: Peter Maydell
Cc: mohamad.gebai, QEMU Developers, Stefan Hajnoczi,
Andreas Färber
Peter Maydell writes:
> On 15 July 2014 12:42, Alex Bennée <alex.bennee@linaro.org> wrote:
>> +#ifndef CONFIG_TRACE_NOP
>> +static inline void trace_inc_counter(int *counter) {
>> + int cnt = *counter;
>> + cnt++;
>> + *counter = cnt;
>> +}
>
> ...why isn't this just "*counter++;" ?
You of course mean:
(*counter)++;
I'll fix that up for the next iteration...
--
Alex Bennée
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 11:42 ` [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity Alex Bennée
2014-07-15 12:15 ` Andreas Färber
2014-07-15 12:23 ` Peter Maydell
@ 2014-07-15 13:19 ` Paolo Bonzini
2014-07-15 14:16 ` Alex Bennée
2 siblings, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2014-07-15 13:19 UTC (permalink / raw)
To: Alex Bennée, stefanha; +Cc: mohamad.gebai, qemu-devel, Andreas Färber
Il 15/07/2014 13:42, Alex Bennée ha scritto:
> + trace_inc_counter(&cpu->tb_jmp_cache_stats.misses);
> tb = tb_find_slow(env, pc, cs_base, flags);
> + } else {
> + trace_inc_counter(&cpu->tb_jmp_cache_stats.hits);
> }
I think this is premature optimization...
Paolo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 13:19 ` Paolo Bonzini
@ 2014-07-15 14:16 ` Alex Bennée
2014-07-15 20:11 ` Paolo Bonzini
0 siblings, 1 reply; 16+ messages in thread
From: Alex Bennée @ 2014-07-15 14:16 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: mohamad.gebai, qemu-devel, stefanha, Andreas Färber
Paolo Bonzini writes:
> Il 15/07/2014 13:42, Alex Bennée ha scritto:
>> + trace_inc_counter(&cpu->tb_jmp_cache_stats.misses);
>> tb = tb_find_slow(env, pc, cs_base, flags);
>> + } else {
>> + trace_inc_counter(&cpu->tb_jmp_cache_stats.hits);
>> }
>
> I think this is premature optimization...
How do you mean? It's not really an optimization as much as an
instrumentation. It should compile away to nothing if you don't have
tracing enabled in your build.
OTOH the numbers I'm seeing are very interesting in so far as the fast
path could be a potential waste of code in a lot of cases.
--
Alex Bennée
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 14:16 ` Alex Bennée
@ 2014-07-15 20:11 ` Paolo Bonzini
2014-07-15 20:29 ` Peter Maydell
0 siblings, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2014-07-15 20:11 UTC (permalink / raw)
To: Alex Bennée; +Cc: mohamad.gebai, qemu-devel, stefanha, Andreas Färber
Il 15/07/2014 16:16, Alex Bennée ha scritto:
>> > I think this is premature optimization...
> How do you mean? It's not really an optimization as much as an
> instrumentation. It should compile away to nothing if you don't have
> tracing enabled in your build.
I think it's not a big deal if you always enable the counting, and
perhaps show them in "info jit".
Paolo
> OTOH the numbers I'm seeing are very interesting in so far as the fast
> path could be a potential waste of code in a lot of cases.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 20:11 ` Paolo Bonzini
@ 2014-07-15 20:29 ` Peter Maydell
2014-07-15 20:38 ` Paolo Bonzini
0 siblings, 1 reply; 16+ messages in thread
From: Peter Maydell @ 2014-07-15 20:29 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Andreas Färber, Alex Bennée, QEMU Developers,
Stefan Hajnoczi, mohamad.gebai
On 15 July 2014 21:11, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 15/07/2014 16:16, Alex Bennée ha scritto:
>
>>> > I think this is premature optimization...
>>
>> How do you mean? It's not really an optimization as much as an
>> instrumentation. It should compile away to nothing if you don't have
>> tracing enabled in your build.
>
>
> I think it's not a big deal if you always enable the counting, and perhaps
> show them in "info jit".
We don't enable any other tracepoints by default; why would
we want to enable just this one which is in a hot codepath??
-- PMM
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] trace: instrument and trace tcg tb flush activity
2014-07-15 20:29 ` Peter Maydell
@ 2014-07-15 20:38 ` Paolo Bonzini
0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2014-07-15 20:38 UTC (permalink / raw)
To: Peter Maydell
Cc: Andreas Färber, Alex Bennée, QEMU Developers,
Stefan Hajnoczi, mohamad.gebai
Il 15/07/2014 22:29, Peter Maydell ha scritto:
>> >
>> > I think it's not a big deal if you always enable the counting, and perhaps
>> > show them in "info jit".
> We don't enable any other tracepoints by default; why would
> we want to enable just this one which is in a hot codepath??
I'm not referring to the tracepoint, only to trace_inc_counter instead
of just "x++". The overhead is probably not measurable. There are a few
low-hanging fruit optimizations in cpu-exec.c that probably would give
more measurable benefit, for example trying to replace cpu_loop_exit
with a goto whenever possible.
Paolo
^ permalink raw reply [flat|nested] 16+ messages in thread