qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <richard.henderson@linaro.org>
To: "Alex Bennée" <alex.bennee@linaro.org>, qemu-devel@nongnu.org
Cc: "Vanderson M. do Rosario" <vandersonmr2@gmail.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	cota@braap.org, Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [PATCH v9 09/13] Adding info [tb-list|tb] commands to HMP (WIP)
Date: Tue, 8 Oct 2019 14:50:16 -0400	[thread overview]
Message-ID: <57a84842-2e85-6c39-2abe-b0dd3b3750ca@linaro.org> (raw)
In-Reply-To: <20191007152839.30804-10-alex.bennee@linaro.org>

On 10/7/19 11:28 AM, Alex Bennée wrote:
> From: "Vanderson M. do Rosario" <vandersonmr2@gmail.com>
> 
> These commands allow the exploration of TBs generated by the TCG.
> Understand which one hotter, with more guest/host instructions... and
> examine their guest, host and IR code.
> 
> The goal of this command is to allow the dynamic exploration of TCG
> behavior and code quality. Therefore, for now, a corresponding QMP
> command is not worthwhile.
> 
> [AJB: WIP - we still can't be safely sure a translation will succeed]
> 
> Example of output:
> 
> TB id:1 | phys:0x34d54 virt:0x0000000000034d54 flags:0x0000f0
> 	| exec:4828932/0 guest inst cov:16.38%
> 	| trans:1 ints: g:3 op:82 op_opt:34 spills:3
> 	| h/g (host bytes / guest insts): 90.666664
> 	| time to gen at 2.4GHz => code:3150.83(ns) IR:712.08(ns)
> 	| targets: 0x0000000000034d5e (id:11), 0x0000000000034d0d (id:2)
> 
> TB id:2 | phys:0x34d0d virt:0x0000000000034d0d flags:0x0000f0
> 	| exec:4825842/0 guest inst cov:21.82%
> 	| trans:1 ints: g:4 op:80 op_opt:38 spills:2
> 	| h/g (host bytes / guest insts): 84.000000
> 	| time to gen at 2.4GHz => code:3362.92(ns) IR:793.75(ns)
> 	| targets: 0x0000000000034d19 (id:12), 0x0000000000034d54 (id:1)
> 
> TB id:2 | phys:0x34d0d virt:0x0000000000034d0d flags:0x0000f0
> 	| exec:6956495/0  guest inst cov:21.82%
> 	| trans:2 ints: g:2 op:40 op_opt:19 spills:1
> 	| h/g (host bytes / guest insts): 84.000000
> 	| time to gen at 2.4GHz => code:3130.83(ns) IR:722.50(ns)
> 	| targets: 0x0000000000034d19 (id:12), 0x0000000000034d54 (id:1)
> 
> ----------------
> IN:
> 0x00034d0d:  89 de                    movl     %ebx, %esi
> 0x00034d0f:  26 8b 0e                 movl     %es:(%esi), %ecx
> 0x00034d12:  26 f6 46 08 80           testb    $0x80, %es:8(%esi)
> 0x00034d17:  75 3b                    jne      0x34d54
> 
> ------------------------------
> 
> TB id:1 | phys:0x34d54 virt:0x0000000000034d54 flags:0x0000f0
> 	| exec:5202686/0 guest inst cov:11.28%
> 	| trans:1 ints: g:3 op:82 op_opt:34 spills:3
> 	| h/g (host bytes / guest insts): 90.666664
> 	| time to gen at 2.4GHz => code:2793.75(ns) IR:614.58(ns)
> 	| targets: 0x0000000000034d5e (id:3), 0x0000000000034d0d (id:2)
> 
> TB id:2 | phys:0x34d0d virt:0x0000000000034d0d flags:0x0000f0
> 	| exec:5199468/0 guest inst cov:15.03%
> 	| trans:1 ints: g:4 op:80 op_opt:38 spills:2
> 	| h/g (host bytes / guest insts): 84.000000
> 	| time to gen at 2.4GHz => code:2958.75(ns) IR:719.58(ns)
> 	| targets: 0x0000000000034d19 (id:4), 0x0000000000034d54 (id:1)
> 
> ------------------------------
> 2 TBs to reach 25% of guest inst exec coverage
> Total of guest insts exec: 138346727
> 

Is there too much cut-and-paste in this commit message?
I certainly hope that identical information about TB id:2
is not output 3 times within the same report...

Or, alternately, that we are not generating multiple TBs
for the { phys:0x34d0d virt:0x34d0d flags:0xf0 } tuple.

Also, I think you probably need to output cs_base.  Depending
on the target, that might have been different, and so it
might make sense that you have 3 copies of the above.

> +static gint
> +inverse_sort_tbs(gconstpointer p1, gconstpointer p2, gpointer psort_by)

What about this makes the sort "inverse"?

> +    int sort_by = *((int *) psort_by);

Why is this not enum SortBy?

> +    if (likely(sort_by == SORT_BY_SPILLS)) {
...
> +    } else if (likely(sort_by == SORT_BY_HOTNESS)) {
...
> +    } else if (likely(sort_by == SORT_BY_HG)) {

Surely these options are not all "likely".

> +        float a =
> +            (float) stat_per_translation(tbs1, code.out_len) / tbs1->code.num_guest_inst;
> +        float b =
> +            (float) stat_per_translation(tbs2, code.out_len) / tbs2->code.num_guest_inst;

I don't understand why we're suddenly introducing floats, when the division
hidden within stat_per_translation is integer.

Think about what units being compared here, because I don't think that "average
host code length / sum of guest insn count" makes sense.  Certainly as time
progresses, average / sum -> ((sum / sum) / sum) -> (sum / sum**2) -> (1 / sum)
is going to approach 0.

I think you actually want "sum of host code length / sum of guest insn length",
which could be stated as "host/guest code ratio" or "jit code expansion factor".

> +        c1 = a <= b ? 0 : 1;
> +        c2 = a <= b ? 1 : 0;

Please do recall that (x < y ? 1 : 0) => (x < y).

In addition, there's no point in having these comparisons feed...

> +    return c1 < c2 ? 1 : c1 == c2 ? 0 : -1;

... these comparisions.

> +    for (i = last_search; i; i = i->next) {
> +        TBStatistics *tbs = (TBStatistics *) i->data;
> +        uint64_t tb_total_execs =
> +            (tbs->executions.atomic + tbs->executions.normal) * tbs->code.num_guest_inst;
> +        tbs->executions.coverage = (10000 * tb_total_execs) / (total_exec_count + 1);

If ever there was a time to want to use float, this is it.

I assume the total_exec_count + 1 is to avoid divide by zero?
Surely we can do better than this...

Indeed, given that we've already checked...

> +    if (!last_search) {
> +        qemu_printf("No data collected yet!\n");
> +        return;
> +    }

... surely we can either assert total_exec_count != 0, or don't and just let
the divide-by-zero signal do the same thing.  (I don't see the value of
replacing one signal with another in most cases.)

> +/*
> + * We cannot always re-generate the code even if we know there are
> + * valid translations still in the cache. The reason being the guest
> + * may have un-mapped the page code.

Um... unless I mistake what's being described here, that wouldn't be a valid
translation.  Or do you just mean that the page mapping isn't present within
the TLB?  Which is not quite the same thing as "unmapping".

> + * TODO: can we do this safely? We need to
> + *  a) somehow recover the mmu_idx for this translation

We could add an interface for this, yes.  The value *must* be able to be
derived from tb->flags, but of course in a target-dependent way.

> + *  b) probe MMU_INST_FETCH to know it will succeed

We *do* have this now, sort of: tlb_vaddr_to_host.

So far all use of this function originates from target/foo/,
and so some targets have not been updated to work with this.
I've marked these with asserts within foo_cpu_tlb_fill.

Notable targets for which it won't work: i386, sparc.


> +static GString *get_code_string(TBStatistics *tbs, int log_flags)
> +{
> +    int old_log_flags = qemu_loglevel;
> +
> +    CPUState *cpu = first_cpu;
> +    uint32_t cflags = curr_cflags() | CF_NOCACHE;
> +    TranslationBlock *tb = NULL;
> +
> +    GString *code_s = g_string_new(NULL);
> +    qemu_log_to_string(true, code_s);
> +
> +    qemu_set_log(log_flags);
> +
> +    if (sigsetjmp(cpu->jmp_env, 0) == 0) {
> +        mmap_lock();
> +        tb = tb_gen_code(cpu, tbs->pc, tbs->cs_base, tbs->flags, cflags);
> +        tb_phys_invalidate(tb, -1);
> +        mmap_unlock();

Ew.  No.

Let us not go through tb_gen_code, just to get logging output from the
translator.  What are we really after here?  Input assembly?

> @@ -86,7 +91,6 @@ struct TBStatistics {
>  
>      struct {
>          unsigned long total;
> -        unsigned long uncached;
>          unsigned long spanning;
>      } translations;
>  

Vanishing unused variable that maybe shouldn't have existed?



r~


  reply	other threads:[~2019-10-08 18:51 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-07 15:28 [PATCH v9 00/13] TCG code quality tracking and perf integration Alex Bennée
2019-10-07 15:28 ` [PATCH v9 01/13] accel/tcg: introduce TBStatistics structure Alex Bennée
2019-10-08 12:35   ` Richard Henderson
2019-12-13 11:14     ` Alex Bennée
2019-10-07 15:28 ` [PATCH v9 02/13] accel: collecting TB execution count Alex Bennée
2019-10-08 13:10   ` Richard Henderson
2019-10-07 15:28 ` [PATCH v9 03/13] accel: collecting JIT statistics Alex Bennée
2019-10-08 13:38   ` Richard Henderson
2019-12-13 11:51     ` Alex Bennée
2019-10-07 15:28 ` [PATCH v9 04/13] accel: replacing part of CONFIG_PROFILER with TBStats Alex Bennée
2019-10-08 13:58   ` Richard Henderson
2019-10-07 15:28 ` [PATCH v9 05/13] accel: adding TB_JIT_TIME and full replacing CONFIG_PROFILER Alex Bennée
2019-10-08 15:25   ` Richard Henderson
2019-12-13 21:49     ` Alex Bennée
2019-12-16 20:34       ` Richard Henderson
2019-10-07 15:28 ` [PATCH v9 06/13] debug: add -d tb_stats to control TBStatistics collection: Alex Bennée
2019-10-08 15:34   ` Richard Henderson
2019-10-08 15:49     ` Alex Bennée
2019-10-07 15:28 ` [PATCH v9 07/13] monitor: adding tb_stats hmp command Alex Bennée
2019-10-08 15:48   ` Richard Henderson
2019-10-07 15:28 ` [PATCH v9 08/13] tb-stats: reset the tracked TBs on a tb_flush Alex Bennée
2019-10-08 18:00   ` Richard Henderson
2019-10-08 19:18     ` Alex Bennée
2019-10-07 15:28 ` [PATCH v9 09/13] Adding info [tb-list|tb] commands to HMP (WIP) Alex Bennée
2019-10-08 18:50   ` Richard Henderson [this message]
2019-10-08 19:36     ` Alex Bennée
2019-10-09  9:44   ` Dr. David Alan Gilbert
2019-10-07 15:28 ` [PATCH v9 10/13] tb-stats: dump hot TBs at the end of the execution Alex Bennée
2019-10-08 19:05   ` Richard Henderson
2019-10-07 15:28 ` [PATCH v9 11/13] accel/tcg: adding integration with linux perf Alex Bennée
2019-10-08 19:33   ` Richard Henderson
2019-10-07 15:28 ` [PATCH v9 12/13] tb-stats: adding TBStatistics info into perf dump Alex Bennée
2019-10-08 19:46   ` Richard Henderson
2019-10-07 15:28 ` [PATCH v9 13/13] configure: remove the final bits of --profiler support Alex Bennée
2019-10-08 19:39   ` Richard Henderson
2019-10-07 18:14 ` [PATCH v9 00/13] TCG code quality tracking and perf integration no-reply
2019-10-07 18:47 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57a84842-2e85-6c39-2abe-b0dd3b3750ca@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=cota@braap.org \
    --cc=dgilbert@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=vandersonmr2@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).