From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Henderson <rth@twiddle.net>
Cc: qemu-devel@nongnu.org, peter.maydell@linaro.org,
"Emilio G. Cota" <cota@braap.org>
Subject: Re: [Qemu-devel] [PULL 22/26] target/aarch64: optimize indirect branches
Date: Wed, 07 Jun 2017 16:52:15 +0100 [thread overview]
Message-ID: <87d1afdccw.fsf@linaro.org> (raw)
In-Reply-To: <87efuvddux.fsf@linaro.org>
Alex Bennée <alex.bennee@linaro.org> writes:
> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Alex Bennée <alex.bennee@linaro.org> writes:
>>
>>> Richard Henderson <rth@twiddle.net> writes:
>>>
>>>> From: "Emilio G. Cota" <cota@braap.org>
>>>>
>>>> Measurements:
>>>>
>>>> [Baseline performance is that before applying this and the previous
>>>> commit]
>>>
>>> Sadly this has regressed my qemu-system-aarch64 EL2 run. It was slightly
>>> masked by an unrelated assertion breakage which I had to fix. However
>>> with this patch my boot hangs spinning all 4 threads. Once reverted
>>> things work again.
>>>
>>> My command line:
>>>
>>> timeout -k 1s --foreground 120s ./aarch64-softmmu/qemu-system-aarch64 -machine type=virt -display none -m 4096 -cpu cortex-a57 -serial mon:stdio -netdev user,id=unet -device virtio-net-device,netdev=unet -drive file=/home/alex/lsrc/qemu/images/jessie-arm64.qcow2,id=myblock,index=0,if=none -device virtio-blk-device,drive=myblock -append "console=ttyAMA0 root=/dev/vda1 systemd.unit=benchmark.service" -kernel /home/alex/lsrc/qemu/images/aarch64-current-linux-kernel-only.img -smp 4 -machine gic-version=3 -machine virtualization=true -name debug-threads=on
>>>
>>> My tree with fix and revert:
>>>
>>> https://github.com/stsquad/qemu/tree/debug/aarch64-hang
>>>
>>> I'm investigating now.
>>
>> Well this seems to be a case of hangs with -smp > 1 (which I guess was
>> obvious seeing as the TCG threads seem to be spinning against each
>> other).
>
> So the minimum fix I've found to get things working again is:
>
> void *HELPER(lookup_tb_ptr)(CPUArchState *env, target_ulong addr)
> {
> CPUState *cpu = ENV_GET_CPU(env);
> TranslationBlock *tb;
> target_ulong cs_base, pc;
> uint32_t flags;
>
> tb = atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(addr)]);
> if (likely(tb)) {
> cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
> if (likely(tb->pc == addr && tb->cs_base == cs_base &&
> tb->flags == flags)) {
> goto found;
> }
> tb = tb_htable_lookup(cpu, addr, cs_base, flags);
> if (likely(tb)) {
> atomic_set(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(addr)], tb);
> /* goto found; */
> }
> }
> return tcg_ctx.code_gen_epilogue;
> found:
> qemu_log_mask_and_addr(CPU_LOG_EXEC, addr,
> "Chain %p [%d: " TARGET_FMT_lx "] %s\n",
> tb->tc_ptr, cpu->cpu_index, addr,
> lookup_symbol(addr));
> return tb->tc_ptr;
> }
>
> Which I find very odd. It would imply that tb_htable_lookup is giving us
> a bad result, except I would find it very unlikely what ever funky value
> we stored in the jmp cache would never get hit again.
>
> /me is very confused.
OK I think we need to think about when we can run code, c.f. comment in
tb_gen_code:
/* As long as consistency of the TB stuff is provided by tb_lock in user
* mode and is implicit in single-threaded softmmu emulation, no explicit
* memory barrier is required before tb_link_page() makes the TB visible
* through the physical hash table and physical page list.
*/
tb_link_page(tb, phys_pc, phys_page2);
As tb_link_page does the qht_insert I think what is happening is we are
attempting to run code that has just been generated but is not
completely visible to other cores yet. By commenting out the goto found
we slip it enough that next time around it is complete.
However I'm still confused because qht_insert and qht_lookup have
implied barriers in them that should mean we don't pull a partially
complete TB out of the hash.
I did add some debug that saved the tb_htable_lookup's in an cpu indexed
array and they certainly get hits on the tb_jmp_cache lookup so I don't
think my commenting of found means those htable looked up TBs never get
executed.
--
Alex Bennée
next prev parent reply other threads:[~2017-06-07 15:51 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-05 16:52 [Qemu-devel] [PULL 00/26] tcg queued patches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 01/26] target/nios2: Fix 64-bit ilp32 compilation Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 02/26] tcg/sparc: Use the proper compilation flags for 32-bit Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 03/26] qemu/atomic: Loosen restrictions for 64-bit ILP32 hosts Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 04/26] tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 05/26] tcg/i386: implement goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 06/26] target/arm: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 07/26] target/arm: optimize indirect branches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 08/26] target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 09/26] target/i386: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 10/26] target/i386: optimize indirect branches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 11/26] tb-hash: improve tb_jmp_cache hash function in user mode Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 12/26] tcg/ppc: Implement goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 13/26] tcg/aarch64: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 14/26] tcg/sparc: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 15/26] tcg/s390: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 16/26] tcg/arm: Clarify tcg_out_bx for arm4 host Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 17/26] tcg/arm: Implement goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 19/26] target/s390: Use tcg_gen_lookup_and_goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 20/26] target/hppa: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 21/26] target/aarch64: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 22/26] target/aarch64: optimize indirect branches Richard Henderson
2017-06-07 14:11 ` Alex Bennée
2017-06-07 14:22 ` Alex Bennée
2017-06-07 15:19 ` Alex Bennée
2017-06-07 15:52 ` Alex Bennée [this message]
2017-06-07 20:22 ` Emilio G. Cota
2017-06-08 10:48 ` Alex Bennée
2017-06-07 20:38 ` Richard Henderson
2017-06-08 8:38 ` Alex Bennée
2017-06-05 16:52 ` [Qemu-devel] [PULL 23/26] target/mips: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 24/26] target/mips: optimize indirect branches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 25/26] target/alpha: Implement WTINT inline Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 26/26] target/alpha: Use goto_tb for fallthru between TBs Richard Henderson
2017-06-06 8:56 ` [Qemu-devel] [PULL 00/26] tcg queued patches Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d1afdccw.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=cota@braap.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).