From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38611) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f4G6S-0001r0-ER for qemu-devel@nongnu.org; Thu, 05 Apr 2018 21:23:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f4G6O-0006KB-Cf for qemu-devel@nongnu.org; Thu, 05 Apr 2018 21:23:44 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:48213) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f4G6O-0006JD-4n for qemu-devel@nongnu.org; Thu, 05 Apr 2018 21:23:40 -0400 Date: Thu, 5 Apr 2018 21:23:38 -0400 From: "Emilio G. Cota" Message-ID: <20180406012338.GB25706@flamenco> References: <1519709965-29833-1-git-send-email-cota@braap.org> <1519709965-29833-13-git-send-email-cota@braap.org> <87po3m7vuc.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87po3m7vuc.fsf@linaro.org> Subject: Re: [Qemu-devel] [PATCH 12/16] translate-all: discard TB when tb_link_page returns an existing matching TB List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex =?iso-8859-1?Q?Benn=E9e?= Cc: qemu-devel@nongnu.org, Paolo Bonzini , Richard Henderson On Thu, Mar 29, 2018 at 16:19:39 +0100, Alex Bennée wrote: > > Emilio G. Cota writes: > > > Use the recently-gained QHT feature of returning the matching TB if it > > already exists. This allows us to get rid of the lookup we perform > > right after acquiring tb_lock. > > > > Suggested-by: Richard Henderson > > Signed-off-by: Emilio G. Cota > > --- > > accel/tcg/cpu-exec.c | 14 ++------------ > > accel/tcg/translate-all.c | 47 ++++++++++++++++++++++++++++++++++++++--------- > > 2 files changed, 40 insertions(+), 21 deletions(-) > > > > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > > index 7c83887..8aed38c 100644 > > --- a/accel/tcg/cpu-exec.c > > +++ b/accel/tcg/cpu-exec.c > > @@ -243,10 +243,7 @@ void cpu_exec_step_atomic(CPUState *cpu) > > if (tb == NULL) { > > mmap_lock(); > > tb_lock(); > > - tb = tb_htable_lookup(cpu, pc, cs_base, flags, cf_mask); > > - if (likely(tb == NULL)) { > > - tb = tb_gen_code(cpu, pc, cs_base, flags, cflags); > > - } > > + tb = tb_gen_code(cpu, pc, cs_base, flags, cflags); > > tb_gen_code needs to be renamed to reflect it's semantics. > tb_get_or_gen_code? Or maybe tb_get_code with a sub-helper to do the > generation. I think it can remain as tb_gen_code. The caller still gets a TB, and whether that TB has been generated by this thread or any other thread is irrelevant. (snip) > > --- a/accel/tcg/translate-all.c > > +++ b/accel/tcg/translate-all.c > > @@ -1503,12 +1503,16 @@ static inline void tb_page_add(PageDesc *p, TranslationBlock *tb, > > * (-1) to indicate that only one page contains the TB. > > * > > * Called with mmap_lock held for user-mode emulation. > > + * > > + * Returns @tb or an existing TB that matches @tb. > > That's just confusing to read. So this returns a TB like the @tb we > passed in but actually a different one matching the same conditions? Good point. Here tb_link_page is not a great name, but instead of adding a long name such as tb_link_page_or_get_existing, in v2 I've expanded the above comment. It now looks as follows: * Returns a pointer @tb, or a pointer to an existing TB that matches @tb. * Note that in !user-mode, another thread might have already added a TB * for the same block of guest code that @tb corresponds to. In that case, * the caller should discard the original @tb, and use instead the returned TB. > > @@ -1706,7 +1727,15 @@ TranslationBlock *tb_gen_code(CPUState *cpu, > > * memory barrier is required before tb_link_page() makes the TB visible > > * through the physical hash table and physical page list. > > */ > > - tb_link_page(tb, phys_pc, phys_page2); > > + existing_tb = tb_link_page(tb, phys_pc, phys_page2); > > + /* if the TB already exists, discard what we just translated */ > > So are we in the position now that we could potentially do a translation > but be beaten by another thread generating the same code? Exactly. > I suspect we could > do with a bit of explanatory commentary for the tb_gen_code functions. As I said above I don't think tb_gen_code changes at all to its callers, since the caller still gets a TB pointer that it did not have before. tb_link_page is the key here -- I hope the updated comment I quoted above is enough to make things clear. > Also I think the "Translation Blocks" section needs updating in the > MTTCG design document to make this clear. I've added a comment at the bottom of that section: Parallel code generation is supported. QHT is used at insertion time as the synchronization point across threads, thereby ensuring that we only keep track of a single TranslationBlock for each guest code block. > I'm curious if we should be counting unused translations somewhere in > the JIT stats. I'm guessing you need to work at a pathalogical case to > hit this much? This should be extremely rare on most workloads. Given that and the fact that we won't have unused translated code (we discard it by resetting code_gen_ptr), I wouldn't worry too much about this. In the unlikely case that it ever became a problem, TCG profiling time would account for it, and on a perf profile we'd see the slow path in tb_link_page being taken. Thanks, Emilio