From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36781) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1akr1X-0004M9-MQ for qemu-devel@nongnu.org; Tue, 29 Mar 2016 06:37:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1akr1U-0001co-EG for qemu-devel@nongnu.org; Tue, 29 Mar 2016 06:37:23 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51613) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1akr1U-0001cN-7J for qemu-devel@nongnu.org; Tue, 29 Mar 2016 06:37:20 -0400 References: <1458222382-6498-1-git-send-email-sergey.fedorov@linaro.org> <1458222382-6498-5-git-send-email-sergey.fedorov@linaro.org> <56EAC8A2.7060700@redhat.com> <56EAC9E3.60000@gmail.com> <56F94B59.80905@gmail.com> <56F9A051.9090907@redhat.com> <56FA52E3.3000900@gmail.com> From: Paolo Bonzini Message-ID: <56FA5ADB.7030103@redhat.com> Date: Tue, 29 Mar 2016 12:37:15 +0200 MIME-Version: 1.0 In-Reply-To: <56FA52E3.3000900@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Fedorov , sergey.fedorov@linaro.org, qemu-devel@nongnu.org Cc: Peter Crosthwaite , Richard Henderson On 29/03/2016 12:03, Sergey Fedorov wrote: >>> >> [...] I would suggest the following solution: >>> >> (1) Use 'tb->pc' as an indicator of whether TB is valid; check fo= r it >>> >> in cpu_exec() when deciding on whether to patch the last exec= uted >>> >> TB or not >>> >> (2) Use 'tcg_ctx.tb_ctx.tb_flush_count' to check for translation = buffer >>> >> flushes; capture it before calling tb_gen_code() and compare = to it >>> >> afterwards to check if tb_flush() has been called in between >> > Of course that would work, but it would be slower. =20 > What's going to be slower? Checking tb->pc (or something similar) *and* tb_flush_count in=20 tb_find_physical. > > I think it is > > unnecessary for two reasons: > > > > 1) There are two calls to cpu_exec_nocache. One exits immediately wi= th > > "break;", the other always sets "next_tb =3D 0;". Therefore it is sa= fe in > > both cases for cpu_exec_nocache to hijack cpu->tb_invalidated_flag. > > > > 2) if it were broken, it would _also_ be broken before these patches > > because cpu_exec_nocache always runs with tb_lock taken. =20 >=20 > I can't see how cpu_exec_nocache() always runs with tb_lock taken. It takes the lock itself. :) >>From Fred's "tcg: protect TBContext with tb_lock", as it is in my mttcg=20 branch: @@ -194,17 +194,23 @@ static void cpu_exec_nocache(CPUState *cpu, int max= _cycles, if (max_cycles > CF_COUNT_MASK) max_cycles =3D CF_COUNT_MASK; =20 + tb_lock(); cpu->tb_invalidated_flag =3D 0; tb =3D tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flag= s, max_cycles | CF_NOCACHE); tb->orig_tb =3D cpu->tb_invalidated_flag ? NULL : orig_tb; cpu->current_tb =3D tb; + tb_unlock(); + /* execute the generated code */ trace_exec_tb_nocache(tb, tb->pc); cpu_tb_exec(cpu, tb->tc_ptr); + + tb_lock(); cpu->current_tb =3D NULL; tb_phys_invalidate(tb, -1); tb_free(tb); + tb_unlock(); } #endif =20 It takes the lock before resetting tb_invalidated_flag. cpu_exec_nocache is not used in user-mode emulation, so it's okay if qemu.git doesn't take the lock yet. (This kind of misunderstanding about which code is thread-safe is going to be common until we have MTTCG. This was the reason for the patch "cpu-exec: elide more icount code if CONFIG_USER_ONLY"). > > So I think > > documenting the assumptions is better than changing them at the same > > time as doing other changes. > > I'm not sure I understand you here exactly, but if implementing my > proposal, it'd rather be a separate patch/series, I think. Exactly. For the purpose of these 5 patches, I would just document above cpu_exec_nocache that callers should ensure that next_tb is zero. Alternatively, you could add a patch that does old_tb_invalidated_flag =3D cpu->tb_invalidated_flag; cpu->tb_invalidated_flag =3D 0; ... cpu->tb_invalidated_flag |=3D old_tb_invalidated_flag; it could use the single global flag (and then it would be between patch 4 and patch 5) or it could use the CPU-specific one (and then it would be after patch 5). However, I think documenting the requirements is fine. >> > Your observation that tb->pc=3D=3D-1 is not necessarily safe still h= olds of >> > course. Probably the best thing is an inline that can do one of: >> > >> > 1) set cs_base to an invalid value (anything nonzero is enough excep= t on >> > x86 and SPARC; SPARC can use all-ones) >> > >> > 2) sets the flags to an invalid combination (x86 can use all ones) >> > >> > 3) sets the PC to an invalid value (no one really needs it) > > It's a bit tricky. Does it really worth doing so instead of using a > separate dedicated flag? Mainly, it should cost one extra compare on TB > look-up. I suppose it's a kind of trade-off between performance and cod= e > clarity. I think a new inline function cpu_make_tb_invalid would not be too tricky= . Just setting "tb->cs_base =3D -1;" is pretty much obvious for all the tar= gets that do not use cs_base at all and for SPARC which sets it to a PC (and thus a multiple of four). x86 is the odd one out. Thanks, Paolo