From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55437) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aldhY-0007ms-3v for qemu-devel@nongnu.org; Thu, 31 Mar 2016 10:36:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aldhS-0001Ou-Ui for qemu-devel@nongnu.org; Thu, 31 Mar 2016 10:36:00 -0400 Received: from mail-lb0-x243.google.com ([2a00:1450:4010:c04::243]:36214) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aldhS-0001Om-LR for qemu-devel@nongnu.org; Thu, 31 Mar 2016 10:35:54 -0400 Received: by mail-lb0-x243.google.com with SMTP id q4so7102966lbq.3 for ; Thu, 31 Mar 2016 07:35:54 -0700 (PDT) References: <56FC0818.10002@linaro.org> <56FC174A.6070906@redhat.com> <56FD22A5.10501@gmail.com> <56FD28BB.6030305@redhat.com> From: Sergey Fedorov Message-ID: <56FD35C8.6060900@gmail.com> Date: Thu, 31 Mar 2016 17:35:52 +0300 MIME-Version: 1.0 In-Reply-To: <56FD28BB.6030305@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] tcg: reworking tb_invalidated_flag List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , Sergey Fedorov , QEMU Developers , Richard Henderson , Peter Crosthwaite Cc: =?UTF-8?Q?Alex_Benn=c3=a9e?= On 31/03/16 16:40, Paolo Bonzini wrote: > > On 31/03/2016 15:14, Sergey Fedorov wrote: >> On 30/03/16 21:13, Paolo Bonzini wrote: >>> On 30/03/2016 19:08, Sergey Fedorov wrote: >>>> The second approach is to make 'tb_invalidated_flag' per-CPU. This >>>> would be conceptually similar to what we have, but would give us thread >>>> safety. With this approach, we need to be careful to correctly clear and >>>> set the flag. >>> You can just ensure that setting and clearing it is done under tb_lock. >> So it could remain sitting in 'tcg_ctx.tb_ctx'. I'm just wondering what >> could be real benefits for making it per-CPU then? > All CPUs need to observe it in order to clear their own local next_tb > variable. It is not enough to do that once, so it has to be per-CPU. So for each vCPU thread we have a separate flag to clear it safely. Got it, thanks. > >>> Because TranslationBlocks live in tcg_ctx.tb_ctx.tbs you need >>> special code to exit all CPUs at tb_flush time, otherwise you risk that >>> a tb_alloc reuses a TranslationBlock while it is in use by a VCPU. >> Looks like no matter which approach we use, it's ultimately necessary to >> ensure all CPUs have exited from translated code before the translation >> buffer may be safely flushed. > My plan was to use some kind of double buffering, where only half of > code_gen_buffer is in use. At the end of tb_flush you call cpu_exit() > on all CPUs, so that CPUs stop executing chained TBs from the old half > before they can see one from the new half. > > If code_gen_buffer is static you have to preallocate two buffers (and > two tbs arrays) and waste one of them; while it is theoretically > possible to have CPUs still executing from the old half while you finish > the new half, it can be more or less ignored. > > If it is dynamic, the previously used areas can be freed with call_rcu, > and you can safely allocate a new code_gen_buffer and tbs array. > > I haven't thought much about it; it might require keeping a cache of the > tbs array per CPU, and possibly changing the code under "if > (tcg_ctx.tb_ctx.tb_invalidated_flag)" to simply exit cpu_exec. Maybe save this idea for latter? :) We'd better use a simpler approach at first and then move on and optimize. BTW, a few years ago I came across an interesting paper on code cache eviction granularities [1]. [1] http://www.cs.virginia.edu/kim/courses/cs851/papers/hazelwood04mediumgrained.pdf Kind regards, Sergey