From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55437)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <serge.fdrv@gmail.com>) id 1aldhY-0007ms-3v
	for qemu-devel@nongnu.org; Thu, 31 Mar 2016 10:36:01 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <serge.fdrv@gmail.com>) id 1aldhS-0001Ou-Ui
	for qemu-devel@nongnu.org; Thu, 31 Mar 2016 10:36:00 -0400
Received: from mail-lb0-x243.google.com ([2a00:1450:4010:c04::243]:36214)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <serge.fdrv@gmail.com>) id 1aldhS-0001Om-LR
	for qemu-devel@nongnu.org; Thu, 31 Mar 2016 10:35:54 -0400
Received: by mail-lb0-x243.google.com with SMTP id q4so7102966lbq.3
	for <qemu-devel@nongnu.org>; Thu, 31 Mar 2016 07:35:54 -0700 (PDT)
References: <56FC0818.10002@linaro.org> <56FC174A.6070906@redhat.com>
	<56FD22A5.10501@gmail.com> <56FD28BB.6030305@redhat.com>
From: Sergey Fedorov <serge.fdrv@gmail.com>
Message-ID: <56FD35C8.6060900@gmail.com>
Date: Thu, 31 Mar 2016 17:35:52 +0300
MIME-Version: 1.0
In-Reply-To: <56FD28BB.6030305@redhat.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] tcg: reworking tb_invalidated_flag
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>, Sergey Fedorov <sergey.fedorov@linaro.org>, QEMU Developers <qemu-devel@nongnu.org>, Richard Henderson <rth@twiddle.net>, Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: =?UTF-8?Q?Alex_Benn=c3=a9e?= <alex.bennee@linaro.org>

On 31/03/16 16:40, Paolo Bonzini wrote:
>
> On 31/03/2016 15:14, Sergey Fedorov wrote:
>> On 30/03/16 21:13, Paolo Bonzini wrote:
>>> On 30/03/2016 19:08, Sergey Fedorov wrote:
>>>> The second approach is to make 'tb_invalidated_flag' per-CPU. This
>>>> would be conceptually similar to what we have, but would give us thread
>>>> safety. With this approach, we need to be careful to correctly clear and
>>>> set the flag.
>>> You can just ensure that setting and clearing it is done under tb_lock.
>> So it could remain sitting in 'tcg_ctx.tb_ctx'. I'm just wondering what
>> could be real benefits for making it per-CPU then?
> All CPUs need to observe it in order to clear their own local next_tb
> variable.  It is not enough to do that once, so it has to be per-CPU.

So for each vCPU thread we have a separate flag to clear it safely. Got
it, thanks.

>
>>> Because TranslationBlocks live in tcg_ctx.tb_ctx.tbs you need
>>> special code to exit all CPUs at tb_flush time, otherwise you risk that
>>> a tb_alloc reuses a TranslationBlock while it is in use by a VCPU.
>> Looks like no matter which approach we use, it's ultimately necessary to
>> ensure all CPUs have exited from translated code before the translation
>> buffer may be safely flushed.
> My plan was to use some kind of double buffering, where only half of
> code_gen_buffer is in use.  At the end of tb_flush you call cpu_exit()
> on all CPUs, so that CPUs stop executing chained TBs from the old half
> before they can see one from the new half.
>
> If code_gen_buffer is static you have to preallocate two buffers (and
> two tbs arrays) and waste one of them; while it is theoretically
> possible to have CPUs still executing from the old half while you finish
> the new half, it can be more or less ignored.
>
> If it is dynamic, the previously used areas can be freed with call_rcu,
> and you can safely allocate a new code_gen_buffer and tbs array.
>
> I haven't thought much about it; it might require keeping a cache of the
> tbs array per CPU, and possibly changing the code under "if
> (tcg_ctx.tb_ctx.tb_invalidated_flag)" to simply exit cpu_exec.

Maybe save this idea for latter? :) We'd better use a simpler approach
at first and then move on and optimize. BTW, a few years ago I came
across an interesting paper on code cache eviction granularities [1].

[1]
http://www.cs.virginia.edu/kim/courses/cs851/papers/hazelwood04mediumgrained.pdf

Kind regards,
Sergey