From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:59984)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1aYyiM-0006pI-Rc
	for qemu-devel@nongnu.org; Thu, 25 Feb 2016 11:24:31 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1aYyiJ-0004qo-M3
	for qemu-devel@nongnu.org; Thu, 25 Feb 2016 11:24:30 -0500
Received: from mail-wm0-x22f.google.com ([2a00:1450:400c:c09::22f]:34429)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1aYyiJ-0004qc-9u
	for qemu-devel@nongnu.org; Thu, 25 Feb 2016 11:24:27 -0500
Received: by mail-wm0-x22f.google.com with SMTP id b205so38687043wmb.1
	for <qemu-devel@nongnu.org>; Thu, 25 Feb 2016 08:24:27 -0800 (PST)
References: <874mcy818t.fsf@linaro.org> <56CEC235.3090603@greensocs.com>
From: Alex =?utf-8?Q?Benn=C3=A9e?= <alex.bennee@linaro.org>
In-reply-to: <56CEC235.3090603@greensocs.com>
Date: Thu, 25 Feb 2016 16:24:24 +0000
Message-ID: <87y4a87o7b.fsf@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] Making all TB invalidation asynchronous (MTTCG
	safe_work)?
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Frederic Konrad <fred.konrad@greensocs.com>
Cc: MTTCG Devel <mttcg@listserver.greensocs.com>, Paolo Bonzini <pbonzini@redhat.com>, "Emilio G. Cota" <cota@braap.org>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, Sergey Fedorov <serge.fdrv@gmail.com>


Frederic Konrad <fred.konrad@greensocs.com> writes:

> Hi Alex,
>
> We decided in Seattle to make this flag per tb (eg move it to the tb
> struct).

Indeed and looking at the state of Emilo's branch I see he did exactly
that and solved a lot of the lock contention problems.

>
> On 24/02/2016 18:30, Alex Bennée wrote:
>> Hi,
>>
>> So I've been working on reducing MTTCG tb_lock contention and currently
>> have a tb_lock around the following code (in my cpu_exec):
>>
>>     /* Note: we do it here to avoid a gcc bug on Mac OS X when
>>        doing it in tb_find_slow */
>>     tb_lock();
>>     if (tcg_ctx.tb_ctx.tb_invalidated_flag) {
>>         /* as some TB could have been invalidated because
>>            of memory exceptions while generating the code, we
>>            must recompute the hash index here */
>>         next_tb = 0;
>>         tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
>>     }
>>     /* see if we can patch the calling TB. When the TB
>>        spans two pages, we cannot safely do a direct
>>        jump. */
>>     if (next_tb != 0 && tb->page_addr[1] == -1
>>         && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>>         tb_add_jump((TranslationBlock *)(next_tb & ~TB_EXIT_MASK),
>>                     next_tb & TB_EXIT_MASK, tb);
>>     }
>>     tb_unlock();
>>
>> And this started me down the rabbit hole of the meaning of
>> tcg_ctx.tb_ctx.tb_invalidated_flag. So as far as I follow there are two
>> places this is set:
>>
>>  * We've run out of translation memory and we are throwing everything
>>    away (tb_alloc == NULL)
>>  * We've invalidated the physical pages of some TranslationBlocks
>>
>> The first case there is a slightly convoluted buffer overflow handing
>> code (tb_gen_code):
>>
>>     if (unlikely(!tb)) {
>>  buffer_overflow:
>>         /* flush must be done */
>>         tb_flush_safe(cpu);
>>         /* Don't forget to invalidate previous TB info.  */
>>         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>>         tb_unlock();
>>         cpu_loop_exit(cpu);
>>     }
>>
>> Which I'm sure could be more simply handled by just queuing the safe
>> tb_flush and returning a NULL tb and letting the execution loop unwind
>> before resetting the translation buffers.
>>
>> The second case has been partially asynced by Fred:
>>
>>     /* invalidate one TB */
>>     void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>>     {
>>         CPUState *cpu;
>>         PageDesc *p;
>>         unsigned int h;
>>         tb_page_addr_t phys_pc;
>>         struct CPUDiscardTBParams *params;
>>
>>         assert_tb_lock(); /* added by me because of bellow */
>>
>>         /* remove the TB from the hash list */
>>         phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
>>         h = tb_phys_hash_func(phys_pc);
>>         tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
>>
>>         /* remove the TB from the page list */
>>         if (tb->page_addr[0] != page_addr) {
>>             p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
>>             tb_page_remove(&p->first_tb, tb);
>>             invalidate_page_bitmap(p);
>>         }
>>         if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
>>             p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
>>             tb_page_remove(&p->first_tb, tb);
>>             invalidate_page_bitmap(p);
>>         }
>>
>>         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>>
>>         CPU_FOREACH(cpu) {
>>             params = g_malloc(sizeof(struct CPUDiscardTBParams));
>>             params->cpu = cpu;
>>             params->tb = tb;
>>             async_run_on_cpu(cpu, cpu_discard_tb_from_jmp_cache, params);
>>         }
>>         async_run_safe_work_on_cpu(first_cpu, tb_invalidate_jmp_remove, tb);
>>
>>         tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
>>     }
>>
>> But I'm wondering why we can't defer all the page invalidation to safe
>> work?
>>
>> I don't think it matters to the invalidating vCPU as it has to get
>> to the end of its block anyway. For other vCPUs as there is no strict
>> synchronisation can we not pretend what ever the operation was that
>> triggered the invalidation happened just as the block ended?
>>
>> The final case I don't quite follow is the avoiding invalidation of
>> tb_next in cpu_exec_nocache() if we have already caused a tb
>> invalidation event:
>>
>>     tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
>>
>> Which is later (in cpu_io_recompile):
>>
>>     if (tb->cflags & CF_NOCACHE) {
>>         if (tb->orig_tb) {
>>             /* Invalidate original TB if this TB was generated in
>>              * cpu_exec_nocache() */
>>             tb_phys_invalidate(tb->orig_tb, -1);
>>         }
>>         tb_free(tb);
>>     }
>>
>> My aim in all of this is to see if we can remove another flag from
>> tb_ctx (one less thing to mutex access to) and make the code flow easier
>> to follow. So remaining question:
>>
>> * Are there cases where not immediately invalidating the tb_page
>>   structures would cause problems for the emulation?
>
> Is that the same issue we might have with the memory barriers?
>
> Fred
>>
>> Thanks in advance for any elucidation ;-)
>>
>> --
>> Alex Bennée
>>


--
Alex Bennée