From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:48402)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Yq1bj-0004f7-NJ
	for qemu-devel@nongnu.org; Wed, 06 May 2015 11:51:36 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Yq1bc-0000a8-Nq
	for qemu-devel@nongnu.org; Wed, 06 May 2015 11:51:35 -0400
Received: from mx1.redhat.com ([209.132.183.28]:56992)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Yq1bc-0000Zv-Bn
	for qemu-devel@nongnu.org; Wed, 06 May 2015 11:51:28 -0400
Message-ID: <554A386F.9030804@redhat.com>
Date: Wed, 06 May 2015 17:51:11 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1430926687-25875-1-git-send-email-a.rigo@virtualopensystems.com>
In-Reply-To: <1430926687-25875-1-git-send-email-a.rigo@virtualopensystems.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC 0/5] Slow-path for atomic instruction
	translation
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Alvise Rigo <a.rigo@virtualopensystems.com>, qemu-devel@nongnu.org
Cc: mttcg@greensocs.com, jani.kokkonen@huawei.com, tech@virtualopensystems.com, claudio.fontana@huawei.com

On 06/05/2015 17:38, Alvise Rigo wrote:
> This patch series provides an infrastructure for atomic
> instruction implementation in QEMU, paving the way for TCG multi-threading.
> The adopted design does not rely on host atomic
> instructions and is intended to propose a 'legacy' solution for
> translating guest atomic instructions.
> 
> The underlying idea is to provide new TCG instructions that guarantee
> atomicity to some memory accesses or in general a way to define memory
> transactions. More specifically, a new pair of TCG instructions are
> implemented, qemu_ldlink_i32 and qemu_stcond_i32, that behave as
> LoadLink and StoreConditional primitives (only 32 bit variant
> implemented).  In order to achieve this, a new bitmap is added to the
> ram_list structure (always unique) which flags all memory pages that
> could not be accessed directly through the fast-path, due to previous
> exclusive operations. This new bitmap is coupled with a new TLB flag
> which forces the slow-path exectuion. All stores which take place
> between an LL/SC operation by other vCPUs in the same memory page, will
> fail the subsequent StoreConditional.
> 
> In theory, the provided implementation of TCG LoadLink/StoreConditional
> can be used to properly handle atomic instructions on any architecture.
> 
> The new slow-path is implemented such that:
> - the LoadLink behaves as a normal load slow-path, except for cleaning
>   the dirty flag in the bitmap. The TLB entries created from now on will
>   force the slow-path. To ensure it, we flush the TLB cache for the
>   other vCPUs
> - the StoreConditional behaves as a normal store slow-path, except for
>   checking the state of the dirty bitmap and returning 0 or 1 whether or
>   not the StoreConditional succeeded (0 when no vCPU has touched the
>   same memory in the mean time).
> 
> All those write accesses that are forced to follow the 'legacy'
> slow-path will set the accessed memory page to dirty.
> 
> In this series only the ARM ldrex/strex instructions are implemented.
> The code was tested with bare-metal test cases and with Linux, using
> upstream QEMU.
> 
> This work has been sponsored by Huawei Technologies Dusseldorf GmbH.
> 
> Alvise Rigo (5):
>   exec: Add new exclusive bitmap to ram_list
>   Add new TLB_EXCL flag
>   softmmu: Add helpers for a new slow-path
>   tcg-op: create new TCG qemu_ldlink and qemu_stcond instructions
>   target-arm: translate: implement qemu_ldlink and qemu_stcond ops

That's pretty cool.

Paolo