From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48402) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yq1bj-0004f7-NJ for qemu-devel@nongnu.org; Wed, 06 May 2015 11:51:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yq1bc-0000a8-Nq for qemu-devel@nongnu.org; Wed, 06 May 2015 11:51:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56992) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yq1bc-0000Zv-Bn for qemu-devel@nongnu.org; Wed, 06 May 2015 11:51:28 -0400 Message-ID: <554A386F.9030804@redhat.com> Date: Wed, 06 May 2015 17:51:11 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1430926687-25875-1-git-send-email-a.rigo@virtualopensystems.com> In-Reply-To: <1430926687-25875-1-git-send-email-a.rigo@virtualopensystems.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC 0/5] Slow-path for atomic instruction translation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alvise Rigo , qemu-devel@nongnu.org Cc: mttcg@greensocs.com, jani.kokkonen@huawei.com, tech@virtualopensystems.com, claudio.fontana@huawei.com On 06/05/2015 17:38, Alvise Rigo wrote: > This patch series provides an infrastructure for atomic > instruction implementation in QEMU, paving the way for TCG multi-threading. > The adopted design does not rely on host atomic > instructions and is intended to propose a 'legacy' solution for > translating guest atomic instructions. > > The underlying idea is to provide new TCG instructions that guarantee > atomicity to some memory accesses or in general a way to define memory > transactions. More specifically, a new pair of TCG instructions are > implemented, qemu_ldlink_i32 and qemu_stcond_i32, that behave as > LoadLink and StoreConditional primitives (only 32 bit variant > implemented). In order to achieve this, a new bitmap is added to the > ram_list structure (always unique) which flags all memory pages that > could not be accessed directly through the fast-path, due to previous > exclusive operations. This new bitmap is coupled with a new TLB flag > which forces the slow-path exectuion. All stores which take place > between an LL/SC operation by other vCPUs in the same memory page, will > fail the subsequent StoreConditional. > > In theory, the provided implementation of TCG LoadLink/StoreConditional > can be used to properly handle atomic instructions on any architecture. > > The new slow-path is implemented such that: > - the LoadLink behaves as a normal load slow-path, except for cleaning > the dirty flag in the bitmap. The TLB entries created from now on will > force the slow-path. To ensure it, we flush the TLB cache for the > other vCPUs > - the StoreConditional behaves as a normal store slow-path, except for > checking the state of the dirty bitmap and returning 0 or 1 whether or > not the StoreConditional succeeded (0 when no vCPU has touched the > same memory in the mean time). > > All those write accesses that are forced to follow the 'legacy' > slow-path will set the accessed memory page to dirty. > > In this series only the ARM ldrex/strex instructions are implemented. > The code was tested with bare-metal test cases and with Linux, using > upstream QEMU. > > This work has been sponsored by Huawei Technologies Dusseldorf GmbH. > > Alvise Rigo (5): > exec: Add new exclusive bitmap to ram_list > Add new TLB_EXCL flag > softmmu: Add helpers for a new slow-path > tcg-op: create new TCG qemu_ldlink and qemu_stcond instructions > target-arm: translate: implement qemu_ldlink and qemu_stcond ops That's pretty cool. Paolo