From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48028) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHcow-0000hl-E5 for qemu-devel@nongnu.org; Mon, 27 Jun 2016 16:07:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bHcos-0007M3-Ba for qemu-devel@nongnu.org; Mon, 27 Jun 2016 16:07:50 -0400 Received: from mail-qt0-x233.google.com ([2607:f8b0:400d:c0d::233]:35641) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHcos-0007Lw-7H for qemu-devel@nongnu.org; Mon, 27 Jun 2016 16:07:46 -0400 Received: by mail-qt0-x233.google.com with SMTP id f89so27827497qtd.2 for ; Mon, 27 Jun 2016 13:07:45 -0700 (PDT) Sender: Richard Henderson References: <1467054136-10430-1-git-send-email-cota@braap.org> <1467054136-10430-3-git-send-email-cota@braap.org> From: Richard Henderson Message-ID: <84eab369-ea37-3262-c433-e57175fc366f@twiddle.net> Date: Mon, 27 Jun 2016 13:07:42 -0700 MIME-Version: 1.0 In-Reply-To: <1467054136-10430-3-git-send-email-cota@braap.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC 02/30] tcg: add tcg_cmpxchg_lock List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Emilio G. Cota" , QEMU Developers , MTTCG Devel Cc: Peter Maydell , Alvise Rigo , Sergey Fedorov , Paolo Bonzini , =?UTF-8?Q?Alex_Benn=c3=a9e?= On 06/27/2016 12:01 PM, Emilio G. Cota wrote: > This set of locks will allow us to correctly emulate cmpxchg16 > in a parallel TCG. The key observation is that no architecture > supports 16-byte regular atomic load/stores; only "locked" accesses > (e.g. via cmpxchg16b on x86) are allowed, and therefore we can emulate > them by using locks. > > We use a small array of locks so that we can have some scalability. > Further improvements are possible (e.g. using a radix tree); but > we should have a workload to benchmark in order to justify the > additional complexity. > > Signed-off-by: Emilio G. Cota > --- > cpu-exec.c | 1 + > linux-user/main.c | 1 + > tcg/tcg.h | 5 +++++ > translate-all.c | 39 +++++++++++++++++++++++++++++++++++++++ > 4 files changed, 46 insertions(+) As formulated, this doesn't work. In order to support cmpxchg16 without a native one, you have to use locks on *all* operations, lest a 4-byte atomic operation and a 16-byte operation be simultaneous in the same address range. Thankfully, the most common hosts (x86_64, aarch64, power7, s390x) do have a 16-byte cmpxchg, so this shouldn't really matter much in practice. It would be nice to continue to support the other hosts (arm32, mips, ppc32, sparc, i686) without locks when the guest doesn't require wider atomics than the host suports. r~