From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753053AbbJFVjT (ORCPT ); Tue, 6 Oct 2015 17:39:19 -0400 Received: from terminus.zytor.com ([198.137.202.10]:39147 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752574AbbJFVjR (ORCPT ); Tue, 6 Oct 2015 17:39:17 -0400 Subject: Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber To: Pranith Kumar References: <1444157670-19951-1-git-send-email-bobby.prani@gmail.com> <56142C2E.20308@zytor.com> Cc: Thomas Gleixner , Ingo Molnar , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" From: "H. Peter Anvin" Message-ID: <56143F76.2010902@zytor.com> Date: Tue, 6 Oct 2015 14:39:02 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/06/2015 01:29 PM, Pranith Kumar wrote: > On Tue, Oct 6, 2015 at 4:16 PM, H. Peter Anvin wrote: >> >> NAK. We already have the "+m" for exactly this reason; adding an >> explicit memory clobber should only be used to prevent movement of >> *other* memory operations around this one (i.e. a barrier). >> > > OK. If that is so, can you please explain why we need it in the > __raw_cmpxchg() case? I think it is a good idea to make cmpxchg() and > cmpxchg_double() have similar barrier semantics. > OK, it is a bit of a mess. We use the same macros for locked operations (__cmpxchg and __sync_cmpxchg) and unlocked operations (__cmpxchg_local). For locked operations we generally want a compiler barrier, although there are exceptions. I'm wondering if it would be better to add an explicit barrier(); to the locked versions. However, I think one of the major uses for cmpxchg_double() is for page table manipulation, and for that it isn't clear that a compiler barrier is needed nor desired. On the other hand, perhaps all of this is false optimization and we should just add the memory clobber. The real issue is the impact on the _local variants. -hpa