From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755515AbcHCEgs (ORCPT <rfc822;w@1wt.eu>);
	Wed, 3 Aug 2016 00:36:48 -0400
Received: from mx1.redhat.com ([209.132.183.28]:57532 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751351AbcHCEgj (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Wed, 3 Aug 2016 00:36:39 -0400
Date: Wed, 3 Aug 2016 07:36:34 +0300
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>,
        Dexuan Cui <decui@microsoft.com>,
        "linux-x86_64@vger.kernel.org" <linux-x86_64@vger.kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>,
        David Howells <dhowells@redhat.com>,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: x86 memory barrier: why does Linux prefer MFENCE to Locked ADD?
Message-ID: <20160803073134-mutt-send-email-mst@kernel.org>
References: <BLUPR03MB1410A48DDA4C0A4902A8E163BFBD0@BLUPR03MB1410.namprd03.prod.outlook.com>
 <20160303152739.GA16303@gmail.com>
 <20160303153453.GR6356@twins.programming.kicks-ass.net>
 <20160303203414-mutt-send-email-mst@redhat.com>
 <66C0F0F8-5D2C-47DB-8C7A-EF8A15F263DB@zytor.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <66C0F0F8-5D2C-47DB-8C7A-EF8A15F263DB@zytor.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Wed, 03 Aug 2016 04:36:39 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Mar 03, 2016 at 11:05:43AM -0800, H. Peter Anvin wrote:
> On March 3, 2016 10:35:50 AM PST, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >On Thu, Mar 03, 2016 at 04:34:53PM +0100, Peter Zijlstra wrote:
> >> On Thu, Mar 03, 2016 at 04:27:39PM +0100, Ingo Molnar wrote:
> >> > 
> >> > * Dexuan Cui <decui@microsoft.com> wrote:
> >> > 
> >> > > Hi,
> >> > > My understanding about arch/x86/include/asm/barrier.h is:
> >obviously Linux
> >> > > more likes {L,S,M}FENCE -- Locked ADD is only used in x86_32
> >platforms that
> >> > > don't support XMM2.
> >> > > 
> >> > > However, it looks people say Locked Add is much faster than the
> >FENCE
> >> > > instructions, even on modern Intel CPUs like Haswell, e.g.,
> >please see
> >> > > the three sources:
> >> > > 
> >> > > " 11.5.1 Locked Instructions as Memory Barriers
> >> > > Optimization
> >> > > Use locked instructions to implement Store/Store and Store/Load
> >barriers.
> >> > > "
> >> > > http://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf
> >> > > 
> >> > > "lock addl %(rsp), 0 is a better solution for StoreLoad barrier
> >":
> >> > > http://shipilev.net/blog/2014/on-the-fence-with-dependencies/
> >> > > 
> >> > > "...locked instruction are more efficient barriers...":
> >> > >
> >http://www.pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-essay/
> >> > > 
> >> > > I also found that FreeBSD prefers Locked Add.
> >> > > 
> >> > > So, I'm curious why Linux prefers MFENCE.
> >> > > I guess I may be missing something.
> >> > > 
> >> > > I tried to google the question, but didn't find an answer.
> >> > 
> >> > It's being worked on, see this thread on lkml from a few weeks ago:
> >> > 
> >> >    C Jan 13 Michael S. Tsir    | [PATCH v3 0/4] x86: faster
> >mb()+documentation tweaks
> >> >    C Jan 13 Michael S. Tsir    | ├─>[PATCH v3 1/4] x86: add cc
> >clobber for addl
> >> >    C Jan 13 Michael S. Tsir    | ├─>[PATCH v3 2/4] x86: drop a
> >comment left over from X86_OOSTORE
> >> >    C Jan 13 Michael S. Tsir    | ├─>[PATCH v3 3/4] x86: tweak the
> >comment about use of wmb for IO
> >> >    C Jan 13 Michael S. Tsir    | ├─>[PATCH v3 4/4] x86: drop mfence
> >in favor of lock+addl
> >> > 
> >> > The 4th patch changes MFENCE to a LOCK ADDL locked instruction.
> >> 
> >> Lots of additional chatter here:
> >> 
> >>   lkml.kernel.org/r/20160112150032-mutt-send-email-mst@redhat.com
> >> 
> >> And some useful bits here:
> >> 
> >>   lkml.kernel.org/r/56957D54.5000602@zytor.com
> >> 
> >> latest version here:
> >> 
> >>   lkml.kernel.org/r/1453921746-16178-1-git-send-email-mst@redhat.com
> >
> >It's ready as far as I am concerned.
> >Basically we are just waiting for ack from hpa.
> 
> And I'm still discussing this with the hardware people.  It seems we
> can do this for *most* things, but not all; the question is where
> exactly we need to do something different.

I'm guessing there's still no update?

There's a decent chance that without documentation a bunch of current
uses are actually broken. See for example
http://marc.info/?l=linux-kernel&m=145400059304553&w=2
which going by the manual is fixing smp_mb misuse for clflush - or maybe not?

> -- 
> Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.