From: Ingo Molnar <mingo@kernel.org>
To: Dexuan Cui <decui@microsoft.com>
Cc: "linux-x86_64@vger.kernel.org" <linux-x86_64@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
David Howells <dhowells@redhat.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: x86 memory barrier: why does Linux prefer MFENCE to Locked ADD?
Date: Thu, 3 Mar 2016 16:27:39 +0100 [thread overview]
Message-ID: <20160303152739.GA16303@gmail.com> (raw)
In-Reply-To: <BLUPR03MB1410A48DDA4C0A4902A8E163BFBD0@BLUPR03MB1410.namprd03.prod.outlook.com>
* Dexuan Cui <decui@microsoft.com> wrote:
> Hi,
> My understanding about arch/x86/include/asm/barrier.h is: obviously Linux
> more likes {L,S,M}FENCE -- Locked ADD is only used in x86_32 platforms that
> don't support XMM2.
>
> However, it looks people say Locked Add is much faster than the FENCE
> instructions, even on modern Intel CPUs like Haswell, e.g., please see
> the three sources:
>
> " 11.5.1 Locked Instructions as Memory Barriers
> Optimization
> Use locked instructions to implement Store/Store and Store/Load barriers.
> "
> http://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf
>
> "lock addl %(rsp), 0 is a better solution for StoreLoad barrier ":
> http://shipilev.net/blog/2014/on-the-fence-with-dependencies/
>
> "...locked instruction are more efficient barriers...":
> http://www.pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-essay/
>
> I also found that FreeBSD prefers Locked Add.
>
> So, I'm curious why Linux prefers MFENCE.
> I guess I may be missing something.
>
> I tried to google the question, but didn't find an answer.
It's being worked on, see this thread on lkml from a few weeks ago:
C Jan 13 Michael S. Tsir | [PATCH v3 0/4] x86: faster mb()+documentation tweaks
C Jan 13 Michael S. Tsir | ├─>[PATCH v3 1/4] x86: add cc clobber for addl
C Jan 13 Michael S. Tsir | ├─>[PATCH v3 2/4] x86: drop a comment left over from X86_OOSTORE
C Jan 13 Michael S. Tsir | ├─>[PATCH v3 3/4] x86: tweak the comment about use of wmb for IO
C Jan 13 Michael S. Tsir | ├─>[PATCH v3 4/4] x86: drop mfence in favor of lock+addl
The 4th patch changes MFENCE to a LOCK ADDL locked instruction.
Thanks,
Ingo
next prev parent reply other threads:[~2016-03-03 15:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-03 14:33 x86 memory barrier: why does Linux prefer MFENCE to Locked ADD? Dexuan Cui
2016-03-03 15:27 ` Ingo Molnar [this message]
2016-03-03 15:34 ` Peter Zijlstra
2016-03-03 18:35 ` Michael S. Tsirkin
2016-03-03 19:05 ` H. Peter Anvin
2016-06-03 13:39 ` Peter Zijlstra
2016-08-03 4:36 ` Michael S. Tsirkin
2016-08-03 12:50 ` Henrique de Moraes Holschuh
2016-08-03 13:04 ` Michael S. Tsirkin
2016-08-03 23:19 ` Henrique de Moraes Holschuh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160303152739.GA16303@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=decui@microsoft.com \
--cc=dhowells@redhat.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-x86_64@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mst@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.