From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1763709AbcALRpd (ORCPT ); Tue, 12 Jan 2016 12:45:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47245 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752572AbcALRpb (ORCPT ); Tue, 12 Jan 2016 12:45:31 -0500 Date: Tue, 12 Jan 2016 19:45:27 +0200 From: "Michael S. Tsirkin" To: Linus Torvalds Cc: Davidlohr Bueso , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , "Paul E. McKenney" , Linux Kernel Mailing List , the arch/x86 maintainers , Davidlohr Bueso , "H. Peter Anvin" , virtualization Subject: Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb() Message-ID: <20160112193027-mutt-send-email-mst@redhat.com> References: <1445975631-17047-1-git-send-email-dave@stgolabs.net> <1445975631-17047-4-git-send-email-dave@stgolabs.net> <20151027223744.GB11242@worktop.amr.corp.intel.com> <20151102201535.GB1707@linux-uzut.site> <20160112150032-mutt-send-email-mst@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 12, 2016 at 09:20:06AM -0800, Linus Torvalds wrote: > On Tue, Jan 12, 2016 at 5:57 AM, Michael S. Tsirkin wrote: > > #ifdef xchgrz > > /* same as xchg but poking at gcc red zone */ > > #define barrier() do { int ret; asm volatile ("xchgl %0, -4(%%" SP ");": "=r"(ret) :: "memory", "cc"); } while (0) > > #endif > > That's not safe in general. gcc might be using its redzone, so doing > xchg into it is unsafe. > > But.. > > > Is this a good way to test it? > > .. it's fine for some basic testing. It doesn't show any subtle > interactions (ie some operations may have different dynamic behavior > when the write buffers are busy etc), but as a baseline for "how fast > can things go" the stupid raw loop is fine. And while the xchg into > the redzoen wouldn't be acceptable as a real implementation, for > timing testing it's likely fine (ie you aren't hitting the problem it > can cause). > > > So mfence is more expensive than locked instructions/xchg, but sfence/lfence > > are slightly faster, and xchg and locked instructions are very close if > > not the same. > > Note that we never actually *use* lfence/sfence. They are pointless > instructions when looking at CPU memory ordering, because for pure CPU > memory ordering stores and loads are already ordered. > > The only reason to use lfence/sfence is after you've used nontemporal > stores for IO. By the way, the comment in barrier.h says: /* * Some non-Intel clones support out of order store. wmb() ceases to be * a nop for these. */ and while the 1st sentence may well be true, if you have an SMP system with out of order stores, making wmb not a nop will not help. Additionally as you point out, wmb is not a nop even for regular intel CPUs because of these weird use-cases. Drop this comment? > That's very very rare in the kernel. So I wouldn't > worry about those. Right - I'll leave these alone, whoever wants to optimize this path will have to do the necessary research. > But yes, it does sound like mfence is just a bad idea too. > > > There isn't any extra magic behind mfence, is there? > > No. > > I think the only issue is that there has never been any real reason > for CPU designers to try to make mfence go particularly fast. Nobody > uses it, again with the exception of some odd loops that use > nontemporal stores, and for those the cost tends to always be about > the nontemporal accesses themselves (often to things like GPU memory > over PCIe), and the mfence cost of a few extra cycles is negligible. > > The reason "lock ; add $0" has generally been the fastest we've found > is simply that locked ops have been important for CPU designers. > > So I think the patch is fine, and we should likely drop the use of mfence.. > > Linus OK so should I repost after a bit more testing? I don't believe this will affect the kernel build benchmark, but I'll try :) -- MST