From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Subject: Re: Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Date: Wed, 18 Feb 2009 08:09:21 -0800 (PST) Message-ID: References: <1234823097.30178.406.camel@laptop> <20090216231946.GA12009@redhat.com> <1234862974.4744.31.camel@laptop> <20090217101130.GA8660@wotan.suse.de> <1234866453.4744.58.camel@laptop> <20090217112657.GE26402@wotan.suse.de> <20090217192810.GA4980@redhat.com> <20090217213256.GJ6761@linux.vnet.ibm.com> <20090217214518.GA13189@redhat.com> <20090217223910.GM6761@linux.vnet.ibm.com> <20090218135212.GB23125@wotan.suse.de> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Return-path: Received: from smtp1.linux-foundation.org ([140.211.169.13]:51251 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751526AbZBRQKr (ORCPT ); Wed, 18 Feb 2009 11:10:47 -0500 In-Reply-To: <20090218135212.GB23125@wotan.suse.de> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Nick Piggin Cc: "Paul E. McKenney" , Oleg Nesterov , Peter Zijlstra , Jens Axboe , Suresh Siddha , Ingo Molnar , Rusty Russell , Steven Rostedt , linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org On Wed, 18 Feb 2009, Nick Piggin wrote: > > I agree with you both that we *should* make arch interrupt code > do the ordering, but given the subtle lockups on some architectures > in this new code, I didn't want to make it significantly weaker... > > Though perhaps it appears that I have, if I have removed an smp_mb > that x86 was relying on to emit an mfence to serialise the apic. The thing is, if the architecture doesn't order IPI wrt cache coherency, then the "smp_mb()" doesn't really do so _either_. It might hide some architecture-specific implementation issue, of course, so random amounts of "smp_mb()"s sprinkled around might well make some architecture "work", but it's in no way guaranteed. A smp_mb() does not guarantee that some separate IPI network is ordered - that may well take some random machine-specific IO cycle. That said, at least on x86, taking an interrupt should be a serializing event, so there should be no reason for anything on the receiving side. The _sending_ side might need to make sure that there is serialization when generating the IPI (so that the IPI cannot happen while the writes are still in some per-CPU write buffer and haven't become part of the cache coherency domain). And at least on x86 it's actually pretty hard to generate out-of-order accesses to begin with (_regardless_ of any issues external to the CPU). You have to work at it, and use a WC memory area, and I'm pretty sure we use UC for the apic accesses. Linus