From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 33E4DDDED1 for ; Tue, 14 Oct 2008 19:35:40 +1100 (EST) Subject: Re: [patch] mutex: optimise generic mutex implementations From: Benjamin Herrenschmidt To: Nick Piggin In-Reply-To: <20081012054634.GA12535@wotan.suse.de> References: <20081012054634.GA12535@wotan.suse.de> Content-Type: text/plain Date: Tue, 14 Oct 2008 19:35:19 +1100 Message-Id: <1223973319.8157.333.camel@pasglop> Mime-Version: 1.0 Cc: linux-arch@vger.kernel.org, linuxppc-dev@ozlabs.org, Ingo Molnar , paulus@samba.org Reply-To: benh@kernel.crashing.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Sun, 2008-10-12 at 07:46 +0200, Nick Piggin wrote: > Speed up generic mutex implementations. > > - atomic operations which both modify the variable and return something imply > full smp memory barriers before and after the memory operations involved > (failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because > they don't modify the target). See Documentation/atomic_ops.txt. > So remove extra barriers and branches. > > - All architectures support atomic_cmpxchg. This has no relation to > __HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally > > This reduces a simple single threaded fastpath lock+unlock test from 590 cycles > to 203 cycles on a ppc970 system. > > Signed-off-by: Nick Piggin Looks ok. Cheers, Ben.