From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752510AbdBTLAN (ORCPT ); Mon, 20 Feb 2017 06:00:13 -0500 Received: from bombadil.infradead.org ([65.50.211.133]:48544 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751147AbdBTLAM (ORCPT ); Mon, 20 Feb 2017 06:00:12 -0500 Date: Mon, 20 Feb 2017 12:00:09 +0100 From: Peter Zijlstra To: Andrea Parri Cc: Waiman Long , Ingo Molnar , Pan Xinhui , Boqun Feng , linux-kernel@vger.kernel.org, Will Deacon Subject: Re: [PATCH v3] locking/pvqspinlock: Relax cmpxchg's to improve performance on some archs Message-ID: <20170220110009.GN6500@twins.programming.kicks-ass.net> References: <1487364220-11641-1-git-send-email-longman@redhat.com> <20170220042052.GA4529@andrea> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170220042052.GA4529@andrea> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 20, 2017 at 05:20:52AM +0100, Andrea Parri wrote: > On Fri, Feb 17, 2017 at 03:43:40PM -0500, Waiman Long wrote: > > @@ -361,6 +361,9 @@ static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node) > > * observe its next->locked value and advance itself. > > * > > * Matches with smp_store_mb() and cmpxchg() in pv_wait_node() > > + * > > + * We can't used relaxed form of cmpxchg here as the loading of > > + * pn->state can happen before setting next->locked in some archs. > > */ > > if (cmpxchg(&pn->state, vcpu_halted, vcpu_hashed) != vcpu_halted) > > Hi Waiman. > > cmpxchg() does not guarantee the (here implied) smp_mb(), in general; c.f., > e.g., arch/arm64/include/asm/atomic_ll_sc.h, where cmpxchg() get "compiled" > to something like: > > _loop: ldxr; eor; cbnz _exit; stlxr; cbnz _loop; dmb ish; _exit: So cmpxchg() should have an effective smp_mb() before and after the operation, _except_ in the case of failure, in which case it need not imply any barrier. And since a successful cmpxchg does the store, which is a store-release in that arm64 sequence, that provides the ordering against all prior state. The dmb-ish provides the required barrier after the operation.