From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758440AbdJQUiI (ORCPT ); Tue, 17 Oct 2017 16:38:08 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:56640 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751961AbdJQUiG (ORCPT ); Tue, 17 Oct 2017 16:38:06 -0400 Date: Tue, 17 Oct 2017 13:37:59 -0700 From: "Paul E. McKenney" To: Will Deacon Cc: Boqun Feng , stern@rowland.harvard.edu, parri.andrea@gmail.com, peterz@infradead.org, npiggin@gmail.com, dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr, linux-kernel@vger.kernel.org Subject: Re: Linux-kernel examples for LKMM recipes Reply-To: paulmck@linux.vnet.ibm.com References: <20171011223229.GA31650@linux.vnet.ibm.com> <20171012012359.yrz5dhqmfp7nyq37@tardis> <20171012112718.GA31036@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171012112718.GA31036@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17101720-0048-0000-0000-000001F734BD X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007907; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000237; SDB=6.00932575; UDB=6.00469640; IPR=6.00712865; BA=6.00005643; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017580; XFM=3.00000015; UTC=2017-10-17 20:38:03 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17101720-0049-0000-0000-000042E6FD03 Message-Id: <20171017203759.GZ3521@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-10-17_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710170289 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 12, 2017 at 12:27:19PM +0100, Will Deacon wrote: > On Thu, Oct 12, 2017 at 09:23:59AM +0800, Boqun Feng wrote: > > On Wed, Oct 11, 2017 at 10:32:30PM +0000, Paul E. McKenney wrote: > > > I am not aware of any three-CPU release-acquire chains in the > > > Linux kernel. There are three-CPU lock-based chains in RCU, > > > but these are not at all simple, either. > > > > > > > The "Program-Order guarantees" case in scheduler? See the comments > > written by Peter above try_to_wake_up(): > > > > * The basic program-order guarantee on SMP systems is that when a task [t] > > * migrates, all its activity on its old CPU [c0] happens-before any subsequent > > * execution on its new CPU [c1]. > > ... > > * For blocking we (obviously) need to provide the same guarantee as for > > * migration. However the means are completely different as there is no lock > > * chain to provide order. Instead we do: > > * > > * 1) smp_store_release(X->on_cpu, 0) > > * 2) smp_cond_load_acquire(!X->on_cpu) > > * > > * Example: > > * > > * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule) > > * > > * LOCK rq(0)->lock LOCK X->pi_lock > > * dequeue X > > * sched-out X > > * smp_store_release(X->on_cpu, 0); > > * > > * smp_cond_load_acquire(&X->on_cpu, !VAL); > > * X->state = WAKING > > * set_task_cpu(X,2) > > * > > * LOCK rq(2)->lock > > * enqueue X > > * X->state = RUNNING > > * UNLOCK rq(2)->lock > > * > > * LOCK rq(2)->lock // orders against CPU1 > > * sched-out Z > > * sched-in X > > * UNLOCK rq(2)->lock > > * > > * UNLOCK X->pi_lock > > * UNLOCK rq(0)->lock > > > > This is a chain mixed with lock and acquire-release(maybe even better?). > > > > > > And another example would be osq_{lock,unlock}() on multiple(more than > > three) CPUs. > > I think the qrwlock also has something similar with the writer fairness > issue fixed: > > CPU0: (writer doing an unlock) > smp_store_release(&lock->wlocked, 0); // Bottom byte of lock->cnts > > > CPU1: (waiting writer on slowpath) > atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > ... > arch_spin_unlock(&lock->wait_lock); > > > CPU2: (reader on slowpath) > arch_spin_lock(&lock->wait_lock); > > and there's mixed-size accesses here too. Fun stuff! You had me going there until you mentioned the mixed-size accesses. ;-) Thanx, Paul