From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752185AbdJKWch (ORCPT ); Wed, 11 Oct 2017 18:32:37 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:60980 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751827AbdJKWcf (ORCPT ); Wed, 11 Oct 2017 18:32:35 -0400 Date: Wed, 11 Oct 2017 15:32:30 -0700 From: "Paul E. McKenney" To: stern@rowland.harvard.edu, parri.andrea@gmail.com, will.deacon@arm.com, peterz@infradead.org, boqun.feng@gmail.com, npiggin@gmail.com, dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr Cc: linux-kernel@vger.kernel.org Subject: Linux-kernel examples for LKMM recipes Reply-To: paulmck@linux.vnet.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17101122-2213-0000-0000-0000022A38F8 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007880; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000236; SDB=6.00929780; UDB=6.00467995; IPR=6.00710051; BA=6.00005634; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017496; XFM=3.00000015; UTC=2017-10-11 22:32:32 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17101122-2214-0000-0000-000057D30DE5 Message-Id: <20171011223229.GA31650@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-10-11_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710110303 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello! At Linux Plumbers Conference, we got requests for a recipes document, and a further request to point to actual code in the Linux kernel. I have pulled together some examples for various litmus-test families, as shown below. The decoder ring for the abbreviations (ISA2, LB, SB, MP, ...) is here: https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test6.pdf This document is also checked into the memory-models git archive: https://github.com/aparri/memory-model.git I would be especially interested in simpler examples in general, and of course any example at all for the cases where I was unable to find any. Thoughts? Thanx, Paul ------------------------------------------------------------------------ This document lists the litmus-test patterns that we have been discussing, along with examples from the Linux kernel. This is intended to feed into the recipes document. All examples are from v4.13. 0. Single-variable SC. a. Within a single CPU, the use of the ->dynticks_nmi_nesting counter by rcu_nmi_enter() and rcu_nmi_exit() qualifies (see kernel/rcu/tree.c). The counter is accessed by interrupts and NMIs as well as by process-level code. This counter can be accessed by other CPUs, but only for debug output. b. Between CPUs, I would put forward the ->dflags updates, but this is anything but simple. But maybe OK for an illustration? 1. MP (see test6.pdf for nickname translation) a. smp_store_release() / smp_load_acquire() init_stack_slab() in lib/stackdepot.c uses release-acquire to handle initialization of a slab of the stack. Working out the mutual-exclusion design is left as an exercise for the reader. b. rcu_assign_pointer() / rcu_dereference() expand_to_next_prime() does the rcu_assign_pointer(), and next_prime_number() does the rcu_dereference(). This mediates access to a bit vector that is expanded as additional primes are needed. These two functions are in lib/prime_numbers.c. c. smp_wmb() / smp_rmb() xlog_state_switch_iclogs() contains the following: log->l_curr_block -= log->l_logBBsize; ASSERT(log->l_curr_block >= 0); smp_wmb(); log->l_curr_cycle++; And xlog_valid_lsn() contains the following: cur_cycle = ACCESS_ONCE(log->l_curr_cycle); smp_rmb(); cur_block = ACCESS_ONCE(log->l_curr_block); d. Replacing either of the above with smp_mb() Holding off on this one for the moment... 2. Release-acquire chains, AKA ISA2, Z6.2, LB, and 3.LB Lots of variety here, can in some cases substitute: a. READ_ONCE() for smp_load_acquire() b. WRITE_ONCE() for smp_store_release() c. Dependencies for both smp_load_acquire() and smp_store_release(). d. smp_wmb() for smp_store_release() in first thread of ISA2 and Z6.2. e. smp_rmb() for smp_load_acquire() in last thread of ISA2. The canonical illustration of LB involves the various memory allocators, where you don't want a load from about-to-be-freed memory to see a store initializing a later incarnation of that same memory area. But the per-CPU caches make this a very long and complicated example. I am not aware of any three-CPU release-acquire chains in the Linux kernel. There are three-CPU lock-based chains in RCU, but these are not at all simple, either. Thoughts? 3. SB a. smp_mb(), as in lockless wait-wakeup coordination. And as in sys_membarrier()-scheduler coordination, for that matter. Examples seem to be lacking. Most cases use locking. Here is one rather strange one from RCU: void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) { unsigned long flags; bool needwake; bool havetask = READ_ONCE(rcu_tasks_kthread_ptr); rhp->next = NULL; rhp->func = func; raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); needwake = !rcu_tasks_cbs_head; *rcu_tasks_cbs_tail = rhp; rcu_tasks_cbs_tail = &rhp->next; raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); /* We can't create the thread unless interrupts are enabled. */ if ((needwake && havetask) || (!havetask && !irqs_disabled_flags(flags))) { rcu_spawn_tasks_kthread(); wake_up(&rcu_tasks_cbs_wq); } } And for the wait side, using synchronize_sched() to supply the barrier for both ends, with the preemption disabling due to raw_spin_lock_irqsave() serving as the read-side critical section: if (!list) { wait_event_interruptible(rcu_tasks_cbs_wq, rcu_tasks_cbs_head); if (!rcu_tasks_cbs_head) { WARN_ON(signal_pending(current)); schedule_timeout_interruptible(HZ/10); } continue; } synchronize_sched(); ----------------- Here is another one that uses atomic_cmpxchg() as a full memory barrier: if (!wait_event_timeout(*wait, !atomic_read(stopping), msecs_to_jiffies(1000))) { atomic_set(stopping, 0); smp_mb(); return -ETIMEDOUT; } int omap3isp_module_sync_is_stopping(wait_queue_head_t *wait, atomic_t *stopping) { if (atomic_cmpxchg(stopping, 1, 0)) { wake_up(wait); return 1; } return 0; }