All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: stern@rowland.harvard.edu, parri.andrea@gmail.com,
	will.deacon@arm.com, peterz@infradead.org, npiggin@gmail.com,
	dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr,
	linux-kernel@vger.kernel.org
Subject: Re: Linux-kernel examples for LKMM recipes
Date: Tue, 17 Oct 2017 13:56:50 -0700	[thread overview]
Message-ID: <20171017205650.GA3521@linux.vnet.ibm.com> (raw)
In-Reply-To: <20171012012359.yrz5dhqmfp7nyq37@tardis>

On Thu, Oct 12, 2017 at 09:23:59AM +0800, Boqun Feng wrote:
> On Wed, Oct 11, 2017 at 10:32:30PM +0000, Paul E. McKenney wrote:
> > Hello!
> > 
> > At Linux Plumbers Conference, we got requests for a recipes document,
> > and a further request to point to actual code in the Linux kernel.
> > I have pulled together some examples for various litmus-test families,
> > as shown below.  The decoder ring for the abbreviations (ISA2, LB, SB,
> > MP, ...) is here:
> > 
> > 	https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test6.pdf
> > 
> > This document is also checked into the memory-models git archive:
> > 
> > 	https://github.com/aparri/memory-model.git
> > 
> > I would be especially interested in simpler examples in general, and
> > of course any example at all for the cases where I was unable to find
> > any.  Thoughts?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > This document lists the litmus-test patterns that we have been discussing,
> > along with examples from the Linux kernel.  This is intended to feed into
> > the recipes document.  All examples are from v4.13.
> > 
> > 0.	Single-variable SC.
> > 
> > 	a.	Within a single CPU, the use of the ->dynticks_nmi_nesting
> > 		counter by rcu_nmi_enter() and rcu_nmi_exit() qualifies
> > 		(see kernel/rcu/tree.c).  The counter is accessed by
> > 		interrupts and NMIs as well as by process-level code.
> > 		This counter can be accessed by other CPUs, but only
> > 		for debug output.
> > 
> > 	b.	Between CPUs, I would put forward the ->dflags
> > 		updates, but this is anything but simple.  But maybe
> > 		OK for an illustration?
> > 
> > 1.	MP (see test6.pdf for nickname translation)
> > 
> > 	a.	smp_store_release() / smp_load_acquire()
> > 
> > 		init_stack_slab() in lib/stackdepot.c uses release-acquire
> > 		to handle initialization of a slab of the stack.  Working
> > 		out the mutual-exclusion design is left as an exercise for
> > 		the reader.
> > 
> > 	b.	rcu_assign_pointer() / rcu_dereference()
> > 
> > 		expand_to_next_prime() does the rcu_assign_pointer(),
> > 		and next_prime_number() does the rcu_dereference().
> > 		This mediates access to a bit vector that is expanded
> > 		as additional primes are needed.  These two functions
> > 		are in lib/prime_numbers.c.
> > 
> > 	c.	smp_wmb() / smp_rmb()
> > 
> > 		xlog_state_switch_iclogs() contains the following:
> > 
> > 			log->l_curr_block -= log->l_logBBsize;
> > 			ASSERT(log->l_curr_block >= 0);
> > 			smp_wmb();
> > 			log->l_curr_cycle++;
> > 
> > 		And xlog_valid_lsn() contains the following:
> > 
> > 			cur_cycle = ACCESS_ONCE(log->l_curr_cycle);
> > 			smp_rmb();
> > 			cur_block = ACCESS_ONCE(log->l_curr_block);
> > 
> > 	d.	Replacing either of the above with smp_mb()
> > 
> > 		Holding off on this one for the moment...
> > 
> > 2.	Release-acquire chains, AKA ISA2, Z6.2, LB, and 3.LB
> > 
> > 	Lots of variety here, can in some cases substitute:
> > 	
> > 	a.	READ_ONCE() for smp_load_acquire()
> > 	b.	WRITE_ONCE() for smp_store_release()
> > 	c.	Dependencies for both smp_load_acquire() and
> > 		smp_store_release().
> > 	d.	smp_wmb() for smp_store_release() in first thread
> > 		of ISA2 and Z6.2.
> > 	e.	smp_rmb() for smp_load_acquire() in last thread of ISA2.
> > 
> > 	The canonical illustration of LB involves the various memory
> > 	allocators, where you don't want a load from about-to-be-freed
> > 	memory to see a store initializing a later incarnation of that
> > 	same memory area.  But the per-CPU caches make this a very
> > 	long and complicated example.
> > 
> > 	I am not aware of any three-CPU release-acquire chains in the
> > 	Linux kernel.  There are three-CPU lock-based chains in RCU,
> > 	but these are not at all simple, either.
> > 
> 
> The "Program-Order guarantees" case in scheduler? See the comments
> written by Peter above try_to_wake_up():
> 
>  * The basic program-order guarantee on SMP systems is that when a task [t]
>  * migrates, all its activity on its old CPU [c0] happens-before any subsequent
>  * execution on its new CPU [c1].
> ...
>  * For blocking we (obviously) need to provide the same guarantee as for
>  * migration. However the means are completely different as there is no lock
>  * chain to provide order. Instead we do:
>  *
>  *   1) smp_store_release(X->on_cpu, 0)
>  *   2) smp_cond_load_acquire(!X->on_cpu)
>  *
>  * Example:
>  *
>  *   CPU0 (schedule)  CPU1 (try_to_wake_up) CPU2 (schedule)
>  *
>  *   LOCK rq(0)->lock LOCK X->pi_lock
>  *   dequeue X
>  *   sched-out X
>  *   smp_store_release(X->on_cpu, 0);
>  *
>  *                    smp_cond_load_acquire(&X->on_cpu, !VAL);
>  *                    X->state = WAKING
>  *                    set_task_cpu(X,2)
>  *
>  *                    LOCK rq(2)->lock
>  *                    enqueue X
>  *                    X->state = RUNNING
>  *                    UNLOCK rq(2)->lock
>  *
>  *                                          LOCK rq(2)->lock // orders against CPU1
>  *                                          sched-out Z
>  *                                          sched-in X
>  *                                          UNLOCK rq(2)->lock
>  *
>  *                    UNLOCK X->pi_lock
>  *   UNLOCK rq(0)->lock
> 
> This is a chain mixed with lock and acquire-release(maybe even better?).

I added this one, though it might be outside of the scope of recipes.

						Thanx, Paul

> And another example would be osq_{lock,unlock}() on multiple(more than
> three) CPUs. 
> 
> Regards,
> Boqun
> 
> > 	Thoughts?
> > 
> > 3.	SB
> > 
> > 	a.	smp_mb(), as in lockless wait-wakeup coordination.
> > 		And as in sys_membarrier()-scheduler coordination,
> > 		for that matter.
> > 
> > 		Examples seem to be lacking.  Most cases use locking.
> > 		Here is one rather strange one from RCU:
> > 
> > 		void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
> > 		{
> > 			unsigned long flags;
> > 			bool needwake;
> > 			bool havetask = READ_ONCE(rcu_tasks_kthread_ptr);
> > 
> > 			rhp->next = NULL;
> > 			rhp->func = func;
> > 			raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags);
> > 			needwake = !rcu_tasks_cbs_head;
> > 			*rcu_tasks_cbs_tail = rhp;
> > 			rcu_tasks_cbs_tail = &rhp->next;
> > 			raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
> > 			/* We can't create the thread unless interrupts are enabled. */
> > 			if ((needwake && havetask) ||
> > 			    (!havetask && !irqs_disabled_flags(flags))) {
> > 				rcu_spawn_tasks_kthread();
> > 				wake_up(&rcu_tasks_cbs_wq);
> > 			}
> > 		}
> > 
> > 		And for the wait side, using synchronize_sched() to supply
> > 		the barrier for both ends, with the preemption disabling
> > 		due to raw_spin_lock_irqsave() serving as the read-side
> > 		critical section:
> > 
> > 		if (!list) {
> > 			wait_event_interruptible(rcu_tasks_cbs_wq,
> > 						 rcu_tasks_cbs_head);
> > 			if (!rcu_tasks_cbs_head) {
> > 				WARN_ON(signal_pending(current));
> > 				schedule_timeout_interruptible(HZ/10);
> > 			}
> > 			continue;
> > 		}
> > 		synchronize_sched();
> > 
> > 		-----------------
> > 
> > 		Here is another one that uses atomic_cmpxchg() as a
> > 		full memory barrier:
> > 
> > 		if (!wait_event_timeout(*wait, !atomic_read(stopping),
> > 					msecs_to_jiffies(1000))) {
> > 			atomic_set(stopping, 0);
> > 			smp_mb();
> > 			return -ETIMEDOUT;
> > 		}
> > 
> > 		int omap3isp_module_sync_is_stopping(wait_queue_head_t *wait,
> > 						     atomic_t *stopping)
> > 		{
> > 			if (atomic_cmpxchg(stopping, 1, 0)) {
> > 				wake_up(wait);
> > 				return 1;
> > 			}
> > 
> > 			return 0;
> > 		}
> > 

  parent reply	other threads:[~2017-10-17 20:57 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-11 22:32 Linux-kernel examples for LKMM recipes Paul E. McKenney
2017-10-12  1:23 ` Boqun Feng
2017-10-12 11:27   ` Will Deacon
2017-10-17 20:37     ` Paul E. McKenney
2017-10-17 20:56   ` Paul E. McKenney [this message]
2017-10-12 13:27 ` Andrea Parri
2017-10-17 20:18   ` Paul E. McKenney
2017-10-13 19:44 ` Alan Stern
2017-10-13 20:00   ` Paul E. McKenney
2017-10-13 20:09     ` Alan Stern
2017-10-17 18:56       ` Paul E. McKenney
2017-10-17 19:38         ` Alan Stern
2017-10-17 20:33           ` Paul E. McKenney
2017-10-17 21:03             ` Alan Stern
2017-10-17 21:55               ` Paul E. McKenney
2017-10-18 14:43                 ` Alan Stern
2017-10-18 20:28                   ` Paul E. McKenney
2017-10-18 21:18                     ` Alan Stern
2017-10-18 22:57                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171017205650.GA3521@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luc.maranget@inria.fr \
    --cc=npiggin@gmail.com \
    --cc=parri.andrea@gmail.com \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.