From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751451AbdJRUsy (ORCPT ); Wed, 18 Oct 2017 16:48:54 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:34331 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750926AbdJRUsw (ORCPT ); Wed, 18 Oct 2017 16:48:52 -0400 Date: Wed, 18 Oct 2017 13:48:47 -0700 From: "Paul E. McKenney" To: Andrea Parri Cc: j.alglave@ucl.ac.uk, luc.maranget@inria.fr, stern@rowland.harvard.edu, dhowells@redhat.com, peterz@infradead.org, will.deacon@arm.com, boqun.feng@gmail.com, npiggin@gmail.com, linux-kernel@vger.kernel.org Subject: Re: Memory-ordering recipes Reply-To: paulmck@linux.vnet.ibm.com References: <20170917230509.GA21394@linux.vnet.ibm.com> <20171017210137.GA12700@linux.vnet.ibm.com> <20171018010748.GA4017@andrea> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171018010748.GA4017@andrea> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17101820-0036-0000-0000-0000027DC8F4 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007915; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000237; SDB=6.00933059; UDB=6.00469930; IPR=6.00713348; BA=6.00005648; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017596; XFM=3.00000015; UTC=2017-10-18 20:48:50 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17101820-0037-0000-0000-0000421EBC47 Message-Id: <20171018204847.GS3521@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-10-18_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710180288 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 18, 2017 at 03:07:48AM +0200, Andrea Parri wrote: > On Tue, Oct 17, 2017 at 02:01:37PM -0700, Paul E. McKenney wrote: > > On Sun, Sep 17, 2017 at 04:05:09PM -0700, Paul E. McKenney wrote: > > > Hello! > > > > > > The topic of memory-ordering recipes came up at the Linux Plumbers > > > Conference microconference on Friday, so I thought that I should summarize > > > what is currently "out there": > > > > And here is an updated list of potential Linux-kernel examples for a > > "recipes" document, and thank you for the feedback. Please feel free > > to counterpropose better examples. In addition, if there is some other > > pattern that is commonplace and important enough to be included in a > > recipes document, please point it out. > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > This document lists the litmus-test patterns that we have been discussing, > > along with examples from the Linux kernel. This is intended to feed into > > the recipes document. All examples are from v4.13. > > > > 0. Simple special cases > > > > If there is only one CPU on the one hand or only one variable > > on the other, the code will execute in order. There are (as > > usual) some things to be careful of: > > > > a. There are some aspects of the C language that are > > unordered. For example, in the expression "f(x) + g(y)", > > the order in which f and g are called is not defined; > > the object code is allowed to use either order or even > > to interleave the computations. > > > > b. Compilers are permitted to use the "as-if" rule. That is, > > a compiler can emit whatever code it likes, as long as > > the results of a single-threaded execution appear just > > as if the compiler had followed all the relevant rules. > > To see this, compile with a high level of optimization > > and run the debugger on the resulting binary. > > > > c. If there is only one variable but multiple CPUs, all > > that variable must be properly aligned and all accesses > > to that variable must be full sized. Variables that > > straddle cachelines or pages void your full-ordering > > warranty, as do undersized accesses that load from or > > store to only part of the variable. > > > > 1. Another simple case: Locking. [ Assuming you don't think too > > hard about it, that is! ] > > > > Any CPU that has acquired a given lock sees any changes previously > > made by any CPU prior to having released that same lock. > > > > [ Should we discuss chaining back through different locks, > > sort of like release-acquire chains? ] > > > > 2. MP (see test6.pdf for nickname translation) > > > > a. smp_store_release() / smp_load_acquire() > > > > init_stack_slab() in lib/stackdepot.c uses release-acquire > > to handle initialization of a slab of the stack. Working > > out the mutual-exclusion design is left as an exercise for > > the reader. > > > > b. rcu_assign_pointer() / rcu_dereference() > > > > expand_to_next_prime() does the rcu_assign_pointer(), > > and next_prime_number() does the rcu_dereference(). > > This mediates access to a bit vector that is expanded > > as additional primes are needed. These two functions > > are in lib/prime_numbers.c. > > > > c. smp_wmb() / smp_rmb() > > > > xlog_state_switch_iclogs() contains the following: > > > > log->l_curr_block -= log->l_logBBsize; > > ASSERT(log->l_curr_block >= 0); > > smp_wmb(); > > log->l_curr_cycle++; > > > > And xlog_valid_lsn() contains the following: > > > > cur_cycle = ACCESS_ONCE(log->l_curr_cycle); > > smp_rmb(); > > cur_block = ACCESS_ONCE(log->l_curr_block); > > > > Alternatively, from the comment in perf_output_put_handle() > > in kernel/events/ring_buffer.c: > > > > * kernel user > > * > > * if (LOAD ->data_tail) { LOAD ->data_head > > * (A) smp_rmb() (C) > > * STORE $data LOAD $data > > * smp_wmb() (B) smp_mb() (D) > > * STORE ->data_head STORE ->data_tail > > * } > > * > > * Where A pairs with D, and B pairs with C. > > > > The B/C pairing is MP with smp_wmb() and smp_rmb(). > > > > d. Replacing either of the above with smp_mb() > > > > Holding off on this one for the moment... > > > > 3. LB > > > > a. LB+ctrl+mb > > > > Again, from the comment in perf_output_put_handle() > > in kernel/events/ring_buffer.c: > > > > * kernel user > > * > > * if (LOAD ->data_tail) { LOAD ->data_head > > * (A) smp_rmb() (C) > > * STORE $data LOAD $data > > * smp_wmb() (B) smp_mb() (D) > > * STORE ->data_head STORE ->data_tail > > * } > > * > > * Where A pairs with D, and B pairs with C. > > > > The A/D pairing covers this one. > > > > 4. Release-acquire chains, AKA ISA2, Z6.2, LB, and 3.LB > > > > Lots of variety here, can in some cases substitute: > > > > a. READ_ONCE() for smp_load_acquire() > > b. WRITE_ONCE() for smp_store_release() > > c. Dependencies for both smp_load_acquire() and > > smp_store_release(). > > d. smp_wmb() for smp_store_release() in first thread > > of ISA2 and Z6.2. > > e. smp_rmb() for smp_load_acquire() in last thread of ISA2. > > > > The canonical illustration of LB involves the various memory > > allocators, where you don't want a load from about-to-be-freed > > memory to see a store initializing a later incarnation of that > > same memory area. But the per-CPU caches make this a very > > long and complicated example. > > > > I am not aware of any three-CPU release-acquire chains in the > > Linux kernel. There are three-CPU lock-based chains in RCU, > > but these are not at all simple, either. > > > > Thoughts? > > > > 5. SB > > > > a. smp_mb(), as in lockless wait-wakeup coordination. > > And as in sys_membarrier()-scheduler coordination, > > for that matter. > > > > Examples seem to be lacking. Most cases use locking. > > Here is one rather strange one from RCU: > > > > void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) > > { > > unsigned long flags; > > bool needwake; > > bool havetask = READ_ONCE(rcu_tasks_kthread_ptr); > > > > rhp->next = NULL; > > rhp->func = func; > > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); > > needwake = !rcu_tasks_cbs_head; > > *rcu_tasks_cbs_tail = rhp; > > rcu_tasks_cbs_tail = &rhp->next; > > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); > > /* We can't create the thread unless interrupts are enabled. */ > > if ((needwake && havetask) || > > (!havetask && !irqs_disabled_flags(flags))) { > > rcu_spawn_tasks_kthread(); > > wake_up(&rcu_tasks_cbs_wq); > > } > > } > > > > And for the wait side, using synchronize_sched() to supply > > the barrier for both ends, with the preemption disabling > > due to raw_spin_lock_irqsave() serving as the read-side > > critical section: > > > > if (!list) { > > wait_event_interruptible(rcu_tasks_cbs_wq, > > rcu_tasks_cbs_head); > > if (!rcu_tasks_cbs_head) { > > WARN_ON(signal_pending(current)); > > schedule_timeout_interruptible(HZ/10); > > } > > continue; > > } > > synchronize_sched(); > > > > ----------------- > > > > Here is another one that uses atomic_cmpxchg() as a > > full memory barrier: > > > > if (!wait_event_timeout(*wait, !atomic_read(stopping), > > msecs_to_jiffies(1000))) { > > atomic_set(stopping, 0); > > smp_mb(); > > return -ETIMEDOUT; > > } > > > > int omap3isp_module_sync_is_stopping(wait_queue_head_t *wait, > > atomic_t *stopping) > > { > > if (atomic_cmpxchg(stopping, 1, 0)) { > > wake_up(wait); > > return 1; > > } > > > > return 0; > > } > > > > ----------------- > > > > And here is the generic pattern for the above two examples > > taken from waitqueue_active() in include/linux/wait.h: > > > > * CPU0 - waker CPU1 - waiter > > * > > * for (;;) { > > * @cond = true; prepare_to_wait(&wq_head, &wait, state); > > * smp_mb(); // smp_mb() from set_current_state() > > * if (waitqueue_active(wq_head)) if (@cond) > > * wake_up(wq_head); break; > > * schedule(); > > * } > > * finish_wait(&wq_head, &wait); > > > > Note that prepare_to_wait() does the both the write > > and the set_current_state() that contains the smp_mb(). > > The read is the waitqueue_active() on the one hand and > > the "if (@cond)" on the other. > > > > 6. W+RWC+porel+mb+mb > > > > See recipes-LKcode-63cae12bce986.txt. > > > > Mostly of historical interest -- as far as I know, this commit > > was the first to contain a litmus test. > > > > 7. Context switch and migration. A bit specialized, so might leave > > this one out. > > > > When a thread moves from one CPU to another to another, the > > scheduler is required to do whatever is necessary for the thread > > to see any prior accesses that it executed on other CPUs. This > > includes "interesting" interactions with wake_up() shown in the > > following comment from try_to_wake_up() in kernel/sched/core.c: > > > > * Notes on Program-Order guarantees on SMP systems. > > * > > * MIGRATION > > * > > * The basic program-order guarantee on SMP systems is that when a task [t] > > * migrates, all its activity on its old CPU [c0] happens-before any subsequent > > * execution on its new CPU [c1]. > > * > > * For migration (of runnable tasks) this is provided by the following means: > > * > > * A) UNLOCK of the rq(c0)->lock scheduling out task t > > * B) migration for t is required to synchronize *both* rq(c0)->lock and > > * rq(c1)->lock (if not at the same time, then in that order). > > * C) LOCK of the rq(c1)->lock scheduling in task > > * > > * Transitivity guarantees that B happens after A and C after B. > > * Note: we only require RCpc transitivity. > > * Note: the CPU doing B need not be c0 or c1 > > * > > * Example: > > * > > * CPU0 CPU1 CPU2 > > * > > * LOCK rq(0)->lock > > * sched-out X > > * sched-in Y > > * UNLOCK rq(0)->lock > > * > > * LOCK rq(0)->lock // orders against CPU0 > > * dequeue X > > * UNLOCK rq(0)->lock > > * > > * LOCK rq(1)->lock > > * enqueue X > > * UNLOCK rq(1)->lock > > * > > * LOCK rq(1)->lock // orders against CPU2 > > * sched-out Z > > * sched-in X > > * UNLOCK rq(1)->lock > > We might want to "specialize" this snippet to emphasize (or illustrate) > a particular _cycle_; one possible way of "closing the chain" would be: > > CPU0 > > smp_store_release(&X->on_cpu, 0); /* from sched-out X */ > UNLOCK rq(0)->lock > > > CPU2 > > LOCK rq(0)->lock > UNLOCK rq(1)->lock > > > CPU1 > > LOCK rq(1)->lock > X->on_cpu = 1; /* from sched-in X */ > > > BUG_ON("each LOCK reads from its UNLOCK" && X->on_cpu == 0); > > (c.f., Z6.2 in test6.pdf) I added this as a summary, pretty much in the form you have above. > > * > > * > > * BLOCKING -- aka. SLEEP + WAKEUP > > * > > * For blocking we (obviously) need to provide the same guarantee as for > > * migration. However the means are completely different as there is no lock > > * chain to provide order. Instead we do: > > * > > * 1) smp_store_release(X->on_cpu, 0) > > * 2) smp_cond_load_acquire(!X->on_cpu) > > * > > * Example: > > * > > * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule) > > * > > * LOCK rq(0)->lock LOCK X->pi_lock > > * dequeue X > > * sched-out X > > * smp_store_release(X->on_cpu, 0); > > * > > * smp_cond_load_acquire(&X->on_cpu, !VAL); > > * X->state = WAKING > > * set_task_cpu(X,2) > > * > > * LOCK rq(2)->lock > > * enqueue X > > * X->state = RUNNING > > * UNLOCK rq(2)->lock > > * > > * LOCK rq(2)->lock // orders against CPU1 > > * sched-out Z > > * sched-in X > > * UNLOCK rq(2)->lock > > * > > * UNLOCK X->pi_lock > > * UNLOCK rq(0)->lock > > Similarly: > > CPU0 > > smp_store_release(&X->on_cpu, 0); /* from sched-out X */ > > > CPU1 > > smp_cond_load_acquire(&X->on_cpu, !VAL); > UNLOCK rq(2)->lock > > > CPU2 > > LOCK rq(2)->lock > X->on_cpu = 1; // from sched-in X > > > BUG_ON("each LOCK reads from its UNLOCK" && X->on_cpu == 0); > > (c.f., WWC in test6.pdf) And same here. Thanx, Paul > Andrea > > > > * > > * > > * However; for wakeups there is a second guarantee we must provide, namely we > > * must observe the state that lead to our wakeup. That is, not only must our > > * task observe its own prior state, it must also observe the stores prior to > > * its wakeup. > > * > > * This means that any means of doing remote wakeups must order the CPU doing > > * the wakeup against the CPU the task is going to end up running on. This, > > * however, is already required for the regular Program-Order guarantee above, > > * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire). > > > commit 63cae12bce9861cec309798d34701cf3da20bc71 > > Author: Peter Zijlstra > > Date: Fri Dec 9 14:59:00 2016 +0100 > > > > perf/core: Fix sys_perf_event_open() vs. hotplug > > > > There is problem with installing an event in a task that is 'stuck' on > > an offline CPU. > > > > Blocked tasks are not dis-assosciated from offlined CPUs, after all, a > > blocked task doesn't run and doesn't require a CPU etc.. Only on > > wakeup do we ammend the situation and place the task on a available > > CPU. > > > > If we hit such a task with perf_install_in_context() we'll loop until > > either that task wakes up or the CPU comes back online, if the task > > waking depends on the event being installed, we're stuck. > > > > While looking into this issue, I also spotted another problem, if we > > hit a task with perf_install_in_context() that is in the middle of > > being migrated, that is we observe the old CPU before sending the IPI, > > but run the IPI (on the old CPU) while the task is already running on > > the new CPU, things also go sideways. > > > > Rework things to rely on task_curr() -- outside of rq->lock -- which > > is rather tricky. Imagine the following scenario where we're trying to > > install the first event into our task 't': > > > > CPU0 CPU1 CPU2 > > > > (current == t) > > > > t->perf_event_ctxp[] = ctx; > > smp_mb(); > > cpu = task_cpu(t); > > > > switch(t, n); > > migrate(t, 2); > > switch(p, t); > > > > ctx = t->perf_event_ctxp[]; // must not be NULL > > > > smp_function_call(cpu, ..); > > > > generic_exec_single() > > func(); > > spin_lock(ctx->lock); > > if (task_curr(t)) // false > > > > add_event_to_ctx(); > > spin_unlock(ctx->lock); > > > > perf_event_context_sched_in(); > > spin_lock(ctx->lock); > > // sees event > > > > So its CPU0's store of t->perf_event_ctxp[] that must not go 'missing'. > > Because if CPU2's load of that variable were to observe NULL, it would > > not try to schedule the ctx and we'd have a task running without its > > counter, which would be 'bad'. > > > > As long as we observe !NULL, we'll acquire ctx->lock. If we acquire it > > first and not see the event yet, then CPU0 must observe task_curr() > > and retry. If the install happens first, then we must see the event on > > sched-in and all is well. > > > > I think we can translate the first part (until the 'must not be NULL') > > of the scenario to a litmus test like: > > > > C C-peterz > > > > { > > } > > > > P0(int *x, int *y) > > { > > int r1; > > > > WRITE_ONCE(*x, 1); > > smp_mb(); > > r1 = READ_ONCE(*y); > > } > > > > P1(int *y, int *z) > > { > > WRITE_ONCE(*y, 1); > > smp_store_release(z, 1); > > } > > > > P2(int *x, int *z) > > { > > int r1; > > int r2; > > > > r1 = smp_load_acquire(z); > > smp_mb(); > > r2 = READ_ONCE(*x); > > } > > > > exists > > (0:r1=0 /\ 2:r1=1 /\ 2:r2=0) > > > > Where: > > x is perf_event_ctxp[], > > y is our tasks's CPU, and > > z is our task being placed on the rq of CPU2. > > > > The P0 smp_mb() is the one added by this patch, ordering the store to > > perf_event_ctxp[] from find_get_context() and the load of task_cpu() > > in task_function_call(). > > > > The smp_store_release/smp_load_acquire model the RCpc locking of the > > rq->lock and the smp_mb() of P2 is the context switch switching from > > whatever CPU2 was running to our task 't'. > > > > This litmus test evaluates into: > > > > Test C-peterz Allowed > > States 7 > > 0:r1=0; 2:r1=0; 2:r2=0; > > 0:r1=0; 2:r1=0; 2:r2=1; > > 0:r1=0; 2:r1=1; 2:r2=1; > > 0:r1=1; 2:r1=0; 2:r2=0; > > 0:r1=1; 2:r1=0; 2:r2=1; > > 0:r1=1; 2:r1=1; 2:r2=0; > > 0:r1=1; 2:r1=1; 2:r2=1; > > No > > Witnesses > > Positive: 0 Negative: 7 > > Condition exists (0:r1=0 /\ 2:r1=1 /\ 2:r2=0) > > Observation C-peterz Never 0 7 > > Hash=e427f41d9146b2a5445101d3e2fcaa34 > > > > And the strong and weak model agree. > > > > Reported-by: Mark Rutland > > Tested-by: Mark Rutland > > Signed-off-by: Peter Zijlstra (Intel) > > Cc: Alexander Shishkin > > Cc: Arnaldo Carvalho de Melo > > Cc: Arnaldo Carvalho de Melo > > Cc: Jiri Olsa > > Cc: Linus Torvalds > > Cc: Peter Zijlstra > > Cc: Sebastian Andrzej Siewior > > Cc: Stephane Eranian > > Cc: Thomas Gleixner > > Cc: Vince Weaver > > Cc: Will Deacon > > Cc: jeremy.linton@arm.com > > Link: http://lkml.kernel.org/r/20161209135900.GU3174@twins.programming.kicks-ass.net > > Signed-off-by: Ingo Molnar > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c > > index ab15509fab8c..72ce7d63e561 100644 > > --- a/kernel/events/core.c > > +++ b/kernel/events/core.c > > @@ -2249,7 +2249,7 @@ static int __perf_install_in_context(void *info) > > struct perf_event_context *ctx = event->ctx; > > struct perf_cpu_context *cpuctx = __get_cpu_context(ctx); > > struct perf_event_context *task_ctx = cpuctx->task_ctx; > > - bool activate = true; > > + bool reprogram = true; > > int ret = 0; > > > > raw_spin_lock(&cpuctx->ctx.lock); > > @@ -2257,27 +2257,26 @@ static int __perf_install_in_context(void *info) > > raw_spin_lock(&ctx->lock); > > task_ctx = ctx; > > > > - /* If we're on the wrong CPU, try again */ > > - if (task_cpu(ctx->task) != smp_processor_id()) { > > - ret = -ESRCH; > > - goto unlock; > > - } > > + reprogram = (ctx->task == current); > > > > /* > > - * If we're on the right CPU, see if the task we target is > > - * current, if not we don't have to activate the ctx, a future > > - * context switch will do that for us. > > + * If the task is running, it must be running on this CPU, > > + * otherwise we cannot reprogram things. > > + * > > + * If its not running, we don't care, ctx->lock will > > + * serialize against it becoming runnable. > > */ > > - if (ctx->task != current) > > - activate = false; > > - else > > - WARN_ON_ONCE(cpuctx->task_ctx && cpuctx->task_ctx != ctx); > > + if (task_curr(ctx->task) && !reprogram) { > > + ret = -ESRCH; > > + goto unlock; > > + } > > > > + WARN_ON_ONCE(reprogram && cpuctx->task_ctx && cpuctx->task_ctx != ctx); > > } else if (task_ctx) { > > raw_spin_lock(&task_ctx->lock); > > } > > > > - if (activate) { > > + if (reprogram) { > > ctx_sched_out(ctx, cpuctx, EVENT_TIME); > > add_event_to_ctx(event, ctx); > > ctx_resched(cpuctx, task_ctx); > > @@ -2328,13 +2327,36 @@ perf_install_in_context(struct perf_event_context *ctx, > > /* > > * Installing events is tricky because we cannot rely on ctx->is_active > > * to be set in case this is the nr_events 0 -> 1 transition. > > + * > > + * Instead we use task_curr(), which tells us if the task is running. > > + * However, since we use task_curr() outside of rq::lock, we can race > > + * against the actual state. This means the result can be wrong. > > + * > > + * If we get a false positive, we retry, this is harmless. > > + * > > + * If we get a false negative, things are complicated. If we are after > > + * perf_event_context_sched_in() ctx::lock will serialize us, and the > > + * value must be correct. If we're before, it doesn't matter since > > + * perf_event_context_sched_in() will program the counter. > > + * > > + * However, this hinges on the remote context switch having observed > > + * our task->perf_event_ctxp[] store, such that it will in fact take > > + * ctx::lock in perf_event_context_sched_in(). > > + * > > + * We do this by task_function_call(), if the IPI fails to hit the task > > + * we know any future context switch of task must see the > > + * perf_event_ctpx[] store. > > */ > > -again: > > + > > /* > > - * Cannot use task_function_call() because we need to run on the task's > > - * CPU regardless of whether its current or not. > > + * This smp_mb() orders the task->perf_event_ctxp[] store with the > > + * task_cpu() load, such that if the IPI then does not find the task > > + * running, a future context switch of that task must observe the > > + * store. > > */ > > - if (!cpu_function_call(task_cpu(task), __perf_install_in_context, event)) > > + smp_mb(); > > +again: > > + if (!task_function_call(task, __perf_install_in_context, event)) > > return; > > > > raw_spin_lock_irq(&ctx->lock); > > @@ -2348,12 +2370,16 @@ again: > > raw_spin_unlock_irq(&ctx->lock); > > return; > > } > > - raw_spin_unlock_irq(&ctx->lock); > > /* > > - * Since !ctx->is_active doesn't mean anything, we must IPI > > - * unconditionally. > > + * If the task is not running, ctx->lock will avoid it becoming so, > > + * thus we can safely install the event. > > */ > > - goto again; > > + if (task_curr(task)) { > > + raw_spin_unlock_irq(&ctx->lock); > > + goto again; > > + } > > + add_event_to_ctx(event, ctx); > > + raw_spin_unlock_irq(&ctx->lock); > > } > > > > /* >